PT - JOURNAL ARTICLE AU - Rudd, Meghan L. AU - Hansen, Nancy F. AU - Zhang, Xiaolu AU - Urick, Mary Ellen AU - Zhang, Suiyuan AU - Merino, Maria J. AU - National Institutes of Health Intramural Sequencing Center Comparative Sequencing Program AU - Mullikin, James C. AU - Brody, Lawrence C. AU - Bell, Daphne W. TI - <em>KLF3</em> and <em>PAX6</em> are candidate driver genes in late-stage, MSI-hypermutated endometrioid endometrial carcinomas AID - 10.1101/2021.04.26.21256125 DP - 2021 Jan 01 TA - medRxiv PG - 2021.04.26.21256125 4099 - http://medrxiv.org/content/early/2021/04/30/2021.04.26.21256125.short 4100 - http://medrxiv.org/content/early/2021/04/30/2021.04.26.21256125.full AB - Endometrioid endometrial carcinomas (EECs) are the most common histological subtype of uterine cancer. Late-stage disease is an adverse prognosticator for EEC. The purpose of this study was to analyze EEC exome mutation data to identify late-stage-specific statistically significantly mutated genes (SMGs), which represent candidate driver genes potentially associated with disease progression. We exome sequenced 15 late-stage (stage III or IV) non-ultramutated EECs and paired non-tumor DNAs; somatic variants were called using Strelka, Shimmer, Somatic Sniper and MuTect. Additionally, somatic mutation calls were extracted from The Cancer Genome Atlas (TCGA) data for 66 late-stage and 270 early-stage (stage I or II) non-ultramutated EECs. MutSigCV (v1.4) was used to annotate SMGs in the two late-stage cohorts and to derive p-values for all mutated genes in the early-stage cohort. To test whether late-stage SMGs are statistically significantly mutated in early-stage tumors, q-values for late-stage SMGs were re-calculated from the MutSigCV (v1.4) early-stage p-values, adjusting for the number of late-stage SMGs tested. We identified 14 SMGs in the combined late-stage EEC cohorts. When the 14 late-stage SMGs were examined in the TCGA early-stage data, only KLF3 and PAX6 failed to reach significance as early-stage SMGs, despite the inclusion of enough early-stage cases to ensure adequate statistical power. Within TCGA, nonsynonymous mutations in KLF3 and PAX6 were, respectively, exclusive or nearly exclusive to the microsatellite instability (MSI)-hypermutated molecular subgroup and were dominated by insertions-deletions at homopolymer tracts. In conclusion, our findings are hypothesis-generating and suggest that KLF3 and PAX6, which encode transcription factors, are MSI target genes and late-stage-specific SMGs in EEC.Competing Interest StatementDWB receives royalty income from Esoterix Genetic Laboratories resulting from the licensing of U.S. patent No. 7,294,468, which is unrelated to the study reported in this manuscriptFunding StatementThis work was funded by the Intramural Program of the National Human Genome Research Institute at the National Institutes of Health, grant number ZIA HG200338, Principal Investigator DWB. No payment or services were received from a third party for any aspect of the submitted workAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The National Institutes of Health Office of Human Subjects Research Protections determined that this research was not human subject research, per the Common Rule (45 CFR 46). Anonymized data were analyzed.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesExome sequencing data for the NHGRI tumor-normal cohort have been deposited in dbGAP under controlled access https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001153.v1.p1