Huntingtin gene CAG repeat size in patients with Lynch syndrome =============================================================== * Karin Dalene Skarping * Åsa Petersén * Samuel Gebre-Medhin ## Abstract Patients with Lynch syndrome (LS) are prone to cancer due to heterozygous germline pathogenic variants in genes encoding DNA mismatch repair proteins MLH1, MSH2, MSH6 and PMS2. LS cancer cells exhibit deficient DNA mismatch repair and microsatellite instability due somatic inactivation of the second copy of the affected gene. To study microsatellite characteristics in non-neoplastic cells in LS we determined CAG repeat size in the huntingtin gene (*HTT*) microsatellite in lymphocyte DNA from LS patients with germline pathogenic variants in *MLH1* (n = 11), *MSH2* (n = 9), *MSH6* (n = 7) and non-LS controls (n=19). Mean repeat size in LS was 19,55 CAG (*MLH1*), 19,39 CAG (*MSH2*), 18.07 CAG (*MSH6*), respectively compared to 18,42 CAG in controls. Standard deviation for CAG repeat size in LS was 4,183 CAG (*MLH1*), 5,089 CAG (*MSH2*), 3,075 CAG (*MSH6*), respectively, compared to 3,342 CAG in controls. Peak CAG repeat size in LS was 32 CAG (*MLH1*), 32 CAG (*MSH2*), 24 CAG (*MSH6*), respectively compared to 27 CAG in controls. Collectively, our data indicate that *HTT* CAG repeat size tends to be larger and more variable in individuals with LS caused by pathogenic variants in *MLH1* and *MSH2*. ## Introduction Lynch syndrome (LS) is a multiorgan cancer predisposition syndrome caused by germline heterozygous pathogenic variants (PV) in the mismatch repair (MMR) genes *MLH1, MSH2, MSH6* or *PMS2* 1. LS cancers exhibit deficient MMR (dMMR) and microsatellite instability (MSI) due to somatic inactivation of the remaining allele of the affected MMR gene 1. In the vast majority of patients, the underlying germline PV is inherited, and as a consequence there is often an accumulation of LS cancers (predominantly colon cancer and uterus cancer) in the affected family branch. Average age at onset is lower for LS cancers than for sporadic cancers. In addition, anticipation, i.e. lower age at disease onset in subsequent generations has been reported in LS 2,3 although the phenomenon has been questioned in the absence of mechanistic explanation or explained by a birth-cohort bias 4,5. Yet, as the MMR system is ubiquitous to maintain nuclear genome stability 6 dysfunctional MMR alleles could conceivably contribute to an increased mutational load including in germ cells and hence acquired genetic aberrations could be transmitted to the next generation. Indeed, the state of haploidy in gametes implies that half of the germ cells in patients with LS have no functional copy of the affected MMR gene and could therefore be subject to dMMR. However, a recent whole genome sequencing (WGS) effort revealed no evidence for altered mutational load in non-neoplastic tissue in LS patients 7. Since WGS technologies still lack sufficient resolution in regions with short tandem repeats (STR), i.e. in regions with microsatellites, we herein have used a PCR-based clinical grade high-resolution DNA fragment analysis to determine the range of CAG repeats in the huntingtin gene (*HTT*) microsatellite in patients with LS and matched controls. ## Materials and Methods Genomic DNA extracted from blood samples from patients investigated for non-polyposis hereditary colorectal cancer (CRC) in the Southern health care region in Sweden during 1997-2012 and shown to have either a dMMR colorectal cancer (CRC) and a PV diagnostic for LS or a MMR proficient CRC (referred to as controls in the present study) were retrieved from the Skåne university hospital biobank and anonymized. LS patients with PV in *PMS2* were not included in this study for reasons of integrity as they were too few cases in the clinical registry. *HTT* CAG repeat size estimation was performed using PCR amplification and capillary electrophoresis fragment analysis with a validated accuracy of ± 1 CAG repeat for alleles with < 45 repetitions and ± 3 CAG repetitions for alleles with 45 or more repeats as described 8. *HTT* was chosen as it contains a well-characterized STR known to contain pathogenic CAG repeat expansions causative for Huntington disease (HD) and since the *HTT* CAG pathogenic repeat is thought to further expand somatically in an MMR-dependent manner 9. ### Statistical analyses *HTT* CAG repeat values were converted to integers according to clinical genetic laboratory diagnostic routines for Huntington disease 8. The CAG repeat size estimation error +/-1 was excluded from statistical calculations. Due to limited number of patients and incomplete normal distribution of histogram data, Mann-Whitney U and Kruskal-Wallis tests were used. CAG repeat size was analyzed either as unpaired values (independent alleles) or as paired (i.e. as the sum of CAG repeats for both *HTT* alleles in each patient). A CAG repeat size of of 18,42 triplets was set at baseline in accordance with values from patient controls in the present study and in line with published data from 7379 unselected individuals in Sweden (Sundblom et al., 2020). SPSS Statistics for Windows (SPSS Inc., Chicago, Ill., USA) software was used. *P-*values < 0.05 were considered significant. ### Ethical approvals Approvals and decisions were received from The Regional Ethical Review Board in Lund (application no. 2013/468) and from the Swedish Ethical Review Agency (application no. 2019-02312 and application no. 2021-06254-02). ## Results CAG repeat size was determined for both *HTT* alleles in LS patients with PV in *MLH1* (n = 11), *MSH2* (n = 9), *MSH6* (n = 7) and in control patients (n=19) (Table 1, Figure 1). Mean CAG repeat size for unpaired *HTT* alleles in the LS subgroups was 19,55 CAG (*MLH1*), 19,39 CAG (*MSH2*), 18,07 CAG (*MSH6*), respectively compared to 18,42 CAG in controls (Figure 2). Standard deviation (SD) for CAG repeat size for unpaired *HTT* alleles in the LS subgroups was 4,183 CAG (*MLH1*), 5,089 CAG (*MSH2*), 3,075 CAG (*MSH6*), respectively, compared to 3,342 CAG in controls (Figure 2). SD for the sum of CAG repeat size for paired *HTT* alleles in LS subgroups was 6,640 CAG (*MLH1*), 6,667 CAG (*MSH2*), 3,761 CAG (*MSH6*), respectively, compared to 4,375 CAG in controls (Figure 3). Peak CAG repeat size in LS subgroups was 32 CAG (*MLH1*), 32 CAG (*MSH2*), 24 CAG (*MSH6*), respectively compared to 27 CAG in controls (Table 1; Figure 1). Mean sum of CAG repeat size for paired *HTT* alleles in LS subgroups was 39,09 CAG (*MLH1*), 38,78 CAG (*MSH2*), 36,14 CAG (*MSH6*), respectively compared to 36,84 CAG in controls (Figure 3). Differences observed between the LS subgroups or between LS subgroups and controls were not statistically significant (data not shown). View this table: [Table 1.](http://medrxiv.org/content/early/2022/05/29/2022.05.28.22275723/T1) Table 1. Summary of *HTT* CAG repeat size data for all patients in the study. Patient ID includes Lynch syndrome genetic subcategory (i.e. pathogenic variant in *MLH1, MSH2* or *MSH6*, respectively) and controls (bottom). Clinical value refers to the CAG repeat size that would have been reported in a clinical laboratory routine. Column headed Sum denotes sum of CAG repeat size for allele 1 and allele. Column headed Difference denotes difference in CAG repeat size between allele 2 and allele 1. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/05/29/2022.05.28.22275723/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2022/05/29/2022.05.28.22275723/F1) Figure 1. Graphical representation of *HTT* CAG repeat size in all patients. Bar representation of allele 1 (upper end of bar) and allele 2 (lower end of bar) CAG repeats size for each patient in the study. Bars are grouped for patients with Lynch syndrome due to pathogenic variants in *MLH1* (blue field), *MSH2* (green field), *MSH6* (yellow field) and control patients (red field). Mean CAG repeat size for control patients is shown (dashed line). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/05/29/2022.05.28.22275723/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2022/05/29/2022.05.28.22275723/F2) Figure 2. Statistics of CAG repeat size for unpaired *HTT* alleles. Statistic evaluation of CAG repeat size data for unpaired *HTT* alleles in patients with Lynch syndrome with a pathogenic variant in *MLH1* (a), *MSH2* (b), *MSH6* (c), and controls (d) ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/05/29/2022.05.28.22275723/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2022/05/29/2022.05.28.22275723/F3) Figure 3. Statistics of CAG repeat size for paired *HTT* alleles. Statistic evaluation of sum of CAG repeat size, i. e. paired *HTT* alleles in patients with Lynch syndrome with a pathogenic variant in *MLH1* (a), *MSH2* (b), *MSH6* (c), and controls (d). ## Conclusion In this study we hypothesized that heterozygosity for PV in MMR genes could have an impact on microsatellite size in non-neoplastic cells. For this purpose we studied CAG repeat size in the *HTT* microsatellite in genetic subgroups of patients with LS. We found that CAG repeat size in LS patients with PV in *MLH1* and *MSH2* tended to be larger and more variable compared to patients with a PV in *MSH6* and non-LS controls. The differences observed between the groups were however not statistical significant. The study of a larger group of patients with LS or inter-generational studies of LS family members could possibly clarify whether or not PV in MMR genes affect microsatellite length in germline DNA. ## Data Availability All data produced in the present study are available upon reasonable request to the authors. ## Acknowledgements We wish to thank the Department of Clinical Genetics and Pathology, Office for Medical Service for support regarding personnel, equipment and materials in this study. * Received May 28, 2022. * Revision received May 28, 2022. * Accepted May 29, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. 1.Curtius K, Gupta S, Boland CR. Review article: Lynch Syndrome-a mechanistic and clinical management update. Aliment Pharmacol Ther. Apr 2022;55(8):960–977. doi:10.1111/apt.16826 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/apt.16826&link_type=DOI) 2. 2.Nilbert M, Timshel S, Bernstein I, Larsen K. Role for genetic anticipation in Lynch syndrome. J Clin Oncol. Jan 20 2009;27(3):360–4. doi:10.1200/JCO.2008.16.1281 [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjg6IjI3LzMvMzYwIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDUvMjkvMjAyMi4wNS4yOC4yMjI3NTcyMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 3. 3.von Salome J, Boonstra PS, Karimi M, et al. Genetic anticipation in Swedish Lynch syndrome families. PLoS Genet. Oct 2017;13(10):e1007012. doi:10.1371/journal.pgen.1007012 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pgen.1007012&link_type=DOI) 4. 4.Gruber SB, Mukherjee B. Anticipation in lynch syndrome: still waiting for the answer. J Clin Oncol. Jan 20 2009;27(3):326–7. doi:10.1200/JCO.2008.19.1445 [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjg6IjI3LzMvMzI2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDUvMjkvMjAyMi4wNS4yOC4yMjI3NTcyMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 5. 5.Stupart D, Goldberg P, Algar U, Vorster A, Ramesar R. No evidence of genetic anticipation in a large family with Lynch syndrome. Fam Cancer. Mar 2014;13(1):29–34. doi:10.1007/s10689-013-9669-0 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10689-013-9669-0&link_type=DOI) 6. 6.Lujan SA, Kunkel TA. Stability across the Whole Nuclear Genome in the Presence and Absence of DNA Mismatch Repair. Cells. May 17 2021;10(5)doi:10.3390/cells10051224 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3390/cells10051224&link_type=DOI) 7. 7.Lee BCH, Robinson PS, Coorens THH, et al. Mutational landscape of normal epithelial cells in Lynch Syndrome patients. Nat Commun. May 17 2022;13(1):2710. doi:10.1038/s41467-022-29920-2 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-022-29920-2&link_type=DOI) 8. 8.Losekoot M, van Belzen MJ, Seneca S, et al. EMQN/CMGS best practice guidelines for the molecular genetic testing of Huntington disease. Eur J Hum Genet. May 2013;21(5):480–6. doi:10.1038/ejhg.2012.200 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ejhg.2012.200&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22990145&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F05%2F29%2F2022.05.28.22275723.atom) 9. 9.Iyer RR, Pluciennik A. DNA Mismatch Repair and its Role in Huntington’s Disease. J Huntingtons Dis. 2021;10(1):75–94. doi:10.3233/JHD-200438 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3233/JHD-200438&link_type=DOI)