PT - JOURNAL ARTICLE AU - Manigbas, Celine A. AU - Jadhav, Bharati AU - Garg, Paras AU - Shadrina, Mariya AU - Lee, William AU - Martin-Trujillo, Alejandro AU - Sharp, Andrew J. TI - A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank AID - 10.1101/2024.01.22.24301630 DP - 2024 Jan 01 TA - medRxiv PG - 2024.01.22.24301630 4099 - http://medrxiv.org/content/early/2024/01/23/2024.01.22.24301630.short 4100 - http://medrxiv.org/content/early/2024/01/23/2024.01.22.24301630.full AB - Most genetic association studies focus on binary variants. To identify the effects of multi-allelic variation of tandem repeats (TRs) on human traits, we performed direct TR genotyping and phenome-wide association studies in 168,554 individuals from the UK Biobank, identifying 47 TRs showing causal associations with 73 traits. We replicated 23 of 31 (74%) of these causal associations in the All of Us cohort. While this set included several known repeat expansion disorders, novel associations we found were attributable to common polymorphic variation in TR length rather than rare expansions and include e.g. a coding polyhistidine motif in HRCT1 influencing risk of hypertension and a poly(CGC) in the 5’UTR of GNB2 influencing heart rate. Causal TRs were strongly enriched for associations with local gene expression and DNA methylation. Our study highlights the contribution of multi-allelic TRs to the “missing heritability” of the human genome.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis research has been conducted using the UK Biobank Resource under Application Number 82094 and the NIH All of Us data under research project "Association studies of tandem repeats". This work was supported by NIH grants AG075051, NS105781 and HD103782 to A.J.S. and NHLBI BioData Catalyst Fellowship #5120339 to A.M.T.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Collection of the UKB data was approved by the Research Ethics Committee of the UKB obtained under application 32568. All study participants provided informed consent, and the protocols for UKB are overseen by The UKB Ethics Advisory Committee, see https://www.ukbiobank.ac.uk/ethics/.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll genotype data generated in the UKB and All of Us cohorts have been returned and will be available through future data releases. Code utilized for this study is available as follows: https://github.com/Illumina/ExpansionHunter https://github.com/bharatij/Global-Ancestry-Assignment https://github.com/PacificBiosciences/trgt https://www.internationalgenome.org/data-portal/data-collection/30x-grch38 https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001644.v2.p2 https://biobank.ctsu.ox.ac.uk/crystal/search.cgi https://www.researchallofus.org/data-tools/workbench/ https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE213478 https://storage.googleapis.com/gtex_analysis_v8/rna_seq_data/GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_tpm.gct.gz