Abstract
Examining the downstream molecular consequences of genetic variation significantly enhances our understanding of the heritable determinants of complex traits and disease predisposition. Metabolites serve as key indicators of various biological processes and disease states, playing a crucial role in this systematic mapping, also providing opportunities for the discovery of new biomarkers for disease diagnosis and prognosis. Here, we present a genome-wide association study for 249 circulating metabolite traits quantified by nuclear magnetic resonance spectroscopy across various genetic ancestry groups from the Estonian Biobank and the UK Biobank. We generated mixed model associations in the Estonian Biobank and six major genetic ancestry groups of the UK Biobank and performed two separate meta-analyses across the predominantly European genetic ancestry samples (n = 599,249) and across all samples (n = 619,372). In total, we identified 89,489 locus-metabolite pairs and 8,917 independent lead variants, out of which 4,184 appear to be novel associated loci. Moreover, 12.4% of the independent lead variants had a minor allele frequency of less than 1%, highlighting the importance of including low-frequency and rare variants in metabolic biomarker studies. Our publicly available results provide a valuable resource for future GWAS interpretation and drug target prioritisation studies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
K.A and I.R. were supported by a grant from the Estonian Research Council (grant no PSG415). E.A was supported by the European Union through Horizon 2020 and Horizon Europe research and innovation programs under grants no. 894987, 101137201 and 101137154. R.T., E.A, U.V., J.K. and T.E. were supported by the Estonian Research Council grant no PRG1291. K.F and A.K. were supported by a grant from the Estonian Research Council no PRG1197. N.T. was supported by the Estonian Research Council grant no PRG1414. Project was supported by European Union's Horizon 2020 research and innovation programme under Grant Agreement No 101017802 (OPTOMICS).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Individual level data from the Estonian Biobank was analysed under ethical approval 1.1-12/624 from the Estonian Committee on Bioethics and Human Research (Estonian Ministry of Social Affairs), using data according to release application 6-7/GI/8988 from the Estonian Biobank. The UK Biobank study was approved by the North West Multi-Centre Research Ethics Committee. This research was conducted using the UK Biobank Resource under application numbers 91233 and 30418.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* These authors jointly supervised this work.
Augmented the Data Availability statement with a link to our PheWeb browser. Fixed some minor typos.
Data Availability
Complete genetic ancestry group-specific and meta-analysis association summary statistics from this study can be downloaded from the GWAS Catalog (Sollis et al., 2023) (accessions GCST90449363 - GCST90451603, Table S5). GWAS lead variants are available from Zenodo (https://dx.doi.org/10.5281/zenodo.13937265). The meta_EUR meta-analysis results can also be viewed in our PheWeb browser (https://nmrmeta.gi.ut.ee/). Meta-analysis code is available from https://github.com/ralf-tambets/EstBB-UKBB-metaanalysis/. The individual-level UK Biobank data are available for approved researchers through the UK Biobank data-access protocol (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access). The individual-level data from Estonia Biobank can be accessed through a research application to the Institute of Genomics of the University of Tartu (https://genomics.ut.ee/en/content/estonian-biobank).
https://dx.doi.org/10.5281/zenodo.13937265
https://github.com/ralf-tambets/EstBB-UKBB-metaanalysis/blob/main/data/sumstats_paths.tsv