Abstract
Proportional hazards models have previously been proposed to analyse time-to-event phenotypes in genome-wide association studies(GWAS). While proportional hazards models have many useful applications, their ability to identify genetic associations under different generative models where ascertainment is present in the analysed data is poorly understood. This includes widely used study designs such as case-control and case-cohort designs (e.g. the iPSYCH study design) where cases are commonly ascertained.
Here we examine how recently proposed and computationally efficient Cox regression for GWAS perform under different generative models with and without ascertainment. We also propose the age-dependent liability threshold model (ADuLT), first introduced as the underlying model for the LT-FH++ method, as an alternative approach for time-to-event GWAS. We then benchmark ADuLT with SPACox and standard case-control GWAS using simulated data with varying degrees of ascertainment. We find Cox regression GWAS to underperform when cases are strongly ascertained (cases are oversampled by a factor larger than 5), regardless of the generative model used. In contrast, we found ADuLT to be robust to case-control ascertainment, while being much faster to run. We then used the methods to conduct GWAS for four psychiatric disorders, ADHD, Autism, Depression, and Schizophrenia in the iPSYCH case-cohort sample, which has a strong case-ascertainment. Summarising across all four mental disorders, ADuLT found 20 independent genome-wide significant associations, while case-control GWAS found 17 and SPACox found 8, consistent with our simulation results.
As more genetic data are being linked to electronic health records, robust GWAS methods that can make use of age-of-onset information have the opportunity to increase power in analyses. We find that ADuLT to be a robust time-to-event GWAS method that performs on par with or better than Cox-regression GWAS, both in simulations and real data analyses of four psychiatric disorders. ADuLT has been implemented in an R package called LTFHPlus, and is available on GitHub.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
E.M.P, B.J.V. and F.P. were supported by the Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH (R102-A9118, R155-2014-1724 and R248-2017-2003), and a Lundbeck Foundation Fellowship (R335-2019-2339). J.M., B.J.V. and F.P. were also supported the Danish National Research Foundation (Niels Bohr Professorship to Prof. John McGrath). A.J.S. is supported by a Lundbeckfonden Fellowship (R335-2019-2318), and O.P.R. is supported by a Lundbeck Foundation Fellowship (R345-2020-1588). K.M. is supported by grants from The Lundbeck Foundation (R303-2018-3551) and the Brain & Behavior Research Foundation (Young Investigator Award 2021). A.G. has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 945733), starting grant AI-Prevent. High-performance computer capacity for handling and statistical analysis of iPSYCH data on the GenomeDK HPC facility was provided by the Center for Genomics and Personalised Medicine and the Centre for Integrative Sequencing, iSEQ, Aarhus University, Denmark (grant to A.D.B.).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study was approved by the Danish Data Protection Agency, and data access was approved by Statistics Denmark and the Danish Health Data Authority. Approval by the Ethics Committee and written informed consent were not required for register-based projects [Act no. 1338 of 1 September 2020, section 10 on research ethics for administration of health scientific research projects and health data scientific research projects]. All data were de-identified and not recognizable at an individual level.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
iPSYCH is approved by the Danish Scientific Ethics Committee, the Danish Health Data Authority, the Danish Data Protection Agency, Statistics Denmark, and the Danish Neonatal Screening Biobank Steering Committee. Code used to generate simulation results, analyse iPSYCH, and generate plots and tables can be found at https://github.com/EmilMiP/ADuLTCode. LT-FH++ can be found at https://github.com/EmilMiP/LTFHPlus.