Abstract
When quantitative longitudinal traits are risk factors for disease progression and subject to random biological variation, joint model analysis of time-to-event and longitudinal traits can effectively identify direct and/or indirect genetic association of single nucleotide polymorphisms (SNPs) with time-to-event. We present a joint model that integrates: i) a multivariate linear mixed model describing trajectories of multiple longitudinal traits as a function of time, SNP effects, and subject-specific random effects, and ii) a frailty Cox survival model that depends on SNPs, longitudinal trajectory effects, and subject-specific frailty accounting for dependence among multiple time-to-event traits. Motivated by complex genetic architecture of type 1 diabetes complications (T1DC) observed in the Diabetes Control and Complications Trial (DCCT), we implement a two-stage approach to inference with bootstrap joint covariance estimation and develop a hypothesis testing procedure to classify direct and/or indirect SNP association with each time-to-event trait. By realistic simulation study, we show that joint modelling of two time-to-T1DC (retinopathy, nephropathy) and two longitudinal risk factors (HbA1c, systolic blood pressure) reduces estimation bias in genetic effects and improves classification accuracy of direct and/or indirect SNP associations, compared to methods that ignore within-subject risk factor variability and dependence among longitudinal and time-to-event traits. Through DCCT data analysis, we demonstrate feasibility for candidate SNP modelling, and quantify effects of sample size and Winner’s curse bias on classification for two SNPs identified as having indirect associations with time-to-T1DC traits. Overall, joint analysis of multiple longitudinal and multiple time-to-event traits provides insight into complex trait architecture.
Competing Interest Statement
The authors have declared no competing interest.
Clinical Trial
NCT00360815
Funding Statement
This project was supported by: CIHR Operating/Project Grants (#MOP-84287, #PJT-159509, #PJT-159463), CANSSI Collaborative Research Team in Statistical methods for the analysis of genetic data with survival outcomes, CANSSI postdoctoral fellowship (MB), CIHR STAGE fellowships (MB and OEG, #GET-101831). Computations were performed on the Niagara supercomputer at the SciNet HPC Consortium. SciNet is funded by the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund - Research Excellence; and the University of Toronto.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethical approval was obtained for each of the 29 study centres across USA and Canada for the DCCT study. Ethical approval for the DCCT genetics study was obtained from the Hospital for Sick Children Research Ethics Board (Sickkids REB# 1000030543, Title: Genetics of diabetic complications and risk factors). We have current Research Ethics Board approval at Mount Sinai Hospital, Sinai Health System which includes the Lunenfeld-Tanenbaum Research Institute (MSH REB# 07-0208-E, Title: Genome-wide Association of Common Alleles with Long-term Diabetic Complications).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
This version of the manuscript has been revised to update the following: improvement of the exposition & notations in the proposed joint model for multiple longitudinal and multiple time-to-event traits, clarification of the novelty of the proposed model in comparisons with the joint model for one longitudinal and one time-to-event trait, extension of the simulation studies to include comparisons with the joint model of one longitudinal trait and one time-to-event trait, evaluations of the accuracy of the classification of SNP association as direct and/or indirect, regression diagnostics to assess joint model validity in the DCCT application, evaluations of effects of sample size & Winners curse bias on classification for two SNPs identified as having indirect associations with time-to-T1DC traits in the DCCT Genetic Study.
Data Availability
DCCT data are available to authorized users at https://repository.niddk.nih.gov/studies/edic/ and https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000086.v3.p1 (IRB #07-0208-E). Example R codes for DCCT-data-based simulation and analysis of the simulated data are provided on GitHub (https://github.com/brossardMyriam/Joint-model-for-multiple-trait-genetics). Supplementary files are available on Figshare at https://figshare.com/s/2b9f6b3da5e1f03e8086. File S1 includes the description of the DCCT dataset as well as the list of the participants of the DCCT/EDIC Research Group; File S2 includes supplemental information for the DCCT-based simulation study; File S3 includes supplemental information for the Analysis of the DCCT Genetics Study data; File S4 includes the list of SNPs analyzed in DCCT; File S5 includes some notes on a multi-trait SNP association test for SNP effects estimated under the proposed joint model framework.
https://repository.niddk.nih.gov/studies/edic/
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000086.v3.p1.
https://github.com/brossardMyriam/Joint-model-for-multiple-trait-genetics
Abbreviations
- DAG
- directed acyclic graph
- DCCT
- Diabetes control and complications trial
- DR
- diabetic retinopathy
- DN
- diabetic nephropathy
- GWAS
- genome-wide association study
- HbA1c
- Hemoglobin A1c
- LD
- linkage disequilibrium
- MAF
- minor allele frequency
- PH
- proportional hazards
- QT(s)
- quantitative trait(s)
- SBP
- systolic blood pressure
- SNP
- single nucleotide polymorphism
- T1DC
- type 1 diabetes complications
- TTE
- time-to-event