Abstract
Background Life course epidemiology provides a framework for studying the effects of time-varying exposures on health outcomes. The structured life course modeling approach (SLCMA) is a theory-driven analytic method that empirically compares multiple prespecified life course hypotheses characterizing time-dependent exposure-outcome relationships to determine which theory best fits the observed data. However, the statistical properties of inference methods used with the SLCMA have not been investigated with high-dimensional omics outcomes.
Methods We performed simulations and empirical analyses to evaluate the performance of the SLCMA when applied to genome-wide DNA methylation (DNAm). In the simulations, we compared five statistical inference tests used by SLCMA (n=700). For each, we assessed the familywise error rate (FWER), statistical power, and confidence interval coverage to determine whether inference based on these tests was valid in the presence of substantial multiple testing and small effect sizes, two hallmark challenges of inference from omics data. In the empirical analyses, we applied the SLCMA to evaluate the time-dependent relationship of childhood abuse with genome-wide DNAm (n=703).
Results In the simulations, selective inference and max-|t|-test performed best: both controlled FWER and yielded moderate statistical power. Empirical analyses using SLCMA revealed time-dependent effects of childhood abuse on DNA methylation.
Conclusions Our findings show that SLCMA, applied and interpreted appropriately, can be used in the omics setting to examine time-dependent effects underlying exposure-outcome elationships over the life course. We provide recommendations for applying the SLCMA in high-throughput settings, which we hope will encourage researchers to move beyond analyses of exposed versus unexposed.
Key messages
The structured life course modeling approach (SLCMA) is an effective approach to directly compare life course theories and can be scaled-up in the omics context to examine nuanced relationships between environmental exposures over the life course and biological processes.
Of the five statistical inference tests assessed in simulations, we recommend the selective inference method and max-|t|-test for post-selection inference in omics applications of the SLCMA.
In an empirical example, we revealed time-dependent effects of childhood abuse on DNA methylation using the SLCMA, with improvement in statistical power when accounting for covariates by applying the Frisch-Waugh-Lovell (FWL) theorem.
Researchers should assess p-values in parallel with effects sizes and confidence intervals, as triangulating multiple forms of statistical evidence can strengthen inferences and point to new directions for replication.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by the National Institute of Mental Health of the National Institutes of Health [grant number R01MH113930 awarded to ECD]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author Declarations
All relevant ethical guidelines have been followed and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Not Applicable
Any clinical trials involved have been registered with an ICMJE-approved registry such as ClinicalTrials.gov and the trial ID is included in the manuscript.
Not Applicable
I have followed all appropriate research reporting guidelines and uploaded the relevant Equator, ICMJE or other checklist(s) as supplementary files, if applicable.
Not Applicable
Footnotes
Funding: This work was supported by the National Institute of Mental Health of the National Institutes of Health [grant number R01MH113930 awarded to ECD]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Conflict of Interest: None declared.
Data Availability
In the simulations, code used to perform analysis has been included in the Supplemental Materials. In the empirical analysis, data came from the Avon Longitudinal Study of Parents and Children (ALSPAC). More details about data access are available on the ALSPAC website, including a fully searchable data dictionary: http://www.bristol.ac.uk/alspac/researchers/our- data/.