ABSTRACT
Background Wheeze is common in early life and often transient. It is difficult to identify which children will experience persistent symptoms and subsequently develop asthma. Machine learning approaches have the potential for better predictive performance and generalisability over existing childhood asthma prediction models.
Objective To apply machine learning approaches for predicting school-age asthma (age 10) in early life (Childhood Asthma Prediction in Early life, CAPE model) and at preschool age (Childhood Asthma Prediction at Preschool age, CAPP model).
Methods Data on clinical symptoms and environmental exposures were collected from children enrolled in the Isle of Wight Birth Cohort (N=1368, ∼15% asthma prevalence). Recursive Feature Elimination (RFE) identified the optimal subset of features predictive of school-age asthma for each model. Seven state-of-the-art machine learning classification algorithms were used to develop the models and the results were compared. To optimize the models, training was performed by applying 5-fold cross-validation, imputation and resampling. Predictive performances were evaluated on the test set and externally validated in the Manchester Asthma and Allergy Study (MAAS) cohort.
Results RFE identified eight and 12 predictors for the CAPE and CAPP models, respectively. The best predictive performance was demonstrated by a Support Vector Machine (SVM) algorithm for both the CAPE model (area under the receiver operating curve, AUC=0.71) and CAPP model (AUC=0.82). Both models demonstrated good generalisability in MAAS (CAPE 8YR=0.71, 11YR=0.71, CAPP 8YR=0.83, 11YR=0.79).
Conclusion Using machine learning approaches improved upon the predictive performance of existing regression-based models, with good generalisability and ability to rule in asthma.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by the National Institute for Health Research through the NIHR Southampton Biomedical Research Centre and a University of Southampton Presidential Research Studentship. Replication analysis in MAAS was supported by the Medical Research Council as part of UNICORN (Unified Cohorts Research Network): Disaggregating asthma MR/S025340/1. Angela Simpson and Clare Murray are supported by the NIHR Manchester Biomedical Research Centre.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The development of the predictive models received approval from the Faculty of Medicine Ethics Committee, University of Southampton (ERGO 46033.R1). All participants in the Isle of Wight Birth Cohort provided informed consent or parental consent and Ethics approval was obtained from the local/national ethics committees at recruitment of the birth cohort between January 1989 and February 1990, and subsequently at each assessment.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Funding: This work was supported by the National Institute for Health Research through the NIHR Southampton Biomedical Research Centre and a University of Southampton Presidential Research Studentship. Replication analysis in MAAS was supported by the Medical Research Council as part of UNICORN (Unified Cohorts Research Network): Disaggregating asthma MR/S025340/1. Angela Simpson and Clare Murray are supported by the NIHR Manchester Biomedical Research Centre.
Conflicts of interest: The authors have no funding relationships or conflicts of interest related to this article to disclose.
Data availability: Access to data from the Isle of Wight Birth Cohort can be made available upon request. Further information can be found at www.allergyresearch.org.uk/ or from: Cohort Profile: The Isle Of Wight Whole Population Birth Cohort (IOWBC). Int J Epidemiol. 2018 Aug 1;47(4):1043-1044i. doi: 10.1093/ije/dyy023. Source code for the development and use of the prediction models can also be made available upon request.
Data Availability
Access to data from the Isle of Wight Birth Cohort can be made available upon request. Further information can be found at www.allergyresearch.org.uk/ or from: Cohort Profile: The Isle Of Wight Whole Population Birth Cohort (IOWBC). Int J Epidemiol. 2018 Aug 1;47(4):1043-1044i. doi: 10.1093/ije/dyy023. Source code for the development and use of the prediction models can also be made available upon request.
Abbreviations
- AUC
- Area under the receiver operating curve
- BHR
- Bronchial hyper-responsiveness
- RFE
- Recursive feature elimination
- LR-
- Negative likelihood ratio
- LR+
- Positive likelihood ratio
- NPV
- Negative predictive value
- PPV
- Positive predictive value
- SPT
- Skin prick test