RT Journal Article SR Electronic T1 Development of Childhood Asthma Prediction Models using Machine Learning Approaches JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.03.31.21254678 DO 10.1101/2021.03.31.21254678 A1 Kothalawala, Dilini M. A1 Murray, Clare S. A1 Simpson, Angela A1 Custovic, Adnan A1 Tapper, William J. A1 Arshad, S. Hasan A1 Holloway, John W. A1 Rezwan, Faisal I. A1 on behalf of STELAR/UNICORN investigators YR 2021 UL http://medrxiv.org/content/early/2021/04/06/2021.03.31.21254678.abstract AB Background Wheeze is common in early life and often transient. It is difficult to identify which children will experience persistent symptoms and subsequently develop asthma. Machine learning approaches have the potential for better predictive performance and generalisability over existing childhood asthma prediction models.Objective To apply machine learning approaches for predicting school-age asthma (age 10) in early life (Childhood Asthma Prediction in Early life, CAPE model) and at preschool age (Childhood Asthma Prediction at Preschool age, CAPP model).Methods Data on clinical symptoms and environmental exposures were collected from children enrolled in the Isle of Wight Birth Cohort (N=1368, ∼15% asthma prevalence). Recursive Feature Elimination (RFE) identified the optimal subset of features predictive of school-age asthma for each model. Seven state-of-the-art machine learning classification algorithms were used to develop the models and the results were compared. To optimize the models, training was performed by applying 5-fold cross-validation, imputation and resampling. Predictive performances were evaluated on the test set and externally validated in the Manchester Asthma and Allergy Study (MAAS) cohort.Results RFE identified eight and 12 predictors for the CAPE and CAPP models, respectively. The best predictive performance was demonstrated by a Support Vector Machine (SVM) algorithm for both the CAPE model (area under the receiver operating curve, AUC=0.71) and CAPP model (AUC=0.82). Both models demonstrated good generalisability in MAAS (CAPE 8YR=0.71, 11YR=0.71, CAPP 8YR=0.83, 11YR=0.79).Conclusion Using machine learning approaches improved upon the predictive performance of existing regression-based models, with good generalisability and ability to rule in asthma.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the National Institute for Health Research through the NIHR Southampton Biomedical Research Centre and a University of Southampton Presidential Research Studentship. Replication analysis in MAAS was supported by the Medical Research Council as part of UNICORN (Unified Cohorts Research Network): Disaggregating asthma MR/S025340/1. Angela Simpson and Clare Murray are supported by the NIHR Manchester Biomedical Research Centre.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The development of the predictive models received approval from the Faculty of Medicine Ethics Committee, University of Southampton (ERGO 46033.R1). All participants in the Isle of Wight Birth Cohort provided informed consent or parental consent and Ethics approval was obtained from the local/national ethics committees at recruitment of the birth cohort between January 1989 and February 1990, and subsequently at each assessment.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAccess to data from the Isle of Wight Birth Cohort can be made available upon request. Further information can be found at www.allergyresearch.org.uk/ or from: Cohort Profile: The Isle Of Wight Whole Population Birth Cohort (IOWBC). Int J Epidemiol. 2018 Aug 1;47(4):1043-1044i. doi: 10.1093/ije/dyy023. Source code for the development and use of the prediction models can also be made available upon request. http://www.allergyresearch.org.uk/ AUCArea under the receiver operating curveBHRBronchial hyper-responsivenessRFERecursive feature eliminationLR-Negative likelihood ratioLR+Positive likelihood ratioNPVNegative predictive valuePPVPositive predictive valueSPTSkin prick test