RT Journal Article SR Electronic T1 Video-Audio Neural Network Ensemble For Comprehensive Screening Of Autism Spectrum Disorder in Young Children JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.06.28.23291938 DO 10.1101/2023.06.28.23291938 A1 Natraj, Shreyasvi A1 Kojovic, Nada A1 Maillart, Thomas A1 Schaer, Marie YR 2023 UL http://medrxiv.org/content/early/2023/07/03/2023.06.28.23291938.abstract AB A timely diagnosis of autism is paramount to allow early therapeutic intervention in preschoolers. Deep Learning (DL) tools have been increasingly used to identify specific autistic symptoms, and offer promises for automated detection of autism at an early age. Here, we leverage a multi-modal approach by combining two neural networks trained on video and audio features of semi-standardized social interactions in a sample of 160 children aged 1 to 5 years old. Our ensemble model performs with an accuracy of 82.5% (F1 score: 0.816, Precision: 0.775, Recall: 0.861) for ASD screening. Additional combinations of our model were developed to achieve higher specificity (92.5%, i.e., few false negatives) or sensitivity (90%, i.e. few false positives). Finally, we found a relationship between the neural network modalities and specific audio versus video ASD characteristics, bringing evidence that our neural network implementation was effective in taking into account different features that are currently standardized under the gold standard ASD assessment.Competing Interest StatementThe authors have declared no competing interest.Funding StatementWe are grateful to all the families involved in this study and the psychologists and colleagues who contributed to the data collection. This study was funded by the Sinergia Grant for Digital Phenotyping of Autism Spectrum Disorders in Children (Grant Number: 202235), National Centre of Competence in Research (NCCR) Synapsy, by the Swiss National Science Foundation (SNF, Grant Number: 51NF40_185897), by SNF grants to M.S. (163859 and 190084), as well as by funds from the Private Foundation of the HUG, by a UNIGE COINF2018 equipment grant and by the Fondation Pole Autisme (https://www.pole-autisme.ch). We are also very grateful to the Alexis for Autism initiative (https://www.alexisforautism.com) for supporting this research.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study and protocols conducted in this research were approved by the Ethics Committee of the Faculty of Medicine at the University of Geneva, Switzerland. The methods employed in this study strictly adhered to the relevant guidelines and regulations set forth by the University of Geneva. Informed written consent was obtained from a parent and/or legal guardian for all children participating in the study.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesCode used for developing video and audio neural network is available by the corresponding author upon request. The ADOS clinical examination recordings from which the video and audio data were extracted represent sensitive data and thus cannot be shared.