RT Journal Article SR Electronic T1 An automatic diagnostic system for pediatric genetic disorders by linking genotype and phenotype information JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.08.26.21261185 DO 10.1101/2021.08.26.21261185 A1 Dong, Xinran A1 Wu, Bingbing A1 Wang, Huijun A1 Yang, Lin A1 Chen, Xiang A1 Ni, Qi A1 Wang, Yaqiong A1 Liu, Bo A1 Lu, Yulan A1 Zhou, Wenhao YR 2021 UL http://medrxiv.org/content/early/2021/08/28/2021.08.26.21261185.abstract AB Background Quantitatively describe the phenotype spectrum of pediatric disorders has remarkable power to assist genetic diagnosis. Here, we developed a matrix which provide this quantitative description of genomic-phenotypic association and constructed an automatic system to assist the diagnose of pediatric genetic disorders.Results 20,580 patients with genetic diagnostic conclusions from the Children’s Hospital of Fudan University during 2015 to 2019 were reviewed. Based on that, a phenotype spectrum matrix -- cGPS (clinical Gene’s Preferential Synopsis) -- was designed by Naïve Bayes model to quantitatively describe genes’ contribution to clinical phenotype categories. Further, for patients who have both genomic and phenotype data, we designed a ConsistencyScore based on cGPS. ConsistencyScore aimed to figure out genes that were more likely to be the genetic causal of the patient’s phenotype and to prioritize the causal gene among all candidates. When using the ConsistencyScore in each sample to predict the causal gene for patients, the AUC could reach 0.975 for ROC (95% CI 0.972-0.976 and 0.575 for precision-recall curve (95% CI 0.541-0.604). Further, the performance of ConsistencyScore was evaluated on another cohort with 2,323 patients, which could rank the causal gene of the patient as the first for 75.00% (95% CI 70.95%-79.07%) of the 296 positively genetic diagnosed patients. The causal gene of 97.64% (95% CI 95.95%-99.32%) patients could be ranked within top 10 by ConsistencyScore, which is much higher than existing algorithms (p <0.001).Conclusions cGPS and ConsistencyScore offer useful tools to prioritize disease-causing genes for pediatric disorders and show great potential in clinical applications.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was funded by the Shanghai Hospital Development Center (SHDC2020CR6028-002, Prof. Zhou), National Key R&D Program of China (2020YFC2006402 Prof. Zhou), Shanghai Municipal Science and Technology Major Project (2017SHZDZX01, Dr. Lu; 20Z11900600 Prof. Zhou), National Key Research and Development Program (2018YFC0116903 Prof. Zhou), and Shanghai Key Laboratory of Birth Defects (13DZ2260600 Prof. Zhou).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was approved by the Research Ethics Committee of the Children's Hospital of Fudan University.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesDr. Wenhao Zhou, National Children's Medical Center, Children's Hospital of Fudan University, Shanghai, China, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The cGPS matrix score generated during the current study are available from the corresponding author upon reasonable request.