PT - JOURNAL ARTICLE AU - Dong, Xinran AU - Wu, Bingbing AU - Wang, Huijun AU - Yang, Lin AU - Chen, Xiang AU - Ni, Qi AU - Wang, Yaqiong AU - Liu, Bo AU - Lu, Yulan AU - Zhou, Wenhao TI - An automatic diagnostic system for pediatric genetic disorders by linking genotype and phenotype information AID - 10.1101/2021.08.26.21261185 DP - 2021 Jan 01 TA - medRxiv PG - 2021.08.26.21261185 4099 - http://medrxiv.org/content/early/2021/08/28/2021.08.26.21261185.short 4100 - http://medrxiv.org/content/early/2021/08/28/2021.08.26.21261185.full AB - Background Quantitatively describe the phenotype spectrum of pediatric disorders has remarkable power to assist genetic diagnosis. Here, we developed a matrix which provide this quantitative description of genomic-phenotypic association and constructed an automatic system to assist the diagnose of pediatric genetic disorders.Results 20,580 patients with genetic diagnostic conclusions from the Children’s Hospital of Fudan University during 2015 to 2019 were reviewed. Based on that, a phenotype spectrum matrix -- cGPS (clinical Gene’s Preferential Synopsis) -- was designed by Naïve Bayes model to quantitatively describe genes’ contribution to clinical phenotype categories. Further, for patients who have both genomic and phenotype data, we designed a ConsistencyScore based on cGPS. ConsistencyScore aimed to figure out genes that were more likely to be the genetic causal of the patient’s phenotype and to prioritize the causal gene among all candidates. When using the ConsistencyScore in each sample to predict the causal gene for patients, the AUC could reach 0.975 for ROC (95% CI 0.972-0.976 and 0.575 for precision-recall curve (95% CI 0.541-0.604). Further, the performance of ConsistencyScore was evaluated on another cohort with 2,323 patients, which could rank the causal gene of the patient as the first for 75.00% (95% CI 70.95%-79.07%) of the 296 positively genetic diagnosed patients. The causal gene of 97.64% (95% CI 95.95%-99.32%) patients could be ranked within top 10 by ConsistencyScore, which is much higher than existing algorithms (p <0.001).Conclusions cGPS and ConsistencyScore offer useful tools to prioritize disease-causing genes for pediatric disorders and show great potential in clinical applications.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was funded by the Shanghai Hospital Development Center (SHDC2020CR6028-002, Prof. Zhou), National Key R&D Program of China (2020YFC2006402 Prof. Zhou), Shanghai Municipal Science and Technology Major Project (2017SHZDZX01, Dr. Lu; 20Z11900600 Prof. Zhou), National Key Research and Development Program (2018YFC0116903 Prof. Zhou), and Shanghai Key Laboratory of Birth Defects (13DZ2260600 Prof. Zhou).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was approved by the Research Ethics Committee of the Children's Hospital of Fudan University.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesDr. Wenhao Zhou, National Children's Medical Center, Children's Hospital of Fudan University, Shanghai, China, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The cGPS matrix score generated during the current study are available from the corresponding author upon reasonable request.