Abstract
Objectives Develop a digital phenotyping algorithm (PheIndex) using electronic medical records (EMR) data to identify children aged 0-3 who have been diagnosed with genetic disorders or present with illness with an increased risk for genetic disorders from a mother-child cohort.
Methods We established 13 criteria for the algorithm where two metrics – a quantified score and a classification – were derived. The criteria and the classification were validated by chart review from a pediatrician and clinical geneticist. To demonstrate the utility of our algorithm in real-world evidence applications, we examined the association between size of carrier screening panel (small/≤4 genes [CS-S] vs large/≥100genes [CS-L]) undertaken by mothers prior to delivery, and children classified as presenting with illness with an increased risk for genetic disorders by our algorithm.
Results The PheIndex algorithm identified 1,088 such children out of 93,154 live births and achieved 90% sensitivity, 97% specificity, and 94% accuracy by chart review. We found that children whose mothers received CS-L were less likely to be classified as presenting with illness with an increased risk for genetic disorders and a decreased need to have multiple specialist visits and multiple ER visits, compared to children whose mothers received CS-S.
Conclusions The PheIndex algorithm can help identify when a rare genetic disorder may be present, and has the potential to improve healthcare delivery by alerting providers to consider ordering a diagnostic genetic test and/or referring a patient to a medical geneticist or other specialists.
Article Summary Algorithm using EMR data to identify children who have been diagnosed with a genetic disorder or present with illness with increased risk of genetic disorders.
What’s known on this subject With over 7000 Mendelian disorders, identifying children with a specific rare genetic disorder diagnosis through structured EMR data is challenging given incompleteness of records, inaccurate medical diagnosis coding, as well as heterogeneity in clinical symptoms and procedures for specific disorders.
What this study adds We developed a digital phenotyping algorithm using electronic medical records (EMR) data to identify children aged 0-3 who have been diagnosed with genetic disorders or present with illness with an increased risk for genetic disorders from a mother-child cohort.
Competing Interest Statement
BDW, LYL, RAS, DC, LS, SL, JT, SL, ZW, GS, LE, RC, EES, LL are formerly or currently employed by GeneDx. GeneDx is a company that integrates genetic testing and data analytics to improve diagnosis, treatment, and prevention of disease. The Icahn School of Medicine at Mount Sinai holds equity in this for-profit company.
Funding Statement
This study did not receive any funding.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee/IRB of Mount Sinai Health Systems gave approval for this work (IRB-20-01771).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Conflict of Interest Disclosures: The authors have no conflicts of interest relevant to this article to disclose.
Funding/Support: None. This project was performed in collaboration with GeneDx. GeneDx is a company that integrates genetic testing and data analytics to improve diagnosis, treatment, and prevention of disease. The Icahn School of Medicine at Mount Sinai holds equity in this for profit company.
Data Availability
The clinical data used in this study is under license from Mount Sinai Data Warehouse. As a result, this dataset is not publicly available. Qualified researchers affiliated with the Mount Sinai Health Systems may apply for access to these data through the Mount Sinai Health Systems Institutional Review Board.
Abbreviations
- CSER
- Clinical Sequencing Exploratory Research
- CS
- carrier screening
- CS-L
- carrier screening, large panel
- CS-M
- carrier screening, medium panel
- CS-S
- carrier screening, small panel
- CT
- computed tomography
- CTICU
- cardiothoracic intensive care unit
- eMERGE
- Electronic Medical Records & Genomics EHR
- EMR
- electronic medical record
- ER
- emergency room
- ICD
- International Classification of Diseases
- MRI
- magnetic resonance imaging
- MSHS
- Mount Sinai Health System
- NICU
- neonatal intensive care unit
- NPV
- negative predictive value
- PPV
- positive predictive value