Abstract
Electronic health records (EHRs) represent a rich data source to support precision medicine, particularly in disorders with small and heterogeneous populations where longitudinal phenotypes are poorly characterized. However, the impact of EHR data is often limited by incomplete or imperfect source documentation and the inability to leverage unstructured data. Here, we address these shortcomings through a computational analysis of one of the largest cohorts of developmental and epileptic encephalopathies (DEEs), representing 466 individuals across six genetically defined conditions. The DEEs encompass debilitating pediatric-onset disorders with high unmet needs for which treatment development is ongoing. By applying a platform approach to data curation and annotation of 18 clinical data entities from comprehensive medical records, we characterize variation in longitudinal clinical journeys. Assessments of the relative enrichment of phenotypes and semantic similarity analysis highlight commonalities and differences between the six cohorts. Evaluation of medication use reflects unmet needs, particularly in the management of movement disorders. We also present a novel composite measure of seizure severity that is more robust than existing measures of seizure frequency alone. Finally, we show that the attainment of developmental outcomes, including the ability to sit independently and the ability to walk, is correlated with seizure severity scores. Overall, the combined analyses demonstrate that patient-centric real world data generation, including structuring of medical records, holds promise to improve clinical trial success in rare disorders. Applications of this approach support improved understanding of baseline disease progression, selection of relevant endpoints, and definition of inclusion and exclusion criteria.
Competing Interest Statement
All authors are current or former employees and shareholders of Invitae, Corp, which owns the medical record data extraction platform, Ciitizen.
Funding Statement
This study was funded by Invitae, Corp.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Pearl IRB, an independent IRB, gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.