ABSTRACT
Pediatric Long COVID has been associated with a wide variety of symptoms, conditions, and organ systems, but distinct clinical presentations, or subphenotypes, are still being elucidated. In this exploratory analysis, we identified a cohort of pediatric (age <21) patients with evidence of Long COVID and no pre-existing complex chronic conditions using electronic health record data from 38 institutions and used an unsupervised machine learning-based approach to identify subphenotypes. Our method, an extension of the Phe2Vec algorithm, uses tens of thousands of clinical concepts from multiple domains to represent patients’ clinical histories to then identify groups of patients with similar presentations. The results indicate that cardiorespiratory presentations are most common (present in 54% of patients) followed by subphenotypes marked (in decreasing order of frequency) by musculoskeletal pain, neuropsychiatric conditions, gastrointestinal symptoms, headache, and fatigue.
Competing Interest Statement
Dr. Jhaveri is a consultant for AstraZeneca, Seqirus, Dynavax, receives an editorial stipend from Elsevier and Pediatric Infectious Diseases Society and royalties from Up To Date/Wolters Kluwer. Dr. Rao reports prior grant support from GSK and Biofire and is a consultant for Sequiris. Dr Bailey has received grants from Patient-Centered Outcomes Research Institute. Dr. Brill received support from Novartis and Regeneron Pharmaceuticals within the last year. Dr. Horne is a member of the advisory boards of Opsis Health and Lab Me Analytics, a consultant to Pfizer regarding risk scores (funds paid to Intermountain), and an inventor of risk scores licensed by Intermountain to Alluceo and CareCentra and is site PI of a COVID-19 grant from the Task Force for Global Health, site PI of grants from the Patient-Centered Outcomes Research Institute, a member of the advisory board of Opsis Health, and previously consulted for Pfizer regarding risk scores (funds paid to Intermountain). All other authors have no conflicts of interest to disclose.
Funding Statement
This research was funded by the National Institutes of Health (NIH) Agreement OTA OT2HL161847-01 as part of the Researching COVID to Enhance Recovery (RECOVER) research Initiative.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Institutional Review Board (IRB) approval was obtained under Biomedical Research Alliance of New York (BRANY) protocol #21-08-508. As part of the BRANY IRB process, the protocol has been reviewed in accordance with institutional guidelines. BRANY waived the need for consent and HIPAA authorization.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The results reported here are based on detailed individual-level patient data compiled as part of the RECOVER Program. Due to the high risk of reidentification based on the number of unique patterns in the date, patient privacy regulations prohibit us from releasing the data publicly. The data are maintained in a secure enclave, with access managed by the program coordinating center to remain compliant with regulatory and program requirements. Please direct requests to access the data, either for reproduction of the work reported here or for other purposes, to recover@chop.edu.
https://github.com/PEDSnet/recover_pasc_subphenotype_manuscript