ABSTRACT
Purpose This study assessed the performance of International Classification of Diseases 10th Revision, Clinical Modification (ICD-10-CM) coronavirus disease 2019 (COVID-19) diagnostic code U07.1 against polymerase chain reaction (PCR) test results (Objective 1), and electronic medical record (EMR)-based codified algorithm for severe COVID-19 illness based on endpoints used in the Pfizer-BioNTech COVID-19 vaccine trial against chart review (Objective 2).
Methods This retrospective, longitudinal cohort study used EMR data from the Mass General Brigham COVID-19 Data Mart (3/1/2020–11/19/2020) for adult patients with ≥1 PCR test, antigen test, or code U07.1 (Objective 1) and adult patients with a positive PCR test hospitalized with COVID-19 (Objective 2).
Results Among 354,124 patients in Objective 1, 96% had ≥1 PCR test (including 6% with ≥1 positive PCR test; 11% with ≥1 code U07.1). Code U07.1 had low sensitivity (54%) and positive predictive value (PPV; 63%) but high specificity (97%) against the PCR test. Among 300 patients hospitalized for COVID-19 randomly sampled for chart review in Objective 2, the EMR-based case definition for severe COVID-19 illness had high PPV (>95%), showing better performance than severe/critical COVID-19 endpoints defined by the World Health Organization (PPV: 79%).
Conclusions COVID-19 diagnosis based on ICD-10-CM code U07.1 had inadequate sensitivity and requires confirmation by PCR testing. The EMR-based case definition showed high PPV and can be used to identify cases of severe COVID-19 illness in real-world datasets. These findings highlight the importance of validating outcomes in real-world data, and can guide researchers analyzing COVID-19 data when PCR tests are not readily available.
This study evaluated the performance of International Classification of Diseases 10th Revision, Clinical Modification (ICD-10-CM) codes and an electronic medical record (EMR)-based algorithm for identifying coronavirus disease 2019 (COVID-19) diagnosis and severe COVID-19 illness in real-world data.
ICD-10-CM code U07.1 for COVID-19 had low sensitivity and positive predictive value (PPV) against PCR tests.
The EMR-based algorithm for severe COVID-19 illness developed from the Pfizer– BioNTech COVID-19 vaccine trial had high PPV against chart review, and may be used to identify severe cases in real-world data.
These results highlight the importance of validating outcomes when conducting analyses of real-world datasets.
PLAIN LANGUAGE SUMMARY As polymerase chain reaction (PCR) tests for coronavirus disease 2019 (COVID-19) diagnosis are becoming less frequently used and there is no standard definition of severe COVID-19 illness, it is important to have a way of correctly identifying COVID-19 diagnosis or severe COVID-19 illness in real-world data (e.g., electronic medical records [EMRs]). This study examined: 1) how a diagnosis code for COVID-19 used in EMRs (i.e., U07.1) compares to PCR test results in terms of accurately identifying patients with COVID-19; and 2) whether a definition for severe COVID-19 illness developed based on the Pfizer–BioNTech COVID-19 vaccine trial and a definition used by the World Health Organization [WHO]) can be used to accurately identify patients with severe COVID-19 illness in EMRs. The results showed that code U07.1 was not very accurate in identifying patients with COVID-19. On the other hand, the developed definition for severe COVID-19 illness was more accurate than the WHO definition and was able to identify most patients with severe COVID-19 illness in real-world data.
Competing Interest Statement
The authors have the following conflicts of interest to declare: Mei Sheng Duh, Catherine Nguyen, Rose Chang, Maral DerSarkissian, Azeem Banatwala, Louise H. Yu, Bruce E. Stangle, and Pierre Y. Cremieux are employees of Analysis Group, Inc., which received research funding from Pfizer for this study. Heather Rubino and Francesca Kolitsopoulos are full-time employees and stock shareholders of Pfizer. Christopher Herrick, Yichuan Grace Hsieh, Gregory Belsky, Marykate E. Murphy, Janet Boyle-Kelly, Andrew Cagan, and Shawn N. Murphy have nothing to disclose.
Funding Statement
This study was funded by Pfizer, Inc., New York, NY, USA. The study sponsor was involved in several aspects of the research including study design, data interpretation, manuscript writing, and decision to submit the manuscript for publication.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The Institutional Review Board of Mass General Brigham gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript.