ABSTRACT
Background The overlapping clinical presentations of patients with acute respiratory disease can complicate disease diagnosis. Whilst PCR diagnostic methods to identify SARS-CoV-2 are highly sensitive, they have their shortcomings including false-positive risk and slow turnaround times. Changes in host gene expression can be used to distinguish between disease groups of interest, providing a viable alternative to infectious disease diagnosis.
Methods We interrogated the whole blood gene expression profiles of patients with COVID-19 (n=87), bacterial infections (n=88), viral infections (n=36), and not-infected controls (n=27) to identify a sparse diagnostic signature for distinguishing COVID-19 from other clinically similar infectious and non-infectious conditions. The sparse diagnostic signature underwent validation in a new cohort using reverse transcription quantitative polymerase chain reaction (RT-qPCR) and then underwent further external validation in an independent in silico RNA-seq cohort.
Findings We identified a 10-gene signature (OASL, UBP1, IL1RN, ZNF684, ENTPD7, NFKBIE, CDKN1C, CD44, OTOF, MSR1) that distinguished COVID-19 from other infectious and non-infectious diseases with an AUC of 87.1% (95% CI: 82.6%-91.7%) in the discovery cohort and 88.7% and 93.6% when evaluated in the RT-qPCR validation, and in silico cohorts respectively.
Interpretation Using well-phenotyped samples collected from patients admitted acutely with a spectrum of infectious and non-infectious syndromes, we provide a detailed catalogue of blood gene expression at the time of hospital admission. The findings result in the identification of a 10-gene host diagnostic signature to accurately distinguish COVID-19 from other infection syndromes presenting to hospital. This could be developed into a rapid point-of-care diagnostic test, providing a valuable syndromic diagnostic tool for future early pandemic use.
Funding Imperial COVID fund; NIHR Imperial BRC; UKRI (ISARIC-4C).
Evidence before this study Rapid diagnosis is fundamental for ensuring that high consequence infections are identified at an early stage, and that correct and timely treatment is started. Pathogen- focused diagnostic tools may not be available early in a pandemic. To determine if host-based syndromic diagnostic tools to identify acute COVID-19 in the emergency setting have been developed, we searched PubMed using the following search terms for all hits between January 2020-July 2023: “COVID19” AND “viral” AND “whole blood” AND (“RNAseq” OR “RNA-Seq” OR “transcriptomic” OR “transcriptome” OR “gene expression”) AND (“signature” OR “diagnosis” OR “classification” OR “classifier”). This returned 16 studies, with two focused on paediatric populations and one focused on an elderly population. A further two studies explored utility of host gene expression in predicting viral infection severity and one study focused on exploring whole blood transcriptome profiles of patients with SARS-CoV-2, however only contrasting them to healthy controls rather than clinically similar disease cohorts. One study demonstrated that metabolomic biomarkers can distinguish COVID-19 and viral infections from other disease groups, and a further study showed that host gene expression (nasopharyngeal swabs and whole blood) differs between patients with COVID-19 and those with influenza, other seasonal coronaviruses, and bacterial sepsis, using classifiers with as few as 20 genes to perform diagnosis. These studies show that acute infection with SARS-CoV-2 can give rise to specific gene expression changes in the host that may differ from those seen in clinically similar infectious or non-infectious presentations. However to date there is no signature that has been adapted to a diagnostic platform, and none has been validated to discriminate SARS-CoV-2 from other infectious syndromes.
Added value of this study Our study provides a unique snapshot of gene expression in a large cohort of well-phenotyped adults at the point of admission to an emergency department with a range of suspected infections including COVID-19. We identified a 10-gene signature, which outperformed common laboratory markers, such as CRP and white cell count for discriminating patients with COVID-19 from those with clinical similar infectious and non-infectious diseases. This signature has been shown to be effective in a completely independent cohort of patients recruited in the United States, as well as in a validation cohort from the emergency department, using a different quantitation platform (RT-qPCR). Taken together, these findings show that acute COVID-19 can be differentiated from other emergency presentations using a sparse combination of host transcripts in blood. The findings allow a gene expression signature to be developed into a rapid point-of-care diagnostic test to differentiate serious COVID-19-like infection from other similar presentations.
Implications for practice or policy and future research combined with existing evidence. PCR-based diagnostic approaches have high sensitivity and specificity to detect SARS-CoV-2 and other viruses in the respiratory tract, however there are many situations where the results may not indicate active disease and can be misleading. Host response-based diagnostics can provide supporting evidence of an active viral infection, and could prove essential in the setting where emerging virus variants elude detection by PCR, or where no PCR diagnostic exists.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was funded by: Imperial COVID fund; NIHR Imperial BRC; UKRI (ISARIC-4C).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The national research ethics committee of South Central Oxford C gave ethical approval for this work (references 14/SC/0008 and 19/SC/0116). The national research ethics committee of Wales REC3 also gave ethical approval for this work (reference 17/WA/0161)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability Statement
De-identified participant level data and metadata leading to signature discovery are available on ArrayExpress (E-MTAB-10527 and E-MTAB-13307 respectively). BioAID study protocol and other materials are available at https://imperialbrc.nihr.ac.uk/2020/01/28/bioaid/. De-identified participant level data from BioAID patients can be made available to investigators whose proposed use of the data has been approved by the BioAID governance review committee.