Augmented Curation of Unstructured Clinical Notes from a Massive EHR System Reveals Specific Phenotypic Signature of Impending COVID-19 Diagnosis ================================================================================================================================================= * FNU Shweta * Karthik Murugadoss * Samir Awasthi * AJ Venkatakrishnan * Arjun Puranik * Martin Kang * Brian W. Pickering * John C. O’Horo * Philippe R. Bauer * Raymund R. Razonable * Paschalis Vergidis * Zelalem Temesgen * Stacey Rizza * Maryam Mahmood * Walter R. Wilson * Douglas Challener * Praveen Anand * Matt Liebers * Zainab Doctor * Eli Silvert * Hugo Solomon * Tyler Wagner * Gregory J. Gores * Amy W. Williams * John Halamka * Venky Soundararajan * Andrew D. Badley ## Abstract Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive fine-grained resolution of the pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 8.2 million clinical notes from 14,967 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health Record (EHR)-derived clinical phenotypes of COVID-19-positive (COVID*pos*, n=272) versus COVID-19-negative (COVID*neg*, n=14,695) patients over each day of the week preceding the PCR testing date, we identify diarrhea (2.8-fold), change in appetite (2-fold), anosmia/dysgeusia (28.6-fold), and respiratory failure (2.1-fold) as significantly amplified in COVID*pos* over COVID*neg* patients. The specific combination of cough and diarrhea has a 4-fold amplification in COVID*pos* patients during the week prior to PCR testing, and along with anosmia/dysgeusia, constitutes the earliest EHR-derived signature of COVID-19 (4-7 days prior to typical PCR testing date). This study introduces an Augmented Intelligence platform for the real-time synthesis of institutional knowledge captured in EHRs. The platform holds tremendous potential for scaling up curation throughput, with minimal need for training underlying neural networks, thus promising EHR-powered early diagnosis for a broad spectrum of diseases. Coronavirus disease 2019 (COVID-19) is a respiratory infection caused by the novel Severe Acute Respiratory Syndrome coronavirus-2 (SARS-CoV-2). As of April 15, 2020, according to WHO there have been more than 1.9 million confirmed cases worldwide and more than 123,000 deaths attributable to COVID-19 ([https://covid19.who.int/](https://covid19.who.int/)) The clinical course and prognosis of patients with COVID-19 varies substantially, even among patients with similar age and comorbidities1. Following exposure and initial infection with SARS-CoV-2, likely through the upper respiratory tract, patients can remain asymptomatic although active viral replication may be present for weeks before symptoms manifest1,2. The asymptomatic nature of initial SARS-CoV-2 infection in the majority of patients may be exacerbating the rampant community transmission observed3. It remains unknown which patients become symptomatic, and in those who do, the timeline of symptoms remains poorly characterized and non-specific. Symptoms may include fever, fatigue, myalgias, loss of appetite, loss of smell (anosmia), and altered sense of taste, in addition to the respiratory symptoms of dry cough, dyspnea, sore throat, and rhinorrhea, and well as gastrointestinal symptoms of diarrhea, nausea, and abdominal discomfort4. A small proportion of COVID-19 patients progress to severe illness requiring hospitalization or intensive care management; among these individuals, mortality owing to Acute Respiratory Distress Syndrome (ARDS) is higher5. The estimated average time from symptom-onset to resolution can range from three days to more than three weeks, with a high degree of variability6. The COVID-19 public health crisis demands a data science-driven and temporal pathophysiology-informed precision medicine approach for its effective clinical management. Here we introduce a platform for the augmented curation of the full-spectrum of patient phenotypes from 8,22,9092 clinical notes of the Mayo Clinic EHRs for 14,967 patients with confirmed positive/negative COVID-19 diagnosis by PCR testing (see ***Methods***). The platform utilizes state-of-the-art transformer neural networks on the unstructured clinical notes to automate entity recognition (e.g. diseases, drugs, phenotypes), quantify the strength of contextual associations between entities, and characterize the nature of association into positive, negative, or other sentiments. We identify specific gastro-intestinal, respiratory, and sensory phenotypes, as well as some of their specific combinations, that appear to be indicative of impending COVID*pos* diagnosis by PCR testing. This highlights the potential for neural networks-powered EHR curation to facilitate a significantly earlier diagnosis of COVID-19 than currently thought feasible. ## Results The clinical determination of the COVID-19 status for each patient was conducted using the SARS-CoV-2 PCR (RNA) test approved for human nasopharyngeal and oropharyngeal swab specimens under the U.S. FDA emergency use authorization (EUA)6. This PCR test resulted in 14,695 COVID*neg* patient diagnoses and 272 COVID*pos* patient diagnoses. In order to investigate the time course of COVID-19 progression in patients, we used BERT-based deep neural networks to extract symptoms and their putative synonyms from the clinical notes for a few weeks prior to, and a few weeks post, the date when the COVID-19 diagnosis test was taken (see ***Methods;* Table 1**). For the purpose of this analysis, all patients were temporally aligned, by setting the date of COVID-19 PCR testing to ‘day 0’, and the proportion of patients demonstrating each symptom derived from the EHR over each day of the week preceding and post PCR testing was tabulated (**Table 2**). As a negative control, we included a non-COVID-19 symptom ‘dysuria’. View this table: [Table 1.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T1) Table 1. Augmented curation of the unstructured clinical notes from the EHR reveals specific clinically confirmed phenotypes that are amplified in COVID*pos* patients over COVID*neg* patients in the week prior to the SARS-CoV-2 PCR testing date. The key COVID*pos* amplified phenotypes in the week preceding PCR testing (i.e. day = -7 to day = -1) are highlighted in gray. The ratio of COVID*pos* to COVID*neg* proportions represents the fold change amplification of each phenotype in the COVID*pos* patient set. View this table: [Table 2.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T2) Table 2. Temporal analysis of the EHR clinical notes for the week preceding PCR testing (i.e. day -7 to day -1), the day of PCR testing (day 0), and the subsequent pair of days (day 1, day 2) in COVID*pos* and COVID*neg* patients. Temporal enrichment for each symptom is quantified using the ratio of COVID*pos* patient proportion over the COVID*neg* patient proportion for each day. The patient proportions in the rows labeled ‘Positive (n = 272)’ and ‘Negative (n = 14695)’ are represented as percentages. In the COVID*pos* patients, diarrhea occurs in 43 of 272 patients (15.8%) in the week prior to the PCR testing date, whereas in the COVID*neg* patients, only 822 of 14,695 patients (5.6%) have confirmed diarrhea in the week prior to the PCR test date. The amplified probability (**Table 1**; 2.8-fold; p-value = 8.4E-13) of diarrhea in the week preceding PCR testing for COVID*pos* patients is quite noteworthy. Some of these undiagnosed COVID-19 patients that experience diarrhea may be unintentionally shedding SARS-CoV-2 fecally7. Incidentally, epidemiological surveillance by waste water monitoring conducted recently in the state of Massachusetts observed copious SARS-CoV-2 RNA8. The amplification of diarrhea in COVID*pos* over COVID*neg* patients in the week preceding PCR testing highlights the importance and necessity for washing hands often. Change in appetite/intake is amplified in the week preceding PCR testing in COVID*pos* over COVID*neg* patients (**Table 1**, 2.0-fold amplification; p-value = 0.0026). Altered or diminished sense of taste or smell (dysgeusia or anosmia) is significantly amplified in COVID*pos* over COVID*neg* patients in the week preceding PCR testing (**Table 1**; 28.6-fold amplification; p-value = 5.1E-36). This result suggests that anosmia is likely a significant early indicator of COVID-19, including in otherwise asymptomatic patients. Respiratory failure is modestly enriched in the week prior to PCR testing in COVID*pos* over COVID*neg* patients (2.1-fold amplification; p-value = 0.01; **Table 1**). Among other common phenotypes, diaphoresis manifests in 31 of 272 patients (11.4%) and fatigue in 37 of 272 patients (13.6%) during the week prior to COVID*pos* PCR testing. In contrast, for the COVID*neg* patients, diaphoresis occurs in 825 of 14,695 patients (5.6%) and fatigue in 1,279 of 14,695 (8.7%) during the week prior to the PCR test. This corresponds to a 2-fold amplification of diaphoresis (p-value = 4.7E-05) and 1.56-fold amplification of fatigue (p-value = 0.005). Headache occurs in 35 of 272 COVID*pos* patients (12.9%) and in 1,023 of 14,695 COVID*neg* patients (7.0%), reflecting a 1.9-fold amplification in COVID*pos* patients in the week prior to the PCR test (p-value = 0.0002). Cough has a 1.3-fold amplification (p-value = 0.01) in COVID*pos* over **COVID*****neg*** patients in the week preceding PCR testing (**Table 1**). Fever/chills occur in 67 of 272 COVID*pos* patients (24.6%) and in 2726 of 14,695 COVID*neg* patients (18.6%) in the week prior to the PCR test. This suggests that fever/chills is somewhat nonspecific to COVID-19 patients. Finally, dysuria was included as a negative control for COVID-19, and consistent with this assumption, 1 out of 272 COVID*pos* patients (0.4%) and 91 out of 14,695 COVID*neg* patients (0.6%) had dysuria during the week preceding PCR testing (**Table 1**). Next, we considered the 351 possible pairwise conjunctions of 27 phenotypes for COVID*pos* versus COVID*neg* patients in the week prior to the PCR testing date (**Table S1**). Given that an altered sense of smell or taste (anosmia/dysgeusia) occurs in very few of these conjunctions within the COVID*pos* patients till date, and the fact that independently anosmia/dysgeusia is already a significant signature of impending COVID*pos* diagnosis (based on the above results), here we only remark on the other 325 possible pairwise symptom combinations. The combination of cough and diarrhea is noted to be particularly significant in COVID*pos* over COVID*neg* patients during the week preceding PCR testing; i.e. cough and diarrhea co-occur in 36 of 272 COVID*pos* patients (13.2%) and in 486 of 14,695 COVID*neg* patients (3.3%) indicating a 4-fold amplification of this specific symptom combination as a signature of impending COVID-19 diagnosis (BH corrected p-value = 9.3E-19). Another enriched combination of symptoms in COVID*pos* over COVID*neg* patients in the week preceding PCR testing is diaphoresis and diarrhea that co-occur in 21 of 272 COVID*pos* (7.7%) and 204 of 14,695 COVID*neg* (1.4%) patients. This corresponds to a 5.6-fold enrichment (BH corrected p-value = 1.8E-17) in the COVID*pos* patient group and suggests diaphoresis and diarrhea as another symptom combination preceding COVID*pos* PCR test results. View this table: [Table S1.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T3) Table S1. Collection of treatments undergoing clinical trials for COVID-19. View this table: [Table S2.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T4) Table S2. Disease Progression of COVID-19 and Associated Treatment View this table: [Table S3.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T5) Table S3. Symptoms and their synonyms used for the EHR analysis. View this table: [Table S4.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/T6) Table S4. Pairwise analysis of phenotypes in the COVID*pos* and COVID*neg* cohorts. We further investigated the temporal evolution of the proportion of patients with each symptom over the week prior to PCR testing. Cough and diarrhea were found to be the early indicators that significantly discriminate COVID*pos* from COVID*neg* patients. In particular, between 3 to 7 days prior to PCR testing, cough is amplified in the COVID*pos* patient cohort over the COVID*neg* patient cohort with an amplification of 5.5-fold (day -7, p-value = 7.0E-16), 5.3-fold (day -6, p-value = 5.4E-15), 4.7-fold (day -5, p-value = 1.4E-10), 3.9-fold (day -4, p-value = 3.6E-08), and 3.8-fold (day -3, p-value = 7.3E-08). The intriguing diminishing odds of cough as a symptom from 7 to 3 days preceding the PCR testing date, suggests this may be a notable temporal pattern. Likewise, diarrhea is amplified in the COVID*pos* patient cohort over the COVID*neg* patient cohort with an amplification of 5.7-fold (day -7, p-value = 6.1E-11), 5-fold (day -6, p-value = 5.5E-08), 3.2-fold (day -5, p-value = 3E-03), and 4.2-fold (day -4, p-value = 7E-06). Following the enriched odds of diarrhea and cough, we further find that change in appetite may be considered a subsequent symptom of impending COVID-19 diagnosis. This is because, change in appetite is amplified in the COVID*pos* cohort over the COVID*neg* cohort on day -4 (3.4-fold, p-value = 4.4E-03), day -3 (3.7-fold, p-value = 2.9E-03), and day -2 (4.2-fold, p-value = 6.5E-05). Finally, as the day of PCR testing ensues, fever/chills are enriched in the COVID*pos* over the COVID*neg* cohort, with 1.6-fold (day 0, p-value =), 2.4-fold (day 1, p-value =) and 2-fold (day 2, p-value = 1.7E-04) respectively. Similarly, cough is also amplified in the COVID*pos* over the COVID*neg* cohort on day 0 (1.9-fold, p-value = 2.1E-06), day 1 (2.8-fold, p-value = 2.3E-12) and day 2 (2-fold, p-value = 2.8E-03) post the PCR testing date. These observations characterize the temporal evolution of specific phenotypes that are enriched in COVID*pos* patients preceding and post the PCR testing date. While explicit identification of SARS-CoV-2 in patients prior to the PCR testing date was not conducted, such prospective validation of our augmented EHR curation approach is being initiated. Nevertheless, this high-resolution temporal overview of the EHR-derived clinical phenotypes as they relate to the SARS-CoV-2 PCR diagnostic testing date for 14,967 patients has revealed specific enriched signals of impending COVID-19 onset. These clinical insights can help modulate social distancing measures and appropriate clinical care for individuals exhibiting the specific gastro-intestinal (diarrhea, change in appetite/intake), sensory (anosmia, dysgeusia) and respiratory phenotypes identified herewith, including for patients awaiting conclusive COVID-19 diagnostic testing results (e.g. by SARS-CoV-2 RNA RT-PCR). ## Discussion In order to identify potential cells and tissue types that may be associated with the EHR-derived clinical phenotypes observed above for COVID-19 patients, we analyzed Single Cell RNA-seq data using the nferX platform (see **Methods**)9. Given recent studies implicating the necessity of both ACE2 and TMPRSS2 for the SARS-CoV-2 lifecycle10, we scouted for human cells that co-express both genes. This co-expression analysis revealed that specific cell types from the small intestine/colon, nasal cavity, respiratory system, pancreas, urinary tract, and gallbladder co-express both ACE2 and TMPRSS2 (**Figure 1, Figure S1**). Notably, multiple small intestine cell types co-express the two genes. These cell types include enterocytes, enteroendocrine cells, stem cells, goblet cells, and Paneth cells. In the pancreas, the cell types included ductal cells and acinar cells. The kidney cells co-expressing TMPRSS2 and ACE2 include proximal tubular cells, pelvic epithelial cells and type A intercalated cells. Co-expression of TMPRSS2 and ACE2 is also observed in the epithelial cells of the olfactory nasal cavity and the respiratory tract as well as in type II pneumocytes (albeit at comparatively lower level). While the identified tissues showing co-expression of ACE2 and TMPRSS2 in the gastro-intestinal, respiratory, and sensory systems correlate with the clinical phenotypes of early COVID-19 infection as described above, these insights are conceivably from normal/healthy tissues. This highlight the need for meticulous bio-banking of COVID-19 patient-derived biospecimen and their characterization via single cell RNA-seq and other molecular technologies. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/23/2020.04.19.20067660/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/F1) Figure 1. Clinical phenotypes of COVID-19 and their connection to single cell RNA-seq co-expression of ACE2-TMPRSS2. Severity of COVID-19 and associated clinical conditions are shown. Cell types co-expressing SARS-CoV-2 infectivity determinants ACE2 and TMPRSS2 determined by single cell RNA-seq are mapped onto the COVID-19 pathophysiology summary. Primary prevention is the most effective method to minimize spread of contagious infectious viruses such as SARS-CoV-2 (**Figure S2**). In addition to population-based strategies such as social distancing, there are significant ongoing efforts to develop a prophylactic solution (**Table S1**). As the immunodominant humoral immune response in patients is directed against the SARS-CoV2 spike protein, many vaccines under investigation target this viral protein. It remains to be determined whether anti-spike protein antibodies induced by natural infection or by vaccines induce neutralizing antibody responses. Chloroquine and its analogues have been shown to inhibit virus replication in-vitro28. Whether Chloroquine or Hydroxychloroquine have meaningful effects of SARS-CoV2 replication in patients remains to be understood, and are the subject of clinical trials, both as post-exposure prophylaxis and as treatment (**Table S1**). Hydroxychloroquine was approved by FDA for emergency use in hospitalized COVID-19 patients who are not eligible for clinical trials on April 7, 2020 based on limited clinical data, but concerns have been raised about toxicity and risk of sudden death29. Our findings from the EHR analysis of COVID-19 progression can aid in a human pathophysiology enabled summary of the experimental therapies being investigated for COVID-19 (**Figure 2, Table S1**). Some of the earliest phases of intervention attempt to inhibit the entry/replication of SARS-CoV-2 by modulating critical host targets (e.g. renin angiotensin aldosterone system/RAAS inhibitors, ACE2 analogs, serine protease inhibitors) or directly inhibiting the function of viral proteins (e.g. viral RNA-dependent RNA polymerase inhibitors, protease inhibitors, convalescent plasma, synthetic immunoglobulins) (**Box 1, Table S1**). In patients with more advanced stages of disease progression, who suffer from respiratory abnormalities, therapeutics are being advanced to target the inflammatory response that can lead to Acute Respiratory Disease Syndrome (ARDS) and is associated with high mortality (**Box 1**). These include anti-GM-CSF agents, anti-IL-6 agents, JAK inhibitors, and complement inhibitors. Another emerging option for patients at this stage is convalescent plasma, which has shown some clinical benefits in cases of COVID-19 and related viral diseases (SARS-1, MERS) at various stages of severity (**Box 1**). Administration of convalescent plasma containing active specific antiviral antibodies may prevent or attenuate progression to severe disease. Expanded access to convalescent plasma for treatment of patients with COVID-19 has been approved by the FDA for emergency IND use and is available through a nationwide program led by Mayo Clinic (**Box 1**). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/23/2020.04.19.20067660/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/F2) Figure 2. Pathophysiology of COVID-19, associated treatments, and the underlying molecular mechanisms. While there is no established treatment strategy for COVID-19, several classes of therapeutics have emerged for the medical management of the disease, on the basis of their known mechanisms of action and the pathophysiology of COVID-19. In those who become symptomatic, it is imperative that diagnostic testing is done, at dedicated testing sites if available, to confirm diagnosis (**Figure S2**). Meanwhile, patients are recommended to self-quarantine at home, use mask protection when social distancing cannot be obtained, and continue supportive measures. For patients with mild symptoms, such measures may be sufficient given the self-limited nature of viral syndromes. In the event of symptom exacerbation, often marked by worsening respiratory distress, medical evaluation is warranted, and possible hospitalization. The mainstay of treatment for COVID-19, remains supportive care, and as needed supplemental oxygen. Experimental therapies intended to block SARS-CoV2 viral entry and inhibit steps in the viral life cycle necessary for viral replication have been proposed at this early stage (**Figure 2**). The goal of these therapies is to reduce viral load, thus reducing the chance of overwhelming immune reaction by delaying progression of the disease. Among the proposed treatment options for COVID-19, corticosteroid should be avoided outside a clinical trial, as suggested by the IDSA, until further clinical evidence can be established ([www.idsociety.org/practice-guideline/covid-19-guideline-treatment-and-management](http://www.idsociety.org/practice-guideline/covid-19-guideline-treatment-and-management)). This is because there has been conflicting evidence and guidance on steroid use in COVID-1930. While steroids can play a role in control of inflammation, a collection of clinical evidence from steroid use in other coronavirus outbreaks suggest that the use of corticosteroids might exacerbate COVID-19-associated lung injury31. As patients progress to severe or critical diseases, the primary objective of COVID-19 management is to provide respiratory support and control immune overactivation (**Figure 3, Figure S3**). Patients whose condition deteriorates to critical status primarily decompensate from a respiratory standpoint, but may also develop multi-organ failure (respiratory failure, cardiac failure, renal failure, hypercoagulable state, thrombotic microangiopathy), as well as severe inflammatory responses similar to cytokine release syndrome and eventually reactive hemophagocytic lymphohistiocytosis syndrome. A major manifestation of respiratory decompensation and cytokine release syndrome is acute respiratory distress syndrome (ARDS). Critical care support such as mechanical support from noninvasive to invasive mechanical ventilation and in, some instances, extracorporeal support, vasopressors, renal replacement therapy, anticoagulation, and are paramount to survival of these critically ill patients per SCC guidelines (SCCM/ESICM 2020). On the other hand, drugs such as immunomodulatory agents often used to treat cytokine release syndrome, may allow for some degree of improvement or recovery either leading into or during severe and critical disease (**Figure 2**). This study demonstrates how the highly unstructured institutional knowledge can be synthesized using deep learning and neural networks32. Expanding beyond one institution’s COVID-19 diagnostic testing and clinical care to the EHR databases of other academic medical centers and health systems will provide a more comprehensive view of clinical phenotypes enriched in COVID*pos* over COVID*neg* patients in the days preceding confirmed diagnostic testing. This requires leveraging a privacy-preserving federated software architecture that enables each medical center to retain the span of control of their de-identified EHR databases, while enabling the machine learning models from partners to be deployed in their secure cloud infrastructure. Such seamless multi-institute collaborations over an Augmented Intelligence platform that puts patient privacy and HIPAA-compliance first, is being advanced actively over the Mayo Clinic’s Clinical Data Analytics Platform Initiative (CDAP). The capabilities demonstrated in this study for rapidly synthesizing over 8.2 million unstructured clinical notes to develop an EHR-powered clinical diagnosis framework will be further strengthened through such a universal biomedical research platform. A caveat of relying solely on EHR inference is that mild phenotypes that may not lead to a presentation for clinical care, such as anosmia, may go unreported in otherwise asymptomatic patients. As at-home serology-based tests for COVID-19 with high sensitivity and specificity are approved, capturing these symptoms will become increasingly important in order to facilitate the continued development and refinement of disease models. EHR-integrated digital health tools may help address this need. As we continue to understand the diversity of COVID-19 patient outcomes through holistic inference of EHR systems, it is equally important to invest in uncovering the molecular mechanisms and gain cellular/tissue-scale pathology insights through large-scale patient-derived biobanking and multi-omics sequencing. As the anecdotal single cell RNA-seq (scRNA-seq) based co-expression analysis of ACE2 and TMPRSS2 on normal human samples conducted here highlights, the rich heterogeneity of cell types constituting various host tissues can be investigated in great detail by scRNA-seq. To correlate patterns of molecular expression from scRNA-seq with EHR-derived phenotypic signals of COVID-19 disease progression, a large-scale bio-banking system has to be created. Such a system will enable deep molecular insights into COVID-19 to be gleaned and triangulated with SARS-CoV-2 tropism and patient outcomes. Ultimately, connecting the dots between the temporal dynamics of COVID*pos* and COVID*neg* clinical phenotypes across diverse patient populations to the multi-omics signals from patient-derived bio-specimen will help advance a more holistic understanding of COVID-19 pathophysiology. This will set the stage for a precision medicine approach to the diagnostic and therapeutic management of COVID-19 patients. BOX 1 **Experimental therapies targeting entry and replication of SARS-CoV-2** ***RAAS inhibitors and ACE2 analogs*** One class of experimental therapies intended to inhibit viral entry and early disease in COVID-19 includes Renin Angiotensin Aldosterone System (RAAS) inhibitors and recombinant ACE2 (**Table S1**). ACE2 is the primary host receptor for SARS-CoV-2, while serine protease TMPRSS2 is implicated in the spike protein priming after viral binding10. Recombinant ACE2 has been proposed as an early COVID-19 therapy based on in-vitro data11. At this time, the effect of RAAS inhibitors is uncertain in the context of COVID-19. Studies have investigated how ACE expression is modulated by coronavirus infection, and how that relates to lung injury11. Trials are ongoing with Angiotensin Receptor Blockers (ARBs) for treatment of COVID-19 by diminishing downstream harmful effects of angiotensin receptor activation (**Figure 2**). ***Serine Protease inhibitors*** Given the TMPRSS2 involvement in viral entry (**Figure 2**), serine protease inhibitors such as Camostat are now under evaluation in trials and should also be considered in the early stages of SARS-CoV-2 infection. ***Viral RNA-dependent RNA polymerase inhibitors*** Of these, Remedesivir, a nucleoside analog, has attracted much attention for in-vivo inhibition of SARS-CoV-2, and a recent observational study of 53 patients who received Remdesivir under compassionate use found that 68% of patients demonstrated improvement in respiratory status after a 10 day regimen12. Another nucleoside analog, Galidesivir, is also under evaluation in patients. Yet another viral replication inhibitor in clinical trials is Favipiravir (**Figure 2**). Favipiravir is a broad spectrum viral RNA dependent RNA polymerase inhibitor that is shown to have in-vivo activity against a wide range of RNA viruses. In one RCT of 240 patients, Favipiravir was found to improve the clinical recovery rate of COVID-19 relative to Umifenovir, a viral entry inhibitor13 (**Table S1**). ***HIV Protease inhibitors*** This class of medication is widely proposed and used off-label based on postulates that HIV and HCV proteases share structural similarities with those of SARS-CoV-214. Of these, Lopinavir/Ritonavir (combination) has shown promise but was found to have a non-significant benefit in a Randomized Clinical Trial (RCT) of 199 patients in China15, while Darunavir has shown no significant activity against SARS-CoV-2 in-vitro (**Table S1**)16. Multiple randomized, controlled clinical trials are now underway in the USA to determine efficacy of these drugs in the treatment of COVID-19. ***Other Antiviral Agents*** Another emerging option for patients at this stage is convalescent plasma (**Figure 2**), which has shown clinical benefits in cases of COVID-1917 and related viral diseases (SARS-1, MERS) at various stages of severity18,19. Administration of convalescent plasma containing specific antiviral antibodies may prevent or attenuate progression to severe disease. Expanded access to convalescent plasma for treatment of patients with COVID-19 is available through a program led by Mayo Clinic20. Synthetic hyperimmune globulins are also under development and evaluation. **Agents being advanced that target the inflammatory response in COVID-19** *Anti-GM-CSF agents* A xenograft study found that granulocyte monocyte colony stimulating factor (GM-CSF) neutralization with Lenzilumab significantly reduced production of inflammatory cytokines21, offering evidence for efficacy of anti-GM-CSF agents in prevention of CART-induced cytokine release syndrome (CRS). Lenzilumab has been approved by the FDA for emergency IND use for CRS in COVID-19, while others such as Mavrilimumab and Gimsilumab aimed at controlling undesired inflammation from myeloid activation will be evaluated in clinical trials. ***Anti-IL-6 agents*** IL-6 is a pro-inflammatory cytokine, regarded as a driver of CRS22 (**Figure 2A-C**). A recent report suggests IL-6 as a biomarker for respiratory failure in COVID-1922. As such, anti-IL-6 agents including Tocilizumab, Sarilumab and Siltuximab are being evaluated in randomized trials (**Table S1**), and used off-label in severe COVID-19 patients. Tocilizumab was approved for the treatment of CRS in 2017. An observational study of 21 patients with severe COVID-19 pneumonia treated with Tocilizumab showed promising results23,24. ***Anti-JAK agents*** A number of immunomodulatory agents not linked to CRS are also under trial for COVID-19 (**Figure 2**). Janus kinase (JAK) inhibitors such as Baricitinib, Fedratinib, and Ruxolitinib, indicated for Rheumatoid Arthritis and Myelofibrosis, have been tested in xenograft models for Chimeric Antigen Receptor (CAR) T-cell therapy induced CRS25. Ruxolitinib is available under an expanded access program in USA for severely ill COVID-19 patients (**Table S1**) and trials are underway in other countries. ***Anti-Complement agents*** A recent study found that SARS-CoV-2 also binds to MASP2, a key driver of the complement activation pathway, leading to complement hyperactivation in COVID-19 patients26. Inhibitors of the terminal complement pathway such as Eculizumab have been tried in individuals with improvements observed after administration in China. **Agents targeting ventilation/perfusion defects in COVID-19-induced ARDS** ***Vasodilators*** A recent report based on 16 cases in Italy and Germany noted that, contrary to the established understanding in ARDS, COVID-19 patients in ARDS retain relatively high lung compliance27 and demonstrate ventilation/perfusion defects likely arising from perfusion dysregulation and hypoxic vasoconstriction. Therefore, patients with COVID-19 in ARDS may benefit from vasodilators to address this pathophysiologic mechanism. A trial is underway in China for use of inhaled nitric oxide in patients with mechanical ventilation **(Table S1)**. ## Methods ### Augmented curation of SARS-CoV2-positive patient charts The nferX Augmented Curation technology was leveraged to rapidly curate the charts of SARS-CoV-2-positive patients. First, we read through the charts of 100 patients and identified and grouped symptoms into sets of synonymous words and phrases. For example, “SOB”, “shortness of breath”, and “dyspnea”, among others, were grouped into “shortness of breath”. We did the same for diseases and medications. For the SARS-CoV2-positive patients, we identified a total of 26 symptom categories (**Table S1**) with 145 synonyms or synonymous phrases. Together, these synonyms and synonymous phrases capture a multitude of ways that symptoms related to COVID-19 are described in the Mayo Clinic Electronic Health Record (EHR) databases. Next, for charts that had not yet been manually curated, we used state-of-the-art BERT-based neural networks32 to classify symptoms as being present or not present based on the surrounding phraseology. The neural network used to perform this classification was trained using nearly 250 different phenotypes and 20000 sentences; it achieves over 96% recall for positive/negative sentiment classification. We went through individual sentences and either accepted the sentences or rejected and reclassified them. The neural networks were actively re-trained as curation progressed, leading to stepwise increases in curation efficiency and model accuracy. In step 1 of this process, we labeled 11433 sentences, 8737 of which were labeled as either ‘present’ or ‘not present.’ The model trained on this data set (80%-20% training/test split) achieved F1 scores of 0.93 and 0.84 for ‘present’ and ‘not present’ classifications, respectively. The model was then applied to an additional 3688 sentences in step 2, rapidly corrected by a human for classification errors and re-trained to generate a newer version of the model. Step 3 was an iteration of step 2 on an additional 3369 sentences. The model achieved F1 scores of 0.96/0.91 after step 2 and 0.96/0.96 after step 3 for the classification of ‘present’/’not present.’ Due to the augmented nature of this approach, steps 2 and 3 required successively less input from the human annotator. This model was applied to 80,148 clinical notes from the 272 COVID*pos* patients and 8.2 million clinical notes from the 14,695 COVID*neg* patients. First, the difference between the date on which a particular note was written and the PCR testing date of the patient corresponding to that note formed the relative date measure for that note. The PCR testing date was treated as ‘day 0’ with notes preceding it assigned ‘day-1’, ‘day-2’ and so on. BERT-based neural networks were applied on each note to provide a set of symptoms that were present at that point of time for the patient in question. This map was then inverted to determine for each symptom and relative date the set of unique patients experiencing that symptom. For each synonymous group of symptoms, we computed the count and proportion of COVID*pos* and COVID*neg* patients that were deemed to have that symptom in at least one note between 1 and 7 days prior to their PCR test. We additionally computed the ratio of those proportions which indicates the extent of prevalence of the symptom in the COVID*pos* cohort as compared to the COVID*neg* cohort. A standard 2-proportion z hypothesis test was performed, and a p-value was reported for each symptom. To capture the temporal evolution of symptoms in the COVID*pos* and COVID*neg* cohorts, the process described above was repeated considering counts and proportions for each day independently. Pairwise analysis of phenotypes was performed by considering 351 phenotypic pairs from the original set of 27 individual phenotypes. For each pair, we calculated the number of patients in the COVID*pos* and COVID*neg* cohorts wherein both phenotypes occured at least once in the week preceding PCR testing. With these patient proportions, a 2-proportion z test p-value was computed. Benjamini-Hochberg correction was applied to account for multiple hypothesis testing. This research was conducted under IRB 20-003278, “Study of COVID-19 patient characteristics with augmented curation of Electronic Health Records (EHR) to inform strategic and operational decisions”. All analysis of EHRs was performed in the privacy-preserving environment secured and controlled by the Mayo Clinic. nference and the Mayo Clinic subscribes to the basic ethical principles underlying the conduct of research involving human subjects as set forth in the Belmont Report and strictly ensures compliance with the Common Rule in the Code of Federal Regulations (45 CFR 46) on the Protection of Human Subjects. ### Analysis of cell-types expressing ACE2 and TMPRSS2 using single cell RNAseq Since the successful entry of virus in the cell requires priming by cellular host protease – TMPRSS2, we hypothesized that cells that express both TMPRSS2+ and ACE2+ cells could harbor SARS-CoV-2 during the course of infection. Thus, we probed for the expression of ACE2 and TMPRSS2 in all the single-cell studies from human tissues available on the nferX Single Cell platform ([https://academia.nferx.com/](https://academia.nferx.com/)). For all the tissues that we profiled, we ensured that there are a minimum of 100 cells in the cell population and that there is a minimum of 1% of the cells in the cell population co-expressing (non-zero expression) both TMPRSS2 and ACE2 expression. ## Data Availability The data analyzed are summarized in the manuscript in the tables and methods sections. ## SUPPLEMENTARY MATERIAL ![Figure S1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/23/2020.04.19.20067660/F3.medium.gif) [Figure S1.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/F3) Figure S1. Cell-types connected to pathophysiology of COVID-19 as inferred from high expression of ACE2 and TMPRSS2 in human scRNA seq datasets. A scatter plot depicting the expression of ACE2 and TMPRSS2 inferred from the single-cell RNA-seq profiling of human tissues using nferX single cell platform. The x-axis represents the mean ln(cp10k+1) expression of ACE2 in all the cells and the y-axis represents the mean ln(cp10k+1) expression of TMPRSS2 in the corresponding cell-types from respective tissues. The colors on the scatter plot depicts the tissue origins. The size of the points on the scatter plot represents the percentage of single cells in the cell-type that co-express ACE2 and TMPRSS2 (non-zero expression). ![Figure S2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/23/2020.04.19.20067660/F4.medium.gif) [Figure S2.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/F4) Figure S2. Disease progression of COVID-19 can be divided into multiple stages, and appropriate therapeutics can be chosen based on the specific pathophysiological mechanisms. Using nferX Knowledge Synthesis, the most associated molecular markers at each step of disease progression are also identified (see *Supplementary Methods* for details on nferX knowledge synthesis). In order to capture biomedical literature based associations, the nferX platform defines two scores: a “local score” and a “global score”, as described previously (Park, J. et al. Recapitulation and Retrospective Prediction of Biomedical Associations Using Temporally-enabled Word Embeddings. doi:10.1101/627513). ![Figure S3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/23/2020.04.19.20067660/F5.medium.gif) [Figure S3.](http://medrxiv.org/content/early/2020/04/23/2020.04.19.20067660/F5) Figure S3. nferX-derived associations of COVID-19 treatment options to clinical phenotypes. **(A)** Schematic of the derivation of nferX local and global scores quantifying associations between concepts from across the literature. **(B)** Heatmap of nferX Local Scores capturing associations discussed in the literature between select COVID-19 treatment drugs and COVID-19 related phenotypes. In order to capture biomedical literature based associations, the nferX platform defines two scores: a “local score” and a “global score”, as described previously (Park, J. et al. Recapitulation and Retrospective Prediction of Biomedical Associations Using Temporally-enabled Word Embeddings. doi:10.1101/627513). ## Acknowledgments We thank Murali Aravamudan, Ajit Rajasekharan, and Rakesh Barve for their thoughtful review and feedback on this manuscript. We also thank Andrew Danielsen, Jason Ross, Jeff Anderson, Ahmed Hadad, and Sankar Ardhanari for their support that enabled the rapid completion of this study. ## Footnotes * * Joint first authors * Received April 19, 2020. * Revision received April 19, 2020. * Accepted April 23, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.Guan, W.-J. et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N. Engl. J. Med. (2020) doi:10.1056/NEJMoa2002032. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2002032&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32109013&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 2. 2.Verity, R. et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect. Dis. (2020) doi:10.1016/S1473-3099(20)30243-7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1473-3099(20)30243-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32240634&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 3. 3.Hoehl, S. et al. Evidence of SARS-CoV-2 Infection in Returning Travelers from Wuhan, China. N. Engl. J. Med. 382, 1278–1280 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMc2001899&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32069388&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 4. 4.Xiao, F. et al. Evidence for Gastrointestinal Infection of SARS-CoV-2. Gastroenterology (2020) doi:10.1053/j.gastro.2020.02.055. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1053/j.gastro.2020.02.055&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32142773&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 5. 5.Zhang, B. et al. Clinical characteristics of 82 death cases with COVID-19. medRxiv 2020.02.26.20028191 (2020). 6. 6.COVID - Overview: Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) RNA Detection, Varies. [https://www.mayocliniclabs.com/test-catalog/Overview/608825](https://www.mayocliniclabs.com/test-catalog/Overview/608825). 7. 7.Xu, Y. et al. Characteristics of pediatric SARS-CoV-2 infection and potential evidence for persistent fecal viral shedding. Nat. Med. 26, 502–505 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-020-0817-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 8. 8.Wu, F. et al. SARS-CoV-2 titers in wastewater are higher than expected from clinically confirmed cases. medRxiv 2020.04.05.20051540 (2020). 9. 9.Venkatakrishnan, A. J. et al. Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors. bioRxiv 2020.03.24.005702 (2020) doi:10.1101/2020.03.24.005702. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYmlvcnhpdiI7czo1OiJyZXNpZCI7czoxOToiMjAyMC4wMy4yNC4wMDU3MDJ2MSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzA0LzIzLzIwMjAuMDQuMTkuMjAwNjc2NjAuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 10. 10.Hoffmann, M. et al. SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor. Cell (2020) doi:10.1016/j.cell.2020.02.052. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2020.02.052&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32142651&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 11. 11.del Pozo, F. P. et al. Inhibition of SARS-CoV-2 infections in engineered human tissues using clinical-grade soluble human ACE2. Cell (2020) doi:10.1016/j.cell.2020.04.004. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2020.04.004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32333836&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 12. 12.Grein, J. et al. Compassionate Use of Remdesivir for Patients with Severe Covid-19. N. Engl. J. Med. (2020) doi:10.1056/NEJMoa2007016. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2007016&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32275812&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 13. 13.Chen, C. et al. Favipiravir versus Arbidol for COVID-19: A Randomized Clinical Trial. medRxiv 2020.03.17.20037432 (2020). 14. 14.Chen, H. et al. First Clinical Study Using HCV Protease Inhibitor Danoprevir to Treat Naive and Experienced COVID-19 Patients. medRxiv 2020.03.22.20034041 (2020). 15. 15.Cao, B. et al. A Trial of Lopinavir-Ritonavir in Adults Hospitalized with Severe Covid-19. N. Engl. J. Med. (2020) doi:10.1056/NEJMoa2001282. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa2001282&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32187464&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 16. 16.De Meyer, S. et al. Lack of Antiviral Activity of Darunavir against SARS-CoV-2. medRxiv 2020.04.03.20052548 (2020). 17. 17.Shen, C. et al. Treatment of 5 Critically Ill Patients With COVID-19 With Convalescent Plasma. JAMA (2020) doi:10.1001/jama.2020.4783. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jama.2020.4783&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32219428&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 18. 18.Bloch, E. M. et al. Deployment of convalescent plasma for the prevention and treatment of COVID-19. J. Clin. Invest. (2020) doi:10.1172/JCI138745. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1172/JCI138745&link_type=DOI) 19. 19.Duan, K. et al. The feasibility of convalescent plasma therapy in severe COVID-19 patients: a pilot study. medRxiv 2020.03.16.20036145 (2020). 20. 20.Convalescent Plasma COVID-19 (Coronavirus) Treatment – Mayo Clinic. [https://www.uscovidplasma.org/](https://www.uscovidplasma.org/). 21. 21.Sterner, R. M. et al. GM-CSF inhibition reduces cytokine release syndrome and neuroinflammation but enhances CAR-T cell function in xenografts. Blood 133, 697–709 (2019). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTI6ImJsb29kam91cm5hbCI7czo1OiJyZXNpZCI7czo5OiIxMzMvNy82OTciO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wNC8yMy8yMDIwLjA0LjE5LjIwMDY3NjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 22. 22.Shimabukuro-Vornhagen, A. et al. Cytokine release syndrome. J Immunother Cancer 6, 56 (2018). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiaml0YyI7czo1OiJyZXNpZCI7czo2OiI2LzEvNTYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wNC8yMy8yMDIwLjA0LjE5LjIwMDY3NjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 23. 23.Herold, T. et al. Level of IL-6 predicts respiratory failure in hospitalized symptomatic COVID-19 patients. medRxiv 2020.04.01.20047381 (2020). 24. 24.Mingfeng, X. X. H. et al. Effective Treatment of Severe COVID-19 Patients with Tocilizumab. ChinaXiv.org [http://www.chinaxiv.org/abs/202003.00026](http://www.chinaxiv.org/abs/202003.00026). 25. 25.Kenderian, S. S. et al. Ruxolitinib Prevents Cytokine Release Syndrome after CART Cell Therapy without Impairing the Anti-Tumor Effect in a Xenograft Model. Blood 128, 652–652 (2016). 26. 26.Gao, T. et al. Highly pathogenic coronavirus N protein aggravates lung injury by MASP-2-mediated complement over-activation. medRxiv 2020.03.29.20041962 (2020). 27. 27.Gattinoni, L. et al. Covid-19 Does Not Lead to a ‘Typical’ Acute Respiratory Distress Syndrome. Am. J. Respir. Crit. Care Med. (2020) doi:10.1164/rccm.202003-0817LE. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1164/rccm.202003-0817LE&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32228035&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 28. 28.Vincent, M. J. et al. Chloroquine is a potent inhibitor of SARS coronavirus infection and spread. Virol. J. 2, 69 (2005). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1743-422X-2-69&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16115318&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 29. 29.Borba M et al. Chloroquine diphosphate in two different dosages as adjunctive therapy of hospitalized patients with severe respiratory syndrome in the context of coronavirus (SARS-CoV-2) infection: Preliminary safety results of a randomized, double-blinded, phase IIb clinical trial (CloroCovid-19 Study) (2020). 30. 30.Wang, Y. et al. Early, low-dose and short-term application of corticosteroid treatment in patients with severe COVID-19 pneumonia: single-center experience from Wuhan, China. medRxiv 2020.03.06.20032342 (2020). 31. 31.Russell, C. D., Millar, J. E. & Baillie, J. K. Clinical evidence does not support corticosteroid treatment for 2019-nCoV lung injury. Lancet 395, 473–475 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(20)30317-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F23%2F2020.04.19.20067660.atom) 32. 32.Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2018). ## Table S1 - References 1. 1.Bian, Huijie, et al. “Meplazumab treats COVID-19 pneumonia: an open-labelled, concurrent controlled add-on clinical trial.” medRxiv (2020). 2. 2.Hensley, L., Fritz, E., Jahrling, P., Karp, C., Huggins, J. and Geisbert, T., 2020. Interferon-B 1A And SARS Coronavirus Replication. 3. 3.Yamamoto, Mizuki, et al. “Identification of nafamostat as a potent inhibitor of Middle East respiratory syndrome coronavirus S protein-mediated membrane fusion using the split-protein-based cell-cell fusion assay.” Antimicrobial agents and chemotherapy 60.11 (2016): 6532–6539. 4. 4.Cohen, Shira, and Pnina Fishman. “Targeting the A3 adenosine receptor to treat cytokine release syndrome in cancer immunotherapy.” Drug design, development and therapy 13 (2019): 491. 5. 5.Chorny, Alejo, et al. “Vasoactive intestinal peptide induces regulatory dendritic cells with therapeutic effects on autoimmune disorders.” Proceedings of the National Academy of Sciences 102.38 (2005): 13562–13567. 6. 6.Barnard, Dale L., et al. “Inhibition of severe acute respiratory syndrome-associated coronavirus (SARSCoV) by calpain inhibitors and β-D-N4-hydroxycytidine.” Antiviral Chemistry and Chemotherapy 15.1 (2004): 15–22. 7. 7.Zanasi, Alessandro, Massimiliano Mazzolini, and Ahmad Kantar. “A reappraisal of the mucoactive activity and clinical efficacy of bromhexine.” Multidisciplinary respiratory medicine 12.1 (2017): 7. 8. 8.Chen, Guo-Yun, et al. “CD24 and Siglec-10 selectively repress tissue damage–induced immune responses.” Science 323.5922 (2009): 1722–1725. 9. 9.Florence, Jon M., et al. “Inhibiting Bruton’s tyrosine kinase rescues mice from lethal influenza-induced acute lung injury.” American Journal of Physiology-Lung Cellular and Molecular Physiology 315.1 (2018): L52-L58. 10. 10.Dayal, Devi, and Saniya Gupta. “Connecting BCG Vaccination and COVID-19: Additional Data.” medRxiv (2020).