Extending the range of symptoms in a Bayesian Network for the Predictive Diagnosis of COVID-19 ============================================================================================== * Rachel Butcher * Norman Fenton ## Abstract Emerging digital technologies have taken an unprecedented position at the forefront of COVID-19 management. This paper extends a previous Bayesian network designed to predict the probability of COVID-19 infection, based on a patient’s profile. The structure and prior probabilities have been amalgamated from the knowledge of peer-reviewed articles. The network accounts for demographics, behaviours and symptoms, and can mathematically identify multivariate combinations with the highest risk. Potential applications include patient triage in healthcare systems or embedded software for contact-tracing apps. Specifically, this paper extends the set of symptoms that are a marker for COVID-19 infection and the differential diagnosis of other conditions with similar presentations. Keywords * SARS-CoV-2 * COVID-19 * Bayesian network * predictive model ## I. Introduction In January 2020, the UK welcomed a new decade, unknowingly about to face its second pandemic of the 21st century. By late 2019, a novel strain of coronavirus had made the zoonotic leap from an unknown animal reservoir to humans, in connection with a seafood market in Wuhan, China (WHO, 2020b). The virus (hereinafter ‘SARS-CoV-2’) is homologous to that of the 2002-2003 SARS epidemic, causing asymptomatic to severe respiratory disease (hereinafter ‘COVID-19’) (Yuki et al., 2020). The virulent infection since reached 213 countries and territories globally. To date, confirmed cases exceed 19 million with over 700,000 reported deaths (worldometers, 2020). On 11th March 2020, the WHO promoted COVID-19 to pandemic status. The pandemic has been a huge burden on the global economy. In the absence of a vaccine, public health response has been targeted towards disease containment (Wynants et al., 2020). On 23rd March 2020, the UK was placed under strict lockdown in attempts “flatten the curve”. Such measures aimed to maintain the number of cases below healthcare system capacity (Matrajt and Leung, 2020). Widespread lockdown forced closure of much of the retail, hospitality, and entertainment sectors. From May to June, the UK’s gross domestic product (GDP) dropped by 20.4%, marking the entrance into technical recession (Office for National Statistics, 2020a). Individuals in such sectors have been disproportionally affected, with unemployment reaching 3.6% and expected to rise (Office for National Statistics, 2020b). Further detrimental impacts have been felt across education, travel (Nicola et al., 2020) and mental health (Liang et al., 2020). Despite a desperate need to bolster the economy, easing of restrictions carry the risk of disease resurgence and unsustainable strain on the NHS (Anderson, 2020). Whilst transmission surveillance remains an integral part of government response, community diagnosis will continue to be a limiting factor. Clinical tests, such as reverse transcription polymerase chain reaction (RT-PCR), are hindered by significant false-negative rates (West et al., 2020), expensive equipment and the need for specialised professionals (Russo et al., 2020). The management of COVID-19 has seen an unprecedented dependence on emerging digital technologies. AI-driven predictive models have the potential to provide an immediate diagnosis based on a patient’s profile, without the need for testing (Wynants et al., 2020). Technologies can be used to reduced triage time in hospitals (Soltan et al., 2020) or improve community outreach where testing is inaccessible. This study builds on the Bayesian network solution by Fenton et al., (2020) and concurrent work by Prodhan (2020) for prediction of current COVID-19 status coupled with eventual prognoses. The network takes an input of observable symptoms and risk factors to produce a personalised probability score for disease status. ## II. Related Work ### A. Bayesian Networks Bayesian networks (BNs) exploit Bayes probabilistic reasoning to provide insights in the causal relationships between the contributors and outcomes of an event (He, 2014). The BNs ability to account for uncertainties makes it a useful application to support clinical decision-making, where the event is the probability of disease in a patient (Wang et al., 2014). The network graphically represents the directed conditional dependencies (arcs) between stochastic variables (nodes) of the event. Given node A represents infection with a disease, e.g. flu, and node B represents some symptom, e.g. cough, the directed arc would point from A to B, and be interpreted as A causes or influences B. A may also be expressed as the parent of B (Fenton and Neil, 2018). Associated with each variable is a predefined set of mutually exclusive states and a node probability table (NPT) (Jensen, 1996). The NPT defines the marginal probability distribution across the states. When an observation is set for a known variable, the effect propagates both forwards and backwards to update the posterior probability of any dependencies. Important constraints of a BN is it must be acyclic, as the algorithm does not handle feedback loops, and it should not be a complete graph, meaning variables independent of each other should not share an arc (Fenton and Neil, 2018). BNs are credible in decision support as they can provide evidence of the reasoning behind their conclusions. Physicians can simply trace the posterior probabilities through the network interface or use secondary tools such as BANTER (Kahn et al., 1997). BANTER identifies the most influential nodes using sensitivity analysis and the strongest path of influence to the hypothesis. A textual explanation is then generated for the physician to review (Haddawy et al., 1994). A study by Wang et al., (2014) proposed a BN model for comparison against traditional machine learning methods, namely logistic regression, naive Bayes and support vector machine, for the prediction of lung cancer-induced brain metastasis. Whilst sensitivity was only marginally improved by the BN, the method’s major advantages were realised as: easy comprehension, efficient modelling of both linear and non-linear events, the ability to reason from consequence to cause and the ability to handle missing data. Bayes theorem provides a means to standardise the interactions between dependent and independent probabilities, allowing for reasoning under uncertainty and incorporation of domain knowledge (Kahn et al., 1997). Uncertainties are missing observations, such as unavailable information on patient family history or diagnostic tests that have not yet been performed. BNs are advantageous as they can maintain accurate predictions even with little observational input (Zheng et al., 2008). Statistical paradoxes are prevalent in literature and can drive false claims in the media or, more critically, incorrect conclusions in medical studies. Simpson’s paradox occurs when conclusions derived from aggregated data reverse when the same data are stratified The common clinical trial example cited is; when two treatments A and B are administered to a population of patients, treatment A has greater efficiency. However, when the same population is stratified, for example by sex, treatment B paradoxically performs better in each sub-category. Given an assumption about causality, BNs can resolve such paradoxes by computing the necessary, but often counterintuitive, inverse probabilities and simulate the effect of an intervention, or ‘treatment’ in the given scenario (Fenton and Neill, 2019). Numerous Bayesian models have been published since their popularisation by Judea Pearl in the late eighties (Pearl, 1988). Seixas et al., (2014) developed a network to support dementia and Alzheimer’s diagnosis using predisposal factors, neurophysiological test results, demographics and symptoms, populated by both expert knowledge and supervised learning. The BN was found to outperform other classifiers, such as decision tables, in diagnostic accuracy. Luciani et al., (2003) explored BNs diagnostic ability for pulmonary embolism, with nodes representing patient risk factors and pathophysiology, and NPTs populated by a systematic review of literature. A study by Kahn et al., (1997) constructed a BN to aid in the interpretation of mammographs for diagnosis of breast cancer. The model incorporates patient histories, physical symptoms, and mammographic indicators. Probabilities were mined from medical literature and expert mammographers. The network was successful in the early detection of breast cancer, allowing for pre-emptive medical intervention and ultimately improving patient prognosis. In 2001, Kahn *et al*., tailored another BN towards diagnosis of primary bone tumours, with nodes representing patient demographics, physical findings and lesion properties. The NPT values were elicited from peer-reviewed literature and the model returned a 68% diagnostic accuracy. BNs have demonstrated their value not only in medicine, but also agriculture. Bi and Chen (2011) developed a model for producing BNs for the diagnosis of crop diseases, such as corn borer which can reduce corn harvest by up to 15% per year and have significant economic damage. Whilst BNs for diagnostic decision support is not a novel application, there has been poor uptake in clinical practice thought to be partially due to concerns about accuracy and a lack of evidence supporting clinical credibility (Yet et al., 2017). ### B. Predictive Models for COVID-19 Diagnosis Several predictive models have recently emerged to help alleviate the burden of COVID-19 on healthcare systems. Such models may diagnose current infection, predict risk of infection or prognose disease progression to inform medical decision-making (Wynants et al., 2020). A systematic review of COVID-19 prediction models by Wynants et al., (2020) found current models to claim moderate to excellent predictive performance yet are severely limited by overfitting and bias. As adequate data may not always be available, it is suggested that future models comprehensively describe the demographics of the population on which the model is developed. Performance of the model can then be evaluated in relation to its applicability to the future user. Wynants et al., (2020) also propose basing the model on global rather than local patient data, to allow for greater application and generalisability. Both recommendations will be considered in the development of our Bayesian network. Soltan et al., (2020) of Oxford University have developed two AI-driven models for triage of hospitalized, potential COVID-19 patients awaiting RT-PCR screening. Machine learning methods (namely logistic regression, random forest and extreme gradient boosted trees) were applied to electronic health record data from Oxford University Hospitals. Prediction was based on clinical data typically available for patients presenting to hospital, such as leukocyte counts and respiratory rate. Both models, all presenting patients and only admitted patients, achieved high sensitivity and specificity, and could reduce triage time from up to ∼48 hours (for RT-PCR) to ∼1 hour (for blood and vitals collection). However, the method is limited to individuals with symptoms severe enough to present to hospital and the input data requires specialist equipment and trained professionals. The models are currently in clinical trials under the name CURIAL AI, ultimately for use by the NHS. Menni et al., (2020) of King’s College London studied potential symptoms predictive of COVID-19 infection. Using self-reported symptoms of the ZOE Global smartphone app, logistic regression determined anosmia, fever, persistent cough, fatigue, shortness of breath, diarrhea, delirium, skipped meals, abdominal pain, chest pain and hoarse voice were associated with a positive SARS-CoV-2 test in UK participants. Whilst the study incorporates a significantly large population and adjusts for age, sex and BMI; Menni et al., (2020) admit that the self-reported nature of the data is a caveat to the study being representative of the whole population. Symptoms have not been clinically verified, leaving opportunity for human error, and the sample population is reflective only of self-selected participants using the app and laboratory tested positive for SARS-CoV-2. Similarly, the study only queries known symptoms of COVID-19 and does not contribute to discovery of symptoms based on empirical evidence. ## III. Materials and Methods ### A. Requirements The aim was to develop a probabilistic causal model for the prediction of COVID-19 infection and progression in an individual. To provide an accurate diagnosis, the model must account for all variables that may contribute to probability of infection (e.g. occupation, ethnicity, health risks or other demographics). As SARS-CoV-2 is thought to be transmitted by droplet infection, behaviours that increase human proximity and contact are also deemed high risk (ECDC, 2020a). The model should mine expert domain knowledge from peer-reviewed research to develop informed prior probabilities. The knowledge-based structure should also adhere to Bayesian network constraints. Rationale for the model, beyond what is discussed in this paper, is presented by Fenton et al., (2020). See Fig. 2. for complete model structure. ![Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F1.medium.gif) [Fig. 1.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F1) Fig. 1. Flowchart describing the inclusion criteria for the systematic review. ![Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F2.medium.gif) [Fig. 2.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F2) Fig. 2. Complete structure of proposed Bayesian network.. ### B. Design As the model was a collaborative effort, this paper focuses primarily on symptom discovery and the differential diagnosis of conditions with similar clinical manifestations. Symptoms were mined on an empirical basis. Inclusion was not limited to the three main symptoms cited by the NHS: high temperature, continuous cough or a loss of smell or taste (NHS, 2020). A systematic review mitigates the constraints of a targeted search, revealing several additional symptoms that may improve the model’s sensitivity. The model can also afford to be dynamic, updating as new information comes to light. Evaluating multiple presentations of COVID-19 was deemed an important step in model development. The novelty of COVID-19 means symptoms are not yet fully defined. Those that are commonly expressed, such as fever and coughing, are extremely non-specific and could point to a number of different diagnoses. To establish how COVID-19 presented in individuals, a systematic review was conducted using the CDC’s COVID-19 Research Articles Downloadable Database (Centers for Disease Control and Prevention, 2020). The database is updated daily with all publicly available COVID-19 research, including preprints, amalgamated from the following publishers: Medline, PubMed Central, Embase, CAB Abstracts, Global Health, PsycInfo, Cochrane Library, Scopus, Academic Search Complete, Africa Wide Information, CINAHL, ProQuest Central, SciFinder, the Virtual Health Library, LitCovid, WHO COVID-19 website, CDC COVID-19 website, Eurosurveillance, China CDC Weekly, Homeland Security Digital Library, ClinicalTrials.gov, bioRxiv, medRxiv, chemRxiv, and SSRN (Centers for Disease Control and Prevention, 2020). The database was captured on the 10th July 2020 and initially consisted of 64180 unique studies. Results were primarily screened for titles containing one, or a combination, of the terms: “symptom” OR “clinical” OR “feature” OR “characteristic” OR “manifestation”. On removal of duplicates, 689 unique titles remained. Each remaining study was screened by abstract and quality of quantitative data available. The inclusion criteria were met if: (a) quantitative symptom data was stratified by disease severity, (b) severe COVID-19 infection was defined by the WHO guidelines (WHO, 2020a), (c) cases were confirmed by a positive RT-PCR test. It is recognised that using RT-PCR alone is a caveat due to high false-negative reports, but it allowed for a standardized method of comparing symptoms. 12 papers satisfied the criteria and were used in development (Fig. 1.). It should also be noted that all studies were in English and preprint papers were not excluded from eligibility. The research detailed 8095 cases, reported between 11th December 2019 to 20th April 2020. Locations include Henan, Zhejiang and Wuhan in China, California, Spain, Tokyo, and Daegu in Korea; with a total of 4062 male patients and 4033 female patients (Table. 1). Frequency analysis was used to determine the most reported symptoms. Typically, if a symptom was found in two or more studies it was eligible for inclusion. The frequency target was low (2+) to mitigate potential effects of researcher bias or expectation, or effects of increased media attention on symptom reporting (Menni et al., 2020). View this table: [TABLE I.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T1) TABLE I. Table displaying features of the final articles selected by systematic review Each symptom identified by frequency analysis was cross-referenced with its likelihood ratio (LR) from Wagner et al., (2020a) curation of clinical notes (Table. 2). The likelihood ratio was calculated as follows (1) ![Formula][1] View this table: [TABLE II.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T2) TABLE II. Table displaying the likelihood ratios of symptoms in indicating covid-19 infection (wagner et al., 2020a). A likelihood score >1 is generally accepted to indicate that the symptom is a predictor for disease (McGee, 2002). Most discovered symptoms adhered to a result >1, with the exception of fatigue, chest pain, dermatitis and generalised gastrointestinal symptoms. However, these symptoms appeared frequently in literature and therefore deemed too indictive to omit. The strongest score was from anosmia (27.08), which is reflected in several other studies (Menni et al., 2020; Williams et al., 2020a). The model has been designed to capture laboratory findings that explicitly define severe COVID-19 under WHO guidelines. An observation of respiratory rate ≥30 breaths per minute or oxygen saturation of <90% should calculate an increased prior probability of severe disease compared to mild (WHO, 2020a). Multiple studies also suggest elevated serum C-reactive protein (CRP) and lymphocytopenia provide important markers for disease severity (Ali, 2020; Wang, 2020; Chen et al., 2020a; Wagner et al., 2020b and Huang and Pranata, 2020a). Nodes reflecting this have been added as appropriate. The differential diagnosis of COVID-19 is important in clinical decision-making. Clinical presentations of COVID-19 are often non-specific and can lead to misdiagnosis. The model considers the prevalence of other diseases with similar symptoms in the UK, as conditioned on both age and level of risk. Such conditions are often respiratory diseases; commonly cited misdiagnoses being influenza, chronic obstructive pulmonary disease (COPD) and meningitis. If it is observed prior to COVID-19 diagnosis that the subject suffers from any of these, then the model can account for such scenario and predict the probability of co-infection. ### C. Implementation Specialised Bayesian network software AGENARISK has been used to host the model. AGENARISK automatically produces the format for NPTs and handles Bayes calculations when run. Quantitative data from the systematic review was used to complete the NPTs. As children nodes of ‘Current COVID-19 or Similar’, each symptom required prior probabilities for: mild disease, severe disease, asymptomatic disease, non-COVID-19 similar disease and no disease. Where possible, data from multiple studies of the systematic review were amalgamated to produce robust prior probabilities. Data was only pooled across studies if the parameters were standardized. For example, fever was ubiquitous in all studies, but figures were only selected if it had been defined as >37.3°C. See Table. 3 for full pooled analysis by symptom. Whilst it is expected that asymptomatic disease would produce no symptoms, an estimated probability of 0.01 was included under the assumption that individuals categorised as such may be subclinical (Hu et al., 2020). Data for conditions with similar symptoms were gathered from Chen et al., (2020b), Menni et al., (2020) and Zayet et al., (2020), as part of a targeted literature search. SARS-CoV-2 negative RT-PCR results in patients presenting to hospital with symptoms were deemed an accurate prediction for the presence of a different disease. Such studies also looked at differences in symptom prevalence in COVID-19 vs non-COVID-19 pneumonia, and influenza. The probabilities of symptoms in subjects with no disease were taken from the REACT study by Riley et al., (2020). It has been assumed that symptoms reported by participants testing negative for COVID-19 were circumstantial since the study was community outreach rather than in a clinical setting. Type of Rash data was taken from studies by Zhao et al., (2020) and Matar et al., (2020), whilst laboratory findings were mined from Zhang et al., (2020b); Tabata et al., (2020); Guan et al., (2020); de Jager et al., (2010); Wilkerson et al., (2020) and J. Xie et al., (2020). View this table: [TABLE III.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T3) TABLE III. Table displaying the reference papers for each symptom with the combined number of observations over the combined number of participants Quantitative data for ‘Other Conditions with COVID-like Symptoms’ conditioned on age and risk factor could not be readily found for the UK. Probabilities were estimated as follows: the ratio of deaths per age strata for respiratory disease in the UK, in 2012, was used to predict the prevalence in alive individuals (British Lung Foundation, 2013). Age categories were then further discretised, and prevalence was allocated as an exponential increase, as suggested by M. Xie et al., (2020). Finally, the estimates were categorised by high, medium and low risk, based on UK clinical risk ratios elicited from Cromer et al., (2014). The prevalence of influenza and COPD in the UK were discovered from Public Health England (2020) and Rayner *et al*., (2014) respectively. ### D. Testing In the initial model, meningitis was included under Conditions with COVID-Like Symptoms, as a study by Packwood *et al*., (2020) reported a case of misdiagnosis. Meningitis can share several symptoms with COVID-19, namely fever, headache, lymphocytopenia and most indicatively, rash. However, it was omitted from the model as prevalence in the UK is low due to effective vaccination programs (McGill *et al*., 2020; PHE, 2019). COVID-19 infection has also been mistaken for dengue (Joob and Wiwanitkit, 2020), but it was not deemed appropriate for inclusion as dengue is not endemic to the UK. AGENARISK supports sensitivity analysis via tornado diagrams. This allowed for the relative importance of each symptom in disease outcome to be compared. The analysis discovered cough, loss of appetite, shortness of breath, chest pain and chills to be the five most important contributors to severe COVID-19 infection (Fig. 3.). These symptoms differ, in part, to those recognised by the NHS (NHS, 2020). The difference may reveal significant symptoms that have previously been overlooked or imply a bias in the sample population dynamics. When applied to other conditions with COVID-like symptoms, the five most indictive symptoms were nausea or vomiting, cough, shortness of breath, abdominal pain and chest pain (Fig. 4.). Interestingly, the presence of anosmia or ageusia had the greatest negative impact on non-COVID-19 infection outcome, consistent with studies that express loss of sense of smell to be a marker for COVID-19 (Daher et al., 2020). As expected, elevated respiratory rate and reduced oxygen saturation had the highest relative importance to severe infection. Lymphocyte count had the lowest (Fig. 5.). For non-COVID infections, increased respiratory rate and elevated C-reactive protein were the most indicative laboratory findings (Fig. 6). ![Fig. 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F3.medium.gif) [Fig. 3.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F3) Fig. 3. Tornado graph representing the hierachy of symptom sensitivities to severe COVID-19 infection. ![Fig. 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F4.medium.gif) [Fig. 4.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F4) Fig. 4. Tornado graph representing the hierachy of symptom sensitivities to non-COVID infection. ![Fig. 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F5.medium.gif) [Fig. 5.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F5) Fig. 5. Tornado graph representing the hierachy of laboratory finding sensitivities to severe COVID-19 infection. ![Fig. 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F6.medium.gif) [Fig. 6.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F6) Fig. 6. Tornado graph representing the hierachy of laboratory finding sensitivities to non-COVID infection. ## IV. Results To test the model, subject profiles from case reports were input as scenarios. Study outcomes were then compared against the posterior probabilities of the network. All patients were positive for COVID-19. To assess the model’s ability to estimate infection, probabilities were compared before and after test node observations were added. ### A. Case Study 1 Sachdeva et al., (2020) reported a 71-year old, Caucasian female presenting with cough and deteriorating shortness of breath. The patient also complained of fever, but it was not clinically confirmed, and had no underlying medical conditions. Laboratory findings reported elevated C-reactive protein and normal lymphocyte count. She reportedly lived with an infected individual and was diagnosed with COVID-19 by a positive nasal PCR result. After treatment and respiratory recovery, the patient developed a maculo-papular rash. The nodes set in the network to reflect this can be found in Table. 4. View this table: [TABLE IV.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T4) TABLE IV. Bayesian network scenario observations for case study 1. On initial run, the model predicted the patient to have a 49% probability of severe infection, and 43.85% probability of mild infection. Eventual status was also forecast to be severe (53.16%). With the prediction of severe infection and deteriorating symptom status, the ‘hospitalisation alert’ 60% threshold was breached (69.04%) (Fig. 7). With the addition of a positive RT-PCR observation, the primary diagnosis of severe infection was maintained for current (58.4%) and eventual status (62.26%). Mild infection was predicted at 40.64% (Fig. 8). To simulate the patient’s rash after respiratory recovery, observations for cough and shortness of breath were removed and rash symptom nodes were set to true for maculo-papular. It was also assumed that ‘current time since infected’ was >5 days and ‘symptom status’ was stable. No other observations were updated. Given these new parameters, our model overwhelmingly supported mild infection (93.47%) (Fig .9). ![Fig. 7.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F7.medium.gif) [Fig. 7.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F7) Fig. 7. Posterior probabilities of COVID-19 status nodes for observations from case study 1, with ‘test type’ not yet observed. ![Fig. 8.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F8.medium.gif) [Fig. 8.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F8) Fig. 8. Posterior probabilities of COVID-19 status nodes for observations from case study 1, with ‘test type’ observed. ![Fig. 9.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/10/26/2020.10.22.20217554/F9.medium.gif) [Fig. 9.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/F9) Fig. 9. Posterior probabilities of COVID-19 status nodes after respiratory recovery and during rash presentation, for case study 1. ### B. Case Study 2 Chen et al., (2020c) reported a 46-year old, Chinese female presenting with a fever of 37.3°C progressing to sore throat, cough and chest pain after 5 days. The patient had an oxygen saturation of 98% and tested negative for influenza A and B. After admitting to having contact with a confirmed infected person, she was tested for SARS-CoV-2 via nasal RT-PCR and was confirmed positive. The nodes set in the network to reflect this can be found in Table. 5. View this table: [TABLE V.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T5) TABLE V. Bayesian network scenario observations for case study 2. Before confirming a positive test, our model strongly predicted mild infection at 83.16%, with a slight decrease in eventual status (77.2%). Severe infection was forecast at 16.57% for current and 22.53% for eventual status. As an observation was not available for symptom status, the prior probability favoured a stable progression and therefore ‘hospitalisation alert’ was not triggered (11.27%). ‘COVID-19 alert’ breached the threshold comfortably at 99.64%. After observing a positive PCR test, the probability of severe infection increased marginally to 20.38%, but mild infection was still the primary diagnosis (79.58%). The pattern followed for eventual status (26.09% and 73.87% respectively). ### C. Case Study 3 Song et al., (2020) reported a 37-year old, Chinese male comorbid with liver cancer. The patient began presenting with fever, cough and dyspnea, and was confirmed positive for influenza and SARS-CoV-2 via nasal PCR. He did not suffer from lymphocytopenia and observations were not available for C-reactive protein levels. The nodes set in the network to reflect this can be found in Table. 6. View this table: [TABLE VI.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T6) TABLE VI. Bayesian network scenario observations for case study 3. View this table: [TABLE VII.](http://medrxiv.org/content/early/2020/10/26/2020.10.22.20217554/T7) TABLE VII. Bayesian network scenario observations for case study 4. With the initial observations, the model predicted infection with a non-COVID-19 disease (99.51%) for both current and eventual status. Probability of mild and severe infection were almost inconsequential at 0.27% and 0.22% respectively. Given this scenario, neither ‘hospitalisation alert’ (0.13%) nor ‘COVID alert’ (0.49%) were triggered. With the addition of a positive RT-PCR result, infection with a non-COVID-19 disease was still favoured (96.25%). Severe COVID-19 was forecast at 1.95% progressing to an eventual status of 2.15%. Mild infection was predicted at 1.76% for current status and decreased to 1.6% for eventual. COVID-19 infection was also confirmed by a positive chest CT-scan. When this observation was entered, the probability of mild infection increased to 17.58% and severe infection to 15.05%. ‘Hospitalisation alert’ was not triggered for either test types. ‘COVID-19 alert’ threshold was not breached, despite positive COVID-19 test. ### D. Case Study 4 Li et al., (2020b) reported the infection of a 3-month old, Chinese male. The patient presented with cough and rhinorrhea but no fever, diarrhea or vomiting. Chest CT-scan suggested viral pneumonia. The child lived with a confirmed case and had been visited two weeks prior by relatives from Wuhan at the time of outbreak. Laboratory testing revealed elevated lymphocyte count and low CRP levels. COVID-19 infection was confirmed by positive RT-PCR. The nodes set in the network to reflect this can be found in Table. 7. Initially the model strongly estimated ‘none’ for current status of COVID-19 infection at 78.39%. The probability of mild infection was calculated at 20.13% and severe at 1.2%. There was little change predicted in eventual COVID-19 status, with ‘none’ increasing to 78.41%, severe to 1.45% and mild decreasing to 19.94%. Neither ‘hospitalisation alert’ (0.73%) nor ‘COVID-19 alert’ (21.59%) were triggered. On the observation of a positive RT-PCR test, calculations favoured mild COVID-19 infection (60.88%). The probability of severe infection increased to 4.67% and ‘none’ to 33.87%. Eventual COVID-19 status followed a similar pattern (60.25%, 5.42% and 33.87% respectively). ‘Hospitalisation alert’ remained untriggered (2.71%) but ‘COVID-19 alert’ breached the threshold (66.13%). ## V. Discussion For case study 1 the model correctly predicted severe COVID-19 infection. The patient was found to have pneumonia and required oxygen therapy, both indicators of severe disease. However, the prediction is weak with only 5.15% difference between mild and severe status before testing. This may be due to contradictory low risk health history and ethnicity with high risk age category and confirmed contact with infected persons. High-risk observations ultimately lead to a predicted increase in severe eventual status. However, the network does not account for the intervention of treatment on this node. Eventual status does not reflect the patient’s outcomes by the end of the study period, as she recovered fully after receiving antivirals and oxygen therapy. Treatment as a parent node of ‘eventual COVID-19 status’ could be a potential future development. The model was able to correctly trigger the ‘hospitalisation alert’ node. This prediction is concurrent with the study the patient was admitted to the Emergency Department. On the simulation of rash during convalescence, the update in prediction from 58.4% severe to 93.47% mild effectively reflects the patient’s recovery journey. Mild COVID-19 severity was correctly predicted for the second case study. This is known from the study as treatment only consisted of antimicrobials and did not require additional ventilation. The observations did not trigger the hospitalisation alert, which may appear to contradict the report as the patient presented to the Third Affiliated Hospital of Sun Yat-sen University fever clinic. However, the case occurred during the early stages of the pandemic when clinical uncertainty was high. It is likely if the same case occurred at present day, hospitalisation would not have been necessary and hence the model would be correct. For case study 3, the model evidently struggled with the non-specific symptom presentation and comorbidity observations. Despite the positive PCR results, a high probability was predicted for non-COVID-19 similar status. It is hypothesised that this occurred due to the model accounting for potential false-positives from the RT-PCR method (Tahamtan and Ardebili, 2020). Song et al., (2020) verified the initial diagnosis with a positive chest CT scan. With an observation of CT-scan for ‘test type’, the probability of mild infection increased, confirming the hypothesis. A potential development to the model could be the stratifying of ‘test type’ such that PCR and CT-scan are not mutually exclusive and can be observed concurrently. Even with the updated test type, probabilities were not as explicit as hoped in comparison to the outcome of the study. The patient was admitted to the ICU and administered oxygen, which convincingly points to severe infection, whereas our model predicted a non-COVID-19 disease. As the patient had recently undergone chemotherapy, he may have been temporarily at higher risk for severe infection due to the immunocompromising cancer treatment (Williams et al., 2020b). It could be argued that if the patient had not recently undergone chemotherapy, his disease severity would have been mild, supporting the model outcome. However, research shows this hypothesis is a point of contention, with studies such as Jee et al., (2020) disputing chemotherapy as a risk for COVID-19 severity. The subject of case study 4 was predicted to be negative for COVID-19 infection. This is understandable as less than 5% of COVID-19 infections reported in the EU/EEA and UK are in children under 18 years old (ECDC, 2020b). Similarly, many of the subject’s observations denied the presence of COVID-19. For example, diarrhea, vomiting, fever and laboratory findings were actively unobserved in the patient, lowering the probability of infection. Bias in the data used to populate the model was often unavoidable and therefore must be considered. Most data available for COVID-19 infection are from adults. Similarly, symptoms conditioned on age were not explored. When a positive RT-PCR test was observed, the model was able to accurately diagnosis the subject with mild COVID-19. This is concurrent with the study as recovery only required symptomatic treatments, such as cough medicine and traditional Chinese tonics (Li et al., 2020b). ## VI. Conclusion As the future of the COVID-19 pandemic remains uncertain, governments and researchers push to develop technologies and strategies to regain control. This paper proposes a Bayesian network to aid in diagnosis and triage of potential COVID-19 positive individuals. The model demonstrated high efficacy in predicting the disease outcomes of initial case studies but should be subject to larger raw datasets for more rigorous testing. One of the main achievements is the network is not reliant on the observation of a clinical test result for accurate prediction. This has huge potential benefits in monitoring community prevalence, especially in populations where clinical testing is inaccessible. Distinguishing COVID-19 from diseases with similar presentations is identified as an area for improvement and may prove a caveat as we transition into winter influenza season. Treatment as an intervention on ‘eventual COVID-19 status’ and the separation of ‘test type’ to allow for mutual inclusivity should be considered for future developments. Going forward, it is hoped this model can be implemented in a tangible setting, for contribution to COVID-19 relief efforts. ## VII. Future Work Governments internationally are increasingly turning to novel digital technologies in response to the pandemic. Contact tracing is a well-established protocol in infectious disease management that functions to disrupt transmission among population clusters. The concept involves individuals maintaining a log of recent contacts and locations. If they or a person in their network contracts the disease, the likely transmission route can be traced and appropriate action taken (Braithwaite et al., 2020). Developers are leveraging ubiquitous personal devices, such as smartphones, to efficiently reach the wider population. Across governments, app-based contact tracing for COVID-19 is a growing craze. Whilst England and Scotland are amid developments of their respective test and trace applications, Northern Ireland has already released the StopCOVID NI Proximity App. The app supports a decentralised framework in which users’ smartphones exchange digital keys, via Bluetooth, if within a two-metre vicinity for longer than fifteen minutes. When a user informs the app of a positive test result, keys collected from the previous fourteen days will be used to alert their respective owners. In line with government guidelines, alerted users must self-isolate for two weeks (nidirect, 2020). A caveat to this approach is a positive clinical test must be obtained for input into the application. The model developed in this paper could replace the need for a test with a threshold probability score or triggering of the ‘COVID alert’ node. The Bayesian network could be embedded in the application and interact with a graphical user interface via the AGENARISK application programming interface (API). In the interest of data privacy, only the user’s location and disease probability score would be forwarded to the government for monitoring. This application is currently in development by Fenton et al., (2020). ## Data Availability The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials. ## Acknowledgment I would like to thank my supervisor Norman Fenton for his expert guidance and for kindly providing the license to AGENARISK for the duration of this project. I would also like to thank the group, Norman Fenton, Scott McLachlan, Peter Lucas, Kudakwashe Dube, Graham Hitman, Magda Osman, Evangelia Kyrimi and Martin Neil, whose research this project builds upon. Finally, I would like to thank fellow student Georgina Prodhan for her contributions to the model. ## Appendix * Received October 22, 2020. * Revision received October 22, 2020. * Accepted October 26, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. Ali, N. (2020) ‘Elevated level of C-reactive protein may be an early marker to predict risk for severity of COVID-19’, Journal of Medical Virology, [https://doi.org/10.1002/jmv.26097](https://doi.org/10.1002/jmv.26097) [Online]. Available at: [https://onlinelibrary.wiley.com/doi/10.1002/jmv.26097](https://onlinelibrary.wiley.com/doi/10.1002/jmv.26097) (Accessed: 29th July 2020). 2. Almazeedia, S., Al-Youhaa, S., Jamala, M.H., Al-Haddada, M., Al-Muhainia, A., Al-Ghimlas, F., et al., (2020) ‘Characteristics, risk factors and outcomes among the first consecutive 1096 patients diagnosed with COVID-19 in Kuwait’, EClinicalMedicine, 24(1), [https://doi.org/10.1016/j.eclinm.2020.100448](https://doi.org/10.1016/j.eclinm.2020.100448) [Online]. Available at: [https://www.sciencedirect.com/science/article/pii/S2589537020301929#](https://www.sciencedirect.com/science/article/pii/S2589537020301929#) ! (Accessed: 11th July 2020). 3. Anderson, R.M., Hollingsworth, T.D., Baggaley, R.F., Maddren, R. (2020) COVID-19 spread in the UK: the end of the beginning?,doi: [https://doi.org/10.1016/S0140-6736(20)31689-5:TheLancet](https://doi.org/10.1016/S0140-6736(20)31689-5:TheLancet). 4. 1. Li D., 2. Liu Y., 3. Chen Y. Bi, C., Chen, G. (2011) ‘Bayesian Networks Modeling for Crop Diseases.’ In: Li D., Liu Y., Chen Y. (eds) Computer and Computing Technologies in Agriculture IV. CCTA 2010. IFIP Advances in Information and Communication Technology, vol 344. Springer, Berlin, Heidelberg 5. Braithwaite, I., Callender, T., Bullock, M., Alderidge, R. (2020) ‘Automated and partly automated contact tracing: a systematic review to inform the control of COVID-19’, The Lancet Digital Health, doi: [https://doi.org/10.1016/S2589-7500(20)30184-9](https://doi.org/10.1016/S2589-7500(20)30184-9). 6. British Lung Foundation (2013) Lung disease in the UK – big picture statistics, Available at: [https://statistics.blf.org.uk/lung-disease-uk-big-picture#numbers-developed-lung-disease-uk](https://statistics.blf.org.uk/lung-disease-uk-big-picture#numbers-developed-lung-disease-uk) (Accessed: 11th August 2020). 7. Centers for Disease Control and Prevention (2020) COVID-19 Research Articles Downloadable Database, Available at: [https://www.cdc.gov/library/researchguides/2019novelcoronavirus/researcharticles.html](https://www.cdc.gov/library/researchguides/2019novelcoronavirus/researcharticles.html) (Accessed: 10th July 2020). 8. Chen, D., Xu, W., Lei, Z., Huang, Z., Liu, J., Goa, Z., et al., (2020c) ‘Recurrence of positive SARS-CoV-2 RNA in COVID-19: A case report’, International Journal of Infectious Disease, 93(1), pp. 297–299. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ijid.2020.03.003&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32147538&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 9. Chen, W., Zheng, K.I., Liu, S., Yan, Z., Xu, C., & Qiao, Z. (2020a) ‘Plasma CRP level is positively associated with the severity of COVID-19’, Annals of Clinical Microbiology and Antimicrobials, 19(18), [https://doi.org/10.1186/s12941-020-00362-2](https://doi.org/10.1186/s12941-020-00362-2) [Online]. Available at: [https://ann-clinmicrob.biomedcentral.com/articles/10.1186/s12941-020-00362-2](https://ann-clinmicrob.biomedcentral.com/articles/10.1186/s12941-020-00362-2) (Accessed: 29th July 2020). 10. Chen, X., Yang, Y., Huang, M., Liu, L., Zhang, X., Xu, J., et al., (2020b) ‘Differences between COVID-19 and suspected then confirmed SARS-CoV-2-negative pneumonia: A retrospective study from a single center’, Journal of Medical Virology, 92(9), pp. 1572–1579. 11. Cholankeril, G., Podboy, A., Aivaliotis, V.I., Pham, E.A., Spencer, S., Kim, D., et al., (2020) ‘Association of Digestive Symptoms and Hospitalization in Patients with SARS-CoV-2 Infection’, American Journal of Gastroenterology, 115(7), pp. 1129-1132 [Online]. Available at: [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302101/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302101/) (Accessed: 11th July 2020). 12. Cromer, D., van Hoek, A., Jit, M., Edmunds, J., Fleming, D., Miller, E., (2014) ‘The burden of influenza in England by age and clinical risk group: A statistical analysis to inform vaccine policy’, Journal of Infection, 68(4), pp. 363–371. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jinf.2013.11.013&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24291062&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 13. Daher, V., Oliveria, D., Junior, M., Fernandes, E., Bomtempo de Castro, J., Moya, M., Guimaraes, V. (2020) ‘Anosmia: A marker of infection by the new corona virus’, Respiratory Medicine Case Reports, 31(1), [https://doi.org/10.1016/j.rmcr.2020.101129](https://doi.org/10.1016/j.rmcr.2020.101129). 14. de Jager, C., van Wijk, P., Mathoera, R., de Jongh-Leuvenink, J., Poll, T., and Wever, P. (2010) ‘Lymphocytopenia and neutrophil-lymphocyte count ratio predict bacteremia better than conventional infection markers in an emergency care unit’, Critical Care, 14(R192), [https://doi.org/10.1186/cc9309](https://doi.org/10.1186/cc9309). 15. European Centre for Disease Prevention and Control (2020a) Transmission of COVID-19, Available at: [https://www.ecdc.europa.eu/en/covid-19/latest-evidence/transmission](https://www.ecdc.europa.eu/en/covid-19/latest-evidence/transmission) (Accessed: 11th August 2020). 16. European Centre for Disease Prevention and Control (2020b) COVID-19 in children and the role of school settings in COVID-19 transmission, Available at: [https://www.ecdc.europa.eu/en/publications-data/children-and-school-settings-covid-19-transmission](https://www.ecdc.europa.eu/en/publications-data/children-and-school-settings-covid-19-transmission) (Accessed: 14th August 2020). 17. Fenton, N.E. and Neil, M. (2018) Risk Assessment and Decision Analysis with Bayesian Networks. Boca Raton, Florida: Chapman and Hall/CRC. 18. Fenton, N.E., McLachlan, S., Lucas, P., Dube, K., Hitman, G.A., Osman, M., Kyrimi, E., Neil, M. (2020) ‘A privacy-preserving Bayesian network model for personalised COVID19 risk assessment and contact tracing’, ResearchGate, doi: 10.1101/2020.07.15.20154286. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wNy4xNS4yMDE1NDI4NnYxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMTAvMjYvMjAyMC4xMC4yMi4yMDIxNzU1NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 19. Galvan, C., Catala, A., Carretero Hernandez, G., Rodriguez-Jimenez, P., Fernandez Lario, D., Rodriguez-Villa, A., et al., (2020) ‘Classification of the cutaneous manifestations of COVID-19:a rapid prospective nationwide consensus study in Spain with 375 cases’, British Journal of Dermatology, 183(1), pp. 71–77. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 20. Guan, W., Ni, Z., Hu, Y., Liang, W., Ou, C., He, et al., (2020) ‘Clinical Characteristics of Coronavirus Disease 2019 in China’, Journal of Clinical Virology, 382(1), pp. 1708–1720. 21. Haddawy, P., Jacobson, J., and Kahn, C.E. (1994) ‘Generating explanations and tutorial problems from Bayesian networks.’, Proceedings of the Annual Symposium on Computer Applications in Medical Care, 1(1), pp. 770–774. 22. 1. Akhgar, B. and 2. Arabnia, H. He, P. (2014) ‘Counter Cyber Attacks By Semantic Networks’, in Akhgar, B. and Arabnia, H. (ed.) Emerging Trends in ICT Security. Massachusetts: Morgan Kaufmann, pp. 455–467. 23. Hu, Z., Song, C., Xu, C., Jin, G., Chen, Y., Xu, X., et al., (2020) ‘Clinical characteristics of 24 asymptomatic infections with COVID-19 screened among close contacts in Nanjing, China’, Science China Life Sciences, 4(1), pp. 1–6. 24. Huang, C., Wang, Y., Li, X., Ren, L., Zhoa, J., Hu, Y., et al., (2020b) ‘Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China’, The Lancet, 395(10223), pp. 497–506. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0146736(20)30183-5&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 25. Huang, I., and Pranata, R. (2020a) ‘Lymphopenia in severe coronavirus disease-2019 (COVID-19): systematic review and meta-analysis’, Journal of Intensive Care, 8(36), [https://doi.org/10.1186/s40560-020-00453-4](https://doi.org/10.1186/s40560-020-00453-4) [Online]. Available at: [https://jintensivecare.biomedcentral.com/articles/10.1186/s40560-020-00453-4](https://jintensivecare.biomedcentral.com/articles/10.1186/s40560-020-00453-4) (Accessed: 29th July 2020). 26. Jee, J., Foote, M., Lumish, M., Stonestrom, A., Wills, B., Narendra, V., et al., (2020) ‘Chemotherapy and COVID-19 Outcomes in Patients With Cancer’, Journal of Clinical Oncology, doi: 10.1200/JCO.20.01307. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1200/JCO.20.01307&link_type=DOI) 27. Jensen, F.V. (1996) ‘AISB’, Bayesian Network Basics, 94(1), pp. 9–22. 28. Joob, B., Wiwanitkit, V. (2020) ‘COVID-19 can present with a rash and be mistaken for dengue’, Journal of the American Academy of Dermatology, 82(5), pp. e117. 29. Kahn, C.E., Laur, J.J., and Carrera, G.F. (2001) ‘A Bayesian network for diagnosis of primary bone tumors’, Journal of Digital Imaging, 14(2), pp. 56–57. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11442121&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 30. Kahn, C.E., Roberts, L.M., Shaffer, K.A., and Haddawy, P., (1997) ‘Construction of a Bayesian network for mammographic diagnosis of breast cancer’, Computers in Biology and Medicine,27(1), pp. 19–29. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0010-4825(96)00039-X&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9055043&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 31. Lee, Y., Min, P., Lee, S., Kim, S. (2020) ‘Prevalence and Duration of Acute Loss of Smell or Taste in COVID-19 Patients’, Journal of Korean Medical Science, 3(18), pp. 174. 32. Li, C., Luo, F., Wu, B. (2020b) ‘A 3-month-old child with COVID-19: a case report’, Medicine, 99(23), doi: 10.1097/MD.0000000000020661. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MD.0000000000020661&link_type=DOI) 33. Li, J., Chen, Z., Nie, Y., Man, Y., Guo, Q., Dai, X (2020) ‘Identification of Symptoms Prognostic of COVID-19 Severity: Multivariate Data Analysis of a Case Series in Henan Province’, JMIR Publications, 22(6), pp. 10.2196/19636 [Online]. Available at: [https://www.jmir.org/2020/6/e19636/#Introduction](https://www.jmir.org/2020/6/e19636/#Introduction) (Accessed: 11th July 2020a). 34. Liang, L., Ren, H., Cao, R., Hu, Y., Qin, Z., Li, c., Mei, S. (2020) ‘The Effect of COVID-19 on Youth Mental Health’, Psychiatric Quarterly, 91(1), pp. 841–852. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 35. Liu, X., Yue, X., Liu, F., Wei, L.,Chu, Y., Bao, H., Dong, Y., Cheng, W., Yang, L. (2020) ‘Analysis of clinical features and early warning signs in patients with severe COVID-19: A retrospective cohort study’, PLOS ONE, 1 [https://doi.org/10.1371/journal.pone.0235459](https://doi.org/10.1371/journal.pone.0235459) [Online]. Available at: [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235) 459 (Accessed: 11th July 2020). 36. Luciani, D., Marchesi, M., and Bertolini, G. (2003) ‘The role of Bayesian Networks in the diagnosis of pulmonary embolism’, Journal of Thrombosis and Haemostasis, 1(4), pp. 698–707. 37. Mao, L., Jin, H., Wang, M., Hu, Y., Chen, S., He, Q., et al., (2020) ‘Neurologic Manifestations of Hospitalized Patients With Coronavirus Disease 2019 in Wuhan, China’, JAMA Neurology, 77(6), pp. 683–690. 38. Matar, A., Oules, B., Sohier, P., Chosidow, O., Beylot-Barry, M., Dupin, N., Aracting, S. (2020) ‘Cutaneous manifestations in SARS-CoV-2 infection (COVID-19): a French experience and a systematic review of the literature’, European Academy of Dermatology and Venereology. [https://doi.org/10.1111/jdv.16775](https://doi.org/10.1111/jdv.16775). 39. Matrajt, L., and Leung, T. (2020) ‘Evaluating the Effectiveness of Social Distancing Interventions to Delay or Flatten the Epidemic Curve of Coronavirus Disease’, Emerging Infectious Diseases, 26(8), pp. 1740–1748. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3201/eid2608.201093&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 40. McGee, S. (2002) ‘Simplifying Likelihood Ratios’, Journal of General Internal Medicine, 17(8), pp. 647–650. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1046/j.1525-1497.2002.10750.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 41. McGill, F., Griffiths, M., Bonnett, L., Geretti, M., Michael, B., Beeching, N., et al., (2018) ‘Incidence, aetiology, and sequelae of viral meningitis in UK adults: a multicentre prospective observational cohort study’, The Lancet Infectious Diseases, 18(9), pp. 922–1003. 42. Menni, C., Valdes, A.M., Freidin, M.B., Sudre, C.H., Nguyen, L.H., Drew, D.A., et al., (2020) ‘Real-time tracking of self-reported symptoms to predict potential COVID-19’, Nature Medicine, 26(1), pp. 1037-1040 [Online]. Available at: [https://www.nature.com/articles/s41591-020-0916-2#Sec2](https://www.nature.com/articles/s41591-020-0916-2#Sec2) (Accessed: 29th July 2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-020-0916-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 43. NHS (2020) Check if you or your child has coronavirus symptoms, Available at: [https://www.nhs.uk/conditions/coronavirus-covid-19/symptoms/](https://www.nhs.uk/conditions/coronavirus-covid-19/symptoms/) (Accessed: 14th August 2020). 44. Nicola, M., Alsafi, Z., Sohrabi, C., Kerwan, A., Al-Jabir, A., Iosifidis, C., et al., (2020) ‘The socio-economic implications of the coronavirus pandemic (COVID-19): A review’, International Journal of Surgery, 78(1), pp. 185–193. 45. nidirect (2020) Coronavirus (COVID-19): StopCOVID NI Proximity App, Available at: [https://www.nidirect.gov.uk/articles/coronavirus-covid-19-stopcovid-ni-proximity-app](https://www.nidirect.gov.uk/articles/coronavirus-covid-19-stopcovid-ni-proximity-app) (Accessed: 25th August 2020). 46. Office for National Statistics (2020a) Coronavirus and the impact on output in the UK economy: June 2020, Available at: [https://www.ons.gov.uk/economy/grossdomesticproductgdp/articles/coronavirusandtheimpactonoutputintheukeconomy/june2020](https://www.ons.gov.uk/economy/grossdomesticproductgdp/articles/coronavirusandtheimpactonoutputintheukeconomy/june2020) (Accessed: 20th August 2020). 47. Office for National Statistics (2020b) Unemployment, Available at: [https://www.ons.gov.uk/employmentandlabourmarket/peoplenotinwork/unemployment](https://www.ons.gov.uk/employmentandlabourmarket/peoplenotinwork/unemployment) (Accessed: 20th August 2020). 48. Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, 1st edn., San Francisco: Morgan Kaufmann. 49. Prodhan, G., Fenton N.E., (2020) ‘Extending the range of COVID-19 risk factors in a Bayesian network model for personalised risk assessment’, Queen Mary University pre-print. 50. Public Health England (2020) UK flu levels according to PHE statistics: 2019 to 2020, Available at: [https://www.gov.uk/government/news/uk-flulevels-according-to-phe-statistics-2019-to-2020](https://www.gov.uk/government/news/uk-flulevels-according-to-phe-statistics-2019-to-2020) (Accessed: 15th August 2020). 51. Public Health England (2019) Invasive meningococcal disease in England: annual laboratory confirmed reports for epidemiological year 2018 to 2019, Available at: [https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/842368/hpr3819_IMD-ann.pdf](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/842368/hpr3819_IMD-ann.pdf) (Accessed: 20th August) 52. Rayner, L., Sherlock, J., Creagh-Brown, B., Williams, J., de Lusignan, S. (2017) ‘The prevalence of COPD in England: An ontological approach to case detection in primary care’, Respiratory Medicine, 132(1), pp. 217–225. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 53. Riley, S., Ainslie, K., Eales, O., Jeffery, B., Walters, C., Atchison, C., et al., (2020) ‘Community prevalence of SARS-CoV-2 virus in England during May 2020: REACT study’, medRxiv, [https://doi.org/10.1101/2020.07.10.20150524](https://doi.org/10.1101/2020.07.10.20150524). 54. Russo, A., Minichini, C., Starace, M., Astorri, R., Calo, F., Coppola, N. (2020) ‘Current Status of Laboratory Diagnosis for COVID-19: A Narrative Review’, Infection and Drug Resistance, 13(1), pp. 2657–2665. 55. Sachdeva, M., Gianotti, R., Shah, M., Bradanini, L., Tosi, D., Veraldi, S., et al., (2020) ‘Cutaneous manifestations of COVID-19: Report of three cases and a review of literature’, Journal of Dermatological Science, 98(2), pp. 75–81. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 56. Seixas, F.L., Zadrozny, B., Laks, J., Conci, A., Muchaluat Saade, D.C. (2014) ‘A Bayesian network decision model for supporting the diagnosis of dementia, Alzheimer?s disease and mild cognitive impairment’, Computers in Biology and Medicine, 51(1), pp. 140–158. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 57. Soltan, A., Kouchaki, S., Zhu, T., Kiyasseh, D., Taylor, T., Hussain, Z.B., et al., (2020) ‘Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital’, medRxiv, doi: [https://doi.org/10.1101/2020.07.07.20148361](https://doi.org/10.1101/2020.07.07.20148361) [Online]. Available at: [https://www.medrxiv.org/content/10.1101/2020.07.07.20148361v1.full.pdf](https://www.medrxiv.org/content/10.1101/2020.07.07.20148361v1.full.pdf) (Accessed: 29th July 2020). 58. Song, S.H., Chen, T.L., Deng, L.P., Zhang, Y.X., Mo, P.Z., Gao, S.C., et al., (2020) ‘Clinical characteristics of four cancer patients with SARS-CoV-2 infection in Wuhan, China’, Infectious Diseases of Poverty, 9(82), [https://doi.org/10.1186/s40249-020-00707-1](https://doi.org/10.1186/s40249-020-00707-1). 59. Tabata, S., Mai, K., Kawano, S., Ikeda, M., Kodama, T., Miyoshi, K., et al., (2020) ‘Clinical characteristics of COVID-19 in 104 people with SARS-CoV-2 infection on the Diamond Princess cruise ship: a retrospective analysis’, The Lancet Infectious Diseases, 1(1), [https://doi.org/10.1016/S1473-3099(20)30482-5](https://doi.org/10.1016/S1473-3099(20)30482-5). 60. Tahamtan, A., and Ardebili (2020) ‘Real-time RT-PCR in COVID-19 detection: issues affecting the results’, Expert Review of Molecular Diagnostics, 20(5), pp. 453–454. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/14737159.2020.1757437&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 61. Wagner, J., DuPont, A., Larson, S., Cash, B., Farooq, A. (2020b) ‘Absolute lymphocyte count is a prognostic marker in Covid-19: A retrospective cohort review’, International Journal of Laboratory Hematology, [https://doi.org/10.1111/ijlh.13288](https://doi.org/10.1111/ijlh.13288) [Online]. Available at: [https://ann-clinmicrob.biomedcentral.com/articles/10.1186/s12941-020-00362-2](https://ann-clinmicrob.biomedcentral.com/articles/10.1186/s12941-020-00362-2) (Accessed: 29th July 2020). 62. Wagner, T., Shweta, F., Murugadoos, K., Awasthi, S., Venkatakrishnan, A., Bade, S., et al., (2020a) ‘Augmented Curation of Clinical Notes from a Massive EHR System Reveals Symptoms of Impending COVID-19 Diagnosis’, medRxiv, [https://doi.org/10.1101/2020.04.19.20067660](https://doi.org/10.1101/2020.04.19.20067660). 63. Wang, K-J., Makond, B., and Wang, K-M. (2014) ‘Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan’, Computers in Biology and Medicine, 47(1), pp. 147–160. 64. Wang, L. (2020) ‘C-reactive protein levels in the early stage of COVID-19’, Medecine et Maladies Infectieuses, 50(4), pp. 332–334. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.medmal.2020.03.007&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 65. West, C., Montori, V., Sampathkumar, P. (2020) ‘COVID-19 Testing: The Threat of False-Negative Results’, Mayo Clinic Proceedings, 95(6), pp. 1127–1129. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 66. WHO (2020a) ‘Clinical Management of COVID-19’. Available at: [https://www.who.int/publications/i/item/clinical-management-of-covid-19](https://www.who.int/publications/i/item/clinical-management-of-covid-19) (Accessed: 10th July 2020) 67. Wilkerson, G., Alder, J., Shah, N., Brown, R (2020) ‘Silent hypoxia: A harbinger of clinical deterioration in patients with COVID-19’, American Journal of Emergency Medicine, doi: 10.1016/j.ajem.2020.05.044. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ajem.2020.05.044&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32471783&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 68. Williams, F., Freydin, M., Mangino, M., Couvreur, S., Visconti, A., Bowyer, R., et al., (2020b) ‘Self-reported symptoms of covid-19 including symptoms most predictive of SARS-CoV-2 infection, are heritable’, medRxiv, [https://doi.org/10.1101/2020.04.22.20072124](https://doi.org/10.1101/2020.04.22.20072124). 69. Williams, M., Calvez, K., Mi, E., Chen, J., Dadhania, S., Pakzad-Shahabi, L. (2020a) ‘Estimating the Risks from COVID-19 Infection in Adult Chemotherapy Patients’, medRxiv, [https://doi.org/10.1101/2020.03.18.20038067](https://doi.org/10.1101/2020.03.18.20038067). 70. worldometers (2020) COVID-19 Coronavirus Pandemic, Available at: [https://www.worldometers.info/coronavirus/](https://www.worldometers.info/coronavirus/) (Accessed: 29th July 2020). 71. Wynants, L., Calster, B.V., Collins, G., Riley, R.D., Heinze, G., Schuit, E., et al., (2020) ‘Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal’, the bmj, 369(1), pp. 1037-1040 [Online]. Available at: [https://doi.org/10.1136/bmj.m1328](https://doi.org/10.1136/bmj.m1328) (Accessed: 29th July 2020). 72. Xie, J., Covassin, N., Fan, Z., Singh, P., Gao, W., Li, G., et al., (2020a) ‘Association Between Hypoxemia and Mortality in Patients With COVID-19’, Mayo Clinical Proceedings, 95(6), pp. 1138–1147. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.mayocp.2020.04.006&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 73. Xie, M., Liu, X., Cao, X., Guo, M., Li, X. (2020) ‘Trends in prevalence and incidence of chronic respiratory diseases from 1990 to 2017’, Respiratory Research, 21(49), pp. [https://doi.org/10.1186/s12931-020-1291-8](https://doi.org/10.1186/s12931-020-1291-8). 74. Yet, B., Perkins, Z.B., Tai, N.R.M., Marsh, W.R. (2017) ‘Clinical evidence framework for Bayesian networks’, Knowledge and Information Systems, 50(1), pp. 117–143. 75. Yuki, K., Fujiogi, M.,Koutsogiannaki, S. (2020) ‘COVID-19 pathophysiology: A review’, Clinical Immunology, 215(108427), doi: 10.1016/j.clim.2020.108427. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.clim.2020.108427&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32325252&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 76. Zayet, S., Kadiane-Oussou, N., Lepiller, Q., Zahra, H., Royer, P., Toko, L., et al., (2020) ‘Clinical features of COVID-19 and influenza: a comparative study on Nord Franche-Comte cluster’, Microbes and Infection, [https://doi.org/10.1016/j.micinf.2020.05.016](https://doi.org/10.1016/j.micinf.2020.05.016). 77. Zhang, G., Hu, C., Luo, L., Fang, F., Chen, Y., Li, J., et al., (2020b) ‘Clinical features and short-term outcomes of 221 patients with COVID-19 in Wuhan, China’, Journal of Clinical Virology, 127(1), doi: 10.1016/j.jcv.2020.104364. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jcv.2020.104364&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32311650&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F10%2F26%2F2020.10.22.20217554.atom) 78. Zhang, S-Y., Lian, J-S., Hu, J-H., Zhang, X-L., Lu, Y-F., Cai, H., et al., (2020a) ‘Clinical characteristics of different subtypes and risk factors for theseverity of illness in patients with COVID-19 in Zhejiang, China’, JMIR Publications, 9(85), [https://doi.org/10.1186/s40249-020-00710-6](https://doi.org/10.1186/s40249-020-00710-6) [Online]. Available at: [https://idpjournal.biomedcentral.com/articles/10.1186/s40249-020-00710-6#Tab1](https://idpjournal.biomedcentral.com/articles/10.1186/s40249-020-00710-6#Tab1) (Accessed: 11th July 2020). 79. Zhao, Q., Fang, X., Pang, Z., Zhang, B., Liu, H., Zhang, F. (2020) ‘COVID-19 and cutaneous manifestations: A systematic review’, European Academy of Dermatology and Venereology,. [https://doi.org/10.1111/jdv.16778](https://doi.org/10.1111/jdv.16778). 80. 1. da Costa P.C.G. et al. Zheng HT., Kang BY., Kim HG. (2008) ‘An Ontology-Based Bayesian Network Approach for Representing Uncertainty in Clinical Practice Guidelines’ In: da Costa P.C.G. et al.(eds) Uncertainty Reasoning for the Semantic Web I. URSW 2006, URSW 2007, URSW 2005. Lecture Notes in Computer Science, vol 5327. Springer, Berlin, Heidelberg 81. ZOE (2020) Skin rash should be considered as a fourth key sign of COVID-19, Available at: https://covid.joinzoe.com/post/skin-rash-covid#:~:text=Researchers%20discovered%20that%208.8%25%20of,with%20a%20negative%20test%20result. (Accessed: 20th July 2020). [1]: /embed/graphic-4.gif