Abstract
Background The objectives of this study were to identify risk factors for severe COVID-19 and to lay the basis for risk stratification based on demographic data and health records.
Methods and Findings The design was a matched case-control study. Severe COVID-19 was defined as either a positive nucleic acid test for SARS-CoV-2 in the national database followed by entry to a critical care unit or death within 28 days, or a death certificate with COVID-19 as underlying cause. Up to ten controls per case matched for sex, age and primary care practice were selected from the population register. All diagnostic codes from the past five years of hospitalisation records and all drug codes from prescriptions dispensed during the past nine months were extracted. Rate ratios for severe COVID-19 were estimated by conditional logistic regression.
There were 4272 severe cases. In a logistic regression using the age-sex distribution of the national population, the odds ratios for severe disease were 2.87 for a 10-year increase in age and 1.63 for male sex. In the case-control analysis, the strongest risk factor was residence in a care home, with rate ratio (95% CI) 21.4 (19.1, 23.9).
Univariate rate ratios (95% CIs) for conditions listed by public health agencies as conferring high risk were 2.75 (1.96, 3.88) for Type 1 diabetes, 1.60 (1.48, 1.74) for Type 2 diabetes, 1.49 (1.37, 1.61) for ischemic heart disease, 2.23 (2.08, 2.39) for other heart disease, 1.96 (1.83, 2.10) for chronic lower respiratory tract disease, 4.06 (3.15, 5.23) for chronic kidney disease, 5.4 (4.9, 5.8) for neurological disease, 3.61 (2.60, 5.00) for chronic liver disease and 2.66 (1.86, 3.79) for immune deficiency or suppression.
78% of cases and 52% of controls had at least one listed condition (NA of cases and NA of controls under age 40). Severe disease was associated with encashment of at least one prescription in the past nine months and with at least one hospital admission in the past five years [rate ratios 3.10 (2.59, 3.71)] and 2.75 (2.53, 2.99) respectively] even after adjusting for the listed conditions. In those without listed conditions significant associations with severe disease were seen across many hospital diagnoses and drug categories. Age and sex provided 2.58 bits of information for discrimination. A model based on demographic variables, listed conditions, hospital diagnoses and prescriptions provided an additional 1.25 bits (C-statistic 0.825). A limitation of this study is that records from primary care were not available.
Conclusions Along with older age and male sex, severe COVID-19 is strongly associated with past medical history across all age groups. Many comorbidities beyond the risk conditions designated by public health agencies contribute to this. A risk classifier that uses all the information available in health records, rather than only a limited set of conditions, will more accurately discriminate between low-risk and high-risk individuals who may require shielding until the epidemic is over.
Author summary Most people infected with the SARS-CoV-2 coronavirus do not become seriously ill. It is The risk of severe or fatal illness is higher in older than in younger people, and is higher in people with conditions such as asthma and diabetes than in people without these conditions. Using Scotland’s capability for linking electronic health records, we report the first systematic study of the relation of severe or fatal COVID-19 to pre-existing health conditions and other risk factors. We show that the strongest risk factor, apart from age, is residence in a care home. The conditions associated with increased risk include not only those already designated by public health agencies – asthma, diabetes, heart disease, disabling neurological disease, kidney disease – but many other diagnoses, associated with frailty and poor health. This lays a basis for constructing risk scores based on electronic health records that can be used to advise people at high risk of severe disease to shield themselves when there cases in their neighbourhood.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
No external funding was received for the work
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was conducted under approvals from the Privacy Advisory Committee ref 44/13 and Public Benefit Privacy Protection amendment 1617-0147. Datasets were de-identified before analysis.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Data cannot be directly shared publicly because of ethical and legal concerns as the data derive from deidentified National Health Service Records. The datasets used in this analysis are available via the Public Benefits Privacy Panel for Health at https://www.informationgovernance.scot.nhs.uk/pbpphsc/ for researchers who meet the criteria for access to confidential data.