Abstract
Objective To study whether COVID-19 infection may be associated with increased hospitalization and mortality from other diseases.
Design Cohort study.
Setting The UK Biobank.
Participants All subjects in the UK Biobank with available hospitalization records and alive as of 31-Jan-2020 (N= 412,096; age 50-87).
Main outcome measures We investigated associations of COVID-19 with hospitalization and mortality due to different diseases post-infection. We conducted a comprehensive survey on disorders from all systems (up to 135 disease categories). Multivariable Cox and Poisson regression was conducted controlling for main confounders. For sensitivity analysis, we also conducted separate analysis for new-onset and recurrent cases, and other analysis such as the prior event rate adjustment(PERR) approach to minimize effects of unmeasured confounders. We also performed association analyses stratified by vaccination status. Time-dependent effects on subsequent hospitalization and mortality were also tested.
Results Compared to individuals with no known history of COVID-19, those with severe COVID-19 (requiring hospitalization) exhibited higher hazards of hospitalization and/or mortality due to multiple disorders (median follow-up=608 days), including disorders of respiratory, cardiovascular, neurological, gastrointestinal, genitourinary and musculoskeletal systems. Increased hazards of hospitalizations and/or mortality were also observed for injuries due to fractures, various infections and other non-specific symptoms. These results remained largely consistent after sensitivity analyses. Severe COVID-19 was also associated with increased all-cause mortality (HR=14.700, 95% CI: 13.835-15.619).
Mild (non-hospitalized) COVID-19 was associated with modestly increased risk of all-cause mortality (HR=1.237, 95% CI 1.037-1.476) and mortality from neurocognitive disorders, as well as hospital admission from a few disorders such as aspiration pneumonitis, musculoskeletal pain and other general signs/symptoms.
All-cause mortalities and hospitalizations from other disorders post-infection were generally higher in the pre-vaccination era. The deleterious effect of COVID-19 was observed to wane over time, with maximum HR in the initial phase.
Conclusions In conclusion, this study revealed increased risk of hospitalization and mortality from a wide variety of pulmonary and extra-pulmonary diseases after COVID-19, especially for severe infections. Mild disease was also associated with increased all-cause mortality. Causality however cannot be established due to observational nature of the study. Further studies are required to replicate our findings.
Introduction
There are more than 590 million1 documented cases of COVID-19 worldwide (as at 16-Aug-2022), and accumulating evidence have shown that the disease may be associated with various sequelae. For example, respiratory symptoms and disorders were reported frequently in the post-infection period, such as dyspnoea and cough2, lung function impairment3, and bronchiectasis4. Relatively less attention has been paid on other organ systems. For example, studies have reported that neurological manifestations could persist in patient with COVID-195-7. Increased risks of cardiovascular diseases have also been observed8. Several other studies have also investigated the health consequences of COVID-19 across different systems9-12. Nevertheless, the full spectrum of sequalae from COVID-19 remains unclear, and there are still limited studies which have conducted a comprehensive analysis on potential sequelae from all body systems.
Given the very large number of people affected by COVID-19 worldwide, additional studies are still very much needed to delineate clearly the sequelae of COVID-19. There are also some limitations of existing studies. For example, some studies11,12 are of relatively small sample sizes, which limits the power to detect moderate associations. Another study9 examined the risks of a wide range of sequelae; however the study was restricted to veterans, who are predominantly males and the findings may not be readily generalizable to the general population13. In addition, very few works (except e.g. ref14) have studied how vaccination may change the risk of sequelae from COVID-19.
Here we leveraged a large prospective cohort of the UK biobank (UKBB) to identify associations of COVID-19 with hospitalization and mortality due to different diseases post-infection. We conducted a comprehensive survey on disorders from all systems, and a series of additional analysis, for example on the risk of hospitalization from new-onset and recurrent diseases, risk of sequelae across different follow-up periods, and how vaccination (before or after the infection) may alter the risk of sequelae.
Methods
UK Biobank sample
The UK Biobank (UKBB) is a large prospective cohort comprising ∼500,000 UK participants with continuous follow-up. The current age of subjects in this study ranged from 50 to 87 years. The details of the UKBB sample are described in ref15. As not every participant is linked to hospital inpatient records, our analysis was restricted to those subjects (sample size, N= 412,096) with available hospitalization records and alive as at 31-Jan-2020, the date for the first confirmed case of COVID-19 in UK. The current study was conducted under the project number 28732. The overall analytic flow is presented in Fig.1.
Study Outcomes
Our main study outcomes were hospitalization and mortality due to various disorders. Hospitalization records were extracted from the Hospital Episode Statistics (HES) data of UKBB. Detailed descriptions of HES can be found in https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/HospitalEpisodeStatistics.pdf. The hospital inpatient ‘core’ and diagnoses datasets were updated to 30 Sep 2021 (accessed 21 Dec 2021). The diagnosis codes and corresponding dates were summarized based on each participant’s inpatient record. All the diagnosis codes were converted to 3-character ICD-10 codes. We followed the mapping strategy as described in https://biobank.ndph.ox.ac.uk/showcase/showcase/docs/first_occurrences_outcomes.pdf. The mortality records of UKBB contain the cause of death (ICD10-coded) and were linked to national death registries. Details can be found in https://biobank.ctsu.ox.ac.uk/crystal/crystal/docs/DeathLinkage.pdf. Mortality data were updated to 18 Oct 2021 (accessed 21 Dec 2021).
All ICD-10 codes were covered in analysing the sequelae of COVID-19 infection. We employed the latest version (2022.1) of Clinical Classifications Software Refined (CCSR)16 to classify ICD-10 codes to a smaller number of clinically meaningful categories of diagnoses. Totally 320 clinical categories covering 21 body systems were included as outcomes. Since the number of events may be very small for some outcomes, we only present the results if the number of events>=5 for both exposed and unexposed groups. Totally up to 135 disease categories were included after the filtering.
Only primary causes of admission/mortality were considered. All diagnosis given before the study start-date (i.e. time0; the date a subject entered the cohort study) were regarded as medical history of comorbidities; diagnoses given after time0 were treated as hospitalization/mortality due to ‘new-onset’ diseases. For better statistical power, we primarily present the results from any hospitalization or mortality from diseases, without consideration of past medical history. Nevertheless, we also performed additional stratified analysis on new-onset (subjects with no known prior history) and recurrent/relapsed diseases (subjects with known history of the disease).
Covariates
We performed multivariable regression analysis with adjustment for covariates including basic demographics (age, sex, ethnicity), major risk factors of cardiovascular disorders (lipid and glucose levels, HbA1c), immunological disorders (autoimmune diseases, immunodeficiency, history of receiving immunosuppressants), major comorbidities (coronary artery disease[CAD], stroke, Type 2 diabetes mellitus[T2DM], hypertension[HTN], atrial fibrillation[AF], chronic obstructive pulmonary disease[COPD], dementia, history of cancer, chronic kidney disease[CKD] and pneumonia), overall health status and indicators of multi-morbidities (number of non-cancer illnesses, number of medications prescribed by GP and number of hospitalizations in the past year), indicators of obesity (body mass index [BMI], waist circumference), socioeconomic status (Townsend Deprivation index) and smoking status. Covariates were picked based on their potential relevance to COVID-19 infection and/or its complications. Please refer to Table S1 for the full list and distribution of covariates.
To define history of relevant diseases, we included information from primary care and hospital inpatient data, as well as ICD-10 diagnoses (code 41270) and self-reported illnesses (code 20002). The strategy of integrating all diagnoses were based on instructions from UKBB (https://biobank.ndph.ox.ac.uk/showcase/showcase/docs/first_occurrences_outcomes.pdf).
Missing covariate data
Missing values of features were imputed with “missRanger”. The program is based on missForest, an iterative imputation approach using random forest. The program has been widely used and has been shown to produce low imputation errors and good performance in predictive models 17. The missing rate of each covariate and the OOB (out-of-bag error) of imputation are listed in Table S2.
COVID-19 infection status
COVID-19 testing data were downloaded from the UKBB data portal (please refer to https://biobank.ndph.ox.ac.uk/showcase/exinfo.cgi?src=COVID19). Briefly, the latest COVID test results were downloaded on 21 Dec 2021 (last update on 30 Oct 2021). COVID-19 test results after the study end-date (30 Sep 2021 for hospital admission and 18 Oct 2021 for mortality) were ignored. COVID-19 infection status was defined by any positive test results, or a diagnosis of COVID-19 (ICD-10 code: U071) from inpatient/mortality records or the code “Y2a3b” within the TPP GP records ((https://biobank.ndph.ox.ac.uk/showcase/coding.cgi?id=8708&nl=1) in the follow-up period.
“Severe” COVID-19 was defined by any case who is hospitalized or whose mortality is due to U07.1. Other non-hospitalized and non-fatal cases are considered as ‘mild’. For hospitalized cases, we required both test result and test origin to be 1 (positive test and inpatient origin). For a small number of cases with initial outpatient origin and positive test result, but changed to inpatient origin within 2 weeks, we still considered these subjects as inpatient cases (i.e., we assume the hospitalization was related to COVID-19).
Four models of analysis
Four sets of analysis were performed for hospitalization and mortality as outcome separately (Table 1). For model A, it was restricted to test-positive subjects, while for other models, they were based on the whole population. For Model A, we compared severe vs mild COVID-19 cases for post-infection hospital admission and mortalities. For model B, we compared severe COVID-19 cases against those with no known history of infection. For model C, all infected cases were compared against those with no known infection. Finally, ‘mild COVID-19’ were compared against subjects with no known infection in model D.
Statistical analysis
Cox and Poisson regression were employed for analysis. Cox regression models time to development of event, while Poisson regression models the incidence rate (with consideration of follow-up time). Briefly, the time to hospitalization or mortality from the studied diseases was treated as outcome in Cox regression, controlling for other covariates. For Poisson regression, presence/absence of the event of interest was considered as outcome, with ‘offset’ specified (=number of days of follow-up) to account for the differences in duration of follow-up for each subject.
For subjects with COVID-19 infection, the first positive date was treated as the start-date (time0) of this cohort study. For subjects free of COVID-19 infection, 31-Jan-2020 were assigned as time0. The end of follow-up was set as the time a participant developed an event of interest or 30 Sep 2021 (for analysis of hospitalization) or 18 Oct 2021 (for analysis of mortality). All statistical analysis were performed by R (version number 3.6.1). The false discovery rate (FDR) approach by Benjamini and Hochberg18 was performed to control for multiple testing, which controls the expected proportion of false-positives among the rejected hypotheses.
To avoid convergence issues when large number of predictors are included, we followed the methodology described by Zhao et al.19 to further select covariates for inclusion. We first included several ‘default’ variables, including age, sex and general health status (number of hospitalization and medications received in the past year) into the regression model. For the rest of the covariates, univariate testing was performed with each predictor; those with nominally significant associations (p<0.05) with the outcome were selected into the final model. This approach19 was shown by theory and simulations to control the proportion of false positive predictors.
Prior event rate ratio (PERR) adjustment
Although many confounders have been adjusted for, the chance of residual confounding remains. The PERR approach exploits a before-and-after design to minimize the effects of unmeasured confounders20-22. The basic idea is to compare the differences in hazard ratios between the exposed and unexposed groups before and after the COVID-19 pandemic, as differences in hospitalization/mortality risks prior to the pandemic can be due to baseline differences (unmeasured confounders) between the two groups.
We may assume that before the onset of the pandemic in UK (31 Jan 2020), the hazard ratio of event (e.g. hospitalization due to a specific disease) between the exposed and unexposed groups is HRbefore. Since neither group would have COVID-19 before the baseline, HRbefore is assumed to reflect baseline differences owing to unmeasured confounders, independent of the effect of COVID-19. Similarly, we define the differences between groups after the pandemic onset to be HRafter. The PERR-adjusted HR, calculated as HRafter/HRbefore, estimates the effect of COVID-19 accounting for both unmeasured and measured confounders. Equivalently we may compute the difference in Cox regression coefficients. For details please refer to Supplementary Methods.
Prescription-time distribution matching (PTDM)
In this cohort study, the start-date (time0) for the exposed group is the first date of being tested positive, while an artificial time0 was assigned to the unexposed group. Imbalance of ‘prescription time’ (here ‘prescription’ refers to being infected by COVID-19) distribution may lead to ‘survival bias’. Survival bias may be generated when individuals who die shortly after the start of follow-up might not have a probability of being infected and would only be classified as ‘unexposed’.
In our primary analysis, we excluded the ‘immortal time’ period which gives good estimates of the true coefficients when the exposed group is far smaller than the unexposed group23. In addition, if bias remains, the observed effect size (HR) is always smaller than the true HR. Since COVID-19 tends to increase risk of other diseases (HR>1), a smaller HR means that the bias is conservative23. Secondly, we also employed the PTDM approach which has been shown to effectively reduce survival bias24. In brief, for each person without COVID-19 infection, instead of using the same start-date (31-Jan-2020), an alternative time0 was generated by randomly choosing from the sets of time0 from the COVID-19-exposed group. This ensures a similar distribution of the time0 across both groups (see also Supplementary Methods).
Other analysis to check for robustness of findings
We performed a variety of different analyses to verify the robustness of our findings under different modelling strategies and assumptions:
While we primarily focused on all hospitalizations, we also performed stratified analysis for new-onset and recurrent diseases, as mentioned earlier;
for counting the follow-up end date, we primarily consider the last date of available hospitalization/mortality records as the end of follow-up. However, since vaccination may change the risk of sequelae, we also performed analysis limited to the period for which no vaccines were yet available, i.e. setting 8 Dec 2020 (the date of first vaccination in UK) as our study end-date;
We performed extra analysis to test associations with subsequent hospitalization and mortality 15 or 30 days after being tested positive. The reason is that it is possible that some subjects may be admitted for other disorders and were also tested positive on screening. (However, we note that even in such cases, these patients may be infected by COVID-19 first, and the infection may trigger the disease that leads to the primary admission; this approach may miss such cases).
Association with hospital admission/mortality stratified by vaccination status and follow-up period
We also conducted further (exploratory) analysis stratified by vaccination status. Briefly, we compared hospitalization rates for subjects who have received at least one dose of vaccine before or after the infection, those who are infected but not vaccinated and those with no known COVID-19. One of our main questions is whether prior vaccination (and to a lesser extent, vaccination after infection) may ameliorate the risk of sequelae, when compared to infected subjects who are not vaccinated.
In another additional analysis, we first tested proportional hazards (PH) assumption for associations that were (nominally) significant (p<0.05). For these outcomes we also examined the effect of COVID-19 on hospitalization/mortality risks as stratified by the follow-up period. Three periods were specified, namely <3 months, 3-6 months and >6 months. The PH assumption were tested again after stratification by follow-up time. Moreover, we also evaluated continuous time-dependent coefficients (https://cran.r-project.org/web/packages/survival/vignettes/timedep.pdf), constructed using the time-transformation functionality of the R function coxph. Briefly, we assume the Cox regression coefficient is time-varying and a linear function of log(t), i.e. β(t) = a + b log(t). From this we may also calculate the time taken for the regression coefficient (i.e. the adverse effects of infection) to wane to zero. The CI for the “time to no effect” was calculated by the delta method. We hypothesized that the risk would be most prominent in the initial period after infection and would wane afterwards.
Results
Overview
In this study, we found that severe COVID-19 was associated with elevated hazards ratios (HR) of hospital admission and mortality due to various diseases involving many body systems. These patterns were consistent in a series of sensitivity analyses, and after PTDM and PERR adjustment.
The four sets of analysis are summarized in Table 1 and the number of subjects for each model is presented in Table 2.
Compared to subjects with no known history of COVID-19, association of severe COVID-19 with hospitalization from other disorders (severe disease vs uninfected)
Severe infection was associated with higher hazards of hospitalization and mortality due to a broad array of diseases (median follow-up=608 days). The main results are listed in Table 3 and Fig.2 and full results in Table S3. Here we primarily present the results based on Cox regression model, without further adjustments by PTDM or PERR. Considering all hospitalizations (regardless of history), severe COVID-19 was significantly associated with elevated hazards of hospitalization involving multiple body systems, as described below.
Respiratory diseases
Elevated hazards of hospitalization due to respiratory conditions after severe COVID-19 infection were observed. Increased risk of hospital admission was observed for pneumonia (except that caused by tuberculosis) (Hazard ratio [HR]=2.886 (95% Confidence Interval: 2.381-3.498); FDR-adjusted p-value (p.adj)=2.70e-25) (figures in the parentheses organized in the same way below), chronic obstructive pulmonary disease and bronchiectasis (3.013 (2.206-4.114); p.adj=1.06e-10), aspiration pneumonitis (7.332 (5.142-10.453); p.adj=3.06e-26), respiratory failure/insufficiency/arrest (7.442 (4.109-13.480); p.adj=8.58e-10), pneumothorax (12.799 (5.485-29.867); p.adj=7.07e-08), and other specified and unspecified lower respiratory disease (2.694 (1.941-3.739); p.adj=6.07e-08), which includes interstitial pulmonary disorder, hypostatic pneumonia without specified organism, rheumatoid lung disease, etc.
Cardiovascular diseases
We observed increased hazards of hospital admission from several cardiovascular/cerebrovascular conditions, including acute myocardial infarction (AMI) (2.369 (1.606-3.494); p.adj=1.34e-04), nonspecific chest pain (1.843 (1.133-2.998); p.adj=4.68e-02), cardiac dysrhythmias (2.162 (1.524-3.066); p.adj=1.45e-04), heart failure (2.797 (1.831-4.273); p.adj=2.30e-05), cerebral infarction (3.028 (2.034-4.508); p.adj=7.99e-07), acute hemorrhagic cerebrovascular disease (4.244 (2.524-7.138); p.adj=8.17e-07), peripheral and visceral vascular disease (3.166 (1.913-5.240); p.adj=7.52e-05) and hypotension (3.512 (2.415-5.109); p.adj=1.16e-09).
Genitourinary diseases
There was evidence for increased hazards of hospitalization from genitourinary disorders, including the following conditions: nephritis/nephrosis/renal sclerosis (6.508 (2.731-15.507); p.adj=2.14e-04), acute and unspecified renal failure (5.285 (4.089-6.832); p.adj=5.07e-35), chronic kidney disease (CKD) (2.711 (1.727-4.256); p.adj=1.40e-04), urinary tract infection (UTI) (3.262 (2.635-4.037); p.adj=1.44e-25), calculus of urinary tract (2.284 (1.403-3.716); p.adj=4.63e-03), other specified and unspecified diseases of kidney and ureters (e.g. hydronephrosis, obstructive and reflux uropathy, cyst) (2.693 (1.597-4.540); p.adj=1.34e-03).
Nervous system disorders
Increased hazards of hospitalization from neurological disorders were evident and included Parkinson’s disease (3.877 (2.056-7.312); p.adj=2.49e-04), headache (including migraine) (1.968 (1.150-3.366); p.adj=4.60e-02), neurocognitive disorders (4.364 (3.098-6.146); p.adj=1.39e-15), transient cerebral ischemia (2.649 (1.565-4.484); p.adj=1.79e-03), coma, stupor and brain damage (9.848 (4.045-23.972); p.adj=6.03e-06) and nerve and nerve root disorders (2.804 (1.301-6.046); p.adj=3.14e-02).
Digestive conditions
Our results showed increased hazards of hospitalization due to digestive disorders after severe COVID-19, including oesophageal disorders (1.626 (1.100-2.405); p.adj=4.98e-02), pancreatic disorders (excluding diabetes) (2.980 (1.731-5.132); p.adj=6.20e-04), hemorrhage of digestive tract (1.979 (1.414-2.770); p.adj=5.40e-04), infectious and non-infectious diseases of digestive tract (2.052 (1.176-3.581); p.adj=3.99e-02). Elevated hazards of hospitalization due to other specified and unspecified gastrointestinal disorders (2.895 (1.612-5.199); p.adj=2.24e-03) was also observed.
Musculoskeletal disorders
Severe COVID-19 were associated with increasedhazards of hospitalization from osteomyelitis (5.095 (2.524-10.283); p.adj=5.79e-05), musculoskeletal pain (2.593 (1.955-3.439); p.adj=8.77e-10) and low back pain (2.444 (1.615-3.701); p.adj=2.15e-04). Higher hospitalization risks from initial encounters of fracture were observed, including fracture of the spine and back (3.643 (1.893-7.010); p.adj=7.80e-04), the upper limb (2.526 (1.631-3.912); p.adj=2.85e-04), the lower limb (except hip (3.368 (1.769-6.413); p.adj=1.44e-03) and the neck of the femur (hip), (3.708 (2.604-5.282); p.adj=1.12e-11). High hazards of hospital admission were also seen in subsequent encounter of above-mentioned fracture and other injury sequelae, like dislocation, sprain, etc.
Neoplasms
History of severe COVID-19 was associated with increased hazards of hospitalization from several neoplasms, including stomach cancers (3.787 (1.820-7.879); p.adj=2.22e-03), colorectal cancers (2.664 (1.842-3.853); p.adj=2.78e-06), brain cancers (5.443 (2.345-12.636); p.adj=6.11e-04), secondary malignancies (2.370 (1.540-3.646); p.adj=6.42e-04) and neoplasms of unspecified nature or uncertain behaviour (2.912 (1.727-4.908); p.adj=4.83e-04).
Other infections
History of severe COVID-19 was associated with increased hazards of hospitalization from septicemia (4.089 (3.238-5.164); p.adj=2.71e-30), bacterial infections (4.040 (3.275-4.983); p.adj=7.66e-37), as well as parasitic and other unspecified infections (2.559 (1.793-3.653); p.adj=3.20e-06).
Other sequelae
We also observed increased hazards of hospitalization from nutritional anemia (1.568 (1.151-2.137); p.adj=1.81e-02), aplastic anemia (1.887 (1.364-2.611); p.adj=8.82e-04), metabolic disorders, like fluid and electrolyte disorders (2.746 (1.817-4.150); p.adj=1.96e-05), depressive disorders (5.329 (2.380-11.929); p.adj=3.93e-04), and general poor wellbeing, such as syncope (2.455 (1.722-3.500); p.adj=8.58e-06), fever (3.343 (1.840-6.073); p.adj=5.70e-04), and other general signs and symptoms (7.383 (6.797-8.019); p.adj<9.60e-231), abnormal findings without diagnosis (3.605(2.352-5.526); p.adj=7.56e-08).
When we restricted the outcome to hospitalization from new-onset or recurrent diseases, most of our results were similar (see Fig.S1). Higher risks of hospitalization from a few additional disorders were found when we performed analysis on hospitalizations from new-onset diseases, including coronary atherosclerosis and other heart diseases (2.248 (1.195-4.229); p.adj=4.19e-02), benign neoplasms (1.576 (1.121-2.215); p.adj=3.24e-02), and other specified hereditary and degenerative nervous system conditions (4.763 (1.851-12.257); p.adj=6.10e-03).
We also repeated the analysis using Poisson regression which models the incidence rate of hospitalization. The results were similar to those from Cox regression, with similar significant associations observed (Table S3).
Compared to subjects with mild COVID-19, association of severe COVID-19 with hospitalization from other disorders (severe vs mild infection)
Here we compared the risk of hospitalization (from other disorders) of severe vs mild cases within the infected individuals (Table 4 and Fig.2). Compared with mild COVID-19, higher hazards of hospitalization from multiple diseases were observed after severe infection (median follow-up: 248 days), including for example, hospitalization from aplastic anaemia (hazard ratio [HR]= 3.606 (95% CI: 1.854-7.014); FDR-adjusted p-value (p.adj)= 1.08e-03), acute myocardial infarction [AMI] (HR=3.674 (1.788-7.549); p.adj=2.36e-03), cardiac dysrhythmias (HR=4.364 (2.232-8.531); p.adj=1.54e-04), cerebral infarction (HR=6.307 (2.331-17.061); p.adj= 1.79e-03), hypotension (HR=3.965 (1.867-8.423); p.adj= 2.07e-03), acute and unspecified renal failure (HR=7.009 (3.744-13.119); p.adj= 2.38e-08), chronic kidney disease [CKD] (HR=4.769 (1.970-11.542); p.adj= 3.01e-03), urinary tract infections [UTI] (HR=4.533 (2.915-7.048); p.adj= 4.85e-10), calculus of urinary tract (HR=2.992 (1.321-6.777); p.adj= 3.16e-02), neurocognitive disorders (HR=10.081 (4.201-24.193); p.adj= 3.21e-06), pneumonia (except that caused by tuberculosis) (HR=5.226 (3.446-7.924); p.adj= 2.46e-13), chronic obstructive pulmonary disease and bronchiectasis [COPD] (HR=3.633 (1.904-6.935); p.adj= 6.71e-04), and aspiration pneumonitis (HR=3.573 (1.921-6.644); p.adj= 4.66e-04). We also observed increased risk of hospitalization from several digestive disorders, infectious conditions, injury and general signs/symptoms, which are listed in Table 4.
Compared to subjects with no known history of COVID-19, association of mild COVID-19 with hospitalization from other disorders (mild infection vs uninfected)
We found from this analysis (model D) that history of ‘mild’ (i.e. non-hospitalized) COVID-19 were associated increased hazards of hospital admission due to musculoskeletal pain (excluding low back pain) (HR=1.937 (1.463-2.566); p.adj=4.29e-05), aspiration pneumonitis (HR=2.933 (1.716-5.012); p.adj=6.25e-04), and other general signs and symptoms (like edema, abnormal weight loss, anorexia, and etc.) (HR=1.353 (1.163-1.574); p.adj= 6.59e-04), as shown in Table S3 and Fig.2.
Higher hazards of all-cause mortality (1.237 (1.037-1.476); p.adj=2.27e-02) and mortality due to neurocognitive disorders (9.100 (5.590-14.816); p.adj=7.33e-18) were also observed.
Compared with subjects with no known history of COVID-19, association of any COVID-19 infection with hospitalization from other disorders (any infection vs uninfected)
The results from this analysis (model C) were largely similar to those from model B. Details can be found in Table S3 and Fig.2. In addition to findings from model B, we also found increased hazards of hospitalization from essential hypertension (3.061 (1.397-6.705); p.adj=2.09e-02), obesity (6.609 (2.263-19.298); p.adj=3.10e-03), hematuria (9.131 (3.184-26.186); p.adj=3.27e-04), osteoarthritis (1.515 (1.267-1.813); p.adj=5.75e-05), muscle disorders (6.564 (2.390-18.022); p.adj=1.66e-03), lung disease due to external agents (4.211 (1.558-11.384); p.adj=1.88e-02), abdominal pain and other digestive/abdomen signs and symptoms (1.591 (1.098-2.306); p.adj=4.79e-02) and circulatory signs and symptoms (1.645 (1.104-2.451); p.adj=4.88e-02).
Association of severe COVID-19 with hospitalization from other disorders, with PTDM adjustment
Results with PTDM adjustment are listed in Table S3 and Fig.S2. Since PTDM pushed the start-date of the unexposed to a later date, the number of events was smaller for the unexposed. We relaxed our results presenting criteria (cell count>=3 for at least one cell). Under the PTDM adjustment, increased risks of hospitalization were still observed for multiple pulmonary disorders, such as pneumonia (except that caused by tuberculosis) (3.612 (2.944-4.432); p.adj=1.75e-32), COPD (4.325 (3.102-6.029); p.adj=4.13e-16), aspiration pneumonitis (7.796 (5.303-11.460); p.adj=2.00e-23), respiratory failure; insufficiency; arrest (8.559 (4.559-16.070); p.adj=8.64e-10), pneumothorax (19.199 (7.665-48.088); p.adj=9.23e-09), lung disease due to external agents (8.660 (2.210-33.939); p.adj=1.55e-02).
In addition, increased hazards of hospital admission from cardiovascular and several other diseases were observed, such as AMI (2.167 (1.459-3.217); p.adj=1.56e-03), other and ill-defined heart disease (10.610 (2.696-41.752); p.adj=7.15e-03), cardiac dysrhythmias (2.195 (1.539-3.130); p.adj=2.27e-04), heart failure (2.688 (1.731-4.173); p.adj=1.74e-04), cerebral infarction (3.274 (2.167-4.945); p.adj=4.94e-07), acute hemorrhagic cerebrovascular disease (4.251 (2.480-7.288); p.adj=3.36e-06), peripheral and visceral vascular disease (3.986 (2.354-6.747); p.adj=5.93e-06), hypotension (3.002 (2.034-4.431); p.adj=8.06e-07), acute phlebitis; thrombophlebitis and thromboembolism (8.753 (3.037-25.224); p.adj=7.91e-04), nephritis; nephrosis; renal sclerosis (7.471 (2.967-18.814); p.adj=3.07e-04), acute and unspecified renal failure (5.321 (4.051-6.989); p.adj=5.60e-31), chronic kidney disease (5.915 (3.766-9.291); p.adj=6.69e-13); mental disorders (schizophrenia spectrum and other psychotic disorders (9.199 (3.108-27.224); p.adj=8.14e-04)), and rheumatoid arthritis/related disease (2.992 (1.512-5.923); p.adj=1.34e-02). Elevated risks of hospitalization due to some other infectious diseases, injury, neoplasms, and general poor well-being were also observed.
Association results with PERR adjustment
The primary results with PERR adjustment (with prior exposure period restricted to 30-Jan-2018 to 30-Jan-2020) are listed in Table 5 and Fig.3.
With the PERR adjustment, compared to those with no known history of COVID-19, severe COVID-19 was significantly associated with increased hazards of hospital admission due to pulmonary disorders, like pneumonia (except that caused by tuberculosis) (2.528 (1.725-3.705); p= 1.98e-06), aspiration pneumonitis (2.521 (1.099-5.786); p= 2.91e-02), respiratory failure; insufficiency; arrest (5.496 (1.593-18.966); p= 7.01e-03); and extra-pulmonary system diseases, like AMI (2.432 (1.045-5.658); p= 3.91e-02), cardiac dysrhythmias (2.219 (1.249-3.941); p= 6.57e-03), cerebral infarction (4.120 (1.989-8.535); p= 1.39e-04), peripheral and visceral vascular disease (3.331 (1.129-9.825); p= 2.93e-02), acute and unspecified renal failure (4.941 (2.918-8.364); p= 2.73e-09), UTI (3.569 (2.228-5.716); p= 1.20e-07), neurocognitive disorders (2.786 (1.160-6.689); p= 2.19e-02), transient cerebral ischemia (2.829 (1.080-7.409); p= 3.42e-02), osteomyelitis (8.640 (1.909-39.096); p= 5.12e-03), and low back pain (2.023 (1.097-3.728); p= 2.40e-02). Under PERR adjustment, we also found that higher hospitalization risks due to various infections, endocrine disorders, fever, and other general signs.
When compared with mild (non-hospitalized) COVID-19, severe COVID-19 illness (i.e. model A) were associated with higher hazards of hospitalization due to pneumonia (except that caused by tuberculosis) (2.170 (1.140-4.129); p= 1.83e-02), cardiac dysrhythmias (3.524 (1.288-9.646); p= 1.42e-02), acute and unspecified renal failure (4.025 (1.260-12.854); p= 1.87e-02), UTI (2.296 (1.142-4.614); p= 1.96e-02) and injury and intestinal infection.
Considering any (severe/mild) infection (model C), it was also observed that COVID-19 was associated with elevated hazards of hospitalization from COPD (1.635 (1.010-2.648); p= 4.55e-02), hypotension (2.069 (1.142-3.751); p= 1.66e-02), osteoarthritis (1.388 (1.101-1.750); p= 5.50e-03), and other general respiratory symptoms, in addition to most associations found under model B.
We note that the PERR adjustment tends to be conservative in general in our case. By the assumptions of PERR, the prior events should have no effect on the risk of exposure22; however here the history of many diseases could increase the risk of (severe) COVID-19 infection. As a result, the hazard ratio of hospitalization (for infected vs non-infected groups) will be higher than expected even in the pre-COVID-19 period, since the (severely) infected group are more likely to be hospitalized for diseases even in the pre-COVID period. This will lead to a conservative bias, as also shown in previous simulation studies22.
Association with mortality after severe COVID-19
The primary results of associations with mortality after COVID-19 (without PTDM adjustment) are listed in Table 6 and Fig.4 and full results in Table S4. We found that severe COVID-19 was associated with increased all-cause mortality (HR=14.700 (13.835-15.619); p.adj<9.6e-231); significantly increased risk of mortality was also observed for mild disease, albeit with an attenuated effect size (1.237 (1.037-1.476); p.adj= 2.27e-02).
Considering cause-specific mortality (without considering the past history of disease), severe infection was associated with mortality due to pneumonia (except that caused by tuberculosis) (3.404 (1.434-8.083); p.adj= 7.55e-03), COPD (8.448 (5.434-13.134); p.adj= 3.23e-20), AMI (2.658 (1.432-4.931); p.adj= 2.89e-03), coronary atherosclerosis and other heart disease (2.700 (1.549-4.707); p.adj= 8.25e-04), Parkinson’s disease (3.540 (1.947-6.439); p.adj= 7.37e-05), multiple sclerosis (19.784 (7.588-51.582); p.adj= 4.76e-09), neurocognitive disorders (13.406 (8.729-20.588); p.adj= 4.28e-31), and death due to several cancers (see Table 6).
Associations with mortality after any infection (model C) are largely similar. When we only considered fatalities from those with a history of the disease, most associations were non-significant, likely due to the small number of events and lower statistical power. Associations with mortality with PTDM adjustment is similar to those without PTDM adjustment (Table S4 and Fig.S3).
Analysis restricted to sequelae 15 or 30 days post infection
In an additional sensitivity analysis, we restrict outcomes that occur 15 or 30 days after the date of infection (Table S7). The overall patterns of sequelae were similar.
Analysis restricted to the pre-vaccination period
Association of all hospitalization and mortality after COVID-19 before the launch of vaccination (i.e. prior to 8 Dec 2020) are presented in Tables S5 (also see Fig.S4) and S6 (also see Fig.S5) respectively. The spectrum of sequelae remained similar. As for mortality, increased hazards of all-cause mortality and mortality due to specific common causes such as coronary atherosclerosis and other heart disease, acute hemorrhagic cerebrovascular disease, COPD, and other cancers were also observed when the analysis was limited to the ‘pre-vaccination’ period.
When comparing the HR for various sequelae in the pre-vaccination vs the whole follow-up (FU) periods, the estimated HR for hospitalization in the pre-vaccination period was generally higher (Table S6b). For example under model B, out of the 65 disorders included in this analysis, 55 showed higher HRs for hospitalization in the pre-vaccination era (binomial p, one-tailed= 5.88E-09); similar result was observed for model C (62 out of 73 disorders showing higher HRs in the pre-vaccination period, p=4.55E-10). For individual disease categories, for example, we found higher HRs in the pre-vaccination period for circulatory disorders (8 out of 9) and respiratory disorders (5 out of 5) and several other categories (model B; please refer to Table S6b for details). Note that we restricted to disorders with at least 5 events in each group such that a meaningful comparison can be made. For ‘other general signs and symptoms’, the HR in the pre-vaccination period was substantially higher when compared to the estimate based on the whole FU period, with non-overlapping CIs. In addition, we observed that for all-cause mortality (under model A/B/C), HR of the pre-vaccination period were substantially higher than the estimate from the whole FU period, with non-overlapping CIs and significant p-values (p=5.62E-03, 1.3eE-21 and 1.40E-42 under model A, B and C respectively). We computed standard error (SE) of the difference in coefficients using non-parametric bootstrap. We note however that the HR estimates of the pre-vaccination period were generally less precise, due to smaller number of events.
Associations results stratified by vaccination status
The associations with sequelae stratified by vaccination status are presented in Table S8 and Fig.S6. We will focus on the comparison between subjects with vaccination before (and to a lesser extent, after) the infection against those infected but have not received any vaccination. A few results were nominally significant. For the analysis on breakthrough infections vs infection without vaccination, we observed lower risks of hospitalization from pneumonia, unspecified lower respiratory diseases and other general signs/symptoms from the former group. We observed increased rates of hospitalization from a few non-specific causes, including low back pain, non-specific chest pain and cataracts. However, such results may not reflect genuine causal relationships as the exact reason for admission is unknown (e.g. admission could be for an elective operation). The most significant difference was observed for general signs and symptoms (p=1.83e-15; lower HR for breakthrough infections).
We then compared those vaccinated after infection against those infected and unvaccinated. The former group had lower hazards of hospitalization from injuries, pneumonia, bacterial infections, septicaemia and other general signs and symptoms, under the original model without PTDM adjustment.
Nevertheless, for subjects who received vaccinations after being infected, we only consider events occurring after vaccination, when comparing with infected but unvaccinated subjects (for whom we will consider all events after infection). As such, the PTDM adjustment may produce more reasonable results. No results were nominally significant under PTDM except for urinary tract infections which was marginally significant.
Associations results stratified by follow-up period (time lapsed after infection)
The proportional hazards assumption was violated in some analyses (considering all hospitalization as outcome) (Table S9). We performed time-stratified and time-transformation analysis to estimate time-dependent hazard ratios. As for the analysis stratified by follow-up period, we observed larger effects from COVID-19 infection and more statistically significant associations when restricting analysis of outcome to the first 3 months, and in general the effects of infection attenuated over time (Table S10 and Fig S7). Nevertheless, it is noteworthy that even after 6 months, infections and severe infections in particular, were still associated with elevated hospitalization risks from numerous conditions. These included, for example, aspiration pneumonitis, acute renal failure, injuries and fractures, and various infections such as urinary tract infections, septicaemia, pneumonia and other bacterial infections (Table S10). The results of tests for proportional hazards for time-stratified analysis are shown in Table S11. Analyses using time-transformation are presented in Table S12. For the majority of outcomes, the baseline risk was the highest, and the risk decreased over time. However, the estimated time taken (assuming a linear reduction in risk on the scale of log(t)) for the adverse effects of infection to wane to zero was generally over 6 months. As an example, the median time taken for the deleterious effects of severe infection (i.e. model B) to wane to zero was 403.8 days. Results on mortality using time-split and time-transformation analyses are presented in Table S13.
Patient and public involvement
Patients were not involved in the conception, design or implementation of the study. No patients were involved in data interpretation or preparation of this paper. This work was conducted to address research concerns in view of the escalating COVID-19 pandemic, and we believe is of public health interest, although no patients/public was directly involved due to the need to complete the study in a timely manner.
Discussion
Overview
Overall, we found that COVID-19, especially severe disease, were associated with increased risks of hospitalization and/or mortality due to pulmonary, cardiovascular, digestive, neurological, genitourinary and musculoskeletal disorders in the post-infection period. These results were largely consistent and robust to multiple sensitivity analysis, and after PERR and PTDM adjustment.
Interpretation of findings in the context of previous studies
We noted that a recent study by Al-Aly et al. has characterized the post-acute sequelae of COVID-19 in US veterans9 for a range of outcomes, however the study was on veterans only which may affect its generalizability to the general population and females. The focus of Al-Aly et al. was also somewhat different from ours, as they focused on incident (new) diagnoses post-infection, while we studied hospitalization due to new or recurrent diseases. As such, a person with history of a disorder (e.g. stroke) who was admitted for the same disease post-infection will be considered by our study but not in Al-Aly et al. We also studied mortality due to all disorders covering all systems.
In this study we utilized a large prospective cohort of UKBB to delineate the septum of hospital admission and mortality cause up to 20 months after COVID-19. We have conducted a very comprehensive set of analyses, which includes all hospitalization and mortality from diseases covering all systems, as well as stratified analysis on new-onset and recurrent diseases. We also performed other sensitivity analyses such as PTDM and PEER adjustments. The PEER adjustment for example can help to reduce residual confounding and evaluate causal relationships between COVID-19 and possible sequelae, which has not been employed in the above study9.
A number of studies have been performed on the longer-term sequelae of COVID-19 (mostly focused on specific organ systems), and have revealed elevated risks for a variety of disorders, which corroborate with our findings in general. For example, impaired lung function and respiratory conditions are among the most commonly reported sequelae of COVID-193,4. Two previous studies on US veterans8,9 have revealed that COVID-19 was associated with a wide range of cardiovascular disorders including dysrhythmias, inflammatory and ischemic heart diseases, thrombotic disorders, heart failure, cerebrovascular conditions and others. The current study also revealed a similar spectrum of cardiovascular in a general population cohort based on an independent sample. Cerebrovascular disorders were also commonly reported as complications of COVID-19, as reviewed elsewhere25,26. Also, renal involvement is not uncommon during the acute phase of infection, and subclinical inflammation may persist for many months27,28. Deterioration of renal function following acute infection has been reported11. Although the mechanisms remain poorly understood, neurocognitive impairment was observed for patients after recovery from COVID-1929-31. In a large-scale retrospective cohort study32, it was reported that the incidence of most neurological and psychiatric disorders were raised after COVID-19, compared to those with influenza or other respiratory tract infections, especially for the most severe COVID-19 cases requiring intensive care support.
We also observed that COVID-19 patients were more likely to be admitted for other bacterial or viral infections post-infection. Elevated risk of hospitalization was not only observed for pneumonia, but also other infective conditions such as septicemia, peritonitis and intra-abdominal abscess, intestinal infection, urinary tract infections (UTI), osteomyelitis, and other bacterial infections. The exact mechanisms remain unclear, however for post-influenza infections, it has been suggested that secondary infections could be due to disruption of airway epithelium owing to immune-mediated damage and dysfunctional innate immune responses33,34. Since corticosteroids are often prescribed for severe COVID-19 infections, it may be worthwhile to investigate whether steroid use is associated with increased risks of other infections35.
Another interesting finding is that we observed increased risk of hospitalization from several musculoskeletal problems, especially fractures. Increased risks of fall may be one important contributing factor, which in turn could be due to myriad reasons such as malaise and fatigue, musculoskeletal pain, muscle weakness, neurocognitive impairment, poor general well-being and the increased incidence of other disorders that may increase fall risk post-infection9,36. Another contributing factor may be increased risks of osteoporosis due to COVID-19 itself37 and/or its treatment (e.g. steroids), but further clinical studies are required. We also noted increased risks of hospitalization from several neoplasms, which may be due to, for example, deconditioning and increased risk of secondary infections after the acute COVID-19 infection, given that it is unlikely that COVID-19 would affect the incidence of neoplasms.
While this is an observational study, it is interesting to note that Mendelian randomization studies, which are less prone to confounding, have also shown that COVID-19 may be a causal risk factor for various cardiometabolic and neuropsychiatric disorders38,39.
The mechanisms underlying COVID-19 sequelae are still largely unknown, and different symptoms/disorders may also involve different pathophysiology. It has been postulated that, for example, persistent reservoirs of the virus, on-going inflammation, dysregulated immune response and re-activation of pathogens may all contribute to the long-term sequelae40,41.
Whether vaccination may ameliorate the risk of COVID-19 sequelae is another important question. Interestingly, we observed preliminary evidence that the risks of sequelae were in general higher in the pre-vaccination period, and there was a higher-than-expected proportion of diseases which showed larger HR in the pre-vaccination era. For all-cause mortality, HR was also significantly higher before the introduction of COVID-19 vaccines. Moreover, we observed that for some sequelae (e.g. non-COVID-19 pneumonia, septicaemia, bacterial infections, other general signs/symptoms), subjects who are vaccinated before or after infection had lower risks than infected patients who are not vaccinated.
However, this study is not primarily designed to study COVID-19 vaccinations, and the results require replications in future studies. This study is observational in nature, and the sample size may still be inadequate to capture uncommon events. We cannot exclude the possibility that apparently higher risks of sequelae in the pre-vaccination era could also be due to other time-varying factors (e.g. different circulating variants or treatment strategies). However, our study period was before the emergence of Omicron, and other variants during the post-vaccination period, such as Alpha and Delta, are likely to be of at least similar or higher pathogenicity than the original strain of virus42. Anti-viral medications such as Paxlovid and molnupiravir were not yet available in the study period. In summary, it is encouraging to observe preliminary evidence that vaccination may reduce the risk of sequelae, but the findings should be regarded as tentative before more evidence are available.
Clinical implications
Our findings may have important clinical implications. We observed that COVID-19, especially severe infections requiring hospitalization, is associated with a wide variety of pulmonary and extra-pulmonary sequelae. Alarmingly, elevated risk of admission and mortality was observed for multiple disorders spanning many organ systems. Patients who have recovered from especially severe COVID-19 may therefore require appropriate follow-up, monitoring and risk assessment for different complications. It remains to be investigated what types of interventions may help to reduce the risk of sequelae.
From a public health perspective, when determining policies related to the pandemic (such as lifting restrictions), it may be advisable to consider the total burden caused by COVID-19 that includes its possible complications, and the disabilities and mortalities due to these complications. Our findings also call for allocation of resources to prevent and treat the relevant complications.
Another alarming finding is that even mild (non-hospitalized) COVID-19 may be associated with modestly increased risk of all-cause mortality (HR=1.237, 95% CI 1.037-1.476) and mortality from neurocognitive disorders, as well as hospital admission from a few disorders such as aspiration pneumonitis, musculoskeletal pain and other general signs/symptoms. Such findings are in line with previous studies25,26 which also showed elevated risks of cardiovascular and other sequelae even for mild COVID-19 cases. In Al-Aly et al.9, it was reported that mild (non-hospitalized) COVID-19 had elevated mortality risks (HR=1.59 (1.46-1.73)) beyond the first 30 days of disease, which corroborates with our findings. The difference in the effect size could be due to heterogeneity of the two samples43, different gender proportions, and lengths of follow-up etc.
Given the huge number of people infected by COVID-19, even a modest rise in the hazard for mortality would imply a large (absolute) number of fatalities. Similarly, the absolute burden and disabilities caused by hospitalizations related to mild infections could be large. This suggests containment of infection is still important together with efforts to reduce the number of severe cases.
Strengths and limitations
The major strength of this study included large sample size of the UK Biobank with detailed health records and sufficient period of follow-up. This study is very comprehensive and covers most clinically relevant categories of diagnosis. In addition to any hospitalization/mortality, we also performed additional analysis on hospitalization with/without relevant history of disease (i.e. new-onset and recurrent diseases). Apart from these, we also made use of advanced statistical methods to evaluate whether our findings are consistent to different modeling strategies. For example, the PERR adjustment was employed to minimize unmeasured confounding.
There are several limitations of this study. Firstly, this is an observational study. Although we have included many known confounders in the model and used other methods (e.g. PERR) to minimize residual confounding, confounding effects cannot be excluded completely and causality cannot be concluded. However, many results remained significant after PERR adjustment which provides further support to our findings. Secondly, subclinical disorders may be underestimated and some disorders may be too rare to be included for further analysis. In addition, due to the relatively small number of events for some disorders, absence of significant associations can also be due to inadequate statistical power. Small number of events (e.g. mortality from specific disorders) may also lead to relatively imprecise effect estimates and wide confidence intervals. We also note that mild or asymptomatic infections may not be detected and considered as having no history of COVID-19; however this is likely to result in a conservative bias. Variant information is not known and thus cannot be analyzed, although we expect the Alpha and Delta variants are likely the predominant variants. Further studies are required to investigate the sequelae from other variants, such as Omicron. Finally, UKBB participants are likely healthier and of higher socioeconomic status44, and generalizability of current findings to other populations (e.g. other ethnicities and those aged<50) may require further studies.
Conclusions
In conclusion, this study reveals increased risk of hospitalization and mortality from a wide variety of pulmonary and extra-pulmonary diseases after COVID-19, especially for those with severe disease. On the other hand, we also observed increased all-cause mortality and hospitalization risk from several disorders after mild infections. The findings may have important clinical and public health implications given the huge number of people infected by COVID-19. Proper monitoring and assessments for the risk of sequelae may be warranted for patients recovered from severe COVID-19 in particular. Further studies are required to replicate the current findings and to investigate the mechanisms underlying the sequelae, and interventions for prevention and treatment.
Data Availability
The UK Biobank data is available to all registered researchers upon application. All other data produced in the present work are contained in the manuscript.
Author Contributions
Conception and design: HCS (lead), with input from YX. Study supervision: HCS. Funding acquisition: HCS. Methodology: HCS (lead), YX. Data curation: YX (lead), RZ, JQ. Data analysis: YX (lead), RZ, JQ. Data interpretation: HCS, YX, RZ, JQ. Preparation of first draft of manuscript: YX, HCS, with input from RZ and JQ.
Supplementary Information
All supplementary Tables and notes are available at the journal’s website and at https://drive.google.com/drive/folders/1E2hsN0E7wwuTa7mqxLHmAizz9j_Cxc?usp=sharing
Conflicts of interest
The authors declare no relevant conflicts of interest.
Conflicts of interest
The authors declare no conflict of interest.
Acknowledgements
This work was supported partially a National Natural Science Foundation China (NSFC) grant (81971706) and the Lo Kwee Seong Biomedical Research Fund from The Chinese University of Hong Kong and the KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, China. We thank Prof. Pak Sham for support on data access. We also thank Prof. Stephen Tsui and Prof. Simon Lui for useful discussions.
Footnotes
Additional figures and analysis have been added.