Abstract
Understanding COVID-19 and its risk factors in the Portuguese population is critical to combat this condition. To study the impact of multimorbidity in the population with COVID-19 infection, we performed a descriptive analysis of a dataset extracted from all reported confirmed cases of COVID-19 in Portugal until June 30, 2020. We observed a prevalence of multimorbidity in 6.77% of the 36,244 infected patients. Patients showed an increased risk of hospitalization, ICU admission and mortality with OR 2.22 (CI 95%: 2.13-2.32) for every additional morbidity. Further studies should confirm these findings and special attention should be made on data collection to ensure proper recording of patient comorbidities.
1 Introduction
COVID-19 is an infectious disease caused by SARS-CoV-2 that started challenging health systems in recent months. Its impact is global with 21,200,000 cases and 761,000 deaths by August 16, 2020 [1]. Early evidence from this pandemic suggested that older patients with chronic conditions were over-represented and may herald a poor clinical course. However, data is still largely limited to China and Italy [2, 3]. Given that multimorbidity increases with age, the specific risk of different chronic conditions and their combination in terms of poor health outcomes needs to be adjusted for age, which is not routinely done [4]. Understanding which groups are at higher risk is important to better inform public health policies, resource allocation and advance knowledge about this novel condition.
DGS (Directionate-General of Health), the Portuguese public health authority, developed and operates SINAVE (National Epidemiological Surveillance System), the national public health surveillance system, to collect, update, analyze, and disclose data related to mandatory reporting infectious diseases and other public health hazards [5]. Symptoms, previous medical history, disease course and laboratory results are introduced by physicians in charge of COVID-19 patients. Although this platform was not specifically designed for this outbreak, it has been used since the beginning of the pandemic as the main official source of data about COVID-19 cases in Portugal.
In this work, we evaluate the prevalence of multimorbidity and age-adjusted risk of hospitalization, ICU admission and death in the Portuguese population from official data, based on a dataset extracted from SINAVE containing all confirmed cases of COVID-19 in Portugal by June 30, 2020.
This paper updates an initial (April) version of the DGS/SINAVE dataset, which is based on cases reported until April 28, 2020. Our analysis is based on the dataset spanning the full period (June version), which adds two more months and provides more data about the initial cases. We will only discuss the April version to address variations between the two datasets.
2 Methods
2.1 Sample
This retrospective observational study used data provided by DGS after the required institutional and ethical approvals. The sample population consists of all the Portuguese population with SARS-CoV2 confirmed infection as notified by clinicians by June 30, 2020. A broad range of clinical and demographic variables are present in this dataset. In this study we specifically used age, gender, hospital admission, admission in intensive care unit, mortality and patient’s underlying conditions.
2.2 Measures
Chronic conditions were provided as a categorical variable on the presence, absence or unknown status of the following conditions: asthma, malignancy, cardiac disorder (including hypertension and other cardiovascular diseases), chronic hematological disorder, diabetes, HIV/other immune deficiency, renal disease, liver disease, chronic lung disease, and neuromuscular disorder. A field containing the “raw” input from doctors was also taken into account and a text-mining script, for keywords associated with the above-mentioned conditions, was developed in order to truly represent the prevalence of the diseases. We defined multimorbidity as the presence of two or more conditions in the same individual, following the definition used by other authors [6]. Addressed outcomes were hospitalization, admission to ICU unit, and reported death. A composite outcome of any of these events was also analyzed.
2.3 Statistical analysis
We used Python 3.6 and packages Numpy, Pandas, Seaborn, and SciPy in combination with Microsoft Excel to evaluate and plot the prevalence of multimorbidity. Additionally, IBM SPSS Statistics was used for obtaining age-adjusted risk of hospitalisation, ICU admission and death.
Categorical variables were presented as counts and percentages with 95% confidence intervals and comparisons were made using using the χ2 test. Univariate regression analysis of each individual chronic condition was performed adjusting only for age. A multivariate logistic regression was performed adjusting for age and every other chronic condition with significant statistical association on univariate analysis. Results were considered significant when P <0,05.
3 Results
The overall sample contained 36,244 adult patient cases, women being more prevalent (56.66%). Among the cases, 18.79% had at least one chronic condition. Cardiac disorder was the most commonly reported condition, reported in 43.33% of the patients with any morbidity. Table 1 shows the reported prevalence of different chronic conditions in the studied population. Multimorbidity, as previously defined, was present in 6.77% of the cases. Figure 1 and Figure 2 plot the prevalence of multimorbidity by age group for the COVID-19 infected general population and hospitalized population, respectively. To analyze the Odd Ratio and prevalence of co-occurring pairs of chronic diseases, people with unknown disease prevalence were excluded, which resulted in a population of 33,283 adult patients. The prevalence of co-occurring pairs of chronic health conditions, plotted in Figure 4, shows Cardiac disorders and Diabetes as the most common dyad of chronic diseases.
Percentage of total population affected by comorbidity.
Odds Ratio for the outcome (Death) for the Age variable and analyzed comorbidities. The real value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.097 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 9.7%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the outcome (Hospitalization) for the Age variable and analyzed comorbidities. The real value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.056 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 5.6%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the outcome (ICU stay) for the Age variable and analyzed comorbidities. The real value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.052 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 5.2. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the outcome (Death + Hospitalisation + ICU stay) for the Age variable and analyzed comorbidities. The real value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.061 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 6.1%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Prevalence of multimorbidity by age group for the COVID-19 infected population. The lighter shade of blue is representative of the absence of conditions and the black line represents the prevalence of multimorbidity.
Prevalence of multimorbidity by age group for the COVID-19 infected hospitalized population. The lighter shade of blue is representative of the absence of conditions and the black line represents the prevalence of multimorbidity.
Prevalence of multimorbidity by age group (Laires et al 2019) [6] using data from the fifth National Health Interview Survey. The lighter shade of blue is representative of the absence of conditions and the black line represents the prevalence of multimorbidity.
Prevalence of single (left) and co-occurring pairs (right) of chronic health conditions. Left: Prevalence of the disease in the population with at least one disease. Right: Prevalence of the disease, rows, in the population affected by another disease, columns. All values are presented in percentage.
Data regarding Hospitalization and ICU admission was available for only 32,945 patients (90.90% of the overall study population). Within this population, Hospitalization occurred in 12.89% of patients, with a male predominance (50.66%), and ICU admission was required for 4.11% of patients, with a female predominance (51.73%). Observed mortality was 3.19%. All chronic conditions except for asthma were associated with increased risk of mortality, and hospitalisation (Table 6 and 7, respectively). Age, diabetes, renal disease, lung disease, and neuromuscular disorders were all associated with increased risk of ICU admission (Table 8). Additionally, every additional chronic condition increases the risk of the composite outcome of death, hospitalisation or ICU admission by 123.3% (OR 2.22; CI 95%: 2.13-2.32).
Odds Ratio for the association between the age, categories of comorbidity and outcome (Death) in patients with Covid-19. The continuous value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.093 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 9.3%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the association between the age, categories of comorbidity and outcome (Hospitalisation) in patients with Covid-19. The continuous value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.045 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 4.5%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the association between the age, categories of comorbidity and outcome (ICU stay) in patients with Covid-19. The continuous value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.050 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 5.0%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the association between the age, categories of comorbidity and outcome (Death + Hospitalisation + ICU stay) in patients with Covid-19. The continuous value of Age was used instead of an age group. Being a continuous variable, an Odds Ratio of 1.050 implies that, as the Age variable increases by 1, the probability of the patient having the outcome increases by 5.0%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
Odds Ratio for the association between the age, number of comorbidities and outcome (Death + Hospitalization + ICU stay) in patients with Covid-19. Both Age and No of comorbidities are continuous variables, i.e, an Odds Ratio of 2.223 implies that, as the No of comorbidities increases by 1, the probability of the patient having the outcome increases by 122.3%. B, S.E, and Wald are the unstandardized regression weight, how much the unstandardized regression weight can vary by, and test statistic for the individual predictor variable, respectively.
4 Discussion
This study shows that multimorbidity is significantly associated with adverse outcomes in COVID-19 infection in the Portuguese population, independently from age. All chronic conditions, except asthma, lead to increased risk of hospitalization. However, only diabetes, chronic kidney disease, chronic respiratory diseases, and neuromuscular disorders are associated with more severe cases requiring ICU admission. These results are in line with previous reports where chronic diseases are associated with poorer outcomes [3, 7]. Although the strength of association differs between diseases, every additional morbidity leads to an increased risk of the composite outcome of hospitalization, ICU admission and mortality.
Multimorbidity was previously studied in the general Portuguese population in 2014 by Laires et al., using data from the fifth National Health Interview Survey (Inquérito Nacional de Saúde, INS) [6]. Prevalence of multimorbidity on the Laires at al. study can be compared with our data. Although only individuals aged 25-79 were included in that study, this constitutes a robust sample for morbidity prevalence in Portugal. We can observe in Figure 1 to Figure 3 a rise in chronic health conditions with increasing age as expected. However, multimorbidity is much less prevalent in our study population (6.39% vs 43.9%). Since the beginning of this pandemic there has been significant public awareness regarding the higher risk of older people with comorbidities. This may have induced efforts to protect and isolate this population. The findings of this study could thus confirm the positive effect of such measures, given the younger and healthier population in the COVID-19 dataset. The discrepancy may however be related to different reporting methods. For instance, the maximum number of significant reported simultaneous morbidities in the COVID-19 infected population was 5 disorders (there were a total of 5 people with simultaneous morbidities ranging from 6 to 8), which is half of the maximum number of morbidities found in the INS population (10 disorders). Since the total number of conditions between both datasets is not so different (COVID-19: 10 diseases; INS: 13 diseases), a possible explanation for the higher number of co-occurring conditions in the INS population can be the combination of self-diagnoses with the presence of more “subjective” disorders such as: lower and upper back pain, allergies, depression, and urinary incontinence.
This study has several limitations. First of all, the cross-sectional nature of the COVID-19 dataset makes it impossible to account for incomplete outcomes, since several patients could ultimately be hospitalized or die after the end of observation. Reported data on outcomes may therefore be underestimated, so careful interpretation is advised until more data is available. More importantly, despite the fact that no standard set of conditions is established to define multimorbidity, chronic conditions were given on broad groups and there is no specific information on individual conditions. For example, diabetes is given as one group and no distinction is made between type 1 and type 2 diabetes. Therefore, measured morbidities may herald heterogeneous groups of diseases with different degrees of severity, which may influence outcomes. Future datasets should include more accurate information on chronic conditions.
Another important concern is related to the risk of under-reporting, which becomes obvious by analyzing reported cardiac diseases. Given that cardiovascular diseases, particularly hypertension, are very prevalent in the Portuguese population [8], the observed prevalence of 8.14% in our study highly suggests that under-reporting may have occurred. In addition, the prevalence of reported cardiac diseases has significantly increased from the April version of the DGS dataset, showing a much lower cardiac disease prevalence in the general population of 0.28%. Surprisingly, cardiovascular diseases are absent from the available list of previous conditions in SINAVE’s reporting page, which could have contributed to a lower notification of this comorbidity (see Figure 6b). One possible explanation to the considerable increase in cardiac disorder prevalence is the fact that “raw” input from doctors was included in the June version of the DGS/SINAVE dataset.
Relationship between the observed and expected prevalence of co-occurring pairs of chronic health conditions (Laires et al 2019) [6] using data from the fifth National Health Interview Survey. All results are given in percentages. The shaded bar depicts the prevalence of each chronic health condition. In the matrix, the first value for each pair is the observed frequency, while the second (italic) is the expected one after multiplying the respective prevalence of each disorder. Chi-square tests were used to determine whether observed frequencies were significantly different from expected frequencies. All p-values were inferior to 5%.) LBP low back pain, HTA hypertension, UBP upper back pain, UI urinary incontinence, COPD chronic obstructive pulmonary disease, CKD chronic kidney disease, CHD coronary heart disease, MI previous myocardial infarction, and LC liver cirrhosis
Screenshots of SINAVE’s interface regarding the report of known comorbidities: (a) Reporting the presence of comorbidities; (b) Reporting the known comorbidities
Although we acknowledge that the DGS/SINAVE dataset was not primarily generated for research, but rather for public health proceedings and government information, we believe that a better user interface design and a more rational set of chronic conditions could effortlessly improve the quality of recorded data. One important lesson we can learn from this pandemic is the important contribution that quickly gathered relevant clinical data and health information systems, such as SINAVE, can have in this setting.
Overall, we consider SINAVE to be very susceptible to under-reporting, specially regarding multimorbidity. This is due to a confusing and non-practical interface that allows doctors to skip filling some important fields in the entry form. Some suggestions to improve future studies regarding multimorbidity, using SINAVE’s data, are:
Redesign the reporting form, making data entry more effective and faster. The current user interface could be simplified, while encouraging the input of relevant comorbidity information;
Integrate SINAVE data with the patient’s health record or with data from the new Trace Covid-19 system, to provide richer data relevant to COVID-19. This could be a way of implementing the redesigned user interface suggested above;
Make it mandatory to input if comorbidities are absent; if present, the filling of the entry form inputs related to comorbidities and the specific chronic conditions should be mandatory. As seen in Figure 6, the list of comorbidities only becomes visible if a previous parameter is filled;
Make the entry of all known chronic conditions by healthcare professionals more persuasive;
Add cardiovascular diseases to the list of comorbidities.
To conclude, findings in our study show that multimorbidity is significantly associated with poor outcomes in COVID-19 infection. Further data is needed to inform about the strength of this association and about the significance of observed differences in multimorbidity prevalence between infected patients and the general population of Portugal. We believe that data collection problems may have occurred and influenced outcome measurement. We also provide recommendations for improving the data collection user interface that could improve quality of health information about the COVID-19 infected population, while increasing confidence in SINAVE data.
Data Availability
This retrospective observational study used data provided by DGS after the required institutional and ethical approvals. The sample population consists of all the Portuguese population with SARS-CoV2 confirmed infection as notified by clinicians by June 30, 2020. A broad range of clinical and demographic variables are present in this dataset. In this study we specifically used age, gender, hospital admission, admission in intensive care unit, mortality and patient's underlying conditions.
- Abbreviations
- SINAVE
- Sistema Nacional de Vigilância Epidemiológica;
- COVID-19
- Coronavirus disease 2019;
- SARS-CoV-2
- severe acute respiratory syndrome coronavirus 2, multimorbidity, chronic conditions