Abstract
Ethnic differences in the risk of severe COVID-19 may be linked to household composition. We quantified the association between household composition and risk of severe COVID-19 by ethnicity for older individuals. With the approval of NHS England, we analysed ethnic differences in the association between household composition and severe COVID-19 in people aged 67 or over in England. We defined households by number of generations living together, and used multivariable Cox regression stratified by location and wave of the pandemic and accounted for age, sex, comorbidities, smoking, obesity, housing density and deprivation. We included 2 692 223 people over 67 years in wave 1 (01/02/2020-31/08/2020) and 2 731 427 in wave 2 (01/09/2020-31/01/2021). Multigenerational living was associated with increased risk of severe COVID-19 for White and South Asian older people in both waves (e.g. wave 2, 67+ living with 3 other generations vs 67+ year olds only: White HR 1·61 95% CI 1·38-1·87, South Asian HR 1·76 95% CI 1·48-2·10), with a trend for increased risks of severe COVID-19 with increasing generations in wave 2. Multigenerational living was associated with severe COVID-19 in older adults. Older South Asian people are over-represented within multigenerational households in England, especially in the most deprived settings. The number of generations in a household, number of occupants, ethnicity and deprivation status are important considerations in the continued roll-out of COVID-19 vaccination and targeting of interventions for future pandemics.
Funding This research was funded in part, by the Wellcome Trust. For the purpose of open access, the author has applied a CC-BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Introduction
The composition of a household - the age and number of its members - is a key determinant of infection risk for many infections, including COVID-19.1–5 Households with multiple generations may be at higher risk of infection due to increased routes of household introduction, with increased contact between older adults and working age adults of particular concern. Differences in the proportion of multigenerational households by ethnicity may be an underlying factor in the disproportionate effect that COVID-19 has had on ethnic minority groups in the UK.1,6–11 Analysing how multigenerational living affects the risk of severe COVID-19 in people of retirement age (over 67 in the UK) across wave 1 and wave 2 of the pandemic could improve our understanding of drivers of ethnic disparities in COVID-19 outcomes.
Using the OpenSAFELY platform, we sought to assess whether (1) household composition was associated with severe COVID-19 in older people within individual ethnic groups after accounting for potential confounders and (2) whether any association between household composition and severe COVID-19 and other potential household-level explanatory factors differed by ethnicity.
Research in context
Evidence before this study
We searched PubMed on 28 March 2022 for population-based studies examining the association between household composition and severe COVID-19 in older people. Keywords included (COVID-19 OR coronavirus OR SARS-CoV-2) AND (household) AND (mortality OR hospitalisation OR severe). We identified two studies, one from the UK and one from Sweden, both of which reported increased COVID-19 mortality in older people associated with multigenerational living. However, the study from Sweden was not able to assess whether results differed by ethnicity and did not account for comorbidities or investigate the role of other household-level variables (such as IMD or household size). The UK study did perform analysis by ethnicity but was based upon historical (2011 census) household composition data and did not examine effects separately by pandemic wave.
Added value of this study
Our study is the first to use UK linked NHS primary and secondary electronic health records from February 2020 onwards to study the association between household composition and both hospitalisation and mortality due to COVID-19 in older people separately for wave 1 and wave 2 of the pandemic. Additionally, it is the only population-based study that uses up to date records to account for sociodemographic characteristics and clinical comorbidities while studying effects across ethnic groups, and is the first to allow the impact of increasing numbers of generations in a household on severe COVID-19 to be assessed. This has allowed us to demonstrate that living with more younger generations was associated with increased hazard of severe COVID-19 for White and South Asian older people in both waves, with a trend of increasing severe COVID-19 with increasing generations in wave 2. We were also able to demonstrate strong evidence that the effect of deprivation on severe COVID-19 was substantially greater for South Asian people compared to White people, an effect not seen for non-COVID outcomes, and that despite there being over ten times the number of older White people in England than older South Asian people, nearly twice the number of South Asian older people live in multigenerational houses in the most deprived settings. For older occupants of these households, rates of severe COVID-19 during wave 2 were higher than those for older people with multiple (established severe COVID-19 risk factor) comorbidities.
Implications of all the available evidence
Multigenerational living is associated with severe COVID-19 in White and South Asian older people in England. Deprivation has a larger effect on severe COVID for South Asian people than White people, suggesting that disparities in COVID outcomes by ethnicity cannot be explained by socioeconomic factors alone. Older South Asian people are over-represented within multigenerational households in the most deprived settings, and experienced very high rates of severe COVID-19 during wave 2. Consideration of ethnicity, number of generations in a household (and/or total number of occupants in a household) and deprivation is warranted in planning booster vaccination roll-out and public health interventions (for future COVID-19 variants and other respiratory pandemic viruses).
Methods
Study design and population
We used linked primary care electronic health record data for 24 million people in England from the OpenSAFELY-TPP platform (see supplementary material). We extracted separate study populations for waves one (1 February 2020 to 31 August 2020) and two (1 September 2020 to 31 January 2021). We selected people aged 67 years or older at the start of each wave to represent a population (1) at or over retirement age and (2) at risk of severe COVID-19 due to their age. We wanted to assess if people of retirement age were at a differential risk of severe COVID-19 if they lived with younger generations who were more likely to be working (or attending educational or childcare settings). Participants were followed-up until the earliest of: the outcome of interest, deregistration from their general practice, death from any cause, or the end date for each wave. Our pre-specified study protocol (https://github.com/opensafely/hh-classification-research/tree/master/docs) and (post hoc) deviations from the protocol (supplementary table S1) are available.
Inclusion and exclusion criteria
Participants with at least 3 months of follow-up before the study start date for each wave were included, to ensure adequate capture of baseline factors. We used a TPP-developed pseudonymised household identifier which links people living at the same address on 1 February 2020,12 (see supplementary methods). We excluded people with no household identifier, households with anyone flagged as living in a care home (with care home status derived by matching addresses to Care Quality Commission data13), those living with more than 12 people (possible care homes or other institutions), and those living with more than 4 people who were all over the age of 67 (possible care homes).13 We also excluded people with missing sex, location (based on Middle Layer Super Output Area - MSOA) or index of multiple deprivation (indicators of poor data quality when missing).
Exposure
We classified distinct generations as 0-17 year olds, 18-29 year olds, 30-66 year olds and 67+ year olds before assigning each 67+ year old in our study to one of the following five household composition categories:
67+ living alone: 67+ year old living alone (single occupancy household)
Multiple 67+ year olds: 67+ year old living with up to three other 67+ year olds (reference category)
1 other generation: 67+ year old(s) living with people from just ONE other younger generation
2 other generations: 67+ year old(s) living with people from TWO other younger generations
3 other generations: 67+ year old(s) living with people from all THREE other younger generations
The primary exposure was the household composition where each 67+ year old was resident on 1st February 2020.
Outcomes
The primary outcome was “severe COVID-19” defined as COVID-19 related hospital admission or death between 1 February 2020 and 31 August 2020 (for wave 1), and 1
September 2020 and 31 January 2021 (for wave 2):
Hospital admission with COVID-19: a COVID-19 ICD-10 code for confirmed (U07.1) or suspected (U07.2) COVID-19 in the primary diagnosis field in Secondary Use Service (SUS) data
COVID-19-related death: a COVID-19 ICD-10 code for confirmed or suspected COVID-19 anywhere on the death certificate.
We also analysed each of the above outcomes separately, and included a secondary outcome of “non-COVID-19 death” (death from any other cause on the death certificate) to assess specificity of results.
Covariates
Ethnicity
We investigated the effect of household composition within GP-recorded census ethnicity categories of White, South Asian, Black, mixed and other. Detailed census categories (Table S2) were analysed for any ethnicity where household composition was associated with severe COVID-19.
Other covariates
We included categorical age in years (67-69, 70-74, 75-79, 80-84, 85+), sex, body mass index categories (kg/m2) (underweight, normal, overweight, obese I, obese II, obese III), index of multiple deprivation quintiles (from 1 - most affluent to 5 - most deprived), geographic region defined by Upper Tier Local Authority (UTLA), smoking status (current, former, never), housing density (ONS Rural/Urban classification in categories of urban major/minor conurbation, urban city and town, rural town, rural village) and number of comorbidities shown to be associated with poor COVID-19 outcomes (0, 1 or 2+)14 (see Supplementary Material Tables S3 and S4).
Statistical analysis
Descriptive analysis
Participant characteristics at baseline were summarised by household composition category separately for wave 1 and for wave 2.
Household composition and severe COVID-19/non-COVID-19 death by ethnicity
We used multivariable Cox proportional hazards regression with robust standard errors to account for clustering by household in order to estimate differences by household composition in the hazard of severe COVID-19 and non-COVID-19 death. All models were stratified by UTLA region to account for region-specific variation in infection rates over time,15 and separate analyses were performed for wave 1 and wave 2. An interaction between household composition and ethnicity was included in all models, with all household composition results reported by ethnicity. We hypothesised that associations between other covariates and severe COVID-19 would vary by ethnicity, so performed likelihood ratio tests for interaction (LRT) between each covariate and ethnicity (including interactions as appropriate in our final models). Where deprivation or housing density interacted with ethnicity, we also present results stratified by ethnicity for these variables.
We assessed the proportional hazards assumption by testing for a zero slope in the scaled Schoenfeld residuals and through graphical inspection of plots of the Schoenfeld residuals against time.
Analysis of the impact of household size (number of occupants)
In our main analysis of household composition we decided a-priori not to adjust for household size (see supplementary methods). Instead, to assess how the hazard of severe COVID varied by number of occupants within categories of household composition and vice versa, we cross-tabulated results from a combined household composition-household size exposure variable (Supplementary Table S6).
Absolute rates of severe COVID-19 and non-COVID-19 death by household characteristics
Rates of all outcomes were reported by ethnicity and (1) household composition (2) household size (3) any other household-level explanatory variables found to differ by ethnicity and (4) those with two or more comorbidities (for comparison).
Sensitivity analyses
We tested the impact of including a 5 year “buffer” between the 67+ year old generation and the next youngest generation (to avoid a 67+ year old living with a 61-66 year old being considered multigenerational). We assessed the impact of only including people who lived in households with 100% TPP coverage (i.e. all adults in the household were registered with TPP - see supplementary methods). Our main analysis was a complete ethnicity records analysis - we performed a sensitivity analysis applying multiple imputation to account for missing ethnicity (10 imputations). In our main analysis people with missing body mass index were assumed to be normal weight, and those with missing smoking data were assumed to be never-smokers (on the assumption that obesity and smoking would likely be recorded if present) - a complete records sensitivity analysis for BMI and smoking was performed.
Software and Reproducibility
This analysis was delivered through the OpenSAFELY platform - see supplementary material for details.
Patient and Public Involvement
We have developed a publicly available website https://opensafely.org/ through which we invite any patient or member of the public to contact us regarding this study or the broader OpenSAFELY project.
Role of the funding source
The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.
Results
Descriptive results
From a total of 23 696 832 individuals in the OpenSAFELY database on 1 February, 2020, there were 2 692 223 people aged 67 or over (referred to as: “67+”) at the beginning of wave 1 who met the selection criteria (Figure S1). Of these 1 109 443 (41·2%) lived with other 67+, 920 670 (34·2%) lived alone, 526 037 (19·5%) lived with one other younger generation, 113 553 (4·2%) lived with two other younger generations and 22 540 (0·8%) lived with three other younger generations (Table 1). The final wave 2 cohort was slightly larger (2 731 427 people), with no change in relative proportion of household compositions (Table 1).
London had the highest proportion of 67+ living in multigenerational houses (15·8%), with a much lower proportion in the North West (3·4%) (Figure 1). Younger 67+ and those living in urban areas were more likely to live with multiple younger generations (Table 1). A notably larger proportion of South Asian (66%) and Black people (49%) 67+ were living with one or more other generations than White 67+ (23%) (Figure 1). Household composition by ethnicity for wave 2 was similar to wave 1 (Supplementary Table S7).
Household composition and severe COVID-19 by ethnicity
Wave 1
After accounting for age, sex, comorbidities, housing density, deprivation status, obesity and smoking, and including interactions between ethnicity and household composition and age (Table S8), White 67+ living alone had an increased risk of severe COVID-19 compared to the reference group (HR 1·35 95% CI 1·30-1·41) (Figure 2 (W1)). There was a small increase in hazard for 67+ living with any other younger generation (e.g. 67+ 2 other generations: HR 1·22 95% CI 1·10-1·35) (Figure 2 (W1). For South Asian 67+, there was also an increase in hazard of severe COVID-19 in those living alone (HR 1·47 95% CI 1·18-1·84), and living with either 2 or 3 generations (e.g. 3 other generations HR 1·41 95% CI 1·09-1·83). For all other ethnicities wide confidence intervals limited interpretation, but estimates were generally consistent with an increased HR for multigenerational living (Figure 2 (W1).
For White people, the association of household composition with non-COVID-19 death was similar to the association with severe COVID-19 while for South Asian people associations were specific to COVID-19 (e.g. non-COVID-19 death 67+ & 2 other generations: HR 0·98, 0·83-1·15 (Figure 2 (W1)). Despite wide confidence intervals, results for other ethnicities were consistent with those for South Asian people.
Wave 2
After accounting for sex, comorbidities, housing density and smoking, and including interactions between ethnicity and: household composition, age, deprivation status and obesity (Table S8), multigenerational living for White people in wave 2 was associated with higher hazards of severe COVID-19 compared to wave 1 (e.g. 67+ & 3 other generations: wave 2 HR 1·61 95% CI 1·38-1·87) (Figure 2 (W2)) and there was evidence for an increasing hazard of severe COVID-19 with increasing number of generations (Figure 2 (W2)). For South Asian people, a similar trend was observed (Figure 2 (W2)), with increases in HRs for all categories of multigenerational living compared to wave 1 (e.g. 67+ & 3 other generations wave 2 HR 1·76 95% CI 1·48-2·10) (Figure 2 (W2)).
For South Asian people, the specificity of effect persisted and increased for Wave 2 (e.g. Wave 2, 67+ & 3 other generations: severe COVID-19 HR 1·76 95% CI 1·48-2·10, non-COVID-19 death HR 1·12 95% CI 0·87-1·46) (Figure 2 (W2)). For Black and Mixed ethnicities, although wide confidence intervals limited interpretation, increased multigenerational living seemed to be associated with an increased hazard of non-COVID-19 death but not severe COVID-19 (Figure 2 (W2)).
Results for the separate severe COVID-19 outcomes (death and hospitalisation), for the detailed (16-level) ethnicity analysis and for the analysis of the impact of household size are provided in the supplementary material.
There was no evidence of deviations from the proportional hazards assumption for wave 1 or wave 2 (Table S10).
Other household-level variables and severe COVID-19 by ethnicity
There was strong evidence that the effect of deprivation on severe COVID-19 was different for White 67+ year olds compared to South Asian 67+ year olds in wave 2 (IMD most deprived vs least deprived - White: HR 1·65 95% CI 1·58-1·72, South Asian: HR 2·46 95% CI 2·00-3·03) (Table S10). This difference was specific to severe COVID-19 (Table S11).
Absolute effects
Over 13% of South Asian people lived in multigenerational households in the most deprived settings (i.e. 67+ & 3 generations in the 5th deprivation quintile) compared to less than 1% of White people (Table 2). This meant that despite South Asian people making up only 3·2% of the total study population (compared to 94·7% White) (Table 1), there were a larger number of South Asian 67+ year olds (2841) than White 67+ year olds (2377) living in the highest risk household composition arrangement. South Asian people in this group experienced nearly triple the rate of severe COVID-19 (11802 per 100 000 person years) than White people in this group (4293 per 100 000 person-years) (Table 2), and the rate of severe COVID-19 for South Asian people in this group was higher than in those with two or more comorbidities (9905 per 100 000 person years) (Table 2).
Sensitivity analyses
All sensitivity analyses had minimal impact on results (Table S12).
Discussion
Principal findings
Multigenerational living was associated with increased risk of severe COVID-19 among older South Asian and White people in waves 1 and 2 of the SARS-CoV-2 pandemic in England, with evidence for a trend in increasing risk of severe COVID-19 with increasing number of generations in wave 2. For other ethnicities, results were consistent with harmful effects for wave 1 only. For South Asian older people, results were highly specific to COVID-19 and the effect of deprivation on severe COVID-19 outcomes was greater in South Asian than in White people. Very high rates of severe COVID-19 were observed for older people in multigenerational households in the most deprived settings. Despite there being over ten times the number of White people in England than South Asian people, nearly twice as many South Asian people live in multigenerational households in the most deprived settings compared to White people.
Comparison with other studies
Our findings are consistent with two studies of older people (from the UK and from Sweden) that analysed COVID-19 mortality alone 6,11 and found increased risks associated with multigenerational living. Another UK study found that living with children was a risk factor for COVID-19 mortality in wave 2,12 and a study in the UK Biobank found an increased risk of severe COVID-19 in larger households during combined wave 1 and 2 (in younger people than studied here).16 A number of studies have reported ethnic disparities in COVID-19 outcomes in the UK.17–21 Of these, none studied age-based generational household composition by ethnic group and only two studies analysed effects over both of the first two waves of the pandemic.17,21
Our study complements and advances these previous studies by using both up-to-date household composition and covariate data from a very large population-representative sample of England to illustrate that as the number of generations in a household increased, the risk of severe COVID-19 for older people increased, and that effects were particularly pronounced for White and South Asian ethnicities during wave 2. Absolute rates of severe COVID-19 in wave 2 were very high for people living in multigenerational houses in the most deprived settings, and particularly for South Asian older people.
Strengths and limitations
Our study is the first to use up-to-date household and covariate information to study household composition and severe COVID-19 in England by ethnicity, and was able to analyse effects over the first two waves where lockdown restrictions differed. The key strengths of this study are the scale, detail and completeness of the underlying health record data. We could therefore assess whether there was an increasing trend for COVID-19 harms in older people living with increasing numbers of generations by ethnicity, and assess the impact of other potential household-level explanatory variables (household size and deprivation).
Our study had a number of potential weaknesses. Firstly, 12% of our cohort who did not have a household identifier were excluded. Furthermore, for the main analysis 22% of the cohort were not included due to missing ethnicity data, although conclusions were identical when applying multiple imputation to account for missing ethnicity and the distribution of household composition classification by ethnicity was similar to Census data from 2011 (Table S13). There may have been some misclassification of household composition due to including some houses where not all occupants were registered at GP practices using TPP software, although our sensitivity analysis including only 100% TPP households had no impact on results. We were not able to account for people moving house during the pandemic, as household occupancy was determined only once (in February 2020). Finally, there were a number of potential explanatory factors that we could not include in this analysis, such as occupation22 and household crowding.
Interpretation
Our results suggest that during wave 1 there were harmful effects of living with younger generations that were specific to COVID-19 for South Asian and other ethnicities, but no different to the effect on non-COVID-19 death for White older people. Possible explanations include the (non-specific) health benefits of multigenerational living,10,23 that strict lockdown intervention messaging was particularly effective at reaching White people,24 or that younger White people were more able to adhere to the messaging due to (e.g.) type of occupation. Older White people in households with more generations may have been less impacted specifically by COVID-19 as they had younger co-occupants who were able to provide support to the older occupant but were not coming into contact with other people outside of their own households. For the other ethnic groups, there may have been more contact between households (or in workplaces) during wave 1.24
For wave 2, increased multigenerational living had a large and specific effect on severe COVID-19 for South Asian older people. It is likely that as restrictions eased there were differences by ethnicity in key risk factors for transmission such as occupation type,25 inter-household mixing, religion, and experiences of structural racism.17 This could have led to increased exposure for South Asian people at work (or in education), and also for any South Asian older people in multigenerational houses. The fact that the effect of deprivation on severe COVID-19 was greater for South Asian older people than for White people provides further confirmation that differences in COVID-19 outcomes by ethnicity cannot be explained by deprivation alone.26
Finally, although the relative effects of multigenerational living were similar across White and South Asian groups during wave 2, the absolute effect differed substantially, with over one and a half times more people living in the highest risk household/deprivation category (n=5587 South Asian vs n=3415 White). As this group experienced over twice the rate of severe COVID-19 than their White counterparts and the rate was comparable to that in people with multiple established comorbidity risk factors, it is highly likely that these household-level characteristics are contributing to previously observed imbalances in severe COVID-19 outcomes by ethnicity that particularly impact South Asian people.17,21
Conclusions
Multigenerational living was associated with severe COVID-19, particularly for South Asian and (to a lesser extent) White older people in both waves, with results consistent with harmful effects for other ethnicities in wave 1 only. The established COVID-19 risk factor of deprivation has a greater effect on serious COVID-19 outcomes for South Asian older people than for White older people. Older people in households with more than 3 other younger generations (or with six or more occupants) in the most deprived settings experienced particularly high rates of severe COVID-19. Household characteristics, ethnicity and socioeconomic deprivation are important considerations alongside individual risk factors when considering the roll-out of COVID-19 vaccination boosters and the targeting of interventions for COVID-19 and future pandemics.
Data Availability
All data were linked stored and analysed securely within the OpenSAFELY-TPP platform: https://opensafely.org/. Data include pseudonymised data such as coded diagnoses medications and physiological parameters. No free text data are included. All code is shared openly for review and re-use under MIT open license [https://github.com/opensafely/hh-classification-research]. Detailed pseudonymised patient data is potentially re-identifiable and therefore not shared.
Conflicts of Interest
All authors will complete the ICMJE uniform disclosure form following peer review.
Funding
The OpenSAFELY data science platform is funded by the Wellcome Trust.
BG’s work on clinical informatics is supported by the NIHR Oxford Biomedical Research Centre and the NIHR Applied Research Collaboration Oxford and Thames Valley. BG’s work on better use of data in healthcare more broadly is currently funded in part by: the Wellcome Trust, NIHR Oxford Biomedical Research Centre, NIHR Applied Research Collaboration Oxford and Thames Valley, the Mohn-Westlake Foundation; all DataLab staff are supported by BG’s grants on this work. RM holds a Sir Henry Wellcome fellowship. AS is employed by LSHTM on a fellowship sponsored by GSK. HF holds a UKRI fellowship. KB holds a Wellcome Senior Research Fellowship (220283/Z/20/Z). EW holds grants from MRC. EH holds a fellowship from NIHR. NG is a full-time employee of Aetion, Inc AYSW holds a fellowship from BHF. BMK is also employed by NHS England working on medicines policy and clinical lead for primary care medicines data. IJD holds grants from NIHR and GSK. RME is funded by HDR UK, the MRC and the HPRU in Modelling and Economics.
The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, Public Health England or the Department of Health and Social Care.
Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.
Data access and verification
Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. The data in OpenSAFELY is drawn from General Practice data across England where TPP is the Data Processor. TPP developers initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. DataLab developers and PIs holding contracts with NHS England have access to the OpenSAFELY pseudonymised data tables as needed to develop the OpenSAFELY tools. These tools in turn enable researchers with OpenSAFELY Data Access Agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code. All code for the full data management pipeline—from raw data to completed results for this analysis—and for the OpenSAFELY platform as a whole is available for review at https://github.com/opensafely/hh-classification-research.
The data management and analysis code for this paper was led by KW and contributed to by DG, RM, HPG, EN, AS, HF, KB, EW, HJC, CTR, AYSW.
Acknowledgements
We are very grateful for all the support received from the EMIS and TPP Technical Operations team throughout this work, and for generous assistance from the information governance and database teams at NHS England / NHSX.