Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7 =========================================================================== * Nicholas G. Davies * Christopher I. Jarvis * CMMID COVID-19 Working Group * W. John Edmunds * Nicholas P. Jewell * Karla Diaz-Ordaz * Ruth H. Keogh ## Abstract SARS-CoV-2 lineage B.1.1.7, a variant first detected in the United Kingdom in September 20201, has spread to multiple countries worldwide. Several studies have established that B.1.1.7 is more transmissible than preexisting variants, but have not identified whether it leads to any change in disease severity2. We analyse a dataset linking 2,245,263 positive SARS-CoV-2 community tests and 17,452 COVID-19 deaths in England from 1 September 2020 to 14 February 2021. For 1,146,534 (51%) of these tests, the presence or absence of B.1.1.7 can be identified because of mutations in this lineage preventing PCR amplification of the spike gene target (S gene target failure, SGTF1). Based on 4,945 deaths with known SGTF status, we estimate that the hazard of death associated with SGTF is 55% (95% CI 39–72%) higher after adjustment for age, sex, ethnicity, deprivation, care home residence, local authority of residence and test date. This corresponds to the absolute risk of death for a 55–69-year-old male increasing from 0.6% to 0.9% (95% CI 0.8–1.0%) within 28 days after a positive test in the community. Correcting for misclassification of SGTF and missingness in SGTF status, we estimate a 61% (42–82%) higher hazard of death associated with B.1.1.7. Our analysis suggests that B.1.1.7 is not only more transmissible than preexisting SARS-CoV-2 variants, but may also cause more severe illness. Most community SARS-CoV-2 PCR tests in England are processed by one of six national “Lighthouse” laboratories. Among the mutations carried by lineage B.1.1.7—also known as Variant of Concern (VOC) 202012/01—is a 6-nucleotide deletion that prevents amplification of the S gene target by the commercial PCR assay currently used in three of the Lighthouse labs1. By linking individual records of positive community tests with and without S gene target failure (SGTF) to a comprehensive line list of COVID-19 deaths in England, we estimate the relative hazard of death associated with B.1.1.7 infection. We define confirmed SGTF as a compatible PCR result with cycle threshold (Ct) < 30 for ORF1ab, Ct < 30 for N, and no detectable S (Ct > 40); confirmed non-SGTF as any compatible PCR result with Ct < 30 for each of ORF1ab, N, and S; and an inconclusive (missing) result as any other positive test, including tests processed by a laboratory incapable of assessing SGTF. ## Characteristics of the study population The study sample (**Extended Data Table 1**) comprises 2,245,263 individuals who had a positive community (“Pillar 2”) test between 1 November 2020 and 14 February 2021. Just over half of those tested (1,146,534, 51.1%) had a conclusive SGTF reading and, of these, 58.8% had SGTF. Females comprised 53.6% of the total sample; 44.3% were aged 1–34 years, 34.4% aged 35–54, 15.1% aged 55–69, 4.3% aged 70–84 and 1.9% aged 85 or older. The majority of individuals (93.7%) lived in residential accommodation (defined as a house, flat, sheltered accommodation, or house in multiple occupancy), with 3.1% living in a care or nursing home. Based on self-identified ethnicity, 74.0% were White, 13.6% were Asian, 4.6% were Black and 7.8% were of other, mixed or unknown ethnicity. All seven NHS England regions are represented, with the London region contributing 22.5% of tests and the South West 5.9%. The first three weeks of the study period (1–21 November) contributed 15.5% of the total tests, and the final three weeks (24 January–14 February) 12.8%. The period between 3–23 January contributed 31.6% of tests. In those with SGTF status measured, SGTF prevalence was similar in males and females but lower in the older age groups: 59.0% in 1–34-year-olds compared with 55.4% in those aged 85 and older. In keeping with these age patterns, SGTF prevalence was lower in individuals living in a care or nursing home (54.3%) than those in residential accommodation (58.8%). SGTF prevalence by self-identified ethnicity was 58.0% in the White group, 57.6% in the Asian group, 69.6% in the Black group, and 64.8% in the other, mixed, or unknown ethnicity group. SGTF prevalence was lowest in the most deprived index of multiple deprivation3 (IMD) quintile (53.9%). The highest prevalences of SGTF over the study period were observed in the East of England (77.5%), South East (77.3%) and London (75.4%) regions, and prevalence of SGTF was lowest in the North East and Yorkshire region (41.2%). The prevalence of SGTF increased steeply over time (**Fig. 1a**), from 5.8% during 1–21 November 2020 to 94.3% during 24 January–14 February 2021. ![Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F1.medium.gif) [Fig. 1.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F1) Fig. 1. Descriptive analyses. **a** The number of samples with and without SGTF by day from 1 November 2020 to 14 February 2021, the period covered by our main analysis. **b** Number of deaths within 28 days of positive test by specimen date included in the analysis. **c** Kaplan-Meier plot showing survival (95% confidence intervals) among individuals tested in the community in England with and without SGTF, in the subset with SGTF measured. Inset shows the full *y*–axis range. **d–i** Crude death rates (point estimates and 95% CIs) among SGTF versus non-SGTF cases (in the subset with SGTF measured, *n* = 1,146,534) for deaths within 28 days of positive test stratified by broad age groups and (**d**) sex, (**e**) place of residence, (**f**) ethnicity, (**g**) index of multiple deprivation, (**h**) NHS England region, and (**i**) specimen date. Horizontal bars show the overall crude death rates (95% CIs) by age group irrespective of SGTF status. Missing SGTF status was strongly associated with age and place of residence. The proportion with SGTF status missing was similar in age groups 1–34 (48.3%), 35–54 (47.8%) and 55–69 (48.2%), and then rose to 54.4% in the 70–84 age group and to 77.7% in the 85 and older age group. SGTF status was missing in 87.9% of tests for individuals living in a care or nursing home, compared to 47.4% of tests among individuals in residential accommodation. This is partly due to more extensive use of lateral flow immunoassay tests in care homes, which do not yield an SGTF reading. Missingness in SGTF status also differed substantially by NHS England region, ranging from 21.2% in the North West to 71.1% in the South West, which is largely explained by proximity to a Lighthouse lab capable of producing an SGTF reading (**Extended Data Fig. 1**). Missingness also depended on specimen date, with the percentage missing being lower for the earlier specimen dates and highest (54.4%) in the 21-day period that contributed the most tests (3–23 January). There were also minor differences in missingness by ethnicity and IMD. Of the 48.9% of tests with missing SGTF status, 5.1% were inconclusive due to high Ct values and the remaining 43.8% were not assessed for SGTF. 19,615 people in the study sample are known to have died (0.87% of 2,245,263). Crude death rates were substantially higher in the elderly and in those living in a care or nursing home (**Supplementary Table 1**). The standard definition of a COVID-19 death in England is any death occurring within 28 days of an individual’s first positive SARS-CoV-2 test; 17,452 of the observed deaths (89.0%) met this criterion (**Fig. 1b**). Among those with known SGTF status, the crude COVID-19 death rate was 1.86 deaths per 10,000 person-days of follow-up in the SGTF group, versus 1.42 deaths per 10,000 person-days in the non-SGTF group (**Fig. 1c**; **Extended Data Table 2**). Stratifying by broad age groups and by sex, place of residence, ethnicity, IMD, region, and specimen date, death rates within 28 days of a positive SARS-CoV-2 test were higher among SGTF than non-SGTF cases in 98 of the 104 strata assessed (94%; **Figs. 1d–i**; see also **Supplementary Table 2**). ## Cox regression analyses To estimate the effect of SGTF on mortality while controlling for observed confounding (**Extended Data Fig. 2**), we fitted a series of Cox proportional hazards models4 to the data. We stratified the analysis by lower tier local authority (LTLA) and specimen date to control for geographical and temporal differences in the baseline hazard—for example, due to changes in hospital pressure during the study period—and used spline terms for age and IMD and fixed effects for sex, ethnicity, and residence type. All models were fitted twice, once using complete cases only, i.e. by simply excluding individuals with missing SGTF status, and once using inverse probability weighting (IPW), i.e. accounting for missingness by upweighting individuals whose characteristics—age, sex, IMD, ethnicity, residence type, NHS England region of residence and sampling week—are underrepresented among complete cases. This analysis assumes that, holding these characteristics constant, whether an individual dies is independent of missingness in SGTF status5. For the complete-cases analysis, the estimated hazard ratio for SGTF was 1.55 (95% CI 1.39– 1.72), indicating that the hazard of death in the 28 days following a positive test is 55% (39– 72%) higher for SGTF than for non-SGTF cases. To assess the model assumption of proportional hazards, we added an interaction term between SGTF and time since positive test. There was strong evidence of non-proportionality of hazards (likelihood ratio test ![Graphic][1]; **Fig. 2a**; **Extended Data Fig. 3**), with the estimated time-varying hazard ratio increasing over time: 1.14 (0.92–1.40) one day after the positive test, 1.58 (1.42–1.75) on day 14, and 2.24 (1.75–2.87) on day 28. Adding higher-order functions of time into the interaction terms did not significantly improve model fit (likelihood ratio test ![Graphic][2]). We found no evidence that the effect of SGTF varied by age group (likelihood ratio test ![Graphic][3]), sex ![Graphic][4], IMD ![Graphic][5], ethnicity ![Graphic][6], or residence type ![Graphic][7]. We note, however, that the relatively small number of deaths among 1–34-year-olds over the study period (44 deaths) does not permit robust assessment of the impact of SGTF in this age group. Other time-covariate interactions suggested that the delay from positive test to death was slightly shorter among females, care home residents, and the elderly; see **Supplementary Note 1** for more details on models with interaction terms. ![Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F2.medium.gif) [Fig. 2.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F2) Fig. 2. Survival analyses. **a–d** Estimated hazard ratio of death (mean and 95% CIs) within 28 days of positive test for (**a**) SGTF, complete-cases analysis; (**b**) SGTF, IPW analysis; (**c**) *p*VOC, complete-cases analysis; and (**d**) *p*VOC, IPW analysis, in model stratified by LTLA and specimen date and adjusted for the other covariates. **e** Estimated hazard ratio of death (point estimates and 95% CIs) across each model investigated. Death types are coded as follows: d*X*, all deaths within *X* days of a positive test; c28, death-certificate-confirmed COVID-19 deaths within 28 days; e60, all deaths within 60 days plus all death-certificate-confirmed COVID-19 deaths within any time period. S, spline term (for Age or IMD); L, linear term (for Age or IMD); NHSE, NHS England region (*n* = 7); UTLA, upper-tier local authority (*n* = 150); LTLA, lower-tier local authority (*n* = 316). LTLA start date signifies a start date chosen separately for each LTLA (see Methods). For IPW analysis, a model to predict missingness is required. We evaluated a series of such models, including a cauchit model, which is a robust alternative to logistic regression suitable for IPW5. We selected the cauchit model as it fit well and resulted in less extreme weights than other models (**Extended Data Fig. 4**). The IPW analysis gave similar results to the complete-cases analysis, yielding a hazard ratio of 1.58 (1.40–1.78). Like the complete-cases analysis, the IPW analysis recovered an increasing hazard with time since positive test, but the increase was less marked (**Fig. 2b**) and did not significantly differ from zero (Wald test ![Graphic][8]. ## Misclassification analysis Prior to the emergence of B.1.1.7, a number of minor circulating SARS-CoV-2 lineages with spike mutations could also cause SGTF1. Our main analyses are restricted to specimens from 1 November 2020 onwards to avoid diluting the measured effect of B.1.1.7 on mortality due to non-B.1.1.7 lineages causing SGTF. As an alternative approach, we undertook a misclassification analysis6, modelling the relative frequency of SGTF over time for each NHS England region as a low, time-invariant frequency of non-B.1.1.7 samples with SGTF plus a logistically growing2 frequency of B.1.1.7 samples. This allowed us to estimate the probability *p*VOC that a given SGTF sample was B.1.1.7 based upon its specimen date and NHS England region (**Extended Data Fig. 5**). Again restricting the analysis to specimens from 1 November 2020 onward, we find a hazard ratio associated with *p*VOC of 1.58 (1.42–1.76) for the complete-cases analysis and 1.61 (1.42–1.82) for the IPW analysis (**Fig. 2c–d**). ## Absolute risks To put these results into context, we calculated absolute mortality risks by applying hazard ratios for SGTF to the baseline risk of death among individuals tested in the community between August–October 2020 (assumed to be representative of the CFR associated with preexisting variants of SARS-CoV-2; **Table 1**). For the complete-cases analysis, in females aged 70–84, the estimated risk of death within 28 days of a positive SARS-CoV-2 test increases from 2.9% without SGTF to 4.4% with SGTF (95% CI 4.0–4.9%) and for females 85 or older increases from 13% to 19% (17–21%). For males aged 70–84 the risk of death within 28 days increases from 4.7% to 7.2% (6.4–7.9%) and for males 85 or older it increases from 17% to 25% (23– 27%). Estimates based on the IPW analysis corrected for misclassification were marginally higher. These estimates reflect a substantial increase in absolute risk amongst older age groups, but the risk of COVID-19 death following a positive test in the community remains below 1% in most individuals younger than 70 years old. Note that these estimates capture the fatality ratio among people tested in the community, and are thus likely to be higher than the infection fatality ratio, as many infected individuals are never tested. View this table: [Table 1.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/T1) Table 1. Absolute 28-day mortality risk associated with B.1.1.7, as expressed by case fatality ratio (%) among individuals testing positive in the community. The baseline risk (i.e., for preexisting SARS-CoV-2 variants) is derived using linked deaths within 28 days for all individuals testing positive in the community from 1 August – 31 October 2020. Adjusted risks are presented for the SGTF analysis for complete cases and for the misclassification-adjusted (*p*VOC) IPW analysis, which yielded the lowest and highest mortality estimates, respectively, of the main models assessed (**Fig. 2a–d**). ## Further investigations We conducted a number of sensitivity analyses to verify the robustness of our results. Our main results were largely insensitive to: restriction to death-certificate-confirmed COVID-19 deaths only; any follow-up time of 21 days or longer; coarseness of geographical and temporal stratification; use of linear versus spline terms for age and IMD; analysis start date; follow-up time–covariate interactions; removal of the 10-day death registration cutoff; and restriction of the analysis to individuals with a full 28-day follow-up period (**Fig. 2e**). Generally, the IPW analysis yielded marginally higher hazard ratios, with greater uncertainty. As a further sensitivity analysis, we adjusted for an indicator in Pillar 2 testing data for whether the subject was tested because of symptoms or due to asymptomatic screening. Although we caution that symptomatic screening status may lie on the causal pathway between SGTF status and death, we found that this adjustment had no effect on the relative hazard of SGTF (1.54 [1.39–1.71], complete-cases analysis). ## Discussion We previously found that B.1.1.7 is substantially more transmissible than preexisting SARS-CoV-2 variants, but could not robustly identify any associated change in disease severity using population-level analysis of early data2. This analysis of individual-level data, which controls for factors that could confound the association between B.1.1.7 infection and death, reveals an increase in COVID-19 mortality associated with lineage B.1.1.7. We stratify our analyses by test time and geographical location—mimicking matching on these variables—to account for changes in testing rates and changing pressures on hospital services over time and by region. Our findings are consistent with earlier reports7 by ourselves and other groups assessing the risk of death among individuals with SGTF. Crucially, our study is limited to individuals tested in the community. Indicators for B.1.1.7 infection are not currently available for most people who die from COVID-19 in England, as they are tested in the hospital rather than in the community and hospitals do not routinely collect genotypic data. However, this restricted focus allows us to capture the combined effect of an altered risk of hospitalisation given a positive test and an altered risk of death given hospitalisation, while only the latter would be measurable in a study of hospitalised patients only. Unfortunately, we were unable to account for vaccination status in this analysis. We do not identify any mechanism for increased mortality here. B.1.1.7 infections are associated with higher viral concentrations on nasopharyngeal swabs, as measured by Ct values from PCR testing (**Extended Data Fig. 6**). Higher viral load could therefore be partly responsible for the observed increase in mortality; this could be assessed using a mediation analysis. Alternatively, changes in test-seeking behaviour could, in principle, explain our results. If B.1.1.7 infections were less likely to cause symptoms, but symptomatic B.1.1.7 cases were more severe, then our study could overestimate changes in the infection fatality rate. However, we find no clear difference in SGTF frequency among community tests relative to a random sample of SARS-CoV-2 infections in the population (**Extended Data Fig. 7**), suggesting that variant-associated changes in test-seeking propensity do not explain our findings. ## Data Availability All analysis code is publicly available. An anonymised data set allowing replication of the analysis is publicly available. [https://github.com/nicholasdavies/cfrvoc](https://github.com/nicholasdavies/cfrvoc) [https://zenodo.org/record/4579857](https://zenodo.org/record/4579857) ## CMMID COVID-19 Working Group authors The CMMID COVID-19 working group is (randomized order) Kevin van Zandvoort1, Samuel Clifford1, Fiona Yueqian Sun1, Sebastian Funk1, Graham Medley1, Yalda Jafari1, Sophie R Meakin1, Rachel Lowe1, W John Edmunds1, Matthew Quaife1, Naomi R Waterlow1, Rosalind M Eggo1, Nicholas G. Davies1, Jiayao Lei1, Mihaly Koltai1, Fabienne Krauer1, Damien C Tully1, James D Munday1, Alicia Showering1, Anna M Foss1, Kiesha Prem1, Stefan Flasche1, Adam J Kucharski1, Sam Abbott1, Billy J Quilty1, Thibaut Jombart1, Alicia Rosello1, Gwenan M Knight1, Mark Jit1, Yang Liu1, Jack Williams1, Joel Hellewell1, Kathleen O’Reilly1, Yung-Wai Desmond Chan1, Timothy W Russell1, Christopher I Jarvis1, Simon R Procter1, Akira Endo1, Emily S Nightingale1, Nikos I Bosse1, C Julian Villabona-Arenas1, Frank G Sandmann1, Amy Gimma1, Kaja Abbas1, William Waites1, Katherine E. Atkins1,4, Rosanna C Barnard1, Petra Klepac1, Hamish P Gibbs1, Carl A B Pearson1, and Oliver Brady1. 4. Centre for Global Health, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK. ## Methods ### Ethical approval Approved by the Observational / Interventions Research Ethics Committee at the London School of Hygiene and Tropical Medicine (reference number 24020). Subject consent is not required for national infectious disease notification data sets in England. ### Data sources We linked three datasets provided by Public Health England: a line list of all positive tests in England’s “Pillar 2” (community) testing for SARS-CoV-2, containing specimen date and demographic information on the test subject; a line list of cycle threshold (Ct) values for the ORF1ab, N (nucleocapsid), and S (spike) genes for positive tests that were processed in one of the three national laboratories (Alderley Park, Glasgow, or Milton Keynes) utilising the Thermo Fisher TaqPath COVID-19 assay; and a line list of all deaths due to COVID-19 in England, which combines and deduplicates deaths reported by hospitals in England, by the Office for National Statistics, via direct reporting from Public Health England Health Protection Team, and via Demographic Batch Service tracing of laboratory-confirmed cases8. We link these datasets using a numeric identifier for Pillar 2 tests (‘FINALID’) common to all three datasets. We define S gene target failure (SGTF) as any test with Ct < 30 for ORF1ab and N targets but no detectable S gene, and non-SGTF as any test with Ct < 30 for ORF1ab, N, and S targets. A small proportion (10.4%) of SGTF tests are inconclusive. The study population of interest is defined as all individuals who received a positive Pillar 2 test between 1 November 2020 and 14 February 2021. For our main analysis, we included only tests from after 1 November 2020 to avoid including an excess of tests with SGTF not resulting from infection by lineage B.1.1.7. In sensitivity analyses, we also consider extending the population to include tests performed between 1 September and 31 October 2020. Our analysis does not include individuals who first tested positive in hospital, that is, those who presented to hospital after symptom onset without first being tested in the community. This is because cycle threshold values used to ascertain SGTF status are not available for individuals who were not tested in the community. Of the 57,750 COVID-19 deaths in England during the study period, 17,642 deaths (44%) can be linked to a positive Pillar 2 test; among these, 4,945 have non-missing SGTF status. So, while our study includes 1,098,729 Pillar 2 tests with non-missing SGTF status, which represents 51.1% of the 2,245,263 Pillar 2 tests over this period and 40.2% of the 2,736,806 combined Pillar 1 (hospital) and Pillar 2 (community) SARS-CoV-2 tests over this period, we can only assess SGTF status for 9% (4,945 / 57,750) of the individuals who died from COVID-19 over the study period. This is explained by differing mortality rates among individuals who first test positive in a hospital compared to those who are tested in the community, as the former group are much more likely to have severe illness, as well as by missingness in the SGTF data. There was a small amount of missing data for sex (*n* = 14, <0.01%), age (*n* = 171, <0.01%), and IMD and regional covariates (*n* = 3,817, 0.16%). There were no missing specimen dates. Individuals with missing age, sex, or geographical location were excluded. We also excluded individuals from the dataset whose age was recorded as zero, as there were 17,913 age-0 individuals compared to 10,132 age-1 individuals in the dataset, suggesting that many of these age-0 individuals may have been miscoded. There was some missing data on ethnicity (*n* = 47,491, 2%) and we created a category that combines missing values with “Other” and “Mixed”. Missing values for residence type (*n* = 63,905, 3%) were also combined with an “Other” category. The full data set used for the main analysis comprises 2,245,263 individuals, with SGTF status missing or inconclusive for 1,098,729 (48.9%). Missing data on the exposure is addressed in the analysis, described below. We grouped residence types into three categories: Residential, which included the “Residential dwelling (including houses, flats, sheltered accommodation)” and “House in multiple occupancy (HMO)” groups; Care/Nursing home; and Other/Unknown, which included the “Medical facilities (including hospitals and hospices, and mental health)”, “No fixed abode”, “Other property classifications”, “Overseas address”, “Prisons, detention centres, secure units”, “Residential institution (including residential education)”, and “Undetermined” groups, as well as unspecified residence type. We grouped ethnicities into four categories according to the broad categories used in the 2011 UK Census: Asian, which included the “Bangladeshi (Asian or Asian British)”, “Chinese (other ethnic group)”, “Indian (Asian or Asian British)”, “Pakistani (Asian or Asian British)”, and “Any other Asian background” groups; Black, which included the “African (Black or Black British)”, “Caribbean (Black or Black British)”, and “Any other Black background” groups; White, which included the “British (White)”, “Irish (White)”, and “Any other White background” groups; and Other / Mixed / Unknown, which included the “Any other ethnic group”, “White and Asian (Mixed)”, “White and Black African (Mixed)”, “White and Black Caribbean (Mixed)”, “Any other Mixed background”, and “Unknown” groups. ### Statistical methods There are several factors that we expect to be associated with both SGTF and with risk of death, thus confounding the association between SGTF and risk of death in those tested. Area of residence and specimen date were expected to be potentially strong confounders. Area of residence is expected to be strongly associated with SGTF status due to different virus variants circulating in different areas, and specimen date because the prevalence of SGTF is known to have greatly increased over time. Area of residence and specimen date are also expected to be associated with risk of death following a test, including due to differential pressure on hospital resources by area and time. The following variables were also identified as potential confounders: sex, age, place of residence (Residential, Care/Nursing home, or Other/Unknown), ethnicity (White, Asian, Black, or Other/Mixed/Unknown), index of multiple deprivation (IMD). The potential confounders are referred to collectively as the covariates. For descriptive analyses, age (in years) was categorised as 1–34, 35–54, 55–69, 70–84, or 85 and older. Descriptive analyses were performed. We tabulated the distribution of the covariates in the whole study sample, the association between each covariate and SGTF status in the subset with SGTF measured, and the association between each covariate and missing data in SGTF status **(Extended Data Table 1)**. The subset with SGTF status measured are referred to as the complete cases. The unadjusted association between SGTF and mortality in the complete cases was assessed using a Kaplan-Meier plot (**Fig. 1c**), and Kaplan-Meier plots and crude 28-day mortality rates are also presented separately according to categories of the covariates (**Extended Data Table 2, Extended Data Fig. 2**). Crude overall mortality rates (i.e., not restricted to 28 days after a positive test) were obtained for the whole sample, by SGTF status in the complete cases, and in those with missing SGTF status, according to categories of each covariate (**Supplementary Table 1**). We also obtained mortality rates by SGTF status (in the complete cases) for categories of each covariate stratified by age group (**Fig. 1d–i**). Exact Poisson CIs are used for mortality rates, assuming constant rates. Approximately 49% of individuals in the study sample are missing data on SGTF status, due to their test not being sent to one of the three laboratories utilising the Thermo Fisher TaqPath COVID-19 assay or the test being inconclusive. We performed complete cases analysis, restricted to the subset with SGTF status measured. This complete case analysis assumes that for each analysis, the missing data, in this case missing SGTF status, is independent from the outcome of interest, given the variables included in the models. This is a specific type of missing not at random (MNAR) assumption, as in particular it is allowed to depend on the underlying value of SGTF. We also performed an analysis of the complete cases using inverse probability weights5 (IPW) to address the missing data on SGTF, under a missing at random (MAR) assumption. In the analysis, each individual with SGTF status measured is weighted by the inverse of their probability of having SGTF status measured based on their covariates. For the IPW, the missingness model estimated the probability of missingness using logistic regression with age (restricted cubic spline), sex, IMD decile (restricted cubic spline), ethnicity, residence type by asymptomatic screening indicator, and NHS region by specimen week as predictors. We also considered a cauchit and a Gosset link for the missingness model, including the same predictors, as this was expected to provide better stability for the weights5. The fit of the missingness model was assessed using a Q-Q plot (**Extended Data Fig. 4**), and Hosmer-Lemeshow and Hinkley tests were used to choose the most appropriate model. Cox regression4 was used to estimate the association between SGTF and the hazard of mortality, conditioning on the potential confounders listed above. The analyses described here were applied to the complete cases and using IPW. For IPW analyses, the standard errors (SEs) accounted for the weights, though the fact that the weights were estimated was not accounted for; this results in conservative SEs. The baseline hazard in the Cox model was stratified by both specimen date and LTLA, therefore finely controlling for these variables. The stratification gives a large number of strata matched by specimen date and LTLA. Only those strata that contain individuals who die and individuals who survive contribute to the analysis. The analysis is therefore similar to that which would be performed had we created a matched nested case-control sample. The remaining variables were included as covariates in the model (sex, age, place of residence, ethnicity, IMD decile). Age was included as a restricted cubic spline with 5 knots, and IMD decile was included as a restricted cubic spline with 3 knots. The time origin for the analysis was specimen date and we considered deaths up to 28 days after the specimen date. Individuals who did not die within 28 days were censored at the earlier of 28 days post specimen date and the administrative censoring date, which we chose as the date of the most recent death linkable to SGTF status minus 10 days (i.e., 14 February 2021) in order to minimise any potential bias due to late reporting of deaths. We began by assuming proportionality of hazards for SGTF and the covariates included in the model. The proportional hazards assumption was assessed by including in the model an interaction between each covariate and time, which was performed separately for SGTF and for each other covariate. Schoenfeld residual plots were also obtained for each covariate (**Extended Data Fig. 3**). We assessed whether the association between SGTF and the hazard was modified by age, sex, IMD, ethnicity, and place of residence. Models with and without interactions were compared using likelihood ratio tests for the complete cases analyses. For the analysis using IPW we used Wald tests based on robust standard errors9. The analysis assumes that censoring is uninformative, which is plausible as all censoring is administrative. ### Misclassification analysis The exposure of SGTF is subject to misclassification, because a number of minor circulating SARS-CoV-2 lineages in addition to B.1.1.7 are also associated with failure to amplify the spike gene target. Accordingly, a positive test with SGTF is not necessarily indicative of infection with B.1.1.7. A negative test of SGTF is assumed to be indicative of absence of infection with B.1.1.7. Misclassification of an exposure can result in bias in its estimated association with the outcome. We fitted a logistic model to Pillar 2 SGTF frequencies by NHS region to estimate a “background” rate of SGTF in the absence of B.1.1.7, assuming a beta binomial prior. This model is then used to estimate the probability that an individual testing positive with SGTF is infected with B.1.1.7, separately for individuals in each NHS region. These probabilities can then be used in place of the indicator of SGTF exposure in the Cox models. This is the regression calibration approach6 to correcting for bias due to measurement error in an exposure. We fitted models accounting for false positives (modelled as regionally-varying background rates of SGTF associated with non-B.1.1.7 variants) to the SGTF data. Our logistic model for B.1.1.7 growth over time is as follows: ![Formula][9] Here, *f*(*t*) is the predicted frequency of B.1.1.7 among positive tests at time *t* (in days since 1 September 2020) based on the terms *slope* and *intercept*; *s*(*t*) is the predicted frequency of S gene target failure at time *t* due to the combination of B.1.1.7 and a background false positive rate *falsepos, conc* is the “concentration” parameter (= α + β) of a beta distribution with mode *s*(*t*); *k**t* is the number of S gene target failures detected at time *t*; and *n**t* is the total number of tests at time *t*. All priors above are chosen to be vague, and the truncation of *conc* to values greater than 2 ensures a unimodal distribution for the proportion of tests that are SGTF. The model above is fitted separately for each NHS England region. Then, *p*VOC for a test with SGTF = 1 at time *t* is equal to *f(t)/s(t)*, and *p*VOC = 0 for all tests with SGTF = 0. The model was fitted using Markov chain Monte Carlo with 10,000 iterations of burn-in and 5,000 iterations of sampling. The model above was fitted using the same data source (i.e. SGTF frequencies among Pillar 2 community tests for SARS-CoV-2) as our survival analysis. To verify the robustness of this model, we performed a sensitivity analysis using sequencing data from the COVID-19 UK Genomics Consortium10 downloaded from the Microreact platform11 on 11 January 2020 to estimate *p*VOC. In this alternative analysis we estimated *p*VOC for each NHS England region and date as the number of samples that were VOC 202012/01 (i.e. lineage B.1.1.7 with mutations Δ69/Δ70 and N501Y in Spike) divided by the number of samples that were SGTF (i.e. any lineage with Δ69/Δ70, the deletion that causes SGTF) for that NHS England region and date, setting *p*VOC = 1 for all dates later than 31 December 2020 as there were no sequencing data available past this date, and filling any gaps in the data using linear interpolation. This yielded nearly identical results in our survival analysis compared to using the modelled *p*VOC described above (**Fig. 2e**). ### Absolute risks Estimates from the final Cox models were used to obtain estimates of absolute risk of death for 28 and 60 days with SGTF and *p*VOC. Given the strong influence of age on risk of death, we present absolute risks by sex and age group (1–34, 35–54, 55–69, 70–84, 85+). Absolute risks of death (case fatality rate) within 28 days were estimated by age group and sex using data on individuals tested during August–October 2020; this is referred to as the baseline risk. The absolute risks of death for individuals with SGTF were then estimated as follows. If the baseline absolute risk of death in a given age group is 1− *A*, then the estimated absolute risk of death with SGTF is 1−*A**HR*, where HR denotes the estimated hazard ratio obtained from the Cox model assuming proportional hazards. Standard errors are obtained via the delta method, and CIs based on normal approximations. ### Sensitivity analyses Several sensitivity analyses were performed. After establishing the final model through using the process outlined above, we investigated the impact of using different variables for stratification of the baseline hazard measuring region at a coarser level (UTLA, or NHS England region), as well as coarser test specimen time (week rather than exact date). Adjusting for these variables instead of using stratification was also explored. We also repeated the main analysis restricting data to specimens collected from September onwards, October onwards, November onwards, or December onwards. To assess the impact of imposing an administrative cutoff to follow-up time of 10 days prior to data extraction, we first reanalysed the data without this cutoff, as well as reanalysing the data restricting the analysis to individuals with at least 28 days’ follow-up. Finally, we adjusted for symptomatic status associated with the test (asymptomatic versus symptomatic), which relates to whether the test was given for asymptomatic screening purposes or on the basis of a request by a (presumed symptomatic) individual, as only symptomatic individuals may request a community SARS-CoV-2 test in England. ## Author contributions NGD, CIJ, KDO and RHK performed analyses; all authors designed the study and wrote the manuscript. ## Competing interests The authors declare no competing interests. ## Additional information Correspondence should be addressed to Nicholas G. Davies. ## Data availability An anonymised data set allowing replication of the analysis is available at [https://zenodo.org/record/4579857](https://zenodo.org/record/4579857). ## Code availability Analysis code deposited at time of publication is available at [https://zenodo.org/record/4579857](https://zenodo.org/record/4579857). The repository is maintained at [https://github.com/nicholasdavies/cfrvoc](https://github.com/nicholasdavies/cfrvoc). ## Supplementary Information Supplementary Tables 1–2; Supplementary Notes 1–2. ## Extended Data View this table: [Extended Data Table 1.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/T2) Extended Data Table 1. Characteristics of study subjects, 1 November 2020–14 February 2021. **All**, N (%); **Missing**, N (%); **SGTF**, N with SGTF (%) in subset of tests with non-missing SGTF status; **Non-SGTF**, N with non-SGTF (%) in subset of tests with non-missing SGTF status; **SGTF prevalence**, N with SGTF / total (%) in subset with known SGTF status; **Missingness**, N with missing SGTF status / total (%). View this table: [Extended Data Table 2.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/T3) Extended Data Table 2. Rates of death within 28 days of positive test among study subjects. Total number of deaths, number of days of followup, and deaths per 10,000 days of followup reported for: **All** deaths, **Missing** SGTF status deaths, known **SGTF** deaths and known **Non-SGTF** deaths. Missingness among deaths (%) and SGTF prevalence among deaths with non-missing SGTF status (%) are also reported. ![Extended Data Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F3.medium.gif) [Extended Data Fig. 1.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F3) Extended Data Fig. 1. Missingness in SGTF status and proximity to SGTF-capable Lighthouse lab. The geographical location of the six Lighthouse Labs in the United Kingdom; missingness is higher in the lower-tier local authorities (shaded regions) which are closer to a Lighthouse lab that is not capable of producing an SGTF reading. ![Extended Data Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F4.medium.gif) [Extended Data Fig. 2.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F4) Extended Data Fig. 2. Kaplan-Meier plots of survival within 60 days of positive test for SGTF versus non-SGTF. Plots are stratified by sex, age group, place of residence, ethnicity, NHS England region, IMD decile (in 5 groups), and specimen date. Note that *y*–axis ranges differ among panels. These curves show the crude survival within each group (unadjusted for other covariates), and so do not necessarily signify differences in the effect of SGTF on survival for any specific group due to possible confounding. Shaded areas show 95% CIs. ![Extended Data Fig. 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F5.medium.gif) [Extended Data Fig. 3.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F5) Extended Data Fig. 3. Schoenfeld residuals for survival model by SGTF stratified by LTLA and specimen date. Model uses linear terms for age and IMD and a 28-day followup using complete cases. Schoenfeld residual tests (two-sided) give (**a**) *P* = 0.001 for SGTF; (**b**) *P* = 0.039 for age; (**c**) *P* = 0.101 for sex; (**d**) *P* = 0.937 for IMD decile; (**e**–**g**) *P* = 0.969 for ethnicity; (**h**–**i**) *P* = 0.064 for residence type; and *P* = 0.027 globally. Trend line shows mean and 95% CIs of a loess regression. ![Extended Data Fig. 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F6.medium.gif) [Extended Data Fig. 4.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F6) Extended Data Fig. 4. Comparison of missingness models. Q-Q plot (left; mean and 95% CIs) and distribution of weights (right) under different missingness models assessed for IPW with **(a)** a cauchit link, **(b)** a robit link (Student’s *t* distribution with d.f. = 4), and **(c)** a logit link. ![Extended Data Fig. 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F7.medium.gif) [Extended Data Fig. 5.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F7) Extended Data Fig. 5. Misclassification model. For each NHS England region, we fit a beta-binomial model (purple, Modelled SGTF) to the observed SGTF frequencies among Pillar 2 tests (black, Observed SGTF), which estimates a constant proportion of “false positive” SGTF samples among non-VOC 202012/01 (i.e., non-B.1.1.7) specimens (orange, Modelled non-VOC SGTF) and a logistically growing proportion of VOC 202012/01 (i.e., B.1.1.7) specimens over time (blue, Modelled VOC). This allows us to model the conditional probability that a specimen with SGTF represents VOC 202012/01 (teal, P(VOC|SGTF)). For our misclassification survival analysis, *p*VOC = 0 for non-SGTF specimens and *p*VOC = P(VOC|SGTF) for SGTF specimens. Lines show medians and shaded areas show 95% credible intervals. ![Extended Data Fig. 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F8.medium.gif) [Extended Data Fig. 6.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F8) Extended Data Fig. 6. Ct values for SGTF versus non-SGTF. The distribution of Ct values for (**a**) ORF1ab and (**b**) N gene targets among specimens collected between 1 January–14 February 2021. ![Extended Data Fig. 7.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/03/05/2021.02.01.21250959/F9.medium.gif) [Extended Data Fig. 7.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/F9) Extended Data Fig. 7. S gene dropout in community tests relative to a random sample of SARS-CoV-2 infections in the community. Comparison of the proportion of samples with S gene dropout in the Pillar 2 (i.e., community testing) sample used in this analysis compared to Office for National Statistics (ONS) random sampling of the community. This comparison suggests that S gene dropout samples are not overrepresented in testing data relative to the prevalence of S gene dropout in the community, suggesting that the increased hazard of death among positive community tests estimated in this study is not the result of a decrease in the average propensity for test-seeking among individuals infected with B.1.1.7. Point and ranges for ONS data show mean and 95% credible intervals. ## Supplementary Information for View this table: [Supplementary Table 1.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/T4) Supplementary Table 1. Rates of death within any time period following positive test among study subjects, including missing SGTF status. Total number of deaths, number of days of followup, and deaths per 10,000 days of followup reported. The maximum observed followup was 105 days. View this table: [Supplementary Table 2.](http://medrxiv.org/content/early/2021/03/05/2021.02.01.21250959/T5) Supplementary Table 2. Cases, deaths, followup days, and deaths per 10,000 days of followup, by SGTF and non-SGTF, cross-tabulated by date range, geographical region, age group, and IMD group. The number in parentheses is the rate of death per 10,000 days of follow-up. EE, LD, SE: East of England, London, South East (regions in which B.1.1.7 was first detected); ML, NEY, NW, SW: Midlands, North East & Yorkshire, North West, South West (other regions). ## Supplementary Note 1 ### Models with interaction terms In the main text (section “Cox regression analyses”), we briefly describe results obtained from models including interactions between SGTF and other covariates, as well as interactions between other covariates and time since positive test, in our complete-cases analysis. As stated in the main text, in our analysis of the effect of SGTF on the hazard of death due to COVID-19, we found no evidence that the effect of SGTF varied by age group, sex, IMD, ethnicity, or residence type. In an earlier analysis using data up to 25 January 2021, we did find a marginally significant interaction (likelihood ratio test ![Graphic][10] between SGTF and residence type indicating a higher hazard of death due to SGTF in care/nursing home residents, but this interaction is no longer present (likelihood ratio test ![Graphic][11]) when analysing the more complete data set up to 14 February 2021. We also describe in the main text a significant interaction between SGTF and time since positive test (henceforth, “time since positive test” is abbreviated as “time”, not to be confused with the date upon which the specimen was taken). In additional analyses, we also found significant interactions between age and time, sex and time, and place of residence and time. Specifically, in our complete-cases analysis: * – the SGTF:time coefficient was estimated at 0.025 (standard error 0.0076), indicating a slightly longer time from specimen to death for individuals with SGTF (likelihood ratio test ![Graphic][12]); * – the age:time coefficient was estimated at –8.3×10−4 (SE 1.9×10−4), indicating a slightly shorter time from specimen to death for older individuals (likelihood ratio test ![Graphic][13]; * – the sexmale:time coefficient was estimated at 9.4×10−3 (SE 4.7×10−3), indicating a slightly longer time from specimen to death for males (likelihood ratio test ![Graphic][14]; and * – the residencecare/nursing home:time coefficient was estimated at –0.024 (SE 0.010), indicating a slightly shorter time from specimen to death for care home residents (likelihood ratio test ![Graphic][15]. By contrast, we did not find any significant interaction between ethnicity and time (likelihood ratio test ![Graphic][16] or between IMD decile and time (likelihood ratio test ![Graphic][17]. ## Supplementary Note 2 ### CMMID COVID-19 Working Group acknowledgements Funding statements for the CMMID COVID-19 working group are as follows. KvZ: KvZ is supported by the UK Foreign, Commonwealth and Development Office (FCDO)/Wellcome Trust Epidemic Preparedness Coronavirus research programme (ref. 221303/Z/20/Z), and Elrha’s Research for Health in Humanitarian Crises (R2HC) Programme, which aims to improve health outcomes by strengthening the evidence base for public health interventions in humanitarian crises. The R2HC programme is funded by the UK Government (FCDO), the Wellcome Trust, and the UK National Institute for Health Research (NIHR). SC: Wellcome Trust (grant: 208812/Z/17/Z). FYS: NIHR EPIC grant (16/137/109). SFunk: Wellcome Trust (grant: 210758/Z/18/Z), NIHR (NIHR200908). GFM: NTD Modelling Consortium by the Bill and Melinda Gates Foundation (OPP1184344). YJ: LSHTM, DHSC/UKRI COVID-19 Rapid Response Initiative. SRM: Wellcome Trust (grant: 210758/Z/18/Z). RL: Royal Society Dorothy Hodgkin Fellowship. WJE: European Commission (EpiPose 101003688), NIHR (NIHR200908). MQ: European Research Council Starting Grant (Action Number #757699); Bill and Melinda Gates Foundation (INV-001754). NRW: Medical Research Council (grant number MR/N013638/1). RME: HDR UK (grant: MR/S003975/1), MRC (grant: MC_PC 19065), NIHR (grant: NIHR200908). NGD: UKRI Research England; NIHR Health Protection Research Unit in Immunisation (NIHR200929); UK MRC (MC_PC_19065). JYL: Bill & Melinda Gates Foundation (INV-003174). MK: Foreign, Commonwealth and Development Office / Wellcome Trust. FK: Innovation Fund of the Joint Federal Committee (Grant number 01VSF18015), Wellcome Trust (UNS110424). DCT: No funding declared. JDM: Wellcome Trust (grant: 210758/Z/18/Z). AS: No funding declared. AMF: No funding declared. KP: Gates (INV-003174), European Commission (101003688). SFlasche: Wellcome Trust (grant: 208812/Z/17/Z). AJK: Wellcome Trust (grant: 206250/Z/17/Z), NIHR (NIHR200908). SA: Wellcome Trust (grant: 210758/Z/18/Z). BJQ: This research was partly funded by the National Institute for Health Research (NIHR) (16/137/109 & 16/136/46) using UK aid from the UK Government to support global health research. The views expressed in this publication are those of the author(s) and not necessarily those of the NIHR or the UK Department of Health and Social Care. BJQ is supported in part by a grant from the Bill and Melinda Gates Foundation (OPP1139859). TJ: RCUK/ESRC (grant: ES/P010873/1); UK PH RST; NIHR HPRU Modelling & Health Economics (NIHR200908). AR: NIHR (grant: PR-OD-1017-20002). GMK: UK Medical Research Council (grant: MR/P014658/1). MJ: Gates (INV-003174, INV-016832), NIHR (16/137/109, NIHR200929, NIHR200908), European Commission (EpiPose 101003688). YL: Gates (INV-003174), NIHR (16/137/109), European Commission (101003688). JW: NIHR Health Protection Research Unit and NIHR HTA. JH: Wellcome Trust (grant: 210758/Z/18/Z). KO’R: Bill and Melinda Gates Foundation (OPP1191821). YWDC: No funding declared. TWR: Wellcome Trust (grant: 206250/Z/17/Z). CIJ: Global Challenges Research Fund (GCRF) project ‘RECAP’ managed through RCUK and ESRC (ES/P010873/1). SRP: Bill and Melinda Gates Foundation (INV-016832). AE: The Nakajima Foundation. ESN: Gates (OPP1183986). NIB: Health Protection Research Unit (grant code NIHR200908). CJVA: European Research Council Starting Grant (Action number 757688). FGS: NIHR Health Protection Research Unit in Modelling & Health Economics, and in Immunisation. AG: European Commission (EpiPose 101003688). KA: Bill & Melinda Gates Foundation (OPP1157270, INV-016832). WW: MRC (grant MR/V027956/1). KEA: European Research Council Starting Grant (Action number 757688). RCB: European Commission (EpiPose 101003688). PK: This research was partly funded by the Royal Society under award RP\EA\180004, European Commission (101003688), Bill & Melinda Gates Foundation (INV-003174). HPG: This research was produced by CSIGN which is part of the EDCTP2 programme supported by the European Union (grant number RIA2020EF-2983-CSIGN). The views and opinions of authors expressed herein do not necessarily state or reflect those of EDCTP. This research is funded by the Department of Health and Social Care using UK Aid funding and is managed by the NIHR. The views expressed in this publication are those of the author(s) and not necessarily those of the Department of Health and Social Care (PR-OD-1017-20001). CABP: CABP is supported by the Bill & Melinda Gates Foundation (OPP1184344) and the UK Foreign, Commonwealth and Development Office (FCDO)/Wellcome Trust Epidemic Preparedness Coronavirus research programme (ref. 221303/Z/20/Z). OJB: Wellcome Trust (grant: 206471/Z/17/Z). ## Acknowledgements We gratefully acknowledge the assistance of Public Health England in providing the analysis data and authorising release of an anonymised data set. This work was supported by grants from UK Research and Innovation, the National Institutes of Health Research (NIHR) Health Protection Research Unit in Immunisation [NIHR200929], and the UK Medical Research Council [MC_PC_19065] (NGD); Global Challenges Research Fund project ‘RECAP’ managed through Research Councils UK and the Economic and Social Research Council [ES/P010873/1] (CIJ); European Commission project ‘EpiPose’ [101003688] and the NIHR [NIHR200908] (WJE); the National Institutes of Health / National Institute of Allergy and Infectious Diseases [R01AI148127] (NPJ); a Royal Society-Wellcome Trust Sir Henry Dale Fellowship [218554/Z/19/Z] (KDO); and a UKRI Future Leaders Fellowship [MR/S017968/1] (RHK). See Supplementary Note 2 for working group acknowledgements. ## Footnotes * § A list of authors and their affiliations appears at the end of the paper. * Updated version. * Received February 1, 2021. * Revision received March 5, 2021. * Accepted March 5, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Public Health England. Investigation of novel SARS-COV-2 variant Variant of Concern 202012/01: Technical briefing 1. [https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/959438/Technical\_Briefing\_VOC\_SH\_NJL2\_SH2.pdf](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/959438/Technical\_Briefing\_VOC_SH_NJL2_SH2.pdf) (2020). 2. 2.Davies, N. G. et al. Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science eabg3055 (2021). 3. 3.Ministry of Housing & Communities & Local Government. English indices of deprivation 2019: technical report. [https://www.gov.uk/government/publications/english-indices-of-deprivation-2019-technical-report](https://www.gov.uk/government/publications/english-indices-of-deprivation-2019-technical-report) (2019). 4. 4.Cox, D. R. Regression Models and Life-Tables. Journal of the Royal Statistical Society: Series B (Methodological) 34, 187–202 (1972). [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1972N572600003&link_type=ISI) 5. 5.Seaman, S. R. & White, I. R. Review of inverse probability weighting for dealing with missing data. Stat. Methods Med. Res. 22, 278–295 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/0962280210395740&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21220355&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F03%2F05%2F2021.02.01.21250959.atom) 6. 6.Keogh, R. H. et al. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 1-Basic theory and simple methods of adjustment. Stat. Med. 39, 2197–2231 (2020). 7. 7.Scientific Advisory Group for Emergencies. NERVTAG: Update note on B.1.1.7 severity, 11 February 2021. [https://www.gov.uk/government/publications/nervtag-update-note-on-b117-severity-11-february-2021](https://www.gov.uk/government/publications/nervtag-update-note-on-b117-severity-11-february-2021) (2021). ## Methods references 1. 8.COVID-19 confirmed deaths in England: report. [https://www.gov.uk/government/publications/covid-19-reported-sars-cov-2-deaths-in-england/covid-19-confirmed-deaths-in-england-report](https://www.gov.uk/government/publications/covid-19-reported-sars-cov-2-deaths-in-england/covid-19-confirmed-deaths-in-england-report). 2. 9.Lumley, T. & Scott, A. Partial likelihood ratio tests for the Cox model under complex sampling. Stat. Med. 32, 110–123 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.5492&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22806928&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F03%2F05%2F2021.02.01.21250959.atom) 3. 10.Home - COG-UK Consortium. [https://www.cogconsortium.uk/](https://www.cogconsortium.uk/). 4. 11.Imperial College London. Microreact - Open data visualization and sharing for genomic epidemiology. [https://microreact.org/](https://microreact.org/). [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: /embed/inline-graphic-5.gif [6]: /embed/inline-graphic-6.gif [7]: /embed/inline-graphic-7.gif [8]: /embed/inline-graphic-8.gif [9]: /embed/graphic-4.gif [10]: /embed/inline-graphic-9.gif [11]: /embed/inline-graphic-10.gif [12]: /embed/inline-graphic-11.gif [13]: /embed/inline-graphic-12.gif [14]: /embed/inline-graphic-13.gif [15]: /embed/inline-graphic-14.gif [16]: /embed/inline-graphic-15.gif [17]: /embed/inline-graphic-16.gif