The SARS-CoV-2 Alpha variant causes increased clinical severity of disease ========================================================================== * David J. Pascall * Guy Mollett * Rachel Blacow * Naomi Bulteel * Robyn Campbell * Alasdair Campbell * Sarah Clifford * Chris Davis * Ana da Silva Filipe * Ludmila Fjodorova * Ruth Forrest * Emily Goldstein * Rory Gunson * John Haughney * Matthew T.G. Holden * Patrick Honour * Joseph Hughes * Edward James * Tim Lewis * Samantha Lycett * Martin McHugh * Yusuke Onishi * Ben Parcell * David L Robertson * Noha El Sakka * Sharif Shabaan * James G. Shepherd * Katherine Smollett * Kate Templeton * Elen Vink * Elizabeth Wastnedge * Thomas Williams * The COVID-19 Genomics UK (COG-UK) consortium * Emma C. Thomson ## Abstract **Background** The B.1.1.7 (Alpha) SARS-CoV-2 variant of concern was associated with increased transmission relative to other variants present at the time of its emergence and several studies have shown an association between the B.1.1.7 lineage infection and increased 28-day mortality. However, to date none have addressed the impact of infection on severity of illness or the need for oxygen or ventilation. **Methods** In this prospective clinical cohort sub-study of the COG-UK consortium, 1475 samples from hospitalised and community cases collected between the 1st November 2020 and 30th January 2021 were collected. These samples were sequenced in local laboratories and analysed for the presence of B.1.1.7-defining mutations. We prospectively matched sequence data to clinical outcomes as the lineage became dominant in Scotland and modelled the association between B.1.1.7 infection and severe disease using a 4-point scale of maximum severity by 28 days: 1. no support, 2. oxygen, 3. ventilation and 4. death. Additionally, we calculated an estimate of the growth rate of B.1.1.7-associated infections following introduction into Scotland using phylogenetic data. **Results** B.1.1.7 was responsible for a third wave of SARS-CoV-2 in Scotland, and rapidly replaced the previously dominant second wave lineage B.1.177) due to a significantly higher transmission rate (∼5 fold). Of 1475 patients, 364 were infected with B.1.1.7, 1030 with B.1.177 and 81 with other lineages. Our cumulative generalised linear mixed model analyses found evidence (cumulative odds ratio: 1.40, 95% CI: 1.02, 1.93) of a positive association between increased clinical severity and lineage (B.1.1.7 versus non-B.1.1.7). Viral load was higher in B.1.1.7 samples than in non-B.1.1.7 samples as measured by cycle threshold (Ct) value (mean Ct change: -2.46, 95% CI: -4.22, -0.70). **Conclusions** The B.1.1.7 lineage was associated with more severe clinical disease in Scottish patients than co-circulating lineages. **Funding** COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) and Genome Research Limited, operating as the Wellcome Sanger Institute. Funding was also provided by UKRI through the JUNIPER consortium (grant number MR/V038613/1). Sequencing and bioinformatics support was funded by the Medical Research Council (MRC) core award (MC UU 1201412). The B.1.1.7 SARS-CoV-2 Pango lineage (termed the Alpha variant by the World Health Organisation) was first identified in the UK in September 2020 and at the time of writing has been reported in 150 countries (1). It is defined by 21 genomic mutations or deletions, including 8 characteristic changes within the spike gene (Table S1) (2). These are associated with increased ACE-2 receptor binding affinity and innate and adaptive immune evasion (3-6). The B.1.1.7 lineage, the first variant of concern (VOC), was estimated to be 50-100% more transmissible than others present at the time of its emergence (7), explaining the transient dominance of variants in this lineage globally. The presence of a spike gene deletion (Δ69-70) results in spike-gene target failure (SGTF) in real-time reverse transcriptase polymerase chain reaction (RT-PCR) diagnostic assays and provides a useful proxy for the presence of B.1.1.7 for epidemiological analysis (2). Recently, three large community analyses have shown a positive association between 28-day mortality and the presence of SGTF, with hazard ratios of 1.55 (CI 1.39-1.72), 1.64 (CI 1.32-2.04) and 1.67 (CI 1.34-2.09) (8-10). Two other large-scale analyses found a greater risk of hospitalisation in cases with SGTF (hazard ratio 1.52; CI 1.47-1.57) or confirmed B.1.1.7 infection (hazard ratio 1.34; CI 1.07-1.66) (11, 12). In contrast, a smaller analysis of 341 hospitalised patients with confirmed COVID-19 and matched sequences found no association between B.1.1.7 and increased clinical severity on a composite score of severe COVID-19 at day 14 and 28-day mortality (PR 1.02, CI 0.76-1.38, p=0.88) (13). Limited data are available on the full clinical course of disease with B.1.1.7 in relation to other variants. Understanding the clinical pattern of disease with B.1.1.7 infection is important for a number of reasons. Firstly, if B.1.1.7 is more pathogenic in younger people than previous variants, this has implications for easing of lockdown in partially vaccinated populations, especially vaccination focused on targeting older age groups. Secondly, much of the world, particularly in low- and middle-income countries, is unlikely to achieve vaccination coverage until well into 2022. A better understanding of a lineage with increased severity is important in modelling the impact of unmitigated infection in these settings. Finally, a clear understanding of the behaviour of this lineage, which has emerged as a dominant variant, is needed as a baseline to compare the clinical phenotype of newly emerging variants such as B.1.351 (Beta variant) and the B.1.617 sublineages (particularly B.1.617.2, Delta variant) which may be better able to evade vaccine-induced immunity than B.1.1.7 and therefore may have the potential to spread even in immunised populations (14). We aimed to quantify the clinical features and rate of spread of B.1.1.7-lineage infections in Scotland in a comprehensive national dataset. We used whole genome sequencing data to analyse patient presentations between the 1st November 2020 and 30th January 2021 as the virus emerged in Scotland and used cumulative generalised additive models to compare 28-day maximum clinical severity for B.1.1.7 against other lineages over the same period. ## METHODS ### Sequencing sequencing was performed using amplicon-based next generation sequencing as previously described (15) as part of the COG-UK consortium (16). ### Bioinformatics sequence alignment, lineage assignment, tree generation and estimates of growth rate were performed using the COG-UK data pipeline ([https://github.com/COG-UK/datapipe](https://github.com/COG-UK/datapipe)) and phylogenetic pipeline ([https://github.com/cov-ert/phylopipe](https://github.com/cov-ert/phylopipe)) with pangolin lineage assignment ([https://github.com/cov-lineages/pangolin](https://github.com/cov-lineages/pangolin)) (17). Lineage assignments were performed on 18/03/2021 and phylogenetic analysis was performed using the COG-UK tree generated on 25/02/2021. Estimates of growth rates of major lineages in Scotland were calculated from time-resolved phylogenies for lineages B.1.1.7, B.177 and the sub-clades B.177.5, B.177.8, and another minor B.177 sub-clade (W.4). The estimates were carried out utilising sequences from November 2020 – March 2021 in BEAST with an exponential growth rate population model, strict molecular clock model and TN93 with four gamma rate distribution categories. Each lineage was randomly subsampled to a maximum of 5 sequences per epiweek (resulting in 52 to 103 sequences per subsample, depending on the lineage), and 10 subsamples replicates analysed per lineage in a joint exponential growth rate population model. ### Clinical data we included all Scottish COG-UK pillar 1 samples sequenced at the MRC-University of Glasgow Centre for Virus Research (CVR) and the Royal Infirmary of Edinburgh (RIE) between 1st November 2020 and 30th January 2021. These samples derived from hospitalised patients (59%) as well as community testing (41%). Core demographic data (age, sex, partial postcode) were collected via linkage to electronic patient records and a full prospective review of case notes was undertaken. Collected data included residence in a care home; occupation in care home or healthcare setting; admission to hospital; date of admission, discharge and/or death and maximum clinical severity at 28 days sample collection date via a 4-point ordinal scale (1. No respiratory support; 2. Supplemental oxygen; 3. Intubation and ventilation or non-invasive ventilation or high-flow nasal canula; 4. Death) as previously used in Volz et al 2020 and Thomson et al 2021 (18-19). Where available, PCR cycle threshold (Ct) and the PCR testing platform were recorded. Hospital acquired COVID-19 in patients admitted to hospital was defined as a first positive PCR occurring greater than 48 hours following admission to hospital. Discharge status was followed up until 15th April 2021 for the hospital stay analysis. For the co-morbidity subanalysis, delegated research ethics approval was granted for linkage to National Health Service (NHS) patient data by the Local Privacy and Advisory Committee at NHS Greater Glasgow and Clyde. Cohorts and de-identified linked data were prepared by the West of Scotland Safe Haven at NHS Greater Glasgow and Clyde. ### Severity analyses four level severity data was analysed using cumulative (per the definition of Bürkner and Vuorre (2019)) generalised additive mixed models (GAMMs) with logit links, specifically, following Volz et al (2020) (18,20). We analysed three subsets of the data: 1. the full dataset, 2. the dataset excluding care home patients, and 3. exclusively the hospitalised population. Further details regarding these analyses are provided in Supplementary Appendix 1. ### Ct analysis Ct value was compared between B.1.1.7 and non-B.1.1.7 lineage infections for those patients where the TaqPath assay (Applied Biosystems) was used. This platform was used exclusively for this analysis because different platforms output systematically different Ct values, and this was the most frequently used in our dataset (n = 154, B.1.1.7 = 38, non-B.1.1.7 = 116). We used a generalised additive model with a Gaussian error structure and identity link, and the same covariates used as in the severity analysis to model the Ct value. The model was fitted using the brms (v. 2.14.4) R package (22). The presented model had no divergent transitions and effective sample sizes of over 200 for all parameters. The intercept of the model was given a t-distribution (location = 20, scale = 10, df = 3) prior, the fixed effect coefficients were given normal (mean = 0, standard deviation = 5) priors, random effects and spline standard deviations were given exponential (mean = 5) priors. ### Hospital length of stay analysis hospital length of stay was compared for B.1.1.7 and non-B.1.1.7 lineage patients while controlling for age and sex using a Fine and Gray model competing risks regression using the crr function in the cmprsk (v. 2.2-10) R package (24-25). Nosocomial infections were excluded. In total, this analysis had 521 cases (B.1.1.7 = 187, non-B.1.1.7 = 334), of which 4 were censored; 352 patients were released from hospital and 165 died. ## Results ### Emergence of the B.1.1.7 lineage in Scotland Between 01/11/2020 and 31/01/2021 1863 samples from individuals tested in pillar 1 facilities underwent whole genome sequencing for SARS-CoV-2. Of these, 1475 (79%) could be linked to patient records and were included in the analysis. The contribution of patients infected with the B.1.1.7 variant increased over the course of the study, in line with dissemination across the UK during the study period (Figure 1a and 1b). Two peaks of SARS-CoV-2 infection have occurred in the UK to date: the first (wave 1) in March 2020 (13) and the second in summer 2020 (26), both in association with hundreds of importations following travel to Central Europe (27). The second peak incorporated two variant waves (waves 2 and 3), initially of B.1.177 (Figure 1c) and then B.1.1.7, radiating from the South of England (Figure 1e). This B.1.1.7 “takeover” (Figure 1d), corresponded to a five-fold increase in growth rate on an epidemiological scale relative to non-B.1.1.7 lineages (Figure 1f). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/08/24/2021.08.17.21260128/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/F1) Figure 1. Introduction and growth of lineage B.1.1.7 in the UK A) Waves of SARS-CoV-2 confirmed cases in the UK B) Seven-day rolling average of daily PCR positive cases (orange) and total number of patients hospitalised (dark blue) with COVID-19 in Scotland during the study period. Grey shaded area represents the period of lockdown beginning 26/12/2020 C) Variants in the UK D) Proportion of cases by lineage in the clinical severity cohort E) Variants in Scotland showing three distinct waves in winter and early spring 2020, summer 2020 and autumn/winter, attributed to the shifts from B1 and other variants (light blue) to B.1.177 (dark blue) and then B.1.1.7 (orange). Waves one and two closely mirror the broader UK situation as they are linked to both continental European and introductions from England. Wave three has a single origin in Kent so Scotland lags behind England in numbers of cases F) Estimates of growth rates of major lineages in Scotland from time-resolved phylogenies. Estimates were carried out on a subsample of the named lineages using sequences from Scotland only from November 2020-March 2021using BEAST and an exponential growth effective population size model. ### Demographics of the clinical cohort The age of the clinical cohort ranged from 0-105 years, (mean 66.8 years) and was slightly lower in the B.1.1.7 group (65.6 years vs. 67.2 years). Overall, 59.1% were female; this preponderance occurred in both subgroups and was higher in the B.1.1.7 subgroup (60.4% vs 58.6%). In the full cohort, 3.0% were care home workers and 10.4% were NHS healthcare workers. 5.5% and 5.8% of those infected with the B.1.1.7 variant were care home and other healthcare workers respectively, compared with 2.2% and 12.0% of those infected with non-B.1.1.7 lineages. 12.9% of those in the B.1.1.7 subgroup were care home residents, compared with 21.7% in non-B.1.1.7. There was also a difference in the proportion of cases admitted to Intensive Care Units: 6.3% of the B.1.1.7 group compared with 3.4% for non-B.1.1.7. Full details of the demographic data of the cohort can be found in Table 1 and full lineage assignments can be found in Table S2. View this table: [Table 1:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T1) Table 1: Demographic characteristics of Scottish patients infected with SARS-CoV-2 by lineage ### Clinical severity analysis Within the clinical severity cohort there were 364 B.1.1.7, 1030 B.1.177 and 81 of 19 other lineage infections (Figure 2). Consistent with previous research comparing mortality and hospitalisation in SGTF detected by PCR versus absence of SGTF, we found that B.1.1.7 lineage viruses were associated with more severe disease on average than those from other lineages circulating during the same time period. In the full dataset, we observed a positive association with severity (median cumulative odds ratio: 1.40, 95% CI: 1.02,1.93). In both the subsets excluding care home patients, or limiting to hospitalised patients, the mean estimate of the increase in severity of B.1.1.7 lineage viruses was smaller, and the variance in the posterior distribution higher likely due to the smaller sample sizes. Given this uncertainty, we cannot determine whether the association of B.1.1.7 with severity in the populations corresponding to these subsets is the same as that in the population described by the full dataset, but in all cases, the most likely direction of the effect is positive. Model estimates from severity models from all subsets can be found in Tables S3-5. Bernoulli models looking at each severity category individually suggested that for our cohort, there was no evidence that B.1.1.7 was associated with increased mortality at 28 days (median odds ratio: 1.04; 95% central credible interval: 0.67,1.59), but that infection with B.1.1.7 lineage viruses was associated with a moderate increase in the risk of requiring supplemental oxygen (median odds ratio: 1.77; 95% central credible interval: 1.12,2.83). An individual model looking at high flow oxygen/ventilation could not be fit due to the low numbers of events in some cells. Estimates of the severity across the phylogeny are visible in Figure 3, see Supplementary Appendix 2 for more discussion of this analysis. An analysis including comorbidities for the subset of patients where they were available implied that the inclusion of comorbidities had no impact on the results obtained, see Supplementary Appendices 1 and 3. ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/08/24/2021.08.17.21260128/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/F2) Figure 2: Comparison of disease severity between B.1.1.7 and other lineages Clinical severity was measured on a four-level ordinal scale based on the level of respiratory support received for 1454 patients stratified by age group; death, invasive or non-invasive ventilatory support including high flow nasal cannulae (I&V/NIV/HFNC), supplemental oxygen delivered by low flow mask devices or nasal cannulae, no respiratory support. ![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/08/24/2021.08.17.21260128/F3.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/F3) Figure 3: The estimated maximum likelihood phylogenetic tree and a measure of estimated severities of infection. Estimated severities for each viral isolate are means and 95% credible intervals of the linear predictor change under infection with that viral genotype from the phylogenetic random effect in the cumulative severity model under a Brownian motion model of evolution. This model constrains genetically identical isolates to have identical effects, so changes should be interpreted across the phylogeny rather than between closely related isolates which necessarily have similar estimated severities. The dataset was downsampled to 100 random samples for this figure to aid readability. Figure was generated using ggtree (28). We also found that B.1.1.7 lineage viruses were associated with lower Ct values than infection with non-B.1.1.7 infection (median Ct change: -2.46, 95% CI: -4.22, -0.70) as previously observed (8). Model estimates for all parameters can be found in Table S6. We found no evidence that B.1.1.7 was associated with longer hospital stays after controlling for age and sex (HR: -0.02; 95% CI: -0.23, 0.20; p = 0.89). ## Discussion In this prospective analysis of hospitalised and community patients with B.1.1.7 and non-B.1.1.7 lineage SARS-CoV-2 infection, carried out as the B.1.1.7 became dominant in Scotland, we provide evidence of increased clinical severity associated with this variant. This was observed across all adult age groups, incorporating the spectrum of COVID-19 disease; from no requirement for supportive care to supplemental oxygen requirement, the need for invasive or non-invasive ventilation to death. This analysis is the first to assess the full clinical severity spectrum of B.1.1.7 infection in relation to other prevalent lineages circulating during the same time period. Our study supports recent community testing analyses that have reported an increased 28-day mortality associated with SGTF as a proxy for B.1.1.7 status (8-10). A smaller study found no effect of the lineage on 28-day mortality (13), but we note that we would not have detected an effect in a population of the size used in Frampton et al. 2021, indicating that while there is evidence for an effect, it is not large enough to be observed in smaller detailed studies. The association between higher viral load, higher transmission and lineage may reflect changes in the biology of the virus; for example, the B.1.1.7 asparagine (N) to tyrosine (Y) mutation at position 501 of the spike protein receptor binding domain (RBD) is associated with an increase in binding affinity to the human ACE2 receptor (29). In addition, a deletion at position 69–70 may increase virus infectivity (30). The P681H mutation found at the furin cleavage site is associated with more efficient furin cleavage, enhancing cell entry (31). An alternative explanation for the higher viral loads observed in B.1.1.7 infection may be that clinical presentation occurs earlier in the illness. Further modelling, animal experiments and studies in healthy volunteers may help to unravel the mechanisms behind this phenomenon. Our data indicate an association between B.1.1.7 and an increased risk of requiring supplemental oxygen and ventilation; two factors that are critical determinants of healthcare capacity during a period of high incidence of SARS-CoV-2 infection. This means that countries where B.1.1.7 is not yet dominant, in particular those with weaker public health control of the virus, will need to factor the requirement for supportive treatment into models of clinical severity and pandemic response decision planning. In regions where B.1.1.7 is dominant it should be used as the comparison lineage for clinical severity analysis of emergent variants of concern, such as B.1.351 and B.1.617.2. There are some limitations to our study. Our dataset is drawn from first-line local NHS diagnostic (pillar 1) testing which over-represents patients presenting for hospital care (59%) while those sampled in the community represented 41% of the dataset. Further, the analysis dataset employed a non-standardised approach to sampling across the study period as sequencing was carried out both as systematic randomised national surveillance and sampling following outbreaks of interest. Finally, the cumulative model used in this analysis assumes a homogenous application of therapeutic intervention across the population. Despite these limitations, our results remain consistent with previous work on the mortality of Alpha, and this study provides new information regarding differences in infection severity. In summary, the B.1.1.7 lineage was found to be associated with a rapid increase in SARS-CoV-2 cases in Scotland and an increased risk of severe infection requiring supportive care. This has implications for planning for outbreaks in countries with low vaccine uptake where the B.1.1.7 lineage is not yet dominant. Our study has shown the value of the collection of higher resolution patient outcome data linked to genetic sequences when looking for clinically relevant differences between viral variants. ## Data Availability Due to the analysis of patient identifiable data, please contact the authors for data requests. ## Tables All tables should be included at the end of the manuscript text file. Double-space tables (including footnotes) and provide a title for each table. For Original Articles, there is normally a limit of five figures and tables (total) per manuscript. Extensive tables or supplementary materials will be published as supplemental materials with the digital version of the article. ## Supplementary Appendix View this table: [Table S1:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T2) Table S1: Characteristic mutations of B.1.1.7 View this table: [Table S2:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T3) Table S2: Full lineage characterisation of clinical severity dataset View this table: [Table S3:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T4) Table S3: Parameter estimates (on the linear predictor scale) from the severity model from the full dataset View this table: [Table S4:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T5) Table S4: Parameter estimates (on the linear predictor scale) from the severity model from the data subset excluding patients in nursing homes View this table: [Table S5:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T6) Table S5: Parameter estimates (on the linear predictor scale) from the severity model from the data subset only including hospitalised patients View this table: [Table S6:](http://medrxiv.org/content/early/2021/08/24/2021.08.17.21260128/T7) Table S6: Parameter estimates from the Ct value model ## Acknowledgements We would like to thank all NHS staff that looked after patients during the COVID-19 pandemic in Scotland. The authors would like to acknowledge that this work uses data provided by patients and collected by the National Health Service (NHS) as part of their care and support. The authors would also like to acknowledge the work of the West of Scotland Safe Haven team in supporting extractions and linkage to de-identified NHS patient datasets. ## Appendix 1 ### Further methods Four level severity data was analysed using cumulative (per the definition of Bürkner and Vuorre (2019)) generalised additive mixed models (GAMMs) with logit links, specifically, following Volz et al (2020) (18,20). We analysed three subsets of the data: 1. the full dataset, 2. the dataset excluding care home patients, and 3. exclusively the hospitalised population. These GAMMs included B.1.1.7 status and patient sex as fixed effects, with county and partial postcode included as random effects. We included patient age and the days since the first diagnosis in the dataset as non-linear penalised regression splines. The k parameter of the penalised regression splines was set to maximum possible value in each case, with the intention that regularisation occur through the prior. The full dataset was additionally analysed using a phylogenetic cumulative generalised additive mixed model (PGAMM). The PGAMM was a modification of the GAMMs described above, where instead of including B.1.1.7 status as a fixed effect, we included a random effect of phylogenetic relationship between viral isolates (using a variance-covariance matrix calculated from the virus phylogeny under a Brownian motion assumption using the vcv.phylo function in ape (v. 5.5) (21)). All severity models were fitted using the brms (v. 2.14.4) R package (22). All presented models had no divergent transitions and effective sample sizes of over 200 for all parameters. Additionally, we fitted Bernoulli models with the same covariate set as the cumulative model for supplemental oxygen and mortality individually (an individual model for high flow oxygen/ventilation was attempted but could not be fitted due to the low numbers of events in some cells). Comorbidities were only available for patients from the Greater Glasgow and Clyde health board (n = 639). Comorbidities used were those previously identified as important for COVID-19 severity by the ISARIC4C consortium (23). To test whether the lack of comorbidity data for the rest of the sample was leading to biased estimates of the impact of B.1.1.7 lineage infection, we performed three analyses on the Greater Glasgow and Clyde patient population. We fit the above model with the number of comorbidities a patient exhibited included as non-linear penalised regression spline. While the exact form of the relationship between severity of infection and the number of comorbidities a patient exhibits is unknown, we would expect the relationship to be monotonically increasing, however, for mathematical simplicity, we do not enforce this constraint on the spline. We also fit the model to this patient population without the comorbidities included and with the comorbidities permuted in order to estimate the change in the estimate of the B.1.1.7 effect by the inclusion of comorbidities. As the inclusion of comorbidities was found not to change the estimated effect of B.1.1.7, this analysis is presented in Supplementary Appendix 3. Priors were defined over classes of parameters. Priors were designed to be informative for the scale of the parameters, but not for the precise values. The same classes received the same priors in each model. The intercepts of the models were given t-distribution (location = 0, scale = 2.5, df = 3) priors, fixed effects were given normal (mean = 0, standard deviation = 2.5) priors, random effects and spline standard deviations were given exponential (mean = 2.5) priors. ## Appendix 2 ### Phylogenetic severity model The estimates of the severity per isolate shown in Figure 3 were generated by a model making several assumptions, which were violated. The key assumptions used and their impacts will be discussed in this appendix (see 1 for deeper discussion of some the issues involved). Despite the violation of the assumptions, the answer generated was consistent with the non-phylogenetic method and the output is illustrative, so the results are included in the main text, though not stressed. The first major assumption is that the source phylogeny is known without error. This can be practically broken into two assumptions. Firstly, that tree-like evolution is the correct description of the underlying evolutionary process, i.e. that horizonal gene transfer is unimportant. This appears to be a relatively safe assumption in SARS-CoV-2. Secondly, that the phylogenetic tree is correctly estimated. This is likely to be violated as there may be error in both the discrete branching structure (or topology) and real-valued branch lengths. While the topology may be correctly estimated, the probability of estimating all the branch lengths correctly is vanishingly small. This is unlikely to be a large practical issue however, as small errors in the branch lengths of the phylogeny are unlikely to have large impacts relative to other model misspecification issues present in all statistical analyses. If we are willing to assume that the estimated phylogeny is good enough for our purposes, we then must assume some model of the evolution of the trait of interest across that phylogeny. This model of the change in the trait (severity) across the phylogeny is what allows the conversion of the phylogenetic tree into a variance-covariance matrix. This describes the expected covariances (rescaled to correlations) between the severities associated with infection with different genetic variants. Here we made a common simple choice and assumed Brownian motion evolution of the trait across the phylogeny. However, this model has been acknowledged as often suboptimal since its inception (1), and we can consider it particularly so here. The number of observed changes across SARS-CoV-2 genomes are relatively few, and the number of amino acid changes even fewer, with some mutations occurring repeatedly in different lineages. Few mutations with combined with semi-frequent homoplasy represent a particularly problematic case for this model, as severity would be expected to change discretely with mutations and in consistent directions when convergent changes occur (in the absence of extreme epistatic effects on severity), two things that simple Brownian motion does not allow. Theoretically, model extensions using Levy processes may allow discrete jumps in trait value along a phylogenetic tree, however implementing such a model was beyond the scope of this study. Future work will explore more realistic evolutionary models for change in severity with genomes, which will reduce the error potentially imposed by this assumption. ## Appendix 3 ### Comorbidities In the Greater Glasgow and Clyde population for which comorbidity data was available, the model without inclusion of comorbidities estimated the odds ratio for the impact of B.1.1.7 on severity as 1.06 (95% CI: 0.70, 1.58). When number of relevant comorbidities a patient had were included but permuted, so as to break any relationship with the response, a similar odds ratio was estimated (1.06: 95% CI: 0.70, 1.60). The inclusion of the number of relevant comorbidities a patient exhibited did not substantially change this result (odds ratio for impact of B.1.1.7 lineage viruses: 1.13; 95% CI: 0.73, 1.72). This is not unexpected, as the distribution of comorbidities was similar between those patients infected with B.1.1.7 lineage viruses and those infected with non-B.1.1.7 lineage viruses. ## Footnotes * david.pascall{at}mrc-bsu.cam.ac.uk * Guy.Mollett{at}ggc.scot.nhs.uk * Rachel.Blacow{at}ggc.scot.nhs.uk * naomi.bulteel2{at}nhs.scot * robyn.campbell{at}nhslothian.scot.nhs.uk * alasdair.campbell5{at}nhslothian.scot.nhs.uk * sarah.clifford{at}nhslothian.scot.nhs.uk * chris.davis{at}glasgow.ac.uk * ana.filipe{at}glasgow.ac.uk * Ludmila.Fjodorova2{at}nhs.scot * ruth.forrest2{at}nhs.scot * Emily.Goldstein{at}ggc.scot.nhs.uk * Rory.Gunson{at}ggc.scot.nhs.uk * John.Haughney{at}ggc.scot.nhs.uk * matt.holden{at}nhs.scot * patrick.honour3{at}borders.scot.nhs.uk * joseph.hughes{at}glasgow.ac.uk * edward.james{at}borders.scot.nhs.uk * timothy.lewis{at}nhslothian.scot.nhs.uk * samantha.lycett{at}ed.ac.uk * martin.mchugh{at}nhslothian.scot.nhs.uk * yusuke.onishi{at}nhs.scot * ben b.j.parcell{at}dundee.ac.uk * david.l.robertson{at}glasgow.ac.uk * noha.elsakka{at}nhs.scot * Sharif.Shaaban2{at}phs.scot * James.Shepherd.2{at}glasgow.ac.uk * katherine.smollett{at}glasgow.ac.uk * Kate.Templeton{at}nhslothian.scot.nhs.uk * Elen.Vink{at}glasgow.ac.uk * elizabeth.wastnedge{at}nhslothian.scot.nhs.uk * thomaschristiewilliams{at}gmail.com * Received August 17, 2021. * Revision received August 17, 2021. * Accepted August 24, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Global Report B.1.1.7. PANGO Lineages 2021; June 30 published online ([https://cov-lineages.org/global\_report_B.1.1.7.html](https://cov-lineages.org/global_report_B.1.1.7.html)) 2. 2.Investigation of SARS-CoV-2 variants of concern in England, Technical briefing 6. London, UK. Public Health England (PHE), 13 February 2021 ([https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/961299/Variants\_of\_Concern\_VOC\_Technical\_Briefing\_6\_England-1.pdf](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/961299/Variants\_of\_Concern\_VOC\_Technical_Briefing_6_England-1.pdf)) 3. 3.Rees-Spear C, Muir L, Griffith SA, et al. The effect of spike mutations on SARS-CoV-2 neutralisation. Cell Reports 2021;34(12):108890. 4. 4.Shen X, Tang H, McDanal C, et al. SARS-CoV-2 variant B.1.1.7 is susceptible to neutralizing antibodies elicited by ancestral spike vaccines. Cell Host & Microbe 2021;29(4):529–539. 5. 5.Wang Z, Schmidt F, Weisblum Y, et al. mRNA vaccine-elicited antibodies to SARS-CoV-2 and circulating variants. Nature 2021;592:616–622. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F08%2F24%2F2021.08.17.21260128.atom) 6. 6.Thorne LG, Bouhaddou M, Reuschl AK, et al. Evolution of enhanced innate immune evasion by the SARS-CoV-2 B.1.1.7 UK variant. bioRxiv June 7 2021:2021.06.06.446826. 7. 7.Volz E, Mishra S, Chand M, et al. Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England. Nature 2021;(ePub ahead of print) 8. 8.Davies NG, Jarvis CI, CMMID COVID-19 Working Group, et al. Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7. Nature 2021;(ePub ahead of print) 9. 9.Challen R, Brooks-Pollock E, Read J, Dyson L, Tsaneva-Atanasova K, Danon L. Risk of mortality in patients infected with SARS-CoV-2 variant of concern 202012/1: matched cohort study. BMJ 2021;372:579. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1136/bmj.n579&link_type=DOI) 10. 10.Grint DJ, Wing K., Williamson E, et al. Case fatality risk of the SARS-CoV-2 variant of concern B.1.1.7 in England, 16 November to 5 February. Euro Surveill. 2021;26(11):2100256. 11. 11.Nyberg T, Twohig KA, Harris RJ, et al. Risk of hospital admission for patients with SARS-CoV-2 variant B.1.1.7: cohort analysis. BMJ. 2021;373:1412 12. 12.Dabrera G, Allen H, Zaidi A, et al. Assessment of Mortality and Hospital Admissions Associated with Confirmed Infection with SARS-CoV-2 Variant of Concern VOC-202012/01 (B.1.1.7) a Matched Cohort and Time-to-Event Analysis. SSRN. ([https://ssrn.com/abstract=3802578](https://ssrn.com/abstract=3802578)) 13. 13.Frampton D, Rampling T, Cross A, et al. Genomic characteristics and clinical effect of the emergent SARS-CoV-2 B.1.1.7 lineage in London, UK: a whole-genome sequencing and hospital-based cohort study. Lancet Infectious Diseases 2021;(ePub ahead of print) 14. 14.Madhi SA, Baillie V., Cutland CL, et al. Efficacy of the ChAdOx1 nCoV-19 Covid-19 Vaccine against the B.1.351 Variant. N Eng J Med 2021;(ePub ahead of print) 15. 15.da Silva Filipe A, Shepherd JG, Williams T, et al. Genomic epidemiology reveals multiple introductions of SARS-CoV-2 from mainland Europe into Scotland. Nat Microbiol 2021;6:112–122. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F08%2F24%2F2021.08.17.21260128.atom) 16. 16.The COVID-19 Genomics UK (COG-UK) consortium. An integrated national scale SARS-CoV-2 genomic surveillance network. Lancet Microbe 2020;1(3):e99–e100. 17. 17.Data Pipeline. COG-UK Consortium March 18 2021. ([https://githubmemory.com/repo/COG-UK/grapevine_nextflow#pipeline-overview](https://githubmemory.com/repo/COG-UK/grapevine_nextflow#pipeline-overview)) 18. 18.Volz E, Hill V, McCrone JT, et al. Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity. Cell 2021;184(1):64-75.e11. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F08%2F24%2F2021.08.17.21260128.atom) 19. 19.Thomson EC, Rosen LE, Shepherd JG, et al. Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity. Cell 2021;184(5):1171-1187.e20. 20. 20.Bürkner P-C, Vuorre M. Ordinal Regression Models in Psychology: A Tutorial. Advances in Methods and Practices in Psychological Science 2019;2(1):77–101. 21. 21.Paradis E, Schliep K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 2019;35(3):526–528. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/bty633&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30016406&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F08%2F24%2F2021.08.17.21260128.atom) 22. 22.Bürkner P-C. Brms: An R Package for Bayesian Multilevel Models using Stan. Journal of Statistical Software 2017;80:1–28. 23. 23.Knight SR, Ho A, Buchan I, et al. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. BMJ 2020;370:m3339. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE3OiIzNzAvc2VwMDlfNy9tMzMzOSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA4LzI0LzIwMjEuMDguMTcuMjEyNjAxMjguYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 24. 24.Fine JP, Gray RJ. A Proportional Hazards Model for the Subdistribution of a Competing Risk. Journal of the American Statistical Association 1999;94(446):496–509. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2670170&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000081058500019&link_type=ISI) 25. 25.Gray B. cmprsk: Subdistribution Analysis of Competing Risks. R Project 2020 ([https://CRAN.R-project.org/package=cmprsk](https://CRAN.R-project.org/package=cmprsk)) 26. 26.Hodcroft EB, Zuber M, Nadeau S, et al. Spread of a SARS-CoV-2 variant through Europe in the summer of 2020. Nature 2021. ([https://doi.org/10.1038/s41586-021-03677-y](https://doi.org/10.1038/s41586-021-03677-y)) 27. 27.Lycett SA, Hughes J, McHugh MP, et al. Epidemic waves of COVID-19 in Scotland: a genomic perspective on the impact of the introduction and relaxation of lockdown on SARS-CoV-2. medRxiv Jul 01 2021:2021.01.08.20248677. 28. 28.Yu G, Smith DK, Zhu H, et al. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2016;8(1):28–36. 29. 29.Ali F, Kasry A, Amin M. The new SARS-CoV-2 strain shows a stronger binding affinity to ACE2 due to N501Y mutant. Medicine in Drug Discovery 2021;10:100086. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.medidd.2021.100086&link_type=DOI) 30. 30.Kemp SA, Collier DA, Datir RP, et al. SARS-CoV-2 evolution during treatment of chronic infection. Nature 2021;592:277–282. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F08%2F24%2F2021.08.17.21260128.atom) 31. 31.Brown JC, Goldhill DH, Zhou J, et al. Increased transmission of SARS-CoV-2 lineage B.1.1.7 (VOC 2020212/01) is not accounted for by a replicative advantage in primary airway cells or antibody escape. bioRxiv Jun 22 2021:2021.02.24.432576.