Abstract
Objectives The SARS-CoV-2 Alpha variant was associated with increased transmission relative to other variants present at the time of its emergence and several studies have shown an association between Alpha variant infection and increased hospitalisation and 28-day mortality. However, none have addressed the impact on maximum severity of illness in the general population classified by the level of respiratory support required, or death. We aimed to do this.
Methods In this retrospective multi-centre clinical cohort sub-study of the COG-UK consortium, 1475 samples from Scottish hospitalised and community cases collected between 1st November 2020 and 30th January 2021 were sequenced. We matched sequence data to clinical outcomes as the variant became dominant in Scotland and modelled the association between Alpha variant infection and severe disease using a 4-point scale of maximum severity by 28 days: 1. no respiratory support, 2. supplemental oxygen, 3. ventilation and 4. death.
Results Our cumulative generalised linear mixed model analyses found evidence (cumulative odds ratio: 1.40, 95% CI: 1.02, 1.93) of a positive association between increased clinical severity and lineage (Alpha variant versus non-Alpha variant).
Conclusions The Alpha variant was associated with more severe clinical disease in the Scottish population than co-circulating lineages.
Introduction
The Alpha variant of SARS-CoV-2 (Pango lineage B.1.1.7) was first identified in the UK in September 2020 and has since been reported in 184 countries (1). It is defined by 21 genomic mutations or deletions, including 8 characteristic changes within the spike gene (Table S1) (2). These are associated with increased ACE-2 receptor binding affinity and innate and adaptive immune evasion (3-6) compared to preceding lineages. The Alpha variant, the first variant of concern (VOC), was estimated to be 50-100% more transmissible than others present at the time of its emergence (7), explaining the transient dominance of this lineage globally.
The presence of a spike gene deletion (Δ69-70) results in spike-gene target failure (SGTF) in real-time reverse transcriptase polymerase chain reaction (RT-PCR) diagnostic assays and provided a useful proxy for the presence of the Alpha variant for epidemiological analysis during this time period (2). Four large community analyses have shown a positive association between the presence of SGTF and 28-day mortality, with hazard ratios of 1.55 (CI 1.39-1.72), 1.64 (CI 1.32-2.04), 1.67 (CI 1.34-2.09) and 1.73 (CI 1.41-2.13) (8-10). Both SGTF (hazard ratios of 1.52 (CI 1.47-1.57), 1.62 (CI 1.48 - 1.78)) (11,12) and confirmed Alpha variant infection (hazard ratios of 1.34 (CI 1.07-1.66) and 1.61 (CI 1.28 - 2.03) (12,13) were associated with an increased risk of hospitalisation in community cases, and a smaller study of hospitalised patients found a greater risk of hypoxia at admission in those with confirmed Alpha variant infection (14). In contrast, other smaller analyses of hospitalised patients found no association between confirmed Alpha variant infection and increased clinical severity based on a variety of indices (15-17). Limited data are available on the full clinical course of disease with the Alpha variant in relation to other variants.
Understanding the clinical pattern of disease with new variants of concern is important for several reasons. Firstly, if a variant is more pathogenic than previous variants, this has implications for public health restrictions and the functioning of health care systems. Secondly, large numbers of low- and middle-income countries still have less than 50% of their populations having been vaccinated against SARS-CoV-2 (18). A better understanding of a variant with increased severity is important in modelling the impact of unmitigated infection in these settings. A clear understanding of the behaviour of the Alpha variant, which emerged as a dominant variant, is needed as a baseline to compare the clinical phenotype of later variants, such as the currently dominant Omicron sub-variant BA.5. Post-Alpha variants have been shown to be able to evade vaccine-induced immunity and therefore have the potential to spread even in immunised populations (19), so a historical understanding of severity remains important, as it seems unlikely that the SARS-CoV-2 infections will be brought under control in the near future.
We aimed to quantify the clinical features and rate of spread of Alpha variant infections in Scotland in a comprehensive national dataset. We used whole genome sequencing data to analyse patient presentations between 1st November 2020 and 30th January 2021 as the variant emerged in Scotland and used cumulative generalised additive models to compare 28-day maximum clinical severity for the Alpha variant against other lineages over the same period.
Materials and Methods
Sequencing
sequencing was performed as part of the COG-UK consortium using amplicon-based next generation sequencing (20,21).
Bioinformatics
sequence alignment, lineage assignment and tree generation were performed using the COG-UK data pipeline (https://github.com/COG-UK/datapipe) and phylogenetic pipeline (https://github.com/cov-ert/phylopipe) with pangolin lineage assignment (https://github.com/cov-lineages/pangolin) (22). Lineage assignments were performed on 18/03/2021 and phylogenetic analysis was performed using the COG-UK tree generated on 25/02/2021. Estimates of growth rates of major lineages in Scotland were calculated from time-resolved phylogenies for lineages B.1.1.7 (Alpha), B.177 and the sub-clades B.177.5, B.177.8, and another minor B.177 sub-clade (W.4). The estimates were carried out utilising sequences from November 2020 – March 2021 in BEAST (Bayesian Evolutionary Analysis by Sampling Trees) with an exponential growth rate population model, strict molecular clock model and TN93 with four gamma rate distribution categories. Each lineage was randomly subsampled to a maximum of 5 sequences per epiweek (resulting in 52 to 103 sequences per subsample, depending on the lineage), and 10 subsamples replicates analysed per lineage in a joint exponential growth rate population model.
Clinical data
we included all Scottish COG-UK pillar 1 samples sequenced at the MRC-University of Glasgow Centre for Virus Research (CVR) and the Royal Infirmary of Edinburgh (RIE) between 1st November 2020 and 30th January 2021. These samples derived from both hospitalised patients (59%) and community testing (41%).
Core demographic data (age, sex, partial postcode) were collected via linkage to electronic patient records and a full retrospective review of case notes was undertaken. Collected data included residence in a care home; occupation in care home or healthcare setting; admission to hospital; date of admission, discharge and/or death and maximum clinical severity at 28 days sample collection date via a 4-point ordinal scale (1. No respiratory support; 2. Supplemental oxygen; 3. Invasive ventilation, non-invasive ventilation or high-flow nasal canula; 4. Death) as previously used in Volz et al 2020 and Thomson et al 2021 (23,24).
Where available, PCR (Polymerase Chain Reaction) cycle threshold (Ct) and the PCR testing platform were recorded. Nosocomial COVID-19 was defined as a first positive PCR occurring greater than 48 hours following admission to hospital. Discharge status was followed up until 15th April 2021 for the hospital stay analysis. For the co-morbidity subanalysis, delegated research ethics approval was granted for linkage to National Health Service (NHS) patient data by the Local Privacy and Advisory Committee at NHS Greater Glasgow and Clyde. Cohorts and de-identified linked data were prepared by the West of Scotland Safe Haven at NHS Greater Glasgow and Clyde.
Severity analyses
four level severity data was analysed using cumulative (per the definition of Bürkner and Vuorre (2019)) generalised additive mixed models (GAMMs) with logit links, specifically, following Volz et al (2020) (23,25). We analysed three subsets of the data: 1. the full dataset, 2. the dataset excluding care home patients, and 3. exclusively the hospitalised population. Further details regarding these analyses are provided in Supplementary Appendix 1.
Ct analysis
Ct value was compared between Alpha variant and non-Alpha variant infections for those patients where the TaqPath assay (Applied Biosystems) was used. This platform was used exclusively for this analysis because different platforms output systematically different Ct values, and this was the most frequently used in our dataset (n = 154, Alpha = 38, non-Alpha = 116). We used a generalised additive model with a Gaussian error structure and identity link, and the same covariates used as in the severity analysis to model the Ct value. The model was fitted using the brms (v. 2.14.4) R package (26). The presented model had no divergent transitions and effective sample sizes of over 200 for all parameters. The intercept of the model was given a t-distribution (location = 20, scale = 10, df = 3) prior, the fixed effect coefficients were given normal (mean = 0, standard deviation = 5) priors, random effects and spline standard deviations were given exponential (mean = 5) priors.
Hospital length of stay analysis
hospital length of stay was compared for Alpha variant and non-Alpha variant patients while controlling for age and sex using a Fine and Gray model competing risks regression using the crr function in the cmprsk (v. 2.2-10) R package (27,28). Nosocomial infections were excluded. In total, this analysis had 521 cases (Alpha = 187, non-Alpha = 334), of which 4 were censored; 352 patients were discharged from hospital and 165 died.
Results
Emergence of the Alpha variant in Scotland
Between 01/11/2020 and 31/01/2021 1863 samples from individuals tested in pillar 1 facilities in Scotland underwent whole genome sequencing for SARS-CoV-2. Of these, 1475 (79%) could be linked to patient records and were included in the analysis. The contribution of patients infected with the Alpha variant increased over the course of the study, in line with dissemination across the UK during the study period (Figure 1a and 1b). At the time of the data used in this analysis two peaks of SARS-CoV-2 infection had occurred in the UK: the first (wave 1) in March 2020 (15) and the second in summer 2020 (29), both in association with hundreds of importations following travel to Central Europe (30). The second peak incorporated two variant waves (waves 2 and 3), initially of B.1.177 (Figure 1c) and then B.1.1.7/Alpha, radiating from the South of England (Figure 1e). This Alpha variant “takeover” (Figure 1d) corresponded to a five-fold increase in growth rate on an epidemiological scale relative to non-Alpha lineages (Figure 1f).
Demographics of the clinical cohort
The age of the clinical cohort ranged from 0-105 years, (mean 66.8 years) and was slightly lower in the Alpha group (65.6 years vs. 67.2 years). Overall, 59.1% were female; this preponderance occurred in both subgroups and was higher in the Alpha subgroup (60.4% vs 58.6%). In the full cohort, 3.0% were care home workers and 10.4% were NHS healthcare workers. 5.5% and 5.8% of those infected with the Alpha variant were care home and other healthcare workers respectively, compared with 2.2% and 12.0% of those infected with non-Alpha lineages. 12.9% of those in the Alpha subgroup were care home residents, compared with 21.7% in non-Alpha. There was also a difference in the proportion of cases admitted to Intensive Care Units: 6.3% of the Alpha group compared with 3.4% for non-Alpha. Full details of the demographic data of the cohort can be found in Table 1 and full lineage assignments can be found in Table S2.
Clinical severity analysis
Within the clinical severity cohort there were 364 Alpha, 1030 B.1.177 and 81 of 19 other lineage infections (Figure 2). Consistent with previous research comparing mortality and hospitalisation in SGTF detected by PCR versus absence of SGTF, we found that Alpha variant viruses were associated with more severe disease on average than those from other lineages circulating during the same time period. In the full dataset, we observed a positive association with severity (posterior median cumulative odds ratio: 1.40, 95% CI: 1.02-1.93). In both the subsets excluding care home patients, or limiting to hospitalised patients, the mean estimate of the increase in severity of the Alpha variant was smaller, and the variance in the posterior distribution higher likely due to the smaller sample sizes. Given this uncertainty, we cannot determine whether the association of the Alpha variant with severity in the populations corresponding to these subsets is the same as that in the population described by the full dataset, but in all cases, the most likely direction of the effect is positive. Model estimates from severity models from all subsets can be found in Tables S3-5.
Bernoulli models looking at sequential severity categories provided weak evidence that the proportional odds assumption of the cumulative logistic model was violated. The odds ratios for the no oxygen versus low flow oxygen, and low flow oxygen versus were similar to those estimated under the cumulative model (posterior median odds ratio for no oxygen versus low flow oxygen: 1.77, CI: 1.12-2.80; posterior median odds ratio for low flow oxygen versus high flow oxygen: 1.26, CI: 0.43-3.67) but with correspondingly higher posterior variances given the smaller sample size. The odds ratios for the high flow oxygen versus death model suggested that the preponderance of evidence was in favour of Alpha infection associated with lower risk of death, conditional on having received high flow oxygen (posterior median odds ratio: 0.64, CI: 0.22-1.90). However, the credible intervals here are wide, given the sample size and do include the estimated global effect. A similar but more extreme effect was observed for the effect of biological sex, with male sex being associated with worse outcomes for the first two sequential category models (posterior median odds ratio for no oxygen versus low flow oxygen: 1.32, CI: 0.96-1.80; posterior median odds ratio for low flow oxygen versus high flow oxygen: 3.10, CI: 1.37-7.08), but better outcomes for the last (posterior odds ratio: 0.62, CI: 0.19-099). Given other research on the topic has consistently identified male sex as a risk factor, this potentially indicates the existence of an important unmeasured confounder only relevant for those who receive high flow oxygen.
Estimates of the severity across the phylogeny are visible in Figure 3, see Supplementary Appendix 2 for more discussion of this analysis. An analysis including comorbidities for the subset of patients where they were available implied that the inclusion of comorbidities had no impact on the results obtained, see Supplementary Appendices 1 and 3.
We also found that the Alpha variant was associated with lower Ct values than infection with non-Alpha variants (posterior median Ct change: -2.46, 95% CI: (−4.22)-(−0.70)) as previously observed (8). Model estimates for all parameters can be found in Table S6.
We found no evidence that the Alpha variant was associated with longer hospital stays after controlling for age and sex (HR: -0.02; 95% CI: (−0.23)-0.20; p = 0.89).
Discussion
In this analysis of hospitalised and community patients with Alpha variant and non-Alpha variant SARS-CoV-2 infection, carried out as the Alpha variant became dominant in Scotland, we provide evidence of increased clinical severity associated with this variant. This was observed across all adult age groups, incorporating the spectrum of COVID-19 disease; from no requirement for supportive care to supplemental oxygen requirement, the need for invasive or non-invasive ventilation, and to death. This analysis is the first to assess the full clinical severity spectrum of confirmed Alpha variant infection in both community and hospitalised cases in relation to other prevalent lineages circulating during the same time period.
Our study supports the community testing analyses that have reported an increased 28-day mortality associated with SGTF as a proxy for Alpha variant status (8-10). Smaller studies found no effect of lineage on various measures of severity (15-17), but these were studies of patients already admitted to hospital and therefore would not pick up the granular detail of increasing disease severity resulting in a need for increasing levels of respiratory support and consequently admission to hospital.
The association between higher viral load, higher transmission and lineage may reflect changes in the biology of the virus; for example, the Alpha variant asparagine (N) to tyrosine (Y) mutation at position 501 of the spike protein receptor binding domain (RBD) is associated with an increase in binding affinity to the human ACE2 receptor (32). In addition, a deletion at position 69–70 may increase virus infectivity (33). The P681H mutation found at the furin cleavage site is associated with more efficient furin cleavage, enhancing cell entry (34). An alternative explanation for the higher viral loads observed in Alpha variant infection may be that clinical presentation occurs earlier in the illness. Further modelling, animal experiments and studies in healthy volunteers may help to unravel the mechanisms behind this phenomenon.
Our data indicate an association between the Alpha variant and an increased risk of requiring supplemental oxygen and ventilation; two factors that are critical determinants of healthcare capacity during a period of high incidence of SARS-CoV-2 infection. This illustrates the importance of countries, in particular those with weaker public health control of the virus, factoring the requirement for supportive treatment into models of clinical severity and pandemic response decision planning for future SARS-CoV-2 variants of concern. This granular analysis of disease severity based on genomic confirmation of diagnosis should be used as a baseline study for clinical severity analysis of the inevitable future variants of concern.
There are some limitations to our study. Our dataset is drawn from first-line local NHS diagnostic (Pillar 1) testing which over-represents patients presenting for hospital care (59%) while those sampled in the community represented 41% of the dataset. Further, the analysis dataset employed a non-standardised approach to sampling across the study period as sequencing was carried out both as systematic randomised national surveillance and sampling following outbreaks of interest. Finally, the cumulative model used in this analysis assumes a homogenous application of therapeutic intervention across the population. Despite these limitations, our results remain consistent with previous work on the mortality of Alpha, and this study provides new information regarding differences in infection severity.
In summary, the Alpha variant was found to be associated with a rapid increase in SARS-CoV-2 cases in Scotland and an increased risk of severe infection requiring supportive care. This has implications for planning for future variant driven waves of infection, especially in countries with low vaccine uptake or if variants evolve with significant vaccine-escape. Our study has shown the value of the collection of higher resolution patient outcome data linked to genetic sequences when looking for clinically relevant differences between viral variants.
Data Availability
Due to the analysis of patient identifiable data, please contact the authors for data requests.
Declaration of interest
none
Funding
COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) and Genome Research Limited, operating as the Wellcome Sanger Institute. Funding was also provided by UKRI through the JUNIPER consortium (grant number MR/V038613/1). Sequencing and bioinformatics support was funded by the Medical Research Council (MRC) core award (MC UU 1201412).
Appendix 2 – Phylogenetic severity model
The estimates of the severity per isolate shown in Figure 3 were generated by a model making several assumptions, which were violated. The key assumptions used, and their impacts will be discussed in this appendix (see 1 for deeper discussion of some the issues involved). Despite the violation of the assumptions, the answer generated was consistent with the non-phylogenetic method and the output is illustrative, so the results are included in the main text, though not stressed.
The first major assumption is that the source phylogeny is known without error. This can be practically broken into two assumptions. Firstly, that tree-like evolution is the correct description of the underlying evolutionary process, i.e., that horizonal gene transfer is unimportant. This is a relatively safe assumption in SARS-CoV-2. Secondly, that the phylogenetic tree is correctly estimated. This is likely to be violated as there may be error in both the discrete branching structure (or topology) and real-valued branch lengths. While the topology may be correctly estimated, the probability of estimating all the branch lengths correctly is vanishingly small. This is unlikely to be a large practical issue however, as small errors in the branch lengths of the phylogeny are unlikely to have large impacts relative to other model misspecification issues present in all statistical analyses.
If we are willing to assume that the estimated phylogeny is good enough for our purposes, we then must assume some model of the evolution of the trait of interest across that phylogeny. This model of the change in the trait (in this case, severity) across the phylogeny is what allows the conversion of the phylogenetic tree into a variance-covariance matrix. This describes the expected covariances (rescaled to correlations) between the severities associated with infection with different genetic variants. Here we made a common simple choice and assumed Brownian motion evolution of the trait across the phylogeny. However, this model has been acknowledged as often suboptimal since its inception (1), and we can consider it particularly so here. The number of observed changes across SARS-CoV-2 genomes are relatively few, and the number of amino acid changes even fewer, with some mutations occurring repeatedly in different lineages. Few mutations with combined with semi-frequent homoplasy represent a particularly problematic case for this model, as severity would be expected to change discretely with mutations and in consistent directions when convergent changes occur (in the absence of extreme epistatic effects on severity), two things that simple Brownian motion does not allow. Future work will explore more realistic evolutionary models for change in severity with genomes, such as matching criteria, which will reduce the error potentially imposed by this assumption.
Appendix 3 - Comorbidities
In the Greater Glasgow and Clyde population for which comorbidity data was available, the model without inclusion of comorbidities estimated the odds ratio for the impact of the Alpha variant on severity as 1.06 (95% CI: 0.70, 1.58). When number of relevant comorbidities a patient had were included but permuted, to break any relationship with the response, a similar odds ratio was estimated (1.06: 95% CI: 0.70, 1.60). The inclusion of the number of relevant comorbidities a patient exhibited did not substantially change this result (odds ratio for impact of the Alpha variant: 1.13; 95% CI: 0.73, 1.72). This is not unexpected, as the distribution of comorbidities was similar between those patients infected with the Alpha variant and those infected with non-Alpha lineage viruses.
Acknowledgements
We would like to thank all NHS staff that looked after patients during the COVID-19 pandemic in Scotland. The authors would like to acknowledge that this work uses data provided by patients and collected by the National Health Service (NHS) as part of their care and support. The authors would also like to acknowledge the work of the West of Scotland Safe Haven team in supporting extractions and linkage to de-identified NHS patient datasets.
Appendix
Supplementary Appendix
Appendix 1 – Further methods
Following Volz et al (2020), we modelled the ordinal severity data using cumulative generalised additive mixed models (per the definition of (per the definition of Bürkner and Vuorre (2019)) (18,20). We analysed three subsets of the data: 1. the full dataset, 2. the dataset excluding care home patients, and 3. exclusively the hospitalised population. The impact of Alpha variant infection and patient sex were modelled with fixed effects. County and partial postcode were modelled as random effects. Patient age and the days since the first diagnosis in the dataset were modelled using non-linear penalised regression splines with the k parameter set to its maximum value. The full dataset was additionally analysed using a phylogenetic cumulative generalised additive mixed model (PGAMM). The PGAMM was a modification of the GAMMs described above, where instead of including Alpha variant status as a fixed effect, we included a random effect of phylogenetic relationship between viral isolates (using a variance-covariance matrix calculated from the virus phylogeny under a Brownian motion assumption using the vcv.phylo function in ape (v. 5.5) (1)). All severity models were fitted using the brms (v. 2.14.4) R package (2). All presented models had no divergent transitions and effective sample sizes of over 200 for all parameters. Additionally, we fitted Bernoulli models with the same covariate set as the cumulative model for no oxygen vs. low flow supplemental oxygen, low flow supplemental oxygen vs. high flow supplemental oxygen and, high flow supplemental oxygen vs. mortality individually to test the proportional odds assumption.
Comorbidities were only available for patients from the Greater Glasgow and Clyde health board (n = 639). Comorbidities used were those previously identified as important for COVID-19 severity by the ISARIC4C consortium (3). To test whether the lack of comorbidity data for the rest of the sample was leading to biased estimates of the impact of Alpha variant infection, we performed three analyses on the Greater Glasgow and Clyde patient population. We fit the above model with the number of comorbidities a patient exhibited included as non-linear penalised regression spline. While the exact form of the relationship between severity of infection and the number of comorbidities a patient exhibits is unknown, we would expect the relationship to be monotonically increasing, however, for mathematical simplicity, we do not enforce this constraint on the spline. We also fit the model to this patient population without the comorbidities included and with the comorbidities permuted to estimate the change in the estimate of the Alpha variant effect by the inclusion of comorbidities. As the inclusion of comorbidities was found not to change the estimated effect of the Alpha variant, this analysis is presented in Supplementary Appendix 4.
Model intercepts were given t-distribution (location = 0, scale = 2.5, df = 3) priors, fixed effects were given normal (mean = 0, standard deviation = 2.5) priors, random effects and spline standard deviations were given exponential (mean = 2.5) priors.
Footnotes
david.pascall{at}mrc-bsu.cam.ac.uk
Elen.Vink{at}glasgow.ac.uk
Rachel.Blacow{at}ggc.scot.nhs.uk
naomi.bulteel2{at}nhs.scot
alasdair.campbell5{at}nhslothian.scot.nhs.uk
robyn.campbell{at}nhslothian.scot.nhs.uk
sarah.clifford{at}nhslothian.scot.nhs.uk
chris.davis{at}glasgow.ac.uk
ana.filipe{at}glasgow.ac.uk
noha.elsakka{at}nhs.scot
Ludmila.Fjodorova2{at}nhs.scot
ruth.forrest2{at}nhs.scot
Emily.Goldstein{at}ggc.scot.nhs.uk
Rory.Gunson{at}ggc.scot.nhs.uk
John.Haughney{at}ggc.scot.nhs.uk
matt.holden{at}nhs.scot
patrick.honour3{at}borders.scot.nhs.uk
joseph.hughes{at}glasgow.ac.uk
edward.james{at}borders.scot.nhs.uk
timothy.lewis{at}nhslothian.scot.nhs.uk
samantha.lycett{at}ed.ac.uk
Oscar.MacLean{at}glasgow.ac.uk
martin.mchugh{at}nhslothian.scot.nhs.uk
Guy.Mollett{at}ggc.scot.nhs.uk
yusuke.onishi{at}nhs.scot
b.j.parcell{at}dundee.ac.uk
surajit.ray{at}glasgow.ac.uk
david.l.robertson{at}glasgow.ac.uk
Sharif.Shaaban2{at}phs.scot
James.Shepherd.2{at}glasgow.ac.uk
katherine.smollett{at}glasgow.ac.uk
Kate.Templeton{at}nhslothian.scot.nhs.uk
elizabeth.wastnedge{at}nhslothian.scot.nhs.uk
Craig.Wilkie{at}glasgow.ac.uk
thomaschristiewilliams{at}gmail.com
emma.thomson{at}glasgow.ac.uk
Updated to make consistent with newly submitted version - checks to proportional odds assumption, added authors, updated references.