Abstract
As severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spreads, the susceptible subpopulation is depleted causing the incidence of new cases to decline. Variation in individual susceptibility or exposure to infection exacerbates this effect. Individuals that are more susceptible or more exposed tend to be infected earlier, depleting the susceptible subpopulation of those who are at higher risk of infection. This selective depletion of susceptibles intensifies the deceleration in incidence. Eventually, susceptible numbers become low enough to prevent epidemic growth or, in other words, the herd immunity threshold (HIT) is reached. Although estimates vary, simple calculations suggest that herd immunity to SARS-CoV-2 requires 60-70% of the population to be immune. By fitting epidemiological models that allow for heterogeneity to SARS-CoV-2 outbreaks across the globe, we show that variation in susceptibility or exposure to infection reduces these estimates. Accurate measurements of heterogeneity are therefore of paramount importance in controlling the COVID-19 pandemic.
One Sentence Summary Models that curtail individual variation in susceptibility or exposure to infection overestimate epidemic sizes and herd immunity thresholds.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in China in late 2019 and spread worldwide causing the ongoing pandemic of coronavirus disease (COVID-19). As of 06 May 2020, more than 3.5 million cases have been confirmed and almost 250,000 died (1). Scientists throughout the world have engaged with governments, health agencies, and with each other, to address this emergency. Mathematical models have been central to important decisions concerning contact tracing, quarantine, and social distancing, to mitigate or suppress the initial pandemic spread (2). Successful suppression, however, leaves populations at risk to resurgent waves due to insufficient acquisition of immunity. Models have thus also addressed longer term SARS-CoV-2 transmission scenarios and the requirements for continued adequate response (3). This is especially timely as countries begin to relax lockdown measures that have been in place over recent weeks with varying levels of success in tackling national outbreaks.
Here we demonstrate that individual variation in susceptibility or exposure (connectivity) accelerates the acquisition of immunity in populations due to selection by the force of infection. More susceptible and more connected individuals have a higher propensity to be infected and thus are likely to become immune earlier. Due to this selective immunization, heterogeneous populations require less infections to cross their herd immunity thresholds (HITs) than homogeneous (or not sufficiently heterogeneous) models would suggest. We integrate continuous distributions of susceptibility or connectivity in otherwise basic epidemic models for COVID-19 and show that as the coefficient of variation (CV) increases from 0 to 4, the herd immunity threshold declines from over 60% (4, 5) to less than 10%. Measures of individual variation are urgently needed to narrow the estimated ranges of HITs and plan accordingly.
SARS-CoV-2 transmission in heterogeneous populations
SARS-CoV-2 is transmitted primarily by respiratory droplets and modelled as a susceptible-exposed-infectious-recovered (SEIR) process.
Individual variation in susceptibility is integrated as a continuously distributed factor that multiplies the force of infection upon individuals as where S(x) is the number of individuals with susceptibility x, E(x) and I(x) are the numbers of individuals who originally had susceptibility x and became exposed and infectious, δ is the rate of progression from exposed to infectious, γ is the rate of recovery or death, and λ = (β/N) ∫[ρE(x) + I(x)] dx is the average force of infection upon susceptible individuals in a population of size N. The basic reproduction number is where ρ is a factor measuring the infectivity of individuals in compartment E in relation to those in I, and 〈x〉 is the mean susceptibility factor at epidemic onset. Prior to the epidemic, susceptibility is described by a probability density function q(x) with mean 1 and CV = 〈(x − 1)2〉 explored as a parameter. The effective reproduction number (Reff, also denoted by Re or Rt by other authors) is a time-dependent quantity obtained by multiplying R0 by the susceptibility of the population over time.
Figure 1 depicts model trajectories fitted to suppressed epidemics in Italy and Austria, assuming coefficients of variation 1 and 3. The difference in the size of second waves between the two levels of variation is substantial. In the case of Italy, where suppression was less successful, the pandemic appears mostly resolved when CV = 3. However, a large second wave (or a series of smaller waves, depending on possible containment strategies) remains in the horizon when CV = 1. Countries where suppression of the initial outbreak was more successful, such as Austria, have acquired less immunity and therefore the potential for future transmission in the respective populations remains naturally larger. However, also in these situations, expectations for the potential of subsequent waves is much reduced by variation in susceptibility to infection.
In a directly transmitted infectious disease, such as COVID-19, variation in exposure to infection is primarily governed by patterns of connectivity among individuals. We incorporate this in the system (Equation 1) by adding variation in infectivity and assuming a positive correlation between susceptibility and infectivity. Formally this corresponds to modifying the force of infection as λ = (β/N)(∫ x[ρE(x) + I(x)] dx/∫ xq(x) dx) and the basic reproduction number as where 〈x〉 and 〈x2〉 are the first and second moments of the distribution q(x) prior to the epidemic. Applying this model to the epidemics in Italy and Austria (Figure 2) leads to similar results to those obtained when variation was in susceptibility to infection.
The herd immunity threshold
Individual variation in risk of acquiring infection is under selection by the force of infection, whether individual differences are due to biological susceptibility, physical exposure, or a combination of the two traits. Selection results in the removal of the most at-risk individuals from the susceptible pool as they become infected and eventually recover (some die). This selective acquisition of infection and immunity results simultaneously in decelerated epidemic growth and accelerated induction of immunity in the population. The herd immunity threshold (HIT) defines the percentage of the population that needs to be immune to reverse epidemic growth and prevent future waves. Figure 3 shows the expected downward trends in the HIT for SARS-CoV-2 as the coefficients of variation of the gamma distributed susceptibility or exposure are increased between 0 and 4 (to assess robustness to changing the type of distribution see Figure S22 for equivalent plots with lognormal distributions). While herd immunity is expected to require 60-70% of a homogeneous population to be immune given an R0 between 2.5 and 3, these percentages drop to the range 10-20% for CVs between 2 and 4. Therefore, a critically important question is: how variable are humans in their susceptibility and exposure to SARSCoV-2? Hitherto, there is no definite answer to this question.
As the pandemic unfolds evidence will accumulate in support of low or high coefficients of variation, but soon it will be too late for this to impact public health strategies unless we act pragmatically. We searched the literature for estimates of individual variation in the propensity to acquire or transmit several infectious diseases including COVID-19 and overlaid these estimates as vertical lines in Figure 3. CV estimates are mostly comprised between 2 and 4, a range where naturally acquired immunity to SARS-CoV-2 may place populations over the HIT once as few as 10-20% of its individuals are immune. This depends, however, on which specific transmission traits are variable and how much the trait variants are distributed.
Variation in infectiousness was critical to attribute the scarce and explosive outbreaks to superspreaders when the SARS-CoV-1 emerged in 2002 (9), but infectiousness does not respond to selection as susceptibility or exposure do. Models with individual variation in infectiousness perform equivalently to homogeneous versions when implemented deterministically (Figure S21). They diverge when stochasticity is added in the sense that disease extinction becomes more likely and outbreaks become rarer and more explosive (9-11), but this an entirely different phenomenon to that presented in this paper.
Among the estimates of individual variation plotted in Figure 3, those corresponding to SARS-CoV viruses, with coefficients of variation in the range 2.6-3.2, have been described as variation in individual infectiousness (9, 10), but the way authors describe superspreaders is suggestive that higher infectiousness may stem from higher connectivity with other individuals who may be susceptible. This would support the scenarios displayed in Figure 2 with CV = 3 for connectivity, although little is known about how this might have been modified by social distancing.
Discussion
The concept of herd immunity is most commonly used in the design of vaccination programs (12, 13). Defining the percentage of the population that must be immune to cause infection incidences to decline, herd immunity thresholds constitute convenient targets for vaccination coverage. In idealized scenarios of vaccines delivered at random and individuals mixing at random, herd immunity thresholds are given by a simple formula (1 − 1/ R0) which, in the case of SARS-CoV-2, suggests that 60-70% of the population would need be immunized to halt spread considering estimates of R0 between 2.5 and 3. A crucial caveat in exporting these calculations to immunization by natural infection is that natural infection does not occur at random. Individuals who are more susceptible or more exposed are more prone to be infected and become immune, which lowers the threshold (14). In our model, the herd immunity threshold declines sharply when coefficients of variation increase from 0 to 2 and remains below 20% for more variable populations. The amplitude of the decline depends on what property is heterogeneous and how it is distributed but the downwards trend is robust (Figures 3 and S22).
Heterogeneity in the transmission of respiratory infections has traditionally focused on variation in exposure summarized into age-structured contact matrices. Besides overlooking differences in susceptibility given exposure, the aggregation of individuals into age groups curtails coefficients of variation with important downstream implications. We calculated CV for the landmark POLYMOD matrices (15,16) and obtained values between 0.3 and 0.5. Recent studies of COVID-19 integrated contact matrices with age-specific susceptibility to infection (structured in three levels) (17) or with social activity (three levels also) (18) which, again, resulted in coefficients of variation less than 1. We show that models with coefficients of variation of this magnitude would appear to differ only moderately from homogeneous approximations when compared with those that incorporate CVs between 2 and 3, as estimated for a variety of infectious diseases (Figure 3) and supported by detailed mobility data in the city of Portland, Oregon, USA (19) (we obtained an estimate rounding CV = 2 based on data extracted with WebPlotDigitizer). It is therefore crucial that variation in susceptibility and exposure to infection is included in epidemic models at the finest resolution of individuals. This has required agent-based models which are computationally intensive and not amenable to mathematical treatment (19). Here, we introduce mathematical formalisms that enable the entire individual variation to be captured while maintaining the analytical tractability of the simplest homogeneous models. This is especially relevant when dealing with major crises such as the current pandemic where optimal strategies rely on a capacity to quickly rationalize the best compromise between protecting health and safeguarding the economy. The larger the individual variation, the more optimistic the public health prognostics and the milder the required health policies.
Interventions themselves have potential to manipulate individual variation. Current social distancing measures may be argued to either increase or decrease variation in exposure, depending on the roles of different functional strata in societies and the compliance of individuals who are normally more highly connected in relation to the average. Datasets that describe connectivity patterns before and during movement restrictions, such as those in (17) could, in principle, inform relevant changes in distributions of individual connectivity but surveys must be applied on representative samples of the population and the information cannot be collapsed into age-group averages. A deeper understanding of the putative patterns is crucial not only to develop more accurate predictive models, but also to refine control strategies and to interpret data resulting from prevalence studies and serological surveys.
An analysis of the outbreak on board the Diamond Princess cruise ship reported a cumulative infected percentage of 17% (20). Seroprevalences estimated from various settings are currently widespread, but reportedly between less than 1 and just over 20%, including estimates from Kobe, Japan (3.3%) (21) and Guilan province, Iran (22%) (22). While seropositivity estimates are limited by epidemiological context and current estimates are undoubtedly affected by testing uncertainties, our results suggest that some estimated values may be closer to reaching herd immunity thresholds than otherwise predicted, if populations were sufficiently heterogeneous. Worth nothing, however, that these estimates may have been offset by the social distancing measures.
Given current uncertainties, a high level of pragmatism may be required in incorporating results from serological surveys into policy decisions. We have assumed that infection elicits persistent adaptive immunity. This assumption is justified by encouraging reports on animal models and humans recovered from SARS-CoV-2 infection, even though volatile immunity has not been ruled out yet. Our results are robust as long as recovered individuals remain immune for several months. Any test that allows for retrospectively detecting past infections is therefore a convenient tool for monitoring the prevalence and distribution of individuals who may have acquired immunity. It would be imperative to conduct repeated serological studies in representative samples of the population (23) especially as control measures are relaxed, not necessarily to imply that antibodies themselves are neutralizing but to identify past infection and potential for immune protection. Given a percent positivity in an initial survey, the curve traced by subsequent measurements could indicate if and how rapidly a population is moving towards the herd immunity threshold, and simultaneously advise which control measures should be enforced.
Data Availability
All data referred to in the manuscript are publicly available.
Funding
RMC and MUF receive scholarships from the Conselho Nacional de Desenvolvimento Científico e Tecnológio (CNPq), Brazil. CSM was supported by the Intramural Research Program of the NIH, The National Heart Lung and Blood Institute.
Author contributions
MGMG conceived the study and wrote the first draft, RA, RMC and JGK performed the analyses, all authors wrote the paper.
Competing interests
Authors declare no competing interests.
Data and materials availability
All data is available in the main text or the supplementary materials.
Acknowledgements
We thank Jan Hasenauer (Institute of Computational Biology, Helmholtz Zentrum München, München, Germany) for helpful discussions.