Abstract
The effective reproduction number R is a prominent statistic for inferring the transmissibility of infectious diseases and effectiveness of interventions. R purportedly provides an easy-to-interpret threshold for deducing whether an epidemic will grow (R>1) or decline (R<1). We posit that this interpretation can be misleading and statistically overconfident when applied to infections accumulated from groups featuring heterogeneous dynamics. These groups may be delineated by geography, infectiousness or sociodemographic factors. In these settings, R implicitly weights the dynamics of the groups by their number of circulating infections. We find that this weighting can cause delayed detection of outbreak resurgence and premature signalling of epidemic control because it underrepresents the risks from highly transmissible groups. Applying E-optimal experimental design theory, we develop a weighting algorithm to minimise these issues, yielding the risk averse reproduction number E. Using simulations, analytic approaches and real-world COVID-19 data stratified at the city and district level, we show that E meaningfully summarises transmission dynamics across groups, balancing bias from the averaging underlying R with variance from directly using local group estimates. An E>1 generates timely resurgence signals (upweighting risky groups), while an E<1 ensures local outbreaks are under control. We propose E as an alternative to R for informing policy and assessing transmissibility at large scales (e.g., state-wide or nationally), where R is commonly computed but well-mixed or homogeneity assumptions break down.
Author Summary How can we meaningfully summarise the transmission dynamics of an infectious disease? This question, although fundamental to epidemiology and crucial for informing the design and implementation of interventions (e.g., quarantines), is still not resolved. Current practice is to estimate the effective reproduction number R, which counts the average number of new infections generated per past infection, at large scales (e.g., nationally). An estimated R>1 signals epidemic growth. While R is easily interpreted and computed in real time, it averages infections across diverse locations or socio-demographic groups that likely possess different transmission dynamics. We prove that this averaging in R reduces sensitivity to resurgence, making R>1 slow to reflect realistic epidemic growth. This delay can substantially misinform policymakers and impede interventions. We apply optimal design theory to derive the risk averse reproduction number E as an alternative summary of diverse transmission dynamics. Using mathematical arguments, simulations and empirical COVID-19 datasets, we show that E>1 is an improved threshold for resurgence, providing timelier signals for informing policy or interventions and better uncertainty quantification. Further, E maintains the computability and interpretability of R. We propose E as meaningful statistic at large scales, where the averaging within R likely misrepresents the diversity of transmission dynamics.
Introduction
The effective reproduction number R, summarises the time-varying transmissibility or spread of an infectious disease by the average number of secondary infections that it generates per effective primary infection [1]. A value of R above or below 1 is interpreted as a threshold [2] signifying, respectively, that the epidemic is growing (spread is supercritical) or under control (subcritical). This interpretation is widely used, and estimates of R computed at various scales, ranging from e.g., the district to country level when scale is defined spatially, have yielded valuable insights into the transmission dynamics of numerous pathogens including pandemic influenza, malaria, Ebola virus and SARS-CoV-2 [3–5]. During outbreaks, R is monitored and reported in real time to assess the effectiveness of interventions [2,6], signal the emergence of pathogenic variants [7], estimate the probability of sustained outbreaks [8], increase public awareness [9] and inform public health policymaking [10].
The benefits of R mainly stem from two properties: it is easily interpretable as a threshold parameter, and it is easily computable in real time, requiring only routine surveillance data such as epidemic curves of cases [11,12]. However, these must be balanced against its core limitation – R is commonly derived under a well-mixed assumption in which individuals are homogeneous and have equal probabilities of encountering one another. Generalising this assumption to account for the fact that realistic contact rates are heterogenous, leading to assortative and preferential mixing [13], often necessitates some loss in either interpretability or computability [14]. As examples, agent-based [15,16], network [17,18] and compartmental [19,20] models are three common approaches to incorporating heterogeneity that possess different and sometimes complementary characteristics.
Agent-based models explicitly describe individual-level dynamics of the epidemic [17]. These completely simulate transmission heterogeneities but require extensive high-resolution data to fit and can be computationally intractable [21]. Network models instead capture patterns of connections among individuals or larger groups and hence model contact heterogeneities without being as expensive as agent-based approaches (although there is notable overlap among these models [20]). Unfortunately, network models still need detailed contact tracing data and there are questions about the interpretability and even existence of reproduction numbers for these approaches [22,23]. Compartmental models class or group individuals by epidemiological state (e.g., susceptible or infected) and assume homogeneous transmission in their simplest form, but can be extended to include multiple infectious classes to describe heterogeneous transmission [19,20] without substantial computational overhead. However, additional classes necessitate more epidemiological rate and distribution data.
An alternative that can reduce the severity of these complexity-interpretability trade-offs is to model the epidemic as a metapopulation or multitype process, in which local scales are well-mixed but diverse. Global scales then enforce structural heterogeneity [14] and compose the overall R as a weighted function of the local effective reproduction numbers of each group, denoted Rj for group j [24]. We use renewal or autoregressive processes (see Methods) to describe transmission at local scales. These model the relationships among routine infection incidence data and the Rj directly and so minimise complexity and computational expense. We consider spatial scales, though analyses will often equally apply to sociodemographic and other types of heterogeneity. In our context, local scales may represent regions and the global scale a country composed of those regions. While metapopulation approaches have allowed more informative generalisations of R (e.g., using next generation matrices) [19,25], two key questions remain understudied and form the focus of this paper.
First, can we derive an alternative global statistic that better captures the salient dynamics of local scales than R? Standard formulations of R lose sensitivity to key events such as local resurgences [24], defined as a sustained Rj > 1, because these usually initialise in groups with small incidence, which are down-weighted by R. Second, can we optimally trade-off the uncertainty among estimates at local and global scales? In estimating the dynamics of local groups, we reduce the data informing each estimate and are subject to bias-variance trade-offs. If we assess local resurgence using individual Rj estimates, false positives arising from increased uncertainty are more likely and it is unclear how to combine individual estimates to describe overall transmissibility. However, if we neglect local heterogeneities and rely on R, we may be statistically overconfident in our estimates of transmissibility.
Using experimental design theory [26], which provides a framework for optimising and fusing estimates of Rj according to cost functions of interest, we derive the risk averse reproduction number, E. We prove that E upweights groups with likely resurging dynamics because it optimises a cost function that results in the maximum variance among the Rj estimates being minimised. E solves what is known as an E-optimal design problem (see Methods) [26]. Hence E is risk averse and formally trades off bias from R with variance from each Rj. We demonstrate, via analytic arguments, simulations and investigation of empirical COVID-19 datasets from Israel, Norway, New Zealand, states within the USA and regions of the UK, that E achieves more meaningful consensus across local-scale dynamics than R, improving uncertainty quantification and resurgence detection without losing either interpretability or computability. Given its transparent, design-optimal properties, we believe E can help inform policy at large scales (e.g., state or nationwide) where well-mixed assumptions are invalid.
Results
Pitfalls of standard reproduction numbers
We consider the transmissibility of a pathogen at two scales: a local scale, in which we may model a well-mixed population (e.g., within specific geographic or sociodemographic groups) and a global scale, which integrates the dynamics of p ≥ 1 local groups. Our local scale, for example, may refer to spread at a district level, whereas the corresponding global scale is countrywide and covers all districts that compose that country. We assume that subdivision into local groups is based on prior knowledge and logistical constraints. We denote the time-varying reproduction number within local group j as Rj and model the dynamics in this group with a Poisson (Pois) renewal model (see Methods) [11] as on the left side of Eq. (1).
This is a widely used framework for modelling time-varying transmissibility [10], with Ij as the new infections and Λj as the total infectiousness within group j. Λj measures the circulating (active) infections as a weighted sum of past infections in group j, with weights set by the generation time distribution of group j. We allow this distribution, which describes the times between primary and secondary infections, to vary among the groups [1]. All variables are functions of time e.g., Ij is explicitly Ij(t), but we disregard time indices to simplify notation. An outbreak in group j is growing or controlled if the sign of Rj − 1 is, respectively, positive or negative. If a positive sign is sustained, we define group j as being resurgent [24].
We do not directly model inter-group reproduction numbers (e.g., reproduction numbers for infections arising from individuals in group j emigrating into group i ≠ j). These additional parameters are rarely identifiable from routine surveillance data. However, we can include their impact by distinguishing local from imported infections without altering our methodology [27]. We describe how our formulation includes importations or introductions and implicitly accounts for interconnectivity in later sections (see Eq. (4) and the Methods). Moreover, Eq. (1) is widely used in practice [28] to compute R (see below) and we aim to derive statistics that are comparable to R. We find, when applied to empirical data (see later COVID-19 case studies), that our simple but completely identifiable statistics work well.
This renewal model approach is also commonly applied over global scales (e.g., to compute national reproduction numbers during the COVID-19 pandemic [28]) by summing infections from every constituent group. This amounts to a well-mixed assumption at this global scale. If we define and as the new infections and total infectiousness on this global scale, then the transmission model used is I ∼ Pois(RΛ), with R as the effective reproduction number on that scale. However, if we instead develop a global model from our local models, we obtain the right side of Eq. (1). This simple observation has an important ramification – by assuming that this single R summarises global scale dynamics, we make an implicit judgment about the relative importance of the dynamics in different local groups. We can expose this judgment by simply equating both global models to get Eq. (2).
We see that group j is assigned a weight wj, which is the ratio of active infections in group j to the total active infections at the global scale. Note that 0 ≤ wj ≤ 1 and . This weighting has two key consequences. First, groups with outsized infection loads dominate R. This means a group with a large Λj and small Rj < 1, can mask potentially important resurgent groups, which likely possess small Λj and Rj > 1, until those groups generate an appreciable number of infections [24]. Consequently, using R may lead to lagging indicators of resurgence i.e., late warnings. The alternative – to scrutinise each local region for signs of concentrated upticks in Rj – may also be suboptimal. Higher stochasticity is expected from data at smaller scales, potentially causing false positive resurgence alarms. This is similar to the classic bias-variance trade-off commonly encountered in statistical modelling [29].
Second, R is only fully representative of local dynamics at two boundary conditions – when the Rj are highly similar and when there is only one active group (i.e., effectively p = 1). The latter case is trivial, while the former is unlikely because epidemics commonly traverse connected regions in waves [30] and different groups often possess heterogeneous contact patterns and risks of infection. These all result in diverse Rj time series and desynchronised epidemic curves [31]. Hence, we argue that this commonly estimated R [9,10] may neither be sufficient nor representative for communicating overall transmission risks or informing policymaking. We bolster our argument by showing that R is also statistically overconfident as a summary statistic i.e., its estimates have underestimated variance.
We analyse the properties of R by computing maximum likelihood estimates (MLEs) and Fisher information (FI) values. The left side of Eq. (3) defines , the MLE of R, in terms of the MLEs of the Rj of every group. These are [1] under Eq. (1) (see Methods for derivations). The smallest asymptotic uncertainty around these MLEs (or any consistent Rj estimator) is delineated by the inverse of the FI i.e., larger FI values imply smaller estimate uncertainties [32]. For the renewal models studied here, we know that [33]. When comparing across scales, it is easier to work under the robust or variance stabilising transform , as it yields FI (see [33] and Methods for details). We will often switch between the FI of Rj and as needed to clarify comparisons, but ultimately will provide main results in the standard Rj formulation.
Substituting our FI expressions into Eq. (2), we find the global FI linearly sums the local FI contributions as on the right side of Eq. (3) and is an increasing function of p.
Consequently, the uncertainty around is likely to be substantially smaller than that around any . This formulation underestimates overall uncertainty because the FI acts as a weight that is inversely proportional to the variance of the Rj estimates. Estimates of R are therefore statistically overconfident as measures of global epidemic transmissibility. We demonstrate this point in later sections via the credible interval widths obtained from simulations.
The goal of our study is to design alternatives to R that attain a better consensus across heterogeneous dynamics, with defined properties over diverse locales and without inflated estimate confidence. To achieve these objectives, we must make a principled bias-variance trade-off among signals from R and every Rj, deciding how to best emphasise actionable dynamics from local groups without magnifying noise. We apply optimal design theory to develop new consensus reproduction numbers with these tailored properties. As we show in the next section, this involves optimising the weights multiplying every Rj according to cost functions that encode the uncertainty properties and trade-offs that we desire globally.
D and E optimal reproduction numbers and their properties
The consensus problem of deriving a statistic that is representative of local dynamics can be reframed as an optimal design on the weights mapping the Rj to that statistic, based on a cost function of interest. The uncertainty around estimates of Rj, encoded (inversely) by , fundamentally relates to key dynamics of the epidemic e.g., resurgence events likely occur at small Λj and large Rj, minimising FI[Rj] [24]. Hence, we focus our designs on the Fisher information matrix FIR of Eq. (4), which summarises the uncertainty from all local Rj estimates. There we replace Λj with a factor αj > 0 that redistributes the information across the p groups, subject to its sum being equal to Λ (see Eq. (3)).
This formulation facilitates the description of several important scenarios. When αj = Λj, we recover the standard formulation of R (Eq. (2)). If we additionally model introductions among groups using probabilities of transporting active infections as in [34], then αj measures the active infections that are informative about Rj i.e., all infections that are generated in group j, including those that are introduced into other groups. This models interconnectedness or inter-group transmissions [27]. If we assume that infections observed in group j are actually a random sample from multiple groups drawn from some multinomial distribution, then αj corresponds to the fraction of Λ assigned by that distribution to group j. In the Methods we expand on these points mathematically, showing how Eq. (4) and the optimal designs below are valid (assuming knowledge of the introductions) when the p groups interact.
Since the αj are design variables subject to the conservation constraint in Eq. (4), we can leverage experimental design theory [26,35] to derive novel consensus statistics to replace the default formulation from Eq. (2). We examine A, D and E-optimal designs, which have standard definitions of how the total uncertainty on our p parameters is optimised. If p = 2, this uncertainty can be circumscribed by an ellipse in the space spanned by R1 and R2, and designs have a geometric interpretation as we show in Fig 1. A-optimal designs minimise the bounding box of the ellipse, while D and E-optimal designs minimise its area (or volume, when extending to higher dimensions) and largest chord respectively [35,36]. These designs yield optimal versions of , computed as shown in Eq. (5), where tr[.], det[.] and eig[.] indicate the trace, determinant and eigenvalue of their input matrix.
These designs can be done with robust transforms by replacing FIR with , yielding the diagonal FI matrix . We then observe that tr . The default allocation of is hence, trivially, an A-optimal design under this transform. We compute D and E-optimal designs without transforms because we want to work with Rj directly. As , with the bracketed term as a constant, we find that, subject to our constraint, . This follows from majorization theory and solutions are adapted from [36,37]. Since FIR is diagonal, its eigenvalues are and deriving equates to solving for (see Methods and [36,37] for derivations). Our E optimal design is then .
We formulate the new consensus reproduction numbers, D and E, by substituting the above optimised into Eq. (2) to derive Eq. (6), which forms our main result. This corresponds to using optimised weights and in Eq. (2). In Eq. (6) we compute these statistics as convex sums and .
We refer to D as the mean reproduction number because it is the first moment or arithmetic mean of the effective reproduction numbers of each group i.e., it weights the dynamics of each group equally. This construction ensures that the overall uncertainty volume over the estimates of every Rj is minimised. We define E as a risk averse reproduction number, and derive it as the ratio of the second to first raw moment of the group reproduction numbers. This is also known as the contraharmonic mean. E weights each group reproduction number by the fraction of the total reproduction number sum attributable to that group. This weighting emphasises groups with large Rj, which are considered to be high risk. While D and E do not explicitly include Λj as in R, both are still informed by the active infections because the Λj (which are proportional to the FI) control the variance of the Rj estimates. This variance or uncertainty propagates into and Ê as in Eq. (6) (also see Methods).
All three statistics possess important similarities that define them as proxies for reproduction numbers. Because they are all convex sums of the local Rj, the value of each statistic lies inside a simplex with vertices at the Rj. At the boundary conditions of one dominant group (i.e., essentially p = 1) or of highly similar group dynamics (i.e., the Rj are roughly the same over time) this simplex collapses and we find R = D = E. Moreover, if all the Rj = 1, then R = D = E = 1, signifying convergence to the reproduction number threshold. Thus, D and E are alternative strategies to R for combining local reproduction numbers with different properties that may offer benefits when making decisions at large scales. We visualise how these statistics determine our global estimate of transmissibility via the simplex in Fig 1. Although we present several reproduction number formulae for comparison, E and its risk averse properties form the main interest of this work.
We refer to E as risk averse because it ensures that the most uncertain Rj is upweighted as compared to the standard formulation of R in Eq. (2). This protects against known losses of sensitivity to resurgent dynamics [24] that occur due to averaging across groups, because the FI is expected to be smallest for resurgent groups. As opposed to simply interrogating the individual Rj estimates to identify resurgent groups, E weights those groups, while also accounting for the uncertainty in their estimates, to obtain a consensus on overall epidemic transmissibility risk. Consequently, E reduces false positives that may occur due to the noise in individual resurgent groups and provides a framework for interpreting situations where some groups may be concurrently resurgent while others are under control. We illustrate these points in Fig 2 below and explore their ramifications in the next section.
In Fig 2 we apply the renewal model framework from [11,24], which is based on Eq. (1) but models estimates of the local reproduction numbers according to the posterior gamma (Gam) distribution . This yields a mean estimate of Rj that is equal to the MLE with standard deviation . We consider two local groups (p = 2) and compute the probabilities of resurgence P(X > 1) for X = R1, R and E for scenarios that likely represent a resurgence (i.e., small Λ1 and increasing I1 with the second group stable at Λ2 = I2). In these cases, E is able to signal resurgence substantially earlier than R but is distinct from simply observing dynamics in the resurging group 1, where R1 = max Rj. We confirm this by noting that σ(R1) is appreciably larger than σ(E), the standard deviation from E. Hence responding to R1 is maximally sensitive but also magnifies noise. In contrast R is overconfident, with usually a much smaller σ(R).
Risk averse E is more representative of key transmission dynamics
We test our D and E reproduction numbers against R on epidemics that are simulated from renewal models with Ebola virus generation times from [38] in Fig 3 and Fig 4. We also compute max Rj to benchmark how a naive risk averse statistic derived from observing individual groups performs. We consider p = 3 groups with various true Rj dynamics (black, dashed) that fluctuate across controlled and resurgent stages. We use the EpiFilter package [39], which applies Bayesian smoothing algorithms, to obtain local estimates (blue) from the incidence curves Ij. Similarly, we estimate the overall R (blue) from the total incidence , which is how this statistic is evaluated in practice. We infer D (green) and E (red) by sampling from posterior distributions of local Rj estimates and combining them according to Eq. (6). Taking maxima across these local samples gives max Rj (cyan). All estimates include 95% equal tailed Bayesian credible intervals, and we use default EpiFilter settings.
The simulations in Fig 3 examine abrupt changes in transmissibility. Disease transmission in every group is first controlled. Infections then either resurge (j = 1,3) or are driven towards elimination (j = 2). Because of its weighting by active infections, R proposes a false, lengthy period of subcritical spread at t ≈ 70, even though the majority of groups have Rj > 1. This causes R to be slow to indicate resurgences at t ≈ 100 and t ≈ 220. In contrast, E is quick to signal resurgence at t ≈ 220, without losing the capacity to indicate that the epidemic is under control at t ≈ 140. D largely interpolates between R and E, showing the mean of all the Rj and serves as a null model. As expected from the theory, R is overconfident about its transmissibility estimates, which is apparent from its narrow credible intervals. In contrast, max Rj is very noisy, with considerably larger credible intervals limiting its use.
We further investigate fluctuating but anti-synchronised epidemics (j = 1,2) against the backdrop of a much larger monotonically increasing and then decreasing outbreak (j = 3) in Fig 4. The two out-of-phase groups approximately average to a constant value in both their incident infections and Rj. Consequently, we infer a mostly monotonic R and D, with R being overconfident in its assessment of transmissibility. In contrast, estimates of E highlight the transmission potential from the fluctuating but smaller epidemics within other groups, while incorporating their uncertainties. It recognises the overall risk across 170 ≤ t ≤ 270 posed by groups with fluctuating infections. Additionally, E rapidly signals the transmissibility risk that dominates from t > 270, which is only indicated by R after a substantial delay. The max Rj statistic is again the most uncertain and prone to false positives.
Empirical application to COVID-19 across 20 cities in Israel
We compare our consensus statistics with the standard reproduction number on empirical data from the Delta strain outbreak of COVID-19 in Israel across May–December 2021. This dataset provides a convenient case study because daily positive tests results are available from different cities in Israel and both non-pharmaceutical interventions and restrictions were mild during this period. The main intervention deployed, which was highly successful at reducing cases, was the booster vaccine campaign [40,41]. This campaign started July 30 and gradually extended to all ages across August. We examine COVID-19 incidence curves from [42] by date of test for the p = 20 cities with the most cases of this wave. These cities account for 49% of the entire caseload in Israel and are plotted in log scale in Fig 5.
We estimate the standard, maximum group, mean and risk averse reproduction numbers (labelled as (t)) and the probability of resurgence at any time as in above sections but with the serial interval distribution in [43], which is consistent with the Israel-specific Delta wave parameters estimated in [41]. We assume case reporting is stable (i.e., any under-reporting is constant) and that serial intervals provide good approximations for the generation times. These assumptions are reasonable given high fidelity surveillance in Israel during this wave and are consistent with the analyses of [40,41]. As we only focus on relative trends, we make no further corrections to the dataset but note that accounting for issues such as testing delays generally cause incidence curves to be back-shifted and increase uncertainty, but necessitate auxiliary data [44,45].
Fig 5 demonstrates that D, R and E all agree that the wave was curbed across the booster period and that the epidemic was controlled. The max Rj is overly sensitive to worst case local dynamics and signals false or early resurgences across October-November. There is substantial disagreement among the statistics prior to the booster campaign. The standard suggests that the wave is under control in May due to decreasing total COVID-19 incidence.
However, the risk averse E (and to an extent D) highlights potential resurgence and may have contributed evidence to support starting the booster campaign earlier (see Fig A of the S1 Appendix for prospective E and R estimates at key timepoints during that period). The max Rj statistic is too susceptible to noise to provide actionable information.
E also better aligns with the emergence of the Delta strain or variant, which is signalled by with substantial delay. Given that counterfactual analyses from [40] showed that the success of the campaign was strongly dependent on the timing of its implementation, this earlier signalling of resurgence could have important ramifications as part of policy response. Using either mean and more conservative statistics (see Fig 7 for visualisation), we find E signals resurgence and supports starting the booster campaign between 2-12 days earlier than R (corresponding to the 8-20 June 2021). The convergence of D, R and E across September follows as the epidemic curves of many cities became synchronised and shows that E also recognises periods when dynamics are homogeneous. E, via its optimised design, uses the epidemic data to dynamically balance between averaging and emphasising heterogeneous group dynamics. We present a similar analysis on COVID-19 data from Norway that yields qualitatively consistent conclusions in Fig B of the S1 Appendix.
Improved resurgence detection for multiple COVID-19 datasets
We quantify the risk averse behaviour of E in realistic epidemic scenarios by examining its performance on 6 empirical COVID-19 datasets. These include the Israel data above and epidemic curves from districts in Norway (also explored in Fig B of the S1 Appendix) and New Zealand, regions in the US states of New York and Illinois and local UK authorities. We plot group level and total incidence for all datasets in Fig A of the S1 Appendix. We select the top 20 groups by infection counts for each dataset (or all groups when fewer than 20). All curves present cases by date of test (data sourced from [42,46–50]) after weekly smoothing and include resurgences that started locally before propagating. We estimate R (blue) and E (red) for all datasets (retrospectively), under the serial interval distribution from [43] and plot our results in Fig 6 together with total incidence (black). We indicate some key resurgence periods with vertical lines. We analyse these periods prospectively in Fig 7.
There we sequentially re-compute our estimates over time and test the resurgence detection ability of both X = R and X = E. Estimates at any timepoint in Fig 7 are informed by data up to that time only, illustrating the resurgence signals we would have inferred if our time series ended at that timepoint. We can decide if reproduction numbers have signalled resurgence in multiple ways. The simplest compares mean estimates to 1 or resurgence probabilities to 0.5. While we do not show these explicitly, these differences are visible from Fig 6 and are substantial, often of the order of 1-2 weeks. For example, the relative delay of R in signifying resurgence in the Israel study is 12 days. However, as our estimates possess uncertainty, we may prefer to indicate resurgence more conservatively by finding when the lower limit of the 95% credible interval of crosses 1 or when [24].
Fig 7 plots these estimates. Delays in resurgence detection signals from R (blue) relative to those from E (red) are visible from the separation of the coloured circles and squares. We find that, except for the null case of the UK regions, where local epidemics are synchronised (see Fig A of the S1 Appendix), E always provides earlier resurgence signals, confirming its risk averse nature. In the cases of New York, Illinois and New Zealand R fails, across the 2-week period analysed, to ever indicate resurgence. For Israel the lag between signals from R and E is 2 days. The converse occurs if assessing subcritical spread as E is slower than R to fall below 1 when groups show appreciable heterogeneity (while not explicitly shown we can see this in Fig 6). This also confirms the risk averse properties of E. We provide general mathematical arguments for why E has these properties in the S1 Appendix.
Discussion
The value of reproduction numbers or similar measures of transmissibility (e.g., growth rates [12]) as statistics for providing actionable information about the state of an epidemic, lies in their ability to accurately identify changepoints between subcritical and supercritical spread [51]. However, the meaning of a changepoint across scales is ambiguous and understudied. For example, if we have p local groups, how many need to resurge before we decide the epidemic has become supercritical? Is it a changepoint if these groups resurge at different times? A related question is, if those local groups are heterogeneous, is there any meaning in an overall average [22] such as the standard effective reproduction number R? Here we have explored such questions and their implications for describing epidemics at large scales.
We modelled epidemics at two scales: a local scale, over which the well-mixed assumption likely holds, and a global scale, where this assumption is almost surely invalid. Reproduction numbers are commonly computed and reported at global scales. Using this framework, we analysed how changepoints in local reproduction numbers, Rj, influence the properties of global statistics. We showed that, due to its weighting of each Rj by the infections circulating in that group, , R is generally controlled by the dynamics of the groups with the most extant infections. This causes loss of sensitivity to resurgent changepoints (which may often occur at small Λj) and means that estimates of R are usually overconfident or oversmoothed (Eq. (3)). We attempted to counter these undesirable properties by applying experimental design theory to develop algorithms that optimise the weights on the Rj.
We derived a novel reproduction number, E, by selecting weightings on the Rj that minimise the maximum uncertainty from Rj estimates. Consequently, E upweights more uncertain estimates (often associated with resurgent groups [24]) and incorporates the local circulating infections according to their impact on overall estimate uncertainty. This prevents estimate overconfidence and presents a principled method for combining the local Rj changepoints. An E > 1 ensures resurging groups are emphasised without being overly sensitive to individual group noise (Fig 2), while E < 1 indicates that groups are under control with high likelihood. Interestingly, E weights each R by its transmissibility ratio, , which results in a formula (Eq. (6)) that seems consistent with that derived from network epidemic models when individuals have heterogeneous contact rates [22].
Eq. (6), which is the contraharmonic mean of local reproduction numbers, also suggests that E will have general risk averse properties. This follows as contraharmonic means are known to behave as envelope detectors [52] i.e., they detect peaks in waveforms. E also has some resurgence prediction qualities as its weights correlate (in rank) with what the weights in R would converge to if resurgence occurs. We detailed these general properties of E in the S1 Appendix. We further illustrated and validated these properties using multiple simulated and empirical datasets (Fig 3–7). There we demonstrated how E provides a better consensus than R of local group dynamics (converging to R when transmission is homogeneous) but is not as vulnerable to noise as the maximum local statistic max Rj. We found that the earlier resurgence detection provided by E could be substantial, leading R by up to 2 weeks in several case studies.
This earlier resurgence signalling of E may be important given the sensitivity of the cost and effectiveness of many epidemic control actions to their implementation times [40,53] and the growing evidence supporting earlier but data-driven intervention choices [54,55]. If earlier resurgence signals are ignored, then eventually will approach and R will gradually indicate that supercritical spread has occurred. Due to its risk averse properties, E will also signal subcritical spread at global scales when there is a higher likelihood of groups being under control. This may be more conservative than R but avoids premature relaxations of interventions, which have been correlated with more costly and less effective exit strategies [56]. However, all these benefits depend on socio-political and other factors as reproduction numbers are one of many metrics informing public health decisions.
While E is a promising addition to the suite of infectious disease outbreak statistics, it is not perfect. First, its formulation depends on Poisson noise models (Eq. (1)). While such models are commonly applied [10], in some cases they may only offer simplistic representations of the stochasticity of epidemics. Although E will likely maintain its risk averse properties due to its contraharmonic formulation, its optimality is unknown for general stochastic descriptions. Second, we defined resurgence and control indicators based on the reproduction number threshold value of 1. This definition is almost universal but other measures (e.g., the early warning signals of [57]) may circumvent some problems that R presents as a statistic for informing decision-making in real time. In some instances, the disease under study may be uncontrollable (e.g., if it possesses long incubation periods and substantial pre-symptomatic spread [58]) and no metric (including E) will be able to meaningfully inform health policy.
Third, E requires infection time and incidence data (or proxies [45]) at the resolution of the local scale. While this is becoming the norm with steadily improving surveillance [59], it is not guaranteed and may be scarce for emerging infectious diseases. In this scenario, R is still directly computable from global scale data but the same data resolution limits that prevent inferring E here will also preclude any other finer-scale analysis. Last, the added value of E in informing real-time decision-making depends on the quality of data. Practical biases such as delays in ascertaining cases can hinder timely responses to transmission changepoints [44]. This bottleneck is fundamental and will equally limit R and other statistics. However, in these scenarios, E may still be of use retrospectively.
Overall, we propose E as a consensus statistic that better encapsulates salient dynamics across heterogeneous groups without losing the interpretability or computability of R. With late responses to epidemic resurgence often associated with larger epidemic burden [53], increasing interest in early warning signals [57] and reproduction numbers commonly being computed on vast scales in real time [9], the risk averse properties of E may be impactful. Public health policymaking is a complex process combining inputs from diverse data and models spanning epidemiology, economics and behavioural science. Given this complexity, we think that statistics designed with optimal and deliberate properties, such as E, can facilitate more transparent and robust data-driven decision-making.
Methods
Renewal models and estimation statistics
The renewal model [1] is a popular approach for tracking dynamics of infectious diseases [10]. It describes how the number of new or incident infections at time t, I(t), depends on the effective reproduction number at that time, R(t), and the total infectiousness Λ(t) as in Eq. (7), assuming that infectious individuals mix homogeneously. We commonly define time in days, but the model may be applied at other timescales (e.g., weeks).
Here Pois indicates a Poisson noise distribution and Λ(t) defines the active or circulating infections as the convolution of earlier infections with the generation time distribution of the disease. This distribution defines the time interval between primary and secondary infections [1] so that ω(t − s) is the probability of this interval being of length t − s. We assume that we have access to good estimates of the generation time distribution and infections [11].
Eq. (7) has been applied to model many epidemics across a wide range of scales spanning from small communities to entire countries. Its major use has been to facilitate the inference of the time-varying R(t). Fluctuations in estimates of R(t) are frequently associated with interventions or other epidemiologically important events such as the emergence of novel pathogenic variants. We can derive the key statistics of these R(t) estimates from the log-likelihood function of R(t), ℓ, which follows from the Poisson formulation of Eq. (7) as in Eq. (8). Here ζ(t) collects all terms that are independent of R(t).
We can construct the maximum likelihood estimate (MLE) of R(t), denoted (t), by solving to obtain the left expression of Eq. (9). This estimator is asymptotically unbiased. The (expected) Fisher information (FI), FI[R(t)], defines the best achievable precision (i.e., the smallest variance) around the MLE [60], and is computed from Eq. (8) as [60,61], with E[.] as an expectation over the incidence data. This gives . Substituting E[I(t)] = Λ(t)R(t) from Eq. (7) produces the expression in the middle of Eq. (9).
The FI depends on the unknown R(t). We can remove this dependence by applying a robust or variance stabilising transform [36,62]. We can derive this by using the FI change of variables formula as in [33,61]. We consequently obtain the right equation in Eq. (9), under the square root transform . In the main text we often use this transformed FI to make comparisons clearer but present all key results in the standard R(t) formulation.
While we have outlined the core of renewal model estimation, most practical studies tend to apply Bayesian methodology [10]. Accordingly, we explain the two approaches used in this paper. The first follows [11] and assumes some gamma (Gam) conjugate prior distribution over R(t), leading to the posterior estimate of R(t) being described as Gam(I(t), Λ(t)−1) (where we ignore prior hyperparameters). The mean of this posterior is and its variance is the inverse of FI[R(t)] evaluated at . This formulation holds for group reproduction numbers Rj(t) as well, which have posteriors Gam(Ij(t), Λj(t)−1). This methodology has been applied to estimate real-time resurgence probabilities [24]. We use it to generate Fig 2.
The second approach is EpiFilter [39], which combines the statistical benefits of two popular R(t) estimation methods – EpiEstim [11] and the Wallinga-Teunis approach [31] – within a Bayesian smoothing algorithm [63] to derive optimal estimates in a minimum mean squared error sense. We apply EpiFilter to obtain all effective reproduction number estimates with their 95% equal-tailed Bayesian credible intervals in Figs 3–5. This assumes a random walk prior distribution on R(t) and sequentially computes estimates via forward-backward algorithms [63]. We run EpiFilter at default settings.
It outputs posterior for group j with as the incidence curve I(t): 1 ≤ t ≤ T.
This provides retrospective analysis of reproduction numbers, using all the information up to present time T. If we set T to earlier timepoints we can recover past, real-time estimates that reflect the information available up to that timepoint. We compute these real-time estimates at key timepoints for the Israel case study in Fig A of the S1 Appendix. We construct our consensus posterior distributions , with X as D, R, max R, or E (see the next section) by sampling from all and applying appropriate weightings. Resurgence probabilities are evaluated as .
Optimal experimental design and consensus metrics
In the above section we outlined how to model and estimate R(t) across time. However, this assumes that all individuals mix randomly. This rarely occurs and realistic epidemic patterns are better described with hierarchical modelling approaches as in [14]. We investigated such a model at a local and global scale in the main text. There we assumed that p local groups do obey a well-mixed assumption and have local reproduction numbers, Rj(t) for group j that all conform to Eqs. (7-9). We additionally modelled a global scale, as in Eqs. (1-3) that combines the heterogeneous dynamics of the groups. We drop explicit time indices and note that this formulation, which considers weighted means of the Rj, requires a p × p FI matrix to describe estimate uncertainty as in Eq. (4). We now explain how the consensus statistics, D and E, emerge as optimal designs of this matrix.
For convenience, we reproduce the FI matrix FIX and the weighting for some reproduction number or consensus statistic X in Eq. (10). X can be D or E, and we apply a constraint on factors αj such that . When all the αj = Λj, we obtain the global effective reproduction number, now denoted R. The total information of our model is Λ.
The mean reproduction number D is derived as the D-optimal design of FIX. This maximizes the determinant of this matrix, which is . As the first term is independent of the weights, we simply need to maximise subject to a constraint on . This is known as an isoperimetric constraint and is solved when the factors are equalised i.e., [26,36]. Substitution of this optimal design leads to an equal weighting in Eq. (10) and we get the formulation in Eq. (6).
The risk averse reproduction number E is accordingly the solution to the E-optimal design of FIX. This maximises the minimum eigenvalue of FIX (i.e., minimises the maximum estimate uncertainty). Because FIX is diagonal we must maximise the minimum diagonal element subject to the constraint on . This has a known solution because the objective function is Schur concave. This objective function is maximised when is constant for all j (under our constraint) and yields , which follows from majorization theory. More details can be found in [36,37]. Substituting this optimal design into Eq. (10) yields the weight and we recover the result in Eq. (6).
We infer the mean and risk averse reproduction numbers by combining estimates of the group reproduction numbers, Rj, generated from EpiFilter. We achieve this by sampling from the posterior distributions of these local estimates to construct consensus posteriors for D and E in a Monte Carlo manner according to Eq. (6). This involves computing the arithmetic and contraharmonic means of the samples at each time point. The contraharmonic mean is the ratio of the second to first raw moments of its inputs. We also use as a reference, the simple statistic max Rj, which involves taking maxima over the group samples. These consensus estimates underlie the plots in Figs 3–6.
Importantly, we observe that as both D and E are means, they have two key properties that define them as reproduction numbers. First, when all the local Rj = a then R = D = E = a. Second, if we reduce transmissibility globally by (i.e., every R is scaled by , with a > 1) then all three statistics are also reduced by . These properties ensure that our consensus statistics have the same interpretability as R. Specifically, D and E have a threshold around 1 and their estimated values reflect changes resulting from public health interventions, more transmissible variants (if instead the Rj scale up by a) and population behaviours.
Optimal experimental design with interconnected groups
Our framework above does not explicitly consider connections among the groups. Here we outline how our optimal designs can remain valid under models of realistic interconnections. Let ρx→j be the probability of an infection being introduced into a sink group j from source group x as in [34] with ρx→x as the probability of remaining within the source group. For p groups, the renewal process that describes the incidence of new infections in group j is Ij ∼ , ignoring explicit time indices. Consequently, Ij contains information about the Rx. If we assume that introductions have the infectiousness of their source group and that we know the source group of the introductions, then the informative component of Ij is then Ij|Rx ∼ Pois(ρx→jΛxRx) (via a Bernoulli thinning of Poisson distributions).
Using earlier results, the Fisher information that Ij contains about Rx is . We may collect the information about Rx available from all the infection data by summing these terms as . This follows from the infinite divisibility property of the Poisson formulation and as . Consequently, the total Fisher information about Rx from all the incidence data is . This yields the same Fisher matrix as in Eq. (4). Consequently, our D- and E-optimal designs and other results are unchanged and valid under this formulation, which assumes that introductions have the infectiousness of their source group. This assumption holds for example when transmission heterogeneity arises from regional pathogenic variants, with variants forming groups with distinct Rx.
The converse, where introductions have the reproduction number of their sink group leads to . If we let , we get a diagonal Fisher matrix but now with terms . This fits our framework in (see Eq. (10)) if we constrain the sum of all the axΛx = αx to still be Λ. Sink-based reproduction numbers may occur, for example, if groups demarcate areas with different population density or contact patterns and have distinct Rx (that is acquired on entering that group). Both source and sink assumptions require knowledge of the introductions or their ρx→j values. When the ρx→j are unknown or cannot be estimated, an alternative is to treat the introductions as input data as in [27].
This requires that we redefine the total infectiousness of group j as with Ij as the local infections of group j and Mj counting introductions into that group. As Λj is treated as known we do not depart the framework of the main text and we recover our optimal designs (albeit with this redefined Λj). This convergence of results emerges because once we can ascertain the source and sink of infections, we can correctly assign them to their respective Rx and construct a diagonal Fisher matrix. Other models of interconnectivity which instead propose inter-group reproduction numbers (e.g., [64]) do not directly fit our framework or possess non-diagonal Fisher information matrices and can be over-parametrised or non-identifiable without additional data or constraints.
Data Availability
We provide open-source software to reproduce all analyses at https://github.com/kpzoo/risk-averse-R-numbers. While the main code for generating the figures in this text is written in MATLAB, we also include functions in R to compute E on user-defined datasets.
Data and code availability
We provide open-source software to reproduce all analyses at https://github.com/kpzoo/risk-averse-R-numbers. While the main code for generating the figures in this text is written in MATLAB, we also include functions in R to compute E on user-defined datasets.
Funding
KVP acknowledges funding from the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 programme supported by the European Union. UO was supported by a grant from Tel Aviv University Center for AI and Data Science 417 (TAD) in collaboration with Google, as part of the initiative of AI and DS for social good. The funders had no role in study design, data collection and analysis, decision to publish, or manuscript preparation. For the purpose of open access, the author has applied a ‘Creative Commons Attribution’ (CC BY) licence to any Author Accepted Manuscript version arising from this submission.
Footnotes
Additional analyses of 5 empirical datasets and mathematical arguments on the properties of the proposed statistic.