Abstract
We derive and introduce the angular reproduction number, Ω, which measures time-varying changes in epidemic transmissibility resulting from variations in both the effective reproduction number, R, and the generation time distribution, w. Predominant approaches for tracking the dynamics of pathogen spread either infer R or the epidemic growth rate r. However, R is easily biased by mismatches between the assumed and true w, while r is difficult to interpret in terms of the individual-level branching process underpinning transmission. Moreover, R and r may disagree on the relative transmissibility of two epidemics or variants (i.e., rA > rB does not imply RA > RB for variants A and B). We find that Ω responds meaningfully to mismatches in w while maintaining most of the interpretability of R. Additionally, we prove that Ω > 1 if and only if R > 1 and that Ω agrees with r on the relative transmissibility of pathogens. Estimating Ω is no harder than inferring R, uses existing software, and requires no generation time measurement. These advantages come at the expense of selecting one free parameter. We propose Ω as a useful statistic for tracking and comparing the spread of infectious diseases that may better reflect the impact of interventions when those interventions concurrently change both R and w or alter the relative risk of co-circulating pathogens.
Introduction
Estimating the rate of spread or transmissibility of an infectious disease is a fundamental and ongoing challenge in epidemiology [1]. Identifying salient changes in pathogen transmissibility can contribute important information to policymaking, providing warnings of resurgent epidemics, assessments of the efficacy of interventions and signals about the emergence of new variants of concern [1–3]. The effective or instantaneous reproduction number, R, and time-varying growth rate, r, are commonly used to characterise pathogen transmissibility. The former statistic is an estimate of the average number of new infections per active (circulating) past infection, while the latter describes the exponential rate of new infection accumulation [4].
Although R and r are important and popular means of tracking the dynamics of epidemics, they suffer from key limitations that diminish their fidelity and interpretability. Specifically, the meaningfulness of R depends on our ability to measure the generation time distribution of the infection under study, w. This distribution captures the inter-event times among primary and secondary infections [5] and, jointly with the history of infection times, defines Λ, the time-varying total infectiousness of the disease. The total infectiousness serves as the denominator when inferring R, which is effectively a ratio of new infections to Λ. However, infection times and hence w are difficult to measure and depend on the availability of detailed transmission chain data from contact tracing or transmission studies [6]. Even if these data are available, the estimated w (and hence Λ) depends on how inter-event times are sampled or interpreted (e.g., there are forward, backward, intrinsic and realised generation intervals) [7,8].
Workarounds, such as approximating w by the serial interval distribution [9], which describes inter-event times between the onset of symptoms, or inferring w from this distribution [10], do exist but also suffer from related problems [6]. Consequently, w and Λ are often misspecified, biasing R and likely misrepresenting the true branching process dynamics of epidemics. While r is more robust to w misspecification (it only depends on the log gradient of the smoothed infection time series) [4], it lacks the individual-level informativeness and interpretation of R. Given estimates of r, it is unclear how to derive the proportion of new infections that need to be suppressed (roughly R-1), herd immunity thresholds (related to 1-R-1) or the probability of epidemic elimination and establishment (both linked to R-N for N infections) [11–13]. The only known means of attaining such information converts r into R using estimates of w [14].
Difficulties in accurately inferring generation times therefore cause practical bottlenecks that constrain our ability to measure pathogen transmissibility. These problems are worsened as recent studies have empirically found that generation times also vary substantially with time (i.e., w is non-stationary) [15]. These variations may correspond to different epidemic phases [16], emerging variants of concern [17] and coincide with the implementation of interventions [18]. These are precisely the situations in which we also want to infer R. However, concurrent changes in R and w are rarely identifiable, and r inextricably groups the effects of w and R on transmissibility. While high quality, longitudinal contact tracing data [19] can potentially resolve these identifiability issues, this is an expensive and logistically hard solution. Here we propose another means of alleviating the above problems – the angular reproduction number, Ω.
The angular reproduction number defines transmissibility as a ratio of new infections to M, the root mean square number of past infections over a user-defined window δ. Because it replaces Λ with M, a quantity that does not require knowledge of generation times, Ω is not subject to the problems of inferring w. We demonstrate that Ω is able to measure the overall changes in transmissibility caused by fluctuations in both R and w. Moreover, we prove that Ω has similar threshold properties to R, maintains much of its individual-level interpretation and is potentially a better metric for communicating transmissibility. This last point follows as we only need to quote Ω and the known window δ to generalise our estimates of transmissibility to different settings. In contrast, the meaningfulness of R is contingent on the unknown or uncertain w. Downstream studies sometimes use R outside of its generation time context [20], while dashboards aiming at situational awareness often quote R without w, devaluing this statistic as a robust means of communicating disease spread [21].
Additionally, we demonstrate how r and R can easily disagree on relative transmissibility, both across time and for co-circulating variants. Unmeasured changes in w over time can cause R and r to vary in opposite directions (one signals an increase in transmissibility and the other a decrease). Similarly, co-circulating pathogens with different but stationary and known w, may possess contradictory R and r value rankings i.e., for variants A and B, rA > rB does not imply RA > RB. These issues are further complicated when interventions (which can change w, R or both [18]) occur, obscuring notions of the relative risk of spread. However, we find that rA > rB guarantees ΩA > ΩB and that Ω agrees with r across time even if w changes. This consistency reinforces the usefulness of Ω for tracking and comparing outbreak spread.
Results
Angular reproduction numbers
The epidemic renewal model [22] provides a generalised and flexible representation of disease transmission. It defines how the incidence of new infections at time t, denoted It, depends on the effective or instantaneous reproduction number, Rt, and the past incident time series of infections, . This results in the conditional moment relationship in Eq. (1) [9]. Generally, we use
to denote the time series {Xa, Xa+1, …, Xb−1, Xb} and E[X|Y] for the expectation of X over possible epidemic trajectories given known variables Y. Where obvious, and for convenience, we sometimes drop Y in E[X|Y], writing E[X].
Here Λt is the total infectiousness and summarises the weighted influence of past infections. The set of weights wt for all u defines the generation time distribution of the infectious disease with
, and m as the support of this distribution, which we assume to be practically finite [14]. When the time series is shorter than m we truncate and renormalise the wt. Commonly, the stochasticity around the expectation RtΛt is modelled using either Poisson or negative binomial count distributions [1,12].
Although Eq. (1) has successfully been applied to model many diseases including COVID-19, Ebola virus disease, pandemic influenza and measles, among others, it has one major flaw – it assumes that the generation time distribution is fixed or stationary and known [9]. If this assumption holds (we ignore surveillance biases [9,23] until the Discussion), Eq. (1) allows epidemic transmissibility to be summarised in fluctuations of the time-varying Rt parameters. This follows because the sign of Rt − 1 determines if It will increase or decline relative to the total infectiousness Λt. This reproduction number can be linked to the instantaneous epidemic growth rate, rt, using the moment generating function of the generation time distribution [14].
Consequently, from Rt, we obtain temporal information about the rate of pathogen spread and its mechanism i.e., we learn how many new infections we can expect per circulating infection because . Since Rt is a threshold parameter, we know that at a fraction of at least
of new infections must be blocked to suppress epidemic growth (Rt = 1 signifies that rt = 0). The time scale over which this suppression is achievable [14] and our ability to detect changes in Rt [24] in the first place, however, are determined by the generation times.
Recent works emphasise that the assumption of a known or fixed generation time distribution is often untenable, with appreciable fluctuations caused by interventions [15,18] and emerging pathogenic variants [17] or occurring as the epidemic progresses through various stages of its lifetime [5]. Substantial biases in Rt can result (because its denominator Λt is incorrectly specified [4]), which even impede optimal Bayesian inference algorithms [25]. As Rt is a predominant metric of transmissibility, contributing key evidence towards infectious disease policymaking [1], this may potentially obscure situational awareness or misinform intervention planning. While improved and intensive contact tracing can provide updated generation time information, this is usually difficult and expensive. We propose a robust alternative.
We redefine the total infectiousness by recognising that it is a dot product between the vector of generation time probabilities and the past incidence
over the support of the generation time distribution, m. This gives the left equality of Eq. (2) with the Euclidian norm of
as
and θt as the time-varying angle between
and
. This equality holds for non-stationary generation times i.e., both
and
can have elements that change over time). Eq. (2) implies that the count of new infections (for any given Rt) is maximised when the angle between
and
is minimised i.e., when the temporal profile of past infections matches the shape of the generation time distribution.
We can compute the root mean square incidence across the support of the generation time distribution as
. Under the constraint that
we truncate this distribution to sum to 1 – this is an edge effect of the epidemic) then the maximum possible value of the generation time norm is
. This is achieved by the maximum entropy generation time distribution of
, which is uniform (has m entries of
).
Combining these definitions with Eq. (1), we derive the second expression in Eq. (2) for the expected number of new infections at time t. This may seem an unnecessarily complicated manipulation of the standard renewal model, but it admits a novel and important insight – we can separate the influences of the reproduction numbers and the generation time distribution (together with its changes) on epidemic transmissibility. These multiply Mt, which defines a new denominator – the root mean square number of past infections (this is also the average signal power of the past infection time series) – that replaces the total infectiousness Λt.
Consequently, we define a new metric in Eq. (3), the angular reproduction number Ωt, which multiples Rt by the scaled projection of the generation time distribution, , onto
, the past incidence vector. This means that Ωt is a time-varying reproduction number between the expected infection incidence and the past root mean square incidence Mt.
This metric captures all possible variations that impact the ability of the epidemic to transmit. It responds to both changes in Rt and the generation time distribution. The latter would scale
and rotate cos θt (which is why we term this angular). The benefit of compactly describing both types of transmissibility changes does come with a trade-off in interpretability as it may be harder to intuit the meaning behind E[It] = ΩtMt than the more usual E[It] = RtΛt.
We argue that this is not the case practically because Λt is frequently and easily misspecified [15,26], obscuring the meaning of Rt. In contrast, Mt does not depend on generation time assumptions (beyond characterising its support m), and Ωt is always defined as a reproduction number relative to root mean square incidence. We remove structural uncertainty induced by the often unknown wu since Mt is a maximum entropy version of Λt i.e., subject to
. We find that Mt = Λt and hence Ωt = Rt, when the generation time distribution is degenerate (i.e., at some u = g, wg = 1 and wu≠g = 0) or past incidence is flat (as then Λt = Mt and wu has no effect). The first occurs in branching process epidemic models [27] (with fixed generation time g). The second defines the important equilibrium condition Ωt = Rt = 1.
Relationship to popular transmissibility metrics
Having defined the angular reproduction number above, we explore its properties and show why it is an interesting and viable measure of transmissibility. We examine an exponentially growing epidemic with incidence It = I0ert and constant growth rate r. This model matches the dynamics of fundamental compartmental models such as the SIR and SEIR (in the limit of an excess of susceptible individuals) and admits the equation gr = (R − 1) [28], with g as the mean generation time. We assume growth occurs over a period of δ and compute Ωt as in Eq. (3) by noting that E[It] = It and (δ = m and for continuous time models
involves an integral). After some algebra we get the left relation in Eq. (4).
Several important points follow. First, as x ≥ 1 − e−x for every x ≥ 0, Ωt − 1 and r are positive too (an analogous argument proves the negative case). Second, we substitute the compartmental R-r relationship gr = (R − 1) to get the right-side relation of Eq. (4). Applying L’ Hopital’s rule we find
. We hence confirm the threshold behaviour of Ωt i.e., the sign of Ωt − 1 and Rt − 1 are always consistent (for all values of δ > 0).
Third, we see that constant growth rates imply constant angular reproduction numbers. The converse is also true, and we may input time-varying growth rates, rt, into Eq. (4) to estimate Ωt. These properties hold for any choice of δ, which is now a piecewise-constant window. In later sections we show that the bijection between rt and Ωt has important consequences when comparing outbreaks. We plot key R-r-Ω relationships in Figure 1 below. We may also invert this correspondence to estimate rt from Ωt. This involves solving Eq. (5), where Wκ(x) is the Lambert W function with index k ∈ [0, −1] (this range results from the indicator 1(y)) [29].
Panel A and B show how growth rates (r) and reproduction numbers (R) have diverse functional relationships (see [14]) for SEIR models with an excess of susceptible individuals (A) and branching processes (B). Coloured lines indicate R at different mean generation times (g). Black lines highlight a single functional relationship between angular reproduction numbers Ω and r at all g, using a window δ of 20d. Panel C shows that while Ω varies with choice of δ (increasing from blue to red), we have a bijective relationship with r. Panel D demonstrates that R and r do not correspond e.g., if an NPI reduces R and g (see [15,18]), but Ω properly converts r into a transmissibility metric.
A central implication of Eq. (4) and Eq. (5) is that we can infer angular reproduction numbers directly from growth rates or vice versa, without requiring knowledge of the generation times. We provide an example of this in the Appendix, together with the derivation of Eq. (5).
We also comment on the relationship of angular and effective reproduction numbers using a deterministic branching process model, which is also foundational in epidemiology. We again focus on growth, which is geometric since this is a discrete time process with time steps scaled in multiples of the mean generation time g. We can write incidence as It = Rt leading to , where window δ is in units of g. If δ = 1 we recover Ωt = R. Further, if R = 1, then Ωt = R for all δ. For growing epidemics, as δ increases, Ωt > R because we reference present incidence to smaller past infections (or denominators). The opposite occurs if the epidemic declines. This may seem undesirable, but we argue that Ωt improves overall practical transmissibility measurement because g will likely be misspecified or vary with time.
Any g mismatches will bias R, limiting its interpretation and meaningfulness as well as making comparisons among outbreaks or pathogenic variants difficult (since we cannot be certain that our denominators correspond). This can be particularly problematic if estimates of R obtained from a modelling study are incorporated as parameters into downstream studies without accounting for the generation time context on which those estimates depend. However, by communicating Ω and δ, we are sure that denominators match and, further, that we properly include the influences of any g mismatches. Choosing δ is also no worse (and more explicit) than equivalent window assumptions made when inferring R and r [4].
Last, we illustrate how Ωt relates to other key indicators of epidemic dynamics such as herd immunity and elimination probabilities. As our derivation replaces Eq. (1) with for the same observed incidence, these indicators are also readily obtained. Assuming Poisson noise, the elimination probability
is replaced by
, and has analogous properties [30]. Herd immunity, which traditionally occurs when a fraction 1 − R−1 of the population is immune is approximated by 1 − Ω−1 (since both metrics possess the same threshold behaviour) [11]. In a subsequent section we demonstrate that one-step-ahead incidence predictions from both approaches are also comparable.
Responding to variations in generation time distributions
We demonstrate the practical benefits of Ωt using simulated epidemics with non-stationary or time-varying generation time distributions. Such changes lead to misspecification of Λt in Eq. (1), making estimates of the effective reproduction number Rt, denoted , a poor reflection of the true underlying Rt. In contrast variations in the estimated
are a feature (see Eq. (3)) and not a bug (for some chosen δ we control Mt, which is not misspecified). We simulate epidemics with Ebola virus or COVID-19 generation times from [31,32] using renewal models with Poisson noise [9]. We estimate both the time-varying Rt and Ωt using EpiFilter [25], which applies Bayesian algorithms that minimise mean square estimation error.
Inferring Ωt from incident infections, , requires only that we replace the input Λt with Mt in the estimation function and that we choose a window δ for computing Mt. We provide software for general estimation of Ωt at https://github.com/kpzoo/EpiFilter. Code for reproducing this and all other analyses in this paper is also freely available at https://github.com/kpzoo/Omega. We heuristically set δ ≈ 2g0 as our window with g0 as the original mean generation time of each disease from [31,32]. We find (numerically) that this δ ensures
over many possible gamma distributed generation times i.e., we cover most of the probability mass of likely but unknown changes to the generation time distributions, which cause time-varying means gt, without expanding much beyond their supports or incurring large edge effects.
Our results are plotted in Figure 2. We show that responds as expected to both changes in the true Rt and
, subject to the limits on what can be inferred [24]. In Figure 2 we achieve changes in
by altering the mean generation time gt by ratios that are similar in size to those reported from empirical data [15]. In contrast, we observe that
provides incorrect and overconfident transmissibility estimates, which emerge because its temporal fluctuations also have to encode structural differences due to misspecification of
. These can strongly mislead our interpretation and understanding of the risk posed by a pathogen.
We simulate epidemics using generation time distributions of Ebola virus disease (EVD) [31] and COVID-19 [32] in panels A and B. The means of these distributions (g) vary over time (grey), but we fix their variance at their original values. We find substantial bias in estimates of R (red with 95% credible intervals, true value in black), with estimates attempting to compensate for generation time mismatches in an uncontrolled manner that obscures interpretation. However, Ω responds as we expect (blue with 95% credible intervals, window δ, true value in black) and we infer change-points due to both R and g fluctuations (subject to bounds induced by noise i.e., at low incidence inference is more difficult [24]). Our estimates derive from EpiFilter [25] with default settings and we truncate the time series to start from δ to remove any edge effects.
Ranking epidemics or variants by transmissibility
Misspecification of generation time distributions, and corresponding misestimation of R as in Figure 2, also plays a crucial role when assessing the relative transmissibility of pathogens, variants of concern or even outbreaks (where we may want to contrast the spread of contagion among key demographic or spatial groups). As shown in Figure 1, these variations can mean that increases in the growth rate rt actually signify decreases in the effective reproduction number Rt or that a pathogen with a larger rt can have a smaller Rt. Here we illustrate that these issues can persist even if the generation time distributions of pathogens are correctly specified and remain static, obscuring our understanding of relative transmission risk.
In Figure 3 we simulate epidemics under two hypothetical variants of two pathogens. We use EVD and COVID-19 generation time distributions from [31,32] to define our respective base variants. For both pathogens we specify the other variant by reducing the mean generation of each base but hold the generation time variance fixed. Reductions such as these are plausible and have been measured for COVID-19 variants [17]. All distributions are stationary and known in this analysis. We discover that changes in Rt alone can initiate inversions in the relative growth rate of different variants or epidemics. As far as we can tell, this phenomenon has not been explicitly investigated. Given that interventions can change Rt in isolation or in combination with
[15,18], this effect has the potential to be widespread.
We simulate epidemics in blue (with estimates of metrics also in blue) under standard generation time distributions of Ebola virus disease (EVD) [31] and COVID-19 [32] in panels A and B. In red (with estimates also in red) we overlay simulations in which the generation time of these diseases is 40% and 50% shorter, which may indicate a new co-circulating variant or another epidemic with different properties (e.g., due to being in a higher risk group). We show (for the first time to our knowledge) that changes in R due to an intervention (or release of one) may alter the relative growth rates (r) of the epidemics. The mismatches in the R-r rankings alter perceptions of relative risk, making comparisons of transmissibility difficult. However, Ω is able to classify the risk of these epidemics in line with their realised growth rates, while still offering the individual-level interpretability of a reproduction number. True values are in black and all estimates (with 95% credible intervals) are outputs from EpiFilter [25] with default settings. We truncate the time series to start from δ to remove any edge effects.
Interestingly, the angular reproduction numbers of Figure 3 do preserve an ordering that is consistent with the relative growth rates, while maintaining the interpretability (e.g., threshold properties) of a reproduction number. Hence, we argue that Ωt blends advantages from both Rt and rt [4] and serves as a useful outbreak analytic for understanding and conveying the relative risk of spread of differing pathogens or pathogen strains, or of spread among different spatial and demographic groups. As recent research has only started to disentangle the component drivers of transmission, including the differing influences that interventions may introduce (e.g., by defining notions of the strength and speed of control measures [33]) and the diverse properties of antigenic variants [17], we believe that Ω can play a key role in accelerating these investigations.
Reproduction numbers for explanation or prediction?
We highlight an important but underappreciated subtlety when inferring the transmissibility of epidemics – that the value of accurately estimating R, r and Ω largely depends on if our aim is to explain or predict [34] the dynamics of epidemics. The above analyses have focussed on characterising transmissibility to explain mechanisms of spread and design interventions. For these problems, misestimation of parameters, such as R, can affect our understanding of outbreak risk and consequently might misinform the implementation of control measures. An important concurrent problem aims to predict the likely incidence of infections from these estimates. This involves projecting epidemic dynamics forward in time to infer upcoming patterns in incidence of infections.
Here we present evidence that the solution of this problem, at least over short projection time horizons, is robust to misspecification of generation times provided both the incorrect estimate and the misspecified denominator are used in conjunction. We repeat the analyses of Figure 2 for 200 replicate epidemics and apply EpiFilter [25] to obtain the one-step-ahead predictive distributions for every t. We compute the predicted mean square error (PMSE) and the accumulated predictive error (APE). These scores, which we denote as
, average square errors between mean predictions and true incidence and sum log probabilities of observing the true incidence from the predicted distribution respectively [35,36]. We plot the distributions of scores over replicates and illustrate individual predictions in Figure 4.
We simulate 200 replicates of the epidemics from Figure 2, which involve non-stationary changes to EVD and COVID-19 generation times. We use estimates of effective, R, and angular, Ω, reproduction numbers to produce successive one-step-ahead predictions and assess their accuracy to the simulated (true) incidence. Panels A-D provide a representative example of a single simulated epidemic (true incidence shown as black dots) and the R and Ω one-step ahead predictions (red and blue respectively with 95% credible intervals). In panels E-F we formally compute accuracy using distance metrics, D, based on accumulated prediction errors (APE, dashed) and prediction mean square errors (PMSE, solid) for all 200 replicates from R, Ω and R given knowledge of the generation time changes i.e., R|w. We obtain distributions of D by applying kernel smoothing. We find negligible differences in predictive power from all approaches.
We find only negligible differences among the one-step-ahead predictive accuracies of the R estimated given knowledge of the changing generation times (R|w), the R estimated assuming an unchanged (and hence wrongly specified) w and our inferred Ω. As APE and PMSE also measure model suitability, their similarity across the three estimates demonstrate that, if the problem of prediction is of interest, then incorrect generation time choices are not important as long as the erroneous denominator (Λt) and estimate (Rt) are used together. If this estimate is however used outside of the context of its denominator (e.g., if it is simply input into other studies), then inaccurate projections will occur (in addition to poor estimates). As multi-step-ahead predictions can be composed from iterated one-step-ahead ones [37], we conjecture that subtleties between prediction and explanation are likely to also apply on longer horizons.
Empirical example: COVID-19 in mainland China
We complete our analysis by illustrating the practical usability of Ω on an empirical case study where generation time changes are known to have occurred. In [15], the dynamics of COVID-19 in mainland China was tracked across January and February 2020. Transmission pair data indicated that the serial interval of COVID-19 shortened across this period leading to biases in the inferred R if updated serial intervals were not used. Here serial intervals, which measure the lag between the symptom onset times of an infector and infectee are used as a proxy for the generation time. Figure 5 presents our main results. We find Ω (blue), which requires no serial interval information, behaves similarly to the R (red) inferred from the time-changing w. Both metrics appear less biased than estimates of R (green) that assume a fixed serial interval, This is largely consistent with the original investigation in [15].
We analyse COVID-19 data from [15], which spans 9th January 2020 to 13 February 2020 and is known to feature a serial interval distribution that shortened in mean substantially from 7.8d to 2.6d (change times are shown as grey vertical lines). We assume that the serial interval approximates the generation time well and replicate the analysis from Figure 2 of [15]. In panel A, we compare estimates (green) of effective reproduction numbers, R, using fixed generation time distributions inferred in [15] (specified by their means g) against those of our angular reproduction number Ω (blue). We use EpiFilter [25] to obtain all estimates (means shown with 95% credible intervals) and find relative trends similar to those in Figure 2 of [15]. In panel B we plot the incidence (black) and the denominators we use to compute an R that does account for the generation time changes (Λ, red) and for Ω (M, blue). This R uses the different distributions inferred at the grey vertical change times (their means are in panel B and are also the fixed distributions of panel A in sequence). We plot these R and Ω estimates in panel C. In panel D we show the growth rates that are inferred from the R and Ω estimates of C (red and blue respectively) against that obtained from taking the smoothed log derivative (black).
We see that Ω provides a lower assessment of the initial transmissibility as compared to the R that is best informed by the changing w but that both agree in general and in particular at the important threshold between super- and subcritical spread. Interestingly, Ω indicates no sharp changes at the w change-times. This likely follows because the incidence is too small for those changes to influence overall transmissibility. We confirm this point by comparing r estimates derived from R (red, from [14]), Ω (blue, from Eq. (5)) and from the empirical log gradient of smoothed incidence (black, [4]). We find that the r from Ω agrees closely with the empirical growth rate suggesting it as more reliable than the r from R, which somewhat by design shows jumps at the w change-points. While this analysis is not meant as a detailed study of COVID-19 in China, it does demonstrate the practical utility of Ω
Discussion
Quantifying the time-varying transmissibility of a pathogen remains an enduring challenge in infectious disease epidemiology. Changes in transmissibility may signify shifts in the dynamics of an epidemic of relevance to both preparedness and policymaking. While this challenge has been longstanding, the statistics that we use to summarise transmissibility have evolved from dispersibility [38] and incidence to prevalence ratios [39] to cohort [40] and instantaneous [22] reproduction numbers. While the last, which we have denoted R, has become the predominant metric of transmissibility, all of these proposed statistics ultimately involve a ratio between new infections and a measure of active infections (i.e., the denominator). Deciding on appropriate denominators necessitates some notion (implicit or explicit) of a generation time [41].
Difficulties in characterising these generation times and their changes substantially bias [6] estimates of transmissibility and have motivated recent works to propose the instantaneous growth rate, r, as a more reliable approach for inferring pathogen spread [20]. However, on its own, r is insufficient to resolve many of the transmission questions that R can answer and its computation may employ smoothing assumptions that are in some instances equivalent to the generation time ones behind R [4]. We formulated a novel statistic, the angular reproduction number Ω, to merge advantages from both R and r and to provide a more comprehensive view of transmissibility. By applying basic vector algebra (Eqs. 1-3), we encoded both changes to R and the generation time distribution, w, into a single time-varying metric, deriving Ω.
We found that Ω maintains the threshold properties and individual-level interpretability of R but responds to variations in w, in a manner consistent with r (Figure 1). Moreover, Ω indicates variations in transmissibility caused by R and w without requiring measurement of generation times (Figure 2). This is a consequence of its denominator, which is the root mean square of infections over a user-specified window δ that is relatively simple to tune (see Methods). We can interpret Ω = a > 1 as indicating that infections over δ need to be reduced by a-1 (this reduces mean and root mean square infections by a-1). Further, Ω circumvents identifiability issues surrounding the joint inference or R and w [42] by refocussing on only estimating the net changes produced by both. This improves our ability to explain the shifts in transmissibility that underpin observed epidemic dynamics and results in Ω serving as a reproduction number that provides an individual-level interpretation of growth rates (Eqs. 4-5).
The benefits of this r-Ω correspondence are twofold. First, as interventions may alter R, w or R and w concurrently [15,18] situations can arise where r and R disagree across time on both the drivers and magnitude of transmissibility. Second, this disagreement can also occur when comparing pathogenic variants or epidemics (e.g., from diverse spatial or sociodemographic groups) with different but known and unchanging w. As far as we can tell, this study is among the earliest to highlight these discrepancies. Realistic transmission landscapes feature all of these complexities, meaning that conventional measures of relative transmissibility can be fraught with contradictions. In contrast to R, we found that Ω consistently orders epidemics by growth rate while capturing notions of the average new infections per past infection (Figure 3). This suggests Ω blends the advantages of R and r, with clear assumptions (choice of δ).
However, Ω offers no advantage if our goal is to predict epidemic dynamics (see [34] for more on the prediction-explanation distinction). For this problem even R inferred with a misspecified denominator performs equally well (Figure 4). This follows because only the product of any reproduction number and its denominator matter when determining the next incidence value, and iterations of this underpin multi-step ahead predictions [37]. This may be the reason why autoregressive models, which ignore some characteristics of w, can serve as useful predictive models [43]. Other instances where Ω will not improve analysis are at times earlier than δ (due to edge effects [9]) and in periods of near zero incidence (there is no information infer R either [24]). We summarise and compare key properties of R, r and Ω in Table 1 below.
We list important relationships among the instantaneous growth rate (r), the instantaneous or effective reproduction number (R) and the angular reproduction number (Ω) and assess their value as measures of transmissibility.
There are several limitations to our study. First, we only examined biases inherent to R due to the difficulty of measuring the generation time accurately and across time. While this is a major limitation of existing transmissibility metrics [15], practical surveillance data are also subject to under-reporting and delays, which can severely diminish the quality of any transmissibility estimates [23,42,44]. While Ω ameliorates issues due to generation time mismatch, it is as susceptible as R and r to surveillance biases and corrective algorithms (e.g., deconvolution methods [45]) should be applied before inferring Ω. Second, our analysis depends on renewal and compartmental epidemic models [22]. These assume random mixing and cannot account for realistic contact patterns. Despite this key structural uncertainty, there is evidence that well-mixed and network models are comparable when estimating transmissibility [46].
Although the above limitations can, in some instances, reduce the added value of improving the statistics summarising transmissibility, we believe that Ω will be of practical and theoretical benefit. Its similarity in formulation to R means it is as easy to compute using existing software and therefore can be deployed on dashboards and updated in real time to improve situational awareness. Further, Ω makes comparison and communication of the relative risk of circulating variants or epidemics among diverse groups more reliable because it avoids R-r contradictions provided a known parameter, δ, is fixed. This is in contrast to R, which is hard to contextualise [20], since its value depends on an often-unknown w, and hence compare across groups, as each group may have distinct and correspondingly poorly specified denominators. Last, Ω can help probe analytical questions about how changes in R and w interact because it presents a common framework for testing how variations in either influence overall transmissibility.
Methods
Inferring angular reproduction numbers across time
We outline how to estimate Ωt given a time series of incident infections , with T defining the present or last available data timepoint i.e., 1 ≤ t ≤ T. Because Ωt simply replaces the total infectiousness Λt, used for computing Rt, with the root mean square of the new infection time series, Mt, we can obtain Ωt from standard Rt estimation packages with minor changes. This requires evaluating Mt over some user-defined backward sliding window of size δ. Under a Poisson (Pois) renewal model this follows as in Eq. (6) for timepoint t.
The choice of δ is mostly arbitrary but should be sufficiently long to capture most of the likely probability mass of the unknown generation time but not overly long since it induces an edge effect (similar to the windows in [9,36]). We found a suitable heuristic to be twice or thrice the initial expected mean generation time (g0). We can then input Mt and It into packages such as EpiEstim [9] or EpiFilter [25] to estimate Ωt with 95% credible intervals.
Due to the similarity between computing Rt and Ωt we only specify the latter but highlight that replacing Mt with Λt yields the expressions for evaluating any equivalent quantities from Rt. The only difference relates to how the growth rates rt are computed. We estimate rt from Rt by applying the generation time, , based transformation from [14]. For a correctly specified
this gives the same result as the smoothed derivative of the incidence curve [4]. We derive r from Ω using Eq. (5), which follows by rearranging Eq. (4) into
. This expression then admits the Lambert W function solutions. In all estimates of rt we propagate uncertainty from the posterior distributions (see below) over Rt or Ωt.
We applied EpiFilter in this study due to its improved extraction of information from . This method assumes a random walk state model for our transmissibility metric as in Eq. (7) with ∈t−1 as a normally distributed (Norm) noise term and η as a free parameter (default 0.1).
The EpiFilter approach utilises Bayesian smoothing algorithms incorporating the models of Eq. (6)-(7) and outputs the complete posterior distribution
with T as the complete length of all available data (i.e., 1 ≤ t ≤ T). We compute our mean estimates
and 95% credible intervals from this posterior distribution and these underlie our plots in Figures 2-3.
EpiFilter also outputs the one-step-ahead predictive distributions , which we use in Figure 4. There we also quantify predictive accuracy using the predicted mean square error PMSE and the accumulated prediction error APE, defined as in Eq. (8) [35,36] with
as the posterior mean estimate from
and
as the true simulated incidence. These are computed with
and not
, ensuring no future information is used.
We collectively refer to these as distance metrics
and construct their distributions, P(D), over many replicates of simulated epidemics. Last, we use
to compute the posterior distribution of the growth rate
and hence its estimates as in Eq. (5). More details on the EpiFilter algorithms are available at [25,30,47]. We supply open source code to reproduce all analyses at https://github.com/kpzoo/Omega and a function in MATLAB and R to allow users to estimate Ωt from their own data at https://github.com/kpzoo/EpiFilter.
Data Availability
Data and code used to perform all analyses is freely available at https://github.com/kpzoo/Omega
Funding
KVP acknowledges funding from the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 programme supported by the European Union. The funders had no role in study design, data collection and analysis, decision to publish, or manuscript preparation.