Epidemic dynamics in inhomogeneous populations and the role of superspreaders ============================================================================= * K. Kawagoe * M. Rychnovsky * S. Chang * G. Huber * L. M. Li * J. Miller * R. Pnini * B. Veytsman * D. Yllanes ## Abstract A variant of the SIR model for an inhomogeneous population is introduced in order to account for the effect of variability in susceptibility and infectiousness across a population. An initial formulation of this dynamics leads to infinitely many differential equations. Our model, however, can be reduced to a single first-order one-dimensional differential equation. Using this approach, we provide quantitative solutions for different distributions. In particular, we use GPS data from ∼ 107 cellphones to determine an empirical distribution of the number of individual contacts and use this to infer a possible distribution of susceptibility and infectivity. We quantify the effect of superspreaders on the early growth rate ℛ of the infection and on the final epidemic size, the total number of people who are ever infected. We discuss the features of the distribution that contribute most to the dynamics of the infection. ## I. INTRODUCTION A strong temptation in modeling a system consisting of many similar parts is to make the assumption that these parts have identical properties. Accordingly, the classical models in epidemiology assume (often implicitly) that everyone has the same propensity to be infected and, if infected, the same propensity to infect others [1]. This assumption may be justified when differences in the salient parameters are small. However, one of the interesting features of the current COVID-19 pandemic is the huge variation in infectivity: small numbers of infectious events or individuals seem to be responsible for a large number of cases [2–14]. This feature seems to be present in other coronavirus epidemics including SARS [15–17] and MERS [18–20]. One can point to different explanations for this phenomenon: individual variations in viral load and shedding [14, 21, 22], in droplet production (see the review in [23]), in contact networks [8, 10, 12, 24, 25], and differences in the features of ventilation systems at certain events and venues [26, 27]. Inhomogeneity seems to have played an important role for other epidemics as well [28–30], leading to the rule of thumb that “20% of patients produce 80% of infections” [31]. However, it seems that for coronavirus-related infections the variability is even higher than that heuristic [13, 14]. In a recent book [32], the historian Lepore has noted that “the study of the human condition is not the same as the study of the spread of viruses and the density of clouds and the movement of the stars,” which is incontrovertible. The converse, however, is not: it appears that the spread of viruses *is* dependent on at least one aspect of the human condition, namely the intrinsic variability and lack of uniformity of human behavior. There are two related, but distinct, notions of superspreading in this literature, namely, *superspreading events* and *superspreading individuals*. Superspreading events are events that produce many infections. Super-spreading individuals (*superspreaders*) are specific people that produce many infections (such as Typhoid Mary in the early 1900s). As one might imagine, in reality, some combination of these two processes is present. In this paper, however, we set our sights on the latter phenomenon: a superspreader is always an individual, rather than an event. It is reasonable to assume that a variability in infectivity is accompanied by a variability in susceptibility. Common explanations of variability in individual infectivity — increased shedding due to higher rate of virus multiplication in the given host, increased exposure period, and increased personal contacts — suggest that increased infectivity may correlate with increased susceptibility. We note that there are arguments for the opposite correlation: for example, some studies indicate that older age may correspond to higher susceptibility but lower infectivity [33], while other studies seem to contradict this finding [34]. The change of contact patterns caused by mitigation measures further confounds this issue [35]. However, the assumption of positive correlation between infectivity and susceptibility seems to be a reasonable one. One conclusion of this is that superspreaders might be more prominent at the early stages of an epidemic. During the course of an epidemic, the fraction of super-spreaders will typically decrease with time. This would lead to a change in the apparent value of the average transmission rate, which could make it difficult to evaluate the effectiveness of mitigation measures. This effect might be quite large and is not captured by many standard models. Generally speaking, inhomogeneity may lead to a large change in the mean behavior of a system, especially when fast growth is involved. Understanding the effect of inhomogeneity would increase the fidelity of models based on real-world data, and lead to more effective public policy. Several recent works (see, e.g., [9–12, 25, 36–38]) have addressed the issue of heterogeneity in the population, but they either concentrate on specific distributions or treat the variability in infectivity and susceptibility separately, without considering the effect of a possible correlation between the two. Ref. [10] in particular reports numerical experiments with heterogeneous quenched contact probabilities (ours are uniform). A rich set of outcomes was obtained that calls for an analytical treatment in the flavor of this manuscript. In this work we discuss the epidemic dynamics for a population with variable infectivity potential accompanied by variable individual susceptibility. We obtain the results for the general case of an arbitrary distribution of susceptibility and infectivity. We also provide a calculation of ℛ that quantifies the unintuitive effect of superspreaders on the early growth rate of the epidemic and find that it depends strongly on the correlation between susceptibility and infectivity. Furthermore, one of the distributions holds a special interest. If we assume that the main driver of inhomogeneity is diversity in the number of social contacts for an individual, then data [39] on the distribution of these contacts suggests a very wide distribution of infectivity and susceptibility. An important question for modeling the inhomogeneity is whether the result depends only on the moments of the distribution (mean, variance, skewness, …) or on the behavior of the tails of the distribution. The answer to this question could inform the construction of realistic predictive models in the future. We discuss both the cases of fat tails and skinny tails, and the transition between these regimes. We recently became aware of work by Tkachenko and collaborators [40] that also employs a model with distributed infectivity and susceptibility of the population. The treatments of the issue in [40] and the present paper are, however, different. We attempt to study the range of effects caused by a varying degree of correlation in a systematic way, in the framework of a minimal model. Tkachenko et al., on the other hand, postulate a one-to-one correspondence between susceptibility and infectivity. This assumption is a limiting case for our model, corresponding to the “worst-case scenario” for the epidemic, as we show in Appendix C. In terms of fits to real-world data, Tkachenko et al. consider time series of hospitalizations in Chicago and NYC to calibrate their model, while we take a different approach and work from anonymized GPS data from cell phones to infer empirical distributions of the number of social contacts. In situations with matching assumptions, the conclusions of [40] and the present paper are in agreement. We believe that, together, these two works present a nicely complementary view of correlation between susceptibility and infectivity in epidemics. The rest of the paper is organized as follows: In Section II, we give a mathematical description of the dynamics of our model. In Section III, we reduce our model to a one dimensional integro-differential equation, analyze the long time dynamics, and describe an early time criterion for epidemic outbreak. In Section IV, we compare the results of our model for different distributions of population attributes, including an empirical one from anonymized cell phone data. We end with our discussion and conclusions in Section V. In the Appendices, we provide derivations which are relevant to the main text and we discuss some of the methodological aspects of our empirical data. ## II. THE MODEL Classic SIR models [1] divide the population into three compartments: susceptible *S*, infected *I*, and recovered (or deceased) *R*. The rate of new infections in this model is proportional to the number of encounters of susceptible persons with the infected persons, while the rate of recovery is proportional to the number of infected persons. This gives us the well-known SIR equations ![Formula][1] where *S, I*, and *R* are the fractions of susceptible, infected, and recovered persons to the constant population size, dot means the time derivative, *β* and *γ* are non-negative constants, and we use the fact that, with our normalization, the fraction of susceptible persons *S* satisfies the equation ![Formula][2] We use the simplest version of the model, which accounts neither for additional births and deaths, nor for population migration. Additionally, we do not allow for the possibility of recovered individuals being reinfected. We now allow the parameters to be different for different individuals. Namely, let the infection rate *β* in equation (1) be the product of individual susceptibility *s* and infectivity *σ*. To obtain the rate of infection, we integrate over the values of *s* for susceptible individuals and over the values of *σ* for infected individuals. Note that in our model the values of *s, σ*, and *γ* are fixed for each person and do not change with time. Let *p*(*σ, s, γ*) d*σ* d*s* d*γ* be the probability that a person selected uniformly at random from the population has susceptibility *s*, and, when infected, has infectivity *σ* and recovery rate *γ*. Note that *p* does not change with time in our model. We will have reason to make repeated use of the averaging operator 𝔼: for any function *f* (*σ, s, γ*), we define ![Formula][3] Equations (1) and (2) should now be rewritten, because *I, R* and *S* are not just functions of time *t*, but also depend on *s, σ*, and *γ*. Namely, let *I*(*σ, s, γ, t*) *dσ ds dγ* be the probability that a person selected uniformly from the entire population at time *t* is infected and has (initial) susceptibility *s*, infectivity *σ* and recovery rate *γ*. Similarly we introduce *S*(*σ, s, γ, t*) and *R*(*σ, s, γ, t*). Then equation (2) becomes ![Formula][4] and equations (1) become ![Formula][5] ![Formula][6] When the proportion of infected individuals is small, *S*(*σ, s, γ, t*) in equation (5) is close to *p*(*σ, s, γ*), giving a linear approximation of equation (5). For distributions where is *γ* a constant, it can be shown (Appendix B) that the early behavior of an epidemic is determined by ℛ = 𝔼[*σs*]/*γ*. The total fraction *Ω*(*t*) of persons who have ever been infected at time *t* is the sum of currently infected and recovered individuals. If we stratify *Ω* by *s, σ*, and *γ*, we can write down ![Formula][7] with ![Formula][8] The final epidemic size is ![Formula][9] We will use index 0 for the initial conditions in equations (5) and (6), so *I*(*σ, s, γ*) = *I*(*σ, s, γ*, 0) etc. ## III. ANALYTIC RESULTS In this section we discuss the general properties of our model. We assume that the distribution of infectivity and susceptibility is such that the moments 𝔼[*σ*], 𝔼[*s*], and 𝔼[*σs*] as defined in equation (3) exist. If the distribution is so heavy tailed that these moments do not exist then important integrals in our analysis will not converge. This is not merely a technical restriction. For instance the short time behavior of the model should be quite different if 𝔼[*σs*] is infinite. Let us introduce the notation: ![Formula][10] ![Formula][11] An individual has infectivity *σ* if infected and 0 if not. Therefore, 𝔼[*σ*] is the maximal average infectivity (when everyone is infected simultaneously), and *ϕ*(*t*) is the ratio of the current average infectivity and the maximal one. Further, *ψ*(*t*) is the historical average of *ϕ*(*t*). Both these quantities are thus between zero and one. In our model (without births or immigration and no persons with zero recovery rate), there are no infected persons at *t* → ∞, so in this limit ![Formula][12] It is shown in Appendix A that the stratified fraction of people who ever have been infected at time *t* [see equations (7) and (8)] is ![Formula][13] For outbreaks started with a small number of infected persons, almost all remaining individuals are susceptible, so *S* ≈ *p*. The number of currently infected individuals is ![Formula][14] As a result, if we know *ψ*(*t*), then we know the full solution. It is shown in Appendix A that *ψ*(*t*) is a solution of the equation ![Formula][15] To study the behavior of equation (15) we will make several simplifying assumptions. First, we assume a constant recovery rate across the population: ![Formula][16] This means that the other variables (*S, I, R*) are also proportional to *δ*(*γ* − *γ′*); we will use the same notation for them as functions of *σ* and *s*. Second, we assume the initial number of recovered individuals is zero, ![Formula][17] Third, we assume that the initial distribution of infected persons is proportional to *p*(*σ, s*), and is small: ![Formula][18] To see why any other initial distribution *I* that is small should behave similarly see Appendix B. With these assumptions equation (15) can be further transformed from an integro-differential equation to a first-order differential equation ![Formula][19] for the function *ν*(*t*) = *ψ*(*t*)*t* (See equation (A14)). To numerically solve equation (15) it is convenient to rewrite it as two first-order differential equations (See Appendix E). In the rest of this section we discuss the properties of the solution of this equation. Let us start with the final epidemic size [equation (9)]. It can be shown (Appendix A) that at *t*→ ∞ the function *ψ*(*t*) in equation (11) behaves as 1/*t*. Choose *L* so that at large *t*, ![Formula][20] Then equation (9) with *T* from equation (13) becomes (see Appendix A) ![Formula][21] where *L* is the unique nonnegative root of the equation ![Formula][22] We are interested in an infection started with a small number of initial cases, which corresponds to *ε* → 0. If in this limit equation (22) has a strictly positive root, the final epidemic size ![Formula][23] is non-zero, and does not depend on *ε*: in other words, the epidemic takes off. If the limit does not have a strictly positive root then the infection immediately dies out and the final epidemic size is 0. In this *ε* → 0 limit *F* (0) = 0 and *F* (1/*γ*) *>* 0, so equation (22) has a positive (non-zero) root if d*F* (0)/d*L <* 0. Taking the derivative, we see that a non-zero root corresponds to the condition ![Formula][24] Given this result, we take a brief detour from our discussion of *t*→ ∞. Another way to look at epidemic spread is to study the short term behavior of the solution. Our analysis (Appendix B) shows that the initial small infection spreads with exponential rate ℛ = 𝔼[*σs*]/*γ* determined by equation (24). The upshot is that the growth rate of the epidemic is highly dependent on how correlated the infectivity and susceptibility are. One naive generalization of ℛ from the SIR model, i.e., the average number of secondary infections produced by a typical infection would be ![Graphic][25]. To explain why ℛ, rather than ![Graphic][26], determines the exponential growth rate of the infected population we will illustrate what the two quantities measure. If we choose a person from the *entire* population uniformly at random and infect them, then the average number of secondary infections would be ![Graphic][27]. For instance if a cruise ship travels somewhere and almost everyone is infected, then when they return home the expected number of secondary infections each person produces will be ![Graphic][28]. On the other hand a person who was infected via community spread (early in the epidemic) will cause on average ℛ secondary infections. The difference between these cases is that in the first case almost all travelers are infected so the fact that someone is infected tells us little about their susceptibility, whereas in the second case people are infected via community spread which occurs with a probability proportional to their susceptibility early in the epidemic. See Appendix B for details. We will now continue our discussion of the final epidemic size with some limiting cases. As mentioned above, for an epidemic to spread, it is necessary that ℛ = 𝔼[*σs*]/*γ* ≥ 1. Near this transition, where ℛ ≈ 1, we may write down an approximation for *L*. Again, we will be interested in the limit of small initial epidemic size *ε* → 0, although it is not difficult to generalize the following result for non-zero *ε*. Let ℛ *>* 1. Assuming that *L* is small, and that *p*(*σ, s*) falls off quickly enough for large *s*, we may approximate equation (22) as ![Formula][29] Therefore, if we get close enough to the transition where 𝔼[*σs*] − *γ* is small ![Formula][30] In this regime, equation (21) gives the total epidemic size as ![Formula][31] Let us now briefly discuss the opposite limit. Instead of *γ* being so large that the epidemic almost doesn’t start, we study *γ* so small that the epidemic infects almost everyone. It is expected that if *γ* = 0, then the entire population will eventually become infected; that is, *Ω*∞ = 1. Equation (21) shows that in this case *L* → ∞. It is easy to show that for small *γ, L* ≈ 1/*γ*, and equation (21) predicts an exponentially small number of individuals not infected. This framework allows one to make predictions for a number of specific distributions discussed in the next section. We conclude the general discussion with one very interesting case: when the distribution has a very small number of “superspreaders”, individuals with anomalously high infectivity. (Here very small means small enough to not appreciably change 𝔼[*σs*].) A relevant question is whether these individuals have an oversized contribution in the epidemic. Equations (21) and (22) show that this is *not* the case, and the contribution of superspreaders is limited by the linear term in the average value of 𝔼[*σs*] (see Appendix D). Therefore, while superspreaders still contribute to the dynamics, they are only a primary driver of infection in our model when they significantly change ℛ if their number is large. ## IV. RESULTS FOR DIFFERENT DISTRIBUTIONS OF INFECTIVITY AND SUSCEPTIBILITY Let us further illustrate the general results using specific distributions for *s* and *σ*. First, consider an *N* - component SIR model. That is, there are *N* different types of individuals who have parameters *σ**i*, *s**i*, *γ**i* and represent a portion of the population *p**i*, and ![Formula][32] *δ*(*x*) being Dirac’s delta-function. In the case where *N* = 1, this reduces to the standard SIR model. We see in Appendix B that this model is a limiting case of the model presented in this paper [41]. Another useful distribution to study is the Gamma distribution with *σ* = *s*. In particular, we are interested in the distribution ![Formula][33] where ![Formula][34] and *α, β* are positive constants. This system is interesting to study because the integrals involved in solving for *L* are analytically tractable. In the case where *α* = 1 we recover the exponential distribution and we can find *Ω*∞ exactly (equation (F6)). We analyze the case of the Gamma distribution in Appendix F. We further illustrate the dynamics of epidemics using several special cases of distributions of infectivity *σ* and susceptibility *s* with the assumption of constant recovery rate *γ*. (See Appendix C for an analysis of which distributions lead to the worst outcomes for the final epidemic size.) Even with constant *γ* the answer depends on the probability distribution *p*(*σ, s*). We discuss three limiting cases: (i) completely independent *σ* and *s*, with *p*(*σ, s*) = *p**σ*(*σ*)*p**s*(*s*); (ii) completely positively correlated *σ* and *s* with *σ* ∝ *s*; and (iii) positively correlated *σ* and *s* with a correlation coefficient *ρ*. Note that since only the product *σs* enters the equations, we always can multiply *σ* by a constant factor *f*, and *s* by the factor 1/*f*. We choose this factor to ensure that 𝔼[*σ*] = 𝔼[*s*]. In the numerical calculations in this section we used the following parameters roughly following [42–44] ![Formula][35] At present, our understanding of variability in individual susceptibility and infectivity is far from complete. While the consensus is that they have a wide distribution (see the discussion in the Introduction), the shape of this distribution is not known, and most studies assume a convenient one for their calculations. Since we want to explore the dependence of the dynamics on the distribution itself, rather than on its parameters, we compare two reasonable *a priori* assumptions: a log-normal distribution with the parameters *µ* and ![Graphic][36], and a Gamma distribution with the parameters *α* and *β*. Another approach is to suggest some mechanism for the variability and choose a distribution that follows this mechanism. One such mechanism is the variability of individual contacts: the more contacts has a person, the higher is their *s* and *σ*. It is important to note that in this model *s* is completely correlated with *σ* because they are caused by the same mechanism. We are fortunate to be able to use empirical data about the number of contacts from the “path-crossing” network described in Looi et al. [39]. Their network is constructed from the mobility data provided by SafeGraph, a company that aggregates and anonymizes geolocation data from cell phone applications. SafeGraph collects GPS location pings for millions of adult smartphone users in the United States, where each ping represents the latitude and longitude of one user at one timestamp. Looi et al. [39] transform the set of location pings into a dynamic network, where users are represented as nodes, and edges indicate the number of times two users crossed their paths (see Appendix G for the details). We use the number of path crossings as a proxy for the number of users’ social contacts, which is in its turn a proxy for susceptibility and infectivity. Due to the number of assumptions here one should be careful with the interpretation of the results. We do not claim that the SafeGraph data provide *the* distribution of *σ* and *s*. Rather we think they suggest features of the real distribution. Moreover, it should be stressed that this is just one of many possible mechanisms for susceptibility and infectivity heterogeneity, see, e.g., the discussion in [13]. We do not claim that this is the only, or even the main mechanism —it is just the one for which we have data. An interesting feature of the SafeGraph distribution is that it is very wide. The average number of contacts per user is 0.342 × 103, while the standard deviation is 1.04 × 103. We can try to approximate the empirical distribution of contacts using a theoretical distribution. On Figure 1 we show log-normal and gamma approximations together with the empirical distribution with the same mean and variance. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/21/2021.02.08.21251386/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/09/21/2021.02.08.21251386/F1) Figure 1. Comparison of empirical, log-normal and gamma distributions with the same average infectivity 𝔼[*s*] = 0.6 day−1/2 and variance *ζ*2 with *ζ* = 4.16 day−1/2. In the remainder of this section we discuss the numerical solutions of the model equations for the log-normal, Gamma, and empirical distributions obtained with the approach discussed in Appendix E. See Appendix F for analytical solutions in special cases. In Figure 2 we compare the epidemic’s progression for log-normal and Gamma distributions with the same mean *s* and varying distribution widths. We see that a wider distribution leads to a lower epidemic size. When the width of the distribution decreases, the curve goes to the one for the classical SIR model. An interesting feature is that a wide correlated distribution of *s* and *σ* leads to an earlier start of the epidemics instead of the S-like curve of the standard SIR model. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/21/2021.02.08.21251386/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2021/09/21/2021.02.08.21251386/F2) Figure 2. Comparison of epidemics spread for Gamma [panels (a)–(b)] and log-normal [panels (c)–(d)] distributions of infectivity and susceptibility with standard deviation *ζ* and parameters in equation (31). The cases of independent or completely correlated *σ* and *s* are shown. In Figure 3 we study the influence of the positive correlation between infectivity and susceptibility. For simplicity we show just the final size *Ω*∞. As demonstrated by this figure, the more correlated these parameters are, the higher the size is, as predicted by the analysis in the previous section. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/21/2021.02.08.21251386/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2021/09/21/2021.02.08.21251386/F3) Figure 3. Dependence of final epidemic size *Ω*∞ on *ρ* where [ln(*s*), ln(*σ*)] is a Gaussian vector with mean 𝔼[*s*] = 𝔼[*σ*] = 0.6 day−1/2, Var(*s*) = Var(*σ*) = *ζ*2 and correlation 𝔼[*sσ*] = 𝔼[*s*] 𝔼[*σ*] exp(*ρς*2) with *ς*2 ≡ ln[1 + *ζ*2/(𝔼[*s*] 𝔼[*σ*])]. Note that *ρ* is the correlation coefficient for ln(*s*) and ln(*σ*) rather than for *s* and *σ*. For another comparison we take the empirical number of contacts between the individuals (Appendix G) as a proxy for both *s* and *σ*. We renormalize the number of contacts to obtain the average infectivity 𝔼[*s*] in equation (31). This leads to variance *ζ*2 = 17.27 day−1 (*ζ* = 4.16 day−1/2). Then we fit the parameters of log-normal and Gamma distributions to get the same 𝔼[*s*] and *ζ*. All three distributions are shown on Figure 1. The results are shown in Figure 4 together with the solution for the classical SIR model with the infectivity and susceptibility equal to the averages 𝔼[*s*] and 𝔼[*σ*]. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/21/2021.02.08.21251386/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2021/09/21/2021.02.08.21251386/F4) Figure 4. Epidemics progression for the distributions shown on Figure 1 with parameters in equation (31). A classical SIR solution for the same susceptibility and infectivity is also shown. The figures suggest that, generally speaking, variability in susceptibility and infectivity lowers the final epidemic size, and the correlation between them increases it. Important special cases of this statement are proven in Appendix C, and based on the figures, we expect it to hold more generally. Of special interest is the question of whether individuals with high infectivity (“superspreaders”) influence the epidemic dynamics and final epidemic size. To model the effect of superspreaders we can discuss a special bimodal distribution of infectivity, ![Formula][37] where *p*n describes “normal” persons with low *σ*, and *p*s describes superspreaders with high *σ*. In our numerical experiments we modeled superspreaders using a powerlaw distribution ![Formula][38] with the parameters *a* = 4, *b* = 1.2 day−1/2. With these parameters the average infectivity of superspreaders is 1.8 day−1/2, i.e., three times the average infectivity in our simulations. The results are shown on Figure 5. We see that the influence of superspreaders is at most linear in their proportion *λ*. This is not coincidental: as shown in Appendix D, the effect of superspreaders is at most linear. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/21/2021.02.08.21251386/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2021/09/21/2021.02.08.21251386/F5) Figure 5. Final epidemic size for a mix of normal individuals (same distribution as on Figure 2) and superspreaders described by equation (33). The effect of superspreaders is at most linear in their proportion. ## V. DISCUSSION AND CONCLUSIONS The aim of any idealized model is to provide insights about the “real world”. We believe our model provides several important insights beyond the assumptions involved in its derivation and treatment. First, the variation in individual susceptibility and infectivity does matter. All examples studied in Section IV have the same average susceptibility and infectivity—but the outcomes greatly differ. Generally wider distribution lead to lower final epidemic size, and, in the case of correlated infectivity and susceptibility, a faster initial outbreak. Second, the correlation between infectivity and susceptibility is important: the higher the correlation, the larger the epidemic size. Third, the average and the width of infectivity and susceptibility are *not* enough to predict the outcome: the actual shape of the distribution matters too. The comparisons of log-normal and Gamma distributions in Figure 2, and of three different distributions having the same first and second moments in Figure 4, demonstrate this clearly. This conclusion shows that a prediction of the epidemic’s spread is a hard task from the practical point of view. Indeed, one never knows the exact shape of the distribution, since it involves the measurement of individual infectivities and susceptibilities of a great many people. The sensitivity to the shape of the distribution beyond a couple of moments is bad news for precise predictions. Having said this, we still need to answer the question of which features of the distribution are the most salient for predictions. There were a number of works stressing the importance of superspreaders: individuals or events with anomalously high potential for spreading (see the Introduction). Our model suggests a more nuanced view. On the one hand, because the susceptibility and infectiousness of individuals are correlated through how many people someone interacts with, increasing the number of superspreaders in a way that does not change the average infectivity or susceptibility will increase ℛ = 𝔼[*σs*]/*γ*, which greatly increases how fast the infection takes off and decreases the threshold for the outbreak and some-what increases the final epidemic size. In the unrealistic case where we add pairs of one superspreader and one unusually careful person so that the variance increases and ℛ is unchanged, adding both these people will actually tend to decrease the final epidemic size. This can be seen in equations (21) and (22) where we have exponentials suppressing the contribution of individuals with anomalously high susceptibility (or high infectivity, if these parameters are correlated). This can also be seen in Appendix C and in Figure 4. The final result is determined by the average 𝔼[*σs*] *and* the distribution shape at low to moderate susceptibilities. It should be noted that, for wide distributions, the median *s* and mean *s* are quite different, and our conclusion concerns mean, rather than median, susceptibility. Perhaps the following analogy may help to understand the meaning of this result. In comic books the outcome of a war is determined by a handful of superheroes and supervillains. In reality, by contrast, the final result is determined by the combined effort of many people at the lowest rungs of the military hierarchy: privates, petty and junior officers, and so forth. Our conclusion is that epidemic spread is like the “real war” rather than the “comic-book one”. The foregoing analysis has an essential implication for public health policy. While the prevention of superspreading is important (it changes the exponential growth rate ℛ = 𝔼[*σs*]/*γ* and drives down the averages in equations (21) and (22)), it is the mundane everyday efforts that matter most. Therefore, a way to localize the outbreak before mass vaccination becomes an option is to drive down the spread during many daily activities and perform rigorous tracing of everyday contacts. It is important to note that this discussion concerns superspreading *individuals*. The suppression of superspreading *events*, on the other hand, might be a powerful tool in preventing the spread. In sum, we provide a simple, but efficient mathematical apparatus to calculate the epidemic dynamics for a population with variable infectivity and susceptibility, and cast it in a form suitable for numerical estimates. We hope this apparatus might turn out to be useful beyond the insights formulated in this paper. Our model has some limitations that provide ample motivation for future work. One of the most important among them is the neglect of spatial inhomogeneity (the panmictic assumption). In reality each person has their own network of contacts, which is a source of inhomogeneity in infectivity [11]. The spread of infection is significantly influenced by the finite size of the individual’s contact network compared to the full population [7, 11, 43–46]. It would be very interesting to model the combination of spatial inhomogeneity *and* inhomogeneity in infectivity and susceptibility. Lastly, operating in the current pandemic context, where long-lasting immunity has been observed, we have operated within the SIR paradigm, where a recovered person can never be infected again. If we allow for the reinfection of recovered individuals, such as in an SIRS or SIS model, we would expect superspreaders to have a much greater impact on the course of the epidemic. This is because their removal from the system at early times is now only temporary. Therefore, considering the possibility of reinfection will be a very important future application of our methods. ## Data Availability Not applicable ## ACKNOWLEDGMENTS The authors acknowledge the generous support of Chan Zuckerberg Initiative and Chan Zuckerberg Biohub. We are grateful to Dan Zigmond and Rob Phillips for their provocative questions and creative curiosity. We are also greatly indebted to interactions within the COVID-19 Discussion Group, including D. S. Fisher, M. Kamb, G. Le Treut and A. McGeever. M. Rychnovsky is partially supported by I. Corwin’s NSF grant DMS:1811143 as well as the Fernholz Foundation’s “Summer Minerva Fellows” program. D.Y. acknowledges support by MINECO (Spain) through Grant No. PGC2018-094684-B-C21, partially funded by the European Regional Development Fund (FEDER). ## Appendix A Derivation of main equations This Appendix is dedicated to the derivation of the main equation and the results of the general analysis in Section III. First, we derive equation (13). Let us add equations (5) and (6) and use the definitions of *T* (*σ, s, γ, t*) to obtain ![Formula][39] By inspection, we may verify that Eq.13 is a solution to this differential equation. We see that this solution satisfies the initial conditions ![Formula][40] We now turn to the derivation of equation (14). First, we use equation (5) and the definitions of *T* (*σ, s, γ, t*) and *ϕ*(*t*) to write down ![Formula][41] Substituting this expression into equation (A1), we arrive at ![Formula][42] or, equivalently ![Formula][43] This differential equation admits a solution ![Formula][44] We now perform integration by parts in the above expression. In the second step and the second-to-last step, we will use our solution for *T* (*σ, s, γ, t*) from equation (13). ![Formula][45] and therefore ![Formula][46] This final line matches Eq. (14). Finally, we derive the equations of motion for *tψ*(*t*) as written in (15). We begin by substituting in our solution for *I*(*σ, s, γ, t*) into the definition of *ϕ*(*t*) in equation (10): ![Formula][47] Noticing that *ϕ*(*t*) = d(*ψ*(*t*)*t*)/d*t*, we get ![Formula][48] Integrating this equation by parts, we get ![Formula][49] which matches equation (15). Let us now derive equation (22) and propose an iterative algorithm for its numerical solution. Assuming constant *γ* (equation (16)), we multiply both sides of equation (15) by e*γt* and take a time derivative of both sides: ![Formula][50] Taking the derivative of the left hand side, multiplying by e−*γt* and integrating over time, we get ![Formula][51] where *C* is a constant based on initial conditions. With the initial conditions (18), we get *C* = 𝔼 [*σ*]. As an aside, we may alternatively write equation (A13) as a first-order, time-independent equation using *ν*(*t*) = *ψ*(*t*)*t*. ![Formula][52] We rewrite as ![Formula][53] It is not hard to see that the right hand side is Lipshitz in *ν*, so the solution exists and is unique on ℝ≥0 by a standard application of the Picard-Lindelof theorem. In fact we have a bijection between solutions to the system (4), (5), (6) and solutions to (A15) given by ![Formula][54] in one direction and by equations (13) and (14) in the other. Thus, existence and uniqueness of solutions to (A15) implies the existence and uniqueness of solutions to the system (4), (5), (6). We already know that lim*t*→∞ *ψ*(*t*) = 0 (equation (12)). Suppose that ![Graphic][55] *dt* converges, and thus the following limit exists: ![Formula][56] Then in the case *S*(*σ, s*) = 1 − *ε* we obtain equation (22). To justify the assumption (A17) we construct an algorithm to calculate *L* and prove it converges to a non-negative root of equation (22). We use the following iterations We will find the solution using the following iterations: ![Formula][57] ![Formula][58] Below we will prove that the sequence *L**i* converges to the relevant root. Lemma 1. *Suppose equation* (22) *has non-negative roots, and* ![Graphic][59] *is the largest root. Then the sequence L*, *L*1, … *converges to* ![Graphic][60]. *Proof*. We will prove that for all *i* ![Formula][61] Then the sequence *L*, *L*1, … is bounded and non-increasing, and therefore converges. The limit of this sequence is a root of equation (22), and due to inequality (A20) and the fact that ![Graphic][62] is the largest root, it converges to ![Graphic][63]. First, note that from equations (22) and (A18) follows that ![Graphic][64]. For *i* = 1 we have from the iteration equation (A19) *L*1 ≤ *L* and, since ![Graphic][65], ![Formula][66] so inequality (A20) is true. Suppose this inequality is true for *i* − 1, i.e. ![Formula][67] Then we will prove it for *i*. Indeed, ![Formula][68] and ![Formula][69] In other words if the inequality is true for *i* − 1, it is true for *i*, so it is true for all *i*. □ Lemma 2. Equation (22) *always has a non-negative root no smaller than ε/γ*. *Proof*. Similarly to the proof of Lemma 1 we can prove the inequality ![Formula][70] Indeed, for any *i* we can iteratively prove that ![Formula][71] Therefore the sequence *L*, *L*1, … converges to a number no smaller than *ε/γ*. This number is a root of equation (22), which, according to Lemma 1 is the largest root. □ The last lemma shows that the assumed behavior of *ψ*(*t*) at large *t* is indeed *ψ* ≈ *L/t*. ## Appendix B Short-time behavior and initial conditions In this section we show that in a mixed population the parameter that determines whether an infection grows exponentially or dies out is ![Formula][72] We also show that the long term behavior of the epidemic does not depend on the initial conditions. At early time, when the proportion of the population infected, and the proportion of the population recovered are very small, equations (5) and (6) can be linearized as ![Formula][73] and ![Formula][74] We consider the case where *γ* is fixed for the entire population, and the distribution ![Graphic][75] is a finite combination of delta functions. With the notation *I**i*(*t*) = *I*(*σ**i*, *s**i*, *t*), equations (B1) and (B2) become a finite-dimensional system of equations ![Formula][76] We rewrite this as ![Formula][77] with *I* = *I*1 (*t*), … *I**n* ((*t*) *T* and *A**ij* = *p**i*s*i**σ**j* − *γ*𝟙i=*j*. Let ![Formula][78] Let ![Formula][79] From this we see that the largest eigenvalue of *A* is ![Graphic][80] with the associated eigenvector |(*sp*) ⟩, and that all other eigenvectors are perpendicular to *σ* and have eigenvalue −*γ*. Now a general distribution *p*(*σ, s*) can be approximated by a sum of point values, to conclude that the linear equations (B1) and (B2) have the largest eigenvalue ![Formula][81] with corresponding eigenvector *I*(*σ, s*) = *sp*(*σ, s*) and all other eigenvalues negative. If *p*(*σ, s*) is a compactly supported distribution we conclude that if a small enough proportion of the total population is infected at time zero, then until the proportion of the population that is susceptible drops appreciably below 1, we have ![Formula][82] The quantity ℛ is also what epidemiologists measure when they measure the number of secondary infections produced by a typical infection in the very early stages of the epidemic. The key to understanding why this number is 𝔼 [*sσ*]/*γ* instead of 𝔼 [*s*] 𝔼 [*σ*]/*γ* comes from the word “typical.” Based on equation (B8), early in the epidemic the probability *q*(*σ, s*) that a person with infectivity *σ* and susceptibility *s* is infected is proportional to *sp*(*σ, s*), so ![Formula][83] To find the number secondary infections per unit time this “typical infection” produces, we take this person’s infectivity and multiply by the average susceptibility in the population to get *σ*typical 𝔼 [*s*]. Averaging *σ*typical over the measure *q*(*σ, s*) gives ![Formula][84] Multiplying by the typical recovery time ![Graphic][85] gives the expected number of secondary infections. As with the usual SIR model, if ℛ *>* 1 the infection will spread and if ℛ *<* 1 the infection will die out. This allows us to see that the growth rate of an epidemic is highly dependent on how correlated *s* and *σ* are, with higher correlation leading to a higher growth rate. In a true population we expect a persons infectivity *σ* and susceptibility *s* to be highly correlated through factors like how many people someone interacts with. In particular superspreaders have an outsize effect on the early growth of the epidemic in the most realistic case where *s* and *σ* are highly correlated, because in this case ℛ grows like the second moment 𝔼 [*σ*2] of the infectivity rather than the first moment. The second takeaway is that if the proportion of the population that is infected at time 0 is small enough, the solution for the system (5), (6) does not significantly depend on the details of the initial conditions. This can be seen by writing the initial profile of infected *I*(*σ, s*) as a sum of eigenvectors for equations (B1) and (B2), ![Formula][86] and comparing with (B8) to see that ![Graphic][87] has minimal effect, and the long term solution is almost identical to the solution starting from initial condition ![Formula][88] ## Appendix C Worst-case distributions In this section we discuss which distributions provide the highest possible epidemic size *Ω*∞ (the “worst-case scenarios”). We prove two statements 1. *Variability is good*. If *s* and *σ* are independent, then the final epidemic size is less than or equal to the final epidemic size of the classical SIR model with *s* = 𝔼 [*s*], *σ* = 𝔼 [*σ*]. This conclusion agrees with the other studies of heterogeneity based on different assumptions and models [10, 11, 25, 29, 36, 37, 40]. 2. *Strong positive correlation is bad*. If the marginal distributions of *s* and *σ* are known, then the joint distribution *p*(*σ, s*) that maximizes the final epidemic size is given by the “percentile coupling”, where the *n*th most infectious person is also the *n*th most susceptible person. Both these statements follow from the following lemma: Lemma 3. *Let µ and ν be two possible joint distributions for* (*s, σ*). *Let* 𝔼*µ* *and* 𝔼*ν* *denote the expectation with respect to µ and ν respectively, and similarly for final epidemic sizes* ![Graphic][89]*and* ![Graphic][90]. *If* ![Formula][91] *and for all c >* 0, ![Formula][92] *and also* ![Formula][93] *then* ![Formula][94] *Proof*. Using equations (C2) and (C1) together with equation (22) we see that for any *L >* 0, ![Formula][95] Let *L**µ* be the unique positive zero of *F**µ*(*L*) if such a zero exists, and otherwise let *L**µ* = 0. Now *F**µ*(0) = *F**ν* (0) = 0 and both are convex functions of *L*, which together with equation (C5) gives *L**µ* ≥ *L**ν*. Then from equations (C3) and (C1) we obtain ![Formula][96] □ To prove (i) let us take a distribution *ν* with independent *σ* and *s*, and let *µ* = *δ*(*σ* 𝔼*ν* [*σ*])*δ*(*s* 𝔼*ν* [*s*]). We have 𝔼*µ*[*σ*] = 𝔼*ν* [*σ*] by definition. From Jensen’s inequality [47, §1.7(iv)] ![Formula][97] and from Jensen’s inequality and independence of *s* and *σ* under distribution *ν* we have ![Formula][98] Thus the final epidemic size for our arbitrary distribution with independent *s* and *σ* is not greater than the final epidemic size of a delta mass with the same mean. To prove (ii) let *ν* be an arbitrary measure with the correct marginal distributions, and let *µ* be the percentile coupling: the most susceptible person is the most infectious, the second most susceptible person is the second most infectious and so on. In particular if we sample twice from *µ* and obtain (*s*1, *σ*1) and (*s*2, *σ*2), then with probability 1, the statement *s*1 ≥ *s*2 implies *σ*1 ≥ *σ*2. This property implies that if *f* is an arbitrary decreasing function, and *g* is an arbitrary increasing function, then the percentile coupling is the coupling that minimizes the expectation E[*f* (*s*)*g*(*σ*)] for the given marginal distributions of *s* and *σ*. In particular this distribution minimizes *E*[*σ*e−*sc*] for all *c >* 0, so it satisfies equation (C3). It also has the same marginals as the other measure *ν*, thus in-equalities (C1) and (C2) are satisfied. Thus for the given marginal distributions of *s* and *σ* the percentile coupling is the worst possible joint law in that it maximizes the final epidemic size of the infection. To understand the meaning of statement i, consider the case of the population with the same infectivity, where some individuals have zero susceptibility, while all other individuals have the same large susceptibility *s*1. Let *f*1 be the fraction of these individuals. The epidemic size in this population does not exceed *f*1. To increase the variance of *s* while keeping the mean susceptibility constant, we must increase *s*1 and decrease *f*1, so large variance corresponds to lower epidemic size. Similarly one can consider a population with infectivity *σ* being either zero or a large value *σ* and show that when *σ* increases and the fraction of infectious individuals decreases the epidemic size also decreases. Statement ii can be explained in the following way. Highly susceptible individuals become infected in the beginning of the epidemics, when the supply of susceptible individuals is high. If these highly susceptible individuals are also highly infectious, they can cause many secondary infections among the naïve population in this scenario, thus increasing the total size of the epidemics. ## Appendix D The effect of superspreaders In this Appendix we discuss the effect of a superspreaders: a small subpopulation of people with anomalously high infectivity. Consider the distribution of infectivity *σ* and susceptibility *s* as a sum of the “normal” distribution *p**n* and the superspreaders *p**s* with the latter having support at *σ > σ**s* with large *σ**s*, as shown in equation (32). The short term behavior is determined by the value of 𝔼 [*sσ*], which can be represented as ![Formula][99] where subscripts *n* and *s* denote averaging with the distributions *p**n* and *p**s* correspondingly. This equation shows that (i) the only way superspreaders come into short term behavior is the renormalization of average *σs*, and (ii) their influence is linear in the proportion of super-spreaders *λ*. Let us discuss the case where the number of super-spreaders is low enough, so the contribution of super-spreaders to the averages is small, i.e. ![Formula][100] In this case the contribution of superspreaders into the short term dynamics is small according to equation (3). We are going to show that there is no anomalous contribution to the long term dynamics either. We are looking into the final epidemic size, which is determined by equations (22) and (21). First, consider the case where superspreaders have the same susceptibility distribution as the other individuals. In this case *s* and *σ* are independent, and our equations become ![Formula][101] ![Formula][102] We see that in this case the only way superspreaders contribute is the changing of 𝔼 [*σ*]. Now consider the case where superspreaders have anomalous susceptibility *s*, and higher *σ* corresponds to higher *s*. Then the contribution of superspreaders is asymptotically small in both equations (22) and (21), i.e., again no worse than linear in the number of super-spreaders. ## Appendix E Numerically solvable equations In this Appendix we will recast equation (15) into a set of differential equations suitable for numerical analysis. With the constant *γ* assumption (16) and initial conditions (17) and (18), we can write down equation (15) as ![Formula][103] with ![Formula][104] and ![Formula][105] The initial condition is ![Formula][106] We introduce the function *ν*(*t*): ![Formula][107] We multiply both parts of equation (E1) by e*γt* and divide by 𝔼 [*σ*]: ![Formula][108] We differentiate this equation with respect to *t* and multiply by e−*γt*: ![Formula][109] Let us introduce a new variable ![Formula][110] then we can write down equation (E7) as ![Formula][111] We need initial conditions for equations (E9). By definition (E5), *ν*(0) = 0. From equations (E8), (E5) and (E4) we get *ξ* (0) = *ψ*(0) = *ε*, so we can write initial conditions as ![Formula][112] Equations (E9) with the initial conditions (E10) depend at any moment *t* on *ξ* (*t*) and *ν*(*t*) only, and therefore can be solved by any suitable method for differential equations. ## Appendix F Special distributions For several important distributions we can provide analytical results. These results can be used for more sophisticated models, so we provide them below. We are particularly interested in the low-*γ* limit, where outbreaks are large and not easily controlled. We discuss the completely correlated case when *σ*(*s*) is a monotonic function. Since we always can rescale them keeping *σs* constant, let us assume *σ* = *s*, so ![Formula][113] ### 1. The Gamma distribution Consider a Gamma distribution with fixed *γ* and *s* = *σ*, so *p*(*s*) in equation (F1) becomes ![Formula][114] *α* and *β* being positive constants. First, let us calculate *L*, the root of equation (22). In our case we have ![Formula][115] where 𝔼 [*σ*] = *α/β*. This gives for *L* the equation ![Formula][116] which can be easily solved numerically. The final epidemic size is given by equation 21, and may be written as ![Formula][117] In the case of the exponential distribution (i.e., *α* = 1) equation (F5) becomes ![Formula][118] when ℛ *>* 1. We emphasize that (F5) and (F6) are exact formulas. In the low *γ* limit we may approximate *L* by *L* = 1/*γ* − *f* (*γ*) (See equation (A19) and Lemma 1), where the second term can be written as ![Formula][119] Since *α >* 0, *f* (*γ*) is well defined near *γ* = 0 and the approximation is well-controlled. ### 2. Low-recovery-rate limit for the log-normal distribution Let us discuss a log-normal distribution with fixed *γ* and *s* = *σ*, where *p*(*s*) in equation (F1) becomes: ![Formula][120] with the constants *τ >* 0 and *µ*. Note that due to equation (F1), ![Formula][121] Using equations (A19) and (F9), we obtain the iterative equation for *ε* → 0: ![Formula][122] where we have defined ![Formula][123] In principle, these equations are enough to construct an iterative solution for *L*. However, we may take this a step further for the low *γ* (large *L*) limit. In particular, if *L* is large, then so is each *L**i*. For *a* ≡ e*µ* 𝔼 [*s*]*L* ≫ 1, Eqn. (F11) can be evaluated by a standard saddle point approximation[48, 49]. Setting ![Graphic][124] and expanding around ![Graphic][125] gives ![Formula][126] where ![Graphic][127] is the principal branch of the Lambert W-function, satisfying *W* (*ρ*) exp *W* (*ρ*) = *ρ*. This expression is valid up to a small correction of order *O*(*τ* 2/*W*) ∼*O*(*τ* 2/ ln(*a*)) ≪ 1. Returning to our iterative solution for *L* in Eqn. (F11), we will now plug in the previous expression. Note that ![Graphic][128] ![Formula][129] In particular, ![Formula][130] One may continue this iteration procedure to arbitrary precision. ## Appendix G SafeGraph Data In this Appendix we describe the approach by Looi et al. [39] to transform the set of location pings into a dynamic network. In this network users are represented as nodes, and an edge (*u, v, t*) indicates that user *u* crossed paths with user *v* at time *t*. A path crossing is defined to occur when two users have pings which are separated by less than 50 meters and less than 5 minutes. It should be stressed that a path here is the same as a world line in relativity theory: it encompasses spatial and temporal dimensions, so the users cross paths if they are at the same place at the same time. To ensure that users are represented accurately, various filters are applied; for example, excluding users with fewer than 500 pings or removing duplicate users, which could potentially occur if a single person carries multiple mobile devices. To compute the path crossings efficiently, the authors apply a sliding time window, and, within each time slice, use a k-d tree to identify all pairs of points within 50 meters of each other. We refer the reader to the original paper for details of the network-construction methodology. The constructed network captures 1 613 884 111 path crossings between 9 451 697 users across three evenly spaced months in 2017 (March, July, and November). The network provides an estimate of the true contact network, where each user’s number of contacts represents how many people they could possibly transmit the virus to or from. Thus, we can use each user’s degree in the path crossing network to estimate their susceptibility and infectivity. Previous analyses of SafeGraph data have shown that it is representative of the US population, in that it does not systematically over-represent users from certain income levels, racial demographics, degrees of educational attainment, or geographic regions [50]. Recently, their mobility patterns have been instrumental in helping researchers study responses to the COVID-19 pandemic and to model the role of mobility in the spread of disease [7, 45, 51, 52]. Even so, there are caveats to the data that we use. Most notably, the path-crossing network covers three months in 2017, but individuals’ mobility patterns may have changed substantially following the onset of the pandemic. Furthermore, different types of noise may affect an individual’s number of observed crossings; for example, the frequency with which their phone pings. Filtering for only well-represented users can help to mitigate this issue. ## Footnotes * † david.yllanes{at}czbiohub.org * Received February 8, 2021. * Revision received September 20, 2021. * Accepted September 21, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. [1]. H. W. Hethcote, SIAM Rev. 42, 599 (2020). 2. [2]. T. R. Frieden and C. T. Lee, Emerging Infecious Diseases 26, 1059 (2020). 3. [3]. K. Kupferschmidt, Science 368, 808 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNjgvNjQ5My84MDgiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wOS8yMS8yMDIxLjAyLjA4LjIxMjUxMzg2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 4. [4]. D. Adam, P. Wu, J. Wong, E. Lau, T. Tsang, S. Cauchemez, G. Leung, and B. Cowling, “Clustering and superspreading potential of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections in Hong Kong,” Preprint (Version 1) available at Research Square (2020). 5. [5]. D. Miller, M. A. Martin, N. Harel, T. Kustin, O. Tirosh, M. Meir, N. Sorek, S. Gefen-Halevi, S. Amit, O. Vorontsov, D. Wolf, A. Peretz, Y. Shemer-Avni, D. Roif-Kaminsky, N. Kopelman, A. Huppert, K. Koelle, and A. Stern, Nature Communications 11, 5518 (2020). 6. [6].Endo, S. Abbott, A. J. Kucharski, and S. Funk, Wellcome Open Research 5, 67 (2020). 7. [7]. S. Chang, E. Pierson, P. W. Koh, J. Gerardin, B. Redbird, D. Grusky, and J. Leskovec, Nature 589, 82 (2020). 8. [8]. Aleta D. Martín-Corral, M. A. Bakker, A. Pastore y Piontti, M. Ajelli, M. Litvinova, M. Chinazzi, N. E. Dean, M. E. Halloran, I. M. Longini, A. Pentland, A. Vespignani, Y. Moreno, and E. Moro, medRxiv (2020), doi:10.1101/2020.12.15.20248273. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4xMi4xNS4yMDI0ODI3M3YxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 9. [9]. M. Althouse, E. A. Wenger, J. C. Miller, S. V. Scarpino, A. Allard, L. Hébert-Dufresne, and H. Hu, PLoS Biol. 18, e3000897 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pbio.3000897&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33180773&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 10. [10]. T. Britton, F. Ball, and P. Trapman, Science 369, 846 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNjkvNjUwNS84NDYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wOS8yMS8yMDIxLjAyLjA4LjIxMjUxMzg2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 11. [11]. B. F. Nielsen, L. Simonsen, and K. Sneppen, Phys. Rev. Lett. 126, 118301 (2021). 12. [12]. K. Sun, W. Wang, L. Gao, Y. Wang, K. Luo, L. Ren, Z. Zhan, X. Chen, S. Zhao, Y. Huang, Q. Sun, Z. Liu, M. Litvinova, A. Vespignani, M. Ajelli, C. Viboud, and H. Yu, Science 371 (2021), doi:10.1126/science.abe2424. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNzEvNjUyNi9lYWJlMjQyNCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzIxLzIwMjEuMDIuMDguMjEyNTEzODYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 13. [13]. S. S. Lakdawala and V. D. Menachery, Trends in Microbiology (2021), doi:10.1016/j.tim.2021.05.002. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.tim.2021.05.002&link_type=DOI) 14. [14]. Q. Yang, T. K. Saldi, P. K. Gonzales, E. Lasda, C. J. Decker, K. L. Tat, M. R. Fink, C. R. Hager, J. C. Davis, C. D. Ozeroff, D. Muhlrad, S. K. Clark, W. T. Fattor, N. R. Meyerson, C. L. Paige, A. R. Gilchrist, A. Barbachano-Guerrero, E. R. Worden-Sapper, S. S. Wu, G. R. Brisson, M. B. McQueen, R. D. Dowell, L. Leinwand, R. Parker, and S. L. Sawyer, Proceedings of the National Academy of Sciences 118 (2021), doi:10.1073/pnas.2104547118. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxODoiMTE4LzIxL2UyMTA0NTQ3MTE4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 15. [15].1. R. L. Guerrant, 2. D. H. Walker, and 3. P. F. Weller ( 4. W.B. Saunders Edinburgh M.-S. Ho, in Tropical Infectious Diseases: Principles, Pathogens and Practice, edited by R. L. Guerrant, D. H. Walker, and P. F. Weller ( W.B. Saunders Edinburgh, 2011) third edition ed., pp. 392–397. 16. [16]. S. X. Wang, Y. M. Li, B. C. Sun, S. W. Zhang, W. H. Zhao, M. T. Wei, K. X. Chen, X. L. Zhao, Z. L. Zhang, M. Krahn, A. C. Cheung, and P. P. Wang, Epidemiol. Infect. 134, 786 (2006). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/S095026880500556X&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16371174&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 17. [17]. Y. Li, I. T. S. Yu, P. Xu, J. H. W. Lee, T. W. Wong, P. L. Ooi, and A. C. Sleigh, American Journal of Epidemiology 160, 719 (2004). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwh273&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15466494&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000224361000001&link_type=ISI) 18. [18]. C.-K. Min, S. Cheon, N.-Y. Ha, K. M. Sohn, Y. Kim, A. Aigerim, H. M. Shin, J.-Y. Choi, K.-S. Inn, J.-H. Kim, J. Y. Moon, M.-S. Choi, N.-H. Cho, and Y.-S. Kim, Scientific Reports 6, 25359 (2016), doi:10.1038/srep25359. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/srep25359&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27146253&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 19. [19]. K. H. Kim, T. E. Tandi, J. W. Choi, J. M. Moon, and M. S. Kim, J. Hosp. Infect. 95, 207 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jhin.2016.10.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28153558&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 20. [20]. S. Y. Cho, J.-M. Kang, Y. E. Ha, G. E. Park, J. Y. Lee, J.-H. Ko, J. Y. Lee, J. M. Kim, C.-I. Kang, I. J. Jo, J. G. Ryu, J. R. Choi, S. Kim, H. J. Huh, C.-S. Ki, E.-S. Kang, K. R. Peck, H.-J. Dhong, J.-H. Song, D. R. Chung, and Y.-J. Kim, Lancet 388, 994 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(16)30623-7&link_type=DOI) 21. [21]. J. L. Santarpia, D. N. Rivera, V. Herrera, M. J. Morwitzer, H. Creager, G. W. Santarpia, K. K. Crown, D. Brett-Major, E. Schnaubelt, M. J. Broadhurst, J. V. Lawler, S. P. Reid, and J. J. Lowe, Scientific Reports 10, 12372 (2020). 22. [22]. B. Li, A. Deng, K. Li, Y. Hu, Z. Li, Q. Xiong, Z. Liu, Q. Guo, L. Zou, H. Zhang, M. Zhang, F. Ouyang, J. Su, W. Su, J. Xu, H. Lin, J. Sun, J. Peng, H. Jiang, P. Zhou, T. Hu, M. Luo, Y. Zhang, H. Zheng, J. Xiao, T. Liu, R. Che, H. Zeng, Z. Zheng, Y. Huang, J. Yu, L. Yi, J. Wu, J. Chen, H. Zhong, X. Deng, M. Kang, O. G. Pybus, M. Hall, K. A. Lythgoe, Y. Li, J. Yuan, J. He, and J. Lu, medRxiv (2021), doi:10.1101/2021.07.07.21260122. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMS4wNy4wNy4yMTI2MDEyMnYyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 23. [23]. J. Fiegel, R. Clarke, and D. A. Edwards, Drug Discovery Today 11, 51 (2006). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1359-6446(05)03687-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16478691&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000234823500008&link_type=ISI) 24. [24].Allard, C. Moore, S. V. Scarpino, B. M. Althouse, and L. Hébert-Dufresne, “The role of directionality, heterogeneity and correlations in epidemic risk and spread,” (2020), arxiv:2005.11283 [physics.soc-ph]. 25. [25]. L. Hébert-Dufresne, B. M. Althouse, S. V. Scarpino, and A. Allard, J. R. Soc. Interface 17, 20200393 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1098/rsif.2020.0393&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33143594&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 26. [26]. J. Lu, J. Gu, K. Li, C. Xu, W. Su, Z. Lai, D. Zhou, C. Yu, B. Xu,, and Z. Yang, Emerg. Infect. Dis. 26, 1628 (2020). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 27. [27]. P. Y. Chia, K. K. Coleman, Y. K. Tan, S. W. X. Ong, M. Gum, S. K. Lau, X. F. Lim, A. S. Lim, S. Sutjipto, P. H. Lee, T. T. Son, B. E. Young, D. K. Milton, G. C. Gray, S. Schuster, T. Barkham, P. P. De, S. Vasoo, M. Chan, B. S. P. Ang, B. H. Tan, Y.-S. Leo, O.-T. Ng, M. S. Y. Wong, and K. Marimuthu, Nat. Commun. 11 (2020), doi:10.1038/s41467-020-16670-2. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-020-16670-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32472043&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 28. [28]. P. Galvani and R. M. May, Nature 438, 293 (2005). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/438293a&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16292292&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000233300200027&link_type=ISI) 29. [29]. J. O. Lloyd-Smith, S. J. Schreiber, P. E. Kopp, and W. M. Getz, Nature 438, 355 (2005). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature04153&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16292310&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000233300200048&link_type=ISI) 30. [30].James, J. W. Pitchford, and M. J. Plank, Roy. Soc. Proc. Biol. Sci. 274, 741 (2007). 31. [31]. R. A. Stein, International Journal of Infectious Diseases 15, e510 (2011). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ijid.2010.06.020&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21737332&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) 32. [32]. J. Lepore, If Then Liveright Books, 2020). 33. [33]. F. Li, Y.-Y. Li, M.-J. Liu, L.-Q. Fang, N. E. Dean, G. W. K. Wong, X.-B. Yang, I. Longini, M. E. Halloran, H.-J. Wang, P.-L. Liu, Y.-H. Pang, Y.-Q. Yan, S. Liu, W. Xia, X.-X. Lu, Q. Liu, Y. Yang, and S.-Q. Xu, The Lancet Infectious Diseases 21, 617 (2021). 34. [34]. S. Hu, W. Wang, Y. Wang, M. Litvinova, K. Luo, L. Ren, Q. Sun, X. Chen, G. Zeng, J. Li, L. Liang, Z. Deng, W. Zheng, M. Li, H. Yang, J. Guo, K. Wang, X. Chen, Z. Liu, H. Yan, H. Shi, Z. Chen, Y. Zhou, K. Sun, A. Vespignani, C. Viboud, L. Gao, M. Ajelli, and H. Yu, Nature Communications 12, 1533 (2021). 35. [35]. M. Monod, A. Blenkinsop, X. Xi, D. Hebert, S. Bershan, S. Tietze, M. Baguelin, V. C. Bradley, Y. Chen, H. Coupland, S. Filippi, J. Ish-Horowicz, M. McManus, T. Mellan, A. Gandy, M. Hutchinson, H. J. T. Unwin, S. L. van Elsland, M. A. C. Vollmer, S. Weber, H. Zhu, A. Bezancon, N. M. Ferguson, S. Mishra, S. Flaxman, S. Bhatt, and O. Ratmann, Science 371, eabe8372 (2021). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNzEvNjUzNi9lYWJlODM3MiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzIxLzIwMjEuMDIuMDguMjEyNTEzODYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 36. [36]. M. G. M. Gomes, R. M. Corder, J. G. King, K. E. Langwig, C. Souto-Maior, J. Carneiro, G. Goncalves, C. Penha-Goncalves, M. U. Ferreira, and R. Aguas, medRxiv (2020), doi:10.1101/2020.04.27.20081893. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wNC4yNy4yMDA4MTg5M3Y0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 37. [37]. M. Lachiany and Y. Louzoun, Phys. Rev. E 94, 022409 (2016). 38. [38]. J. Neipel, J. Bauermann, S. Bo, T. Harmon, and F. Jülicher, PLOS One 15, e0239678 (2020). 39. [39]. W. Looi, E. Pierson, B. Redbird, B. Villanueva, N. Fishman, Y. Chen, J. Sholar, J. Leskovec, and D. Grusky, Working paper (2020), doi:10.1101/2020.06.15.20131979, medRxiv. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wNi4xNS4yMDEzMTk3OXYyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 40. [40]. V. Tkachenko, S. Maslov, A. Elbanna, G. N. Wong, Z. J. Weiner, and N. Goldenfeld, Proceedings of the National Academy of Sciences 118 (2021), doi:10.1073/pnas.2015972118. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxODoiMTE4LzE3L2UyMDE1OTcyMTE4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 41. [41].In Appendix B, we set γi = γ, but our claim that the N -component SIR model is a special case of our model does not rely on this assumption. 42. [42]. Y. M. Bar-On, A. Flamholz, R. Phillips, and R. Milo, eLife 9, e57309 (2020). 43. [43]. G. Huber, M. Kamb, K. Kawagoe, L. M. Li, B. Veytsman, D. Yllanes, and D. Zigmond, Physical Biology 17, 065010 (2020). 44. [44]. G. Huber, M. Kamb, K. Kawagoe, L. M. Li, A. McGeever, J. Miller, B. A. Veytsman, and D. Zigmond, Physical Biology 18, 045002 (2021). 45. [45]. S. Gao, J. Rao, Y. Kang, Y. Liang, and J. Kruse, SIGSPATIAL Special 12, 16 (2020). 46. [46].Chu, G. Huber, A. McGeever, B. Veytsman, and D. Yllanes, medRxiv (2020), doi:10.1101/2020.11.04.20226308. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4xMS4wNC4yMDIyNjMwOHYyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 47. [47]. F. W. Olver, D. W. Lozier, R. F. Boisvert, and C. W. Clark, NIST Handbook of Mathematical Functions, 1st ed. (Cambridge University Press, USA, 2010). 48. [48]. L. R. Nandayapa, Risk Probabilities: Asymptotics and Simulation, Ph.D. thesis, University of Aarhus (2008). 49. [49]. S. Asmussen, J. L. Jensen, and L. Rojas-Nandayapa, Methodology and Computing in Applied Probability 18, 441 (2016). 50. [50]. R. F. Squire, (2019), available at [https://safegraph.com/blog/what-about-bias-in-the-safegraph-dataset](https://safegraph.com/blog/what-about-bias-in-the-safegraph-dataset). 51. [51]. S. G. Benzell, A. Collis, and C. Nicolaides, Proceedings of the National Academy of Sciences 117, 14642 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTE3LzI2LzE0NjQyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMjEvMjAyMS4wMi4wOC4yMTI1MTM4Ni5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 52. [52]. W. Yang, S. Kandula, M. Huynh, S. K. Greene, G. Van Wye, W. Li, H. T. Chan, E. McGibbon, A. Yeung, D. Olson, A. Fine, and J. Shaman, The Lancet Infectious Diseases 21 (2020), doi:10.1016/S1473-3099(20)30769-6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1473-3099(20)30769-6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33091374&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F21%2F2021.02.08.21251386.atom) [1]: /embed/graphic-1.gif [2]: /embed/graphic-2.gif [3]: /embed/graphic-3.gif [4]: /embed/graphic-4.gif [5]: /embed/graphic-5.gif [6]: /embed/graphic-6.gif [7]: /embed/graphic-7.gif [8]: /embed/graphic-8.gif [9]: /embed/graphic-9.gif [10]: /embed/graphic-10.gif [11]: /embed/graphic-11.gif [12]: /embed/graphic-12.gif [13]: /embed/graphic-13.gif [14]: /embed/graphic-14.gif [15]: /embed/graphic-15.gif [16]: /embed/graphic-16.gif [17]: /embed/graphic-17.gif [18]: /embed/graphic-18.gif [19]: /embed/graphic-19.gif [20]: /embed/graphic-20.gif [21]: /embed/graphic-21.gif [22]: /embed/graphic-22.gif [23]: /embed/graphic-23.gif [24]: /embed/graphic-24.gif [25]: /embed/inline-graphic-1.gif [26]: /embed/inline-graphic-2.gif [27]: /embed/inline-graphic-3.gif [28]: /embed/inline-graphic-4.gif [29]: /embed/graphic-25.gif [30]: /embed/graphic-26.gif [31]: /embed/graphic-27.gif [32]: /embed/graphic-28.gif [33]: /embed/graphic-29.gif [34]: /embed/graphic-30.gif [35]: /embed/graphic-31.gif [36]: /embed/inline-graphic-5.gif [37]: /embed/graphic-36.gif [38]: /embed/graphic-37.gif [39]: /embed/graphic-39.gif [40]: /embed/graphic-40.gif [41]: /embed/graphic-41.gif [42]: /embed/graphic-42.gif [43]: /embed/graphic-43.gif [44]: /embed/graphic-44.gif [45]: /embed/graphic-45.gif [46]: /embed/graphic-46.gif [47]: /embed/graphic-47.gif [48]: /embed/graphic-48.gif [49]: /embed/graphic-49.gif [50]: /embed/graphic-50.gif [51]: /embed/graphic-51.gif [52]: /embed/graphic-52.gif [53]: /embed/graphic-53.gif [54]: /embed/graphic-54.gif [55]: /embed/inline-graphic-6.gif [56]: /embed/graphic-55.gif [57]: /embed/graphic-56.gif [58]: /embed/graphic-57.gif [59]: /embed/inline-graphic-7.gif [60]: /embed/inline-graphic-8.gif [61]: /embed/graphic-58.gif [62]: /embed/inline-graphic-9.gif [63]: /embed/inline-graphic-10.gif [64]: /embed/inline-graphic-11.gif [65]: /embed/inline-graphic-12.gif [66]: /embed/graphic-59.gif [67]: /embed/graphic-60.gif [68]: /embed/graphic-61.gif [69]: /embed/graphic-62.gif [70]: /embed/graphic-63.gif [71]: /embed/graphic-64.gif [72]: /embed/graphic-65.gif [73]: /embed/graphic-66.gif [74]: /embed/graphic-67.gif [75]: /embed/inline-graphic-13.gif [76]: /embed/graphic-68.gif [77]: /embed/graphic-69.gif [78]: /embed/graphic-70.gif [79]: /embed/graphic-71.gif [80]: /embed/inline-graphic-14.gif [81]: /embed/graphic-72.gif [82]: /embed/graphic-73.gif [83]: /embed/graphic-74.gif [84]: /embed/graphic-75.gif [85]: /embed/inline-graphic-15.gif [86]: /embed/graphic-76.gif [87]: /embed/inline-graphic-16.gif [88]: /embed/graphic-77.gif [89]: /embed/inline-graphic-17.gif [90]: /embed/inline-graphic-18.gif [91]: /embed/graphic-78.gif [92]: /embed/graphic-79.gif [93]: /embed/graphic-80.gif [94]: /embed/graphic-81.gif [95]: /embed/graphic-82.gif [96]: /embed/graphic-83.gif [97]: /embed/graphic-84.gif [98]: /embed/graphic-85.gif [99]: /embed/graphic-86.gif [100]: /embed/graphic-87.gif [101]: /embed/graphic-88.gif [102]: /embed/graphic-89.gif [103]: /embed/graphic-90.gif [104]: /embed/graphic-91.gif [105]: /embed/graphic-92.gif [106]: /embed/graphic-93.gif [107]: /embed/graphic-94.gif [108]: /embed/graphic-95.gif [109]: /embed/graphic-96.gif [110]: /embed/graphic-97.gif [111]: /embed/graphic-98.gif [112]: /embed/graphic-99.gif [113]: /embed/graphic-100.gif [114]: /embed/graphic-101.gif [115]: /embed/graphic-102.gif [116]: /embed/graphic-103.gif [117]: /embed/graphic-104.gif [118]: /embed/graphic-105.gif [119]: /embed/graphic-106.gif [120]: /embed/graphic-107.gif [121]: /embed/graphic-108.gif [122]: /embed/graphic-109.gif [123]: /embed/graphic-110.gif [124]: /embed/inline-graphic-19.gif [125]: /embed/inline-graphic-20.gif [126]: /embed/graphic-111.gif [127]: /embed/inline-graphic-21.gif [128]: /embed/inline-graphic-22.gif [129]: /embed/graphic-112.gif [130]: /embed/graphic-113.gif