Abstract
A population can be immune to epidemics even if not all of its individual members are immune to the disease, just as long as sufficiently many are immune—this is the traditional notion of herd immunity. In the smartphone era a population can be immune to epidemics even if not a single one of its members is immune to the disease—a notion we propose to call “digital herd immunity”, which is similarly an emergent characteristic of the population. This immunity arises because contact-tracing protocols based on smartphone capabilities can lead to highly efficient quarantining of infected population members and thus the extinguishing of nascent epidemics. When the disease characteristics are favorable and smartphone usage is high enough, the population is in this immune phase. As usage decreases there is a novel “contact tracing” phase transition to an epidemic phase. We present and study a simple branching-process model for COVID-19 and show that digital immunity is possible regardless of the proportion of non-symptomatic transmission. We believe this is a promising strategy for dealing with COVID-19 in many countries such as India, whose challenges of scale motivated us to undertake this study.
I. INTRODUCTION
Recent events have challenged the public health infrastructure worldwide for controlling the spread of contagious diseases. This difficulty is partly due to the novel pathogen involved and partly due to some unusual characteristics of COVID-19 [1]. Specifically, the infection appears to be transmitted through a large number of asymptomatic and pre-symptomatic cases [1, 2].
This leaves two approaches to controlling an exponential growth in the number of infected people. The first is continuous monitoring of entire populations via regular testing, which can identify new infections already during their latent phase and thus end non-symptomatic transmission. The second is the established method of “contact tracing” [3], in which people who have been exposed to newly identified infected people are isolated before they have a chance to infect others. In principle, either approach is capable of ending the epidemic.
Population-level testing is not possible as we write this. Traditional contact tracing, done by teams of health officials relying on interviews with newly identified cases, is also not up to the task today, as it fails if non-symptomatic transmission is too frequent [3, 4]. Fortunately, we live in the smartphone era, and it has been noted that using these devices to record contacts can make the task of tracing them entirely solvable by automating it. This idea has been spelled out in a series of papers [5, 6] (see also [7–9]) and is the basis for a rapidly expanding set of contact-tracing apps and the recent announcement by the Apple-Google duopoly of their intention to build the technology into their operating systems [10].
In this paper we contribute to this emerging field of infectious disease control along two axes.
First, we present a simple model of the early stages of the spread of COVID-19, which allows us to obtain estimates, as a function of a varying amount of non-symptomatic transmission, of the fraction of the population that needs to participate in a digital contact-sharing network in order to prevent new epidemics. Our interest in this question was seeded by the very practical question of whether a country like India can use this technology to achieve epidemic control today. Indeed, India is now launched on this enterprise [11]. While modeling with various differences from our own became available while we were working on this problem [5, 12], we feel that our approach has the virtue of making the existence and values of the estimated compliance thresholds transparent. Our estimates for the fraction of the population that needs to own a contact-tracing app to avert a COVID-19 epidemic range from 75% – 95% for R0 = 3, depending on the fraction of asymptomatic transmission, θ = 20% – 50%, that takes place. For smaller R0 produced by social distancing this fraction is lower.
Our second contribution is to frame the overall discussion in a language more familiar to physicists and students of complex phenomena more generally—that of phases, phase transitions and emergent properties. The bottom line here is the idea that the immunity of a population to epidemic growth is an emergent, or collective, property of the population. For traditional vaccination or epidemic-induced “herd immunity”, this feature gets conflated with the fact that individuals can be immune to the disease at issue. But mass digital contact tracing now makes it possible for the population to be immune to epidemic growth even as no individual has immunity to the underlying disease. We propose to refer to this as the existence of a “digital herd immunity”. This fits well into the general idea of an emergent property, which does not exist at the level of the microscopic constituents but exists for the collective [13, 14]. We would be remiss if we did not note that epidemiologists have previously referred to this state of affairs as “herd protection” and “sustained epidemic control” [5]. Our intention with the proposed terminology is both to frame a public health goal by including the word digital and to emphasize the emergent nature of a herd immunity.
Our application of ideas from statistical physics to the theory of contact tracing yields three main dividends compared to earlier approaches. First, we are able to analytically quantify the effectiveness of contact tracing for any recursive depth n, from the limit of traditional, manual contact tracing, for which n = 1, to perfect digital tracing, for which n = ∞ in principle. Describing this crossover, which was beyond existing theoretical techniques [3, 15, 16], allows us to sharply formulate an important, general principle of disease control: for any proportion of non-symptomatic transmission, tracing and isolation based on a sufficiently widespread contact-tracing network, of sufficient recursive depth, can prevent epidemic spread. The second insight gained through our approach is a point of principle that has been neglected in the public discourse surrounding this issue, namely that with effective contact-tracing protocols in place, nascent epidemics can be extinguished with probability one. Third, our study of the universal properties of the contact-tracing phase transition allows us to capture the full probability distribution for epidemic sizes as the critical threshold for epidemic control is approached. Our results imply that the contact-tracing phase boundary is not even visible at the resolution of previous studies for COVID-19 [12], that attempted to estimate the critical threshold for contact tracing by numerically simulating the epidemic-size distribution function.
In the balance of this paper we do the following. We begin by summarizing the case for contact tracing as an effective strategy for combating COVID-19. We then introduce a simple branching-process model for the spread of this disease, that incorporates the key features of asymptomatic transmission, pre-symptomatic transmission and recursive contact tracing. We find that there is always a critical fraction 0 ≤ ϕc<1 of app ownership, such that take-up of contact-tracing apps by a fraction ϕ > ϕc of the population is sufficient to prevent epidemic spread. We provide an analytical formula for this threshold, which is verified against detailed numerical simulations, and characterize the universal features of the resulting “contact-tracing phase transition”. Finally, we address the applicability of our results to the specific context of India.
II. A MODEL FOR APP-BASED CONTACT TRACING
A. Motivation and relevance to COVID-19
Traditional contact tracing is a multi-stage process. First, one identifies symptomatic, infected individuals. Next, one finds the people they came into close contact with during their infectious period. Finally, one treats or isolates these people before they can go on to infect others. Manual contact tracing becomes difficult for infections that have a period before the onset of symptoms when an exposed person is contagious (the Ω period). Further delay in finding the symptomatic person and their contacts could lead to tertiary infections, making it difficult to control an outbreak. For COVID-19, the incubation period is thought to be around 5-6 days, while the Ω period is estimated to be 1-3 days [2]. The time before becoming contagious, or the latent period L, is around 4 days. Stochasticity of these times aside, it is reasonable to expect that if Ω < L on average, and if the exposed contacts of an individual can be traced before they become infectious, then an epidemic could be prevented. However, the delays typical for manual contact tracing, even just one or two days, can render contact tracing completely ineffective for COVID-19, given the typical L and Ω periods; this conclusion is supported by detailed numerical simulations [5, 12].
This is where digital contact tracing comes in. A smartphone application could enable instant isolation of an infected person and their network of contacts. This halts the transmission chain, because infected contacts cannot infect others during their latent period. The question immediately arises of how widespread such tracing needs to be in order to prevent an epidemic, and this question is the focus of our paper. Below we present a simple model that captures the essential features of disease spread necessary to tackle this problem.
To place our work in context, the classic quantitative analyses of the efficacy of contact tracing [3, 4], from before the smartphone era, showed that traditional, manual contact-tracing protocols become useless when the rate of non-symptomatic spreading, θ, is too high. By contrast, app-based approaches allow for “recursive” contact tracing, whereby contacts of contacts can be traced to an arbitrary recursive depth, at no additional cost. The effectiveness of recursive contact tracing has been studied in previous work; mathematically rigorous results exist in simple limits [16, 17] and detailed numerical simulations have been performed in analytically inaccessible regimes [15]. Some recent works have provided quantitative estimates for the effectiveness of non-recursive contact-tracing in the specific context of COVID-19 [5, 6, 12]. Our results should be viewed as complementary to these studies. One advantage of the model that we propose is its simplicity; this allows for a more thorough analytical understanding of the contact-tracing phase transition than in previous works, for any recursive depth, 0 ≤ n ≤ ∞.
B. A branching-process model
Suppose we have an epidemic spreading through an infinite population of susceptibles, in discrete time, and infecting a number R of the population at each time step (here, R is an “effective reproduction number” that depends on the detailed properties of the epidemic spread, including the basic reproduction number R0). This is a generic model for a spreading epidemic at short times. The total number of infections Itot scales as and if R < 1, the epidemic has been controlled.
We want to understand which R best captures the effect of mobile-phone-based contact tracing. From a statistical physics perspective, R is the single relevant parameter controlling the epidemic spread, and drives a phase transition from an “epidemic phase” to an “immune phase” as R decreases below R =1, which we shall elaborate on below. For the purposes of modelling epidemic spread, the key question is which “microscopic” degrees of freedom must be included to obtain a realistic estimate for R.
To this end, we consider three parameters that implicitly determine R: the fraction of the population that will present asymptomatic cases (θ), the fraction of the population using a contact-tracing application (ϕ), and the basic reproduction number for an individual who eventually shows symptoms (RS). RS is a combined measure of the number of pre-symptomatic infections and the efficacy of quarantine: in the limit of perfect isolation after showing symptoms, RS is precisely the number of people that a symptomatic individual infects during their Ω period, as defined above. We assume that RS is independent of whether a symptomatic individual is on the contact-tracing network or not.
The effects of these parameters on the growth of the epidemic (or R) are studied using a simple branching-process model, where all infectious individuals are either symptomatic (S) or asymptomatic (A), and either on the app-based contact tracing network (C) or not (N). In an an uncontrolled setting, all types of infectious cases are assumed to proliferate with R0 = 3, which is a reasonable estimate [18] for COVID-19 [19]. Suppose that the outbreak starts from a single infected individual at time t = 0 (“Patient Zero”). In our discrete time (generational) model, Patient Zero infects R0 other people at time t = 1, and each new infection is assigned to one of the categories {CA, CS, NA, NS} randomly, with probabilities that are determined by the values of θ and ϕ. Individuals infected at the beginning of each generation are assumed not to infect anyone else after that generation has elapsed. Whenever a symptomatic individual on the contact network (CS) is encountered during this branching process, the contact network is triggered, and all people connected to the CS individual by the network, through either past or present infections, are placed in quarantine. As discussed earlier, since pre-symptomatic infections are common for COVID-19, our model includes the possibility that a CS individual infects RS people by the time they trigger the contact network. As a consequence, non-CS individuals in the same generation are also allowed to infect the next generation before the activation of the contact network (see Fig. 1c). A few timesteps of the model are illustrated explicitly in Fig. 1, together with the implementation of recursive contact tracing via removing connected components of the contact graph. Different combinations of the parameters θ, ϕ and RS lead to an effective reproduction number R distinct from the bare reproduction number R0, and we expect epidemic growth to be suppressed whenever R < 1.
Numerical simulations of this model were performed on 10, 000 nodes with 100 initial infections, without replacement; the results are summarized in Fig. 2. The location of the phase boundaries were verified to be independent of both doubling the system size and doubling the number of samples averaged per point shown on the phase diagram, to within the resolution of the phase diagram. Our numerics are consistent with the hypothesis that for any given fraction of asymptomatic transmission 0 ≤ θ < 1, and any presymptomatic reproduction number 0 ≤ RS < R0, there is a critical point ϕc (θ), corresponding to the onset of “digital herd immunity”: epidemic control occurs for a fraction of app owners ϕc (θ) < ϕ ≤ 1. For realistic COVID-19 parameter values, R0 = 3, RS = 1 and θ = 0.2 − 0.5 [5, 18], we find that ϕc (θ) = 75% − 95%, illustrating that when both presymptomatic and asymptomatic transmission are taken into account, the rate of app coverage necessary to prevent an epidemic can be rather high. Some practical implications of this point are raised in the final discussion.
C. The contact-tracing phase transition
We now describe the sense in which our branching-process model exhibits a phase transition. Consider the disease dynamics seeded by a single initial infection, Patient Zero, at t = 0. As the disease spreads, there are two possibilities: either the epidemic seeded by Patient Zero terminates at some finite time, or it continues to spread indefinitely. In branching process theory [20], this dichotomy is captured by the “probability of ultimate extinction”, q, which is the probability that the epidemic seeded by Patient Zero terminates at some t < ∞.
To make the connection with statistical physics, consider the quantity ρ = 1 − q, which is the probability that the epidemic seeded by Patient Zero spreads for all time. This defines an order parameter for the epidemic-to-immune phase transition, in the following sense. If ρ > 0, an epidemic can spread with non-zero probability, and the population is in an “epidemic phase”. If ρ = 0, epidemics are almost surely contained, and the population is in an “immune phase”. In fact, ρ is precisely the order parameter for a site percolation phase transition [21] on the infinite, rooted, Cayley tree with bulk co-ordination number z = 1 + R0.
We argued above that the epidemic-to-immune phase transition is driven by a single relevant parameter, the effective reproduction number, R. Let us now make this statement precise. For simple epidemic models, the underlying branching process is Markovian, and we can define R to be the mean number of new infections generated by an infected node. It is then a rigorous result that ρ = 0 for R < 1 and ρ > 0 otherwise [20]. For such models, the connection between epidemic spread and percolation transitions has been known for some time [22-24]. By contrast, for the model studied in this paper, the possibility of tracing successive contacts means that the disease dynamics is no longer Markovian; there are correlations between generations that preclude a simple definition of R, and earlier theoretical results do not apply.
The main technical innovation in our work is surmounting this breakdown of the Markov property: we develop generating function methods that allow for exact summation of non-Markovian contact-tracing processes, to any desired order. We show that despite intergenerational correlations, the critical behaviour is determined by a function Rn (ϕ, θ), which can be viewed as a “mean number of new infections”, suitably averaged over time. In particular, Rn (ϕ, θ) controls the ultimate fate of the epidemic, and the critical line for n-step contact tracing is given by an implicit equation in ϕ and θ; details of the calculation and a full expression for Rn (ϕ, θ) are presented in the Supplementary Material. Taking the limit as n → ∞ to yields the exact critical line for contact tracing to arbitrary recursive depth; however, for any n ≥ 1, this does not seem to be expressible in closed form, except at its endpoints. Fortunately, the function Rn is found to converge rapidly in its arguments with increasing n. Fig. 2 depicts results from Eq. (2) with ten-step contact tracing, n = 10, and shows excellent agreement with stochastic numerical simulations.
We now discuss the universal properties of this phase transition. For concreteness, let us parameterize the critical line as (ϕc(θ),θ). For fixed θ in the domain of ϕc, a transition from an epidemic to an immune phase occurs as ϕ → ϕc (θ)−. We call this transition the “contact-tracing phase transition”, because it is controlled by the population fraction on the contact-tracing network. To capture the universal properties of this transition, it is helpful to consider the random variable |C|, which is the size of the infected cluster seeded by Patient Zero in a single realization of the branching process. On the immune side of the transition, |C| is almost surely finite, and the risk of epidemics is captured by the mean cluster size, . On the epidemic side of the transition, the mean cluster size diverges, and the order parameter better quantifies the risk of epidemic spread. In the vicinity of the critical point ϕ = ϕc (θ), both of these quantities are characterized by universal critical exponents,
In the Supplementary Material, we show that γ = β = 1 for any finite n, demonstrating that the contact-tracing phase transition lies in the universality class of mean-field site percolation [25]. The scaling theory of percolation transitions then implies a universal scaling form for the distribution of epidemic sizes, where f is a scaling function with exponentially decaying tails, and the cluster correlation length Nξ ~ (ϕ − ϕc (θ))−2 as ϕ → ϕc (θ).
As the recursive tracing depth n → ∞, we find that such mean-field behaviour breaks down at the endpoint of the critical line (ϕ, θ) = (1,1), which exhibits a discontinuous phase transition, reflecting the non-locality of the underlying branching process; see Fig. 3. The emergence of a discontinuous percolation transition on the Bethe lattice is highly unusual, and suggests that the non-local character of the n = ∞ contact-tracing transition fundamentally distinguishes it from the percolation-type phase transitions that have arisen in related settings [22–24, 26, 27]
Although Eq. (2) is exact, the full expression for Rn is a little cumbersome to rapidly adapt and use.
We therefore present a simpler, approximate formula for R = R∞(ϕ, θ), based on linear interpolation and a “mean-field” assumption, whereby inter-generational correlations are neglected. The resulting approximation to R, derived in Appendix D, is given by
In Fig. 2, the critical threshold R = 1 predicted by Eq. (6) is compared to the exact result, Eq. (2), for 10-step contact tracing, and shows reasonably good quantitative agreement.
III. DISCUSSION
We have introduced a simple branching-process model for early-stage epidemic spread, which both retains a degree of analytical and numerical tractability and is sufficiently expressive to model complicated features of COVID-19 spreading and control, for example pre-symptomatic transmission, as distinct from asymptomatic transmission, and recursive contact tracing. Using this model, we have obtained predictions for the app take-up fraction needed to provide digital herd immunity as a function of R0 and the asymptomatic transmission frequency θ.
We now consider the practical applicability of our results to India, whose particular challenges provided the initial stimulus for this work. India’s overall smartphone coverage is around 40% [28]. However, this percentage is much higher in the major cities which are the primary challenge and since the start of the epidemic R0 has not gone much higher than 2. Further almost 90% [28] of Indians use some sort of wireless phone, which can be included on a digital contact tracing network to various degrees using cell tower based triangulation and SMS broadcast. So when compared with the challenges of manual contact tracing, it is clear that the digital route is much more scalable. Indeed, India has had considerable success with this strategy already, with over 100 million downloads and 100,000 contacts traced.
We end with some comments towards the future. In terms of statistical mechanics, it would be useful to generalize our computations to take real-world heterogeneity of R0 into account [29] and to examine the course of epidemics on realistic graphs when ϕ < ϕc. In terms of infectious disease control one should start to think of the herd immunity of a population as deriving from a combination of natural immunity, vaccination and digital immunity. For example, it seems realistic to aim at greatly reducing the annual incidence of influenza worldwide by adding digital control to the existing toolkit.
Data Availability
All data shown in the paper is available to readers
ACKNOWLEDGMENTS
We would like to thank the Principal Scientific Adviser to the Government of India, Professor K. VijayRagha-van, for interesting us in this question and for discussions of India’s Aarogya Setu contact tracing app, Professor Bryan Grenfell for sharing his wisdom regarding epidemiology at an extremely hectic time and Dr. Shoibal Chakravarty for continuing discussions on all aspect of India’s COVID-19 challenges.
Appendix A: Critical behaviour along the line θ = 0
In this appendix, we derive the critical behaviour along the line θ = 0. Since all infections on this line are symptomatic, we denote RS = R. As we demonstrate below, the exact critical point can be obtained in closed form, and is found to be which matches our numerical phase diagram to within the resolution of the plot; see Fig. 2 for an example with R = 2.
Let us now summarize some basic facts about the critical behaviour of percolation-type transitions. Recall [25] that standard site percolation on the Bethe lattice exhibits five metric-independent critical exponents, which we may denote {α, β, γ, δ, Δ}. The three scaling relations imply that only two of these are independent. There are three more metric-dependent critical exponents, {ν, ρ, η}, which can be expressed in terms of the previous exponents using the hyperscaling relations in six dimensions.
We will focus on the two independent exponents γ and β, since these are the easiest to calculate. They control the critical behaviour of the mean cluster size and the percolation probability ℙp(|C| = ∞) (probability of formation of an infinite cluster) respectively. In terms of the site occupation probability p and its critical value pc, they are defined by
Site percolation on the Bethe lattice with z ≥ 3 lies in the universality class of mean-field percolation, and exhibits critical exponents γ = β =1 [25].
1. Mean cluster size
To compute the mean cluster size, we follow a method introduced by Fisher and Essam for counting clusters of a given size in the Bethe lattice [31]. We define a probability generating function for the cluster size where |C| denotes the cluster size and ℙϕ(|C| = s| initial node infected) denotes the probability of obtaining a cluster of size s from an initial infected node. Since susceptible individuals are on the network (C) with probability ϕ and off the network (N) with probability (1 − ϕ), B (ϕ; x) can be expressed as
Using Eq. (A4), the expression for the mean cluster size reads
If a node of type C infects another node of type C, the latter cannot transmit infection further. However, a node of type N can infect other nodes freely. This implies the recurrence relations for the coefficients of each probability generating function. Differentiating at x =1, and using the normalization constraint BC/N (ϕ, x = 1) = 1, we obtain
Solving these linear equations, we find that
Thus, using Eqs. (A10) and (A5), we obtain the mean cluster size:
Denoting the roots of the denominator by we may write
Since 0 ≤ ϕ ≤ 1, it follows that the mean cluster size has a simple pole at ϕ = ϕ+, and assumes a physical, positive value only when ϕ > ϕ+. Thus, the exact critical point lies at ϕc = ϕ+. (The unphysical, negative value in Eq. (A12) in the percolating regime ϕ < ϕc reects the divergence of the mean cluster size due to the in_nite cluster. Meaningful results in the percolating regime can be recovered by conditioning on the event {|C| < ∞} [25], but we will not pursue this here.)
In the vicinity of the critical point ϕc = ϕ+, we obtain from which the critical exponent γ = 1 can be read off.
2. Percolation probability
To derive β, we de_ne the probabilities of in_nite cluster formation from a source infection that is respectively on or o_ the contact network:
We now derive recurrence relations to compute ρC and ρN. Since a C node can give rise to an in_nite cluster only through infecting an N node, we can obtain an expression for ρC in terms of ρN as follows. Note that (1 − (1 − ϕ)ρN) is the probability that an infected N node does not lead to an infinite cluster, and hence (1 − (1 − ϕ))ρN)R is the probability that none of the R infected nodes lead to infinite clusters. Thus, the probability that at least one of the nodes infected by an initial C node leads to an infinite cluster is given by
Similarly, noting that an N node can lead to an infinite cluster via either C or N nodes, the probability that at least one of the infected nodes leads to an infinite cluster is given by
Eqs. (A15) and (A16) reduce to a single equation for ρN:
Expanding to second order in ρN yields
In terms of the roots defined in Eq. (A11), we have
Note that in Eq. (A19), since ρN ≥ 0, the only physical solution for ϕ ≥ ϕ+ is ρN = 0, whereas ρN > 0 if ϕ < ϕ+. Hence we recover the result ϕc = ϕ+. It further follows that
Taking limits as yields which implies and consequently that which suggests a critical exponent β = 1. To confirm this exponent, we define the probability of formation of an infinite cluster from a single infected site, ρ = ϕρC +(1 − ϕ)ρN. Since ρN is small in the vicinity of ϕc, we can linearize Eq. (A15), to obtain from which the critical exponent β = 1 is immediate.
Appendix B: Exact critical line for digital herd immunity
Here, we derive the exact critical line for digital herd immunity using two complementary approaches. In App. B1, we obtain recurrence relations for the full probability generating function for the size of infected clusters, which allows us to identify when the mean size of an infected cluster diverges. In App. B2, we study the processes involved in the cluster growth and determine when the percolation probability of the infected cluster approaches zero.
1. Mean cluster size approach
First, it is useful to define one probability generating function per type of initial node: where α ∈ {CA; CS; NA; NS}. The probability generating function for cluster sizes, given any type of infected initial node, is then
We shall find the exact critical line for n-step contact tracing by determining when the mean size of an infected cluster, , diverges.
We first obtain exact recurrence relations for the generating functions Bα by enumerating possibilities at a given node. When the initial infected node is off the contact network, i.e. of type NA or NS, we obtain since nodes off the network can infect any other type of node (cf. Eq. (A7)).
When the initial infected node is of type CS, any infections in the next generation that are on the network will be detected (cf. Eq. (A6)), and we obtain where it is useful to define
The analogous result for BCA is rather more involved. The essential difficulty is that for n-step contact tracing, the recurrence relation for BCA involves n generations beyond the initial node, rather than just one. This is because the possibility arises of multi-generational clusters of CA nodes, that escape detection until they infect a CS node at some later generation 1 < m ≤ n. (Put diffierently, the underlying branching process is not Markovian.) The simplest way to proceed is to study all the configurations of clusters originating from an initial infected node of type CA, and organize the sum according to the generation in which the CA cluster connected to the initial CA node is detected, schematically
We first define a function and its composition to recursively “propagate” the generating function from one generation to the next, after the addition of a CA node:
The generating function for the processes without detection in generations j ≤ n reads where g(1) is the generating function for all processes that do not lead to a CS node in one generation of disease spread. Similarly the generating function for all processes that end in detection in generation j, reads where g(2) (resp. g(3)) are generating functions for all processes that lead to (resp. do not lead to) creation of a CS node in generation j, and the subtraction ensures that only processes that give rise to at least one CS node in generation j are included. Using Eqs. (B6), (B8), and (B9), we _nd that
It can be verified that BCA is correctly normalized, i.e. that BCA | x=1 = 1. To compute the mean cluster size, we first compute derivatives of the generating functions Bα. Noting that Bα | x=1 = 1 and using Eqs. (A7) and (B4), we obtain
Eliminating all variables other than ∂xB | x=1, we can write the derivative of BCA in the form and in terms of these coefficients {En; Fn}, the mean cluster size reads where is is useful to define
It is clear that the mean cluster size diverges when
It remains to compute Fn, as defined in Eq. (B12). To this end, let us introduce functions of ϕ and θ, with arguments other than x suppressed. By the chain rule, these satisfy the recurrence relations where we defined . Since we are only concerned with the coefficient of ∂xB | x=1 in ∂xBCA | x=1, let us write , as in Eq. (B12). Upon making this substitution in Eq. (B17), we obtain the recurrence for the terms of interest. Combining the above expressions, we find where the are defined recursively for j > 0 via with and
The exact critical line for n-step contact tracing is thus given by the implicit equation for ϕ and θ.
2. Percolation probability approach
We now outline a procedure to obtain the critical line using the percolation probability of an infected initial node. The probability of formation of an infinite cluster is given by where ρα is the probability of formation of an infinite cluster starting from a node of type α. Since nodes off the network (NS and NA) can infect any type of node, we obtain the recurrence relations
Further, since the CS node can only infect anyone outside the network, we obtain
Similar to the case of the generating function in the previous section, obtaining the recurrence relation for ρCA is more involved. We proceed by enumerating the minimum number of processes such that the expression for the formation of an infinite cluster ρCA can be expressed in terms of the ρα’s. For depth n contact tracing, this yields a term that schematically reads
To determine the critical line, it is sufficient to linearize the recurrence relations for {ρα} similar to the calculation in App. A. We thus obtain where Rn is defined in Eq. (B14). To count the different kind of processes that enter Eq. (B27), we divide all processes of j generations into three types, which we denote as follows.
Note that processes in and ℇ(j) lead to no detection in generation j and leads to a detection in generation
j that activates the contact network. Examples of these processes for j = 2 are shown in Fig. 4. To linear order in p, we find that the terms in ρCA in Eq. (B27) can be expressed as where τ runs over all the processes in the sets described in Eq. (B29), _ is summed over all types of nodes unless other wise stated, and we have defined nα(τ), Pτ, and as follows. nα(Pτ) denotes the number of nodes of the type α on the edge of a process τ. Pτ is the probability of having a process τ, i.e. the product of the individual probabilities {pα} of all the nodes in the process including the root CA node, and in Eq. (B30) we divide by a factor of pCA in order to determine the probabilities of the processes given that the root is of type CA. is the probability of formation of an infinite cluster due a CA that has been detected due to a CS node in the same generation, which reads where we have used Eq. (B28).
To evaluate the terms in Eq. (B30), we define generating functions corresponding to each of the processes in Eq. (B29) as
These generating functions can be enumerated recursively as with
In terms of these generating functions, Eq. (B30) reads
Using Eqs. (B24), (B28), (B31), and (B35), we finally obtain an expression for ρ of the form using which we identify the critical line to be
While we are not able to derive a more explicit expression for Rn(ϕ; θ) using this approach, we have verified using Mathematica that it yields the same critical line of Eq. (B23) for several values of n. That is, we find that
3. Critical exponents
Following the derivation of critical exponents in App. A, it is clear that for any finite n, the critical exponents β and γ are controlled by the behaviour of the function near the critical line Rn(ϕ; θ) = 1. Unfortunately, it does not seem possible to obtain Rn(ϕ; θ) in closed form. However, since Rn(ϕ; θ) is analytic for any finite n, the critical behaviour on the transition line is expected to be determined by the leading terms in its Taylor expansion, which are linear in ϕ for given θ and vice versa. Numerically, we find that this is indeed the case; see Fig. 5. The numerical data suggests that β = γ = 1 everywhere except on the line ϕ = 1, whose critical behaviour is discussed below.
Appendix C: Critical behaviour along the line ϕ = 1
Here, we study the critical behaviour along the line ϕ = 1. In the limit of infinite tracing depth, we find that there is a discontinuous phase transition at θc = 1. However, for any finite contact-tracing depth n, the critical point θc < 1, and the transition shows the same critical exponents as mean-field percolation.
1. Infinite tracing depth
When the tracing depth is infinite, there is a discontinuous transition as θ → 1−, in the sense that the “order parameter”, i.e. the probability ρ(θ) of formation of an infinite infected cluster, jumps discontinuously from 0 to 1 at θc = 1. To see this, note that when ϕ = 1, the evolution of an asymptomatic cluster from an asymptomatic source is described by the following branching process where denotes the total number of new asymptomatic infections in generation n, the indicator functions reflect the fact that a single symptomatic case will terminate the branching process, and
The probability generating function of then satisfies the recurrence relation where f(y) is the probability generating function for the descendants of a single asymptomatic node,
Using these results, it can be shown by induction that
The probability that the branching process is extinct in generation n is then
The extinction probability is thus
It follows that the critical point occurs at θ = θc = 1, and that the probability of formation of an infinite cluster, which is usually regarded as the order parameter for percolation transitions, behaves like
Such discontinuous behaviour of the order parameter indicates that the transition has a first-order character. For example, at θ = 1 all asymptomatic clusters are infinite and the critical exponent δ is not even defined. However, several other critical exponents are defined, a scenario reminiscent of one-dimensional site percolation, which also has a critical probability pc = 1 and a mixture of continuous and discontinuous behaviour as .
2. Finite tracing depth
For any finite tracing depth n, the exact critical point along the line ϕ = 1 is found to lie at with mean-field critical exponents, γ = β = 1. While this can be obtained from the results of App. B, we provide an intuitive explanation here. When ϕ = 1, all nodes are on the contact network, and hence only two types need to be considered: symptomatic (S) and asymptomatic (A), which occur with probabilities (1 − θ) and θ respectively. Since any infinite cluster must consist entirely of A nodes, we can directly obtain the recurrence relation for the probability ρ of formation of an infinite cluster. For contact tracing with recursive depth n, the only event that does not rule out the existence of an infinite cluster is the formation of an n-generation tree in which every node is asymptomatic. This occurs with probability , where where Nn is the total number of nodes in such a tree excluding the root node.
Given such a tree, infinite clusters can originate from any of the asymptomatic leaves in generation n. This yields the following recurrence relation for ρ:
Expanding Eq. (C11) to second order in ρ, it is straightforward to derive Eq. (C9) and that
Thus, β = 1 for any finite n. A similar argument yields the mean cluster size which has the same critical point as the percolation probability, and yields a critical exponent γ = 1, since
Appendix D: Mean-field-like estimate for the critical line
In this section, we derive a simple, approximate formula for R = R∞(ϕ; θ), based on linear interpolation and a “mean-field” assumption, whereby inter-generational correlations are neglected. It is first helpful to label the possible types of infected individual by α ∊ {CA, CS, NA, NS}, and note that susceptible individuals of each type occur with the independent probabilities given in Table D1:
Now suppose that there are no correlations between generations. Then the effective reproduction number is simply an average over the possible types of node:
Here, the probabilities pα are given as in Table D1, while RNA = R0, RNS = RS, and RCS = RS(1 − ϕ), corresponding to the average number of live nodes generated by a symptomatic individual on the contact network. However, RCA is essentially undetermined in this approach, since discarding correlations in time also discards contact tracing, to which the effective value of RCA is highly sensitive. We will therefore treat RCA as a variational parameter, to be estimated self-consistently. In the “best” case, asymptomatic transmission within the network is completely suppressed, and RCA = (1 − ϕ)Ro. In the “worst” case, asymptomatic transmission within the network is not suppressed at all, and RCA = R0. A simple way to proceed is to solve for the unique linear interpolation between these cases that passes through the known endpoint (ϕ, θ) = (1, 1) of the non-perturbative critical line (see Appendix C). The resulting approximation to R is given by