ABSTRACT
As the coronavirus disease 2019 (COVID-19) spread globally, emerging variants such as B.1.1.529 quickly became dominant worldwide. Sustained community transmission favors the proliferation of mutated sub-lineages with pandemic potential, due to cross-national mobility flows, which are responsible for consecutive cases surge worldwide. We show that, in the early stages of an emerging variant, integrating data from national genomic surveillance and global human mobility with large-scale epidemic modeling allows to quantify its pandemic potential, providing quantifiable indicators for pro-active policy interventions. We validate our framework on worldwide spreading variants and gain insights about the pandemic potential of BA.5 and BA.2.75 sub-lineages. Country-level epidemic intelligence is not enough to contrast the pandemic of respiratory pathogens such as SARS-CoV-2 and a scalable integrated approach, i.e. pandemic intelligence, is required to enhance global preparedness.
Introduction
The coronavirus disease (COVID-19) outbreak, caused by the SARS-CoV-2 virus and first detected in China in early 2020, likely originated from the Huanan seafood wholesale market in Wuhan [1] and continues to spread worldwide. It has forced national governments to pursue country-level elimination strategies [2, 3, 4] or mitigation policies relying on both non-pharmaceutical interventions (NPI) – e.g., physical distancing, wearing masks, hand hygiene, limit large gathering of people, curfews and, in the worst cases, lockdowns [5] – and pharmaceutical ones, such as massive vaccination campaigns and antiviral therapies [6, 7, 8]. Early strict interventions have been shown to be more effective than longer moderate ones in containing national outbreaks in curbing epidemic growth [9], for similar intermediate distress and infringement on individual freedom [10].
In contrast to policy during the early stages of the pandemic, when pharmaceutical interventions were not yet available, most current national efforts to control the virus rely on reactive strategies which alternate enhancement and lifting of NPIs, with the ultimate goal of prevention, or reduction, of pressure on national health systems. To achieve successful containment, such reactive strategies require high capacity for testing and sequencing to continuously monitor the potential emergence of novel viral strains of SARS-CoV-2, whose mutations might be responsible for more severe and/or more transmissible variants with pandemic potential. We define pandemic potential as the ability of a variant to escape population immunity acquired by vaccination or previous infections and to quickly spread worldwide.
Although the emergence of within-host variants with immune escape is likely to be relatively rare [11], sustained community transmission might favor it. When a new variant emerges, it is crucial for policy and decision-making to characterize novel mutations [12, 13, 14], estimate the growth advantage of the new variant with respect to the existing ones [15] and quantify the effectiveness of currently available vaccines [16, 17]. Consequently, any delay in identifying an emerging variant and in determining its key epidemiological parameters introduces uncertainties in the timeline of community transmissions and imported cases which limit, if not completely prevent, effective mitigation responses to take place, similarly to the cryptic transmission of the wild type SARS-CoV-2 which led to the first COVID-19 wave [18]. Combined with limited testing capacity, porous travel screening [19] – at national and, overall, cross-national levels, where international travel play a significant role to amplify the pandemic potential [20, 21, 18] – and lifting of national NPI, the same delays might seriously hinder the timely detection of an emerging variant. The COVID-19 pandemic has been characterized by the regular emergence of such variants [22]. Three important questions arise during the early stages of such a variant, at which point data is missing and noisy: i) can we reconstruct its geographical origin? ii) can we estimate how long it has been spreading undetected in that location? iii) can we quantify the risk of importation to other locations?
In this work, we devise a protocol to quantitatively answer these questions. We show that, by integrating phylogenetic, epidemiological and behavioral analyses within a framework for data-driven and model-informed pandemic intelligence, it is possible to quantify the pandemic potential of an emerging variant and predict the dynamics of subsequent national outbreaks with satisfactory precision.
Results
Blueprint for a pandemic intelligence framework
Reliably quantifying the pandemic potential of an emerging variant requires data, and acquiring data requires time. Between the time t0 of the first undetected case and the time t1 of the first reported case and its subsequent lineage designation at time t2, an emerging variant can silently spread within its country of origin and beyond. For example, let us consider the B.1.1.529 (in the following referred to BA.1) lineage of the Omicron variant (also known as BA.1). This was first reported by genomic surveillance teams in South Africa and Botswana on November 25th 2021. Priority actions have been established by the World Health Organization (WHO) for member states on November 26th, with designation as a variant of concern (VOC) [23] required to raise the level of international alert (t3). By December 16th 2021, there were several reports of an estimated reduction in both vaccine effectiveness against infection and severe disease [24, 25, 26, 17, 27], together with characterisations of the epidemiology of the variant in South Africa [28], Denmark [29] and Norway [30]. Early phylogenetic analysis placed t0 during the third week of October 2021, about one month before t1. Three weeks later it had been identified in 87 countries [28].
Figure 1 summarizes this timeline for BA.1, while highlighting the main analytical steps required to define a self-consistent protocol to characterize the pandemic potential of an emerging variant (see Supplementary Fig. S1 for more mechanistic scheme). Figure 1B illustrates how genomic surveillance data and epidemic modeling can be used to infer the spatio-temporal coordinates of the variant’s origin, thus providing information on t0. This information is used to estimate the importation risk for all countries in the world due to cross-national human flows. Finally, imported cases are used as seeds for community transmission leading to country-level outbreaks, while accounting for the epidemiological parameters characterizing the new variant. Unavoidable uncertainties about t0 and epidemiological parameters are propagated through the workflow. Plausible scenarios are presented, accounting for distinct levels of case under-reporting in each destination country (Fig. 1C).
(A) Evolutionary dynamics of SARS-CoV-2 variants, coded by colour. The panel is obtained from nextstrain.org, based on GISAID data. (B) For the B.1.1.529 lineage (or BA.1, Omicron, according to the WHO nomenclature), we identify four distinct time points in the process of characterising the variant, from the time of the first undetected case to the designation as Variant of Concern. This illustrates how genomic surveillance data is used in combination with global human movement data and epidemic modeling to: i) perform a spatiotemporal reconstruction of the patient zero to identify the country of origin of an emerging variant and estimate its epidemiological parameters and ii) calculate the importation risk for all other countries worldwide. (C) For a subset of about 50 countries worldwide (depending on sequencing data availability), we forecast the increase in the number of cases due to the consequent community transmission according to what-if scenarios, accounting for distinct levels of under-reporting. For a more mechanistic workflow scheme see Supplementary Fig. S1.
In the following, we describe each step of the procedure, detailing our pandemic intelligence framework and the underlying modeling assumptions.
Reconstructing the origin of an emerging variant in space and time
For all SARS-CoV-2 sequences belonging to the B.1.1.7 (Alpha), B.1.617.2 (Delta), BA.1, BA.2, BA.5, and BA.2.75 (Omicron) lineages from GISAID [31, 32, 33], we retained only those generated from cases reported during the early stage of the corresponding wave from the country of evolutionary origin, from 20 up to a total of 100 sequences per lineage. Where there were multiple candidate countries of origin, we estimated the outbreak country by a simple trait model. We then generated 3 alignments, comprised of respectively 20%, 50% and 100% of the sequence set. These were subsequently cleaned by trimming the 5′ and 3′ untranslated regions and gap-only sites. Bayesian evolutionary reconstruction of the dated phylogenetic history [34] was used to obtain posterior distributions of the growth rate t, the parameters of the molecular clock, and the time of the most recent common ancestor (tMRCA). See Materials and Methods for details.
In this way obtain an estimate of t0, the time of the first unreported case, as well as of other epidemic parameters such as the growth rate. From these we estimated the effective reproduction number and generation interval. Indicating the number of infected individuals and number of deaths at time t by I(t) and D(t) respectively, we consider the time period during which there is co-circulation of an existing variant v and an emerging one ω. We approximate the epidemic evolution by
where Ix(t) is the number of infections due to variant x at time t, Rx(t0) is the effective reproduction number at time t0, and GIx is the generation interval. Similarly, the deaths due to the co-circulating variants are approximated by D(t) = Dv(t) + Dω(t), with
where IFRx denotes the infection fatality rate of variant x and τx is the time lag between infection and death. To fit the unknown epidemiological parameters, i.e. the ones related to variant ω for which we obtain a joint probability distribution, we use an optimization procedure (see Materials and Methods)
In the case of BA.1, we obtain t0 = 28 October 2021 (95% HPD: 20 October–5 November) and a daily growth rate estimate of 0.582 (95% HPD: 0.117–1.035) from the phylogenetic analysis and t0 = 19 October 2021 (95% CL: 15 October–23 October) from epidemic modelling, with Rt = 2.56 (95% CL: 2.16–3.19) and GI = 7.36 (95% CL: 6.12–9.17). Our results are in good agreement with the literature, reporting t0 = 9 October 2021 (95% HPD: 30 September–20 October), exponential growth rate of 0.137 (95% HPD: 0.099–0.175) per day [28] and GI = 6.84 days (95% credible intervals: 5.72–8.60) [35].
For further details, refer to Materials and Methods and Supplementary Figs. S2–S3.
Estimating the importation risk of an emerging variant by country
We use monthly seat capacities of flights between airports from the Official Airline Guide [36], encoding how many people could have travelled if all seats were occupied on flights from airport A to B in the month of the estimated t0. We indicate the corresponding flow matrix by F, where entry Fij describes the maximal passenger flow to i from j. The travelling population in the catchment area of an airport is obtained by Ni = Fi, with , i.e., we assume that the population in the catchment area of the airport is equal to the airports outflow. For each emerging variant, the resulting large-scale network of international travels corresponding to the month of t0 is used. The import risk is calculated as in [37]: based on a reconstructed effective distance graph [21] and a random walk with an exit option we estimate how likely it is that one infected individual reaches any airport worldwide given they set off from an airport in the country where the variant emerged. To work at country level, we aggregate the import risk of all airports of the outbreak country by computing the weighted mean with weights
where C(n) the set of airports that belong to the same country as airport n does. We have performed an extensive analysis to validate the estimated import risk against available data, such as the official arrival times as obtained from the WHO, for each emerging variant. We find considerable correlation between arrival time and import risk distance for different variants (Alpha, Beta, Delta, Gamma) with a median of r = 0.55 (range r ∈ [0.41, 0.56]). This median is the largest we found, when compared with several alternative distance measures (see Supplementary Figs. S2, S3, S4). The stochastic nature of the reported variant arrival times (based on the by-country rate of genome sequencing, the probabilistic distribution of infected individuals among passengers, etc.) is one possible reason for the imperfect correlation, but another possibility is that the assumed outbreak location is incorrect. To test this, we attempted to identify outbreak locations by recomputing the correlation for all countries (similarly to [21]). For Beta, Gamma, and BA.1 the country declared by the WHO as the outbreak source had the greatest degree of correlation. For Delta and Alpha the WHO candidate had the 2nd and 5th best correlation respectively (see Suplementary Figs. S4, S5). We extended the analysis to sublineages of Omicron and previously circulating variants of interest (VOIs) by estimating arrival times and outbreak countries from GISAID data (see Material and Methods). For 13 of 17 variants the suspected outbreak location from GISAID had at least the 3rd-largest correlation coefficient (of 183), and for all variants the GISAID candidate was at least on the 12th rank (see Supplementary Figs. S7, S8, S9).
Modeling country-level epidemic spread of an emerging variant under distinct scenarios
We use results from the previous step of the pipeline as inputs for an epidemic model in order to forecast the potential surge in cases due to an emerging variant in a target country. First, we estimate the daily number of infected people (seeds) traveling to the target country from the country where the VoC emerged (source country), based upon four elements: 1) results of our phylogenetic analysis, which inform both the growth rate and the time of emergence of the variant of concern, 2) genomic surveillance in the source country, 3) estimates of prevalence in the source country (incoporating underreporting), and 4) the import risk score of the target based on estimates from our analysis. Then, we produce short term estimates of the daily incidence of the VoC in the target country by means of a Renewal process [38, 39, 40], in which we take into account both the introductions of seeds from the source country and the local epidemic dynamics caused by secondary cases. The renewal equation approach comes with three main advantages with respect to other models, such as SIR [41]. In fact, 1) it does not require to include in the dynamics the immunological status of the population in the target country; 2) the VoC dynamics can be considered as independent from the ones of the co-circulating VoCs, thus avoiding the need of estimating additional parameters for concurring spreading processes; 3) the model explicitly includes the most relevant epidemiological observables, such as Rt, the serial interval distribution [42], and the immune escape of the VoC. For further details we refer to Materials and Methods and Supplementary Figs. S11–S12.
Assessing the pandemic potential of emerging variants
In Fig. 2 we show the result of each step described above in determining the genomic and epidemiological parameters of the BA.1 lineage and, accordingly, quantify its pandemic potential. We refer the reader to Supplementary Figs. S13-S14 for a more detailed analysis of errors in these estimates. Figure 2A displays a time-resolved maximum clade credibility phylogeny of the lineage. Panel B is the map of import risk across the world. Panels C and D show, for two example countries, the simulated epidemic projections, plotted as weekly incidence. For each reproduction number, the shaded area represents the interval between the estimates derived using the minimum and maximum values of underreporting in the source country. Panel E provides model estimates of case counts in all considered countries.
(A) Phylogenetic reconstruction and estimation of the most recent common ancestor (MRCA), identified South Africa on 28 October 2021 (95% HPD: 20 October– 5 November) as the most likely MRCA. (B) Import risk map: countries are colored by their probability to import infectious individuals carrying the BA.1 lineage. (C, D) Projected weekly incidence in Estonia and the U.S. obtained from epidemic modeling, under different Rt scenarios indicated by coloured lines. Line thickness represents the range between the minimum and maximum assumed values of underreporting in the source country (here South Africa). Points represent the observed incidence. (E) Case counts simulated using the Rt scenario that corresponds to the mean growth rate from the phylogenetic analysis. For each country, the date of the first reported case is indicated with a grey circle.
Figure 3 shows the results obtained for the SARS-CoV-2 lineages B.1.1.7 (Alpha), B.1.617.2 (Delta), BA.1, BA.2 and BA.5 (Omicron). The date of the most recent common ancestors and the growth rate are shown, together with the temporal evolution of the number of expected cases around 50 countries (varies depending on available sequencing data; Alpha: 59, Delta: 55, BA.2: 51, BA.5: 49 countries). Point estimates of the mean and 95% HPD regions are further provided in Table 1.
(A–B) tMRCA and growth rate estimates for Alpha, Delta, BA.1, BA.2 and BA.5 from phylogenetic analysis. (C–F) Estimates of case numbers in all the considered countries for the same variants. For each lineage and country, the epidemic simulation starts at the time of infection t0 of the first undetected case as identified using the phylogenetic analysis. The simulation stops at the third date at which sequences belonging to the considered lineages are greater than zero. Results are provided in logarithmic scale and times at which the first case is reported are marked with grey circles.
Phylogenetic estimates of the time of most recent common ancestor (tMRCA) and daily growth rate for SARS-CoV-2 B.1.1.7 (Alpha), B.1.617.2 (Delta), BA.1, BA.2, BA.5 and BA.2.75 (Omicron) lineages.
To assess the prediction error of our workflow we computed the normalized root-mean-square error (nRMSE) between prediction scenarios and observations. The nRMSE is zero, if the observation lies in between the simulation scenarios. Otherwise, the nRMSE is the RMSE between observation and the closest prediction scenario, normalized to the range that is spanned by the observations in the respective target country (for details see Material and Methods). Figure 4 captures the absolute and relative frequency of countries according to their nRMSE. Our predictions are in very good agreement (nRMSE = 0) for Alpha in 81.4%, BA.1 in 53.1%, BA.2 in 52.9%, BA.5 in 49% and for Delta in 12.7% of all considered countries. Note that even though Delta has the smallest amount of countries with incidences falling within scenarios prediction, more than 75% of the countries have a nRMSE ≤ 2.5.
Absolute (bars) and relative frequency (segments) of countries according to their normalized root-mean-square error (nRMSE) for the Alpha (B.1.1.7), Delta (B.1.617.2), BA.1, BA.2, BA.5 (Omicron) lineages. The normalized RMSE is zero if the number of infected people evaluated from data is inside the range spanned by the epidemic scenarios. Otherwise, it is the RMSE between observed incidence and the incidence of the closest epidemic scenario, normalized to the range spanned by observed incidences in the respective country. The order and color of the bars and segments is identical, i.e. the bars serves as color legend for the segments. For example, dark blue always represents the number or percentage of countries with the smallest nRMSE.
Discussion and conclusions
Here we have presented an integrated framework that combines phylogenetic analysis of genomic surveillance data with international human mobility data and large-scale epidemic modeling, in order to characterize in nearly real time the pandemic potential of an emerging variant. This concept is intended to provide quantitative indicators about the ability of a variant to escape population immunity acquired previous infections and/or vaccination, and quickly spread at a global level through human activities.
Our framework naturally deals with missing and noisy information to infer, through a Bayesian approach, the most likely origin – in space (on the country level) and time – of an emerging variant and its growth rate. Spatial and temporal coordinates are used to feed an analytical technique to estimate the probability that a given number of infectious individuals, departing from the country where the variant first appeared, travel to other countries with no exposure to it. This crucial step is based on international travel data, providing information about human movements between countries. Note that our approach is more powerful than naive estimates based only origin-destination pairs: in fact, we make full use of the knowledge we have about the underlying international travel network and its latent geometry [21, 43, 44], known to play a crucial role to amplify the spread of an emerging pathogen [18]. The last stage of our framework is to use importation risk to quantify the number of imported infectious cases to each country and, accordingly, estimate the consequent unfolding of the epidemic due to the emerging variant. The epidemic model is intended to quantify undetected infections that occur well before the first genomic sequence is isolated from a case in a country. Note that it is also possible to estimate the time at which the emerging variant will become dominant in the destination country, as shown in Supplementary Figs. S15-S16. The estimation is based on a logistic growth equation for the relative fraction of a new variant. These predictions will be less accurate if growth advantages in different countries are heterogeneous, for example due to immune escape. In this scenario, they would require calibration for each variant.
Only the early phase of spread of a new lineage is estimated and the proposed model can safely take advantages of assumptions like a homogeneous mixing and the lack of feedback loops in the epidemic dynamics.
We have validated our integrated framework on existing variants, including B.1.1.7 (Alpha), B.1.617.2 (Delta), BA.1, BA.2 and BA.5 (Omicron), finding excellent agreement with independent estimates of the relevant phylogenetic and epidemiological parameters. By accounting for different scenarios in the progress of the epidemic in each country, we provide quantifiable indicators to inform decision makers and support pro-active policy interventions to mitigate the potentially harmful effect of an emerging variant. For the current variant of most concern, BA.5, we estimate that its most recent common ancestor existed in early January 2022 (9 January 2022, 95% HPD: 19 December 2021–29 January 2022), with a daily growth rate of 0.113 (95% HPD: 0.051–0.177).
Overall, our findings show that it is possible to aim at pandemic intelligence, even with partial and noisy data. We must caution that the estimates of the pandemic potential of an emerging SARS-CoV-2 variant are largely driven by the uncertainty in the spatio-temporal coordinates of its origin. Our most unreliable estimates are obtained for countries where genomic surveillance is poor, propagating uncertainties to short-term projections which, in turn, exhibit larger variability. Importantly, note that only the validation of our scenario predictions relies on large enough sequencing rates in the target country, but not its application. That means our framework is perfectly suitable for low- and middle- income countries with little to no genomic surveillance.
Failures in international cooperation with a view to finding global solutions have undoubtedly shaped the COVID-19 pandemic. We have provided robust evidence that epidemic intelligence at country level is not enough, alone, to contrast the pandemic of respiratory pathogens such as SARS-CoV-2, in the absence of well-coordinated genomic surveillance – especially in low-income and middle-income countries currently lacking and adequate response capacity [45] – and global projections of variant’s pandemic potential. Our approach is inherently integrated and scalable, adding to ongoing modeling efforts and pan-viral analyses [46, 47, 48, 49, 22] and responding to global calls for coordinated action [45, 50, 51]. The data-driven approach provides a vital step in the path towards pandemic intelligence – where the interconnected and interdependent nature of human activities [21, 18, 52] is naturally accounted for at a global level – as well a means of enhancing global preparedness against future emerging variants.
Funding
A.D.N. acknowledges support from the Department for Environment, Food and Rural Affairs (Defra), United King-dom [research grant: SE2945], and the Biotechnology and Biological Research Council (BBSRC), United Kingdom [project: BBS/E/I/0007035].
Author contributions
M.D.D. designed the study; P.K., V.D., F.D.L., A.Z., S.B., A.D.N., M.H. and L.F. performed the numerical experiments; all authors analyzed the data and wrote the manuscript.
Competing interests
The authors declare no competing interests.
Data and materials availability
Both the data and analysis material will be available online at Zenodo upon publication. This work is licensed under a Creative Commons Attribution 4.0International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.
Acknowledgments
The authors acknowledge the GISAID initiative and all the authors from the originating laboratories where genetic sequence data were generated for sharing such data through GISAID, which has made this work possible.
SI Appendix Material and Methods and Supplementary Information
Materials and Methods
I Phylogenetic Reconstruction
I.1 Genomic dataset compilation
We retrieved all SARS-CoV-2 sequences belonging to the Alpha B.1.1.7, Delta B.1.617.2, Omicron BA.1, BA.2, BA.5, and BA.2.75 lineages from GISAID. Each genomic dataset was filtered by only retaining those sequences that were generated from cases reported during the initial wave and from the country of evolutionary origin, up to a total of 100 sequences per lineage. We then generated 3 alignments using MAFFT 7.505 [1], each comprised of 20%, 50% and 100% of the total number of sequences, which were subsequently cleaned by trimming the 5′ and 3′ untranslated regions and gap-only sites.
I.2 Phylogenetic estimates of epidemiological parameters
We performed a common Bayesian evolutionary reconstruction of timed phylogenetic history using BEAST 1.10.5 [2] that was source compiled from its github repository (https://github.com/beast-dev/beast-mcmc). We modelled the nucleotide substitution process according to an HKY 85 + Γ parameterisation, setting a strict molecular clock and an exponential growth model as coalescent prior. We used a Lognormal(µ = 9 × 10−4, σ2 = 1 × 10−5) prior for the molecular rate of evolution, a Laplace(µ = 0, b = 100) prior for the rate of exponential growth and a Lognormal(µ = 5.7, σ2 = 2.3) prior for the exponentially growing viral population size. We further set an initial calibration for the time of the most recent common ancestor (tMRCA) at an age of ∼ 6 months before the most recent sample included in the alignment. All the remaining priors were left at their default values.
Bayesian inference through Markov chain Monte Carlo (MCMC) was performed for 2 × 108 generations, sampling every 20,000 generations and using the BEAGLE 3.1.2 library to increase computational performance [3]. MCMC convergence and mixing properties were inspected using Tracer 1.7.2 [4] to ensure that effective sample size (ESS) values associated with estimated parameters were all >200. After discarding 10% of sampled trees as burn-in, estimates of the growth rate, molecular clock and tMRCA were extracted along with their posterior distributions (Figure S2).
The three pieces of our pipeline (black dashed boxes) are illustrated and what input they get from external data sources (orange colored) or from the output of earlier parts of the pipeline (blue arrows). The t0 at the orange arrows means that data from the external sources is used at or prior to t0, which is the estimated time of the most recent ancestor (the output of the first part of the pipeline, the phylogenetic reconstruction).
I.3 Estimates based on epidemic modeling
We obtain an independent estimate for t0, the time of the first unreported case, and for other epidemic parameters, such as the effective reproduction number and the generation interval. By indicating with I(t) the number of infected individuals at time t and with D(t) the number of deaths, we consider the stage with the co-circulation of an existing variant v and the emerging one ω. Since we consider the final stage of the contagions due to v and the early stage of the contagions due to ω, we approximate the epidemic evolution by
where Ix(t) is the number of infections due to variant x at time t, Rx(t0) is the effective reproduction number at time t0 and GIx is the generation interval. Similarly, the deaths due to the co-circulating variants are approximated by
where IFRx denotes the infection fatality rate of variant x and τx is the lag between infection and death. To fit the unknown parameters, i.e. the ones related to variant ω, we use particle swarm optimization [5] to minimize the loss function
where Iobs(t) and Dobs(t) are the number of infected individuals and deaths from empirical data [6], Var indicates the variance in time and θ = {t0; Rω(t0); GIω; IFRω; τω} is the vector of the epidemiological parameters characterizing the emerging variant, for which we obtain a joint probability distribution.
Posterior distributions of the time of the most recent common ancestor (tMRCA), daily growth rate and doubling time estimated for each of the Alpha B.1.1.7, Delta B.1.617.2, Omicron BA.1, BA.2, BA.5, and BA.2.75 SARS-Cov-2 lineages using alignments of 20, 50 and 100 sequences.
II Import Risk estimation
II.1 International travel dataset compilation
We retrieve the monthly seat capacities between airports from the OAG (Official Airline Guide). Note, that it does not represent the actual passengers that flew from airport A to B in one month, but the maximal capacity, i.e. how many could have travelled if all seats were occupied. It is therefore an upper limit for the passenger flux and we refer to it as the flow matrix F, where Fij describes the maximal passenger flow to i from j. We estimate the travelling population in the catchment area of an airport by Ni = Fi, with Fi = ∑j Fji, i.e. we assume that the population is proportional to the outflux of the airport. For each variant we use the WAN at the month of the outbreak day of the respective variant.
The distance measures are the geographic distance Dgeo (A), the import risk distance DIR (B), the effective distance (C), the random walk distance
(D) and the information diffusion distance
(E) whereby the latter three (C, D, E) are generalized to weighted multiple paths.
II.2 Quantifying the Import Risk
The import risk method is introduced in a separate study [9] where it is compared to another data-driven estimate. Here we present a short outline of the method. To know how many passengers leave at node j given they started at node i, we introduce the shortest path exit probability q(j|i) (SPEx). It is based on the shortest path tree of the effective distance [10], and combines the exit probability with all possible paths that end in j. The resulting import risk is therefore an extension of the SPEx.
In order to compute the SPEx we first define, with the flow matrix (maximal passenger flux) F and the travelling population of the catchment area Ni, the transition matrix P, where the element Pij = Fij/∑ i Fij = Fij/Fj is the probability to transition to i from j. Now, the effective distance graph [10] is Dij = d0 − log(Pij), with d0 as the distance offset which we set to d0 = 1 (the larger d0 the more Dij increases with increasing hop-distance). Let T(n0) be the shortest path tree on D for the point of origin n0. With respect to node n the downstream nodes Ω(n|n0) are those nodes that can be reached from the source n0 through node n on T(n0).
The distance measures are the geographic distance Dgeo, the import risk distance DIR, the effective distance , the random walk distance
and the information diffusion distance
whereby the latter three are generalized to weighted multiple paths.
DIR. For the import risk distance DIR(m|n0) = − log(p∞(m|n0)) the WAN of the WHO outbreak month is used and the WHO outbreak location as source country. The arrival time are taken from the “cov-lineages.org” [7, 8] project.
Now we compute the SPEx q(i|n0) by assuming that all passenger that start at n0, travel along the shortest path tree T(n0) and distribute to other airports according to their respective populations Nn. We assume that the exit probability at i is proportional to the ratio of the population at i (i.e. Ni) to the population of all of i’s downstream nodes plus Ni:
The r-value between the import risk distance DIR(m|n0) = − log(p∞(m|n0)) and the arrival time for the 10 best ranked outbreak counries (n0). The 2 Letters in the circles are the countries ISO alpha-2 codes. The red circle marks the country declared as outbreak country by the WHO.
The frequency of the r-value between the import risk distance DIR(m|n0) = − log(p∞(m|n0)) and the arrival time for all possible outbreak counries. The red vertical line marks the r-value using the country declared as outbreak country by the WHO.
![Embedded Image](https://www.medrxiv.org/sites/default/files/highwire/medrxiv/early/2022/08/22/2022.08.19.22278981/embed/graphic-21.gif)
Now, we use the SPEx on a random walk that starts at n0 and the walker exits at node i with probability q(i|n0) or continues its walk with probability 1 − q(i|n0). Thus, the probability to be at node m if the walker was before at node m − 1 is
Consequently, the probability to take a path Γ starting at n0 and exiting at m is
The probability to exit at node m from all possible paths (of all possible lenghts) is
Note that is the probability sum of all paths that started in n0 and end after k steps in m. We aggregate all airports of the same country by computing the weighted mean with weights
with C(n) as the set of airports that belong the same country as node n does.
II.3 Relation to distance and arrival time
In order to assess the quality of the import risk, we compare it with the arrival time of past variants. Clearly, the higher the import risk to a country, the earlier it is to arrive and the direct relation between the probability of travel to a city m from a city n0 and the mean first arrival time t1 is
which is the effective distance [11, 12]. Thus, we define the import risk distance as
which is proportional to the mean first arrival time.
II.4 Alternative distance measures
There are alternative measures to estimate the arrival time [10, 13, 14], and we want to compare our import risk distance to these established measures. However, please note that the alternative measures have a clear qualitative relation to the arrival time, but it is not possible to directly infer the number of passengers that travel between airports from them (what the import risk is especially designed for). The already introduced alternative measure is the effective distance [11] that uses the flow between airports to estimate the probability to travel from airport n to m
Now, the distance along a specific path Γ that connects m and n0 is the sum of the path elements distances
Finally the effective distance from airport n0 to m, also not directly connected airports, is the minimal effective distance of all possible paths Ω(m, n0) they are connected through
An extension to the effective distance is the random-walk effective distance [14] that considers all possible paths connecting two airports Ω(m, n0) instead of only taking the dominant path with the shortest distance:
Note that the sum of path distances via their exponential is due to the linkage to the arrival time as explained in [14].
We also add a comparison with a metric derived from Diffusion Distance [13] which exploits the definition of a random walk Laplacian on top of the WAN. We further explain this Information Distance DID in the dedicated section V.
Country-Level aggregation
The country-level aggregation of the import risk distance DIR is done by first aggregating the import risk on country-level (as described in Sect. II.2) and then applying Eq. S13.
To aggregate the other distances (Deff, DRW) we could either take (along the line of Deff) the minimal distance between two countries (of all relevant airport pairs), or use a weighted multipath approach as used in the derivation of DRW. We will highlight the latter in the following; however, we also computed the minimal measure and found that it is outperformed by the multipath distance (not shown, but it is the basic finding in [14]).
As shown in [12], the effective distance of two paths combined is
Thus, the multipath (MP) effective distance that considers all shortest paths from country S to M is:
with M as the set of all target airports in country M and S all source airports of country S.
Since the distance of source airports with a larger population are more important, we additionally weight the source airport with wi = ∑Fi/s∈s Fs, which represents the probability of an infected to start in location n. Now, we compute the population weighted multipath effective distance by
Note that the weighting for the effective distance can be reformulated to
which corresponds to multiplying the probability to start at the source airport s to the first step of each path. Analogously the
II.5 Data for arrival time and outbreak region
We compare the import risk to measured arrival times of different variants. Therefore, we need to define the outbreak-country and -month and the arrival times. We defined these variables in different ways.
(I) external sources Here we rely on peer reviewed [8] or official [15] sources. The outbreak country and the outbreak month are taken from the website of the World Health Organization (WHO) “Tracking SARS-CoV-2 variants”[15] and the arrival times of the variants Alpha, Beta, Delta, Gamma and Omicron were externally computed with “grinch”[8] and taken from their project website[7]. If arrival times are before the official outbreak they are removed from the analysis (for Delta=1, Gamma=1 and Omicron=19 countries are removed).
(II) GISAID data To also use the other variants to validate our import risk method we design a simple arrival time algorithm. First, we need to define the outbreak day. Instead of relying on an official definition from the WHO, we use GISAID data. The outbreak time TX,out of variant X is defined by
with FX (t) beeing the fraction of variant X to all sequenced probes at time t and T (FX (t) ≥ g · max(FX)) the time when FX (t) crosses the first time the threshold gcdot max(FX) where g ∈]0, 1[and we set g = 0.025. In other words, the outbreak is defined by 30 days before the variant reached 2.5% of its world wide peak. We estimate the arrival time of variant X in an country by the most simple way: the first time the variant is detected (according to GISAID data). In Fig. S8 the estimated outbreak time, official WHO and arrival times of each country are shown. Since for some variants (Alpha, Delta, BA.2) many arrival times fall clearly before our estimated and even the official outbreak date, we recomputed for these countries the arrival time to the first GISAID-detection after the outbreak date. We argue that either (i) the sequencing of the variant in these countries was error-prone (1. count is very sensitive to any wrong predetection) or (ii) the spreading was slow and the variant did not dominate the local epidemic until it reached a susceptible country (low NPIs) from where it did spread more easily (probably the case for Delta).
The outbreak date (black dashed vertical line) of a variant can be defined by the first time the fraction of a variant X of all sequenced probes reaches 2.5% of its current world wide peak. To exclude maldetections of 1st. arrival times in countries, we exclude all arrival times (blue short vertical lines) that are before the outbreak date and set the arrival time as the first detection in the respective country after the outbreak date. The official outbreak date by WHO is marked by a red dashed vertical line.
II.6 Outbreak detection based on 1st count GISAID data
If we repeat the outbreak detection method using all variants and the arrival times estimated via GISAID data (arrival by first detection, Fig. S8), we see that the outbreak detection via the best correlation between import risk distance DIR and arrival times Tarrival in general confirms the outbreak regions declared by the WHO (see Figs. S10, S9). There is a discrepancy for Delta. While using WHO and “cov-lineages.org” data, the official outbreak country India (IN) was second best, it is only on rank 12 if our GISAID estimates are used. A possible explanation is, that our outbreak date estimation is 5 months after the WHO date. In order to not lose the countries with arrivals before the outbreak date, we recompute the arrivals by the first count after the estimated outbreak date. One can argue that Delta did locally spread much stronger in South Africa (ZA, the top ranked country), and therefore is ZA for the worldwidee distribution of larger importance than India. An alternative explanation is that the passenger flow in the WAN was too low and when it increased ZA had a more active Delta epidemic.
The r-value between the import risk distance d∞(m|n0) = − log(p∞(m|n0)) and the arrival time for the 10 best ranked outbreak counries (n0). The 2 Letters in the circles are the countries ISO alpha-2 codes. The red circle marks the country estimated as outbreak country based on GISAID arrival times. In contrast to Fig. S6: the arrival times and outbreak dates are estimated via GISAID data (arrival by first count, outbreak date by reaching the first time 2.5% of world wide peak of the respective variant).
The frequency of the r-value between the import risk distance DIR(m|n0) = − log(p∞(m|n0)) and the arrival time for all possible outbreak countries. The red vertical line marks the r-value using the country estimated as outbreak country based on GISAID arrival times. In contrast to Fig. S7: the arrival times and outbreak dates are estimated via GISAID data (arrival by first count, outbreak date by reaching the first time 2.5% of world wide peak of the respective variant).
III. Epidemic Scenarios
We consider two distinct models to project the number of daily new infected people, namely, a renewal equation based model and a multi-strain SIR-like model. The first one is actually part of the pipeline, while the second one is used as validation.
III.I Renewal equation
The renewal equation approach is a well-known technique, widely used in epidemiology [16, 17, 18]. The reason why renewal equations are such strong candidates for early projection of new cases, is the fact that informing them requires only the reproduction number of the new variant of concern, its generation interval distribution, and the number of people infected by the new variant who travel into the target country from the source country. This allows easily to explore scenarios with different values of epidemiological quantities of interest, such as the effective reproduction number of a new variant as it spreads from the source country to others through travelers.
For now, we assume that the susceptible population is much larger than the number of active cases, and that the mixing between the infected and the susceptible is homogeneous. This allows to exclude feedback loops in the dynamics, e.g. the fact that immunity to the new variant builds up through infection, which would modify the dynamics itself. Such strong assumptions are acceptable as long as we restrict our projections to the very first few weeks from the introduction of the new variant in the target country.
The model assumes that the number of newly infected people at day t, I(t), is given by two distinct processes: a) the arrival of infected individuals from the source country (Iout(t)) and b) the daily new infections (Iin(t)) happening in the target country due to the endogenous spreading. The former is estimated from section II, while the latter can be estimated through the renewal equation
where t0 is the day the first infected cases arrived in the target country, Rs is the daily reproductive number on day s, and Γs is the generation time distribution, i.e. the fraction of transmissions that would occur on day s after infection.
Finally I(t) = Iout(t) + Iin(t). This is the simplest renewal process, which does not include the fact that the target population might have an inhomogeneous immunological landscape, due to previous infections or vaccination. To model this phenomenon, we reinterpret the term on the right side of equation (S23) as the number of inoculations spreading from currently infecting people, which will turn into infections depending on the susceptibility of the recipients. If we assume that previous infections (with other variants) protect against reinfection with an efficacy of ne, and, analogously, vaccination has an effectiveness of νe, then we can explicitly account for removals by modifying equation (S23) into
where Rold(t) is the number of recovered people from previous variants that still have some protection against infections, and V (t) is the total number of vaccinated people. This assumes that the number of recovered or vaccinated people is uniformly distributed across the population, and that the events ‘being vaccinated’ and ‘having been infected’ are independent. This also assumes no gradual waning of protection against infection. However, we can consider as recovered or vaccinated only people who were infected or vaccinated recently, rather than from the beginning of the pandemic. For instance, considering only people who got either infected or their second dose up to six months prior to t is equivalent to assuming that there is an abrupt waning of efficacy against protection six months after getting infected or vaccinated.
Although these hypothesis might seem unrealistic, the lack of readily available data about waning and immunological landscapes of various countries, and the fact that this should be used only for short-term scenario explorations, allow us to avoid introducing further complexity into the model.
The cumulative number of cases and amount of fully vaccinated individuals at each day are the ones reported in the public repository at 1. We select the values for vaccine efficacy and protection from previous infection from available works. In particular we set the vaccine efficacy νe to 0 for Alpha, 0.5 for Delta, BA1 and BA2 and to 0.12 for BA.5 ([19, 20, 21, 22]). The selected protection against reinfection ne is 1 for Delta, 0.56 for BA.1 and BA.2 Omicron lineages and 0.13 for BA.5 ([23, 24, 22]).
The second model is a multi-strain SIR inspired by [25]. This is a two-strain model in which people who recover after being infected with the former variant are not completely immune to infection from the latter variant. The equations governing this system are
where
being the transmission rate of the variant i, and γ being the recovery rate. The initial condition
. Note that, since Iout(t) represents the arrivals from the source country at the beginning of each day, the system is not closed. This is not a problem because we are considering countries, so
. Since the dynamics does not include, per se, the fact that the initial condition changes every day due to arrivals, we can solve this system on a daily basis, updating the initial condition and restarting the system accordingly. The advantage of this system is that it includes feedback phenomena, which is good when validating the model, as it may need to run for more than a few weeks. The drawbacks are that informing the model requires good point estimates of the various compartments, and the interpretation of the trans-missibility coefficient related to the measured ℛt, which may not be straight-forward. For such reasons, this model is used to validate the renewal equation approach, in particular for countries where no new cases were observed after a few weeks from their emergence (not shown). Projections errors valuated with the SIR model relative to Alpha lineage are shown in
III.2 A fully worked out example: the Alpha variant
We apply our pipeline to a real case, the Alpha variant of concern (VOC), that was identified in the UK on 20 September 2020 [8]. We assume that the UK is the source country and we demonstrate how the pipeline works. In the following, we consider as the generation time interval distribution the one inferred from the literature [26].
Starting from the phylogenetic part of our pipeline, we take the time of emergence estimated when n = 20 sequences were collected, to simulate a realistic scenario where only few information is available. This gives a central estimate for the time of emergence of the Alpha variant around the 9th of November 2020. The daily growth rate estimated is r = 0.097 (95% HPD: 0.008–0.202). To translate this into Rt in the source country, we assume that all the growth rate advantage of Alpha relative to the previous circulating variants is given only by transmission advantage (limited capacity of reinfections with Alpha). Further, typical generation time distributions are Gammas, as in [26]. This allows us to estimate the Rt using formula 2.2 in [16]:
where b and a are the shape and rate of the Gamma distribution generation time. In our case, a = 5.9, b = 1.13, therefore Rt(α) = 1.62(1.04, 2.63).
For any target country, the projection of the number of cases infected with Alpha in the next weeks is performed in two steps: first, we estimate the number of infected travellers (referred to as seeds) who arrive in the target country from the source country, then we use the renewal equation (S24) on each possible scenario, to account for endogenous transmission of the secondary cases in the target country. The first step consists in using the import risk estimates described in section II to compute the number of daily travellers from source country to other target countries. We use import risk probability from source to target times the average daily outflow of passengers from source country using WAN data. We then determined the number of travellers infected with Alpha. This is done by considering the proportion of sequenced cases that are Alpha times the 7 − day moving average of daily incidence of new cases, assuming that sequences are taken randomly from the infected population. This estimate does not include undercounting in the source country, which we can estimate as follows.
For a given country, we use the daily new estimated COVID-19 infections from the IHME model which is a hybrid with two main components: a statistical “death model” component produces death estimates that are used to fit an SEIR model component 2. For a complete overview of this model and a comparison with other estimates we refer to OWID3. The data we used for our estimation are publicly available4. In a given temporal window, we integrate over time the confirmed number of cases (7d moving average) and the estimated true number of cases, as well as the estimates for its lower and upper bounds defining the 95% uncertainty interval. The mean undercounting factor is estimated by the ratio between the integrated estimate of the true cases and the confirmed ones in the temporal window, and similarly we estimate the corresponding uncertainty interval. We show in Fig. SS11 the undercounting factor obtained for all countries for which the data is available, whereas Fig. SS12 shows the evolution of this factor along periods of 6 months for some representative countries.
Estimates of the factor accounting for missing confirmed cases: values larger than 1 indicate that a country is counting and confirming less COVID-19 cases than the real number. The reference period is the first semester of 2022. See the text for further details.
To allow for variability in undercounting, we consider two extreme scenarios: the best one, where undercounting is assumed to be 2.27, and the worst one, where undercounting is assumed to be 2.97. The number of infected travelers from the source country to the target country is then computed by multiplying the number of travellers into the target country by the proportion of infected people in the source country. This is often not a natural number. This is not a problem, as the renewal equation does not need to use integer number of infected people, and we interpret this as the results of the various averaging performed through all the steps. The model produces the total number of infected people in the target country given the seeds and the ℛt by day of infection. To validate the model, we need to estimate how many people infected with the VOC were present in the target country during the considered period. We do so in the same way we estimate prevalence in the source country: by multiplying the proportion of sequenced cases that turned out to be Alpha times the daily incidence in the target country, scaled by the estimated undercounting factor.
Estimates of the factor accounting for missing confirmed cases as in Fig. S11, where each panel describes the evolution along periods of 6 months for some representative countries. The dashed line indicates the value 4. See the text for further details.
The total number of different scenarios computed is, in this case 2 × 2 × 3: undercounting in both the source and the target countries, and the different reproduction number of the VOC. Results are shown in Figure 3C and in Figure S13A.
III.3 Prediction error
For each lineage we evaluate different scenarios with a) low and high values of under reporting in both source and target country b) three different basic reproduction numbers Rt that correspond to the range of growth rate values estimated from the phylogenetic reconstruction.
We infer from data the number of infected individuals with the emerging lineage in the target country m and we evaluate the prediction error as zero if this estimated number is included in the range identified by different epidemic scenarios. If the number of infected people evaluated from data is out of the range spanned by the epidemic curves, then the prediction error is evaluated as the root-mean-square error, normalized to the range of the data observed in the target country m, between observed and the closest simulated epidemic curve:
where nt is the number of weeks with number of sequences greater than zero for the selected lineage in the considered country m, that is nt is the number of available data points with not null infected people. Since the scenario simulations stop at the 3 week after sequencing was reported in country m, nt is always nt = 2. The idea behind the normalization by the data range is that it reflects the noise of reported sequences, i.e. if the sequencing rate is low, we expect a large variation and the sequencing data is less reliable. Prediction errors evaluated for all the considered lineages are shown in Figure 3 of the main document. All the panels report the nRMSE in each country as a function of both the number of daily passengers normalized to the total country population (x-axis, values for 100000 individuals) and the number of total collected daily sequences normalized to the total number of confirmed cases (y-axis, values for 100000 cases).
Estimated errors between the number of individuals infected with an emerging lineage and the epidemic curves simulated in the considered scenarios. X-axis show the number of daily passengers normalized to the population in each country (for 100, 000 individuals), y-axis report the number of collected daily sequences, without any classification per lineage, normalized to the total number of confirmed cases (for 100, 000 cases). Inset panels show the map of prediction errors in each country. Panels A-E refer to, respectively, Alpha, Delta, BA.1, BA.2 and BA.5 lineages.
Insets show the evaluated error in each country. Results assess that, in most of the country, the simulated scenarios encompass the data and the prediction error is evaluated as zero. Moreover, error values greater than zero can be found for countries with higher passenger flows.
Estimated errors between the number of individuals infected with an emerging lineage and the epidemic curves simulated in the considered scenarios. X-axis show the number of daily passengers normalized to the population in each country (for 100, 000 individuals), y-axis report the number of collected daily sequences, without any classification per lineage, normalized to the total number of confirmed cases (for 100, 000 cases). Inset panels show the map of prediction errors in each country.
IV Variant Dominance
As already described for the Alpha variant by Fort [27], the relative fraction of a new variant can be described by a simple and well known logistic growth equation. We use an equivalent formula with a slightly different convention, using the growth rate g of the variants as the fitness
This equation is solved by the logistic growth function
Assuming the initial import is small, we can simplify this to
Solving this equation for the inflection point of the sigmoid curve, the point where the new variant becomes the dominant variant in the system (where x(t = tinf) ≈ 0.5), gives us:
For our evaluation we used the GISAID variant data aggregated to weekly values. We show the validity of the inflection time approximation in Eq. S29 by estimating the ratio x0 by the ratio between imported cases by domestic cases x0 ∝ cases/import in Fig. S15.
This function can be directly fitted to available sequence data using logistic regression methods. This will generally perform well in the absence of sampling biases. However, in the presence of sampling bias often present in the first data points of a new outbreak, we found robust regression methods to be more reliable. Transformation using the logit function results in a linear relationship between time and the sample fraction of the new variant:
Then we can perform a robust regression using the RANSAC algorithm. We used the regression method of the python sklearn library [28] and extracted the fitted values for tinf.
The fraction of seq. on GISAID attributed to the Delta variant for four example countries. As described for the Alpha variant by Fort [27], the relative fraction of a new variant can be accurately described by a simple logistic growth equation (Eq. S28).
The date at which four example variants became dominant in different target countries, and an easy to obtain ratio of target country cases and target country’s import risk. Assuming a constant growth advantage f, the logistic growth equation can be solved for the inflection time tinf (new variant constitutes 50% of new cases), describing a linear relationship between the infection time and the logarithm of the fraction x0 of imported cases to domestic cases, in the limit of small x0. The fraction x0 can be approximated using the Import Risk and case numbers. For the delta variant, the relationship is linear and highly correlated (Pearson’s r = 0.81, p < 10−10), for the other variants the relationship is less clear. The time a variant becomes dominant is based on weekly aggregates of GISAID data, transformed with the logit function to obtain a linear relationship between time and the fraction of variant sequences and fitted using a robust regression (RANSAC algorithm, using the scikit-learn python package [28]). For the BA.2 variant, the absolute number of cases is multiplied by the fraction of Omicron cases, both for the fitted data and the cases in the ratio.
V. Information Distance
We also devise an alternative definition of distance on top of a network which embeds information from multiple-pathways diffusion as an additional comparison to the import risk measure. Distances based on the diffusive properties of the system have been of interest in the recent years [10, 14]. Another key example is the Diffusion Distance [13] which estimates a metric distance between nodes based on how similarly the random walkers explore the network by using those nodes as sources, under the assumption that a mesoscale structure is recovered during the time scales in which the random walker explores its functional community.
Starting from Diffusion Distance definition, we propose an educated rewrite of the measure that fits the problem under study to predict arrival times of a random walker on the network, such as an infectious traveller from a source country. The probability p(t | i) of a walker to be in any point in the network at time t, starting from node i, embeds information of multiple paths via successive applications of the Laplacian operator. We introduce a new measure that merges this concept from Diffusion Distance and also embeds information from Effective Distance [10], namely, the idea that low probabilities pk(t | i) are associated with large distances. This can be embedded by taking the negative of the logarithm of the probability, in analogy with Shannon’s entropy. We now introduce this candidate measure for diffusive dynamics which we define Information Distance:
in which pk (t | s) represents the k − th entry associated with node k of the probability state
. Here vs is the initial condition probability for the walker starting from node s, the canonical vector with s-th component equal to 1. The random walk normalized Laplacian (LRW) [29] term encodes the probability to move from node i to node j in its matrix elements. Its off-diagonal terms can be computed as the negative value of Pij, which is directly estimated from the WAN weighted links as stated in subsection II.2. Given the multiple timescales involved in this definition, we evaluate the metric at different scales t to find the timescale at which DID(t) performs better.
Footnotes
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵