Summary
Following initial declines, in mid 2020, a resurgence in transmission of novel coronavirus disease (COVID-19) has occurred in the United States and parts of Europe. Despite the wide implementation of non-pharmaceutical interventions, it is still not known how they are impacted by changing contact patterns, age and other demographics. As COVID-19 disease control becomes more localised, understanding the age demographics driving transmission and how these impacts the loosening of interventions such as school reopening is crucial. Considering dynamics for the United States, we analyse aggregated, age-specific mobility trends from more than 10 million individuals and link these mechanistically to age-specific COVID-19 mortality data. In contrast to previous approaches, we link mobility to mortality via age specific contact patterns and use this rich relationship to reconstruct accurate transmission dynamics. Contrary to anecdotal evidence, we find little support for age-shifts in contact and transmission dynamics over time. We estimate that, until August, 63.4% [60.9%-65.5%] of SARS-CoV-2 infections in the United States originated from adults aged 20-49, while 1.2% [0.8%-1.8%] originated from children aged 0-9. In areas with continued, community-wide transmission, our transmission model predicts that re-opening kindergartens and elementary schools could facilitate spread and lead to additional COVID-19 attributable deaths over a 90-day period. These findings indicate that targeting interventions to adults aged 20-49 are an important consideration in halting resurgent epidemics and preventing COVID-19-attributable deaths when kindergartens and elementary schools reopen.
One sentence summary Adults aged 20-49 are a main driver of the COVID-19 epidemic in the United States; yet, in areas with resurging epidemics, opening schools will lead to more COVID-19-attributable deaths, so more targeted interventions in the 20-49 age group could bring epidemics under control, avert deaths, and facilitate the safe reopening of schools.
1 Introduction
In 2020 a novel pathogen, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Hubei Province, China [1]. Spread within China occurred in January 2020 and the resultant disease was named COVID-19. Following worldwide spread, the implementation of large-scale non-pharmaceutical interventions has led to sustained declines in the number of reported SARS-CoV-2 infections and deaths. However since mid June, the daily number of reported COVID-19 cases has re-surged in the United States, surpassing 40,000 daily reported cases on June 26 [2], and increasing daily cases are beginning to be reported in Europe [3]. Demographic analyses of reported cases have suggested that individuals aged 20 − 49 may be driving the re-surging epidemic [4, 5]. Here, we use detailed, longitudinal, and age-specific population mobility and COVID-19 mortality data to estimate how non-pharmaceutical interventions, changing contact intensities interplay, age and other factors have led to resurgent disease spread. We identify the population age groups driving SARS-CoV-2 spread in 35 U.S. states, the District of Columbia and New York City through August 23, 2020, and quantify the likely impact of school reopening on case and death counts under the scenario that transmission from the age groups that primarily drive transmission continues uninterrupted.
Similar to many other respiratory diseases, the spread of SARS-CoV-2 occurs primarily through close human contact, which, at a population level, is highly structured [6]. Prior to the implementation of COVID-19 interventions, contacts concentrated among individuals of similar age, were highest among school-aged children, and also common between children and their parents, and middle-aged adults and the elderly [7]. Since the beginning of the pandemic, these contact patterns have changed substantially [8, 9, 10]. In the United States, the Berkeley Interpersonal Contact Study suggests that in late March 2020 after stay-at-home orders were issued, the average number of daily contacts made by a single individual, also known as contact intensity, dropped to four or fewer contacts per day [10]. Data from China indicate that infants and school-aged children had almost no contact to similarly aged children in the first weeks after stay-at-home orders, and reduced contact intensities with older individuals [8]. However, detailed age-specific population-level contact and mobility data have remained scarce, especially longitudinally, and this has impeded a better understanding of the age-specific sources driving COVID-19 transmission.
2 Results
Fine scale mobility trends across the United States
We compiled a national-level, aggregate mobility data set using cell phone data from >10 million individuals with Foursquare’s location technology, Pilgrim [11], which leverages a wide variety of mobile device signals to pin-point the time, duration, and location of user visits to locations such as shops, parks, or universities. Unlike the population-level mobility trends published by Google from cell phone geolocation data [12], the data are disaggregated by age. User visits were analyzed from February 1, 2020, aggregated, and projected to estimate for each state and two metropolitan areas daily foot traffic for individuals aged 18 − 24, 25 − 34, 35 − 44, 45 − 54, 55 − 64, and 65+ years. To obtain age-specific mobility trends, the data were divided by the corresponding averages in the baseline period February 3 - February 9, 2020 per age band and state or metropolitan area (see Supplementary Material S1).
Across the US as a whole, the mobility trends indicate substantial initial declines in extra-household visits (location an individual spends time at that is not the primary residence) followed by a subsequent rebound for all age groups (Figure 1A; see also Supplementary Figure S12). During the initial phase of the epidemic, trends declined most strongly among individuals aged 18-24 years across almost all states and metropolitan areas, and subsequently tended to increase most strongly among individuals aged 18-24 in the majority of states and metropolitan areas (Supplementary Figure S1), consistent with re-opening policies for restaurants, night clubs, and other venues [13]. Yet, by the last observation week August 15, 2020 - August 21, 2020, the data suggest mobility levels continue to be below those observed in the baseline period February 3 to February 9, 2020, in most states and metropolitan areas (Figure 1B). In addition, considering both the initial decline and subsequent rebound, our data indicate that mobility levels among individuals aged < 35 years have not increased significantly above those observed among individuals aged 35-44, and that as of August 2020 there have been no significant shifts in the relative levels of mobility between age groups (Figure 1A-B, and Supplementary Figure S13).
Mobile phone signals are challenging to analyse, owing e.g. to daily fluctuations in the user panel providing location data, imprecise geolocation measurements, and changing user behaviour [14]. We cross-validated the inferred mobility trends against age-specific mobility data from a second mobile phone intelligence provider, Emodo. This second data set also showed no evidence for significant shifts in relative mobility levels between age groups (see Supplementary Material S1), leading us to hypothesize that the resurgent epidemics in the United States may not be a result of changes in the contribution of different age groups to SARS-CoV-2 transmission.
Bayesian semi-mechanistic contact and infection model to characterise age-specific SARS-Cov-2 transmission
To test this hypothesis, we incorporated the mobility data into a Bayesian contact-and-infection model that describes time-changing contact and transmission dynamics at state and metropolitan area-level across the United States (see Supplementary Figure S2 and Supplementary Text S3). For the time period prior to changes in mobility trends, we used data from pre-COVID-19 contact surveys [6], and each state or metropolitan area’s age composition and population density to predict contact intensities between individuals grouped in 5-year age bands. On weekends, contact intensities between school-aged children are lower than on weekdays, while inter-generational contact intensities are higher. In the model, the observed age-specific mobility trends of Figure 1 are then used to estimate in each location (state or metropolitan area) daily changes in age-specific contact intensities for individuals aged 15 and above. We assumed that the effect of the observed mobility trends on changing contact intensities was the same across age groups. For younger individuals, for who mobility trends are not recorded, contact intensities during school closure periods were set to estimates from two contact surveys conducted post lockdown [9, 8]. In turn, the contact intensities are used to estimate the rate of SARS-CoV-2 transmission, and subsequently infections and deaths.
An important feature of SARS-CoV-2 transmission is that similarly to other coronaviruses but unlike pandemic influenza [15], susceptibility to SARS-CoV-2 infection increases with age [8, 16, 17]. Here, we used contact tracing data from Hunan province, China [8] to specify lower susceptibility to SARS-CoV-2 infection among children aged 0-9, and higher susceptibility among individuals aged 60+, when compared to the 10-59 age group. Previously infected individuals are assumed to be immune to re-infection within the 6-month analysis period, consistent with mounting evidence for sustained antibody responses to SARS-CoV-2 antigens [18].
In the United States, COVID-19 epidemic trajectories differ substantially across locations and over time, and apart from mobility trends, other factors such as adherence to social distancing guidelines and consistent face mask use contribute to the extent to which spread of SARS-CoV-2 is limited [19, 20]. Thus, and following earlier work [21], the model incorporates random effects in space and time to allow for unobserved factors that could modulate disease-relevant behaviour and contact patterns.
Age groups sustaining SARS-CoV-2 spread in the United States
To disentangle the contribution of different age groups to onward infection, we recorded age-specific, COVID-19-attributed mortality data from 40 U.S. states, the District of Columbia and New York City since March 15, 2020 (Supplementary Text S2 and [22]). Then, we fitted the contact-and-infection model in a Bayesian framework to the mobility trends and the mortality time series data from 35 U.S. states, the District of Columbia and New York City with at least 300 COVID-19-attributed deaths. Kansas was excluded due to atypical mobility trend data, giving a total of 5,579 observation days. The estimated disease dynamics closely reproduced the age-specific COVID-19 death counts (Supplementary Figure S3).
Figure 2 illustrates the model fits for New York City, Florida, California, and Arizona, showing that the inferred epidemic dynamics differed markedly across states and metropolitan areas. In New York City, the epidemic accelerated for at least 4 weeks since the 10th cumulative death and until age-specific reproduction numbers started to decline, resulting in an epidemic of large magnitude as shown through the estimated number of infectious individuals (Figure 2, mid column). Subsequently, we find that reproduction numbers for all age groups were controlled to below one except a two-week period in June (Figure 2, rightmost column), resulting in a steady decline of infectious individuals. In Florida, we estimate reproduction numbers remained above one for individuals aged 20-49, and in June increased substantially above one for individuals aged 10-64, resulting in a moderate initial decline in infectious individuals followed by a peak in the number of infectious individuals in late July, and subsequent decline. In California, we estimate that reproduction numbers for individuals aged 35-49 remained above one throughout the pandemic, and in June increased to above one for individuals aged 20-64, resulting in a similar but less marked increase in infectious individuals when compared to Florida. In Arizona, we estimate reproduction numbers remained above one for individuals aged 10-49, and fell below one in August, resulting in a sustained increase in infectious individuals until August, and subsequent decline. More detailed situation analyses for all locations are presented in Supplementary Text S7.
Figure 3 summarises the epidemic situation for all states and metropolitan areas evaluated. Children aged 0-9 and adults aged 65+ consistently had the lowest estimated reproduction numbers, and these typically remained below one since mobility trends began to decline in March 2020 (Supplementary Table S1), which is consistent with the low contact intensities from these age groups during school closure periods. By August 17, 2020, the estimated reproduction number across all locations evaluated was above one only for individuals aged 35-49 (1.10 [1.04-1.17]), and close to one for individuals aged 10 − 19 or 20 − 34 (Supplementary Table S2). This suggests that targeted interventions to these age groups, and in particular adults aged 35-49, could bring resurgent COVID-19 epidemics under control.
To quantify the contribution of each age group to onward transmission, we also considered the reconstructed transmission flows, because reproduction numbers estimate the number of secondary infections per infected individual, and the number of infectious individuals varies by age as a result of age-specific susceptibility gradients and age-specific contact exposures. Cumulating over time and across all locations evaluated, we estimate that the percent contribution to onward spread was 35.4% [34.2%-36.5%] from individuals aged 35-49, compared to 1.3% [0.8%-2.0%] from individuals aged 0-9, 10.1% [9.2%-11.0%] from individuals aged 10-19, 28.3% [26.9%-29.5%] from individuals aged 20-34, 18.6% [18.1%-19.2%] from individuals aged 50-64, 5.5% [3.7%-8.1%] from individuals aged 65-79 age group, and 0.6% [0.4%-0.9%] from individuals aged 80+ (Table 1). Supplementary Figure S4 compares the contributions of each age group to SARS-Cov-2 transmission against the population age composition in each state. Over time, the model estimates that the mean age of new SARS-CoV-2 infections has been remarkably constant, showing that shifts in age-specific transmission dynamics are not required to explain heterogeneous and resurgent disease dynamics across the United States (Supplementary Figures S5 and S6).
School opening scenarios
The epidemic situation could change as schools re-open across the United States in August and September, 2020, especially in areas with resurgent, community-wide transmission primarily from adults. Re-opening kindergartens and elementary schools for children aged 0-11 are a national priority [23]. We thus focused on school opening scenarios in which children aged 0-11 return to engage in typical contact patterns with their peers and older individuals, while mobility levels, reproduction numbers, and the transmission potential of all other age groups were kept fixed as inferred by the end of August 2020 for the forecast period. We assumed disease transmission from and to children aged 0-11 is reduced by 50% due to face mask use and other non-pharmaceutical interventions [19], considering also the range 0%-80%. The scenarios were evaluated over 90 days and contrasted to continued school closure scenarios. Across all 37 states and metropolitan areas evaluated, we estimate by November 24, 2020 a 253.7% [199.3%-366.9%] increase in infections among children aged 0-11, and 24 [13, 42] excess COVID-19 attributable deaths among children aged 0-11, resulting in pediatric COVID-19 attributable mortality figures that are similar to pediatric influenza-like mortality (Table S4). The forecasts further estimate 6,181 [3,286, 11,925] excess COVID-19 attributable deaths in the total population, which is a 12.6% [7.4%-22.7%] increase compared to the continued school closure scenario, by November 24, 2020 (Table S4). In the central analysis, the predicted excess COVID-19 attributable deaths are concentrated in areas with resurgent epidemics, most notably Texas, California and Florida, and few additional COVID-19 attributable deaths are predicted in areas where reproduction numbers from individuals aged 20 − 49 are below one or close to one (Figure S7). We emphasise that the predictions depend on the assumed level of transmission reductions in kindergartens and elementary schools, with no substantial increases in COVID-19 deaths when transmission from and to children aged 0-11 is reduced by 66% or more, and substantial increases in COVID-19 attributable deaths in most states and metropolitan areas when transmission from and to children aged 0-11 is not reduced due to pre-cautionary measures (Figures S8-S10, Tables S5-S7).
Limitations
The findings of this study need to be considered in the context of the following limitations. First, we rely on limited data from two contact surveys performed in the United Kingdom and China to characterise contact patterns from and to younger individuals during school closure periods [9, 8]. We explored the impact of higher inter-generational contact intensities involving children during school closure periods, and in these analyses the estimated contribution of children aged 0-9 to onward spread until August 2020 remained below 2% (Supplementary Material S6). Second, while COVID-19 deaths are considered a more robust measure of SARS-CoV-2 spread than reported cases due to the high proportion of asymptomatic cases [24], epidemiologic models are sensitive to assumptions on the infection fatality ratio (IFR) that relates infections to deaths. We reconsidered a recent meta-analysis of estimates from large-scale seroprevalence studies [25], and found greater uncertainty associated with IFR estimates for individuals below age 40 (Supplementary Text S3). Using these uncertainty ranges in the model, we estimate greater COVID-19 burden among individuals below age 40, and we are able to match data from several sero-prevalence surveys conducted by the Centers for Disease Control and Prevention [26] (Supplementary Table S3 and Supplementary Text S5). Third, we cannot rule out that the observed time evolution of age-specific COVID-19 attributable deaths is also consistent with models that predict substantial age-shifts in transmission dynamics. However, in this case age-shifts in mobility levels are also expected, and we found no evidence for such changes in two independent mobility trend data sets. We further compared model outputs to the number of daily reported COVID-19 cases in each state and metropolitan area, and find that the ratio of estimated, actual cases to reported cases decreases substantially over time (Supplementary Text S7). This suggests that increased testing and increased awareness and test-seeking of individuals aged 20-49 could explain the observed shifts in the age composition of reported cases over the past months [4, 5, 3, 27], because infections among younger individuals are more frequently associated with no or mild symptoms than in older individuals [17, 28]. Fourth, the COVID-19 epidemic is more granular than considered in our spatial modelling approach. Substantial heterogeneity in disease transmission exists at county level [29], and our situation analyses by state and metropolitan areas need to be interpreted as averages. Fifth, the contact and infection model also falls short to account for population structure other than age, such as household settings, where attack rates have been estimated to be substantially higher than in non-household settings [30]. It is possible that we over-estimated the impact of re-opening kindergartens and elementary schools on transmission dynamics. In line with this possibility, contact tracing in elementary schools and further data from countries that have re-opened schools have provided no evidence for substantial transmission in schools, nor increased community-level infection rates [23, 31], although most reports stem from locations with no resurgent epidemics.
3 Conclusions
This study provides evidence that the resurgent COVID-19 epidemics in the United States are driven by adults aged 20-49. By August 17, 2020, an estimated 62.7% [60.1%-65.1%] of SARS-CoV-2 infections originated from adults aged 20-49 whereas less than 2% originated from children aged 0-9. We find heterogeneity in age-specific reproduction numbers across locations, with highest reproduction numbers from individuals aged 35-49, followed by individuals aged 20-34. We find no evidence for substantial shifts in contact and transmission dynamics between age groups over time. This suggests that working adults who need to support themselves and their families have been driving the resurging epidemics in the United States. Re-opening kindergartens and elementary schools is essential, but are predicted to facilitate the spread of SARS-Cov-2 in areas with sustained community-wide transmission from adults. This study indicates that targeting interventions at adults aged 20-49 could bring resurgent epidemics under control, avert deaths, and facilitate the safe re-opening of schools.
4 Data
The national-level, aggregate mobility data used in this study are described in Supplementary Text S1. The agespecific COVID-19 attributable mortality data used in this study are described in Supplementary Text S2.
5 Methods
The contact-and-infection model and further methods are described in Supplementary Text S3.
6 Location-specific COVID-19 situation reports
Detailed situation reports for the 37 states and metropolitan areas evaluated in this study are in Supplementary Text S7.
7 Comparison to external contact data and COVID-19 seroprevalence data
To gain further insights into the model outputs, we reviewed data from contact surveys during the pandemic, and from several large-scale COVID-19 seroprevalence surveys in the United States. The model outputs are compared to the data from contact surveys in Supplementary Text S4, and to the COVID-19 seroprevalence survey data in Supplementary Text S5.
8 Sensitivity analyses
Sensitivity analyses are presented in Supplementary Text S6.
Data Availability
The COVID-19 mortality data used in this study are available on GitHub, https://github.com/ImperialCollegeLondon/US-covid19-agespecific-mortality-data, under the Creative Commons Attribution 4.0 International Public License. Code and further data are available on Github, https://github.com/ImperialCollegeLondon/covid19model, under the MIT License.
https://github.com/ImperialCollegeLondon/US-covid19-agespecific-mortality-data
Contributors
OR conceived the study. AG, SM, SF, SB, NF, OR oversaw the study. MM, DH, SBe, ST, YC, McM, MH, HZ, ABe, OR oversaw and performed data collection. MM, ABl, XX, OR lead the analysis. VCB, HC, SF, JIH, TM, AG, HJTU, MV, SW, SM contributed to the analysis. All authors discussed the results and contributed to the revision of the final manuscript.
Contributors, Imperial College COVID-19 Response Team
We would like to thank the Imperial College COVID-19 Response Team for their insightful comments, Kylie E C Ainslie, Marc Baguelin, Adhiratha Boonyasiri, Olivia Boyd, Lorenzo Cattarino, Laura V Cooper, Zulma Cucunubá, Gina Cuomo-Dannenburg, Bimandra Djaafara, Ilaria Dorigatti, Sabine L van Elsland, Richard FitzJohn, Katy A M Gaythorpe, Lily Geidelberg, William D. Green, Arran Hamlet, Wes Hinsley, Ben Jeffrey, Edward Knock, Daniel Laydon, Gemma Nedjati-Gilani, Pierre Nouvellet, Kris V Parag, Igor Siveroni, Hayley A Thompson, Robert Verity, Caroline E. Walters, Haowei Wang, Yuanrong Wang, Oliver J Watson, Peter Winskill, Charles Whittaker, Patrick GT Walker, Christl A. Donnelly, Lucy Okell, Sangeeta Bhatia, Nicholas F. Brazeau, Oliver D Eales, David Haw, Natsuko Imai, Elita Jauneikaite, John Lees, Andria Mousa, Daniela Olivera, Janetta Skarp, Lilith Whittles
Data sharing
The COVID-19 mortality data used in this study are available on GitHub, https://github.com/ImperialCollegeLondon/US-covid19-agespecific-mortality-data, under the Creative Commons Attribution 4.0 International Public License. Code and further data are available on Github, https://github.com/ImperialCollegeLondon/covid19model, under the MIT License.
Declaration of interests
SB acknowledges the National Institute for Health Research (NIHR) BRC Imperial College NHS Trust Infection and COVID themes, the Academy of Medical Sciences Springboard award and the Bill and Melinda Gates Foundation. OR reports grants from the Bill & Melinda Gates Foundation during the conduct of the study.
Supplementary Tables and Figures
9 Acknowledgements
This study was supported by the Imperial College COVID-19 Response Fund, the Imperial College Research Computing Service DOI:10.14469/hpc/2232, the Bill & Melinda Gates Foundation, and the EPSRC through the EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning at Imperial and Oxford, the UK Medical Research Council under a concordat with the UK Department for International Development, the NIHR Health Protection Research Unit in Modelling Methodology and Community Jameel. We would like to thank Microsoft and Amazon for providing cloud computing services.