Abstract
At the start of the COVID-19 pandemic, most US K-12 schools shutdown and millions of students began remote learning. By September 2020, little guidance had been provided to school districts to inform fall teaching. This indecision led to a variety of teaching postures within a given state. In this report we examine Ohio school districts in-depth, to address whether on-premises teaching impacted COVID-19 disease outcomes in that community. We observed that counties with on-premises teaching had more cumulative deaths at the end of fall semester than counties with predominantly online teaching. To provide a measure of disease progression, we developed an observational disease model and examined multiple possible confounders, such as population size, mobility, and demographics. Examination of micropolitan counties revealed that the progression of COVID-19 disease was faster during the fall semester in counties with predominantly on-premises teaching. The relationship between increased disease prevalence in counties with on-premises teaching was not related to deaths at the start of the fall semester, population size, or the mobility within that county. This research addresses the critical question whether on-premises schooling can impact the spread of epidemic and pandemic viruses and will help inform future public policy decisions on school openings.
Main text
The impact of school closures on community spread and mortality of COVID-19 is largely unknown. During the COVID-19 pandemic school posture has been thought to have a low impact on COVID-19 public health outcomes based on analysis from large-scale epidemiological studies1,2 and the low incidence of SARS-CoV-2 in school age children3. However, children are known to act as a vector for influenza virus transmission4 and given the lack of testing in asymptomatic individuals it is feasible that infections in kids have gone undiagnosed.
Each school district across the country chose their own teaching method with input from state governments. But the lack of federal or state mandates created scenarios where school districts within a single state had different teaching postures, either online (or virtual), hybrid format, or completely on-premises. The initial teaching mode for each school district across the US was captured by MCH Strategic Data (www.mchdata.com/covid19/schoolclosings), with verification occurring on a rolling basis throughout the Fall 2020 academic semester. Supplemental Fig 1A highlights the heterogeneity of teaching methods across the US school districts. The diversity of teaching methods for each state can be leveraged to examine the relationship between teaching posture, on-premises, hybrid, or online and its impact on COVID-19 deaths within the community.
In this study, we analyzed the impact of school teaching posture on COVID-19 deaths in Ohio. Ohio was selected for its small size with uniform geography and climate as well as two other important reasons. First, multiple public health interventions including mask orders, restaurant/bar restrictions, and gathering ban sizes were declared by the Governor and implemented statewide (Supplemental Fig 2, www.PhightCOVID.org). The statewide implementation of interventions allow for direct comparison between counties without the confounder of public-health mandate disparity. Second, Ohio school districts had diverse teaching postures (Supplemental Fig 1B), which allows for comparison between different teaching methods. The vast majority of counties within Ohio have similar population densities, but different COVID-19 death incidences based on population size (Supplemental Fig 3). Each county within Ohio has multiple school districts within it, although not all school districts within a county had the same teaching posture (Fig 1). Each county was categorized based on the largest proportion of students in a specific teaching mode. This strategy led to 16 ‘on-premises’, 11 ‘online only’, and 59 ‘hybrid model’ counties (Fig 1D).
The cumulative COVID-19 deaths per 1,000 individuals was examined based on the primary teaching method in that county (Fig 2). The deaths in counties where the primary teaching mode was online or hybrid were higher in the summer months than those with on-premises teaching (Fig 2A). The levels of COVID-19 deaths in the community likely influenced the teaching method decision for the coming academic year within a given county. However, during the course of the fall term and after, there is a larger increase in the incidence of COVID-19 death in counties with predominate on-premises teaching compared to online teaching (Fig 2A). A comparison of all counties based on teaching posture and the cumulative deaths per 1,000 individuals throughout the fall semester indicate a significant difference between on-premises teaching and hybrid/online teaching (Fig 2C). This is a striking observation, but may be misleading given potential confounders within each county, such as the differences in deaths prior to the start of school that may have influenced not only the teaching method implemented but also the subsequent trajectory of COVID cases. To further explore the relationship between teaching posture and the counties’ disease state, we developed a generative model to better characterize the instantaneous progression of the disease.
Our observational model describing the disease state within a given community is explained in detail in the Supplemental information. Briefly, observed daily deaths in each county are expressed as a function of the latent variable of new infections on day t (It) and the equation It ≈ exp(Bt) It−1 models infections over time, which provides an easily interpretable value describing the growth of the disease. The growth factor is expressed as exp(Bt) to ensure a positive value to fit the model to observed daily COVID-19 death data, where we can then obtain estimates of Bt, herein referred to as ‘exponential growth coefficient’. A positive Bt implies that the number of new infections on day t exceeds the number of new infections on day t-1, indicating that the disease in that county is worsening. Conversely, a negative Bt suggests that the county’s disease state is improving. The exponential growth coefficients for all counties with a given teaching method were calculated and depicted via smoothed curves in Fig 2B. Curves above the y=0 line indicate time periods where Bt is positive, and thus exp(Bt) > 1 and the disease state is worsening. When the curve dips below y=0, Bt is negative, exp(Bt) <1, and the disease state is getting better.
Comparing the exponential growth coefficient for counties based on teaching method suggests that counties with online or hybrid teaching postures in the Fall of 2020 had expanding COVID-19 disease growth in March-April of 2020, (Fig 2B). However, at the start of the school year (Aug 2020) the pandemic in these counties was shrinking (Fig 2B, curves below y=0 dashed line) compared to those with on-premises teaching. In Oct of 2020, the three groups of counties were experiencing a similar growth in the pandemic. Counties with on-premises teaching had an increase in the growth of the pandemic earlier than the other counties, but the maximum growth in the three groups, all in mid-November, were similar (Fig 2B). Further exploration of the maximum disease growth (max of B1) for each county per teaching posture in the fall semester (Fig 2D) suggests that online counties had worse disease than on-premises counties. This observation is seemingly in contrast to analysis of cumulative deaths per 1,000 individuals (Fig 2C), but cumulative death measures all deaths from the entire fall semester where the max Bt represents the most severe growth of the disease at a single point in time. In addition, the disease state may also be influenced by confounders such as population size, mobility, and age of county residents such that comparison of the aggregated counties distorts the disease state present within individual counties. Thus exploring the variation in possible confounders and refining the analysis to counties with similar mobility, populations, or demographics are needed to better assess the effects of teaching posture on COVID-19 disease. Below we investigate the effects population density and mobility as potential confounders, but other covariates including county demographics and intervention adherence could impact the trajectory of COVID-19 deaths
To explore the role of mobility on COVID-19 disease during the Fall 2020 semester we utilized cell phone data from COVIDcast (https://delphi.cmu.edu/covidcast/). The COVIDcast data was used to calculate the proportion of cellphones within a given county that are away from their home zipcode for >6 hours a day, utilized as a metric for general mobility within a county. We observed that mobility in on-premises counties increased during the school term (yellow shaded box), likely due to the ability of parents to go to work places in person; whereas in online and hybrid counties, many parents may have to coordinate child care during the work day (Supplemental Fig 4). To incorporate increased mobility into the analysis of COVID-19 deaths and teaching method, the average mobility during the fall semester was plotted against the max disease growth coefficient during the fall semester per county (Fig 3A). Surprisingly, no positive correlation was observed between mobility and disease state, suggesting that the increased mobility, as measured by cell phone data, of the population may not be a strong driving force in disease outcomes. Rather mobility may be a consequence of the current disease state within a community.
The max disease growth coefficient during the Fall semester for each county was compared to the log population density for that county using linear regression for each teaching method group (Fig 3B). Consistent with other reports, we observed a correlation with population density and higher COVID-19 disease state3,5. Counties with higher population densities were primarily online or hybrid teaching methods. Looking at counties with log population density in the 4-5 range, the three teaching postures are represented and the maximum disease growth in on-premises teaching counties had a higher slope compared to the other teaching modes. This means that population density had a larger impact on COVID-19 deaths in the counties that chose on-premises teaching posture. However, this effect could be due to confounders rather than to teaching posture. For example, comparison of population age found that on-premises counties had more individuals >65, and based on the age disparity of COVID-19 deaths6 it is feasible that this would influence the increase in COVID-19 deaths (Supplemental Fig5). In addition, the proportion of uninsured individuals was higher in online counties compared to counties with primarily hybrid and on-premises, which could impact access to healthcare and/or behavior patterns (Supplemental Fig 5B).
In order to explore county demographics, we utilized the National Center for Health Statistics Urban-Rural classification scheme for counties based on the Office of Management and Budget delineation of metropolitan statistical areas and micropolitan statistical areas7. As expected, large and medium metropolitan areas were primarily online or hybrid (Supplemental Fig 6). Note that the large central metropolitan areas were largely online only while the suburbs to these large metropolitans (defined as large fringe metropolitans) tended to have more hybrid learning modes. Importantly, micropolitan counties, which are defined as urban areas with a population between 10,000-50,000, were distributed across teaching modes: on-premises (7), hybrid (23), and online (3) teaching counties (Fig 4A),which allows comparison of COVID-19 dynamics for the three teaching modes within a homogeneous group of counties. A similar analysis of the max disease growth coefficient restricted to micropolitan counties suggests that the epidemic got worse in on-premises counties (Fig 4B).
Comparing the impact of teaching posture on max growth between all counties in Fig 2D and only micropolitan counties in Fig 4B reveals a striking difference. Fig 2D (max growth in all counties) suggests that online teaching counties had more COVID-19 disease during the Fall semester; however, in Fig 4B (only micropolitan counties) on-premises school posture is associated with higher disease spread. The apparent contradiction between Figs 2D and 4B is known as Simpson’s paradox8 and in this instance is due to the confounder of urban type. Urban type is associated with population density which may contribute to COVID-19 death incidence (higher incidence in more populous areas), adherence and interest in non-pharmaceutical interventions (NPI) compliance, and teaching posture. These two plots highlight the importance of accounting for confounders at multiple levels and strengthens the observations made by comparing micropolitan counties.
Another consideration, is that maximum disease growth during the fall semester occurred after the 6 week mark closer to Thanksgiving and may be influenced by other factors in addition to teaching posture. To narrow down the window on COVID-19 disease growth in the micropolitan counties, we examined the acceleration of COVID-19 disease at two different short time intervals at the beginning of the semester when covariates, other than school starting, are unlikely to have an impact. In this way, differences in disease growth observed can be attributed to school posture with reasonable confidence. First, we use the slope (or change in disease growth) over the first three weeks of the fall semester (week 0 – 3; first section of shaded box in Fig 2B) to explore differences in the disease state independent of teaching method. All micropolitan counties show similar behavior for change in disease for the first three weeks independent of teaching posture (Fig 5A). In contrast, the change in disease growth during the next three weeks (week 3-6) of the fall semester was consistently higher in counties with on-premises teaching compared to those with online or hybrid teaching (Fig 5B). That is, the acceleration of the disease was higher in counties with on-premises teaching compared to counties with hybrid and online-only teaching postures. This result is independent of mobility in each of these counties, as depicted by the size and color saturation in the data points in Fig 5A and B.
Fig 5B reveals some variation in COVID-19 disease growth during the fall semester for on-premises micropolitan counties, where some counties had much more severe change in growth compared to other counties (such as Defiance). In contrast counties such as Mercer and Champaign had COVID-19 disease growths similar to counties with online or hybrid learning (Fig 5B). Both Champaign and Defiance counties contain five school districts with similar percentages of students in hybrid (46-49%) and >50% in on-premises learning modes (Supplemental Table 1). In contrast, Mercer county had 89% of school age children in on-premises teaching but had one of the lowest COVID-19 growths during 3 to 6 weeks after the fall semester (Supplemental Table 1 and Fig 4B). Thus, the overall percentage of students within a county in on-premises learning does not directly correlate with an acceleration in disease growth. However, changes of NPI compliance and adherence to interventions could influence disease acceleration and was not captured in this study.
Ohio was chosen based on the implementation of universal state-wide interventions on masks orders, dining restrictions, and gathering bans. However, confounders such as adherence to interventions may be county specific and could not be captured within our data sets. In addition, it is unclear if students are functioning as the vector and spreading COVID-19 or whether teachers and school staff are the vehicles of SARS-CoV-2. Additionally, we cannot rule out that behaviors, other than mobility, in communities with predominantly on-premises teaching were different than those with predominantly online learning. These differences may also contribute to the spread of COVID-19. A recent study using self-reporting surveys conducted by the CMU Delphi group revealed that households with students in on-premises teaching had a higher odds ratio of COVID-19 like illness9. However, self-reported data may be inherently biased, and more information regarding infections in school age children is needed.
The analysis of Ohio counties based on teaching methods reveals a positive relationship between COVID-19 disease growth and on-premises teaching. Interestingly, counties with hybrid learning, while having similar cumulative deaths to on-premises counties at the end of the fall semester (Fig 1A), had better disease outcomes in micropolitan counties based on the change in disease growth during the fall semester (Fig 5B). Thus mitigation strategies such as decreased occupancy could impact the community burden of COVID-19. The increase of COVID-19 disease burden in on-premises counties was unlikely due to lack of general interventions as practices of mask usage among teachers and students was implemented by all school districts within Ohio in accordance with the Ohio Department of Health Director’s order effective Aug 14, 2020 (Supplemental material).
Importantly, Lesser et al9 used survey data to capture implementation of intervention strategies from masking, distancing, and extracurricular sports to suggest that some NPIs are more effective than others. In Ohio the vast majority of on-premises micropolitan Ohio school districts implemented similar interventions such as masking and distancing (Supplemental material). However, few details regarding air exchange and ventilation were available on the school district websites. In addition, it is unclear whether schools required health screenings to ensure that symptomatic children were kept home, this strategy could also help limit transmission.
Despite the positive correlation of on-premises education and COVID-19 disease growth in Ohio counties, we remain optimistic that schools can and should return with on-premises learning in the Fall of 2021. To mitigate transmission risk, we recommend implementation of interventions detailed in10. Briefly, we encourage the continued use of masks, increased air ventilation, air purifiers, as well as reduced occupancy within individual classrooms when feasible. In addition, vaccination of both teachers and pediatric populations will increase the barriers to transmission, which should increase the safety of reopening schools for on-premises learning. Ohio in particular has been creative with incentivizing vaccination among its residents with a vaccine lottery, where vaccines are registered for a monetary award, and at the time of writing this article ∼ 45% of all residents were fully vaccinated.
Methods
US school district COVID-19 interventions and status data
Data on the teaching status and COVID-19 interventions of each school district implemented in the Fall 2020 was obtained from MCH Strategic Data. COVID-19 impact: school district status—updates for fall 2020, accessed on Feb 2, 2021 (www.mchdata.com/covid19/schoolclosings). MCH data surveyed school districts around the country in Fall of 2020. Approximately 70% of school districts responded and include information of 35 different variables including the school district name, county, and teaching method (Full on premises, Hybrid/partial, remote learning only, unknown/undecided, Other, pending). Most counties in Ohio had information on the school districts within that county, the exception was Harrison and Vinton County, which were excluded from our analysis, because their school status information was missing. The data for school policies is at the individual school level. However, there can be multiple schools in a school district and multiple school districts in a county, which led us to aggregated the information about the schools to the county level. The variable of most interested in is “Majority_teaching_posture”; which was calculated based on the proportion of students in each teaching mode within a given county.
COVID-19 death data
Death data per county per day was obtained from data compiled by the Johns Hopkins Center for Systems Science and Engineering (JHU-CSSE) COVID-19 resource repository available at: github.com/CSSEGISandData/COVID-19.
Mobility data
Mobility data was obtained from covidcast program within the DELPHI group at Carnegie Melon University (https://delphi.cmu.edu/covidcast/) in the SafeGraph COVIDcast epidata API for assessing social distancing metrics. Data was updated until April 19, 2021. This data source uses anonymized location data from mobile phones to measure mobility. The data is provided for individual census block groups with differential privacy at aggregated county or state levels. The data set contains 9 variables regarding mobility within a county or state. In our analysis variable ‘full_time_work_prop’ was used to infer the mobility within Ohio counties, and provides the fraction of mobile devices that spent more than 6 hours at a location other than their home zipcode during the daytime. The earliest data available is January 1, 2019. Safegraphs’s measures mobility each day which will produce biases on the weekend, where there is considerable more mobility, than on the weekday.
County Level Covariates
Data for county population was obtained in the R_package ‘USmap’ which contains 2015 census data. Other county level covariates such as percent of 65+ population and uninsured population was obtained by the CDC Data Tracker “Trends in COVID-19 Cases and Deaths in the United States, by County-level Population Factors” available (https://covid.cdc.gov/covid-data-tracker/#pop-factors_7daynewcases).
Data Availability and Code Availability
Details on model describing COVID-19 disease is included in Supplemental material. Our analysis is based on four publicly available datasets listed below:
Ohio K12 data Data source: https://www.mchdata.com/covid19/schoolclosings
County-level daily deaths and cases from COVID-19 in Ohio State Source: John Hopkins COVID Resource (https://github.com/CSSEGISandData/COVID-19)
County-level population mobility (SafeGraph data from COVID-CAST API created by CMI DELPHI group). Data source: https://delphi.cmu.edu/covidcast/
County-level demographic information of Ohio State (Ohio profile data from CDC). Data source: https://data.cdc.gov and https://covid.cdc.gov/covid-data-tracker/#pop-factors_7daynewcases
Cleaned and combined data sets, as well as the code for our generative model is available at https://github.com/alexazhu/36726-PIGHT-COVID.git
Data Availability
All data is publicly available at these sources: 1. Ohio K12 data Data source: https://www.mchdata.com/covid19/schoolclosings 2. County-level daily deaths and cases from COVID-19 in Ohio State Source: John Hopkins COVID Resource (https://github.com/CSSEGISandData/COVID-19) 3. County-level population mobility (SafeGraph data from COVID-CAST API created by CMI DELPHI group). Data source: https://delphi.cmu.edu/covidcast/ 4. County-level demographic information of Ohio State (Ohio profile data from CDC). Data source: https://data.cdc.gov and https://covid.cdc.gov/covid-data-tracker/#pop-factors_7daynewcases All our code and repositories are available in the data availability link below.
Author contributions
CE, YL, ZY, ZZ, VV performed the data analysis, SD and AA collected data, JL, RN, VV, SL conceived of the project, RN, VV, SSL wrote and edited the manuscript.
Supplemental Figures
Acknowledgements
We thank Dr. Carter Mecher of the Public Health Company for useful discussions and members of the Lakdawala lab for critical review of this manuscript. SSL is funded by National Institute of Allergy and Infectious Diseases (CEIRS HHSN272201400007C).