ABSTRACT
Background In the months following the global spread of SARS-CoV-2, the lack of effective pharmaceutical interventions led to widespread implementation of behavioral interventions aimed at reducing contacts and transmission. In the US, state and local governments introduced and enforced the bulk of interventions, including university closures. As universities closed, student departures decreased the total population size of college towns while state-level interventions decreased contacts among remaining residents. Though the pandemic continues without pharmaceutical interventions, businesses have begun to reopen, and many universities have resumed operations. These actions have increased contacts and population sizes in college towns. Monitoring movement to implement adaptive policies will be critical for outbreak management.
Methods We use publicly available remotely-sensed nighttime lights and traffic cameras to measure the impact of restriction policies on movement and activities in the university town of State College, and the surrounding areas of Centre County, Pennsylvania, USA.
Results At the county level, nighttime radiance did not differ significantly across restriction phases and largely reflected seasonal fluctuations seen in previous years. Throughout the county, traffic volumes were lowest during the most severe period of restrictions (‘Red phase’ in Pennsylvania). As restrictions eased, traffic volumes grew, indicating increased movement within and between population centers. We show that real-time, publicly available traffic data captured behavioral responses and compliance to different restrictions phases. We also demonstrate that these increases in activity levels precede increases in reported COVID-19 cases.
Discussion Passively collected data can measure population-level movement in response to restrictions and changes in these measured movements are reflected in observed SARS-CoV-2 transmission. Measuring these changes in movements and contacts in near real time can inform local adaptive interventions to curtail outbreaks.
MAIN TEXT
Introduction
Sudden Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first detected in Wuhan, China, in December 2019.1 Global spread by March 2020 resulted in a pandemic (Figure 1).2 In humans, it causes the respiratory disease known as Coronavirus disease 2019, or COVID-19, and its clinical presentation includes a variety of symptoms, ranging from mild to severe or fatal.3 There are currently no pharmaceutical interventions available for disease prevention or treatment. Outbreak management is focused on behavioral interventions to reduce transmission.4,5
Throughout the pandemic, behavioral intervention policies have varied widely between and within countries. China enforced severe interventions early, closing schools, workplaces, roads, and public transit, and requiring mandatory quarantine of uninfected people across entire provinces.6 These reductions in contacts between individuals were effective and interrupted transmission in Hubei Province.7 Other countries have imposed less strict lockdowns, with many relying on recommendations instead of legally enforceable restrictions.
Most universities in the United States cancelled face-to-face instruction mid-semester in the spring of 2020. Closures sent many students to family homes, emptying campuses and reducing population size and density in college towns. However, almost half of US universities have returned or will return to some degree of in-person teaching in the fall of 2020, with many more planning a return in the spring of 2021.8 University reopenings increase population sizes as students return to campus, and are compounded with increased mixing as activities resume. College campuses are fundamentally designed to facilitate interactions, which must now be deliberately and drastically curtailed to prevent SARS-CoV-2 transmission.9 Some universities rapidly pivoted to increase online engagement in response to large numbers of reported cases shortly after reopening. Initial modelling suggests that effective outbreak management on college campuses will require high compliance and widespread adoption of behavioral restrictions.10 Testing capacity is still insufficient in most settings and delayed results prevent effective contract tracing.11 It is essential for universities and surrounding communities to be able to measure the increases in population size and changes in human movement patterns as campuses reopen. Measuring movement and contacts, which precede transmission by about two weeks3,12, helps create early warnings, rapid adaptive control strategies, and preventative public health messaging to curb outbreaks.
There are many ways to measure human populations and movement to inform disease transmission. Data resolution varies across spatiotemporal scales, from targeted individual surveys and censuses to the passive surveillance of satellites and mobile phone usage data. Epidemiological efforts to estimate human movement and contacts have included tracking currency,13 commercial air traffic to model long distance flows,14 anthropogenic illumination to quantify seasonal population changes,15 and mobile phones for individual mobility traces.4 Privately owned mobile device data are anonymized and confined to national boundaries16 and due to their confidential nature, can’t be shared with policy makers. Aggregate device data obtained through third party vendors can be expensive, rely on opaque algorithms, and are proprietary. To overcome limitations on data sharing and increase replicability, we use publicly available passively collected satellite data and traffic cameras to measure indicators of human movement.
State College, in Centre County, is home to The Pennsylvania State University’s University Park campus (referred to as Penn State and PSU). It is the largest campus of the state’s largest public institute of higher education. In Pennsylvania, policies to minimize movement and local contacts were first implemented at the county level in late March (Figure 1, Table 1). Compared to previous years and throughout progression through three restriction phases (Table 1), we assessed changes in population level characteristics using remotely-sensed nighttime lights. To measure movement across restriction phases, we applied machine learning algorithms to ongoing traffic camera data collection.
We demonstrate the use of open access data to monitor near-real-time behavioral changes related to institutional and governmental COVID-19 response policies and outbreak trajectories through July 2020. Our methods are broadly applicable across university towns and the approaches presented here can be adapted to inform policies and messaging going forward.
Methods
Study area
Centre County is located in central Pennsylvania and has an estimated population of 162,000 (Fig 1).17 Approximately 40,000 of these residents are undergraduate students enrolled at Penn State.18 The majority of undergraduates leave during school holidays, resulting in seasonal population fluctuations.
Study period
We used radiance data, traffic cameras, and epidemiological records to measure activity and responses to restriction policies focusing on the period from February 14 to July 2, 2020 (Figure 1). We divided our surveillance efforts into six temporal phases, which align with events and policies that were expected to change movement and behavior at the University and across the county (Table 1, Fig 1):
Baseline (Feb 14 – March 5): before restrictions were in effect and while undergraduate students were on campus
Population Decline (March 6 – March 17): before local restrictions were in effect, after students left campus for spring break, and encompassing March 16th transition to online instruction
Local Restrictions (March 18-March 27): Mandated closure of all non-essential businesses (no county-wide restrictions)
County Red (March 28-May 7): red phase restrictions (Table 1)
County Yellow (May 8 to May 28): yellow phase restrictions
County Green (May 29 to July 2): green phase restrictions
Radiance data covered the study period from Feb 14 to July 3, traffic data collection spans April 26-July 2 2020, and epidemiological records included Feb 14 to July 16.
Radiance Data
We acquired nighttime radiance data from NASA’s Black Marble calibrated product.19 Black Marble quantifies daily anthropogenic nighttime radiance at 500-meter spatial resolution. The radiance data are captured by the day night band (DNB) of the Visible Infrared Imaging Radiometer Suite (VIIRS). They have been collected and publicly available since 2012.19 These data reveal information on changes in population-level distribution, abundance, movement, behavior, and economic activity.19 Here, we use images from February 14 to July 2 of the pandemic year (2020, Fig. 1) and compare them to typical seasonal variation using images collected across the same academic calendars from 2016 to 2019 (Table S1).20 We downloaded NASA VNP46A1 Daily At-Sensor Top of Atmosphere (TOA) Nighttime Radiance products as Hierarchical Data Format 5 (HDF5) files for central Pennsylvania. We extracted radiance values from each HDF5 file and saved the extracted values to GeoTiff files at a resolution of 351 m x 463 m. Only cloud-free pixels were used in the radiance analyses and pixels were masked using Black Marble’s cloud cover values. Acquisition and pre-processing operations were completed in Python version 3·7·6 (SI).21
Radiance rasters were cropped to Centre County boundaries to match the spatial extent of epidemiological and traffic data (Fig 2A). Gridded population density (1 km2) estimates for 202022 were used to identify pixels with fewer than 5 people/ km2 (i.e. state forests, golf courses; Fig 2A; S1). The remainder of the county was divided into population quantiles that each represented approximately 20% of the population (Table S2). Images contaminated by lunar radiance during a full moon, one day prior and three days following were removed (Fig S2).
We calculated mean radiance values for each of the six temporal phases and each population quantile from 2016–2020. PSU events (i.e. Spring Break) do not occur on the same calendar dates each year, so the dates were adjusted for each year to overlap with corresponding academic periods and cover the same number of days in previous years (Table S1). Previous years were included to quantify typical fluctuations in radiance driven by fluxes in the resident student population and seasonal influences of snow albedo during winter and occlusion from emergent vegetation in spring. The distributions of radiance values were visualized as density plots to illustrate the frequency and range of radiance values observed in each phase, population density, and year.
Traffic Cameras
We collected data from 19 traffic cameras across Centre County to quantify numbers of vehicles on roads beginning on April 27, 2020 (Fig 3A). Twelve of these cameras were on interstates, state highways, or other roads that link towns in Centre County, which we refer to as ‘connector’ roads. The remaining seven cameras are on ‘internal’ roads for travel within towns. Cameras produce 24-hour live streams accessible online but are not archived. We captured and stored images from these live streams every 20 seconds. Using the Python package cvlib as a high-level interface to OpenCV23,24 and Google’s open source TensorFlow software stack,25 we identified and counted vehicles in each image. Parked vehicles were excluded. The number of vehicles captured by each camera was standardized by the images captured within that hour (SI).
We fit a series of generalized additive models (GAMs) to these standardized hourly counts. The effect of hour of the day in local time was modelled as a cyclic cubic regression spline to account for the continuity of the variable so the same smoother operates at the 12:00 am ‘start’ of a day and the 11:00 pm ‘end’ of a day (Fig S3). We included the following as predictor variables, with interactions between variables: day of week, weekend (binary variable), restriction phase, camera identity, number of lanes visible in camera image, and road type (internal or connector road). Multivariate GAMs were implemented in the package mgcv in R version 3·6·2.26,27 All cameras had some infrequent and short duration gaps in image acquisition (Fig S4). Missing hourly data were predicted using the best fit GAM. The combined data (both observed and predicted) were used to estimate vehicle traffic volumes.
COVID-19 diagnostic testing results
We acquired total daily, county-level confirmed cases of COVID-19 in Centre County and the surrounding counties, as reported by the Pennsylvania Department of Health.28 Cases are confirmed using the CDC’s diagnostic tests to detect active infections using reverse transcriptase polymerase chain reaction (RT-PCR). When linking epidemiological data to movement data, we estimated a two-week lag between a transmission event and a reported case. This incorporates a 5-8 day12 incubation period and 5.5-10.93 day3 delay between the onset of symptoms to case confirmation. The case data included here extend to July 16, 2020, two weeks after movement data collection ends.
Results
Radiance fluctuations throughout the pandemic period compared to previous years
During 2020, the majority (86·18%) of the study area has low population density (5-99 people km−2), which accounts for 21·98% of the total population in Centre County (Fig 2A, Table S2). Low population density areas have low radiance values (< 2 nW cm−2 sr−1) across all years and little variation between phases (Fig 2B).
The highest density areas (>2000 people km−2) encompass a similar percentage of the total population (20·97%) in a much smaller percentage of the study area (0·69%). In these highest density areas, nighttime radiance was more variable between years and across phases of all years, not just 2020. In Baseline and during the Population Decline phase (Feb 14-March 17), we observed the greatest variation in radiance values, which we expect to relate to snowfall in these respective years (Table S4). Previous years included a proportion of pixels with high radiance values, above 100 nW cm−2 sr−1. During the red, yellow, and green restriction phases, we observed lower radiance values across all years, less variation between years, and little variation between phases within years.
Although we expected a reduction in radiance in 2020, particularly during local and county restriction phases, we did not observe this. While 2020 was darker in the earlier phases compared to 2016, 2018 and 2019, the mean and variation of radiance was not significantly different from 2017 or 2018. During restriction phases, 2020 radiance values were indistinguishable from all other years, suggesting no detectable changes in nighttime lights in response to policies.
Changes in traffic patterns through restriction phases
Data collection on traffic patterns began during the red phase, from April 27, 2020 forward. Throughout restriction phases observed in this study, daily traffic volume followed a bell-shaped pattern, with troughs at 02:00 h and peaks between 12:00 and 18:00 h (Fig. 3B, Fig S3). The best-fit general additive model included splines fit to hourly average counts per camera, a binary weekend predictor, intervention phase, and an interaction between phase and road type (connector or local). This model explained 87·9% of variation observed in vehicle traffic.
Traffic volume significantly increased in each subsequent phase of easing behavioral restrictions, with greater increases during weekdays and on connector roads (Fig 3B,C). During the red phase, mean daily weekday traffic totals were 10,947 vehicles on internal roads and 7,409 vehicles on connector roads. After movement to the yellow phase, mean daily weekday traffic totals increased by 23% on internal roads (increased to 13,265 vehicles) and 31% on connector roads (increase to 9,709 vehicles). While there was only a small increase from yellow to green on internal roads (3% increase), there was a 22% increase in vehicles on connector roads during this transition.
During the red phase (March 28-May 7), the governor announced on May 2 that Centre County would be shifting to yellow on March 8, six days prior to the actual lifting of restrictions. Traffic increased when restrictions officially eased, rather than when the announcement was made on March 2, suggesting behaviors followed governmental policies during this observation period.
Spatial and temporal patterns of confirmed cases in PA counties
Through July 17, Centre County confirmed a total of 208 cases, which is 166·5 cumulative confirmed cases per 100,000 census population. Increases in case counts were not synchronous with surrounding counties (Fig 1). Different counties within the state also experienced vastly different levels of morbidity. By July 17, the most severely impacted county in PA, Philadelphia County, reported 1,468·5 confirmed cases per 100,000 people.28 The epidemic is ongoing, with a surge in cases in western Pennsylvania, where Allegheny county (location of Pittsburgh) doubled the cumulative cases in two weeks prior to July 17 to 502·1 per 100,000 residents.
Epidemiological trends in relation to movement measured through traffic and radiance fluctuations
Transmission precedes official case reporting by approximately two weeks (see Methods3,12). Although there is a decrease in the seven-day moving average of daily radiance in higher population density (≥100 km 2) from Baseline to the Red phase, this is consistent with typical seasonal patterns observed in previous years during the transition from winter to spring (Fig 2B, 4A). There is no discernable pattern in low population areas where the mean radiance is much lower.
When focusing on the ten weeks of traffic data, there is an observed increase in daily traffic volume from red to green phases, with significant increases in vehicle volume observed on all roads and the greatest increases on connector roads.
The daily COVID-19 case totals are low in Centre County throughout the study period. Cases initially increase during the population decline and local restriction phases. Although stochastic, red and yellow phases correlate with a downward trajectory or plateau in daily incidence. In the green phase, larger increases in road traffic precede increases in incidence in Centre County and the highest 7-day averages in this period (Fig 4B, C).
Discussion
Behavioral interventions are currently the primary public health tool for reducing SARS-CoV-2 transmission. We used publicly available remotely-sensed nighttime lights and traffic camera data to measure behavioral responses to state- and local-COVID-19 restrictions in a University town and surrounding areas. We use traffic data to measure changes in movement and compliance with policies. This data source can be used to monitor movement and adapt mitigation strategies during the fall semester. Centre County has not yet experienced substantial community spread and monitoring movement levels throughout the region will be important for preventing sustained transmission following early increases in movement and populations.
In the United States, approximately half of universities are planning some level of fall 2020 return to campus operations,8 with many more resuming in the spring of 2021. University towns face three challenges with a return to operations: 1) an initial surge in susceptible (and possibly infected) individuals in university towns, as students and employees arrive on campus from different locations, many of which may have higher incidence of COVID-19 than the relatively empty college towns to which they will return, 2) rapid increases in mixing in formal educational and informal social settings between people who are returning from many different locations and local permanent residents, and 3) higher daily contact rates between and among students and employees as academic and social activities resume in some form. University towns are experiencing simultaneous increases in local population size and numbers of contacts, raising the likelihood of introduced cases and local transmission, respectively. Quantifying population fluxes and changes in movement and contacts will be key for developing adaptive policies and preventing ongoing transmission in these settings.
Passively collected traffic camera images measured movement in response to phased restriction policies. Traffic volume increased, particularly on connector roads, as restrictions were loosened. Local road traffic increased, but not as much as connector roads, which may be due to the function of each road type. In Centre County, local roads provide access to essential businesses that remained open throughout the restrictions, whereas connector roads are used for travel between places and may reflect a return to business, childcare and in-person work. These results suggest that Centre County residents largely complied with county-level restrictions. However, a caveat is that this region has not yet transitioned from a less restrictive phase to a more restrictive phase so compliance with this cannot be assessed. The increase in movement, as measured by vehicle volume, precede an overall increase in incidence in COVID-19 cases.
Although traffic camera data were able to detect changes in movement from vehicle volume, nighttime radiance data did not show changes in response to population declines or pandemic restriction policies. While there was a decline in radiance from Baseline to Red Phase Restrictions in 2020, this a typical seasonal trend seen in all other years from 2016-2019. The higher radiance values in the early phases are likely driven by albedo affects from snow, which is also highly variable between years (Table S4). Although we expected to see decreases in nighttime radiance values during restriction phases, reflecting the closing of non-essential businesses and stay-at-home mandates, this was not detected for high or low population density areas. It is possible that the rural, dimly lit area made these changes difficult to detect. Changes in population sizes in this county are concentrated in a very small minority of high population density pixels, which would be difficult to detect in radiance values across this large area. We find that highly localized population changes do not have a county-level impact on radiance values. These findings are in contrast to large scale changes detected in China, but over an area with much higher population density.29
In Pennsylvania, restrictions were implemented across the state within a few days at the end of March. A surge in cases in Philadelphia, the most populous county in the state, located 300 km southeast of Centre County, catalyzed the statewide lockdown. Only 2 cases had been confirmed in Centre County when the Red Phase of restrictions was implemented. Transmission of cases decreased during the red phase and was lowest, as a daily average, during the yellow phase. The increase in movement during the green phase, as observed in traffic data, particularly on connector roads, coincided with a second wave of infections which was increasing through July 16.
Testing capacity increased in the later phases of this study, but the bimodal pattern of infections most likely demonstrates increased transmission, which was preceded by increased movement, rather than simply an increase in testing. Increases in movement following easing restrictions, particularly the transition from red to yellow, increased contacts and led to the uptick in cases during the green phase in Centre County.
The local hospital is a relatively small care facility with limited COVID-19 patient care capacity28 and it will quickly become overwhelmed if cases surge, particularly with the influx of over 40,000 students. Quantifying changes in traffic, which correlate with transmission and precede increases in reported cases by about two weeks, provides an actionable metric for proactive policy and behavioral restrictions. This approach will be an important part of a larger plan for staying ahead of COVID-19 in college towns.10
The epidemic in the United States has been managed by local governments implementing policies at state and county levels. COVID-19 has spread widely through the nation’s big cities and small towns. Moving forward, it will be important to monitor local situations and implement locally responsive interventions. In addition to improving testing capacity and reducing delays in returning results, measuring local population movement and size at multiple scales offers a useful strategy to measure behavior and adapt policies and messaging targeting case prevention. University towns will face a large increase in populations and contacts whenever they resume campus operations, which will inevitably increase local transmission of COVID-19. Monitoring movements and contacts through integrated university and community efforts will be necessary to avoid overwhelming local health care capacities.
Data Availability
All data used for this study are available from publicly available sources and are attributed in the methods section. Code to generate models and figures are available at: https://github.com/cfaustus/centre_co_movement
Contributors
NB and AR designed the study. BL and CK contributed to data collection. CLF BL and NB analyzed the data and interpreted the results. All authors contributed to writing and approved the final version of the report.
Declaration of interests
All authors declare no competing interests.
Data sharing
All data used for this study are available from publicly available sources and are attributed in the methods section. Code to generate models and figures are available at: https://github.com/cfaustus/centre_co_movement
SUPPLEMENTARY INFORMATION
Passive, open access data measures movement and predicts COVID-19 cases
Supplementary Methods
I. Detailed timeline of events
Figure 1 in the main text details many of the key disease and policy events that are relevant to changes in policies and movement at the Penn State University Park campus and surrounding Centre County. However, here we provide citations for information in Figure 1 and additional events to give additional context to the policies and communication throughout.
Timeline of important disease and policy events local study periods:
Dec 31, 2019: China reports to WHO new pneumonia
Jan 20, 2020: 1st case in US1
[BASELINE PERIOD: Feb 18 – March 6]
February 12: first death in US (Santa Clara, CA, retrospectively identified)2
March 6: first 2 reported cases of COVID-19 in PA
[LOCAL POPULATION DECLINE: Feb 1 – March 6]
March 9: PSU spring break begins3
March 11: WHO declares COVID-19 a pandemic4; PSU announces no resident instruction following spring break5
March 16: PSU begins remote instruction5
March 19: all non-life-sustaining businesses ordered to close statewide by PA Governor6
March 21: enforcement of closure of non-life-sustaining businesses statewide6
[RED PHASE: Feb 1 – March 6]
March 28: Centre County receives stay at home order from PA Governor7 (“Governor’s Office: Official Website,” n.d.)
April 3: PA Governor calls for “universal masking” 7
April 9: PA schools officially closed through end of academic year
May 1: announcement from PA Governor that on May 8 PA will lift some restrictions in 24 counties in the Northcentral and Northwest health districts; all counties will remain in “red” until May 8, these counties will move to “yellow” on May 8 7
[YELLOW PHASE: Feb 1 – March 6]
May 8: PA lifts some restrictions in 24 counties in the Northcentral and Northwest health districts as determined by PA Governor in moving from “red” (Stay-at-home) to “yellow” (aggressive mitigation) 7
May 28: US death toll from COVID-19 passes 100,000
[GREEN PHASE: May 29 – July 3]
May 29: Some of Northcentral and Northwest health districts (Fig S2) move from “yellow” to “green”
June 30: First PSU student death from COVID-19
July 1: Order of face coverings in public places in Pennsylvania8
II. Black Marble Pre-Processing Details
NASA VNP46A1 Daily At-Sensor Top of Atmosphere (TOA) Nighttime Radiance products were downloaded as Hierarchical Data Format 5 (HDF5) with a bounding box encompassing Centre County. Images are stored for days with timestamps from 00:00:00 and 23:59:59 for a given date, we make the assumption that the images are from the PM and thus reflect changes that happen during the day. However, it is possible the image is captured in the morning nighttime hours (00:00:00 to 07:00:00) and therefore reflect changes in policy from a day prior. CSV files for the radiance values, cloud mask values, and pixel latitude and longitude values were separated into separate Pandas DataFrames for each day.9 We added missing dates (where there was no HDF5 data on a given day) and populated all data for dates added with Not a Number (NaN) values. We derived the plotting extent, transform, and array shape for the study area from the pixel latitude and longitude values DataFrame, for use with radiance data intermediate storage and export. We stored each day of radiance data as a NumPy array10 in a Python dictionary, indexed by YEAR, MONTH, and DAY (e.g. radiance_data[YEAR][MONTH][DAY]). During the storage process, we masked the data for clouds based on the cloud mask values, scaled the data based on the scale factor (0.1), and shaped the NumPy array based on the derived spatial information from the previous pre-processing step. All preprocessing operations with radiance data were completed using Python 3.7.6.11
III. Details of traffic camera acquisition, standardization, and cleaning
Although we tried to capture images every 20 seconds, cameras could be offline or images weren’t captured for each time interval. To standardize variation in images between cameras for a given hour, the sum of vehicles (labeled over all images captured by that camera) from that hour is divided by the number of images captured and scaled by the number of images we expect if we were capturing one image per minute. This helps us to avoid overcounting cars stuck at traffic lights but likely underestimates total vehicle volume on interstates. Therefore, count data should be interpreted as relative increases rather than absolute.
Occasionally, live stream images from traffic cameras would freeze or cameras would go offline. The standardized hourly counts were cleaned to remove strings of zeros or integers (usually parked cars or frozen images). The incorrect data was replaced with NAs and predicted with the best fit generalized additive model (see Methods in Main Text; Fig S4).
III. Explanation of cleaning and interpretation of COVID-19 data
COVID-Cases are reported as daily cumulative cases. However, these cumulative cases did not monotonically increase over the observation period. When false positives are discovered, they are subtracted from the cumulative case total, resulting in some dates with negative new daily cases. To incorporate this data, we assumed that false positives were from the most recent previous day of reporting and no new cases occurred on the current day. There was not a way to determine the exact date of the false negative test, so this was our best estimate.
Centre County has one prison and six long term care homes with a bed capacity of ∼500. Both of these settings are likely hotspots for COVID-19 transmission.12 However, during the course of the study there were not major outbreaks at these locations and therefore the majority of Centre Count In Huntingdon County, there was a prison outbreak and two days of data (with cases in excess of 30) were presumed to represent reporting from this outbreak. We removed these dates from the time series in Figure 1 in the main text to focus on community-driven transmission.
Acknowledgements
We would like to thank the Pennsylvania Department of Transportation, the Pennsylvania Department of Health, and NASA for making data publicly available in real-time. We also acknowledge financial support from Penn State University’s Huck Institutes of the Life Sciences and the Institute for Computational and Data Sciences. The funding sources had no role in the design, execution or interpretation of the results presented here.