COVID-19’s U.S. Temperature Response Profile

Richard T. Carson; Samuel L. Carson; Thayne Keegan Dye; Samuel A. Mayfield; Daniel C. Moyer; Chu A. (Alex) Yu

doi:10.1101/2020.11.03.20225581

Abstract

We estimate the U.S. temperature response curve for COVID-19 and show transmission is quite sensitive to temperature variation. This is despite summer outbreaks widely assumed to show otherwise. By largely replacing the death counts states report daily, with counts based on death certificate date, we build a week-ahead statistical forecasting model that explains most of the daily variation (R² = 0.97) and isolates COVID-19’s temperature response profile (p < 0.001). These counts normalized at 31°C (U.S. mid-summer average) scale up nearly 160% at 5°C. Positive cases are more temperature sensitive; scaling up by almost 400% between 31°C and 5°C. Dynamic feedback amplifies these effects. There is a short window to get COVID-19 under control before cooler weather makes the task substantially more challenging.

Significance Statement COVID-19 transmission, in the form of daily state-level death counts and positive cases, ramps up and down with changes in maximum daily temperature. This relationship, known as COVID-19’s temperature response profile, is reliably estimated by assembling a new dataset that largely repairs the sizable divergence between when COVID-19 deaths occur and when they are reported. Over time this effect is can be quite large and, as such, important to efforts to forecast and manage COVID-19.

Introduction

The question of whether COVID-19 exhibits a pronounced temperature response profile has garnered attention since the early days of the pandemic (1,2,3). Kissler et al.’s examination of medium and long-term management of the pandemic assigns a prominent role to understanding how the temperature response profile (hereafter “TRP”) of COVID-19 might influence the pandemic’s progression in the United States (3,4). This follows conventional wisdom regarding the strong seasonal weather-driven pattern of influenza (5,6), which helped mask COVID-19’s early U.S. ascent (7). The recent rise in positive COVID-19 cases and related deaths across much of the United States has called into question whether COVID-19 transmission is adversely sensitive to summer weather. Nevertheless, some modeling groups such as the University of Washington’s Institute for Health Metrics (IHME) have assumed COVID-19 activity will increase as temperatures decline and are now using information from the U.S. influenza monitoring bnetwork (8), which was previously shown to be predictive of the pandemic’s path last spring (7), to help incorporate that effect.

Daily state reports often bear little resemblance to the actual number of COVID-19-related deaths that occurred on that day. The resulting temporal data misalignment creates a substantial impediment to recovering any relationships with a crucial dependence on event timing. By reconstructing the set of death counts states report daily, largely by substituting in retroactive corrections based on death certificate dates, we are able to reliably estimate the TRP for COVID-19 deaths.

We assemble over 2,500 state-level daily observations from April 16–July 15, 2020, after COVID-19 became well-established across the United States. We largely follow the literature in environmental economics on estimating pollution and temperature related impacts on a range of health and other outcomes (9). SARS-CoV-2 (the virus which causes COVID-19) being a novel virus makes full implementation of the standard approach infeasible, because multiple years of both spatially and temporally delineated data are not available. The specific issue that we cannot currently resolved is whether the TRP over a given temperature range is the same in both directions: the cooler to warmer direction reflected in our data set, and the warmer to cooler direction, occurring as the U.S. enters its fall and winter seasons. In what follows, it is also important to recognize that we estimate the joint, not separate, effect of any biological response by the virus and any behavioral response by the public.

Prior efforts

Early attempts to pin down COVID-19’s temperature response profile (TRP) have proven elusive. The now well-accepted approach for estimating influenza’s TRP is epitomized by Barreca and Shimshack (6), which draws heavily on the modelling of climate impacts on human populations (9,10,11). Under this approach, a panel data set of political entities, such as countries or their political subregions (states, counties, etc.), is assembled and the outcome of interest observed across a long time horizon (e.g., a 20-year period). The ability to employ fixed-effect indicator variables to correct for time-invariant differences between political jurisdictions, coupled with the ability to use short-run weather variability, provides statistical identification of a variety of impact response functions.

Prior research has three important limitations. First, cross-sectional data cannot statistically identify the desired function without making the implausible assumption that all possible confounding variables are adequately controlled for. Routinely updated time-series models slowly incorporate environmental conditions into their forecasts without ever isolating it. Short panel datasets, where the stimulus of interest has limited range (temperature, humidity, UV light, air pollution in each location), often lack the statistical power to pin down such response functions. Consequently, these estimated response functions are often fragile in the sense that statistical significance is lost when time trends or demographic variables are added to models (12,13,14,15).

Second, early work, often using Chinese or cross-country data, focused on the speed at which the pandemic ramped up in different locations, which used variants of derivative statistics like growth rates or R0 as the dependent variable (16,17). These works point to temperatures in the 0-10°C range as being most conducive to spreading COVID-19, with a possible humidity effect (17,18,19). Current interest is now focused on situations where COVID-19 is spatially well-seeded and its effective R0 can move up or down with actions like state reopening plans.

Third, the quality of reported COVID-19 statistics is often suspect, particularly from the early phase of the pandemic. We move past this period to an observational window where reporting has stabilized. However, this reveals a different problem: temporal mismatches between when an event (e.g., a COVID-19-related death) was reported and the weather variables potentially influencing that event (11). When reported event dates significantly differ from actual event dates, the resulting measurement error can overwhelm standard sources of biological variation such as individual differences in incubation periods.

Correcting state-level COVID-19 statistics and why it matters

Our ability to isolate the TRP for COVID-19 related deaths stems largely from our reconstruction of state-level COVID-19 data. The most important of these is replacing the daily death counts initially reported by states with the actual daily counts based on death certificate date where possible. Second, if death certificate data is not available, we use the retroactively corrected data series that a number of states have produced. When available, this type of data generally rectifies many initial reporting errors. Third, we correct other implausible data reports such as implicit negative daily death counts and zero counts on one day followed by a clear double-count the following day using a consistent protocol. This effort is described in detail in Supplementary Information (SI) section on Data Preparation.

Figure 1 displays why repairing daily state-level COVID-19 death counts matters. Panel (A) shows the originally reported (COVID Tracking Project [CPT]) CPTDailyDead_it daily death counts for (Florida) in blue with the “Actual” death counts by death certificate date overlaid in red. Actual curves follow the general shape predicted by epidemic models; curves based on the originally reported counts have clear day of the week patterns and large spikes, neither of which are predicted by biological models. Panel (B) shows the two implications. First, the confidence interval from fitting a simple quadratic trend model is dramatically larger for the CPTDailyDead_it (i.e., the Reported data) than that fitting the same model using the Actuals. Second, the model based on the CPTDailyDead_it is slow to pick up the sharp rise in deaths in Florida because the reported data is temporally misaligned, specifically originally reported counts temporally lag the actual counts. Correct temporal alignment of the death count data is also critical to being able to isolate COVID’s temperature response profile. Figure 1, Panels (C) and (D) display results for Georgia which show how use of CTPDailyDead_it can lead to both missing a downturn and an upturn, while supporting an incorrect slow steady progress story. Figure S1 provides similar graphs for Arizona and Texas.

Fig. S1. COVID-19 deaths in Arizona and Texas.

April 16-July 15. (A) Arizona Actual vs. Reported. (B) Arizona quadratic time trend forecasts and confidence intervals. (C) Texas Actual vs. Reported. and (D) Texas quadratic time trend forecasts and confidence intervals.

Fig. 1. COVID-19 Deaths in Florida and Georgia.

April 16-July 15. (A) Florida Actual vs. Reported. (B) Florida quadratic time trend forecasts and confidence intervals. (C) Georgia Actual vs. Reported. (D) Georgia quadratic time trend forecasts and confidence intervals.

Figure 2(A) shows the week ahead forecast from the University of Washington’s Institute of Health Metrics and Evaluation (UW-IHME) plotted against our model’s DailyDead_it at the state-level for the entire United States over the three-month period we examine. This model explains a reasonable amount of the variance in the data reflected in the R² of .66 obtained by regressing the Actuals on their forecasts. Substituting in the week-ahead forecasts from one of the other heavily used forecasting models results in a similar overall impression.

Fig. 2. DailyDead_it vs. predicted values using information available 7 days earlier.

U.S. States April 16-July 15. (A) UW-IHME forecasts (R² = .66). (B) Eq. 1 base model predictions (R² = .97).

Figure 2(B) plots the predicted DailyDead_it using Eq. 1 which only uses information available a week earlier. This regression’s R² of .97 is reflected in the very tight scatter plot.

The difference between Figure 2 (A) and (B) is striking. Notably it is not driven by our superior modelling ability, since our model is intentionally designed to be simple and to transparently isolate the TRP. Rather, the difference is driven by our use of dramatically higher-quality, almost textbook-like temporally aligned death counts. It suggests this pandemic’s short run daily trajectory at the state-level is (now) predictable with high accuracy, if resources were devoted to ensuring that actual daily death counts were reported in a timely enough manner for use by the modeling community, rather than trickling in over a period of months as they do now.

Modelling approach

Our objective is to isolate COVID-19’s TRPs with respect to its two visible indicators, daily death counts (DailyDead_it) and new positive test counts (NewPositives_it) at the state level. Any TRP specification should allow for the possibility of non-responsiveness (i.e., zero temperature dependence). TRPs should not incorporate fixed factors like state demographics, nor other factors associated with calendar date or a clear temporal profile. Rather, daily exogenous variation in temperature on any specific day should be the source of statistical information for identifying the TRP of interest. Intuitively, the TRP is being statistically identified by having days where the lagged count of the COVID-statistic of interest is approximately the same magnitude but a range of different temperatures is observed. The number of such comparable days is substantially increased by the introduction of controls for conditions that remain fixed across states and a flexible time trend. Statistical modeling issues revolve around functional forms and specification of relevant lag structure. The slow-moving systematic changes in temperature over time imply that, over short time horizons, temperature will not be the driving force behind DailyDead_it and NewPositives_it. However, over longer time horizons, a virus’s TRP can be a major factor in its long-run trajectory.

We focus on two directly observable COVID-19 statistics: daily deaths (DailyDead_it) and new (test-diagnosed) positive cases (NewPositives_it), indexed by state (i) and day (t). For each we build a simple week-ahead forecasting model, controlling for state-level fixed effects and including a quadratic time trend. We then examine how lagged (t-k) maximum daily temperature (MaxTemp_it-k) scales the model’s predicted DailyDead_it or NewPositives_it. Attention is restricted to conditions where MaxTemp_it ≥ 5°C, since data below this level is sparse during our sample period and concentrated in a few sparsely populated states like Alaska and Montana. Our empirical estimates of the TRPs are normalized to 100 at 31°C (∼88°F), the U.S. population-weighted average for the last week of our sample to aid interpretability.

The pandemic modelling community has largely concentrated on DailyDead_it, believing it to be a more reliable indicator of the COVID-19 infection pool than NewPositives_it due to the large differences in testing regimes across states and time. We proceed in a similar manner, but also produce the TRP for NewPositives_it, conditioning on available testing information, since it is positive cases that are potentially directly influenced by temperature.

Model components

Our base DailyDead_it statistical model (Eq. 1) is comprised of two multiplicative components. The first produces expected current period deaths as a function of past observed deaths at a fixed temperature. The second allows expected DailyDead_it to (potentially) vary with past values of MaxTemp_it-k.

The terms inside the first component are the infection pool proxy, DailyDead_it-7, a set of state-level fixed effect indicators, StateIndicator_i and a quadratic time trend in Days_t (t=1, …, 137; initialized to March 1 to aid interpretability). The StateIndicator_i captures the influence of a wide range of variables which remain constant over the period examined, such as demographic composition, geographic links between locales that influence infections, and public health infrastructure. The time trend variables pick up the decline in the case fatality rate and the average effect over time of initial lockdowns, social distancing, and state re-openings. The first component is exponentiated to incorporate the restriction that expected DailyDead_it should be positive if COVID-19 transmission is active anywhere in the set of connected units being examined. Commensurately, we use LogDailyDead_it-k and LogDays_t as regressors so this component can be interpreted as a log-log regression model with state-level fixed effects.

The second component is a logistic function scaling predicted DailyDead_it up or down with MaxTemp_it-k. Deaths on any specific day are the result of infections propagated over an earlier period. We use LogMaxTemp_it-7 and LogMaxTemp_it-14 to roughly encompass the relevant period for temperature influencing current period deaths. Our panel data model specification uses an additive error term, which necessitates using nonlinear least squares to solve the model (20,21), but decouples conditional mean estimates from the estimated error component.

Formally, our base model for U.S. state i on day t is given by: where estimated coefficients for the set of StateIndicator_i, and the Greek letter parameters minimize the sum of the squared error term of the estimated ε_it. The NewPositives_it model is similarly structured.

We investigate (Table S6) the sensitivity of our results to range of alternative specifications (e.g., different infection pool indicators, different temperature scaling functions, adding absolute humidity, relative humidity, ultraviolet radiation, the inclusion of shelter-in-place/reopening orders, lagged cumulative death counts, and use of the CTP death counts) and provide further discussion of modelling issues in the Supplementary Information (SI) section on Modeling Approach.

Data

Our analysis uses three main types of data:

COVID-19 statistics for state-level death counts, positive cases, and tests, using The COVID Tracking Project (CTP; covidtracking.com) as our base information source.
Temperature data from the U.S. National Weather Service Integrated Surface Database.
State-level indicators and time variables.

We undertook extensive repair of the COVID-19 data reported daily by states, particularly those involving death counts. A dominant feature of these data are the substantial lags between when many of these events occurred and when they are reported. It is not uncommon to see states include deaths that occurred several weeks prior in any given day’s count.

Over half the U.S. states have made the number of COVID-related deaths publicly available by their death certificate dates in some manner (SI Data Preparation). A simple OLS regression of death counts by death certificate date on originally reported death counts for many of these states yields an R² of less than 0.5. Other states have made corrections to originally reported COVID-19 statistics. These updated counts tend to retroactively correct a myriad of reporting errors by states contained in the COVID Tracking Project’s daily data snapshot. These issues range from failing to report any information on specific days, decisions on “probable” COVID-related deaths, and the resolution of duplicated death certificates. We use states’ self-corrected counts when available.

In instances where neither of these two sources of information were available, we undertook a consistent set of data repair operations. These include: averaging across days with no reporting, prorating backwards in time large batches of deaths (or other statistics) reported on an arbitrary date when it is known those deaths occurred over a much longer period (typically from nursing homes), and making the minimum set of changes to effectively correct logical violations such as negative counts of new deaths, positive cases, and tests. Details on our data repairs are contained in the SI Data Preparation section; we make this dataset available for use by other researchers, along with a line-by-line account of the corrections made.

For each state, weather variables are taken from the airport with the highest volume of commercial traffic. We focus on maximum daily temperature. Results with mean temperature are quite similar. State-level aggregation requires taking weather data from a single station. The measurement error induced by this compromise is likely to be less than one might think. Many states are small spatially, or have a single concentrated metropolitan area (e.g., Illinois). In spatially large states with large populations, most of the population often lives within reasonable proximity to the largest airport. Even in Texas, most people live along the corridor between Dallas (DFW is our representative airport) and Houston. As a result, over 60% of the American population lives within 300km of the representative airport for their state. Similarly, positive cases are also concentrated in major metropolitan areas near those airports. More generally, a single source of classical measurement error attenuates parameter estimates toward finding no effect.

Daily dead model results

Eq. 1 is estimated using non-linear least squares and has an R² of 0.97. All parameter estimates (provided in Table S1) using robust standard errors clustered at the state-level are significant at p < 0.001. Figure 2(B) displays the actual (corrected) DailyDead_it versus the model’s in-sample predicted values. States where death certificate date were available have considerably smaller prediction errors (p < 0.001, Model 2 in Table S6).

View this table:

Table S1. Eq. (1) base model predicting DailyDead_it.

View this table:

Table S2. Model predicting NewPositives_it.

View this table:

Table S3. Model predicting StateBase_i taken from DailyDead_it model.

View this table:

Table S4. Alternative specifications for DailyDead_it model.

StateIndicator_i for models provided in Table S6. (Robust standard errors clustered at the state level).

View this table:

Table S5. AR(7) DailyDead_it models using different Reported/Actual combinations.

U.S. states: April 16-July 15. June 25 NJ death count (1877) set to missing in reported. (Robust standard errors clustered at state level).

View this table:

Table S6. Parameter estimates including state-level fixed effects for all models.

The model’s quadratic time trend suggests DailyDead_it has fallen over time; at a declining rate from mid-April through the end of May, and then almost flat (rising at a slow rate). LogDailyDead_it-7 is the dominant predictor and its coefficient estimate of 0.8686 (t=35.32) has a standard elasticity interpretation.

Figure 3 shows the estimated TRP implied by the parameter estimates for the two MaxTemp_it lags in the logistic scaling function. The vertical axis represents expected DailyDead_it at each temperature value when both lags are set to the same temperature and the TRP is normalized to 100 at 31°C. (Figure S9 show the corresponding TRP using the original CTP death counts.) Figure S2 provides the contour plot which allows the values of the two MaxTemp_it-k lags to independently vary.

Fig. S2. DailyDead TRP contour plot representation.

Fig. 3. COVID-19 daily dead temperature response profile.

U.S. states: April 16-July 15. Temperatures for 7th & 14th lags set equal.

Differentiating Eq. 1 with respect to DailyDead_it-7 produces a measure of how the expected DailyDead_it increases from a one death increase in DailyDead_it-7. Setting t and MaxTemp_it-k to chosen values, yields [TRP*β*EXP(StateBase_it)]/(DailyDead_it-7)^(1-β), where the TRP directly scales the elasticity parameter, β, on LogDailyDead_it-7 and StateBase_it is the temporally vary sum of the fixed StateIndicator_i and the quadratic time trend. As an example, for Georgia on July 15 (StateBase_it = 3.9184, MaxTemp_it-7 = 29.4, MaxTemp_it-14 = 27.8 and the corresponding non-normalized TRP = 0.0368), changing DailyDead_it-7 from 27 to 28, is predicted to increase the expected DailyDead_it in Georgia on July 15 by 1.0457. Figure S3(B) displays a variant of this calculation for individual states as DailyDead_it-7 increases from 0 to 1.

Fig. S3. Visual representation of information contained in the parameter estimates for base model (Eq. 1).

(A) displays the infection pool for each state in the continental U.S. implied if observed DailyDead_it-7 is set to zero on July 15 and the two lagged MaxTemp_it-k are set to 31°C. (B) shows, under the same conditions, how the expected DailyDead_it changes if DailyDead_it-7 moves from 0 to 1.

We simulate series of DailyDead_it with both static and dynamic variants of our estimated base model (Eq 1). To do so, we take the DailyDead_it-7 from the last week of our sample period and MaxTemp_it = 31°C as the initial values to propagate the simulation. We fix MaxTemp_it-7 and MaxTemp_it-14 at 31°C from day 1 to day 45 to mimic the rest of the summer period, then progressively decrease them by 0.2°C each day until 5°C is hit, which occurs on day 175, after which temperature is held constant.

The brown dashed line in Fig. 4 provides a stylized static [left vertical axis] representation of information contained in our base model (Eq. 1) using Georgia as an example since it starts out at just below our 31C° normalization point and in a cold year hits our 5C° endpoint. The two MaxTemp_it-k change in tandem, with all other variables fixed at their initial values. In this static response mode, expected DailyDead_it increases through only one channel – the direct impact of lowering temperature.

Fig. 4. Stylized static and dynamic TRPs for Georgia derived from Eq. 1 parameter estimates.

The brown solid curve [left vertical axis] is stylized static TRP where temperature influences DailyDead_it through changes to MaxTemp_it-k, with DailyDead_it-7 held constant. The blue dashed curve [right vertical axis] is stylized dynamic TRP under the same assumptions, except dynamic feedback is allowed in the form of temperature influencing subsequent DailyDead_it-7, so that temperature affects expected death count both directly and through lagged death counts.

The blue dashed curve in Figure 4 shows the dynamic counterpart [right vertical axis] that allows MaxTemp_it-k to influence DailyDead_it, which is then used as the lagged model input for subsequent projections. In this dynamic mode, temperature affects expected DailyDead_it through two channels: first, the direct effect of lagged temperature on daily death count, as shown in the static model; second, the indirect compounding effect of these temperature-driven daily death counts when the initial death count of each cycle of the simulation is set to a lag of these outputs. The indirect effect of maximum temperature is the dominant mechanism, accounting for more than 90% of the increase in the expected death count as maximum temperature falls from 31°C to 5°C. The inset in Fig. 4 plots the two curves under the same scale and conveys the magnitude of the differences between the pure static and dynamic responses, both of which are hard to achieve in practice. The static response requires continual reductions in effective contact rates that exactly offset the increase in transmission potential, in the precise sense that the current death count is always held to be equal to last week’s death count. Observing the full dynamic effect would require both the absence of offsetting government actions and endogenous actions by the public such as increased social distancing and face mask adherence as COVID-19 activity quickly increased. A possible example of the dynamic feedback our model allows when this is not done is the early path of the pandemic through Northern Italy in late February and March 2020.

StateIndicator_i values for some geographically isolated states with small populations like Hawaii are small enough to suggest their COVID-19 death counts would be unsustainable in warm enough weather. This is not true, though, of most states, which is consistent with Baker et al.’s finding from examining earlier emergent viruses that warming weather is not enough by itself to stop their spread (22).

The SI section on Alternative Specifications for Base Death Count Model describes additional analyses that (a) look at alternative temperature scaling functions, (b) substitute DailyDead_it-14 or NewPositives_it-7 for DailyDead_it-7, and (c) add the dates of state actions such as shelter-in-place orders as control variables. This work shows that our finding of a strong TRP for DailyDead_it is quite robust. Some specifications suggest the TRP is flatter in the 10-20°C range but steeper in the 5-10°C range (Fig. S8).

Fig. S4. NewPositives TRP contour plot representation.

Fig. S5. LogMaxTemp vs. LogUV.

(A) displays Georgia. (B) displays New York.

Fig. S6. DailyDead_it responsiveness to lagged per capita total dead.

Fig. S7. Bivariate relationship DailyDead & MaxTemp.

U.S. states: April 16-July 15. Lowess smoother: .2 bandwidth.

Fig. S8. Alternative DailyDead TRPs.

U.S. states: April 16-July 15.

Fig. S9. Death count TRPs based on DailyDead vs. CTP originally reported.

U.S. states: April 16-July 15.

New positive case model results

NewPositives_it are modeled similarly to Eq. 1, substituting LogNewPositives_it-7 as the infection pool regressor. Testing information is required to interpret reported state-level positives cases. We use LogNewTest_it, to control for current testing intensity, LogNewTest_it-7 (which allows for a past positivity rate interpretation), total tests administered per thousand lagged by one week (PerCapitaTests_it-7) to help control for prior testing intensity and an indicator variable for systematically lower reporting on Monday. We chose LogMaxTemp_it-7 for consistency with the death count model. For the other temperature lag, the 2^nd lag fits best.

The resulting model’s (Table S2) R² is 0.95. All regressors are significant at p < 0.001, except for some of the test related variables: LogNewTest_it-7 (p=0.005), PerCapitaTests_it-7 (p=0.002) and Monday (p=0.015). Estimated parameters for the two lagged temperature variables are considerably larger than their DailyDead_it counterparts. The implied TRP for NewPositives_it is displayed in Fig. 5, with the contour plot that allows the two temperature lags to vary independently is provided in Fig. S4.

Fig. 5. COVID-19: new positives temperature response profile.

U.S. states: April 16-July 15. Temperatures for 2nd and 7th lags set equal.

Discussion

We show that DailyDead_it predictably varies with changes in maximum daily temperature (Fig. 3). This relationship is considerably more pronounced for new positive cases (Fig. 5). Our TRPs are normalized to 100 at 31°C, which is near the maximum of U.S. summer temperatures. These two TRPs suggest current COVID-19 infection pools in many states need to be brought under control while high temperatures are helping to reduce the virus’s transmission. Cooler temperatures with the progression of fall and winter will dramatically ramp up the number of new positive cases and the deaths that follow unless current infection pools are dramatically reduced. Dynamic feedback between rising infection pool indicators and cooling temperatures (Fig. 4) suggests delay in responding to increased virus activity signs will result in rapid escalation. This is already being seen in the current spatial pattern of outbreaks and the ever-increasing medium-run death count forecasts (e.g., UW-IHME (8)). Investment in providing the pandemic modeling community with timely counts based on death certificate dates would allow them to deliver substantially more accurate and timely warnings of impending upturns.

Warming temperatures during the spring and summer actively aided efforts to reduce the spread of COVID-19 and may have contributed to a false sense of how effective those efforts were. Cooling temperatures going into the fall and winter will present a very different challenge. Figure 6 shows the average date over the past 30-years when each U.S. county enters the particularly dangerous 10°C to 5°C range for COVID-19. Effects of temperature amplification show up with a lag.

Fig. 6. Expected date by U.S. county for entering 5-10°C range (30-year average).

There has long been fear of facing COVID-19 during influenza’s October to March season (23). Our results suggest, that in addition to this concern, COVID-19 transmission will become increasingly more efficient at transmission than it was during the summer. Further, if the TRPs for deaths and positive cases behave like influenza (6), both will continue to increase from 5°C until a few degrees below freezing.

Data Availability

All data and code from the paper will be archived at github before publication.

Supporting Information

Data Preparation

The data set used in this paper starts with the COVID-19 statistics reported by individual states as aggregated daily by The COVID Tracking Project (covidtracking.com). In preliminary work, we (and other modelling groups) found there was no plausible epidemiological model that would produce the wild up and down jumps in the state-level COVID-19 statistics reported daily. As a result, there is a major fork in the path of any modeling effort. Do you build a predictive model for the reported COVID-19 daily death count that policymakers see, where success is judged by minimizing the forecast error around those observable quantities? Or do you attempt to model the actual underlying process with an eye toward understanding key aspects of it? When the reported dependent variable systematically diverges from that being generated by the underlying process, these two approaches fundamentally diverge. For the first path, success, to some degree, comes in uncovering the administrative procedures influencing the divergence. The second path led us to undertake a major effort to rectify and repair reported COVID-19 statistics focused on recovering the temporal structure needed to provide TRP estimates.

The most important correction was replacing originally reported death counts with death counts by date of death as reflected on death certificates. Figures 1 and S1 illustrate the nature of the differences for between the original CTPDailyDead_it and DailyDead_it for four important states: Arizona, Florida, Georgia and Texas. The originally reported data results in substantially larger confidence intervals in predictive models and because it lags the actual death counts substantially reduce the ability to detect and respond changes in COVID-19 activity. We were able to make this correction for 26 states where deaths by death certificate information could be located. Our cutoff date of July 15 allowed for two and half months for states to make these death counts available, which should subsume most of the COVID-19 deaths in these states over our sample period. The states where we have been unable to obtain deaths by death certificate dates tend to be smaller and to have had relatively fewer COVID deaths. Typically, they did not face large peaks, which appear to be associated with increased testing delays. However, we have been unable to obtain this data from three large states – California, Illinois, and New York – and of the 20 observations (out of 4,567) with residuals from Eq. 1 whose absolute value is 50 or more, over half are from these three states.

The methods for collecting data backed by death certificates varied based on how the individual states chose to present them. Some states publish their counts based on death certificates in a downloadable format on their official COVID-19 website. Others publish them inside of longer reports in such a way that they can be hard to find, or present tables or graphs that needed to be extracted by hand into spreadsheets. Because there are no official rules for how to publish this information, even the charts themselves vary in structure and presentation. Commonality across states is largely dependent on the specific software vendor they are using for their public-facing COVID-19 dashboards. Data collection procedures sometimes required hunting through source code to find full datasets within, or by hovering cursors over each bar in a bar chart individually over the collection window to make them display the precise value represented. In each case we were able to verify via labeling or official statements that the data was compiled using verified death certificate dates. It’s worth noting here that, while the CDC collects deaths by death certificate date (https://www.cdc.gov/nchs/nvss/vsrr/COVID19/index.htm), they only do so at the weekly level, which makes it unusable for the sort of modeling effort we have undertaken and, relative to the data we have obtained from individual states, suffers from even longer reporting lags.

For states where deaths by death certificate date were not available, we first sought to determine if an individual state, either on their COVID-19 reporting “dashboard” or in a downloadable file, had deaths by reported date. These datasets often contain substantial corrections to what a state originally reported on a specific day. These include reports that missed covidtracking.com’s daily reporting deadline, more accurate end-of-day tallies (e.g., late reporting counties/hospitals), correction of testing dumps that resulted in the appearance of testing spikes on certain days, the removal of duplicate death certificates, and the resolution of probable cases. When these differed from the COVID-19 death counts originally reported (COVIDTracking.com has a set of “snapshots” of the originally reported information) we used the revised state data. In some smaller states, we believe, but have been unable to fully verify, that these corrected datasets are “close” to deaths by death certificate date.

Two of these types of corrections turn out to be particularly important. First, when a state misses a reporting deadline, this is typically recorded as there having been no COVID-19 events (new deaths positives, or tests) on that date. Because covidtracking.com (and similar aggregator sites) operate off cumulative counts, the next day contains the events that happened during the previous day. Second, “probable” cases have often been treated differently across states and time. State-level resolution of this issue removes an extraneous source of variation. As with the data on death certificate dates, these corrections by states of their originally reported COVID-19 statistics were obtained through a variety of sources ranging from reading counts off interactive bar charts to downloadable *.csv files.

Next, we corrected two obvious problems with the data. First, occasional, exceptionally large spikes accompanied by auxiliary information (e.g., a news/twitter release by the state’s department of health) noting that this spike was due to an accumulation of deaths over an extended time period, typically from one or more congregate living facilities, e.g., nursing homes and prisons. For these, the initial correction involved proportionately increasing death counts over the relevant period, with days where this would result in negative death counts not included in the reallocation. Second, we corrected reports of zero deaths on days surrounded by death counts that were sufficiently large that a zero-death count was highly unlikely. Such days are generally also characterized by a failure to report some other COVID-19 statistics (like new positive cases) and by an abnormally high death count on the following day (or two if it is a weekend). Our approach for these was to assume no reporting followed by double reporting so that the indicated correction is to average counts across the two and, in some instances, three days.

We also corrected data in situations where a state had “corrected” an earlier report, but where an entire rectified data series from the state was not available. A typical example is a state that initially reported all deaths except for those in the state’s largest county, with the corrected version containing the death count for the whole state contained in a press/twitter release. A similar situation was when a state failed a logical consistency check by having the difference between the reported cumulative death counts on two consecutive days generate a negative daily death count. This typically occurs in a state with a small population and few COVID-19 deaths, which, without comment, reduces their cumulative death count by one or two. Here we “rollback” that correction to the closest date that no longer produces a negative daily death count.

Similar corrections have been made to daily positive case counts and new daily test counts. The major difference is that very few states have made available positive case counts by day of test administration (rather than the day the test result is reported). This means that few states publish datasets that can readily subsume the positive case count data from the COVID Tracking project. This is likely due to these types of information having different reporting standards. Specifically, a state eventually knows the actual date of COVID-related death, but this information is subject to reporting delays due to waiting for confirming test results or autopsies. Because the date of death is on the death certificate, obtaining actual death counts for all states is eventually feasible. However, while the lab doing the test knows the date of test administration, this information is often not shared with the state. States could require more complete reporting, but it is unclear whether the past is capturable. Total test counts are even messier due to the common practice – particularly in the earlier part of our sample period – of reporting all newly returned positive test results daily but reporting negative results inconsistently or in batches. Some negative tests results (often by the state’s lab) were reported along with the positives, while other negative tests (often those by private labs) were reported once a week. We performed rollbacks of antibody tests that for a while were mixed with diagnostic tests, but these are messy because information on the period over antibody tests were induced has rarely been disclosed. We have endeavored to average and prorate testing data where the nature of this practice could be reasonably inferred.

The influence of our data correct effort can quickly be gleaned from Table S5, which describes four simple autoregressive models with a constant term and 7^th death count lag. The first uses the original “Reported” dataset (covidtracking.com) as both the source of the dependent variable, CTPDailyDead_it and the lagged death count. The second uses our corrected version as the “Actuals” for the dependent variable and uses the originally Reported for the lagged regressor. The third uses CTPDailyDead_it as the dependent variable and the Actuals for the lagged regressor. The fourth uses Actuals for both. The parameter estimates for lagged deaths are similar in versions using the same lagged variable and substantively larger in the two versions using lagged actual counts. The R² starts at 0.69 for the Reported/Reported model, stays roughly the same (.68) if Actuals are predicted from CTPDailyDead_it-7, and increases somewhat further to 0.74 for the Reported/Actual combination. The Actual/Actual combination, though, has an R² of 0.91, which clearly illustrates that the large gain in explanatory power in the base Eq. (1) model in Table S1 comes mainly from our data repair and rectification effort. Note that the models in Table S5 do set the massive NJ (June 25) reported death count outlier of 1877 (Actual death count is 16) to missing, since it is so large many modeling groups have either dropped it or prorate it over earlier time periods. The R² of this Reported/Reported model using these observations falls to 0.42.

Temporal misalignment of NewPositives_it has received more attention than DailyDead_it because of the often-large gap in time between when a diagnostic test is administered and when the result is returned (24). Indeed, this gap, coupled with different state testing regimes, has led most modeling groups to concentrate on predicting DailyDead_it. We have made substantial repairs to NewPositives_it and NewTests_it data by reference to corrected state reports, and prorated rollbacks of initially reported antibody tests. We have been much less successful in locating information on NewPositives_it by date of test administration. However, temporal misalignment of the NewPositives_it-k is somewhat less important than for deaths than it might first appear because there is a shorter window for positive-to-positive transmission than the positive to death transition and because test results for hospital patients, the persistent high positivity pool, are typically returned quickly. Further, even very noisy testing information can be helpful in serving as controls for variation in state-level testing behavior over time.

Construction of temperature, humidity and ultraviolet radiation data

Weather data for our main analysis are drawn from the National Centers for Environmental Information (NCEI) Integrated Surface Database (ISD), which report hourly temperature and humidity data for most airports in the world. For each state, weather variables are taken from the airport with the highest volume of commercial traffic, where the volume information is found in the Federal Aviation Administration’s 2018 Commercial Service Enplanements report. Our key variable of interest is daily maximum temperature (MaxTemp). We also look at measures of humidity and ultraviolet radiation. Hourly relative humidity is calculated as a function of hourly observed temperature and dewpoint temperature. Under the assumption of ideal gas behavior, we calculate hourly absolute humidity as well (details can be found at https://www.hatchability.com/Vaisala.pdf). We then pick the highest readings within each 24-hour period as daily MaxTemp, MaxRelativeHumidity and MaxAbsoluteHumidity. Minimum daily temperature is obtained by picking the lowest reading and the mean by averaging the hourly readings. Our measure of ultraviolet radiation is UV index, which provides a forecast of the expected risk of overexposure to UV radiation from the sun. UV index data at our representative airports is obtained from OpenWeather Ltd., which publishes daily UV index forecast calculated by National Weather Service.

To calculate the expected date by U.S. county for entering the 5-10°C range (Fig. 6), we use reanalysis data provided by PRISM Climate group (https://prism.oregonstate.edu/) which provide reliable weather data at a high spatial resolution of 4km by 4km for the contiguous U.S. We extract daily maximum temperature from 1990 to 2019 and count the average number of days it takes since Oct 1 for the daily maximum temperature to fall below 10°C.

Modelling approach

There are two competing modelling approaches to predicting future COVID-19 deaths and positive cases. The first builds on a standard SEIR model; the other, a production function approach. The first has a strong epidemiological conceptualization and is clearly better in the early phase of a pandemic when data is scarce. The second is largely agnostic as to the underlying structure of transmission but requires dramatically more data to offset this flexibility. We follow the second, with (Eq. 1) using a simple production function approach. Conceptually, there is an infection pool, the seeds planted, and various inputs ranging from a state’s health care system to temperature, that influence the output, the number of deaths observed to today. We primarily use past death counts as the infection pool proxy, take into account state-level fixed effects (amalgamating effects due to constant factors such as demographic characteristics, fixed resources like health care and transportation networks, and average differences in factors including climate variables and mobility), and a simple quadratic time trend. We exponentiate the right-hand-side variables to effectively impose the restriction that under the conditions we observe, expected death counts must be positive.

The use of a multiplicative scaling function incorporates the logic that temperature by itself cannot generate changes in DailyDead_it. Because those dying at t became infected not on one day but rather over an extended period, some means of representing temperature in this setting is required. The two options, due to the strong correlation between closely adjacent MaxTemp_it-k, are a distributed lag structure that imposes structure on a set of MaxTemp_it-k, or parameterizing the scaling function as the product of individual scaling functions, each with a different MaxTemp_it-k. We find two lags – the 7^th and 14^th – are sufficient and reasonably consistent with what is known about the biology of the virus and its methods of attack.

Model results are reasonably robust to small shifts in temperature lag positions, with the following two caveats. First, the 7^th lag is a natural one to use; many people’s lives follow a typical weekly pattern of contacts and activities, and administrative reporting procedures often have a day-of-the week pattern. Second, lags too far back are insignificant. Notably we use the 18th lag of MaxTemp_it in the weekly model rather than its naturally shifted counterpart LogMaxTemp_it-21.

The most commonly used scaling function for our purposes is the logistic function 1/(1 + EXP(X)^ϒ), where X is the variable of interest, and ϒ is the single estimated parameter. This function converges to a constant as ϒ goes to zero. Use of MaxTemp_it rather than LogMaxTemp_it-k as the stimulus variable provides a function with a different curvature. Statistically, it provides a similar fit, largely because the corresponding shifts in the estimate of ϒ makes the two scaling functions reasonably similar after normalization.

We also consider another commonly used scaling function, X/(X + ψ), where an estimate of ψ > 0 results in smaller values of X being scaled up more than large values of X and an estimate of ψ < 0 results in larger values of X being scaled up more than smaller values of X, and an estimate of ψ = 0 results in a function exhibiting no temperature responsiveness. This function can be used with either LogMaxTemp_it-k or MaxTemp_it-k, and both will approximate a reasonable range of weakly monotonic scaling functions. It is well behaved as long as the estimate of −ψ is bounded away from MIN(X), which appears not to be an issue in the situations we examine. Table S4 provides estimates for a set of competing specifications and their TRPs are displayed in Fig. S8.

Our first major empirical decision was to use the three-month period: April 16-July 15. This time window allows for the seeding of the virus (to different degrees) across the i=1, …, 51 U.S. states (including DC) over a three-month period. We effectively start tracking the observable outcomes, deaths, positive cases, and tests on April 9^th, because we generally use a one-week lag of the COVID-19 statistics of interest. Our time variable, t=1, …, 137 denoted in days starts with 1 on March 1^st, the approximate date individual states first started reporting COVID-19 statistics.

Our second major decision involved how to implement the core epidemiological concept of a pool of infected individuals who can potentially infect other individuals. This pool is dynamic; existing positives transition to being no longer infectious, while newly infected individuals enter the pool. The totality of the currently infected individuals is unobservable without universal administration at each point in time of a 100% accurate diagnostic test. The question, thus, is whether to use a lagged variant of the test diagnosed positives, NewPositives_it, which logically are part of the infection pool but not necessarily representative of it due to differential testing, or a lagged variant of DailyDead_it.

The lagged DailyDead_it measure is one step removed but was previously assumed to always be observed (25,26). This assumption is demonstrably false. In the early run of the pandemic many deaths failed to be classified as COVID-related, in part because some instances were thought to be influenza-related, while COVID’s role in inducing cardiac and kidney failure was not widely recognized. We avoid many of these problems by only using DailyDead_it data generated after the pandemic was well established.

Other problems with using lagged DailyDead_it (or NewPositives_it) as the infection pool indicator (such as a state having a proportionately larger elderly or black population) are readily addressed using the standard statistical approach of including time-invariant state-level fixed effects.

One way the use of a lagged version DailyDead_it vs. NewPositives_it will vary is with respect to timing. Current positives are temporally closer to future positives or deaths than current deaths and can potentially pick up the virus spreading among healthy young adults. The shorter link between a lagged NewPositives_it is enhanced by the short period over which a current positive can influence future COVID outcomes, which makes TRP estimation less sensitive to the choice of temperature lags.

Subject to the same measurement error, using NewPositives_it-7 should be preferred to DailyDead_it-7 as the infection pool indicator. These results are provided in Table S6 (Model 16). The main lesson is that the model using LogNewPositives_it-7 is a good predictor of DailyDead_it (R² = 0.94), but not as good as LogDailyDead_it-7. This suggests that either the positives have more measurement error than our corrected version of death counts or that the DailyDead_it-7 is a better reflection of the current risk-adjusted infection pool than NewPositives_it-7. The other result worth noting is that with the infection pool indicator temporally advanced, the coefficient on LogMaxTemp_it-14, as expected, is no longer significant.

The parameter estimates for the base model (Table S1) can be used to provide an estimate of the minimum infection pool each state faces by setting DailyDead_it-7 = 0 (from a statistical perspective), obtaining expected DailyDead_it, and dividing it by the CDC’s point estimate of the infection fatality rate (0.0065) (27). Setting the value of Days_t equal to the end of the sample period, across states, the sum of these minimum infection pools is 9853 cases. Fig. S3(A) provides a visual display of this information for the continental U.S. states.

Interpretation of State-level fixed effects and time-related effects

We calculate a variant of the state-level fixed effects from our main regression model (Table S1) by setting the time trend equal to the last day of our sample period, the lagged number of deaths equal to zero, and calculating the expected number of deaths for each state. Small isolated states like Hawaii generate estimates close to zero while the most populous states tend to generate estimates above 2. These estimates suggest the minimum infection pool in each state varies by a factor of approximately 20 (Fig. S3(A)). A regression (Table S3) of the minimum expected death count on a small set of state-level demographics taken from the U.S. Census Bureau (LogPopulation, %Black, %Hispanic, and %Age80+) explains 79% of the variation in these counts.

We do not want our TRP estimates to be confounded by other factors changing over time (ranging from state and locally mandated shelter-in-place orders, reopening plans, endogenous social distancing, propensity to wear face masks and changing fatality rates). Since we are agnostic as to the mechanism, the straightforward specification is a polynomial time trend, where we find a quadratic term justified. Higher order terms add little insight or predictive power. The quadratic trend for predicting DailyDead_it falls sharply from mid-April to the end of May, after which it very slowly starts to turn up.

A model which adds (a) the number of days since a state issued a mandatory shelter-in-place order and (b) the number of days since a state started to formally reopen its economy can be found in Table S6 (Model 8). The coefficients on these two variables are small and insignificant. The two LogMaxTemp_it-k parameters fell on average by less than 1%.

This might seem puzzling until it is recognized that the two sets of time-related variables are highly correlated. Dropping the quadratic trend (Model 9 in Table S6) provides a different picture because, while still statistically significant at conventional levels, it was substantially diminished. The earlier a state shelter in-place order was issued, the lower the predicted death count (p = 0.028), and the earlier a state started to reopen the higher the predicted death count (p < 0.001). We are reluctant to provide any substantive interpretation to this result. Papers that have tried to determine the role of state actions and the behavior they are intended to influence show the need to (a) extensively model both state and local orders, and (b) incorporate spatially disaggregated mobility data in order to identify the effects of these government mandates from endogenous social distancing by the public that often occurs before these actions (28,29,30). The two temperature coefficients in this model fall on average by less than 20% and remain highly significant.

Alternative specifications for base death count model

Table S4 compares our base model to alternative specifications. These alternative specifications were chosen to look at the sensitivity of the implied TRP because they all have reasonably similar fits relative to our base model. This allows us to observe how robust the TRP is to a substantial shift in various modelling decisions that we made. The first specification replaces the LogMaxTemp_it-k with their linear counterparts. On the surface, this seems like a decision about which of two different scales fits better, but because the models have estimated parameters in the scale function, it may be possible for the two different specifications to provide reasonably similar TRPs. The second uses a popular ratio scaling function LogMaxTemp_it-k/(LogMaxTemp_it-k + α), where α is an estimated parameter. The third replaces LogMaxTemp_it-k with MaxTemp_it-k in this ratio scaling function. A potential issue with Eq. 1, is the possibility that LogMaxTemp_it-14 directly influences our main infection pool indicator LogDailyDead_it-7. Our next specification replaces our infection pool indicator with an alternative, LogDailyDead_it-14, so both temperature variables are now clearly exogenous from the temporal perspective of the infection pool indicator.

The last is a weekly variant of the model Eq. 2 where the dependent variable is the sum of the daily death count of the next seven days with corresponding pushbacks of the lagged variables. This model averages out much of the daily variation and many types of administrative reporting practices. In forecasting the pandemic’s progression, it has become common to use data aggregated to a weekly level in an effort to average out many of the administratively-induced reporting issues that our extensive reconstruction and repair of the daily death count data sought to alleviate. It is possible to estimate a variant of Eq. (1) that uses death count data aggregated into seven-day periods, WeeklyDead_it = Σ_t DailyDead_it, where the summation is over t=1 to t=7. This makes WeeklyDead_it-7 the sum of the 7th through 13th lags of DailyDead_it. Importantly, this aggregation does not reduce the number of observations because on each day, the weekly aggregation at t=1 adds a new observation DailyDead_i1 and drops DailyDead_i8. Lags of the WeeklyDead_it variable can then be used in the standard way. One implication of this specification is that the temperature variables also need to shift backwards. For conceptual consistency, we use LogMaxTemp_it-14 in place of LogMaxTemp_it-7. Empirically, the model fits best with the second temperature lag being LogMaxTemp_t-18. The model fit is reasonably similar using the 13th lag and 19th lags, but beyond the 19th lag, the temperature variable becomes insignificant, suggesting temperature information farther back in time than this is not useful. The model we report is thus: The overall impression from Table S4 is the general stability of most of the common parameters for time and the infection pool. Some of the specifications offer insights into the latter variable. It is little influenced by whether the logistic or ratio scaling function was used, nor whether LogMaxTemp_it or MaxTemp_it was the stimulus variable. Using LogDailyDead_it-14 in place of LogDailyDead_it-7 results in an estimated TRP being similar to the base model (Fig. S6), although it does rise more sharply between 10°C and 5°C suggesting, while there may be a endogeneity effect, it is not large and that our base TRP is conservative. In this model, the coefficient on the infection pool indicator shifts from the .86 of the base model to .80; in the weekly version model it jumps to .92, The R² for the weekly model increases to 0.99 (from the base model’s 0.97) and falls to 0.94 in the model using LogDailyDead_it-14, which is less informative than LogDailyDead_it-7 in term of the infection pool influencing current death counts.

Constructing temperature response profiles (TRPs)

To compare the TRP’s implied by the models in Table S4, we plot the functions using two independent random uniform variables RTemp and RTemp2, defined over the range 5°C and 40°C. The logistic scaling function with two temperature variables, LogMaxTemp_it-7 and LogMaxTemp_it-14, and corresponding Eq. 1 estimated parameters ϒ1 and ϒ2 (Table S4), results in the following scaling function in the base model: There is a fundamental indeterminacy in such a scaling function, in that multiplying the production function part of Eq. 1 by a constant will result in an offsetting change in the scaling function which maintains the same expected value for the dependent variable. Note that each of the two multiplicative components of Eq. 3 converges to .5 irrespective of temperature values as ϒ1 and ϒ2 become increasingly negative. Often logistic functions are normalized to lie between 0 and 1 by changing the “1” in the numerator to “2”, but this is not needed with our normalization to 31°C, which solves the indeterminacy from the perspective of comparing curves. This is done by calculating the value of the estimated scale function at 31°C: Dividing Eq. 3 by Eq. 4 produces a function which equals 1 at 31°C. Multiplying this quantity by 100 produces a function which has a natural percentage interpretation and equals 100 at 31°C. Note that for forecasting purposes, the original scaling function parameters need to be used.

The ratio scaling function with estimated parameters α₁ and α₂ using RTemp and RTemp2 is: As α₁ and α₂ converge to zero, both multiplicative components of SI.1c converge to 1 irrespective of temperature values. The value of this function at 31°C can be calculated in a manner similar to that described for the logistic. Dividing Eq. 5 with the value of that function at 31°C and multiplying by 100 produces the desired TRP.

Figure S8 graphs the daily death TRPs from the model specifications in Table S4. The alternative specifications produce TRPs remarkably similar to that of our base model and tend to bracket it. The one systematic difference is that TRPs based on the ratio scaling functions tend to be flatter over 10°C to 20°C but rise more sharply in the 5°C to 10°C range.

Further DailyDead_it specifications involving weather, positives and cumulative dead

The other weather variables which have received considerable attention are absolute humidity, relative humidity, and ultraviolet (UV) radiation. Details of construction are provided below in Data Preparation Section. The limitation with all these variables is that they are correlated with MaxTemp_it (i.e., absolute humidity: 0.58, relative humidity: −0.21, UV: 0.79). Maximum absolute humidity has a reasonable size effect (p < 0.001) in the model where it replaces the corresponding MaxTemp_it variables. However, the LogAbsoluteHumidity_it-k parameter estimates are close to zero and are no longer significant when the two parallel logistic functions comprising the temperature scaling function are added (Model 10 in Table S6). Relative humidity has a marginally significant relationship with DailyDead_it when MaxTemp_it is not in the model and its effect is close to zero in a model with MaxTemp_it (Model 11 in Table S6).

The situation with UV_it for which (15) found support is more complex (Table S4 &Table S6, Models 12 and 13). The MaxTemp_it lags are marginally better predictors in a head-to-head comparison. These two variables are strongly correlated. Adding UV_it marginally decreases prediction errors, but the signs of high multicollinearity are obvious; the first MaxTemp_it lag is still highly significant while the second is insignificant and one of the UV_it lags is only marginally significant (Table S4). It is effectively impossible to disentangle the influence of LogMaxTemp_it and UV_it. Their relationship is displayed in Fig. S5, which plots LogMaxTemp_it and LogUV_it for two states over our sample period: Georgia (Atlanta) and New York (New York City). We cast our results in terms of MaxTemp_it rather than UV_it because it is more widely reported and understood, without making any claim our work supports a joint versus singular causal mechanism.

The specification in Eq. (1) is cast in terms of DailyDead_it. If MaxTemp_it-k influences COVID transmission via the link between positive cases, then we should be able to replace DailyDead_it-7 with NewPositives_it-7. Estimates for this model (Table S4) show LogNewPositives_it-7 being significant at p < 0.001 and the R² measure falling a bit. The coefficient on MaxTemp_it-14 is insignificant, which would be expected if NewPositives_it-7 incorporates that information, while the coefficient on LogMaxTemp_it-7 is larger than in the base specification.

A different aspect of the COVID-19 death statistics that has not been incorporated into the model is lagged cumulative death count, TotalDead_it-k. If DailyDead_it-k can be seen as a proxy for the infection pool influencing DailyDead_it, then TotalDead_it-k (normalized by population) is proxy for the fraction of the population that no longer at risk in the sense of being either removed by death or recovered. Adding LogPCTotalDead_it-7 effectively makes the StateBase_it dynamic in a potentially different way than the quadratic time trend by letting each state evolve according to its own pattern of deaths. Table S6 (Models 14 and 15) displays the results of this model with LogPCTotalDead_it-7 enter as (a) a second order polynomial and (b) a fourth order polynomial.

Three results are worth noting. First, the increase in explanatory power is small with the most noticeable changes being as expected in the StateIndicator_i and a substantial reduction in importance of the overall quadratic time trend. Second, the LogMaxTemp_it-k parameter estimates are similar to that of Eq. (1) suggesting that our TRP is robust to a substantial dynamic reparameterization of the model. Third, in the quadratic specification, DailyDead_it is declining with as LogPCTotalDead_it-7 at a declining rate. In the fourth order specification, all the LogPCTotalDead_it-7 terms are insignificant (although jointly significant). Figure S6 displays, starting at .05, the two response functions for LogPCTotalDead_it-7. Like the quadratic time trend, they suggest a sharp drop in how DailyDead_it is influence by DailyDead_it-7 as LogPCTotalDead_it-7 increases from low levels with the fourth order polynomial being flatter at high levels of LogPCTotalDead_it-7 than the quadratic. The influence of this factor is reasonably small by the time a state hits 10 deaths per 100,000, a condition that characterizes 80% of the states at the end of our sample period. Earlier, we noted one interpretation of the quadratic time trend was that medical care (and hence death rates) had improved sharply at first and then at a declining rate. This specification has a similar interpretation but suggests that some of that learning is state specific and related to its prior COVID-19 caseload. There is no indication that this rate of decline is accelerating even in the hard-hit Northeastern states where deaths per 100,000 can be as high as 175 (NJ), looking at the fourth order polynomial which should allow this feature to emerge if the data supports it. This suggests the magnitude of the fraction of the population previously infection is still too small to be a substantial factor in slowing transmission of the virus.

Univariate DailyDead_it and MaxTemp_it-7 relationship

If we have succeeded in isolating the TRP through use of the set of StateIndicator_i and a quadratic time trend, the simple regression of DailyDead_it on lagged MaxTemp_it-7 without the state fixed effects and quadratic time trend should reveal a substantially different curve than our estimated TRP displayed in Fig. 4. Figure S7 displays this relationship using a robust LOWESS smoother (bandwidth 0.2) on MaxTemp_it-7. This curve is dramatically more sensitive to temperature between 10°C and 30°C. Below 10°C it drops, which was expected since states with temperatures near 5°C tend to be more isolated and smaller population-wise. The curve bends up near 35°C and beyond where most of the observations come from a few states with (relatively) high June and July death counts.

Pinning down the TRP further will require: (a) obtaining more data over a longer time horizon with more temperature variation and, in particular, the important −5°C to 5°C range, where there has been little U.S. experience since the virus became widely dispersed, (b) obtaining death certificate data information from the few remaining large states (California, Illinois and New York) where it is not yet available, since states with death certificate date reporting have prediction errors that are substantially smaller (p < 0.001) than those that don’t (Table S6, Model 2), or (c) having high quality temporally aligned COVID-19 statistics at the county level, which would provide a better temperature match and dramatically increase sample size.

Death Count TRPs based on DailyDead vs. The COVID Tracking Project’s originally reported death counts

What does a TRP based on The COVID Tracking Project’s (CTP) originally reported death counts look like compared to our base (Eq. 1) Model 1 which uses DailyDead? To examine this issue, we estimate Model 26 (Table S6) a direct analogue of Model 1 that substitutes CTPDailyDead_it and CTPDailyDeadi_t-7 for their DailyDead counterparts. Consistent with Model 18 (Table S6) and other models employing CTPDailyDead, we exclude the two observations that contain the large New Jersey outlier as either the dependent variable or regressor and further excluded twenty observations where LogCTPDailyDead_it-7 is undefined because CTPDailyDead_it-7 is negative.

There are some clear differences between Model 1 and Model 26. First, in Model 26 there is a large drop in the R² from 0.97 to 0.81 and the RMSE measure more than doubles. Second, the elasticity parameter on lagged LogCTPDailyDead is just over 60% as large as it is in the DailyDead version. This is the expected result from introducing substantial measurement error, but one that also has large implications for drawing inferences about the size of the infection pool or in undertaking any dynamic forecasting exercise. Third, the quadratic time trend, while still sizeable is substantially diminished in both magnitude and statistical significance. Fourth, there are differences between the TRPs implied by their temperature response parameters in Eq. (1).

The TRPs for Model 1 and Model 26 are plotted in Fig. S9. The TRPs for the CTPDailyDead variant effectively lies above its DailyDead counterpart. At 5°C, it predicts almost 80% more deaths than our base Model 1. This result may seem counterintuitive to the usual belief that measurement error induces the parameter estimate to be attenuated toward zero That, intuition, generally correct, helps explain the fall in coefficient when CTPDailyDeadit-7 rather than DailyDeadit-7 is used as the infection pool indicator. However, it does not say anything about the implication of introducing measurement error into another covariate in the model. It is easy to show that as the scale of the measurement error in the lagged death count increases, the magnitude of the temperature responsiveness parameters adjusts to incorporate covariance with DailyDead_it-k. All else held constant, this would increase the magnitude of the estimated temperature effect under classical measurement error, since the parameters on infection pool indicator and temperature variables should have the same sign. The situation here is more complicated because measurement error from using CTDailyDead also substantially reduces the ability to pin down the quadratic time trend. Thus, neither the sign nor magnitude of the difference between the temperature response parameters across comparable specifications is therefore known a priori. Our parameter estimates (Table S6) suggest an upward bias is likely in temperature effect estimates obtained from models similar to ours that (a) use reported death counts and (b) have specifications where the infection pool and temperature response variables are expected to share the same sign. Note though that in a simple OLS regression model, the infection pool and temperature variables should have opposite signs, resulting in an estimate of the temperature parameter that is biased in the direction of finding no effect.

Structure of Table S6

The parameter estimates and standard summary statistics for models discussed is this paper are provided in Table S6. It contains three columns for each model. The first column contains the variables included in the model, the second the parameter estimates, and the third the standard errors. After the parameter estimates, the model’s R², root mean square error (RMSE), and the number of observations on which the model was fit are provided.

The order in which the models appear are:

The base model represented by Eq. 1 using LogDays_t, LogDays_t², and LogDailyDead_it-7 as the predictors along with a set of state-level indicator variables and using LogMaxTemp_it-7 and LogMaxTemp_it-14 in the temperature scaling function.
A model which regresses the squared residuals from (1) on the log of state population and an indicator variable, GOOD_DATE for DailyDead_it representing death counts by death certificate date.
Base model using linear versions of the two temperature variables.
Base model using alternative ratio scaling function Temp/(Temp + α), where Temp is temperature variable and α is the estimated parameter.
Base model using the alternative ratio scaling function and linear versions of the MaxTemp_it-k.
Base model with LogDailyDead_it-14 substituted for LogDailyDead_it-7.
A weekly version of the base model (see Eq. 1) substituting LogDailyDead_it-14 for LogDailyDead_it-7 and LogDailyDead_it-18 for LogDailyDead_it-14.
A version of the base model that adds two state-level government policy variables, the log of the number of days since a shelter-in-place order was first issued (LogDaysShelterInPlace_it) and the log of the number of days since a state began to formally reopen its economy (LogDaysReopen_it).
A version of (8) that drops the quadratic time trend of the base model.
The base model adding parallel scaling functions using the LogMaxAbsoluteHumidity_it-7 and LogMaxAbsoluteHumidity_it-14.
The base model adding parallel scaling functions LogMaxRelativeHumidity_it-7 and LogMaxRelativeHumidity_it-7.
Base model adding additional parallel scaling functions for LogUV_it-7 and LogUV_it-14.
Base model substituting LogUV_it-k for LogMaxTemp_it-k.
Base modeling adding a quadratic in terms of LogTotalDead_it-7.
Base modeling adding a 4th order polynomial LogTotalDead_it-7.
Base model substituting LogNewPositives_it-7 and testing variables in place of LogDailyDead_it-7.
The model used in the paper to predict NewPositives_it.
The AR(7) regression of the originally reported death counts from the COVID Tracking Project (CPTDailyDead_it) on the 7^th lag of itself (CPTDailyDead_it-7) for Table S5.
The same as (18) but with DailyDead_it regressed on the 7^th lag of the CTPDailyDead_it.
The same as (18) but CTPDailyDead_it regressed on the 7^th lag of DailyDead_it.
The same as (18) but DailyDead_it regressed on its 7^th lag.
The StateBase_it calculation for each state.
The model reported in Table S3 which predicts the state-level fixed-effect estimates obtained in (1) as a function of a small set of demographic variables.
State-level estimates of the minimum infection pool.
State-level change in expected DailyDead_it when DailyDead_it-7 shifts from 0 to 1 on July 15 at 31°C.
Base model (1) using the uncorrected death count data.

Data and materials availability

The data used in this study are archived at https://github.com/xxx in the form of a Stata “.dta” file. An Excel version of this file was created using StatTransfer. The Stata “do” file creating the data set contains a line-by-line set of the changes made to the original CovidTracking.com data set, and the providence of those changes. Three additional Stata “do” files are available in this archive. The first contains the code (Stata 16.1) for the regression models reported in this paper. The second provides an example of how to estimate the static and dynamic temperature response profiles for an individual state using Georgia as an example. The third contains Stata code for creating the basic versions (fine labeling was done using Stata’s graph editor) of the figures in this paper. We also provide a further Stata dataset and corresponding Excel files at the state level (using our representative airport) with maximum daily temperature readings from the last thirty years.

Footnotes

Funding: The University of California, San Diego provided partial financial support for the work reported on in this paper. No outside support was received.
Competing interests: The authors declare no competing interests

References

1.↵
National Academies of Sciences, Engineering, Medicine. Rapid expert consultation on SARS-CoV-2 survival in relation to temperature and humidity and potential for seasonality for the COVID-19 pandemic. (2020).
2.↵
M. Kanzawa, H. Spindler, A. Anglemyer, and G. W. Rutherford, Will coronavirus disease 2019 become seasonal? J. Infect. Dis 222, 719–721 (2020).
OpenUrl
3.↵
Trump, D.J. February 11, 2020 statement. https://www.factcheck.org/2020/02/will-the-new-coronavirus-go-away-in-april/, xretrieved 30 May 2020.
3a.
S.M. Kissler, C. Tedijanto, E. Goldstein, Y. H. Grad, M. Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period. Science 368, 860–868 (2020).
OpenUrl Abstract/FREE Full Text
4.↵
C. J. Carlson, A. C. R. Gomez, S. Bansal, S. J. Ryan, Misconceptions about weather and seasonality must not misguide COVID-19 response. Nat. Commun. 11, 1–4 (2020).
OpenUrl CrossRef PubMed
5.↵
J. Shaman, E. Goldstein, M. Lipsitch, Absolute humidity and pandemic versus epidemic influenza. Am. J. Epidemiol. 173, 127–135 (2011).
OpenUrl CrossRef PubMed Web of Science
6.↵
A.I. Barreca, J.P. Shimshack, Absolute humidity, temperature, and influenza mortality: 30 years of county-level evidence from the United States. Am. J. Epidemiol. 176, S114–S122 (2012).
OpenUrl CrossRef PubMed Web of Science
7.↵
J.D. Silverman, N. Hupert, A. D. Washburne, Using influenza surveillance networks to estimate state-specific prevalence of SARS-CoV-2 in the United States. Sci. Transl. Med. 12, (2020).
8.↵
Institute for Health Metrics, University of Washington, http://www.healthdata.org/news-release/ihme-models-show-second-wave-covid-19-beginning-september-15-us
9.↵
Graff Zivin, J., M. Neidell, Environment, health, and human capital. J. Econ. Lit. 51, 689–730 (2013).
OpenUrl CrossRef
10.↵
M. Auffhammer, S. M. Hsiang, W. Schlenker, A. Sobel, Using weather data and climate model output in economic analyses of climate change. Rev. Environ. Econ. Policy. 7, 181–198 (2013).
OpenUrl CrossRef
11.↵
Hsiang, S. Climate econometrics. Annu. Rev. Resour. Econ. 8, 43–75 (2020).
OpenUrl
12.↵
R. Xu, H. Rahmandad, M. Gupta, C. DiGennaro, N. Ghaffarzadegan, H. Amini, M. S. Jalali. The modest impact of weather and air Pollution on COVID-19 transmission. doi: http://dx.doi.org/10.2139/ssrn.3593879 (5 May 2020).
13.↵
Á. Briz-Redón, Á. Serrano-Aroca, The effect of climate on the spread of the COVID-19 pandemic: A review of findings, and statistical and modelling techniques. Progress in Physical Geography: Earth and Environment, 0309133320946302 (2020).
14.↵
P. Jüni, M. Rothenbühler, P. Bobos, K. E. Thorpe, B. R. da Costa, D. N. Fisman, A. S. Slutsky, D. Gesink, Impact of climate and public health interventions on the COVID-19 pandemic: a prospective cohort study. Can. Med. Assoc. J. 192, E566–E573 (2020).
OpenUrl Abstract/FREE Full Text
15.↵
R.H.L. Pedrosa, The dynamics of COVID-19: weather, demographics and infection timeline. doi: https://doi.org/10.1101/2020.04.21.20074450 (10 May 2020).
16.↵
T. Carleton, J. Cornetet, P. Huybers, K. Meng, J. Proctor, Evidence for Ultraviolet Radiation Decreasing COVID-19 Growth Rates: Global Estimates and Seasonal Implications. doi: http://dx.doi.org/10.2139/ssrn.3588601 (28 April 2020).
17.↵
J. Wang, K. Tang, K. Feng, W. Lv, High Temperature and High Humidity Reduce the Transmission of COVID-19. doi: http://dx.doi.org/10.2139/ssrn.3551767 (9 March 2020).
18.↵
M.M. Sajadi, P. Habibzadeh, A. Vintzileos, S. Shokouhi, F. Miralles-Wilhelm, A. Amoroso, Temperature, Humidity and Latitude Analysis to Predict Potential Spread and Seasonality for COVID-19. doi: http://dx.doi.org/10.2139/ssrn.3550308 (5 March 2020).
19.↵
G.F. Ficetola, D. Rubolini, Climate affects global patterns of COVID-19 early outbreak dynamics. doi: https://doi.org/10.1101/2020.03.23.20040501 (27 March 2020).
20.↵
S.M. Goldfeld, R. E. Quandt, The estimation of Cobb-Douglas type functions with multiplicative and additive errors. Int. Econ. Rev. 11, 251–257 (1970).
OpenUrl
21.↵
J.M. Wooldridge, Econometric analysis of cross section and panel data. (MIT Press, Cambridge, MA, 2010).
22.↵
R.E. Baker, W. Yang, G. A. Vecchi, C. J. E. Metcalf, B. T. Grenfell, Susceptible supply limits the role of climate in the early SARS-CoV-2 pandemic. Science 369, 315–319 (2020).
OpenUrl Abstract/FREE Full Text
23.↵
U.S. Centers of Disease Control and Prevention. When is flu season? https://www.cdc.gov/flu/about/season/flu-season.htm, xretrieved 30 May 2020.

Additional References

24.↵
E. C. Schneider, Failing the test—the tragic data gap undermining the US pandemic response. N. Engl. J. Med. 383, 299–302. doi: 10.1056/NEJMp2014836 (2020).
OpenUrl CrossRef
25.↵
M. Salerno, F. Sessa, A. Piscopo, A. Montana, M. Torrisi, F. Patanè, P. Murabito, G. L. Volti, C. Pomara, 5No autopsies on COVID-19 deaths: a missed opportunity and the lockdown of science. J. Clin. Med. 9, 1472 (2020).
OpenUrl
26.↵
D.M. Weinberger, J. Chen, T. Cohen, F.W. Crawford, F. Mostashari, D. Olson, V.E. Pitzer, N.G. Reich, M. Russi, L. Simonsen, A. Watkins, Estimation of excess deaths associated with the COVID-19 pandemic in the United States, March to May 2020. JAMA Intern. Med., doi:10.1001/jamainternmed.2020.3391Publishe (2020).
OpenUrl CrossRef
27.↵
U.S. Center for Disease Control. Planning scenarios. https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html, xretrieved 30 June 2020.
28.↵
C. Courtemanche, J. Garuccio, A. Le, J. Pinkston, A. Yelowitz, Strong social distancing measures in the United States reduced the COVID-19 growth rate: study evaluates the impact of social distancing measures on the growth rate of confirmed COVID-19 cases across the United States. Health Affairs 10, 1377 (2020).
OpenUrl
29.↵
A. Goolsbee, N. B. Luo, R. Nesbitt, C. Syverson, Lockdown Policies at the State and Local Level. doi: http://dx.doi.org/10.2139/ssrn.3682144 (20 Aug 2020).
30.↵
R. Chetty, J. N. Friedman, N. Hendren, M. Stepner, How did covid-19 and stabilization policies affect spending and employment? a new real-time economic tracker based on private sector data. doi: http://dx.doi.org/10.3386/w27431 (25 June 2020).

View the discussion thread.

Posted November 05, 2020.

Download PDF

Data/Code

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Addiction Medicine (399)
Allergy and Immunology (708)
Anesthesia (201)
Cardiovascular Medicine (2931)
Dentistry and Oral Medicine (333)
Dermatology (249)
Emergency Medicine (439)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1035)
Epidemiology (12734)
Forensic Medicine (12)
Gastroenterology (827)
Genetic and Genomic Medicine (4578)
Geriatric Medicine (417)
Health Economics (729)
Health Informatics (2914)
Health Policy (1069)
Health Systems and Quality Improvement (1077)
Hematology (388)
HIV/AIDS (924)
Infectious Diseases (except HIV/AIDS) (14094)
Intensive Care and Critical Care Medicine (846)
Medical Education (424)
Medical Ethics (115)
Nephrology (468)
Neurology (4345)
Nursing (236)
Nutrition (638)
Obstetrics and Gynecology (803)
Occupational and Environmental Health (734)
Oncology (2267)
Ophthalmology (645)
Orthopedics (258)
Otolaryngology (324)
Pain Medicine (279)
Palliative Medicine (83)
Pathology (500)
Pediatrics (1196)
Pharmacology and Therapeutics (504)
Primary Care Research (495)
Psychiatry and Clinical Psychology (3752)
Public and Global Health (6932)
Radiology and Imaging (1526)
Rehabilitation Medicine and Physical Therapy (900)
Respiratory Medicine (915)
Rheumatology (437)
Sexual and Reproductive Health (443)
Sports Medicine (385)
Surgery (487)
Toxicology (60)
Transplantation (210)
Urology (180)

[1] 1.↵
National Academies of Sciences, Engineering, Medicine. Rapid expert consultation on SARS-CoV-2 survival in relation to temperature and humidity and potential for seasonality for the COVID-19 pandemic. (2020).

[2] 2.↵
M. Kanzawa, H. Spindler, A. Anglemyer, and G. W. Rutherford, Will coronavirus disease 2019 become seasonal? J. Infect. Dis 222, 719–721 (2020).
OpenUrl

[3] 3.↵
Trump, D.J. February 11, 2020 statement. https://www.factcheck.org/2020/02/will-the-new-coronavirus-go-away-in-april/, xretrieved 30 May 2020.

[4] 3a.
S.M. Kissler, C. Tedijanto, E. Goldstein, Y. H. Grad, M. Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period. Science 368, 860–868 (2020).
OpenUrl Abstract/FREE Full Text

[5] 4.↵
C. J. Carlson, A. C. R. Gomez, S. Bansal, S. J. Ryan, Misconceptions about weather and seasonality must not misguide COVID-19 response. Nat. Commun. 11, 1–4 (2020).
OpenUrl CrossRef PubMed

[6] 5.↵
J. Shaman, E. Goldstein, M. Lipsitch, Absolute humidity and pandemic versus epidemic influenza. Am. J. Epidemiol. 173, 127–135 (2011).
OpenUrl CrossRef PubMed Web of Science

[7] 6.↵
A.I. Barreca, J.P. Shimshack, Absolute humidity, temperature, and influenza mortality: 30 years of county-level evidence from the United States. Am. J. Epidemiol. 176, S114–S122 (2012).
OpenUrl CrossRef PubMed Web of Science

[8] 7.↵
J.D. Silverman, N. Hupert, A. D. Washburne, Using influenza surveillance networks to estimate state-specific prevalence of SARS-CoV-2 in the United States. Sci. Transl. Med. 12, (2020).

[9] 8.↵
Institute for Health Metrics, University of Washington, http://www.healthdata.org/news-release/ihme-models-show-second-wave-covid-19-beginning-september-15-us

[10] 9.↵
Graff Zivin, J., M. Neidell, Environment, health, and human capital. J. Econ. Lit. 51, 689–730 (2013).
OpenUrl CrossRef

[11] 10.↵
M. Auffhammer, S. M. Hsiang, W. Schlenker, A. Sobel, Using weather data and climate model output in economic analyses of climate change. Rev. Environ. Econ. Policy. 7, 181–198 (2013).
OpenUrl CrossRef

[12] 11.↵
Hsiang, S. Climate econometrics. Annu. Rev. Resour. Econ. 8, 43–75 (2020).
OpenUrl

[13] 12.↵
R. Xu, H. Rahmandad, M. Gupta, C. DiGennaro, N. Ghaffarzadegan, H. Amini, M. S. Jalali. The modest impact of weather and air Pollution on COVID-19 transmission. doi: http://dx.doi.org/10.2139/ssrn.3593879 (5 May 2020).

[14] 13.↵
Á. Briz-Redón, Á. Serrano-Aroca, The effect of climate on the spread of the COVID-19 pandemic: A review of findings, and statistical and modelling techniques. Progress in Physical Geography: Earth and Environment, 0309133320946302 (2020).

[15] 14.↵
P. Jüni, M. Rothenbühler, P. Bobos, K. E. Thorpe, B. R. da Costa, D. N. Fisman, A. S. Slutsky, D. Gesink, Impact of climate and public health interventions on the COVID-19 pandemic: a prospective cohort study. Can. Med. Assoc. J. 192, E566–E573 (2020).
OpenUrl Abstract/FREE Full Text

[16] 15.↵
R.H.L. Pedrosa, The dynamics of COVID-19: weather, demographics and infection timeline. doi: https://doi.org/10.1101/2020.04.21.20074450 (10 May 2020).

[17] 16.↵
T. Carleton, J. Cornetet, P. Huybers, K. Meng, J. Proctor, Evidence for Ultraviolet Radiation Decreasing COVID-19 Growth Rates: Global Estimates and Seasonal Implications. doi: http://dx.doi.org/10.2139/ssrn.3588601 (28 April 2020).

[18] 17.↵
J. Wang, K. Tang, K. Feng, W. Lv, High Temperature and High Humidity Reduce the Transmission of COVID-19. doi: http://dx.doi.org/10.2139/ssrn.3551767 (9 March 2020).

[19] 18.↵
M.M. Sajadi, P. Habibzadeh, A. Vintzileos, S. Shokouhi, F. Miralles-Wilhelm, A. Amoroso, Temperature, Humidity and Latitude Analysis to Predict Potential Spread and Seasonality for COVID-19. doi: http://dx.doi.org/10.2139/ssrn.3550308 (5 March 2020).

[20] 19.↵
G.F. Ficetola, D. Rubolini, Climate affects global patterns of COVID-19 early outbreak dynamics. doi: https://doi.org/10.1101/2020.03.23.20040501 (27 March 2020).

[21] 20.↵
S.M. Goldfeld, R. E. Quandt, The estimation of Cobb-Douglas type functions with multiplicative and additive errors. Int. Econ. Rev. 11, 251–257 (1970).
OpenUrl

[22] 21.↵
J.M. Wooldridge, Econometric analysis of cross section and panel data. (MIT Press, Cambridge, MA, 2010).

[23] 22.↵
R.E. Baker, W. Yang, G. A. Vecchi, C. J. E. Metcalf, B. T. Grenfell, Susceptible supply limits the role of climate in the early SARS-CoV-2 pandemic. Science 369, 315–319 (2020).
OpenUrl Abstract/FREE Full Text

[24] 23.↵
U.S. Centers of Disease Control and Prevention. When is flu season? https://www.cdc.gov/flu/about/season/flu-season.htm, xretrieved 30 May 2020.

COVID-19’s U.S. Temperature Response Profile

Abstract

Introduction

Prior efforts

Correcting state-level COVID-19 statistics and why it matters

Modelling approach

Model components

Data

Daily dead model results

New positive case model results

Discussion

Data Availability

Supporting Information

Data Preparation

Construction of temperature, humidity and ultraviolet radiation data

Modelling approach

Interpretation of State-level fixed effects and time-related effects

Alternative specifications for base death count model

Constructing temperature response profiles (TRPs)

Further DailyDeadit specifications involving weather, positives and cumulative dead

Univariate DailyDeadit and MaxTempit-7 relationship

Death Count TRPs based on DailyDead vs. The COVID Tracking Project’s originally reported death counts

Structure of Table S6

Data and materials availability

Footnotes

References

Additional References

Citation Manager Formats

Subject Area

Further DailyDead_it specifications involving weather, positives and cumulative dead

Univariate DailyDead_it and MaxTemp_it-7 relationship