Abstract
Background Preliminary evidence has shown inequities in COVID-19 related cases and deaths in the US.
Objective We explored the emergence of spatial inequities in COVID-19 testing, positivity, incidence, and mortality in New York City, Philadelphia, and Chicago during the first six months of the pandemic.
Design Ecological, observational study at the zip code tabulation area (ZCTA) level from March to August 2020.
Setting Chicago, New York City and Philadelphia.
Participants All populated ZCTAs in the three cities.
Measures Outcomes were ZCTA-level COVID-19 testing, positivity, incidence, monthly from pandemic onset through the end of August, and ZCTA-level COVID-19 mortality cumulatively through the end of August. Predictors were the CDC social vulnerability index and its four domains, obtained from the 2014-2018 American Community Survey. We examined spatial clusters of COVID-19 outcomes using local Moran’s I and estimated associations using negative binomial models.
Results We found spatial clusters of high and low positivity, incidence and mortality, co-located with clusters of low and high social vulnerability. We also found evidence for the existence of spatial inequities in testing, positivity, incidence and mortality for the three cities. Specifically, neighborhoods with higher social vulnerability had lower testing rates, higher positivity ratios, incidence rates and mortality rates. Inequities in testing and incidence changed over time in the three cities, and inequities in positivity stayed consistent over time.
Limitations ZCTAs are imperfect and heterogeneous geographical units of analysis. We rely on surveillance data, which may be incomplete.
Conclusion We found spatial inequities in COVID-19 testing, positivity, incidence, and mortality in three large cities of the US.
Registration N/A
Funding source NIH (DP5OD26429) and RWJF (77644)
Introduction
As of October 20th, 2020, the COVID-19 pandemic had taken the lives of more than a million people worldwide, while in the US deaths have surpassed 200,000 (1). Cities across the globe have emerged as especially vulnerable to COVID-19. Cities are characterized by diverse populations and are home to pronounced differences in health by race and socioeconomic position, often referred to as health inequities because they are avoidable and unjust(2). The presence of large racial and ethnic differences in COVID19 within US cities has already been documented. For example in New York City, both Blacks and Hispanics have double the age-adjusted mortality rate as compared to non-Hispanic whites(3), in Chicago 50% of deaths have occurred in Blacks, who make up only 30% of the population(4), while in Philadelphia, age-specific incidence, hospitalization, and mortality rates for Blacks and Hispanics are 2-3 times higher than for non-Hispanic whites(5). These stark differences by race are consistent with racial health inequities in many health outcomes and likely reflect multiple interrelated processes linked to structural inequity, historical racist policies, and residential segregation(6-8).
US cities are characterized by strong residential segregation by both race/ethnicity and income, one of the most visible manifestations of structural racism(9). Residential segregation results in stark differences across neighborhoods in multiple factors that could be related to both the incidence and severity of COVID-19, including factors related to transmission (e.g. overcrowding, jobs that do not allow social distancing) and factors related to severity of diseases (higher prevalence of chronic health conditions related to neighborhood environments, greater air pollution exposures and limited access to quality health care)(6-8,10,11). Few studies have systematically characterized spatial inequities in COVID related outcomes in cities over the course of the pandemic.
Characterizing social and spatial inequities in cities is critical to developing appropriate interventions and policies to prevent COVID-19 deaths in the future and mitigate economic and racial inequities. Yet it is rendered complex during an evolving pandemic because of the interrelated nature of access to testing and diagnosis and because the social patterning of the pandemic is likely to change as it advances through the population. We used data from three large US cities to (1) characterize spatial and social inequities in testing, positivity, incidence, and mortality, and (2) examine how the social patterning has evolved in different cities as the pandemic progressed through them.
Methods
Setting
We used data on total numbers of tests, confirmed cases, and deaths by zip code tabulation area (ZCTA) of residence from Chicago, New York City (NYC), and Philadelphia. For Chicago, we downloaded data from the Chicago Department of Public Health(12), including weekly cumulative data from the beginning of the epidemic through October 3rd, 2020. For NYC, we downloaded cumulative data made available daily by NYC Department of Health and Mental Hygiene in their GitHub repository(13) from April 1st through October 1st, 2020. Data on deaths by ZCTA were only available from May 18th onwards. For Philadelphia, we downloaded data from the Philadelphia Department of Public Health(5) on April 24th, including all tests and confirmed cases prior to that date, by zip code and result date. From thereon we downloaded daily cumulative data made available in OpenDataPhilly, from April 24th through October 1st (5) Data on deaths by ZCTA were only available from May 22nd onwards.
A summary of the data availability timeline along with the evolution of the epidemic in each of the three cities is available in Figure 1.
Footnote: shaded areas correspond to dates of availability of monthly data. Cumulative data was available through the entire period (from onset to October 1 [NYC and Philadelphia] or October 3rd [Chicago], 2020)
Outcomes
Study outcomes included four indicators measuring testing as well as incidence and mortality from COVID-19: (1) testing per capita (total tests/population); (2) positivity ratio for tests (14) (confirmed cases/total tests); (3) incidence (confirmed cases/population); and (4) mortality (deaths/population). For testing, positivity, and incidence, we computed seven monthly values for March through the end of September, as well as a cumulative value through the end of the period (October 3rd in Chicago and October 1st in NYC and Philadelphia). For mortality we only computed rates cumulatively through the end of the period because data availability did not allow assessment of monthly measures across all three cities (see Figure 1).
Predictors
To obtain a summary of social conditions in each area of residence we used the 2018 CDC’s Social Vulnerability Index (SVI)(15). Recent research has found the SVI is predictive of COVID-19 incidence and mortality at the county level(16). The SVI reflects the community’s ability to prevent human suffering and financial loss in the event of disaster, including disease outbreaks(15). It includes four domains: socioeconomic status, household composition & disability, minority status & language, and housing type & transportation, along with a summary score with all four domains. The four domains and summary score were calculated by the CDC at the census tract level using data from the 2014-2018 American Community Survey. To aggregate the SVI to the ZCTA level, we used the Census Bureau’s ZCTA to Census Tract Relationship File, and computed a weighted mean of the SVI by ZCTA, using the population of the census tract in the ZCTA as the weight. A higher value of the SVI or of its component scores signifies higher vulnerability, either overall or in its four domains. For example, a higher vulnerability in the “socioeconomic status” domain reflects a higher proportion of people living in poverty, unemployed, with lower income or without a high school diploma. A higher vulnerability in the “housing type & transportation” domain reflects a higher number of people living in multi-unit structures, mobile homes, in crowded situations, without a vehicle, or living in group quarters. See Appendix 2 for more details.
Analysis
We conducted our analysis in three steps. First, we explored the spatial distribution of each of the five predictors (four domains and summary score) and the four outcomes (testing, positivity, incidence and mortality) cumulative through the end of September, using choropleth maps (See Appendix Figure 1 and 2). To explore whether there was spatial clustering, we computed global Moran’s I(17). To show the location of spatial clusters, we computed the local indicator of spatial association (LISA) or local Moran’s I(18) and display clusters with a p-value<0.05.
Footnote: clusters calculated using local Moran’s I statistic; clusters shown have a p-value <0.05.
Second, we examined the relations between SVI and each of the outcomes through the end of the study period using scatterplots and smoothed loess lines. Moreover, to estimate the strength of the association between each predictor and outcome we considered using a Poisson model. However, after exploring the distribution of the outcomes (see Appendix Table 1), and after checking for overdispersion in Poisson models using the approach by Gelman and Hill(19) (see Appendix Table 2), we opted for a negative binomial model. Negative binomial models relax the assumption of equality between the mean and variance, allowing for overdispersion. We fitted a separate model for each city and included the five predictors (four domains and summary score) in separate models. To make coefficients comparable and aid in the interpretation of the results, we standardized all predictors by subtracting the mean and dividing by the standard deviation (SD) for each city. To account for the role of age in determining testing practices and its causal role on mortality, we adjusted all models by the % of people aged 65 or above in the ZCTA.
Relative rates of cumulative testing, positivity and incidence associated with zip code social vulnerability index and its components in Chicago, NYC, and Philadelphia.
To account for spatial autocorrelation of the outcomes, we fitted a Besag-York-Mollie (BYM)(20) conditional autoregressive model, including a structured and unstructured ZCTA random effect, both following an intrinsic Gaussian Markov random field (IGMRF)(21). The structured spatial random effect takes into consideration that ZCTAs are more similar to other neighboring ZCTAs as compared to those further away. We defined neighboring ZCTAs based on regions with contiguous boundaries, defined as sharing one or more boundary point. We fitted this model using integrated nested Laplace approximations (INLA)(22), a method approximating Bayesian inference(23). While this approach is an approximation-based method, it has previously shown accuracy and minimizes computational time(21-23). Details on model specification are provided in the Appendix.
Third, to explore temporal trends in inequities, we used monthly data on testing, positivity and incidence for each ZCTA. For this, we fitted a spatio-temporal model that takes into consideration, simultaneously, the spatial correlation across ZCTAs and the temporal correlation within them. We tested alternative parametrizations of the temporal part of the model(24,25), and decided based on lower deviance information criterion (DIC). Details on these parametrizations are provided in the Appendix. Results are shown as rate ratios for each outcome and month associated with a one SD higher value of the SVI, separately for each city.
All analyses were conducted using R v4.0.2(26). Spatial analysis were conducted using the R package INLA(22,27). More details on data management, the social vulnerability index, and the models are available on Appendices 1, 2, and 3, respectively. Code for replication is available at: https://github.com/usamabilal/COVID_Disparities
This study was funded with support from the NIH under grant DP5OD26429 and the Robert Wood Johnson Foundation under grant RWJF 77644. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Results
We included a total of 58, 177, and 46 zip codes in Chicago, NYC and Philadelphia, respectively. From the beginning of the outbreak up to the latest available date (October 3rd in Chicago and October 1st in NYC and Philadelphia), a total of 674,929, 2,383,919, and 411,559 tests had been conducted in Chicago, New York City and Philadelphia, respectively. There were 81,657, 233,397, and 37,307 confirmed cases, and 2974, 19,149, and 1,803 COVID-19 deaths, respectively. As shown in Figure 1, most of the cases in NYC and Philadelphia occurred in March and April, while most cases in Chicago occurred in April and May. Deaths followed a similar pattern, lagging behind cases.
We found that testing, positivity, incidence, and mortality were spatially clustered in the three cities (see Appendix Table 3), with the exception of mortality in Philadelphia for which we did not find evidence for significant clustering (Moran’s I of 0.062, p=0.143 for the null hypothesis of no spatial clustering of mortality). These clustering patterns held after taking into consideration the spatial distribution of the SVI, with the exception of incidence for Philadelphia, that did not show significant spatial clustering after controlling for the SVI (Appendix Table 4). Figure 2 shows the spatial patterning of clusters of testing, positivity, incidence and mortality in the three cities. Areas of the West and Southwest sides of Chicago have clusters of high positivity, incidence and mortality. Conversely, the Central and North sides of Chicago had clusters of low positivity, incidence and mortality, along with high testing. In NYC, there were clusters of high positivity, incidence and mortality in Bronx, Brooklyn and Queens. There was also a cluster of high testing and low positivity, incidence and mortality in Manhattan and the adjacent areas of Brooklyn and Queens. In Philadelphia, we found clusters of high testing, and low positivity and incidence in Center City. Most of North and Northeast Philadelphia was contained in a cluster of low testing and high positivity. Generally, clusters of high positivity and incidence were spatially co-located with clusters of high social vulnerability (see Appendix Figure 3).
Footnote: solid lines are loess smoothers for each city separately
We visually explored the relationship between the SVI and cumulative testing rates, positivity ratios, incidence rates and mortality rates for the three cities (Figure 3). Testing rates were slightly lower in areas of higher vulnerability in Chicago, NYC and Philadelphia. Positivity, incidence and mortality all increased monotonically with increasing social vulnerability in Chicago. A similar pattern was observed in NYC, but the increase was less marked in ZCTAs above mean vulnerability. Similar patterns were observed in Philadelphia for positivity and incidence but the SVI was not consistently associated with mortality. Appendix Figures 5-8 show the relationship between the four social vulnerability domains (socioeconomic status, household composition & disability, minority status & language, and housing type & transportation) and the four COVID outcomes. Similar patterns were observed across the domains of socioeconomic status, household composition & disability, and minority status & language. On the other hand, higher vulnerability showed weaker or opposite patterns compared to the other domains or the overall SVI in the three cities.
Table 1 shows rate ratios of each outcome (cumulatively across the full study period), associated with a one SD higher value of the SVI index and its four domains. A higher social vulnerability was associated with 13%, 3%, and 9% lower testing rates in the three cities, although confidence intervals crossed the null. Associations of SVI with positivity, incidence, and mortality (adjusted for age) were similar in the three cities. A 1 SD higher SVI was associated with 40%, 37% and 40% higher positivity in Chicago, NYC and Philadelphia, 22%, 33%, and 27% higher incidence, and 44%, 56%, and 58% higher mortality. For the three cities, we found that the social vulnerability domains of socioeconomic status, household composition & disability, minority status & language were associated with the study outcomes similarly to the overall index. However, weaker or even opposite associations were observed for the housing type & transportation component.
Figure 4 shows associations of the SVI with testing, positivity and incidence over the course of the pandemic, using the best fitting time parametrization (see Appendix Figures 9, 10 and 11 for the results using other specifications). The associations of the SVI with testing changed over time. Testing rates were not associated with the SVI in the first months of the pandemic in Chicago, but an inverse association (higher SVI, less testing) emerged in June strengthened over time. A similar pattern was observed in NYC although differences over time in the association of SVI with testing were smaller than in Chicago. In Philadelphia, a higher SVI was associated with lower through the entire period. Higher SVI was consistently associated with higher positivity over the period studied in the three cities consistently. Associations of SVI with incidence followed slightly different patterns by city. In Chicago, the association of higher SVI with higher incidence was strongest at early stages of the epidemic and weakend over time, while in both NYC and Philadelphia the same associations became stronger over the initial period of the pandemic peaking about May/June, but then became weaker although a higher SVI remained associated with a higher incidence in later months both cities.
Footnote: rate ratios (and 95% credible intervals) for each outcome associated with a 1 SD higher value of the summary index
Discussion
We documented large spatial inequities in COVID-19 through the end of September 2020 in three large US cities, Chicago, NYC and Philadelphia, with more vulnerable neighborhoods having a higher positivity, incidence and mortality, and lower testing rates. We also found clusters of high and low positivity, incidence, and mortality, co-located with areas of high and low vulnerability, respectively. These associations changed over time, with initial inequities in testing (less testing in higher vulnerability neighborhoods) followed by a reversal (more testing in higher vulnerability neighborhoods) during April and May, and a re-appearing of inequities in June onwards. Inequities in positivity were consistently wide over the period, and even strengthened over time in NYC and Philadelphia. Inequities in incidence existed thorough the period, but tended to be stronger during the periods in which a higher SVI was associated with more testing. Notably we observed very strong inequities in cumulative mortality, with mortality increasing by ∼50% for each SD higher SVI index.
Findings from this study are consistent with other studies that have examined inequities in COVID-19 incidence by ZCTA in other cities. For example, Chen & Krieger reported a monotonic increase in confirmed cases in ZCTAs of Illinois and NYC with decreasing levels of area-level socioeconomic status(10). Analysis at the county level by the same authors showed similar gradients(10), consistent with other research(28,29), including a study using the SVI as a predictor at the county level(16). We found that, within these large cities, clusters of high and low positivity and incidence that were mostly co-located with clusters of high and low vulnerability, respectively. These include areas of concentrated poverty and with a history of extreme racial segregation(7), including West and North Philadelphia, the West Side of Chicago, and The Bronx in NYC. Notably Chicago, Philadelphia, and NYC are among the top 10 most segregated cities in the country (30).
As others have noted(6-8) potential explanations for neighborhood inequities in incidence may include differential exposure to the virus and as well as differential susceptibility to infection. Residents of higher SVI neighborhoods likely have higher exposure to the virus because of the types of jobs they have (such as essential workers within the healthcare, personal care, production, or service industries(31), personal care or service occupations(32)), lack of telecommuting options(33), dependence on mass transit use(34), and because of overcrowding within households(35). Whether there are factors associated with differential susceptibility to infection is still unclear, but prior research on respiratory viruses has documented that stress linked to disadvantage may increase the likelihood of developing disease after exposure(36,37).
We also found inequities in testing at the beginning stages of the epidemic by which vulnerable neighborhoods had less per capita testing, although these inequities were reduced as the epidemic progressed, and re-emerged shortly afterwards. Barriers to testing when resources are constrained can include unequal location of testing sites(38), lack of vehicle ownership(39), lack of health insurance(40), a usual source of care for referrals(41), and potential mistrust of the medical system(42).
It is possible that the social patterning of infection has been changing over time as the pandemic progressed, beginning in wealthier areas (possibly linked to business travel(43)) and subsequently shifting to more deprived areas. The greater testing in less vulnerable areas early in the epidemic could in part reflect this, given that testing was initially strongly linked to symptoms. However, the strong association of positivity with disadvantage even early in the pandemic suggest that access to testing was, at least initially, lower that what was needed, based on high levels of transmission likely occurring in more vulnerable areas. As a result, incidence rates in more deprived neighborhoods early in the pandemic could be underestimated. We also observed that in all three cities the inequities in incidence over time mirrored the inequities in testing over time: in Chicago the associations weakened over time, while in NYC and Philadelphia inequities became stronger over time and then weakened. This highlights how our ability to adequately characterize inequities in incidence necessarily requires equal access to testing.
A major finding was the substantially higher mortality rate in neighborhoods with a higher SVI. Vulnerability to severe disease and death by COVID-19 are related to the presence of previous comorbidities, such as cardiovascular disease, diabetes and hypertension(44). Since these comorbidities are more prevalent in people of lower socioeconomic status and racial/ethnic minorities(45,46), it is expected that, at equal levels of exposure, these groups will suffer more severe consequences from COVID-19. Other factors may also affect the severity of disease and the case-fatality rates including access to and quality of health care and the role of other factors including co-occurring social factors (e.g. stressors) and environmental factors (e.g. air pollution). In fact, a study with 17 million records in the UK has shown that, even after adjusting for a number of comorbidities, racial/ethnic minorities and people living in socioeconomically deprived areas had a higher risk of death after infection(44). However, two recent studies using data from Michigan and the Veteran Affairs health system suggest that inequities in mortality are driven by differences in infection rates, rather than differential vulnerability(47,48). In our study, we found that the relative risks of mortality associated with higher zip code social vulnerability were slightly higher than those observed for incidence, but underestimation of incidence in higher SVI neighborhoods (because of lower testing) could partly explain this difference.
Last, we found that the domain of social vulnerability due to housing type & transportation showed inconsistent associations as compared to the other domains or the overall summary social vulnerability index. This domain includes (see Appendix 2) variables detailing the proportion of the population living in multi-unit structures, mobile homes, group quarters, in crowded situations, or without a vehicle. Whether these variables can proxy this type of vulnerability in large metropolitan areas is unclear.
An important limitation of our study is the likely underestimation of inequities in incidence due to the lack of systematic widespread testing. We also lack individual-level data, and rely on aggregated surveillance data. In addition, zip codes are very imperfect proxies for neighborhoods. Heterogeneity in the sociodemographic composition within zip codes may have led to underestimation of inequities. However, zip codes represent easy-to-collect data in the middle of a public health emergency when more detailed geocoding is less available. Moreover, because of data limitations we were unable to investigate how inequities in mortality changed over time, and especially how they manifested during the peak of the epidemic in NYC and Philadelphia.
Conclusion
We found large spatial inequities in COVID-19 testing, positivity, incidence and mortality in three large cities of the US and strong associations of COVID-19 incidence and mortality with higher neighborhood social vulnerability. These within-city neighborhood differences in COVID-19 outcomes emerge from differences across neighborhoods generated and reinforced by residential segregation linked to income inequality and structural racism(49-51). coupled with decades of systematic disinvestment in segregated neighborhoods(7-10,50,52).
Addressing these structural factors linked to income inequality, racism and segregation will be fundamental to minimizing the toll of the pandemic but also to promoting population health and health equity across many other health conditions.
Reproducible research statement
Protocol
not available
Statistical Code
Available at https://github.com/usamabilal/COVID_Disparities
Data
Available at https://github.com/usamabilal/COVID_Disparities
Funding
UB was supported by the Office of the Director of the National Institutes of Health under award number DP5OD26429. UB, SB, and ADR were also supported by the Robert Wood Johnson Foundation under award number 77644. The funding sources had no role in the analysis, writing or decision to submit the manuscript.
Conflicts of interest
The authors declare no conflict of interest.
Preprint server
This article was published and updated on medRxiv on October 23th, 2020: https://doi.org/10.1101/2020.05.01.20087833
Acknowledgements
The authors want to acknowledge help by Alyssa Furukawa on data collection, and Dr. Rene Najera for useful code to calculate spatial clustering.
Footnotes
Updated using data through September 2020, added new statistical and spatial analysis, added new author (Dr. Loni Tabb).