Data Availability
The database generated for this study consisted of 226 indicators for 3,141 counties (the complete set of indicators from Center for Disease Control (CDC) PLACES, Environmental Protection Agency's (EPA) EJSCREEN, and EPA AirToxScreen databases) and integrated into a dataframe in Python (version 3.9) using Pandas (version 1.3.4). Chronic Disease data: Health-related indicators for 3,141 US counties including rates of chronic disease, participation in preventive services, and risk factors were extracted from the Behavioral Risk Factor Surveillance System (BRFSS) and available through the 2023 CDC PLACES database22 (Supplementary Table 1). From these datasets we identified 11 disease and health-related measures for analysis (based on the leading contributors to disability-adjusted life years (DALYs) in the United States23), specifically, arthritis, asthma, chronic obstructive pulmonary disease (COPD), cancer, coronary heart disease, depression, diabetes, hypertension, obesity, renal disease, and stroke. Stroke mortality data for ages 35 or older was downloaded from the CDC Stroke Death Rates database (between 2017-2019)24. High disease prevalence or high stroke-mortality counties were defined as having age-adjusted rates > = 70th percentile. Pollution, SDOH, Demographic, and Geographical Data: Pollution data for 9 pollution indicators along with seven social determinants of health (SDOH) / health equity census-tract level measures was extracted from the Environmental Protection Agency (EPA) Environmental Justice (EJSCREEN) 2021 database25, together with 177 chemical ambient air concentrations from the EPA's 2018 AirToxScreen database26 reported at the census block group level (in ug/m3), and calculated at the county level by population-weighting the census block group level exposures and then calculating the sum for each county from the blocks. Together, the EJSCREEN and AirToxScreen measures resulted in 186 pollution measures examined in this study. Geographical boundary information for counties, in the form of GeoJSON, were obtained from the US Census TIGER database27. The 9 EJSCREEN pollution indicators 28 included particulate matter 2.5 (PM2.5; ug/m3), ozone (parts per billion), traffic proximity (vehicles per day / meters), lead paint exposure (%; of housing units built before 1960), superfund proximity (superfund site count / km), RMP facility proximity (facility count / km), hazardous waste proximity (count of hazardous waste facilities within 5 km (or nearest beyond 5 km), each divided by distance in kilometers), underground storage tanks (count of facilities (multiplied by a factor of 7.7) within a 1,500-foot buffered block group), and wastewater discharge (modeled toxic concentrations at stream segments within 500 meters, divided by distance in kilometers (km)) (Supplementary Table 1). The year of pollution exposure was selected to precede the year when chronic disease rates were reported.