Abstract
Electronic health records (EHRs) provide rich data for diverse populations but often lack information on social and environmental determinants of health (SEDH) that are important for the study of complex conditions such as asthma, a chronic inflammatory lung disease. We integrated EHR data with seven SEDH datasets to conduct a retrospective cohort study of 6,656 adults with asthma. Using Penn Medicine encounter data from January 1, 2017 to December 31, 2020, we identified individual-level and spatially-varying factors associated with asthma exacerbations. Black race and prescription of an inhaled corticosteroid were strong risk factors for asthma exacerbations according to a logistic regression model of individual-level risk. A spatial generalized additive model (GAM) identified a hotspot of increased exacerbation risk (mean OR = 1.41, SD 0.14, p < 0.001), and inclusion of EHR-derived variables in the model attenuated the spatial variance in exacerbation odds by 34.0%, while additionally adjusting for the SEDH variables attenuated the spatial variance in exacerbation odds by 66.9%. Additional spatial GAMs adjusted one variable at a time revealed that neighborhood deprivation (OR = 1.05, 95% CI: 1.03, 1.07), Black race (OR = 1.66, 95% CI: 1.44, 1.91), and Medicaid health insurance (OR = 1.30, 95% CI: 1.15, 1.46) contributed most to the spatial variation in exacerbation odds. In spatial GAMs stratified by race, adjusting for neighborhood deprivation and health insurance type did not change the spatial distribution of exacerbation odds. Thus, while some EHR-derived and SEDH variables explained a large proportion of the spatial variance in asthma exacerbations across Philadelphia, a more detailed understanding of SEDH variables that vary by race is necessary to address asthma disparities. More broadly, our findings demonstrate how integration of information on SEDH with EHR data can improve understanding of the combination of risk factors that contribute to complex diseases.
Author summary Electronic health records constitute an important source of data for understanding the health of large and diverse real-world populations, however, they do not routinely capture socioeconomic and environmental factors known to affect health outcomes. We show how electronic health record data can be augmented to include individual measures of air pollution exposures, neighborhood socioeconomic status, and the natural and built environment using patients’ residential addresses to study asthma exacerbations, episodes of worsening disease that remain a major public health challenge in the United States. We found that on an individual patient-level, Black race and prescription of an inhaled corticosteroid were the factors most strongly associated with asthma exacerbations. In contrast, neighborhood deprivation, race, and health insurance type accounted for the most spatial variation in exacerbation risk across Philadelphia. Our findings provide insight into factors that contribute to asthma disparities in our region and present a framework for future efforts to expand the scope of electronic health record data.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Research reported in this publication was supported by the National Institutes of Health (NIH) National Heart, Lung, And Blood Institute (NHLBI, https://www.nhlbi.nih.gov/) under Award Numbers R01HL162354 (BEH and RAH), R01HL143364 (AJA), and R03HL171424 (GEW) and the National Institute of Environmental Health Sciences (NIEHS, https://www.niehs.nih.gov) under Award Numbers P30ES013508 (BEH) and T32ES019851 (BEH).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funding agencies.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The University of Pennsylvania Institutional Review Board (IRB) gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Updated formatting of some sections.
Data Availability
The social and environmental determinants of health datasets that support the findings of this study are publicly available in Sensor-based Analysis of Pollution in the Philadelphia Region with Information on Neighborhoods and the Environment (SAPPHIRINE), offered as a web application (http://sapphirine.org) and R package (https://github.com/HimesGroup/sapphirine). Based on ethical and legal considerations, such as that the data was not collected with informed consent, the electronic health record (EHR) data used in this study cannot be shared widely. To request access to EHR data at the Penn Medicine hospital system, reach out to the PennDnA office (https://www.med.upenn.edu/penndna/). The EHR data cleaning methodology is provided as supplementary files to support reproducibility despite being unable to share the data itself.