ABSTRACT
IMPORTANCE Rural communities are among the most underserved and resource-scarce populations in the United States (US), yet there are limited data on COVID-19 mortality in rural America. Furthermore, rural data are rarely centralized, precluding comparability across urban and rural regions.
OBJECTIVE The purpose of this study is to assess hospitalization rates and all-cause inpatient mortality among persons with definitive COVID-19 diagnoses residing in rural and urban areas.
DESIGN, SETTINGS, AND PARTICIPANTS This retrospective cohort study from the National COVID Cohort Collaborative (N3C) examines a cohort of 573,018 patients from 27 US hospital systems presenting with SARS-CoV-2 infection between January 2020 and March 2021, of whom 117,897 were hospitalized. A sample of 450,725 hospitalized persons without COVID-19 diagnoses was identified for comparison.
EXPOSURES ZIP Codes provided by source hospital systems were classified by urban-rural gradient through a crosswalk to the US Department of Agriculture Rural-Urban Commuting Area Codes.
MAIN OUTCOMES AND MEASURES Primary outcomes were hospitalization and all-cause mortality among hospitalized patients. Kaplan-Meier analysis and mixed effects logistic regression were used to estimate 30-day survival in hospitalized patients and associations between rurality, hospitalization, and inpatient mortality while controlling for major risk factors.
RESULTS Rural patients were more likely to be older, white, have higher body mass index, and diagnosed with SARS-CoV-2 later in the pandemic compared with their urban counterparts. Rural compared with urban inhabitants had higher rates of hospitalization (23% vs. 19%) and all-cause mortality among hospitalized patients (16% vs. 11%). After adjustment for demographic and baseline differences, rural residents (both urban adjacent and non-adjacent) with COVID-19 were more likely to be hospitalized (Adjusted Odds Ratio (AOR) 1.41, 95% Confidence Interval (CI), 1.37-1.45 and AOR 1.42, CI 1.35-1.50) and to die or be transferred to hospice (AOR 1.62, CI 1.30-1.49 and 1.38, CI 1.30-1.49), respectively. Similar differences in mortality were noted for hospitalized patients without SARS-CoV-2 infection.
CONCLUSIONS Hospitalization and inpatient mortality are higher among rural compared with urban persons with COVID-19, even after adjusting for several factors, including age and comorbidities. Further research is needed to understand the factors that drive health disparities in rural populations.
QUESTION Do mortality and hospitalization rates among rural SARS-CoV-2 infected patients in the United States differ from that of their urban counterparts?
FINDINGS Hospitalization and mortality rates were significantly greater among rural compared with urban patients, even after adjusting for rural-urban differences including age and comorbidities. Similarly, in persons without SARS-CoV-2 infection, mortality rates were also higher among rural patients.
MEANING These findings demonstrate a pervasive rural-urban discrepancy in all-cause inpatient mortality among rural individuals in the U.S. infected with SARS-COV-2 as well as among those who were uninfected. Further research is needed to understand factors driving health disparities among rural populations.
Introduction
The novel coronavirus (SARS-CoV-2) was the third leading cause of death in the United States (US) in 20201 and is responsible for more than 650,000 deaths to date.2 During the initial four months of the US SARS-CoV-2 epidemic, cases were most concentrated in urban areas. However, by late 2020, rural communities experienced a surge in SARS-CoV-2 infections with some of the highest case rates in the nation.3 Nonmetropolitan areas constitute 97% of the US land area, with approximately 20% of the population residing in rural areas.4 Rural compared with urban inhabitants are older, less likely to engage in behaviors to prevent SARS-CoV-2 infection,5 and have a higher prevalence of comorbidities (e.g., obesity) associated with more severe COVID-19 (C19) and death.6 They have also experienced a greater disparity in over life expectancy over the last fifty years,7 which is likely linked to reduction rural providers8 and services that necessitate an increase in transfers to distant acute care facilities.9 Rural residents also face various social, economic, and environmental factors distinct from urban communities.10
To better understand potential drivers of C19 outcomes in rural America, we assessed hospitalization and mortality using the National COVID Cohort Collaborative (N3C), a National Institutes of Health-supported data enclave containing electronic health record information on more than 6 million persons tested for SARS-CoV-2 across 55 US sites and more than 2 million patients with a definitive diagnosis of SARS-CoV-2 infection. To our knowledge, this is the largest cohort of C19 cases in North America using data at the patient-level. Previous large-scale studies have been restricted to single states11 or utilized public health reporting systems.12 While the relationship between rural and urban hospitalization and mortality has been studied for chronic conditions, limited research has evaluated SARS-CoV-2 infected rural-urban discrepancies.
Methods
This retrospective cohort study received Institutional Review Board approval from each investigator’s institution and was reviewed and approved by the N3C Data Access Committee. Our study cohort includes patients diagnosed between January 1, 2020, and March 31, 2021, and a demographically matched (2:1 SARS-CoV-2 uninfected-to-infected matching done at the site level as part of the N3C ingestion process based on age, gender, race, and ethnicity)13 comparison group of SARS-CoV-2 uninfected persons. This study followed the Enhancing the Quality and Transparency of Health Research (EQUATOR) reporting guidelines, Reporting of Studies Conducted Using Observational Routinely Collected Health Data (RECORD).14
N3C Data Enclave
N3C has broad inclusion criteria, harmonizing data from 55 sites across the US.15 N3C collects longitudinal Electronic Health Record (EHR) or Health Information Exchange (HIE) data (with a 2-year “lookback” period to January 1, 2018) on all patients with a C19 diagnostic code (22% of all patients) or a positive SARS-CoV-2 polymerase chain reaction (PCR) or antigen test (78% of all patients) as well as uninfected patients serving as controls. Source system C19 testing protocols are mapped to standard terminologies for labs (LOINC) and conditions (ICD-10 CM and SNOMED CT) by the N3C Data Ingestion and Harmonization Workstream, which maintains a computable phenotype for defining presence of C19.16 To capture patients during the early stages of the pandemic (before 5/1/2020), patients with two weak diagnostic codes (such as ICD10 J80* Acute Respiratory Distress Syndrome and R43.0 Anosmia) probabilistic of C19 are also included.17
Cohort Identification
Rural and urban categories were identified by 5-digit ZIP Codes that were then mapped to the 2010 Rural-Urban Continuum Codes (RUCA), distinguishing by population density, degree of urbanization, and adjacency to metropolitan areas.18 For purposes of this study, three categories are defined: urban, urban-adjacent rural (UAR), and nonurban-adjacent rural (NAR).19, 20 This classification has been commonly used to attribute rurality based on census tract or ZIP Code.21, 22 To validate representativeness of the cohort population with the overall US population, we compared the population percentages for each category in N3C with the US population using public datasets.2,23
N3C data partners are contributing institutions that encompass multiple providers and potentially multiple care sites. We developed and utilized a data robustness screening matrix to determine minimum fact reporting per patient across key domains for each data partner. This follows a similar approach used by the four source data models that all rely on data quality dashboards to enhance site reporting for inclusion into network studies: Observational Medical Outcomes Partnership (OMOP)24, Accrual to Clinical Trials (ACT)25, TriNetX26, and Patient-Centered Clinical Research Network (PCORnet).27
Where possible, we categorically excluded data providers rather than individual participants based on minimum data reporting requirements and robustness measures. We excluded data partners who pre-shifted dates by >45 days, reported limited death information (less than one standard deviation below mean reporting), had insufficient measurement information to calculate BMI on most of their population (>50%), and did not provide 5-digit ZIP Codes. We excluded patients with missing age and gender across all data partners.
Data Extraction
Data were extracted on June 4, 2021, (N3C release 32) in the OMOP Common Data Model version 5.3.1.13 This facilitates a 2-month window for data reporting from our diagnostic cutoff (March 31, 2021) to support 30-day outcomes analyses and comprehensive reporting from data partners. All clinical concept sets were created collaboratively within the N3C Enclave, with at least one informatician and one clinical subject-matter expert reviewing each relevant concept set. Concept sets28 contain standardized terminology corresponding to clinical domains (e.g., LOINC, SNOMED CT, ICD-10, RxNorm). Logistic models were calculated with all C19 patients. All-cause mortality was computed on hospitalized patients as published literature suggests that the most reliable and timely death data are available in EHRs of hospitalized patients, representing 64% of all death certificates in the final quarter of 2020.29
Covariates
N3C provides patient ZIP Codes for most patients (∼66% of all subjects). The majority of missing ZIP Code information is from specific data providers who elect not to provide 5-digit ZIP Codes in their data transmissions. We relied on a RUCA Codes crosswalk to match ZIP Codes and RUCA classifications based on data availability.30 Current RUCA codes derive from the 2010 Census and the 2006-2010 American Community Survey.18 We defined rural areas broadly according to the Office of Management and Budget and Federal Office of Rural Health Policy (FORHP) definitions (primary RUCA code between 1-3 urban, and 4-10 rural). Based on FORHP definitions, we divide rural areas into two categories: UAR (RUCA codes 4-5, 7-8) and NAR (RUCA codes 6, 9-10).31
Outcomes
Primary outcomes are hospital admission and all-cause mortality (any reported death or a discharge to hospice) among hospitalized patients. Survival analyses assessed mortality at 30 days post-hospitalization. Secondary outcomes included length of stay, supplemental oxygen, mechanical ventilation, and major adverse cardiovascular event (MACE) or extracorporeal membrane oxygenation (ECMO).
Statistical Analyses
Summary statistics using chi-square tests of independence were calculated on all C19 patients (Appendix A), hospitalized C19 patients (Table 1), and hospitalized non-C19 patients (Appendix B), stratified by rural-urban categories.32 Covariates examined include gender, age, race, ethnicity, BMI, Charlson Comorbidity Index (CCI) 33 composite score (a higher score indicating worse health), comorbidity categories, tobacco usage, hospitalization, and death/discharge to hospice. Mixed-effects logistic regression models were calculated for the entire C19 and hospitalized C19 groups. Both models included fixed effects for gender, race, ethnicity, BMI, age at visit start, CCI score, and rural category, with an outcome of all-cause inpatient mortality, and random effects for data partner.
We assessed model specification using a methodology that checked for specification change influence on the estimate rural versus urban effect. Among the possible specification changes assessed were (a) transformations of main effects included in the model, (b) all two-way interactions of main effects included in the model, (c) potential addition of available comorbidities not included in the model, and (d) include of data providing organization as a fixed or random effect. In addition, a stepwise assessment approach was used to determine if combinations of any such potential changes resulted in change in the estimated rural versus urban effect. The only specification change identified was the need to include the data provider organization as a fixed or random effect, with the choice of effect type (fixed versus random) found to be irrelevant. As a result, the initial model was modified to include data providing organization as a random effect. This helps mitigate the possibility that our final model’s estimated rural effects are artifacts of mortality differences across those organizations’ patient populations.
The risk adjustment process employed in our modeling used information about pre-COVID comorbidities. As patient data in N3C differ in availability of pre-COVID clinical data, ranging from none to two years of pre-COVID clinical data, we examined the possibility that estimated rural effects stemmed from rural and urban patients differing in extent of pre-COVID comorbidity information.
The variables associated with worse outcomes in the logistic model – rurality, Charlson Comorbidity Index, age, and period of SARS-CoV-2 diagnosis – were secondarily evaluated using Kaplan-Meier estimates of the overall time to death, starting from hospital admission, censored at 30 days.
Results
Demographics
Our final cohort included data from 27 data partners (Figure 1) with 573,018 C19 diagnosed,117,897 hospitalized C19, and 450,725 hospitalized non-C19 patients. Patient demographics in the entire C19 cohort demonstrated rural inhabitants to be older and less racially and ethnically diverse in all groups examined, i.e., the hospitalized C19 positive group (Table 1), non-hospitalized Covid19 positive group (Appendix A), and hospitalized Covid19 negative comparison group (Appendix B).
Underlying Health Disparities and Observable Vulnerabilities
Among all patients, rural populations had higher rates of comorbidities across 14 of the 15 comorbidity categories (Appendix C) and had notably higher rates of obesity (Table.1). We also noted that among our C19 cohorts, between 65% and 80% of patients in each category had prior visit history and between 63% and 74% had prior conditions reported in the pre-COVID period (Figure 1), suggesting that our data robustness matrix sufficiently captured patients with high-fidelity data.
Assessing date of hospitalization, as a proxy for changes in clinical practice and treatment, we found that rural C19 patients were more likely to be diagnosed, and subsequently hospitalized later in the pandemic (Appendix C) when treatment practices were leading to better outcomes. Urban dwellers had higher caseloads in quarter 2 (17% urban, 10% UAR, 10% NAR; p<0.001) and quarter 3 (18% urban, 16% UAR, 15% NAR; p<0.001) of 2020. Urban patients also had higher hospitalization rates in quarter 2 (24% urban, 14% UAR, 16% NAR; p<0.001), but more rural patients were hospitalized in quarter 3 (15% urban, 19% UAR, 18% NAR; p<0.001).
Hospitalization
Persons in rural areas were more likely to be hospitalized, UAR (OR 1.62, 95% CI, 1.58, 1.66) and NAR (OR 1.61, 95% CI, 1.54-1.69), compared to urban dwellers (Figure 3.A). After adjusting for differences in gender, race, ethnicity, BMI, age, CCI, and quarter of diagnosis, C19 patients in rural areas had approximately a 40% increased risk of hospitalization, UAR (AOR 1.41, 95% CI, 1.37-1.45) and NAR (AOR 1.42, 95% CI, 1.35-1.50) (Figure 3.B).
Mortality
The model estimates of all-cause inpatient mortality or transfer to hospice after C19 diagnosis were significantly greater for rural compared to urban patients, (OR 1.45, 95% CI, 1.37, 1.54) and NAR (OR 1.56, 95% CI, 1.41-1.74) (Figure 3.C). After adjusting for differences in gender, race, ethnicity, BMI, age, CCI, and quarter of diagnosis, mortality remained approximately 40% greater for rural C19 hospitalized patients, UAR (AOR 1.39, 95% CI, 1.30-1.49) and NAR (AOR 1.38, 95% CI, 1.22-1.55) (Figure 3.D).
Kaplan Meier survival curves demonstrate significantly higher mortality 30 days after hospitalization among rural C19 patients compared to their urban counterparts. (Figure 4.A). Hospitalized C19 patients with more comorbidities, older age, and diagnosis earlier in the pandemic demonstrated significantly higher mortality (Figure 4.B, 4.C, 4.D).
SARS-CoV-2 Uninfected Comparison Cohort
The same analyses on the hospitalized, uninfected comparison group demonstrated similar, albeit attenuated, results as the C19 positive group. This cohort had similar background characteristics (Appendix B) and similar likelihood of being hospitalized throughout the study period. Hospitalized uninfected patients in rural areas had significantly greater mortality compared with urban uninfected patients, UAR (OR 1.30, 95% CI, 1.26-1.34) and NAR (OR 1.34, 95% CI, 1.27-1.42) compared to urban dwellers (Figure 3.E). After adjusting for differences in gender, race, ethnicity, BMI, age, CCI, and quarter of diagnosis, rurality was still associated with approximately 20% increased odds of mortality in UAR (AOR 1.22, 95% CI, 1.17-1.26) and NAR (AOR 1.20, 95% CI, 1.13-1.28) (Figure 3.F).
Kaplan Meier survival curves show similar differences across the SARS-CoV-2 uninfected cohort as the infected cohort, albeit with a 3-fold higher likelihood of survival in all categories. Uninfected rural patients face significantly higher mortality 30 days after hospitalization compared to their urban counterparts (Figure 4.E). As in the infected cohort, uninfected hospitalized persons with more comorbidities, older age, and hospitalization earlier in the pandemic demonstrated higher mortality (Figure 4.F, 4.G, 4.H).
Secondary Outcomes
Treatment with ECMO or occurrence of MACE were relatively infrequent among C19 patients but ranged from 0.9% in urban to 1.6% in UAR and 1.7% in NAR (p<0.001). Invasive mechanical ventilation and oxygen support were also significantly more frequent among rural than among urban patients (Table 1). However, the mean time to death was similar ranging between 12-14 days after C19 diagnosis among those hospitalized. Urban patients were more likely to be readmitted to the hospital at any point after initial hospitalization in both the infected and uninfected cohorts.
Discussion
In this retrospective cohort study from a large representative data base of both C19 positive and negative patients, we found significantly higher mortality rates among hospitalized rural C19 patients. Mortality was approximately 40% higher among rural C19 patients after adjustment for age, gender, race, ethnicity, BMI, Charlson Comorbidity Index composite score, date of diagnosis, and differences derived from the data contributor.
The SARS-CoV-2 pandemic provides an important lens to examine all-cause mortality by area of residence. SARS-CoV-2 is a new virus, rapidly transmitted in an immunologically naïve human population and initially without known effective treatment. The rapidity with which first urban then rural communities in the US experienced surges of SARS-CoV-2, led us to hypothesize that an urban-rural mortality differential would not be observed. That, however, is not the case. Although rural populations are older and have higher rates of comorbidities,34 including diabetes mellitus and obesity35 which have been associated with increased disease severity and death in SARS-CoV-2 infection, adjustment for these factors did not change the finding of higher rural mortality. Notably, mortality among hospitalized patients without C19 diagnoses, after adjustment for several factors, was also significantly higher in rural compared to urban areas, a finding consistent with previous studies.36, 37
The gradient of risk for chronic diseases and mortality between urban and rural inhabitants is a relatively recent development in the United States. Prior to 1980, mortality rates in rural and urban areas of the US were comparable. Since then, mortality rates in rural America have exceeded urban rates, and the gap has accelerated since 1999, even when adjusted for age.38,34,35 This mortality disparity has been referred to as “the nonmetropolitan penalty,” 39, and some experts believe that structural urbanism is widening the gap.40 Similar findings are noted among nonmetropolitan counties for common causes of death, including heart disease, cancer, chronic lower respiratory disease, unintentional injury, and stroke.6
Given the modeling objectives, it would be illogical to adjust the rural versus urban comparison for clinical severity, such as clinical severity at the time of hospital admission. If a rural versus urban effect exists, it almost certainly would lead to rural versus urban differences in clinical severity at admission. Therefore, adjusting for clinical severity at admission would result in adjusting away the effect which the modeling seeks to evaluate. Our modeling showed that the rural versus urban effect was larger among patients hospitalized for COVID than among patients hospitalized for non-COVID reasons. This suggests that rurality played a greater relative role in COVID outcomes than it plays in the outcomes resulting from most other (non-COVID) hospitalizations.
The etiology of this “penalty” is likely multi-factorial. Poverty, unemployment, and lower levels of educational attainment are more prevalent in rural areas, and these factors may partially explain the urban-rural mortality disparity. Diminished access to care has also been described as contributing to rural-urban mortality differences. One study concluded that having one or more specialist visits during the previous year was associated with 16.6% lower mortality for those with chronic conditions.39,41 Among counties with defined shortages in primary care delivery, 56% are nonmetropolitan while 19% are metropolitan, according to data reported by the Health Resources and Services Administration.42
Although limited research has been done on socioeconomic risk factors and case-fatality rates over the first year of the SARS-Cov-2 pandemic, a recent study from Japan observed higher C19-related mortality in prefectures with the lowest household incomes.43 A recent study conducted using spatial models in the US found rurality to be one of several ecologic determinants of C19 mortality.44 Our results again raise the question of why rural populations experience higher mortality rates after adjustment for multiple factors, even with a new pathogen such as SARS-CoV-2. To what extent delays in care contribute to increased SARS-CoV-2 related mortality among rural populations is unclear as is the potential impact of environmental risk factors in rural areas. The increase in hospitalization and hospitalized mortality in our uninfected cohort demonstrates systemic discrepancies in rural-urban outcomes in tertiary care centers that are likely exacerbated by C19.
Identifying and understanding potential causes of urban-rural health disparities including mortality across both chronic and acute conditions may inform study of rational and cost-effective mitigation strategies. Others45, 46, 47 have demonstrated that attribution of rurality does not provide a one-size-fits-all means to prescribe approaches to reduce health disparities. There are many other possible contributors to the observed disparity, highlighting the urgent need for a robust research agenda that will address the root causes which may differ by geographic region.
While confirming worse baseline health status and outcomes of rural dwellers from chronic conditions, the N3C data also suggest rural dwellers have incremental vulnerability to C19 as an acute health condition. The disparately adverse C19 acute outcomes among rural dwellers do not appear to be fully explained by the worse baseline health measures. The C19 pandemic appears to both highlight and extend the apparent relative vulnerability of rural dwellers to acute health conditions. Recent scholarship points to a dynamic, ecological model for addressing rural adversity based on tailored approaches to individual community needs.48 Additional observational and experimental research is needed to potentially identify practical evidence-based steps to improve health outcomes among rural dwellers for both acute and chronic health conditions.
Limitations
N3C is an observational registry compiling data from multiple diverse participating sites. Therefore, some information may be entirely or partially and non-randomly missing from the database in rural versus urban residents. In our C19 positive cohort, we report incidence of comorbidities in more than two-thirds of our study population, which is similar to those reported in a COVID-19 study across OCHIN, a network of 396 community health centers across 14 states.49 Nonetheless, we examined the possibility that estimated rural effects stemmed from rural and urban patients differing in extent of pre-COVID comorbidity information and found this not to be the case. To some degree, such potential bias can be partly assessed by future analyses, which would include data based on all diagnosed C19+ populations in geographic areas, rather than data limited to C19 patients who received care at N3C collaborating provider systems.50 A second potential confounder is that rural residents with C19 may have been diagnosed later in their disease course than urban residents, leading to a greater risk of serious complications. Additionally, healthcare organizations contributing data to N3C may have cared for more severely ill patients, providing a potential source of bias.
Other limitations of the N3C data source include data aggregated from different health systems with different local practices, regulations, and data models, resulting in potential reporting differences, despite our application of data robustness checks. Additionally, the comparison group is limited only to those pre-matched at the site level from patients who have had a confirmed negative SARS-CoV-2 test. Although 54% of American Indian and Alaska Native populations in the US live in rural areas and have been disproportionately affected by Covid19,51, 52 their racial demographic is unavailable for explicit study in accordance with tribal sovereignty policies. Finally, these analyses are limited to residents in the US and may not be generalizable to other countries.
Conclusions
Hospitalization and death were significantly more frequent among rural C19 patients than their urban counterparts after adjusting for multiple factors, including age, sex, race, and comorbidities. An analysis of adjusted mortality among hospitalized SARS-CoV-2 uninfected patients demonstrated similar results. These data provide evidence-based documentation of rural health disparities. Further research is needed to understand this disparity for both acute and chronic health conditions.
Data Availability
All data included in this manuscript are available publicly in the National COVID Cohort Collaborative (N3C). Data utilized in this study are from release 32. All code and concepts described herein are available on our project GitHub repository: https://github.com/National-COVID-Cohort-Collaborative/CS-Rural-Health/tree/main/rural-mortality-and-hospitalization
Appendix A: Baseline Characteristics of All N3C COVID-19 Positive Population by Rurality Category, January 2020 – March 2021
Appendix B: Baseline Characteristics of All Hospitalized COVID-19 Negative Patients by Rurality Category, January 2020 – March 2021
Appendix C: Quarter of Diagnosis, COVID-19 therapies, BMI Categories, Age Group, and Comorbidities by Rural Category
Appendix D: US Summary Table of COVID-19 Caseload and Case Fatality Compared with N3C Using Public Datasets, January 2020 – March 2021
Footnotes
Authorship was determined using ICMJE recommendations