Abstract
Background The U.S. Southeast has a high burden of SARS-CoV-2 infections and COVID-19 disease. We used public data sources and community engagement to prioritize county selections for a precision population health intervention to promote a SARS-CoV-2 testing intervention in rural Alabama during October 2020 and March 2021.
Methods We modeled factors associated with county-level SARS-CoV-2 percent positivity using covariates thought to associate with SARS-CoV-2 acquisition risk, disease severity, and risk mitigation practices. Descriptive epidemiologic data were presented to scientific and community advisory boards to prioritize counties for a testing intervention.
Results In October 2020, SARS-CoV-2 percent positivity was not associated with any modeled factors. In March 2021, premature death rate (aRR 1.16, 95% CI 1.07, 1.25), percent Black residents (aRR 1.00, 95% CI 1.00, 1.01), preventable hospitalizations (aRR 1.03, 95% CI 1.00, 1.06), and proportion of smokers (aRR 0.231, 95% CI 0.10, 0.55) were associated with average SARS-CoV-2 percent positivity. We then ranked counties based on percent positivity, case fatality, case rates, and number of testing sites using individual variables and factor scores. Top ranking counties identified through factor analysis and univariate associations were provided to community partners who considered ongoing efforts and strength of community partnerships to promote testing to inform intervention.
Conclusions The dynamic nature of SARS-CoV-2 proved challenging for a modelling approach to inform a precision population health intervention at the county level. Epidemiological data allowed for engagement of community stakeholders implementing testing. As data sources and analytic capacities expand, engaging communities in data interpretation is vital to address diseases locally.
Introduction
From the outset of the global pandemic, the burden of COVID-19 disease has been dynamic across person, time, and space. In the months of the U.S. pandemic prior to vaccine availability, SARS-CoV-2 testing demand outweighed supply[1]. Further, those most vulnerable to COVID-19 disease were not the most likely to be tested, and states with larger numbers of people living in poverty and with poorer infrastructure for public health and workers’ rights struggled to roll out testing and contact tracing [2].
Alabama is one of the lowest ranked states in many health metrics compared to the rest of the U.S. [3]. Social determinants of health (SDH) including socioeconomic status, educational attainment, racial discrimination, and restrictive governmental policies underlie these poor outcomes [4], with rural regions and Black communities disproportionately impacted [5-7]. Statewide geographic, socioeconomic, and racial disparities are reflected in how Alabama experiences the COVID-19 pandemic. In mid-July 2021, prior to the start of the SARS-CoV-2 delta strain surge, an estimated 11% of Alabama’s nearly five million people had tested for SARS-CoV-2, with approximately 4,500 tests conducted daily, or 148 per 100,000 people. The case rate per 100,000 in some rural counties was nearly triple the rate of metropolitan areas such as Mobile and Birmingham, with SARS-CoV-2 testing percent positivity approaching 20% in a majority of rural counties in July. While 27% of Alabama’s population is Black, nearly 45% of the state’s lab-confirmed SARS-CoV-2 cases and deaths are among Black people [8, 9]. Grounded in a legacy of slavery manifested in institutionalized racism [10], contemporary barriers to health include social and structural factors that heighten risk for COVID-19 disease among Black communities, particularly those in rural areas [11].
In Alabama and similar settings, there is a need to optimize benefits of testing without exacerbating stigma and marginalization in underserved communities in order to control disease spread. We were funded to collaborate with academic institutions, local service organizations, and community health workers to promote SARS-CoV-2 testing in underserved communities in Alabama. We planned to use publicly available data on SARS-CoV-2, including risk mitigation practices and vulnerability to SARS-CoV-2 infection, to identify rural counties that would benefit most from community-based testing interventions.
Here we describe how we used public data sources to explore strategies for identifying communities most at need for SARS-CoV-2 testing in two waves of selection during October 2020 and March 2021. We present our considerations and how our approach evolved with the pandemic, and lessons learned regarding the limitations of percent positivity in a state with inconsistent testing uptake to inform future work in other more rural settings with poorer public health infrastructure and healthcare.
Methods
Setting
The Alabama Department of Public Health announced the first SARS-CoV-2 positive test result on March 13, 2020 [12]. As of September 7, 2021, nearly 18 months later, there have been a total of 727,360 cases and 12,420 deaths in Alabama [13] and about 39% of the population is vaccinated (2 doses) [13, 14]. Alabama’s population is approximately 69% White, 27% Black, with 86% high school graduates, and approximately 16% live in poverty [15]. For reference, the U.S. census estimates that the nation’s population is approximately 76% White, 14% Black, with 89% high school graduates, and approximately 12% live in poverty [16].
SARS-CoV-2 testing data
Data were downloaded periodically (approximately quarterly) from publicly available websites that collated and created visual representations of data reported by the Alabama Department of Public Health (ADPH). For the October 2020 county testing selections, data were downloaded from bamatracker.com[17], a website that collated and displayed data from ADPH through May 2021. In the March 2021 round of selections, data were downloaded from bamatracker.com[13] as well as the New York Times website[18].
Summary models reviewed
We reviewed existing models summarizing, modeling, and predicting patterns in SARS-CoV-2 testing data provided by Johns Hopkins University[19], University of Washington (Institute for Health Metrics and Evaluation, IHME)[20], and the Pandemic Vulnerability Index (PVI) from National Institute of Environmental Health Sciences [21]. We reviewed the COVID Health Equity interactive dashboard summarized by Emory University[22]. None of these were able to distinguish testing intervention need in Alabama at the county level as the entire state was in highest risk category for COVID-19 disease with widespread need for increased SARS-CoV-2 testing in both rounds of selections. However, elements of data sources used in these models and collated on their websites were incorporated as described in model covariates below.
Model building using local data
Outcome
Percent positivity, or the number of positive tests per 100 tests performed, has been widely used as a measure of SARS-CoV-2 disease burden since the beginning of the pandemic, providing insights into transmission within specific geographic areas[23]. The CDC recommends percent positivity as a measure of disease surveillance for public health decision-making[23]. We modeled factors associated with SARS-CoV-2 percent positivity with the goal to identify factors associated with higher percent positivity to inform intervention county selections. At the time of these selections, most Alabama counties reported high percent positivity rates and were uniformly identified as highest risk by the available models. We therefore intended to identify factors that drove that risk and focus selections on counties based on the distributions of factors associated with higher SARS-CoV-2 percent positivity. Our models differed slightly with the evolution in the pandemic, knowledge of SARS-CoV-2 risk factors, and features of the ADPH testing data between October 2020 and March 2021.
October 2020
We used linear regression models with dispersion with the primary outcome of average 14-day SARS-CoV-2 percent positivity summarized at the county level from October 2 – October 16, 2020. Model covariates included factors associated with risk of acquiring SARS-CoV-2 (Daytime Population Density [21, 24], SVI Housing Score [4, 21], % Below Poverty Line [24, 25], % Unemployed [24], % Black [21], % High School graduates [24, 25], % Limited English [24, 25], COVID Testing Sites [13], increased risk of COVID-19 disease severity (Air pollution [21, 26], Premature Death [24], % Smoking [21, 27], % Over 65 [21], % Obese [21, 27], % Diabetic [21, 28], mortality due to influenza and/or pneumonia [25, 29, 30], Number Preventable Hospitalization [25, 31, 32], COPD Prevalence [33], Heart Disease Prevalence [33]), and factors associated with mitigation behavior (% Republican voters [34]).
March 2021
We used negative binomial regression modeling with dispersion to evaluate factors associated with average 7-day SARS-CoV-2 percent positivity at the county level from 06/17/2020 -3/31/2021. Between October and March, we learned that for some months, the 14-day percent positivity result produced in the state data eliminated all individuals with prior SARS-CoV-2 testing data from the denominator, thus inflating the 14-day percent positivity. This did not occur with the 7-day summary value. Model covariates included factors associated with increased risk of acquiring SARS-CoV-2, increased risk of COVID-19 disease severity, and factors associated with less risk mitigation behavior as above.
Alternative Approach Using Descriptive Epidemiology
In October 2020, we summarized past 14-day average SARS-CoV-2 percent positivity, 14-day case fatality, and 14-day cases/100k population. We also included a cross sectional description of the number of testing sites per county based on collated data provided by ADPH. We described the counties in the top quartile for percent positivity, case fatality, and case rates and below the median for number of testing sites. We also conducted a factor analysis including 3-month percent SARS-CoV-2 positivity, SARS-CoV-2 case fatality rate, COVID-testing sites per 100k population. One factor explained 41% of the variability and was used to provide an additional ranking of counties for selection.
In March 2021, we summarized first quarter 2021 (01/01/2021-03/31/2021) 90-day average case fatality, case rate, and overall mortality rate by county. We identified the counties in the top quartile for percent positivity, case fatality, case rates, and mortality.
In both rounds of selections, we also described race[24, 25], poverty[24, 25], and rurality[21, 24] to inform selections. In the March 2021 selections, we also described vaccine uptake, reported by ADPH.
Preliminary county selections
For both the October 2020 and March 2021 selections, the counties with the greatest COVID-19 burden as described above were shared with our Scientific and Community Advisory Boards to further hone selections and avoid overlapping outreach with other efforts (e.g., ADPH focus areas, other RADX-Up projects in the region) and to prioritize counties in need with known partners to promote community outreach efforts. Final selections were made by the Scientific and Community Advisory Boards in collaboration with the scientific leadership for this project.
Results
In October 2020, we modeled the predictors of past 14-day percent SARS-CoV-2 positivity and most recent county-level estimates for clinical factors, social determinants of health, and other measures of behavioral risk mitigation strategies we expected to inform testing and case rates. No covariates were significantly associated with county level percent positivity (Table 1).
In March 2021, we modeled predictors of average 7-day case positivity from 6/18/2020 – 3/31/2021 using similar covariates (Table 1). In this adjusted model, premature death rate was associated with 16% higher SARS-CoV-2 percent positivity (aRR 1.16, 95% CI 1.07, 1.25) per 1,000 deaths, 10% greater proportion of county residents identifying as Black was associated with a <1% higher in SARS-CoV-2 percent positivity (aRR 1.003, 95% CI 1.00, 1.01), and number of preventable hospitalizations was associated with a slightly higher in SARS-CoV-2 percent positivity (aRR 1.03, 95% CI 1.00, 1.06) per 500 hospitalizations). A 10% higher proportion of the county population identifying as smokers was associated with a 77% lower percent positivity (aRR 0.231, 95% CI 0.10, 0.55).
Given the limited predictive power of the existing models and the models we built to identify counties with the greatest need, our county selections were determined based on descriptive epidemiology and input from scientific and community advisory boards. We described the counties in the top quartile for percent positivity, case fatality and case rates and below the median for number of testing sites and organized counties based on the number of parameters for which they were in the top quarter of the 67 counties in the state (Table 2). Table 3 demonstrates the October 2020 summary of past 14-day SARS-CoV-2 percent positivity, 14-day case fatality, and 14-day cases/100k population. We included a cross sectional description of the number of testing sites per county based on collated data provided by ADPH.
Factor scores provided a ranking of counties with respect to 3 months SARS-CoV-2 positivity, case fatality rate, and SARS-CoV-2 testing sites/100k population. The factor rankings indicated in Table 3 can be interpreted as how much they differ from the average county on the composite measure of SARS-CoV-2 test positivity, case fatality, and testing sites. For example, Franklin County was estimated to be about 1.8 standard deviations worse compared to the average. The factor analysis ranking is described in Table 3 and identified the top 10 counties with the highest ranking based on the combined factor analysis.
Fig 1 indicates which counties were prioritized for the intervention when considering the overlapping counties identified by the factor analysis and the descriptive epidemiology and with consideration of strensgth of partnerships and other ongoing efforts to promote testing across the state.
Priority counties selected in both phases of the project based on the descriptive epidemiology and mapped by the service areas of our community partners (Area Health Education Centers, AHEC) who conducted the testing.
Given the ongoing lack of discrimination between counties in the models we again used descriptive epidemiology and input from scientific and community advisory boards for the March 2021 selections. Table 4 demonstrates the March 2021 summaries of first quarter 2021 (01/01/2021-03/31/2021) 90-day average case fatality, case rate, and overall mortality rate by county. We identified the counties in the top quartile for percent positivity, case fatality, case rates, and mortality. Fig 1 indicates the counties prioritized for the intervention when considering the descriptive epidemiology, strength of partnerships, and other ongoing efforts to promote testing across the state.
Fig 1 highlights the priority counties selected in phase 1 in gray. This figure demonstrates how candidate counties (shades of green) were presented to the community and scientific advisory boards for the March 2021 selections, including data on the regions served by the service areas of our community partners (Area Health Education Centers, AHEC) who conducted the testing, in blue. For the March 2021 phase, the final decision was to intervene on regions rather than counties given widespread need and the nature of our community partnerships.
Discussion
Here we present our approach to pragmatic identification of rural counties in need of a SARS-CoV-2 testing intervention in a Southern U.S. state at the peak of the 2019-first quarter 2021 phase of the U.S. SARS-CoV-2 epidemic. In order to select counties for rapid deployment of a testing intervention in a state with high testing need, we were tasked with quickly identifying priority counties using publicly-available data. Based on global and national guidance, we focused on SARS-CoV-2 percent positivity[17, 22]. As models evaluating factors associated with percent positivity did not discriminate counties for selection, we pivoted to analyses of descriptive epidemiological data to inform county selection. This approach allowed us to categorize counties based upon the count of four severity criteria with four and 12 of Alabama’s 68 counties meeting at least three of these criteria in October 2020 and March 2021, respectively. The presentation of these data in tabular and geospatial fashion was easily interpreted by the community and scientific advisory boards, and community testing implementation partners, with prioritization of counties for testing further benefitting from an understanding of the local community context in identified high-severity counties. Equipped with this information, as depicted in Fig 1, community testing partners were able to prioritize rural counties for outreach and the identification of venues and local partners for testing delivery [35]. As our data sources and analytic capacities expand in scope and sophistication, leveraging traditional epidemiological data and analysis retains a vital role in engaging communities to address communicable and non-communicable conditions in their local context, particularly when response to emerging diseases when time for sophisticated analyses is limited.
In October 2020, no factors were associated with percent positivity in our models. This speaks to the fact that in these first waves of COVID-19 no population was immune, and testing was roughly similar across the counties, so it is not surprising that specific factors driving COVID transmission and positivity were not identified. In addition, more localized outbreaks and “super-spreader” events were randomly distributed in time and geographic area and therefore the time periods included in the models represent variable features of a dynamic epidemic. In March 2021, factors associated with increases in percent positivity were county level rates of premature death[24] and preventable hospitalizations[25, 31, 32]. In addition, having a higher proportion of the population who identified as Black increased the likelihood of case positivity. These associations may reflect that those in poorer resourced counties were at higher risk of acquiring SARS-CoV-2 and also less likely to have access to or seek testing in the absence of infection. In addition, an increase in the proportion of county residents who self-described as smokers was associated with a decrease in the SARS-CoV-2 percent positivity. This may reflect more screening on the part of smokers and those living with smokers who are at higher risk for respiratory disease, in general, and possibly for SARS-CoV-2 complications [36-38]. However, there are limits to this model given county level population-level associations which are broad and subject to ecological fallacy. In both selection rounds, the models were not helpful in informing key areas to focus outreach efforts and therefore, we pivoted to descriptive epidemiology to identify areas most in need of testing with the assumption that the testing intervention might improve testing, contact tracing and thus reduce disease spread, case fatality and morbidity.
Identifying the appropriate metric to track in order to identify areas of testing demand during a rapidly evolving pandemic proved to be challenging. Case positivity was originally thought to be the best indicator of the next outbreak which would therefore inform aggressive testing strategies[39]. This did not prove to be the case due to the often-asymptomatic nature of SARS-CoV-2 and the overwhelming task of contract tracing in our strained public health setting [40, 41]. In addition, (a) errors in reporting, (b) state and institutional guidance to not test if household contact has tested positive at times when testing capacity was limited, (c) workers unwilling to test and be forced to quarantine and lose pay, (d) poorer health resourced state where people are anxious about potential test-related fees, (e) erratic approach to testing where some institutions and organizations required weekly tests for workers, (f) variations in data with large numbers of tests in Summer and Fall 2020 with push resumption of in-person university activities in a state with over 300,000[42] university students[43] changed the meaning of case positivity.
This team is led by HIV researchers who routinely wait years for public health data to be cleaned and available, in part due to the sensitive nature of HIV testing and related data [44, 45]. The excitement of real-time data releases that are key to managing the COVID-19 pandemic was tempered by frustrations when rapid data availability was associated with less reliable data. For instance, Alabama datasets, like many other datasets, had noise related to large data dumps due to institutions reporting testing data en masse (e.g. 2352 new cases reported on 3 March 2021 reflecting data from months prior) changing denominators, and, at times, incomplete data. Our process highlights the need to proceed with caution and engage with community stakeholders when using real-time, inherently messy data to inform public health interventions.
In recent years there has been a push for precision population health, leveraging big data to identify those at greatest risk to optimize delivery of the right intervention, to the right population, at the right time. Unlike diseases with more stable epidemiological and risk profiles, the dynamic nature of SARS-CoV-2, over geography and time, proved challenging for a model-based approach to inform precision population health tailored SARS-CoV-2 testing. A simplified approach based upon descriptive epidemiological data proved informative while allowing for meaningful engagement of essential community stakeholders and implementation partners ultimately charged with implementing testing in their local communities. As data sources and analytic capacities expand in scope and sophistication, leveraging traditional epidemiological data and analysis retains a vital role in engaging communities and building trust in data in order to optimally address communicable and non-communicable conditions in their local context.
Limitations
This is a methods paper to share lessons learned in selecting counties for a SARS-CoV-2 testing intervention but not an in-depth exploration of SARS-CoV-2 testing patterns and epidemiology in Alabama. The findings are unlikely to generalize to all settings and will be most applicable to other counties or states with strained public health systems in the U.S. healthcare context. Details regarding our testing intervention and findings are published elsewhere [35].
Conclusions
We present a pragmatic approach to inform an evolving, pandemic community-level intervention. We also share lessons learned regarding the limits of test positivity in an outbreak where testing uptake is poor and where case positivity rates generally exceeded a threshold level to discern need in areas where nearly all counties had high need. Our current and future work highlights the importance of community partnerships, local knowledge to inform testing outreach and perhaps highlights the need for more centralized pandemic preparedness.
Data Availability
Data were downloaded periodically (approximately quarterly) from publicly available websites that collated and created visual representations of data reported by the Alabama Department of Public Health (ADPH). For the October 2020 county testing selections, data were downloaded from bamatracker.com, a website that collated and displayed data from ADPH through May 2021. In the March 2021 round of selections, data were downloaded from bamatracker.com as well as the New York Times website.
Conflicts of Interest/Disclosures
LT Matthews, Operational support from Gilead Sciences for unrelated projects.
E. Levitan. Research funding from Amgen, Inc. unrelated to this work. Personal fees for a research project funded by Novartis, for work unrelated to this publication.
MJ Mugavero, grant support to UAB from Merck Foundation for unrelated project.
This is work supported by NIH: the funders had no role in the interpretation, analysis, or communication of the findings.
Acknowledgements
Community partners.
Larger RADxUP team:
Greer Burkholder, MD, MSPH 1,2,3, Andrea Cherrington, MD, MPH 2,3, William Curry, MD 2, Latesha Elopre, MD, MSPH 1,2, Faith Fletcher, PhD, MA 4, Eric Ford, PhD, MHHA 3,5, Allyson Hall, PhD 2,3,6, Larry Hearld, PhD 2,6, Bertha Hidalgo, PhD, MPH 1,2,3, Dione King, PhD, MSW 1,3,7, Emily Levitan, PhD, MS 1,2,3, Max Michael, MD 3, Trisha Parekh, DO 2, Donna Porter, PhD 1,2, David Redden, PhD, MS 3, Michael Saag, MD 2, Bisaka “Pia” Sen, PhD, MA 2,3,6, Barbara Van Der Pol, PhD, MPH 2, Jeffery Walker, PhD, MA 3,7
COVID COMET RADXUP Team Affiliations:
University of Alabama at Birmingham, Center for AIDS Research, School of Medicine, Birmingham, Alabama, USA.
University of Alabama at Birmingham, School of Medicine, Birmingham, Alabama, USA.
University of Alabama at Birmingham, School of Public Health, Birmingham, Alabama, USA.
Baylor University, College of Medicine, Waco, Texas, USA.
University of Alabama at Birmingham, School of Business, Birmingham, Alabama, USA.
University of Alabama at Birmingham, School of Health Professions, Birmingham, Alabama, USA.
University of Alabama at Birmingham, College of Arts and Sciences, Birmingham, Alabama, USA.
Footnotes
↵^ Michael Mugavero (mmugavero{at}uabmc.edu) is the lead author for the COVID COMET RADXUP Team. Membership of the COVID COMET RADXUP Team is provided in the Acknowledgements.
Funding NIH RADx-Up initiative, UAB Center for AIDS Research, COVID COMET AL, Project #P30AI027767-32S1 (PI Michael Saag)
The funder provided support in the form of salaries for authors [LTM, DML, YY, SH, EBL, TC, MJM, and SEJ], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section