Abstract
Extensive empirical health research leverages variation in the timing and location of policy changes as quasi-experiments. Multiple social policies may be adopted simultaneously in the same locations, creating co-occurrence which must be addressed analytically for valid inferences. The pervasiveness and consequences of co-occurring policies have received limited attention. We analyzed a systematic sample of 13 social policy databases covering diverse domains including poverty, paid family leave, and tobacco. We quantified policy co-occurrence in each database as the fraction of variation in each policy measure across different jurisdictions and times that could be explained by co-variation with other policies (R2). We used simulations to estimate the ratio of the variance of effect estimates under the observed policy co-occurrence to variance if policies were independent. Policy co-occurrence ranged from very high for state-level cannabis policies to low for country-level sexual minority rights policies. For 65% of policies, greater than 90% of the place-time variation was explained by other policies. Policy co-occurrence increased the variance of effect estimates by a median of 57-fold. Co-occurring policies are common and pose a major methodological challenge to rigorously evaluating health effects of individual social policies. When uncontrolled, co-occurring policies confound one another, and when controlled, resulting positivity violations may substantially inflate the variance of estimated effects. Tools to enhance validity and precision for evaluating co-occurring policies are needed.
INTRODUCTION
Evaluating the health effects of social policies is critical to researchers, funders, and decision-makers seeking to promote healthful, evidence-based programs. Study designs such as differences-in-differences and panel fixed effects (1), which exploit variation in the timing and location of policy changes, have the potential to deliver causal inferences. Changes in health outcomes that are tied to the jurisdictions and times at which a particular policy is adopted can be used to isolate the causal effect of the policy (1). Empirical health research on social policies using these methods has grown rapidly and yielded influential findings in recent years in epidemiology and other fields (2–4).
One major concern with study designs that leverage variation in the timing and location of policy changes is that co-occurrence of policies can render it difficult to separately identify the causal effects of each policy. Isolating individual policy effects is crucial for delivering evidence to decision-makers on whether or not to adopt a policy. Yet multiple related policies are often adopted or implemented in the same jurisdiction simultaneously or in quick succession, rendering it difficult to isolate the effect of one policy from the other. For example, a government that moves to overhaul its social safety net is likely to change multiple welfare-related policies in a single wave of legislative changes (5). Consequently, bundles of related policies, selected to address a particular set of health or social priorities and thus with similar potential health effects, are adopted concurrently, creating co-occurring policies.
Co-occurring policies confound one another. Thus, if the co-occurring policies are relevant to the health outcome of interest, failing to account for co-occurring policies can severely bias estimated effects of specific social policies. For example, if an effective policy A and an ineffective policy B are routinely adopted as a set, and their true effects are unknown, when researchers analyze effects of policy B without accounting for policy A, findings are likely to spuriously indicate that policy B is effective. Yet if jurisdictions typically adopt both policies together, adjustment for policy A to isolate the effect of policy B can lead to imprecise or unstable estimates and bias resulting from data sparsity (6–8). In extreme cases, estimates may be severely biased, undefined, or rely entirely on extrapolation because there is no independent variation in the policy of interest (6).
Strong confounding and consequent data sparsity arising from co-occurring policies can be conceptualized as lack of common support in the data, also known as a violation of the “positivity assumption” (9). Lack of positivity implies that some confounder strata do not have variation in the exposure—for example, because places and times with the confounding policy always adopt the policy of primary interest (the “index” policy). A rich literature exists on the problem of positivity and the use of propensity scores to assess and address it (e.g. by restricting to units that are “on-support”) (9–14). However, several aspects of the policy co-occurrence problem make it important to consider separately from positivity issues that arise with other exposures. First, due to the nature of policymaking (5), the levels of co-occurrence among policy variables may be far greater than those typically observed in non-policy studies (15, 16). For example, governments adopt similar policies at similar times in part because they are responding to the desires and values of their constituents. Second, the most relevant analytic solutions may be distinct. For example, analytic solutions such as data-adaptive parameters (17, 18) that rely on large sample sizes may not be feasible for policy studies that are typically based on a small, fixed set of jurisdictions. Meanwhile, stronger theories or substantive knowledge about the mechanisms by which a particular social policy operates could guide analyses leveraging mediating variables for causal effect estimation (19). For example, how education policy affects educational attainment may be better understood than how educational attainment affects health. Furthermore, if a set of policies are always adopted together, then modifying the exposure definition to encompass both policies and evaluating their combined effect may be the most policy-relevant research question, as opposed to attempting to disentangle their individual effects. Re-conceptualizing the exposure in this way may be less relevant to research in other substantive areas. Thus, the policy co-occurrence problem presents unique challenges and potential analytic solutions beyond typical confounding.
Characterizing the extent and impact of policy co-occurrence is a crucial step for the development of rigorous evidence on social policy effects. Yet, to our knowledge, no epidemiologic research has directly addressed this issue. Prior applied studies of social policies in fields including epidemiology, economics, and political science have acknowledged the issue by critiquing existing policy studies or, in some cases, applying solutions—e.g. studying aggregate measures of policy stringency (20–22). Similar methodological challenges have arisen in environmental epidemiology when studying correlated and multipollutant exposures, but the emphasis of this research has been on identifying analytic solutions appropriate for pollutant measures, rather than on quantifying the extent of the problem (7,8,23). To our knowledge, no prior studies have examined how frequently related policies co-occur, a necessary step to lay the foundation for rigorous analytic solutions. For researchers aiming to estimate individual policy effects, guidance is needed on how to evaluate whether the impacts of policy co-occurrence on estimation are likely to undermine the study. In some cases, the challenge of co-occurring policies may require a modified analytic approach or even altering the research question.
In this paper, we addressed these gaps by proposing and applying an approach to assess the extent of policy co-occurrence and to quantify the impact of policy co-occurrence on the precision of effect estimates for individual policies. Using 13 exemplar social policy databases covering diverse domains, we visually depicted and quantified the extent of policy co-occurrence in each database, and use simulations to estimate impacts on precision. This paper illustrates a novel method that can be used in applied research to determine when policy co-occurrence is so severe that alternative analytic approaches are needed.
METHODS
Overview
We developed a systematic sample of social policy databases covering diverse health-related domains that capture measures of policy adoption or implementation across jurisdictions and time. To evaluate the extent and impacts of policy clustering, we applied three analyses to each database. First, we visualized the degree of policy co-occurrence in each database by plotting heatmaps of pairwise correlations among the measured policies. Second, building on the positivity literature, we quantity the overall degree of co-occurrence in each database as the amount of variability in each policy measure across jurisdictions and time that could be explained by the other policy measures in the same database. This indicates how much independent variation remained with which to study the policy of interest. Finally, we used simulations to estimate the impacts of policy co-occurrence on precision by comparing the variance of estimated effects given the observed co-occurrence compared to the variance if all policies were adopted independently.
Database identification
We sought to characterize the extent of policy co-occurrence across diverse social policy domains. Because no registry of all available social policy databases exists, we identified an exemplar set by evaluating contemporary research on social policies and health, and selecting domain-specific policy databases corresponding to those studies.
We identified all studies of social policies published in 2019 in top medical, public health, and social science journals, emphasizing general-topic journals that publish research on the health effects of social policies: Journal of the American Medical Association, American Journal of Public Health, American Journal of Epidemiology, New England Journal of Medicine, Lancet, American Journal of Preventive Medicine, Social Science and Medicine, Health Affairs, Demography, and American Economic Review. After these journals had been selected, we asked a convenience sample of 66 researchers from diverse disciplines to rank relevant journals. Responses confirmed that our selected journals reflect common perceptions of most relevant venues for research on the health effects of social policies (detailed results in Appendix: “Survey assessing relevant journals for inclusion”; Appendix Table 1).
We identified original, empirical studies aiming to estimate the causal effects of one or more social policies on health-related outcomes in any country, state, or locality (areas, neighborhoods, or sub-state units such as counties or cities). Although the definition of “social policies” varies across the literature, a priori we defined “social policy” to mean any non-medical, population-based or targeted policies that are adopted at a community or higher level, and hypothesized to affect health or health inequalities via changes in social or behavioral determinants. A priori, we defined health-related outcomes broadly, to include morbidity, mortality, health conditions, and factors such as smoking, homelessness, and sales of unhealthy products. Given our focus on social interventions, we excluded studies that pertained to health care, health insurance, interventions delivered in the clinical setting, medications, or medical devices, including studies of the Affordable Care Act or Medicaid expansion. For reproducibility, additional detail is presented in the Appendix (“Social policy study inclusion and exclusion criteria”). An independent analyst reviewed a subset of candidate articles to confirm that our strategy to identify relevant papers was reproducible. Concordance between reviewers upon initial review was 90% (for details, see Appendix “Assessment of inter-rater reliability for inclusion and exclusion).
For each social policy study, we identified any corresponding quantitative databases capturing the content, locations, and times of adoption of the index policy and related policies in the same domain. We searched the scientific literature; websites of domain-relevant research institutions, scientific centers, and organizations; and the internet to identify relevant, publicly available databases. We also asked the authors of each index social policy study for policy database recommendations. When possible, we included databases provided on request from individual investigators. If more than one policy database was available, we selected the one that was most amenable to this analysis: first, the database requiring the least data cleaning or manipulation (i.e., panel data structure and variables coded); then, among those remaining, the database with the greatest clarity of variable definitions, followed by the least missingness, and most comprehensiveness (number of policies and time points). We excluded domains for which we could not identify or access any corresponding database. Figure 1 presents information on the number of articles considered, studies and corresponding databases included in the final sample, and studies and databases excluded. Additional detail is presented in the Appendix (“Database selection”).
Database coding
Each database is formatted with one row per jurisdiction (country, state, or locality) and time period (month or year), and one column per policy measure. The types of policy information varied across databases. Some included exclusively binary indicators of policy adoption while others provided information on benefit generosity, implementation, access, and/or scope (e.g. number of Supplemental Nutrition Assistance Program (SNAP) participants by state and year). We included all available policy measures for the heatmaps (see below). For subsequent analyses, when multiple measures of the same policy were available (e.g. year of adoption and number of participants), we selected the measure used in the publication in the original search which invoked the policy, if relevant, or the measure we judged to be the most representative. Some policies were subordinate to umbrella policies. For example, provisions regulating cannabis delivery services are only applicable in jurisdictions where recreational cannabis is legal. For jurisdictions and times in which the umbrella policy was not active, we included these observations in the analysis and coded provisions conditional on that umbrella policy to 0. Additional details are provided in the Appendix (“Database coding”).
Statistical analysis
First, to visually depict policy co-occurrence in each database, we plotted heatmaps of the Pearson’s correlation matrix for each pairwise combination of policy measures (hereafter, “heatmaps”). Although numerous measures are appropriate, we selected Pearson’s correlation because it is common, intuitive, and accommodates continuous-continuous, continuous-binary, and binary-binary variable comparisons. Although the distribution of Pearson’s correlation between continuous and binary variables is constrained, this constraint is appropriate in this context.
Second, we assessed the degree of unique variation available to estimate individual policy effects, when considering each individual policy while controlling for all others. To do this, we estimated an R2 in regression models of each policy regressed on the set of all other policies in the same database. We modeled continuous policy variables using linear regression and used R2 adjusted for the number of predictor variables. We modeled binary policy variables using logistic regression and used McFadden’s pseudo R2 (24). For both types of regression, we included all predictor policy variables in the database as main terms. This step quantified the amount of variability in each policy across jurisdiction-times that could be explained by the other policy measures and results in a distribution of R2 values—one for each policy in each database. This step is also conceptually very similar to estimating propensity scores to assess positivity, except that it accommodates continuous exposure variables.
Third, we estimated the impacts of policy co-occurrence on precision using simulations. For each policy measure, in each policy database, we applied the following procedure:
Step A: Assign a simulated outcome of N observations, where N is the number of jurisdiction-periods in the policy database (Table 1). To simulate the outcome, we assumed (a) a random normal distribution with mean 100 and standard deviation 5; (b) a null effect of the index policy on the outcome (because using an alternative would not substantively affect the results); and (c) 10% of the variance of the outcome was explained by a randomly selected non-index policy (the “explanatory policy”). We incorporate this last component because the precision of the estimated effect of the index policy depends on the proportion of the variance in the outcome that is explained by the other variables in the model. Because large-scale social programs are recognized to have small individual-level effects (25, 26), we considered 10% explained to be optimistic in the setting of the health effects of social policies. We assumed no other confounding was present.
Step B: Apply a linear regression, modeling the simulated outcome as a function of the index policy, the non-index policies, jurisdiction fixed effects, and time fixed effects. From this regression, record the variance of the regression coefficient corresponding to the effect estimate of the index policy (variance = (standard error)2). This was the variance in the real-world, co-occurring, data.
Step C: To estimate the variance if there were no co-occurrence, randomly redistribute the values of the all policy measures across jurisdictions and time (i.e. for each policy measure, randomly shuffling among the rows of the database). This process preserves the overall mean and variance of each policy measure but eliminates systematic co-occurrence.
Step D: Apply the same regression model as in Step B to the redistributed policy data and record the variance of the effect estimate of the index policy.
Step E: Take the ratio of the variance of the effect estimate of the index policy, under the real-world policy regime (derived in Step B) versus under the randomly redistributed regime (derived in Step D) (ratio = varianceStepB/varianceStepD). This ratio is an estimate of the variance inflation due to policy co-occurrence.
We conducted Steps A-E 1,000 times for each policy measure in each database, resulting in a set of estimates of the variance inflation. We summarized the variance inflation due to policy co-occurrence for each database by stacking all the variance inflation estimates for all the policy measures in that database and plotting their distribution. We summarized the variance inflation due to policy co-occurrence overall by stacking all the variance inflation estimates for all policy measures in all databases and calculating their summary statistics.
All analyses were conducted using R version 3.6.2.
RESULTS
We identified 55 studies evaluating links between social policies and health that met our inclusion criteria (27–82), amongst which there were 36 unique policies or databases invoked, and 13 social policy databases that could be identified and accessed (Appendix Table 1; Table 1). Studies included, for example, a panel data analysis of the impacts of changes in the level and duration of paid maternity leave on fertility, workforce participation, and infant mortality across 18 African and Asian countries (37) and a synthetic control evaluation of the effect of raising state-level beer excise taxes on young adult firearm homicides (65).
The sample of 13 identified social policy databases (83–95) (Table 1) included 5 country-level databases, 6 state-level databases, and 2 local-level databases. Domains included poverty and social welfare; family and child welfare; worker welfare; pensions; unemployment; fertility; immigration; LGBT (lesbian, gay, bisexual, and transgender) rights; firearms; alcohol; tobacco; and recreational cannabis. The number of unique policies per database ranged from 6 to 134. Some databases had multiple umbrella policies while others focused exclusively provisions relating to one umbrella policy. For example, the PROSPERED database included overarching policies and specific provisions for breastfeeding breaks, child health leave, family leave, maternity leave, parental leave, paternity leave, and sick leave; while the recreational cannabis policy database focused exclusively on provisions for states where recreational cannabis is legal (e.g. retail sales taxes).
Visualizing policy co-occurrence
The degree of policy co-occurrence varied by database (Figures 2 and 3; Appendix Figures 1-11). Across the 13 databases, Figure 2 shows an example of intermediate degrees of co-occurrence: amongst unemployment, sick leave, and pension benefits policies across 40 years in 22 countries. Figure 3 displays an example of high levels of co-occurrence: amongst recreational cannabis policies across 108 months in 50 states. Darker colors indicate higher degrees of co-occurrence. Because the correlations are calculated on panel data at the level of the jurisdiction and time unit, higher correlations indicate that jurisdictions which adopt one policy are more likely to adopt the other (or vice versa) and that the policies are likely to be adopted in closer temporal succession.
State cannabis policies displayed the highest co-occurrence (median absolute correlation across all pairwise policy combinations: 0.65; 4 policies perfectly aligned) while national LGBT rights policies showed the lowest co-occurrence (Appendix Figure 8; median correlation: 0.04; no policies perfectly aligned). For example, states that restrict what recreational cannabis products can be sold for retail sale also tend to tax retail cannabis sales, while countries that allow same-sex marriage were relatively independent of countries that ban LGBT-related employment discrimination. Tobacco policies at the state, county, and city jurisdiction levels showed similar degrees and patterns of co-occurrence among policies—for example, comprehensive clean air laws for bars and comprehensive clean air laws for restaurants were frequently co-occurred at the state, county, and city levels; Appendix Figures 9-11).
Most policy measures were positively correlated, but we also found pockets of negative correlations. For example, country-years with child tax credits tended not to have child tax allowances (Appendix Figure 1). The heatmaps also reveal groups of co-occurring and independent policies. For example, labor policies requiring licensing for different professions frequently co-occurred, but this set was relatively independent of policies regarding collective bargaining and minimum wages (Appendix Figure 5).
Quantifying policy co-occurrence
Most of the variability in policy measures across jurisdiction-times was explained by the other policies in the same database. Figure 4 displays the distributions of R2 values: the higher the R2, the less unique variation there is for an individual policy, to a maximum of 1.0. The impacts of policy co-occurrence on identifiability were generally substantial: of all 502 policies considered, 65% had R2 values greater than 0.90 when regressed on other policies in the same database. Child benefits had the lowest R2 distribution, with a median of 0.19; poverty and social welfare, family leave, fertility/immigration, firearms, cannabis, alcohol, state tobacco control, and county tobacco control policies had R2 distributions with medians around 0.9 or higher. In some cases, correlations between predictor policy variables were so strong that the statistical software forced certain variables from the model (frequency reported in Appendix Table 2).
Impacts of policy co-occurrence on precision
Policy co-occurrence substantially reduced the precision of possible effect estimates in all cases (Figure 5). Across policy measures, databases, and simulation iterations, policy co-occurrence effectively increased the variance of effect estimates by a median of 57-fold. Across policies, the lowest degree of variance inflation observed was 7% (median across simulations) for country child tax rebates. For other policies, particularly family leave, variance inflation was so substantial as to render estimates effectively indeterminate. Again, some predictors were dropped from models due to strong correlations with other predictors (Appendix Table 2).
DISCUSSION
We analyzed 13 social policy databases drawn from contemporary research in top epidemiology, clinical, and social science journals. These exemplar databases represented diverse policy domains, geographies, and time periods to describe the pervasiveness and impacts of policy co-occurrence on estimation of health effects. We found that high degrees of co-occurrence were the norm rather than the exception. For a majority of policies, greater than 90% of the variation across jurisdiction-times was explained by other related policies in the same database. Unbiased studies attempting to isolate individual policy effects must control for these related policies, so for many applications, there may be little independent variation left with which to study the policy of interest. Consistent with this, we found that adequate control for co-occurring policies is also likely to substantially reduce the precision of estimated effects, often so dramatically that informative effect estimates are unlikely to be derived.
Interpretations
Several factors make the pervasiveness and consequences of policy co-occurrence likely to be even greater than we have estimated. First, we only examined policy co-occurrence within domain-specific databases. Yet social policy changes may happen in multiple domains simultaneously. For health outcomes affected by diverse types of policies (e.g. both unemployment policies and firearm policies may affect suicide rates), researchers must consider policy co-occurrence across domains which likely indicate even more severe co-occurrence. Second, each policy database we considered included only one jurisdictional level, but true policy environments involve complex overlays of national, state/province, county, municipal, employer, and/or school policies. Third, we did not incorporate lagged effects or nonlinear relationships between variables. Fourth, policy variables that perfectly or near-perfectly predict one another were dropped from the regression models. Finally, we did not consider the multitude of social, economic, political, or societal factors (e.g. a recession, migration, gentrification), that may also co-occur with policies of primary interest, including changes in social norms, implementation, or enforcement that can be conflated with policy changes. Some such confounders can be controlled with jurisdiction or time fixed effects; measured confounders that are jurisdiction-specific and time-varying could be evaluated using the same methods illustrated here. This is a formidable task; data sharing efforts would facilitate its assessment and handling.
We found that the overall degree of policy co-occurrence varied across databases, ranging from very high for state-level recreational cannabis policies to low for country-level sexual minority rights policies. Several different factors may drive this variation. Our finding that tobacco policies at the state, county, and city levels showed similar degrees and patterns of co-occurrence among similar sets of policies suggests that co-occurrence may be a characteristic of the domain. Political polarization may result in greater co-occurrence for certain policy domains (e.g. firearms) versus others (e.g. family leave). Databases with rarer policies, fewer umbrella policies (e.g. recreational cannabis), or more nested policies (e.g. firearm policies that apply to all guns versus handguns) also tended to have more co-occurrence. Databases with more unique policies also generally had more co-occurrence; with a fixed number of jurisdiction-times of observation, considering more policies creates more opportunities for alignment. Importantly, these patterns highlight that the measured degree of co-occurrence depends not only on the policies themselves but also on the investigator’s choices of policy measures. Further, policies that could be considered alternatives rather than complements (e.g. child tax credits and child tax allowances) co-occurred less frequently and may offer the opportunity for more robust studies of causal impacts. Differences in the ways that policies are adopted across different political systems and different jurisdictional levels may also matter. In our examples, country-level policies appeared to co-occur less frequently than state-level policies, implying that estimating causal effects of country-level policies may be more feasible. Similar considerations apply to the temporal scale of analyses as well: the feasibility of estimating health effects may depend on whether analyses are conducted at the level of the year, month, or even election cycle. Our analysis could not determine which of these factors drives variation in policy co-occurrence; this would be a fruitful area for future research.
Limitations
Several other limitations of this study must be noted. Certain policy domains were not covered, either because no social policy studies for that domain were sampled, or because no corresponding policy database was identified or accessed. We did not review all potentially relevant journals. Our results may not generalize to policy domains or journals not examined. Our approach also assumes that all the policies in each domain-specific database are relevant to the health outcome of interest; this is plausible for social interventions that likely affect a broad range of health outcomes, but for some outcomes, only a subset of the policies in a database may need to be controlled to isolate the effect of the index policy. Additionally, our approach is only relevant when a database of the relevant policies exists or can be constructed. Developing policy databases is often an arduous task requiring systematic review of legal language. We did not consider the quality of the underlying databases. Our selections serve to illustrate the policy co-occurrence problem, but for applied researchers, the optimal policy database may differ from the one used here. The problem of correlated exposures arises in many domains, including environmental health, and although social policies are distinct in important regards, methods in other domains may nonetheless prove helpful. Furthermore, our analysis did not examine the distinctions between policy adoption, implementation, promulgation, or changes in norms that precede or follow from policy changes, but these considerations are essential in applied policy research.
Finally, data sparsity arising from co-occurring policies can lead to bias not just imprecision. Our simulations did not incorporate this because this type of bias is less relevant to studies of the health effects of social policies, and highly context-specific. Simulation results on the magnitude of bias from positivity violations are therefore unlikely to be generalizable. Specifically, bias arising from positivity problems depends on the estimation method. For methods that rely on modeling the outcome (e.g. with regression), positivity-related bias arises from model-based extrapolation. For methods that involve modeling the exposure mechanism (e.g. propensity score matching, inverse probability of treatment weighting), bias can result from disproportionate reliance on the experiences of a just a few units, or the absence of certain confounder strata (i.e. certain groups are never exposed and thus cannot be weighted to de-bias the estimate). Since our simulations are based on outcome regressions—the most common approach for differences-in-differences, panel fixed effects, and related designs—bias would only arise from model-based extrapolation. However, for the vast majority of policies identified in this study, measures were binary and thus extrapolation cannot occur. For continuous policy measures (e.g. the amount of a tax), model-based extrapolation is possible, but application-dependent. Thus, simulating the potential degrees of bias resulting from model-based extrapolation requires either tenuous generalizations or substantive assumptions about each policy area. We suspect that extremely non-linear relationships that would lead to large extrapolation bias are rare for policy effects, but this remains an open question.
Implications
Researchers should be cautious when seeking to make causal inferences about the health effects of single social policies using methodological approaches premised on “arbitrary” or quasi-random variation in policies across jurisdictions and time. Not every policy change offers a valid differences-in-differences or panel fixed effects study design. These methods are most compelling when policy implementation is staggered across jurisdictions and dates independently from other policies and for plausibly like-random or arbitrary reasons. For example, there could be differing timing of elections, legislative sessions, “crises” that provoke specific policy changes, or “lottery”-type rollouts. In these cases, such research can be very persuasive, or at least constrain the set of co-occurring policies. Our finding of pervasive policy co-occurrence across numerous databases suggests that many policies do not fit this criterion.
Inadequate control for co-occurring policies or differences in the set of policies controlled may explain surprising or conflicting results in previous studies. Investigators should base interpretations of social policy research on the plausibility that policy adoption is distributed arbitrarily with respect to other uncontrolled policies or social changes, a phenomenon that in reality may be rare. This evaluation should be based on deep content knowledge of law, politics, and society—a compelling argument for involving policymakers in the design and interpretation of studies.
Potential solutions
We illustrate an approach for researchers to assess whether the effects of individual policies can be estimated. While other simulation-based methods for assessing positivity exist (9), the approach we propose is tailored to the policy co-occurrence problem and facilitates examining how a full set of policies substantively occur together. For a given application, if the heatmap indicates high correlations, and estimated R2 values and variance inflation are high, it may be necessary to alter the research question and corresponding analytic approach.
Researchers have applied numerous analytic approaches to address the challenge of highly co-occurring policies, ranging from machine learning algorithms that identify policy measures most strongly related to an outcome of interest to methods that characterize overall policy environments based on expert panels. Relevant methods have been discussed in diverse prior work (see for example (7,8,10,13,23,96–99)). The second paper in this series on policy co-occurrence provides a systematic assessment of available methods. We briefly review three promising analytic options here, and refer the reader to the second paper in this series for more detail.
One approach is to focus on outcomes that are affected by the index policy of interest but not the co-occurring policies. For example, changes in state Earned Income Tax Credits (EITC) co-occur with changes in other social welfare policies (100). Rehkopf and colleagues took advantage of seasonality in the disbursement of EITC cash benefits (typically delivered in February, March, and April), versus benefits without the same seasonal dispersal pattern, to examine the association of EITC with health using a differences-in-differences design (101). By comparing health outcomes that can change on a monthly basis (e.g. health behaviors) for EITC-eligible versus non-eligible individuals in months of income supplementation versus non-supplementation, the authors measured potential short-term impacts of EITC independent of other social welfare policies.
Another approach is to move beyond binary measures of policy adoption to more detailed characterizations—the amount of funding, generosity, participation rate, or population reach of a program; the size of a tax; or the duration of a policy. These measures may co-occur less frequently with related policies, or provide opportunities to examine dose-response effects among jurisdictions adopting the policy. For example, the adoption of certain unemployment benefit policies co-occurs frequently with other social welfare policies across state-years. Researchers have successfully assessed their health impacts by comparing varying levels of unemployment benefit generosity—measured as the total maximum allowable benefit (in US dollars) per bout of unemployment –across states and years (102, 103). Heatmaps like those presented in this study may help researchers to identify specific policy measures that are more independent from related policies.
A final option is to conceptualize policy clusters, instead of individual policies, as exposures. This is promising if policies are typically adopted as a group, as is the case with the large omnibus bills that are increasingly common at the state and federal levels. This approach is also particularly relevant when studying the provisions of a single umbrella policy. For example, for provisions of recreational cannabis legalization, exposure categories based on the overall approach to legalization in one state versus another may be of greater interest than the effects of individual provisions. Similarly, Erickson and colleagues categorized states into four groups based on stringency of the overall alcohol policy environment and found that these categories were associated with levels of past-month alcohol consumption (104). Several options are available to define clusters, including manual selection, hierarchical cluster analysis, latent class analysis, and principal components analysis (105). Heatmaps like those presented here can help inform the selection of appropriate clusters by offering an intuitive visual reference for the likelihood that sets of policies were adopted together. Evaluating situations when each clustering approach might be preferable is a future research direction.
Conclusions
Overall, our findings suggest that co-occurring policies are a major methodological challenge to rigorously evaluating the health effects of individual social policies. Rigorous study design and interpretation of studies that seek to isolate individual policy effects requires careful attention to co-occurring policies and their impacts on identifiability and precision. Evaluating the health effects of policies is a powerful strategy to address confounding and an important substantive domain, conceptualizing social policies as a natural avenue for translation of epidemiologic findings to public health. Study designs, statistical methods, and data collection efforts to enhance statistical power for evaluating co-occurring policies or to circumvent the co-occurrence are a high priority for future work.
Data Availability
This study involves only secondary data on social policies that is publicly available online.
Appendix
Appendix
Supplemental Methods
Survey assessing relevant journals for inclusion
We developed a short online survey of population health researchers, disseminated through the authors’ professional networks via email, Twitter, and LinkedIn. The survey stated: “We’d like to identify an illustrative set of journals where important research on the health effects of social policies is published. We want journals that are (a) high impact, (b) reflect diverse disciplines, and (c) publish original quantitative research on how social policies influence population health.” For each of 5 disciplines (epidemiology, public health, clinical medicine, sociology/demography, and economics/health policy), we provided 8 candidate journals, informed by the highest ranking general-topic journals in each field across all countries/regions according to Scimago Journal Rankings. We asked respondents to rank the most important journals to include in each of the 5 fields, with the option to skip fields with which the respondent is not familiar. Response options were ordered randomly for each respondent. This survey was deemed not to be human subjects research by the University of California, San Francisco Human Research Protection Program.
We received 66 anonymous responses, with self-identified primary disciplines of epidemiology (27 respondents), public health (15), medicine (7), sociology/demography (9), economics/health policy (7), and other disciplines (7). The table below summarizes the rankings results.
A clear “top journal” for research on the health effects of social policies emerged in each discipline and our study included all of these top-ranking journals. The 5 other journals we included were ranked 2nd or 3rd. There were 29 additional journals suggested by respondents, none of which were suggested by more than one or two respondents, except Health Services Research, which was suggested by 4 of 66 respondents. There were 11 additional disciplines suggested by respondents (e.g. geography, social work) but none was suggested by more than one respondent.
Social policy study inclusion and exclusion criteria
Included:
Original, empirical studies aiming to make inferences about the causal effects of one or more social policies on health-related outcomes
○ Although the boundaries of “social policies” are fuzzy, we focused on non-medical, population-based or targeted policies that are adopted at a community or higher level and affect social determinants of health or social inequalities in health.
○ We defined health-related outcomes broadly, to include morbidity, mortality, and health conditions, as well as factors related to health outcomes such as smoking, neighborhood availability of health foods, health expenditure, utilization of health services homelessness, and sales of healthy and unhealthy products
Other inclusions:
○ Economic, welfare, unemployment, family and child-related policies
○ Food and nutrition-related policies
○ Drug and alcohol policies
○ Road safety policies
○ Immigration policies
○ Firearm policies
○ Policies regulation federal school meals
Excluded:
Studies that pertained to health care, health insurance, interventions delivered in the clinical setting, medications, medical procedures, or medical devices, including studies of preventive or treatment-oriented health services, the Affordable Care Act, and Medicaid expansion
Systematic reviews and meta-analyses
Qualitative and ethnographic studies
Hypothesis-generating studies
Simulation studies, unless they estimated the effects of existing policies
Studies that leverage policy changes as an instrument to study the effects of endogenous variables such as educational attainment or income (these studies do not estimate the effects of social policies)
Other exclusions:
○ Interventions delivered in the clinical setting, even if they are delivering social services (e.g. housing supports)
○ Surveys about potential responses to hypothetical regulations used to project the impacts of future/potential policies
○ RCTs unless used to evaluate and actual, real-world policy
○ Studies of the effects of lawsuits
○ Studies about cancer screening or communicable testing incentives, guidelines, recommendations and policies
○ Mechanisms and structures of health care payments; health care finance mechanisms and models
○ Studies of industrial/macroeconomic policies with productivity, employment, income, or wages as outcomes
○ Studies on health care spending
○ Studies of the health information exchange
○ Millennium development goals
○ Infectious disease harm reduction
○ Mass drug administration
○ Democracy, regime type, or free and fair elections
○ Community-based interventions around medication access and adherence
○ Vaccination policies
○ Federal funding for research
○ Physician shortages
○ Expansions of family planning services
○ Mental health policies, even if they have components that are not about health insurance or health services – e.g. stigma reduction campaigns (still mainly a health services policy)
○ Funding for prevention programs or primary care
○ Incentives for behavior change in Medicaid
○ Moving to Opportunity
○ Government funding earmarked for different purpose but not associated with a particular policy
Assessment of inter-rater reliability for inclusion and exclusion
We sought to assess the replicability and reliability of classifying each journal’s 2019 articles as “studies of the health effects of social policies” or not. We recruited one analyst to independently review a random subsample of the journal articles we initially reviewed for inclusion. The analyst’s training background was an MPH in Environmental Health. We provided her with some background on the project, some general training in what constitutes a social policy, and the definitions and inclusion/exclusion criteria provided in the main text and appendix of our paper. We provided the analyst with 31 full-text manuscripts: 10 meeting inclusion criteria and 21 studies not meeting inclusion criteria. The analyst classified 28 of the 31 studies in concordance with our determination of inclusion/exclusion. Conflicts for the three discordant studies were easily resolved with the determination to exclude after further clarifying that a composite index of country-level participation in conflict and individually-initiated abstinence-based fertility control were not measures of social policies, and that the exclusion of instrumental variables studies only applied to studies in which leverage policy changes as an instrument to study the effects of endogenous variables such as educational attainment or income.
Database selection
For each social policy study, we identified any corresponding quantitative databases capturing the content, locations, and times of adoption of that index policy and related policies in the same domain. We searched the scientific literature; websites of domain-relevant research institutions, scientific centers, and organizations; and the internet to identify relevant, publicly available databases. We also asked the authors of each index social policy study for policy database recommendations. When possible, we included databases that were provided on request from individual investigators.
If more than one policy database was available, we selected the one that was most amenable to this analysis: first, the one requiring the least data cleaning or manipulation (i.e., panel data structure and variables coded); then, among those remaining, the one with the greatest clarity of variable definitions, followed by the least missingness, and most comprehensiveness (number of policies and time points). For example, the American Journal of Preventive Medicine published a January 2019 study on the association of state firearm legislation with intimate partner homicide using the RAND State Firearm Law Database (97); several state firearm law databases exist, and we selected the one developed by Siegel and colleagues because it had the cleanest data, most precise variable definitions, and covered the greatest number of policies (98, 99). The database selected for our analysis was not always the one used in the index policy.
For studies evaluating national-level policies, we treated the corresponding database as the one with country-level panel data, even if the corresponding study did not include data from other countries—for example, if the authors used an interrupted time series design. If we could not find a database that included the location in the index study (e.g. China), we used the best available database covering other geographic units, if one was available (e.g. OECD countries). We excluded studies and domains for which we could not identify or access a corresponding database.
Database coding
Each database is formatted with one row per place and time period. Each policy was a separate column indicating the value of the policy measure for the given place and time. The time periods were defined based on the finest resolution available in the data set (in most cases, years). The types and formats of policy information varied across databases. For example, some included exclusively binary indicators of when each policy was adopted while others provided information on benefit generosity, implementation, access, or scope (e.g. number of Supplemental Nutrition Assistance Program (SNAP) participants by state and year). Some databases code laws very specifically (e.g. whether a country child family leave policy requires a minimum employee tenure) while others were more general (e.g. whether a country has any policy to increase immigration). Some included dates of enactment, dates the laws became effective, and dates that the laws were amended, if relevant; we included these as separate policy measures. Most included laws, resolutions, and regulations, but excluded executive orders and determinations resulting from legal cases. We included all available policy measures for the heatmaps. For subsequent analyses, when multiple measures of the same policy were available (e.g. year of adoption and number of participants), we selected the measure used in the publication in the original search which invoked the policy, if relevant, or the measure we judged to be the most representative.
We converted categorical policy variables to a series of dichotomous variables, with “no policy” serving as the reference category. Some policies were subordinate to umbrella policies. For example, provisions regulating cannabis delivery services are only applicable in jurisdictions where recreational cannabis is legal. For jurisdictions and times in which the umbrella policy was not active, provisions conditional on that umbrella policy were coded to 0. We treated two policies where one is nested in the other as separate policy variables. For example, handguns are a type of gun; we treated policies requiring a waiting period on purchases of all handguns and policies requiring a waiting period on purchases of all guns as separate policy variables. Additional details on database coding are provided in the Appendix (“Database coding”).
Some databases simply provided a list of the dates of adoption of a set of policies, leaving ambiguity about how many time-units prior to policy adoption should be included in the analysis. In applied work, this would likely depend on the health outcome, data availability, and number of time units needed to establish the pre-policy outcome trends. For our analyses, when converting these to place-time period datasets, we chose to include three time-units before adoption of the first policy and after adoption of the last policy (when relevant), as similar duration run-in and run-out periods are common in social policy studies. If exact dates of policies were provided, we converted the data to a jurisdiction-month-level panel database. We treated missing data—for example, missingness arising because a country was not an entity in a given year—as missing completely at random. We reported the degree of missingness in Appendix Table 2 and conducted complete case analysis. Some databases lacked information on all policies for all available years; in this case, we restricted the range of years or set of policies included in the final database to maximize the number of state-year-policy data points included.
Supplemental results
R statistical code for simulating variance inflation due to policy co-occurrence
Footnotes
Funding: This work was supported by the Evidence for Action program of the Robert Wood Johnson Foundation, the National Institute on Alcohol Abuse and Alcoholism (grant number K99 AA028256), and the National Institute on Drug Abuse (grant number T32 DA007233).
Conflicts of interest None declared.
Revised terminology; additional detail and clarifications on methods; clarification on study goals; added validation of choice of journals and inter-rater reliability for application of inclusion criteria.
References
- 1.↵
- 2.↵
- 3.
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.
- 12.
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.↵
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.↵
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.↵
- 83.↵
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.↵
- 96.
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.
- 107.
- 108.