Adjustment for Unmeasured Spatial Confounding in Settings of Continuous Exposure Conditional on the Binary Exposure Status: Conditional Generalized Propensity Score-Based Spatial Matching =========================================================================================================================================================================================== * Honghyok Kim * Michelle Bell ## ABSTRACT Propensity score (PS) matching to estimate causal effects of exposure is biased when unmeasured spatial confounding exists. Some exposures are continuous yet dependent on a binary variable (e.g., level of a contaminant (continuous) within a specified radius from residence (binary)). Further, unmeasured spatial confounding may vary by spatial patterns for both continuous and binary attributes of exposure. We propose a new generalized propensity score (GPS) matching method for such settings, referred to as conditional GPS (CGPS)-based spatial matching (CGPSsm). A motivating example is to investigate the association between proximity to refineries with high petroleum production and refining (PPR) and stroke prevalence in the southeastern United States. CGPSsm matches exposed observational units (e.g., exposed participants) to unexposed units by their spatial proximity and GPS integrated with spatial information. GPS is estimated by separately estimating PS for the binary status (exposed vs. unexposed) and CGPS on the binary status. CGPSsm maintains the salient benefits of PS matching and spatial analysis: straightforward assessments of covariate balance and adjustment for unmeasured spatial confounding. Simulations showed that CGPSsm can adjust for unmeasured spatial confounding. Using our example, we found positive association between PPR and stroke prevalence. Our *R* package, *CGPSspatialmatch*, has been made publicly available. Propensity scores (PS) are widely used to estimate causal effects of binary treatments/exposures with observational data (1) and extended for continuous treatments/exposures, referred as generalized PS (GPS) (2, 3). Valid estimation of causal effects using (G)PS relies on the assumption of no unmeasured confounding (4). Exposures, health outcomes, and unmeasured confounders commonly exhibit spatial patterns, introducing potential unmeasured spatial confounding (5-8). In regression-based methods, unmeasured spatial confounding may be addressed through adjustment for spatially correlated errors such as spatial random effects (6, 7, 9, 10). Researchers may follow this practice in estimating (G)PS to minimize unmeasured spatial confounding (11). PS trimming may also be useful to address unmeasured confounding in general (12). Researchers are often confronted with settings where continuous exposure is conditional on binary exposure status. Examples are accessibility to emergency services (e.g., emergency medical care, fire stations) (13-15), exposure to nearby environmental factors (e.g., power plants, oil and gas extractions, high voltage cables, overhead powerlines, greenness) (16-20), and surrounding built environments (e.g., walkability, recreational facilities, transportation networks) (21). If exposure focuses only on proximity to exposure, analysis may include a continuous variable indicating distance from the location of an observational unit (e.g., residence of a subject) or a binary variable indicating whether exposure is located within a specified radius (i.e., “buffer”) from the unit. Researchers may want to consider proximity along with other quantity-wise characteristics of exposure (e.g., number of ambulance/fire trucks available, air pollution level, number of oil and gas wells nearby, level of greenness nearby, number of recreation facilities nearby). In this case, exposure may be based on a continuous variable conditional on a binary variable (e.g., buffer). For example, inverse-distance weighted average number of oil production wells within a 10-mile from maternal residence (19) and the sum of production volumes at oil and gas wells within 1-km from maternal residence (18) were used to assess oil production wells in relation to birth outcomes. Other examples are the number of fast-food restaurants within 1-mile from schools in relation to childhood obesity (22), and tree cover within 50m of patient residences in relation to mortality during tuberculosis treatment (23). Spatial accessibility measured by the two-step floating catchment area method (24), which is a continuous metric where travel time from an observational unit to healthcare providers within a pre-specified maximum buffer distance is a determinant of this metric, was used to explore associations with many health outcomes (13-15). All these continuous exposure metrics are defined as 0 if the exposure does not exist within a specified buffer from observational units (i.e., binary exposure status). Otherwise, the metrics are defined as >0 with a continuous variable. Thus, exposure distribution can be bimodal and/or skewed (Figure 1). Both binary and continuous exposures can commonly exhibit spatial patterns raising concerns of unmeasured spatial confounding. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F1) Figure 1. Hypothetical Examples of Bimodal and/or Skewed Distribution of Continuous Exposure Conditional on Its Binary Exposure Status To our knowledge, (G)PS methods are limited to settings for either binary or continuous exposure, but not for continuous exposure conditional on its binary exposure status. Researchers may use GPS by treating such exposure as a single continuous variable to account for the degree of exposure in which the binary exposure status is subsumed. However, GPS estimation may be biased when appropriate consideration for bimodal or skewed exposure distribution is not made. To estimate causal effects of such exposure while addressing unmeasured spatial confounding, we introduce a new GPS-matching method. We revamp GPS estimation for continuous exposure conditional on the binary exposure status. GPS is estimated by separately estimating PS and GPS conditional given the binary exposure status and then combining them. We refer to GPS conditional given the binary exposure status as conditional GPS (CGPS). Spatial information is integrated into estimation of PS and CGPS. Exposed observational units are matched to unexposed units by spatial proximity and GPS. We refer to our method as CGPS-based spatial matching (CGPSsm). The benefits of CGPSsm include mimicking randomized clinical trials with covariate balancing and benefits of spatial analyses integrating spatial information in adjusting for unmeasured spatial confounding. Our method development was motivated by emerging research for which unmeasured spatial confounding may be of concern in evaluating causal effects of oil and gas development and refining (18, 19, 25), although our method is applicable to other exposures conditional on their binary exposure status. Our motivating example is to investigate possible links between proximity to refineries with high petroleum production and refining (PPR) and stroke risk in the southeastern United States (U.S.). The high stroke risk in this region was first recognized in the 1960s, giving rise to the name “Stroke Belt” (26). Stroke risk had decreased over time, but the risk is still higher in the Stroke Belt, and the region of higher risk has expanded to include the eastern Texas (27). We note that this region largely overlaps with the Petroleum Administration of Defense Districts-3 region (PADD3) including Alabama, Arkansas, Mississippi, Louisiana, Texas, and New Mexico where approximately two-thirds of U.S. petroleum production and refining (PPR) takes place. Byproducts in a series of operations (i.e., oil extraction, transportation, refinement, storage and distribution) include air, water, and soil pollution (28, 29), which may be linked to cardiovascular and cerebrovascular risks (30, 31). Figure 2 presents locations of 59 operating petroleum refineries for 2015–2017 in the six PADD-3 states and Oklahoma based on data from the U.S. Energy Information Administration (US-EIA). Oklahoma was included due to proximity of their refineries to northern Texas. While (G)PS matching is useful in that it does not strongly depend on a parametric model in confounding adjustment and enables empirically assessed covariate balance of measured confounders (11, 32-34), we should be careful about unmeasured spatial confounding in terms of both proximity to refineries and the level of emissions. The entrenched stroke risk in the southeastern U.S. is not fully explained by many factors such as socio-demographics, health behaviors (e.g., smoking, diet, physical activity), hypertension, diabetes, and structural racism (27, 35, 36). Figure 3 shows semi-variograms for census tract-level stroke prevalence in the seven states in 2018 and for residual of this prevalence after adjustment for measured census tract-level potential confounders. These plots suggest potential unmeasured spatial confounding if the unmeasured determinants of these spatial patterns are correlated with but not determined by proximity to refineries nor pollutants emitted (7, 8). Standard error regarding the association between PPR and stroke prevalence may also be underestimated (7). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F2) Figure 2. Locations of Petroleum Refineries in the Petroleum Administration of Defense Districts-3 region (PADD3) Including Alabama, Arkansas, Mississippi, Louisiana, Texas, and New Mexico and Oklahoma. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F3) Figure 3. Variograms for Census Tract-Level Stroke Prevalence (A) and Residual (B) in Seven States. Note: Residual of stroke prevalence was obtained from a regression model that predicts stroke prevalence using demographic variables, socioeconomic variables, and smoking variable. Data sources and variables are presented in Web Table 1. The Matérn covariance function with the smoothness and range parameters are presented. Below we introduce CGPSsm and two applications of this method: a simulation and a real-world case using our motivating example. ## CONDITIONAL GENERALIZED PROPENSITY SCORE-BASED SPATIAL MATCHING ### Notations and Average Treatment Effects in the Treated Let ![Graphic][1] denote a binary indicator of whether the *i*th of *n* observational unit (e.g., subject, area) is exposed ![Graphic][2] or unexposed ![Graphic][3]. Let ![Graphic][4] be a continuous variable of the degree of exposure (![Graphic][5] if ![Graphic][6] if ![Graphic][7]). In our motivating example, ![Graphic][8] is the indicator of whether any refinery is located within a specified distance from census tract *i*. ![Graphic][9] is the actual petroleum production that is assumed to be proportionate to emissions of pollutants from refineries in census tract *i*. Let *Y**i* be a continuous variable of a health outcome (e.g., stroke prevalence) of census tract *i*. Suppose that each observational unit has potential outcomes by an exposure contrast, ![Graphic][10]. We denote *Y**i*(*w*) as the potential outcome by ![Graphic][11]. Assuming that indexing is random, we omit subscript *i*. We define the average treatment effect in the treated (ATT) for *Z**c* as ![Formula][12] In estimation, ATT is averaged over *w* and presented as per Δ unit increase of *Z**c*. ATT answers causal questions such as “to what degree *Z* increases the risk of *Y* in the exposed?” by a potential experiment: what if the exposed had not been exposed? To estimate ATT, the weak unconfoundness assumption is needed (3). Let ***C*** be a minimal set of confounding variables. The weak unconfoundness assumption is: ![Formula][13] Under this assumption, ATT can be estimated by comparing health outcomes across *Z**c*, conditional on ***C***. As the dimension of ***C*** increases, this assumption becomes more difficult to meet. Researchers compress the information of ***C*** into GPS as a balancing score that can be used for confounding adjustment. Let *f*(*Z**c*|***C***) denote GPS as the conditional density of *Z**c* given ***C*** (3). With the weak unconfoundness assumption and theorems proved by Hirano and Imbens (2004), we can write ATT using GPS as: ![Formula][14] (See Appendix). We replace *E*[*Y*|*Z**c* = *t, Z**b* = 1, *f*(*Z**c* = *t*|***C***) = *r*] with ![Graphic][15] by assuming that the observed outcome of the unexposed can serve as the potential outcome of the exposed if the propensity of a counterfactual *Z, Z**cf* = *w*, of the unexposed, ![Graphic][16], is similar to that of the exposed. Use of *Z**cf* = *w* is analogous to standard PS matching (2) where the exposed is matched to the unexposed by *P*(*Z**b* = 1|***C***) where *Z**b* of the unexposed is 0, not 1. For the exposed, *Z**cf* = *Z*. Then, for *w* ∈ (*t, W*], Eq.1 becomes ![Formula][17] This is ATT (See Appendix). We introduce estimation of GPS and ATT using CGPSsm below. ### Estimation of ATT Let ***U*** be a set of observed confounders and ***U*** a set of unmeasured confounders that have spatial patterns such that ***C*** = (***X, U***). We propose CGPSsm to estimate ATT of *Z**c* while addressing unmeasured spatial confounding. #### Double-Matching by spatial proximity and GPS We augment confounding adjustment by utilizing spatial information of observational units in matching and estimation of GPS. The first step of CGPSsm is to match each exposed unit to unexposed unit(s) by spatial proximity with replacement, referred to as one-to-n distance-matching. Proximity is defined as distance measure, *d*, between two locations. Spatially neighboring areas have similar characteristics such that matching by proximity will have better covariate balance in ***C*** = (***X, U***). CGPSsm further addresses imbalance in ***C*** by GPS-matching in each distance-matched stratum. Several matching techniques may be applied. We focus on one-to-one nearest-neighbor matching with/without replacement and one-to-one nearest neighbor caliper matching with/without replacement (37). Finally, CGPSsm creates a one-to-one distance- and GPS-matched sample with/without replacement. For this, in GPS-matching without replacement, if the same unexposed units exist in multiple distance-matched strata after one-to-n distance-matching with replacement, once an unexposed unit is matched to the exposed unit in one stratum by GPS, then the same unexposed unit in the other strata is deleted. Web Figure 1 illustrates this process. Matching with replacement may reduce bias more than matching without replacement when unexposed units are sparse (33) but it may reduce precision if fewer unexposed units in the original sample are matched (e.g., multiple exposed units are matched to one unexposed unit) (33). #### Estimation of GPS GPS estimation may be biased if the skewed/bimodal distribution of *Z**c* due to *Z**b* is not adequately considered. In CGPSsm, GPS is estimated by separately estimating PS for *Z**b* and estimating CGPS that is GPS for *Z**c* given *Z**b*. By the law of total probability, GPS can be decomposed as: ![Formula][18] We are interested in only *w* > 0. When *w* > 0, *f*(*Z**c* = *w*|***C***, *Z**b* = 0) = 0, such that ![Formula][19] We introduce estimation of CGPS. Of the exposed, CGPS can be estimated as standard GPS estimation (3, 38). We assume that CGPS follows a normal distribution and, 1. Fit a CGPS model to predict *Z**c* and ***C*** for *only the exposed* and get ![Graphic][20] and ![Graphic][21]. 2. CGPS of the exposed is ![Graphic][22] where ![Graphic][23] is a set of regression coefficients to predict *Z**c*, ![Graphic][24] is a standard deviation of residuals, and ![Graphic][25] is ***C*** of the exposed. Recall *Z**cf* = *Z* = *w* for the exposed. We estimate CGPS of the unexposed by ![Formula][26] where ![Graphic][27] is ***C*** of the unexposed. In the original dataset, there are many different values of *w* over the exposed. After distance-matching, there would be many strata for the distance-matched pairs of one exposed unit to one or several unexposed unit(s). In each stratum, CGPS of the unexposed is estimated according to *w* of the exposed. GPS is then estimated. Web Figure 1 illustrates this process. To build PS and CGPS models, spatial coordinates may be used to adjust for *U* such as spatial regression models and gradient boosting algorithms (11). This augments spatial confounding adjustment in addition to distance-matching. We consider generalized additive models (GAM) with spatial smoother using *mgcv* R package (39) and the extreme gradient boosting (XGBoost) using *xgboost* R package (40) including coordinates as covariates. Spatial models with a random effect as implemented using *spBayes* R package (41) may be an alternative but we did not consider this because of its poor performance in PS matching to adjust for spatial confounding (11). #### Diagnostics of Covariate Balance Within each stratum of the matched pairs, one exposed unit having *Z**c* = *w* is compared to one unexposed unit having ![Graphic][28] to estimate ATT. Therefore, we suggest standardized mean difference (SMD) of ***X*** between the exposed and unexposed (34, 42). Cut-off values to check covariate balance of 0.1 or 0.25 are often used in PS methods (42). However, there is no clear threshold although use of a smaller cut-off may intuitively result in stronger adjustment (32). Correlation in the original dataset with causal knowledge (8) may be informative to build a CGPS model and thereby to achieve covariate balance regarding SMD (43, 44). #### Selecting spatial distance threshold in the distance-matching and caliper in the GPS-matching Researchers can pick *d* to augment confounding adjustment to address unmeasured spatial confounding. The lower *d* matches exposed units to more closely located unexposed units and potentially makes stronger adjustment for *U*. Visual inspection of semi-variograms (Figure 3) can inform selection of *d*. In our example, we use *d*=0.1. Users may select the lowest *d* that minimizes SMD across different *d* values to address covariate balance in ***X*** as well. Too low *d* may result in many exposed units being unmatched. The omission of unmatched exposed units in analysis can lead to loss of precision (33, 45). This can also alter the estimand if ATT for the dropped exposed units is not equal to ATT for the included exposed units (33). In caliper matching, researchers select caliper width. We refer to *cw* as a factor of the standard deviation of GPS. The selection of *cw* is important for covariate balancing. Intuitively, a tighter caliper may reduce bias and precision (33). A too narrow caliper with low *d* may result in many exposed units being unmatched. Again, the omission of unmatched exposed units in analysis can also alter the estimand (33). For *cw* → ∞, nearest neighbor caliper matching becomes nearest neighbor matching. #### R Package CGPSsm is provided as *R* package, *CGPSspatialmatch* in the first author’s ([https://github.com/HonghyokKim/CGPSspatialmatch](https://github.com/HonghyokKim/CGPSspatialmatch)). An illustrative example with R codes is provided at [https://hkimresearch.com](https://hkimresearch.com). ## SIMULATION STUDY ### Methods We conducted a simulation study to evaluate CGPSsm. We briefly overview methods below. See the Web Material for details. #### Data generation We generated three i.i.d confounders (*X**1*, *X**2*, *X**3*) from the standardized normal distribution and one spatial confounder (*U*) from a Gaussian process with the Matérn covariance function. Nine pairs of the two spatial parameters (*k*=smoothness, *π*=range) in the Matérn covariance function were considered: *k*=0.1, 0.5, 1; *π*=0.1, 0.5, or 1. We call these the nine simulation scenarios. Higher *k* reflects smoother spatial patterns. Higher *π* reflects more correlated spatial units. See Minasny and McBratney (2005) (46) for details. Web Figure 2 shows nine spatial confounding patterns. For each spatial pattern, we generated 200 samples with 484 (22**×**22) fixed locations. *Z**b* was generated using logistic regression with the four confounders (i.e., approximately 15% exposed). *Z**c*>0 was generated using linear regression with the four confounders if *Z**b*=1. Otherwise, *Z**c*=0. *Y* was generated using Poisson regression with the four confounders and *Z**c*. #### CGPSsm We conducted CGPSsm with *U* measured, referred to as true CGPSsm, and CGPSsm with *U* unmeasured. One-to-n distance-matching with replacement was conducted with *d* as standardized Euclidian distance of 0.1. For GPS-matching, one-to-one nearest neighbor matching with/without replacement and one-to-one nearest neighbor caliper matching with/without replacement were conducted. We fit PS models using either a GAM with a binomial distribution and the logistic link including *X**1*, *X**2*, *X**3*, and a spatial smoother or XGBoost including *X**1*, *X**2*, *X**3*, and coordinates as covariates. We used grid search with cross-validation to select hyperparameters of XGBoost, which is embedded in our *R* package. We fit CGPS models using GAM with a normal distribution and the identity link including *X**1*, *X**2*, *X*3 and a spatial smoother using only the exposed of each simulated sample. We generated 500 bootstrapped samples from the distance-and GPS-matched samples. Coefficient estimates and standard error estimates for ATT over the bootstrapped samples were obtained. #### Other methods To illustrate the degree of unmeasured spatial confounding in simulation samples, we fit Poisson regression including only *X**1*, *X**2*, and *X**3* as covariates (i.e., no adjustment for spatial confounding). We conducted standard GPS-based inverse probability weighting (IPW) to see the degree of unmeasured spatial confounding when dependency of *Z**c* on *Z**b* is not considered. GPS was estimated by a GAM with spatial smoother (Naïve IPW-GAM) or XGBoost (Naïve IPW-XGBoost). GPS was estimated without consideration of dependency of *Z**c* on *Z**b*. We acknowledge that IPW and conventional Poisson regression is not designed to estimate ATT. The average treatment effect in the population and conditional average effect were set to be identical to ATT in our simulation. ### Results Bias by unmeasured spatial confounding in our simulation settings was approximately -60% based on Poisson regression without spatial adjustment for all nine simulation scenarios (Web Figure 3). For Naïve IPW methods, bias ranged from -20% to -45% in scenarios of *k*=0.5 or 1 with *π* =0.5 or 1 but bias increased for other scenarios. Nominal coverage of 95% confidence intervals ranged from 1% to 10.5% for all the scenarios (Web Figure 4). ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F4) Figure 4. Percentage of Exposed Units Matched to Unexposed Units by CGPSsm Methods With Replacement. Note: True=True PS and CGPS models in CGPSsm (i.e., *U* is known); GAM=GAM with spatial smoother for PS and CGPS estimations; XGBoost=XGBoost for PS estimation and GAM with spatial smoother for CGPS estimation; The number in brackets indicate *cw* in one-to-one nearest neighbor caliper matching; *cw* = ∞ indicates one-to-one nearest neighbor matching. Figure 4 presents the percentage of exposure units matched to unexposed units by CGPSsm methods with replacement. The number in brackets in Figure 4 indicates *cw* in one-to-one nearest neighbor caliper matching. *cw* = ∞ indicates one-to-one nearest neighbor matching. Fewer exposed units were matched when the caliper was narrower. Figure 5 presents bias by CGPSsm methods with replacement. Figure 6 presents balance for *X**1*, *X**2*, *X**3*, and *U* by CGPSsm methods with replacement regarding three selected scenarios. CGPSsm with *U* known showing bias around 0% when *cw*=0.4, 0.6, or 0.8 confirms that CGPSsm can estimate ATT. The bias was around -5% to -10% when *cw*=∞. Covariates were more balanced with lower *cw*. Note that for PS matching in general, use of the true PS model may not completely reduce bias, depending on matching techniques (37, 47). ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F5) Figure 5. Bias by CGPSsm Methods With Replacement. Note: True=True PS and CGPS models in CGPSsm (i.e., *U* is known); GAM=GAM with spatial smoother for PS and CGPS estimations; XGBoost=XGBoost for PS estimation and GAM with spatial smoother for CGPS estimation; The number in brackets indicate *cw* in one-to-one nearest neighbor caliper matching; *cw* = ∞ indicates one-to-one nearest neighbor matching. ![Figure 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F6.medium.gif) [Figure 6.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F6) Figure 6. Covariate Balance Before and After CGPSsm With Replacement. Note: True=True PS and CGPS models in CGPSsm (i.e., *U* is known); GAM=GAM with spatial smoother for PS and CGPS estimations; XGBoost=XGBoost for PS estimation and GAM with spatial smoother for CGPS estimation; The number in brackets indicate *cw* in one-to-one nearest neighbor caliper matching; *cw* = ∞ indicates one-to-one nearest neighbor matching; Grey dashed lines indicate ±0.1. Grey dotted lines indicate ±0.25. When *U* is missing, CGPSsm with GAM and XGBoost showed good adjustment for *U* in scenarios of smoother spatial patterns of *U* (i.e., *k*=0.5 or 1) (Figure 5). Specifically, covariates were more balanced with lower *cw* (Figure 6), resulting in stronger adjustment (Figure 5). Bias ranged roughly from -60% to 5% for caliper matching. For *k*=0.1 where *U* acted more like a randomly distributed confounder (but is still a spatial confounder), CGPSsm adjusted for *U* but to a limited degree. For XGBoost, we needed to use higher value of *cw* than that for CGPSsm with GAM because XGBoost in PS estimation greatly distinguished PS of the exposed and of the unexposed. Figure 7 presents root mean squared error (RMSE) by CGPSsm methods with replacement. In scenarios of *k*=0.5 or 1 with *π* =0.5 or 1, for *cw* from 0.2 to 0.8, CGPSsm with GAM showed slightly higher RMSE than CGPSsm with *U* known. CGPSsm with GAM generally outperformed CGPSsm with XGBoost. ![Figure 7.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F7.medium.gif) [Figure 7.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F7) Figure 7. Root Mean Squared Error by CGPSsm Methods With Replacement. Note: True=True PS and GPS models in CGPSsm; GAM=GAM with spatial smoother for PS and CGPS estimations; XGBoost=XGBoost for PS estimation and GAM with spatial smoother for CGPS estimation; The number in brackets indicate *cw* in one-to-one nearest neighbor caliper matching; *cw* = ∞ indicates one-to-one nearest neighbor matching. Figure 8 presents nominal coverage of 95% confidence intervals by CGPSsm methods with replacement. For less biased methods and scenarios (Figure 5), coverage was higher than 95%. ![Figure 8.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F8.medium.gif) [Figure 8.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F8) Figure 8. Nominal Coverage of 95% Confidence Intervals by CGPSsm Methods With Replacement. Note: True=True PS and CGPS models in CGPSsm (i.e., *U* is known); GAM=GAM with spatial smoother for PS and CGPS estimations; XGBoost=XGBoost for PS estimation and GAM with spatial smoother for CGPS estimation; The number in brackets indicate *cw* in one-to-one nearest neighbor caliper matching; *cw* = ∞ indicates greedy one-to-one nearest neighbor matching. Web Figures 5–9 shows CGPSsm methods without replacement regarding bias, RMSE, nominal coverage, covariate balance in selected scenarios, and the percentage of exposure units being matched to unexposed units. They showed good adjustment for *U* while CGPSsm with replacement performed slightly better regarding bias and RMSE. ## APPLICATION STUDY ### Methods We applied CGPSsm to our motivating example of refineries in relation to stroke prevalence with PPR of refineries conditional on whether refineries are located within a buffer distance from census tracts. We briefly overview methods below; see the Web Material for details. #### Exposure classification Higher overall production capacity is assumed in this example to reflect higher emission of pollutants (18). To estimate residential exposure to PPR, we considered potential petroleum production of each refinery, as a dimension of exposure in addition to distance between census tracts and refineries. We calculated inverse-distance weighted average of the amount of petroleum production (APP) within a pre-specified buffer distance from the centroid of census tracts as: ![Formula][29] *X**ct* is the inverse squared distance-weighted average of APP (barrels/day) at census tract *ct* for 2015-2017. APP was obtained from US-EIA. *D**c,r* is the distance between the centroid of census tract *ct* and refinery *r*. R is a set of refineries within a pre-specified buffer distance from the census tracts’ centroid. We calculated *X**ct* values corresponding to 5-km buffers. This threshold was chosen based on air pollution dispersion from smokestacks of refineries and empirical correlation between petroleum production and measured levels of SO2, which is a byproduct of petroleum production and refining (48). #### Outcome and Confounding variables Outcome of interest is census tract-level stroke prevalence (%) in 2018. Census tract-level potential confounders include subpopulation distributions by age, sex, race/ethnicity, education, health insurance, median household income, and current smoking. Data sources and details are in Web Table 1. View this table: [Table 1.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/T1) Table 1. Associations Between Exposure to Petroleum Production Refining and Stroke Prevalence by Different Methods #### Statistical analysis To estimate census tract-level cross-sectional association of PPR with stroke prevalence in seven states (Figure 1), we used CGPSsm with GAM and CGPSsm with XGBoost. One-to-one nearest neighbor matching with/without replacement were used. One-to-one nearest neighbor with caliper width of 0.2 with/without replacement was used for only CGPSsm with GAM. We also fit a regression model without spatial adjustment to examine degree of unmeasured spatial confounding. Detailed methods are in the Web Material. ### Results Figure 9 presents exposure distribution (*X**ct*) that is bimodal: Figure 9A for all census tracts; Figure 9B for exposed census tracts. ![Figure 9.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/02/02/2022.02.01.22270282/F9.medium.gif) [Figure 9.](http://medrxiv.org/content/early/2022/02/02/2022.02.01.22270282/F9) Figure 9. Exposure Distribution in the Motivating Example: All 10381 Census Tracts (A); 316 Exposed Census Tracts (B). Table 1 shows association between PPR exposure and stroke prevalence. Without spatial adjustment, PPR exposure was associated with 0.02 percent point increase (95% CI: -0.03, 0.07) in stroke prevalence per standard deviation increase of PPR. CGPSsm methods consistently showed approximately 0.20 percent point increase in stroke prevalence. The CGPSsm method with the most balanced covariates produced the estimate as 0.20 (0.04, 0.35) while 34.2% of the exposed were dropped in analysis. SMDs for 14 potential confounders (Web Table 1) before and after CGPSsm are presented in Web Figure 10. CGPSsm with GAM with replacement with *cw*=0.2 or ∞ showed the most balanced covariates with cut-off of ±0.1. Other CGPSsm methods achieved covariate balance with cut-off of ±0.25. ## DISCUSSION We developed a new GPS-matching method, CGPSsm, to estimate ATT of continuous exposure conditional on binary exposure status while adjusting for unmeasured spatial confounding. CGPSsm enables researchers to empirically assess balance in measured covariates and adjust for unmeasured spatial confounding by leveraging spatially indexed data in matching and GPS estimation procedures. Simulations under various spatial patterns confirm that CGPSsm can greatly reduce unmeasured spatial confounding. We applied CGPSsm to our motivating example, finding that PPR is associated with stroke prevalence in seven southern U.S. states. CGPSsm has potential for many epidemiological studies where exposure has both binary and continuous attributes (e.g., examples in the first section). We note that many areas and possibilities of CGPSsm remain unaddressed, which warrant future investigations. There exist many other matching techniques (e.g., optimal matching) (37). We focused on one-to-one nearest neighbor matching with/without caliper and with/without replacement because they are frequently used in epidemiological studies (33, 37). Bootstrapping methods could be studied regarding standard error estimation (49-52). Theoretical and simulation studies suggest that m-out-of-n bootstrapping may be superior regarding PS matching with replacement (51, 52). Regression techniques to build a PS and CGPS model other than GAM and XGBoost may be applied. PS trimming may be integrated into CGPSsm to augment adjustment for unmeasured confounding (12). CGPSsm currently shares limitations of PS such as ambiguity of variable specification and selection (43, 44, 47) and inability to check balance in unmeasured variables (53). ## Supporting information Web Materials [[supplements/270282_file02.pdf]](pending:yes) ## Data Availability All data produced in the present study are available upon reasonable request to the authors. ## ACKNOWLEDGEMENT This publication was developed under Assistance Agreement No. RD835871 awarded by the U.S. Environmental Protection Agency to Yale University. It has not been formally reviewed by EPA. The views expressed in this document are solely those of the authors and do not necessarily reflect those of the Agency. EPA does not endorse any products or commercial services mentioned in this publication. Research reported in this publication was also supported by the National Institute On Minority Health And Health Disparities of the National Institutes of Health under Award Number R01MD012769. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2021R1A6A3A14039711) ## Appendix ### Derivations of Average treatment effect in the treated (ATT) in Eq.1 and Eq.2 We note two theorems proved by Hirano and Imbens (2004) using our notations: Theorem 1. *E*[*Y*(*w*)] = *E*[*β*(*w, f*(*Z**c* = *w*|***C***)*t*] and; Theorem 2. *β*(*w, f*(*Z**c* = *w*|***C***)*t* = *E*[*Y*(*w*)|*f*(*Z**c* = *w*|***C***) = *r*] = *E*[*Y*|*Z**c* = *w, f*(*Z**c* = *w*|***C***) = *r*] With the weak unconfoundness assumption and these theorems, we can write ATT using GPS as follows. ![Formula][30] *E*[*Y*|*Z**c* = *t, Z**b* = 1, *f*(*Z**c* = 0|***C***) = *r*] cannot exist. So, we leverage a concept in PS matching. In PS matching, *P*(*Z**b* = 16***C***) is used for matching. ATT can be expressed as *E*[*Y*(1) − *Y*(0)|*Z**b* = 1] = *E*[*Y*(1)|*Z**b* = 1] − *E*[*Y*(0)|*Z**b* = 1] = *E*[*Y*|*Z**b* = 1, *P*(*Z**b* = 1|***C***)= − *E*[*Y*|*Z**b* = *t, P*(*Z**b* = 1|***C***). Similarly, we replace *E*[*Y*|*Z**c* = 0 *Z**b* = 1, *f*(*Z**c* = 0|***C***) = *r*] in Eq.1 with ![Graphic][31]. Note that *Z**cf* of the exposed units equals to *Z*. Then, Eq.1 will be replaced with ![Formula][32] Eq.2 is still ATT as follows: ![Formula][33] ## Footnotes * Email: honghyok.kim{at}yale.edu, Telephone: (203) 432-9869, Address: 195 Prospect Street, New Haven, CT, 06511, United States. * Received February 1, 2022. * Revision received February 1, 2022. * Accepted February 2, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## REFERENCES 1. 1.Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. American journal of epidemiology 1999;150(4):327–33. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/oxfordjournals.aje.a010011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10453808&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000081967600001&link_type=ISI) 2. 2.Imbens GW. The role of the propensity score in estimating dose-response functions. Biometrika 2000;87(3):706–10. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biomet/87.3.706&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000089678500017&link_type=ISI) 3. 3.Hirano K, Imbens GW. The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives 2004;226164:73–84. 4. 4.Stürmer T, Schneeweiss S, Brookhart MA, et al. Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. American journal of epidemiology 2005;161(9):891–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwi106&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15840622&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000228635100012&link_type=ISI) 5. 5.Flanders WD, Klein M, Darrow LA, et al. A method to detect residual confounding in spatial and other observational studies. Epidemiology (Cambridge, Mass) 2011;22(6):823. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21968772&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 6. 6.Paciorek CJ. The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Statistical science: a review journal of the Institute of Mathematical Statistics 2010;25(1):107. 7. 7.Beale L, Abellan JJ, Hodgson S, et al. Methodologic issues and approaches to spatial epidemiology. Environmental health perspectives 2008;116(8):1105–10. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1289/ehp.10816&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18709139&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000258270200034&link_type=ISI) 8. 8.Hernán MA, Hernández-Díaz S, Werler MM, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. American journal of epidemiology 2002;155(2):176–84. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/155.2.176&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11790682&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000173389100011&link_type=ISI) 9. 9.Hodges JS, Reich BJ. Adding spatially-correlated errors can mess up the fixed effect you love. The American Statistician 2010;64(4):325–34. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1198/tast.2010.10052&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000286451800007&link_type=ISI) 10. 10.Lee D, Sarran C. Controlling for unmeasured confounding and spatial misalignment in long-term air pollution and health studies. Environmetrics 2015;26(7):477–87. 11. 11.Papadogeorgou G, Choirat C, Zigler CM. Adjusting for unmeasured spatial confounding with distance adjusted propensity score matching. Biostatistics 2019;20(2):256–72. 12. 12.Stürmer T, Rothman KJ, Avorn J, et al. Treatment effects in the presence of unmeasured confounding: dealing with observations in the tails of the propensity score distribution—a simulation study. American journal of epidemiology 2010;172(7):843–54. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwq198&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20716704&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000282166900014&link_type=ISI) 13. 13.Min S, Kim D, Lee CK. Association between spatial accessibility to fire protection services and unintentional residential fire injuries or deaths: a cross-sectional study in Dallas, Texas. BMJ open 2019;9(5):e023780. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMToiOS81L2UwMjM3ODAiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8wMi8wMi8yMDIyLjAyLjAxLjIyMjcwMjgyLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 14. 14.Cao Y, Stewart K, Wish E, et al. Determining spatial access to opioid use disorder treatment and emergency medical services in New Hampshire. Journal of substance abuse treatment 2019;101:55–66. 15. 15.Kim H, Bell ML. Air Pollution and COVID-19 Mortality in New York City. American Journal of Respiratory and Critical Care Medicine 2021(ja). 16. 16.de Vocht F, Lee B. Residential proximity to electromagnetic field sources and birth weight: Minimizing residual confounding using multiple imputation and propensity score matching. Environment international 2014;69:51–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.envint.2014.04.012&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24815339&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 17. 17.Deziel NC, Brokovich E, Grotto I, et al. Unconventional oil and gas development and health outcomes: A scoping review of the epidemiological research. Environmental research 2020;182:109124. 18. 18.Tran KV, Casey JA, Cushing LJ, et al. Residential proximity to oil and gas development and birth outcomes in California: a retrospective cohort study of 2006–2015 births. Environmental health perspectives 2020;128(6):067001. 19. 19.McKenzie LM, Guo R, Witter RZ, et al. Birth outcomes and maternal residential proximity to natural gas development in rural Colorado. Environmental health perspectives 2014;122(4):412–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1289/ehp.1306722&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24474681&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 20. 20.Rojas-Rueda D, Nieuwenhuijsen MJ, Gascon M, et al. Green spaces and mortality: a systematic review and meta-analysis of cohort studies. The Lancet Planetary Health 2019;3(11):e469–e77. 21. 21.Brownson RC, Hoehner CM, Day K, et al. Measuring the built environment for physical activity: state of the science. American journal of preventive medicine 2009;36(4):S99–S123. e12. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.amepre.2009.01.005&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19285216&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000264407900004&link_type=ISI) 22. 22.Alviola IV PA, Nayga Jr RM, Thomsen MR, et al. The effect of fast-food restaurants on childhood obesity: a school level analysis. Economics & Human Biology 2014;12:110–9. 23. 23.Blount RJ, Pascopella L, Barry P, et al. Residential urban tree canopy is associated with decreased mortality during tuberculosis treatment in California. Science of The Total Environment 2020;711:134580. 24. 24.Luo W, Wang F. Measures of spatial accessibility to health care in a GIS environment: synthesis and a case study in the Chicago region. Environment and Planning B: Planning and Design 2003;30(6):865–84. 25. 25.Tustin AW, Hirsch AG, Rasmussen SG, et al. Associations between unconventional natural gas development and nasal and sinus, migraine headache, and fatigue symptoms in Pennsylvania. Environmental health perspectives 2017;125(2):189–97. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 26. 26.Casper ML, Wing S, Anda RF, et al. The shifting stroke belt: changes in the geographic pattern of stroke mortality in the United States, 1962 to 1988. Stroke 1995;26(5):755–60. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6OToic3Ryb2tlYWhhIjtzOjU6InJlc2lkIjtzOjg6IjI2LzUvNzU1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDIvMDIvMjAyMi4wMi4wMS4yMjI3MDI4Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 27. 27.Howard G, Howard VJ. Twenty years of progress toward understanding the stroke belt. Stroke 2020;51(3):742–50. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 28. 28.Lynch MJ, Stretesky PB, Burns RG. Determinants of environmental law violation fines against petroleum refineries: Race, ethnicity, income, and aggregation effects. Society and Natural Resources 2004;17(4):333–47. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/08941920490278782&link_type=DOI) 29. 29.Radelyuk I, Tussupova K, Klemeš JJ, et al. Oil refinery and water pollution in the context of sustainable development: Developing and developed countries. Journal of Cleaner Production 2021;302:126987. 30. 30.Zhang H-W, Kok VC, Chuang S-C, et al. Long-term ambient hydrocarbons exposure and incidence of ischemic stroke. PloS one 2019;14(12):e0225363. 31. 31.Ljungman PL, Mittleman MA. Ambient air pollution and stroke. Stroke 2014;45(12):3734–41. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6OToic3Ryb2tlYWhhIjtzOjU6InJlc2lkIjtzOjEwOiI0NS8xMi8zNzM0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDIvMDIvMjAyMi4wMi4wMS4yMjI3MDI4Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 32. 32.Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Statistics in medicine 2009;28(25):3083–107. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.3697&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19757444&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000271409300008&link_type=ISI) 33. 33.Lunt M. Selecting an appropriate caliper can be essential for achieving good balance with propensity score matching. American journal of epidemiology 2014;179(2):226–35. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwt212&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24114655&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 34. 34.Zhang Z, Kim HJ, Lonjon G, et al. Balance diagnostics after propensity score matching. Annals of translational medicine 2019;7(1). 35. 35.Esenwa C, Ilunga Tshiswaka D, Gebregziabher M, et al. Historical slavery and modern-day stroke mortality in the United States stroke belt. Stroke 2018;49(2):465–9. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6OToic3Ryb2tlYWhhIjtzOjU6InJlc2lkIjtzOjg6IjQ5LzIvNDY1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDIvMDIvMjAyMi4wMi4wMS4yMjI3MDI4Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 36. 36.Liao Y, Greenlund KJ, Croft JB, et al. Factors explaining excess stroke prevalence in the US Stroke Belt. Stroke 2009;40(10):3336–41. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6OToic3Ryb2tlYWhhIjtzOjU6InJlc2lkIjtzOjEwOiI0MC8xMC8zMzM2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDIvMDIvMjAyMi4wMi4wMS4yMjI3MDI4Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 37. 37.Austin PC. A comparison of 12 algorithms for matching on the propensity score. Statistics in medicine 2014;33(6):1057–69. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.6004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24123228&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 38. 38.Zhu Y, Coffman DL, Ghosh D. A boosting algorithm for estimating generalized propensity scores with continuous treatments. Journal of causal inference 2015;3(1):25–40. 39. 39.Wood S, Wood MS. Package ‘mgcv’. R package version 2015;1:29. 40. 40.Chen T, He T, Benesty M, et al. Xgboost: extreme gradient boosting. R package version 04-2 2015;1(4):1–4. 41. 41.Finley AO, Banerjee S, Carlin BP. spBayes: an R package for univariate and multivariate hierarchical point-referenced spatial models. Journal of statistical software 2007;19(4):1. 42. 42.Stuart EA, Lee BK, Leacy FP. Prognostic score–based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. Journal of clinical epidemiology 2013;66(8):S84–S90. e1. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2013.01.013&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23849158&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 43. 43.Brookhart MA, Schneeweiss S, Rothman KJ, et al. Variable selection for propensity score models. American journal of epidemiology 2006;163(12):1149–56. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwj149&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16624967&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000238424900011&link_type=ISI) 44. 44.Zhu Y, Schonbach M, Coffman DL, et al. Variable selection for propensity score estimation via balancing covariates. Epidemiology 2015;26(2):e14–e5. 45. 45.Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician 1985;39(1):33–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2683903&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1985ABK0900009&link_type=ISI) 46. 46.Minasny B, McBratney AB. The Matérn function as a general model for soil variograms. Geoderma 2005;128(3-4):192–207. [GeoRef](http://medrxiv.org/lookup/external-ref?access_num=2005062861&link_type=GEOREF) 47. 47.King G, Nielsen R. Why propensity scores should not be used for matching. Political Analysis 2019;27(4):435–54. 48. 48.Chen JA, Zapata AR, Sutherland AJ, et al. Sulfur dioxide and volatile organic compound exposure to a community in Texas City, Texas evaluated using AERMOD and empirical monitoring data. American Journal of Environmental Science 2012;8(6):622–32. 49. 49.Austin PC, Small DS. The use of bootstrapping when using propensity-score matching without replacement: a simulation study. Statistics in medicine 2014;33(24):4306–19. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.6276&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25087884&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) 50. 50.Abadie A, Imbens GW. On the failure of the bootstrap for matching estimators. Econometrica 2008;76(6):1537–57. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3982/ECTA6474&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000261055400010&link_type=ISI) 51. 51.Bickel PJ, Sakov A. On the choice of m in the m out of n bootstrap and confidence bounds for extrema. Statistica Sinica 2008:967–85. 52. 52.Alin A, Martin MA, Beyaztas U, et al. Sufficient m-out-of-n (m/n) bootstrap. Journal of Statistical Computation and Simulation 2017;87(9):1742–53. 53. 53.Reiffel JA. Propensity Score Matching: The ‘Devil is in the Details’ Where More May Be Hidden than You Know. The American journal of medicine 2020;133(2):178–81. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F02%2F02%2F2022.02.01.22270282.atom) [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: /embed/inline-graphic-5.gif [6]: /embed/inline-graphic-6.gif [7]: /embed/inline-graphic-7.gif [8]: /embed/inline-graphic-8.gif [9]: /embed/inline-graphic-9.gif [10]: /embed/inline-graphic-10.gif [11]: /embed/inline-graphic-11.gif [12]: /embed/graphic-4.gif [13]: /embed/graphic-5.gif [14]: /embed/graphic-6.gif [15]: /embed/inline-graphic-12.gif [16]: /embed/inline-graphic-13.gif [17]: /embed/graphic-7.gif [18]: /embed/graphic-8.gif [19]: /embed/graphic-9.gif [20]: /embed/inline-graphic-14.gif [21]: /embed/inline-graphic-15.gif [22]: /embed/inline-graphic-16.gif [23]: /embed/inline-graphic-17.gif [24]: /embed/inline-graphic-18.gif [25]: /embed/inline-graphic-19.gif [26]: /embed/graphic-10.gif [27]: /embed/inline-graphic-20.gif [28]: /embed/inline-graphic-21.gif [29]: /embed/graphic-16.gif [30]: /embed/graphic-19.gif [31]: /embed/inline-graphic-22.gif [32]: /embed/graphic-20.gif [33]: /embed/graphic-21.gif