A comparison of different methods for handling measurements affected by medication use ====================================================================================== * Jungyeon Choi * Olaf M. Dekkers * Saskia le Cessie ## Abstract In epidemiological research it is common to encounter measurements affected by medication use, such as blood pressure lowered by antihypertensive drugs. When one is interested in the relation between the variables not affected by medication, ignoring medication use can cause bias. Several methods have been proposed, but the problem is often ignored or handled with generic methods, such as excluding individuals on medication or adjusting for medication use in the analysis. This study aimed to investigate methods for handling measurements affected by medication use when one is interested in the relation between the unaffected variables and to provide guidance for how to optimally handle the problem. We focused on linear regression and distinguish between the situation where the affected measurement is an exposure, confounder or outcome. In the Netherlands Epidemiology of Obesity study and in several simulated settings, we compared generic and more advanced methods, such as substituting or adding a fixed value to the treated values, regression calibration, censored normal regression, Heckman’s treatment model and multiple imputation methods. We found that often-used methods such as adjusting for medication use could result in substantial bias and that methods for handling medication use should be chosen cautiously. Keywords * Medication effect * measurement error * censored data * missing data * regression calibration * multiple imputation ## 1. Introduction Measurements affected by medication use are commonly encountered in epidemiological research. Examples are glucose levels lowered by glucose lowering medications or blood pressure relieved by antihypertensive drugs. Depending on research questions, these measurements can be an outcome of interest or covariates. Although researchers often are interested in the effect of certain drugs, the relation between the values not affected by medication can also be the primary scientific interest. However, the value of a variable had an individual not been treated is often not available. Using the values affected by medication instead may lead to biased results. In clinical research, however, medication use is often ignored or handled with naïve methods such as excluding medication users or adjusting for medication use. For outcomes affected by medication use, these naïve methods may introduce bias (1-4). Several methods have been proposed to handle measurements affected by medication use. Relatively simple methods are adding an expected medication effect to treated values or substituting the treated values for other values (1, 4, 5). More sophisticated methods include censored normal regression, Heckman’s treatment model, quantile regression, measurement error methods or advanced imputation techniques (1, 2, 6, 7). However, these methods are seldom used in applied research. Additionally, many of the suggested methods are limited to outcomes affected by medication, and little has been known on how to handle exposures or confounders affected by medication. The aim of this study is to investigate methods for handling measurements affected by medication use when the unaffected values are of interest. We focused on etiological studies where effects are estimated by linear regression. We discuss different methods and compare these methods in a large cross-sectional study of the Netherlands Epidemiology of Obesity (NEO) study and in several simulation scenarios generated based on the NEO data. The scenarios vary on whether the exposure, confounder or the outcome is affected by medication use. Based on the results of simulation study, we provide guidance on how to optimally handle the medication effect. ## 2. Methods to handle measurements affected by medication use We will consider the situation where for some individuals a variable is affected by medication use (e.g. blood pressure affected by antihypertensive drug), while the relation between variables when no one is affected by medication is of interest. For convenience, we assume that medication is taken when values are high, aiming to lower the values. Depending on the research question, the variable affected by medication use can be the exposure, a confounder or the outcome in an analysis. The problem of measurements affected by medication can be viewed from different perspectives; it can be viewed as a missing data problem, because for people on medication their untreated values are unobserved. It may be viewed as a measurement error problem, as the observed values differ systematically from the values had the treated individuals not been treated. It could also be viewed as a censoring problem, if we assume that the unobserved untreated values are at least as high as the observed values under treatment. Depending on how one approaches the problem, methods for missing data, for measurement error or for censored observations can be used. Table 1 summarizes methods for handling measurements affected by medication use. The methods can be categorized as generic methods [M1-M5], a method for the exposure affected by medication [M6], methods for the outcome affected by medication [M7-M10], and multiple imputation approaches [M11-M13]. Detailed descriptions on each method and underlying assumptions are available in Supplementary Material 1. All methods are applied to empirical and simulated data in the following sections. View this table: [Table 1.](http://medrxiv.org/content/early/2022/04/27/2022.04.23.22273899/T1) Table 1. Overview of Methods for Handling Measurements Affected by Medication use. ## 3. Example: the Netherlands Epidemiology of Obesity Study The Netherlands Epidemiology of Obesity (NEO) study is a population-based study designed to investigate pathways that lead to obesity-related diseases. From 2008 to 2012, 6,671 individuals aged 45–65 years were included in the study. Participants brought all medication they were using to the NEO study site, which was coded using the Anatomical Therapeutic Chemical Classification (8). Details can be found elsewhere (9). The NEO study data includes several measurements affected by medication, for example, 31% of the participants used antihypertensive medication and 15% used lipid lowering medication. To illustrate the effect of different methods for handling medication use, we use data collected at baseline and consider three research questions: 1. The effect of systolic blood pressure (SBP) on the intima-media thickness (IMT), where the exposure is affected by medication. 2. The effect of BMI on SBP, where the outcome is affected by medication. 3. The effect of BMI on IMT, adjusted for SBP, where the confounder is affected by medication. All methods described in Table 1 were applied to estimate the regression models corresponding to the three research questions stated above. The analyses were adjusted for potential confounders: BMI, sex, age, education level and smoking status. In the Netherlands physicians prescribe blood pressure medication generally aiming at values below 140 mmHg (10). Therefore, we replaced treated SBP values by 150 mmHg in the substitution method [M4], and repeated it using 170 mmHg. For adding medication effect [M5], we followed previous literatures using the values 10 mmHg and 15 mmHg (4, 11). For regression calibration [M6], the assumed mean treatment effect was 15 mmHg; SD=10 mmHg. For inverse probability weighting [M7], logistic regression was used to estimate the probability of medication use based on 21 covariates (see Supplementary Material 2 for details). The same covariates were used in the probit part of Heckman’s treatment model [M10] and in the multiple imputation approaches [M11-M13]. For quantile regression [M8], values of treated individuals were replaced by 150, 170 and 190 mmHg. For censored regression [M9] and imputation [M12], we used 140 mmHg and 160 mmHg as a clinical threshold for treatment prescription. For research question i) and iii) the outcome variable IMT was added to the imputation models (12). Ten imputed datasets were created in each imputation. All analyses were performed using R version 3.6.1, with packages Survival v3.1-8 for (13) [M9], SampleSelection v1.2-6 (function treatreg) (14) for [M10], Quantreg v5.54 (15) for [M11], MICE v3.7.0 (16) with default options for [M11] and miceMNAR (17) for [M13]. R code for [M12] is provided in Supplementary Material 2. Figure 1 presents effect estimates from the different methods for the three research questions. The results show that different methods can lead to quite different effect estimates in all three considered situations. This signals that choosing an appropriate method for handling measurements affected by medication use is essential for the validity of study results. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/27/2022.04.23.22273899/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2022/04/27/2022.04.23.22273899/F1) Figure 1. Regression coefficients and their 95% confidence interval estimated from the NEO data using the different methods to handle medication effect. 1Question 1: effect of SBP (mmHg) on IMT (mm), 2Question 2: effect of BMI (kg/m2) on SBP, Question 3: effect of BMI on IMT where SBP is one of the confounder. In all analyses, SBP was the variable affected by medication. ## 4. Simulation studies To understand the results of the NEO study and provide recommendations, we performed several so-called real-life simulation studies. To mimic the NEO study as closely as possible, we used the baseline variables of the NEO participants (BMI, sex, age education and LDL cholesterol), but replaced SBP, antihypertensive medication prescription and IMT by simulated values which we let depend on the observed values of the other baseline variables. We generated different scenarios where blood pressure could be the exposure (scenario 1), the outcome (scenario 2) or the confounder (scenario 3). In each scenario we considered the research questions i), ii) and iii) of Section 3 respectively. ### 4.1 Simulation setting 1: Medication effect on the exposure In this simulation setting, we are interested in the effect of SBP on IMT, with SBP affected by antihypertensive drug in some individuals. The *untreated SBP* depended linearly on *BMI, sex* (*man*=0, *women*=1), *age* and *education* (*low*=0, *high*=1), with parameter values closely corresponding to observed values in the NEO study: ![Formula][1] with the residual error *ε**SBP* normally distributed with mean 0 and SD 15.9 mmHg. The probability to receive medication depended on *BMI, sex, education* and the *untreated SBP* values: ![Formula][2] In this way, approximately 28% of the participants were treated for high SBP. For a SBP of 150 mmHg, the probability of receiving medication was approximately 11%, while for 180 mmHg the probability was 88%. The *Observed SBP* was lowered when medication was used: ![Formula][3] where the *medication effect* was generated from a normal distribution (30 mmHg, SD=10 mmHg). The outcome, IMT was generated as: ![Formula][4] with *ε**IMT* following a normal distribution (0, SD= 9.2 mm). The relation between medication use and IMT is confounded by sex and BMI. ### 4.2 Simulation setting 2: Medication effect on the outcome In Simulation setting 2, we consider the effect of BMI on untreated SBP. BMI was taken directly from the NEO data. Untreated SBP, medication prescription and the observed SBP were generated in the same way as in Simulation setting 1. ### 4.3 Simulation setting 3: Medication effect on a confounder Here, we consider the effect of BMI on IMT measurement when adjusted for SBP. Untreated and observed SBP, medication prescription and IMT were generated as in Simulation setting 1. ### 4.4 Alternative simulation scenarios Simulation setting 1, 2 and 3 were repeated while changing three parameters: i) The size of the mean treatment effect decreased from 30 mmHg to 10 mmHg. In this simulation 16% of the treated individuals’ SBP increased after medication. ii) The standard deviation of the treatment effect changed from 10 mmHg to 1 mmHg. iii) The percentage of individuals on medication increased from approximately 28% to 50% by changing the intercept of the logistic model for medication use. ### 4.5 Analysis All methods [M1-M13] were applied to the simulated data sets in the same way as described in Section 3, except we used 20 mmHg and 30 mmHg to add to the treated SBP in [M5]. Analyses were adjusted for BMI, sex, age, education level and smoking status. Each simulation was repeated 1000 times. The estimates obtained when using the untreated SBP values were used as reference. Mean bias and mean squared error were calculated as overall measure of performance. ## 5. Results ### 5.1 Simulation setting 1: Medication effect on the exposure Figure 2 (left) and Table 2 display the results of simulation setting 1. The results show that medication use cannot be ignored [M1]. Restricting the analysis to untreated individuals [M2] yielded estimates very close to the true values. In this setting medication use was affected by the exposure and several covariates, in which case one should adjust for all variables both affecting medication use and the outcome to prevent selection bias (18). Furthermore, there was no effect modification, meaning that the effect of SBP on the outcome in the subgroup of untreated individuals is the same as in the total population. View this table: [Table 2.](http://medrxiv.org/content/early/2022/04/27/2022.04.23.22273899/T2) Table 2. Mean Coefficient, Standard Deviation, Bias and Mean Squared Error (MSE) for Three Main Simulation Settings. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/27/2022.04.23.22273899/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2022/04/27/2022.04.23.22273899/F2) Figure 2. Regression coefficients and their 5th and 95th percentile estimated from simulation setting 1, 2 and 3. Results are standardized based on the mean and standard deviation of the true coefficients in each simulation setting. One grid unit represents 2.5 standard deviation. Simulation 1: effect of SBP on IMT, Simulation 2: effect of BMI on SBP, Simulation 3: effect of BMI IMT, where SBP is one of the confounders. In all scenarios, SBP was the variable affected by medication use. Binary adjustment for medication use [M3] did not work well. In our simulation, medication effect was generated with a large variability. This random variability in medication effect attenuated the association between SBP and IMT in the *treated* individuals and led to bias toward the null in the overall effect. The method worked better when the variance of the medication effect was smaller (Supplementary material 3). Substituting treated values [M4] did not perform well in any of the scenarios. The method cannot reconstruct the original distribution of the exposure and therefore in general will yield biased results. Adding 30 mmHg [M5], which was the true mean medication effect in our simulations did not perform well either. The reason is that the medication effect was generated with SD=10 mmHg. Therefore, by adding 30 mmHg to all treated SBP values, we reconstruct untreated SBP with random measurement error. Random measurement error in exposures will bias the estimates in a regression model (14). The method performed better when the random variation of the medication effect was smaller (Supplementary material 3). Regression calibration [M6] yielded unbiased results in all our simulations scenarios, assuming that true medication effect and standard deviation are known. None of the multiple imputation methods [M11-M13] yielded valid results. A possible explanation is that the imputation models included the outcome which does not corresponds to how medication use was generated in our simulations. ### 5.2 Simulation setting 2: Medication effect on the outcome Figure 2 (middle) and Table 2 show the results of Simulation setting 2. Ignoring mediation use [M1], restricting to untreated subgroup [M2] and binary adjustment for medication use [M3] yielded biased results. As the outcome determines medication use directly, adjusting or selecting based on medication use [M2 & M3] implies selection based on outcome values, and this will in general lead to selection bias (19, 20). Substituting method [M4] using 150 mmHg led to a large underestimation. It performed better when 170 mmHg was used, which is slightly higher than the mean untreated SBP in the treated individuals (164 mmHg). Regardless of the substituting values, however, the method cannot reconstruct the original distribution of the outcome Adding 30 mmHg [M5] yielded unbiased results in all simulation settings (Supplementary Material 4). Unlike in Simulation setting 1, adding the true mean medication effect yields valid results irrespectively of the amount of variance in the medication effect. Inverse probability weighting [M7] resulted in large bias. Quantile regression [M8], for all chosen replacement values performed poorly. In our simulation setting, more than 50% were using antihypertensive drug among the individuals with very high BMI. Therefore, the median SBP conditional on high BMI was affected by whichever value used for substitution. Censored normal regression [M9] performed reasonably well when the simple censoring method was used or when clinical guideline set to 140 mmHg was applied. However, in alternative scenarios with a smaller medication effect, the results were off (Supplementary Material 4). One reason is that in these scenarios the treated SBP was sometimes higher than the untreated SBP. This violates the assumption that untreated values are at least as high as untreated values (1). Heckman’s treatment model [M10] performed less well in our main scenario, which contrasts the results reported by Spieker et al. (2, 7). This model assumes that the residual variances of two linear regression models, one for untreated individuals and the other for treated individuals, are equal. This assumption was violated in our main simulation scenario, as we simulated a medication effect with a large random variability. This reflects the reported instability of Heckman’s treatment (21, 22). In the scenarios with a smaller variance in the medication effect, Heckman’s treatment model outperformed the censored regression (Supplementary Material 4). Multiple imputation with predictive mean matching [M11] resulted in bias. Results of multiple imputation with censored regression [M12] were only slightly biased, but for smaller medication effect the method performed less well. Multiple imputation with Heckman’s model [M13] yielded in some cases a large underestimation of the effect in all scenarios. ### 5.3 Simulation setting 3: Medication effect in a confounder Figure 2 (right) and Table 2 show the results of Simulation setting 3. Ignoring the medication effect [M1] resulted in bias. Restricting to untreated individuals [M2], which is the same as adjusting for confounding by restriction, performed well. The method will yield valid estimation under the conditions as in Simulation setting 1, that is with proper adjustment for variables affecting both medication use and the outcome. Binary adjustment for medication use [M3] yielded results close to the truth. Substitution methods [M4] were biased. As the distribution of untreated SBP could not be correctly reconstructed, its effect was not correctly adjusted for. Adding 30 mmHg [M5] yielded a very small upward bias. This is due to the random measurement error introduced by the method. It has been known random measurement error in exposures attenuates the effect, while random measurement in confounders can lead to overestimation (23, 24). Multiple imputation with censored regression [M12] yielded results close to the truth, especially when clinical guideline information was incorporated, and performed better than multiple imputation with Heckman’s model [M13]. All results were consistent in the alternative simulation scenarios (Supplementary Material 5). ## 6. Guidance on how to optimally handle measurements affected by medication use When interest is in the relation between the unaffected variables, ignoring medication use will in general yield biased results regardless whether the exposure, outcome or confounder is affected by medication. To obtain valid estimates, methods for handling medication use are needed. In this section, we provide guidance on how to handle an exposure, an outcome or a confounder affected by medication use. ### What to do when the exposure is affected by medication? * Performing analysis on the untreated individuals [M2] is a valid approach and will not lead to a large loss in power if the number of treated individuals is relatively low. However, there are two things to consider when applying this method: i) One should adjust for variables which both affect medication use and the outcome. ii) The result cannot be generalized to the total population if the effect of the exposure on the outcome is heterogenous. * Regression calibration [M6] may be used, but requires an external estimate of the medication effect with its standard deviation. ### What to do when the outcome is affected by medication? * When an estimate of the mean medication effect is available, it could be added to the measurements of treated individuals. This method was also advocated by Tobin et al. (1). Like them, we also highly recommend to perform sensitivity analysis with several different values to determine the stability of effect estimates. * Quantile (median) regression [M8] can be used when less than 50% of the individuals are treated at any value of the exposure. The method does not require knowledge on the medication effect and can yield robust estimates but with lower power (6) than other methods. * The advantage of censored normal regression [M9] or multiple imputation with censored normal regression [M12] is that no treatment effect needs to be specified. However, the method assumes that the observed values are lower than the untreated values, which could be violated when the treatment is ineffective. Furthermore, the method assumes non-informative censoring which is likely to be violated in most clinical settings. In our simulation, we relaxed this assumption by incorporating knowledge from a clinical guideline into a censoring mechanism. Both in study of Tobin et al. and in our main simulation study, the method was rather robust against the violation of non-informative censoring assumption. * Heckman’s treatment model works well only if the treatment effect has small variance. ### What to do when the confounder is affected by medication? * Restricting the analysis to untreated individuals [M2] is a valid approach, with the same considerations as for the exposure affected by treatment. * Using a binary indicator [M3] is a reasonable solution. * Adding the true mean medication effect to the treated individuals [M5] performs relatively well. ## 7. Discussion Our simulation study showed that the problem of variables affected by medication use should not be ignored, and proper methods are needed to avoid potential bias. Different methods are needed depending on whether the exposure, the outcome or a confounder is affected by medication. Additional information, such as medication prescription patterns in clinical settings and presence of effect heterogeneity should also be considered carefully. Accordingly, all methods need to be used with caution. One important consideration is the trade-off between the robustness of a method and availability of external information. Methods that use external information on the medication effect, such as adding the mean medication effect or regression calibration, performed well when the external information was correct. However, such information is not always available. Other methods such as censored regression, Heckman’s treatment model or multiple imputation methods, do not require assumptions on the medication effect. However, they rely on other assumptions and can perform suboptimal if they are violated. We aimed our simulation scenarios to resemble realistic clinical situations, instead of creating an ideal scenario for a particular method. It is likely that assumptions required for statistical methods will not all be met in clinical data. Therefore, it is relevant to know which methods are robust against violation of assumptions. We encourage researchers to perform more often real-life simulations as we did when generating simulations based on the NEO study data. One limitation of our study is that we did not considered situations where more than one variable is affected by medication. Additionally, our study focused on the methods applicable for cross-sectional analyses. Other approaches may be available when there is an interaction by medication (25), when effect modifiers are associated with medication use (7), in longitudinal settings (26) or in the presence of interaction or mediation by time-varying treatment (27). Furthermore, we focussed on linear regression models, but our recommendations for exposures and confounders will also hold for regression models with a binary or survival outcome. In summary, the optimal strategy for handling measurements affected by medication depends on whether the medication effect is on the exposure, the outcome or a confounder. When deciding which strategy to use, we urge researchers to critically consider the processes of medication prescription and what information on medication effects are available. ## Supporting information Supplementary Material 1 [[supplements/273899_file07.pdf]](pending:yes) Supplementary Material 2 [[supplements/273899_file08.pdf]](pending:yes) Supplementary Material 3 [[supplements/273899_file09.pdf]](pending:yes) Supplementary Material 4 [[supplements/273899_file10.pdf]](pending:yes) Supplementary Material 5 [[supplements/273899_file11.pdf]](pending:yes) ## Data Availability The simulated data that support the findings of this study are available from the corresponding author upon reasonable request. The Netherlands Epidemiology of Obesity (NEO) data are not publicly available due to privacy or ethical restrictions. ## Footnotes * O.M.Dekkers{at}lumc.nl * S.le_Cessie{at}lumc.nl * Declarations of interest: none * Received April 23, 2022. * Revision received April 23, 2022. * Accepted April 27, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. 1.Tobin MD, Sheehan NA, Scurrah KJ, et al. Adjusting for treatment effects in studies of quantitative traits: antihypertensive therapy and systolic blood pressure. Statistics in Medicine 2005;24(19):2911–35. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.2165&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16152135&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000232215600001&link_type=ISI) 2. 2.Spieker AJ, Delaney JAC, McClelland RL. Evaluating the treatment effects model for estimation of cross-sectional associations between risk factors and cardiovascular biomarkers influenced by medication use. Pharmacoepidemiology and drug safety 2015;24(12):1286–96. 3. 3.Tanamas SK, Hanson RL, Nelson RG, et al. Effect of different methods of accounting for antihypertensive treatment when assessing the relationship between diabetes or obesity and systolic blood pressure. Journal of Diabetes and its Complications 2017;31(4):693–9. 4. 4.Cui JS, Hopper JL, Harrap SBJH. Antihypertensive treatments obscure familial contributions to blood pressure variation. 2003;41(2):207–10. 5. 5.Hunt Steven C, Ellison RC, Atwood Larry D, et al. Genome Scans for Blood Pressure and Hypertension. Hypertension 2002;40(1):1–6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1161/01.HYP.0000022660.28915.B1&link_type=DOI) 6. 6.White IR, Koupilova I, Carpenter J. The use of regression models for medians when observed outcomes may be modified by interventions. 2003;22(7):1083–96. 7. 7.Spieker AJ, Delaney JA, McClelland RL. A method to account for covariate-specific treatment effects when estimating biomarker associations in the presence of endogenous medication use. Statistical Methods in Medical Research 2018;27(8):2279–93. 8. 8.NEO codebook baseline measurements. ([https://www.lumc.nl/org/neo-studie/researchers/variables/codebook-neo1/medicalhistory/](https://www.lumc.nl/org/neo-studie/researchers/variables/codebook-neo1/medicalhistory/)). (Accessed). 9. 9.de Mutsert R, den Heijer M, Rabelink TJ, et al. The Netherlands Epidemiology of Obesity (NEO) study: study design and data collection. European journal of epidemiology 2013;28(6):513–23. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10654-013-9801-3&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23576214&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) 10. 10.NHG-Standaaren. Cardiovasculair risicomanagement (CVRM). 2019. ([https://www.nhg.org/standaarden/samenvatting/cardiovasculair-risicomanagement](https://www.nhg.org/standaarden/samenvatting/cardiovasculair-risicomanagement)). (Accessed). 11. 11.Neaton JD, Grimm RH, J., Prineas RJ, et al. Treatment of Mild Hypertension Study: Final Results. JAMA 1993;270(6):713–24. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jama.1993.03510060059034&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=8336373&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1993LQ33800026&link_type=ISI) 12. 12.Moons KGM, Donders RART, Stijnen T, et al. Using the outcome for imputation of missing predictor values was preferred. Journal of Clinical Epidemiology 2006;59(10):1092–101. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2006.01.009&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16980150&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000241064500012&link_type=ISI) 13. 13.Therneau TM, Lumley T. Package ‘survival’. 2015;128:112. 14. 14.Henningsen A, Toomet O, Henningsen MA, et al. Package ‘sampleSelection’. 2019. 15. 15.Koenker R, Portnoy S, Ng PT, et al. Package ‘quantreg’. 2019. 16. 16.Buuren Sv, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. Journal of statistical software 2010:1–68. 17. 17.Galimard J-E, Chevret S, Protopopescu C, et al. A multiple imputation approach for MNAR mechanisms compatible with Heckman’s model. Statistics in Medicine 2016;35(17):2907–20. 18. 18.Hernán M, Robins J. Causal inference: What if. Boca Raton: Chapman & Hall/CRC; 2020. 19. 19.Hernán MSH-D, Robins J. A Structural Approach to Selection Bias. Epidemiology 2004;15(5):615–25. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/01.ede.0000135174.63482.43&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15308962&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223382600020&link_type=ISI) 20. 20.Cole SR, Platt RW, Schisterman EF, et al. Illustrating bias due to conditioning on a collider. International Journal of Epidemiology 2009;39(2):417–20. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19926667&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000276303800019&link_type=ISI) 21. 21.Stolzenberg RM, Relles DA. Theory Testing in a World of Constrained Research Design:The Significance of Heckman’s Censored Sampling Bias Correction for Nonexperimental Research. 1990;18(4):395–415. 22. 22.Lillard L, Smith JP, Welch F. What Do We Really Know about Wages? The Importance of Nonreporting and Census Imputation. 1986;94(3, Part 1):489–506. 23. 23.White IR. Commentary: Dealing with measurement error: multiple imputation or regression calibration? International Journal of Epidemiology 2006;35(4):1081–2. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyl139&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16847023&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) 24. 24.Brakenhoff TB, van Smeden M, Visseren FLJ, et al. Random measurement error: Why worry? An example of cardiovascular risk factors. PLOS ONE 2018;13(2):e0192298. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0192298&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) 25. 25.Masca N, Sheehan NA, Tobin MD. Pharmacogenetic interactions and their potential effects on genetic analyses of blood pressure. Statistics in Medicine 2011;30(7):769–83. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21394752&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) 26. 26.McClelland RL, Kronmal RA, Haessler J, et al. Estimation of risk factor associations when the response is influenced by medication use: An imputation approach. Statistics in Medicine 2008;27(24):5039–53. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.3341&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18613245&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F27%2F2022.04.23.22273899.atom) 27. 27.Schmidt AF, Heerspink HJL, Denig P, et al. When drug treatments bias genetic studies: Mediation and interaction. PLOS ONE 2019;14(8):e0221209. [1]: /embed/graphic-3.gif [2]: /embed/graphic-4.gif [3]: /embed/graphic-5.gif [4]: /embed/graphic-6.gif