Achieving behavior change at scale: Causal evidence from a national lifestyle intervention program for pre-diabetes in the UK ============================================================================================================================= * Julia M. Lemp * Christian Bommer * Min Xie * Anant Jani * Justine I. Davies * Till Bärnighausen * Sebastian Vollmer * Pascal Geldsetzer ## Abstract There remains widespread doubt among clinicians that mere lifestyle advice and counseling provided in routine care can achieve improvements in health. We aimed to determine the health effects of the largest behavior change program for pre-diabetes globally (the English Diabetes Prevention Programme) when implemented at scale in routine care. We exploited the threshold in glycated hemoglobin (HbA1c) used to decide on program eligibility by applying a regression discontinuity design, one of the most credible quasi-experimental strategies for causal inference, to electronic health data from approximately one-fifth of all primary care practices in England. Program referral led to significant improvements in patients’ HbA1c and body mass index. This analysis provides causal, rather than associational, evidence that lifestyle advice and counseling implemented in a national health system can achieve important health improvements. By 2030, the number of adults with diabetes globally is expected to reach 578 million, representing 10.2% of the global adult population (*1*). Diabetes is an important cause of mortality, morbidity, and health-system costs (*2*). There is, thus, an urgent need to implement population-based interventions that prevent diabetes, enhance its early detection, and address cardiovascular risk factors to prevent or delay its progression to complications. In particular, type 2 diabetes (T2DM), which accounts for approximately 90% of the total diabetes burden, is a major risk factor for cardiovascular disease, with people with diabetes having a more than twofold increase in the risk of incident chronic heart disease compared to those without diabetes (*3*). Worryingly, and at least partly driven by the increase in excess weight and obesity, diabetes prevalence and diabetes-related deaths continue to rise in most parts of the world (*4*). Often resulting from socioecological contexts, individual behaviors such as poor nutrition, hazardous alcohol consumption, and physical inactivity play a key role in the development of T2DM (*5*). While targeting individual behavior as a preventive strategy for T2DM has been controversial (*6*), behavior change programs (sometimes referred to as lifestyle interventions or similar) have been efficacious in controlled clinical trials (*7, 8*). In the seminal US Diabetes Prevention Program study (which serves as a model for many behavior change programs in the US and elsewhere) (*9, 10*), targeting changes in lifestyle behavior was even more effective than metformin in preventing or delaying diabetes. However, trials such as the US Diabetes Prevention study have mainly focused on efficacy, supplying proof of principle that the intervention worked when one-to-one sessions with specialists and a range of incentives are being provided (*11*). Thus, while a recent meta-analysis concluded that lifestyle modification provides strong evidence for reversing pre-diabetes in adults (*12*), it remains important to establish the transferability of behavior change programs into real-world settings. Establishing that behavior change programs work in routine care is essential for several reasons. First, although lifestyle counseling is the recognized first-line treatment option for people presenting with pre-diabetes and other cardiovascular risk factors, clinicians often revert to prescribing preventive medication due to limited time resources in primary care (*13*), insufficient knowledge and referral options for promoting healthy lifestyles (*14, 15*), and a predominance of the biomedical model with clinicians being uncertain about the success of counseling (*14, 16*). In particular, doubt that behavior change at the level required for substantial weight loss is possible to achieve for most patients is prevalent among primary care clinicians (*17*). Second, participants enrolling in clinical trials for behavior change programs are unlikely to be representative of the broader patient population. For example, patients enrolled in clinical cardiology trials, compared with patients encountered in everyday practice, had a lower risk profile as they were younger, more likely to be male, and less likely to have a comorbid disease (*18*). Individuals drawn from an unselected, general population may respond differently, given their lower health literacy and willingness to engage, higher comorbidities, and greater ethnic diversity (*19, 20*). Hence, we advance the argument that the impact of behavior change programs on population health must relate to real-world effectiveness and should, thus, be evaluated in an “observational, non-interventional trial in a naturalistic setting” akin to phase IV in drug development (*21, 22*). However, conventional observational studies that are generally applied in health research have the disadvantage that they may fail to account for major confounding factors such as selection biases and, thus, preclude causal interpretations. In contrast, in this study, we provide causal evidence for the effectiveness of a behavior change program using an innovative causal inference method in large-scale routine data. We establish causality by applying a regression discontinuity approach that combines a rich set of electronic health records from the English National Health Service with variation in treatment probabilities generated by guidelines from the National Institute for Health and Clinical Excellence (NICE) that recommend intensive lifestyle counseling for people at high risk of progression to T2DM (*23*). Specifically, we exploit the fact that the NHS Diabetes Prevention Programme (NHS DPP), a behavior change program with weight loss, diet, and physical activity goals consisting of at least 13 group sessions over the course of nine months and the largest DPP globally to achieve universal population coverage, is only open to patients above a prespecified threshold of HbA1c or fasting plasma glucose (*24*). This way we can take advantage of existing large-scale routine health data while still obtaining causal effect estimates that are not vulnerable to confounding and measurement error (*25, 26*). Ultimately, our study aims to determine the real-world health effects of the NHS DPP, investigating whether a routine behavior change program in a national health system has the potential to lead to improvements in key cardiovascular risk factors such as HbA1c, excess weight, raised blood pressure, and blood lipid levels. ## Results Our primary outcome was change in HbA1c. Secondary outcomes included changes in body mass index (BMI), body weight, blood pressure, serum cholesterol levels, and serum triglycerides levels. We also conducted exploratory analyses investigating the effect of program referral on the probability of diabetes, hypertension, and hyperlipidemia incidence; newly prescribed medications for these conditions; diabetic complication; all-cause mortality; and emergency hospitalization for a major adverse cardiovascular event. A detailed definition of each outcome is provided in table S1. ### Data source and sample selection Our study used data from the Clinical Practice Research Datalink (CPRD) Aurum and NHS England’s Hospital Episode Statistics (HES). CPRD Aurum is a large primary care database of de-identified electronic health records from a network of approximately one-fifth of General Practitioner (GP) practices across England. To ensure sufficient implementation of the NHS DPP during the study period after the start of the phased roll-out in mid-2016, our population of interest consisted of adults (aged 18 to 80 years) who received an HbA1c test between January 1st, 2017, and December 31st, 2018. Data were available until the end of June 2020. We identified 2 106 376 patients who had a baseline HbA1c test during the enrolment and met inclusion criteria for our primary cohort (see Methods and Materials). 2 052 480 of these (97.4%) had been registered with their GP for at least 6 months following the index date. Patient characteristics are described in Table 1. Their mean age was 51 years, 915 717 (43.5%) were men and 1 190 659 (56.5%) were women, and their mean baseline HbA1c level was 36.6 mmol/mol. The median time to endline HbA1c during follow-up was 20.5 months (interquartile range, 13.5-26.8). Of the 2 052 480 patients who had a follow-up time of at least 6 months, 2 037 384 (99.3%) were linkable to HES hospitalization data. At baseline, 749 884 (36.5%) people had already received at least one prescription for a blood pressure-lowering medication and 411 288 (20.0%) for a lipid-lowering medication. View this table: [Table 1.](http://medrxiv.org/content/early/2023/06/12/2023.06.08.23291126/T1) Table 1. Sample characteristics N = 2 106 376. Abbreviations: BMI, body mass index; GP, General Practitioner. Age, gender, and number of consultations was available for the full sample. Missingness in baseline BMI, blood pressure and lipids are shown in table S2. ### Meeting the statistical assumptions In the first part of the analysis, we ensured that all necessary assumptions for a regression discontinuity analysis are met. Importantly, we tested the validity of the continuity assumption (*25*). First, the density distribution of baseline HbA1c should be continuous around the threshold; this would be violated if patients (or providers) could precisely manipulate baseline HbA1c. As shown in fig. S1, there was no indication of heaping or manipulation of HbA1c values around the threshold. Secondly, baseline covariates (such as age or previous contact with health service providers) should be balanced, i.e., continuous, at the threshold. As it is in randomized controlled clinical trials, evidence of balance on baseline observables provides confidence that patients assigned to treatment and control conditions are exchangeable. We showed that this assumption is met by plotting the relationship between baseline HbA1c and potential confounders using third-order global polynomial regressions (Fig. 1). The illustrated balance on baseline observables was statistically supported by local linear regressions that yielded no significant discontinuities at the eligibility threshold (Fig. 1, table S3). ![Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/06/12/2023.06.08.23291126/F1.medium.gif) [Fig. 1.](http://medrxiv.org/content/early/2023/06/12/2023.06.08.23291126/F1) Fig. 1. Association between baseline HbA1c and intensive lifestyle counseling and potential confounders BMI = Body mass index. p.p. = percentage points. (**A**) Primary exposure (‘First stage’) after 3, 6, and 12 months after baseline HbA1c. (**B**), (**C**), and (**D**), Potential confounders. The blue lines show the local linear regression models within the bandwidth used in our primary analysis. The orange dotted lines show the global polynomial relationship. The blue circles represent the mean value for individual patients and the dotted vertical lines indicate the HbA1c cutoff. The estimate represents the discontinuity at the HbA1c threshold, whereas discontinuities in potential confounders may jeopardize assumptions underlying regression discontinuity. ### Program referrals in routine care increase at the eligibility threshold While increasing HbA1c is associated with an increase in risks of diabetes and cardiovascular disease (*27*), the HbA1c eligibility threshold of 42 mmol/mol (6%) to enter the program does not represent a pathophysiological phenomenon at that specific threshold value. Rather, since the association is continuous and HbA1c was measured with random measurement error, patients just below and above the threshold are close to identical in their underlying characteristics and, thus, effectively randomized to being referred to the program or not. Using a local linear approach, we compared patients lying closely on either side of the threshold which allows for the interpretation of differences in clinical outcomes as causal (*26*). Local linear regression demonstrated a 10.8 percentage points increase in treatment assignment at the HbA1c eligibility threshold. In relative terms, patients just above the threshold were five times more likely to be referred compared to patients just below the threshold (Fig. 1). Receipt of treatment was defined as a record of a referral to a behavior change program or intensive lifestyle counseling during the 12 months after the baseline HbA1c test. Treatment primarily included referrals to the NHS DPP, but we also included referrals to other structured programs and intensive lifestyle counseling as they are likely to serve as an alternative where placement in NHS DPP is not possible. 26 970 patients with a baseline HbA1c between 42 and 47 mmol/mol were referred to a behavior change program or intensive lifestyle counseling, of which 20 963 (77.7%) were referred to the NHS DPP. 4 800 patients declined NHS DPP referrals offered by their GP. All records considered as treatments are listed in table S4. For convenience only, we henceforth refer to these treatments simply as intensive lifestyle counseling. ### Glycemic control is improved Patients who were referred to intensive lifestyle counseling significantly improved their HbA1c levels. Specifically, we evaluated the effect of referral to intensive lifestyle counseling on glycemic control by fitting separate regression lines of the association between baseline HbA1c and change in HbA1c above and below the eligibility threshold. The difference in where these lines intersect the threshold quantifies the discontinuity in the outcome and can be described as the intention-to-treat effect of the threshold rule (−0.10 mmol/mol, 95% CI −0.16, −0.03). The intention- to-treat effect measures the effect of being eligible for intensive lifestyle counseling as determined by the guideline rather than the effect of actually being referred and is therefore dependent on the probability of treatment at the threshold, i.e., how many GPs adhere to the guidelines. Thus, to obtain the true effect of being referred to intensive lifestyle counseling (the complier average causal effect), it is necessary to scale the intention-to-treat effect by the difference in the probability of treatment at the threshold. When doing this, we find a significant negative effect of referral to intensive lifestyle counseling on HbA1c at follow-up (−0.85 mmol/mol, 95% CI −1.46, −0.24). While the clinical significance of a 0.85 mmol/mol reduction in HbA1c is difficult to judge at an individual level due to limited available data in the non-diabetic range, observational clinical data suggest a linear association between HbA1c and cardiovascular disease, even in non- or prediabetic individuals (*27*). For example, as a reference point, after adjusting for major conventional cardiovascular risk factors, individuals having HbA1c levels of 5.5% to 5.7% (which roughly translates to 37 to 39 mmol/mol) were almost twice as likely to be diagnosed with coronary artery disease compared to individuals with less than 5.5% (*28*). Thus, it is likely that a reduction of 1.50 mmol/mol is meaningful at the population level. The intention-to-treat effects of the threshold rule and the causal effects of intensive lifestyle counseling for all outcomes are displayed alongside optimal bandwidth and sample size in Table 2. To ensure that the finding of improved glycemic control is not sensitive to our selected bandwidth or functional form (linear, quadratic, or polynomial), we show that effect sizes were robust to different choices (Fig. 2, table S5-6). ![Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/06/12/2023.06.08.23291126/F2.medium.gif) [Fig. 2.](http://medrxiv.org/content/early/2023/06/12/2023.06.08.23291126/F2) Fig. 2. Robustness of the effects of being referred to intensive lifestyle counseling on HbA1c and BMI across bandwidth choices MSE = mean-squared error. The mean-squared error (MSE)-optimal bandwidth is 3.8 mmol/mol below and above the threshold. The figure displays the estimated causal average complier effect of referral to intensive lifestyle counseling on (**A**) change in HbA1c and (**B**) change in BMI from local linear regressions with varying bandwidths (i.e., 75%, 125%, 150%, or 200%) of the MSE-optimal bandwidth with heteroskedasticity-robust 95% CI and triangular kernel weights. The sample size of patients in each bandwidth is given alongside the effect estimates. All effect estimates are statistically significant (p<0.05). View this table: [Table 2.](http://medrxiv.org/content/early/2023/06/12/2023.06.08.23291126/T2) Table 2. Effect of being eligible for, and effect of being referred to, intensive lifestyle counseling on Primary and Secondary Outcomes BMI = Body mass index. BP = Blood pressure. MACE = Major adverse cardiovascular event. MSE = mean-squared error. RD = Risk difference (i.e., difference in the probability of the outcome in percentage points). The effects were estimated in local linear regressions with heteroskedasticity-robust standard errors and triangular kernel weight. The definition of all outcomes is detailed in Table S1. We compared statistical significance (p < 0.05) to results using robust bias-corrected confidence intervals, which yielded the same statistical inferences except for diabetes medication, which was no longer significant (table S8). a Effects for HbA1c and diabetic complication were adjusted for diabetes medication prescription; effects for lipid levels were adjusted for lipid-lowering medication prescription; effects for diastolic and systolic blood pressure were adjusted for blood pressure-lowering medication; and effects for mortality and MACE hospitalization were adjusted for all three medication groups. All relevant medications are listed in our Open Science Framework project (see code availability statement). b Sample size within MSE-optimal bandwidth. c Sample restricted to those without prior lipid-lowering medication prescription. d Sample restricted to those without prior blood pressure-lowering medication prescription. ### Confounding by medications is unlikely It is unlikely that our finding of a beneficial effect of being referred to intensive lifestyle counseling on HbA1c is due to confounding by medication prescription or use. To rule out that we may falsely attribute improvements in glycemic control to intensive lifestyle counseling whereby they were in fact induced by diabetes medication, we adjusted our results for newly prescribed diabetes medication. In general, having an HbA1c level above the eligibility threshold for the NHS DPP was associated with a small increase in the probability of being prescribed diabetes medication shortly after treatment assignment (risk difference in percentage points [RD] = 0.04, 95% CI 0, 0.09), which increased to 0.3 percentage points at follow-up (Table 2). However, the discontinuity in the probability of being prescribed diabetes medication was not significant when using robust bias-correct confidence intervals for inference (table S8). There was no discontinuity in newly prescribed lipid-lowering medication (RD = 0.29, 95% CI −0.23, 0.82) or blood pressure-lowering medication (RD = 0.11, 95% CI −0.38, 0.60). Specifically, out of 26 513 patients with a baseline HbA1c between 42 and 47 mmol/mol who were referred to intensive lifestyle counseling, only 882 (3.3%) were prescribed diabetes medication during the 12 months following treatment assignment with numbers increasing with increasing HbA1c levels; these numbers are unlikely to substantially impact improvements in glycemic control. Indeed, when adjusting our results for being prescribed diabetes medication during follow-up, the estimated causal effect of intensive lifestyle counseling on glycemic control indicated a larger reduction in HbA1c, suggesting that improvements in HbA1c were not driven by increased uptake in diabetes medication (Table 2). We refrained from excluding patients who initiated medication from our analysis as this is likely to introduce bias given that the probability of being prescribed diabetes medication changed discontinuously at the threshold. ### Secondary outcomes provide additional evidence for health improvements at scale In secondary analyses, we found evidence that other key cardiovascular risk factors improved. Results showed a significant association of intensive lifestyle counseling with reduced BMI (−1.35 kg/m2, 95% CI −1.88, −0.83) and reduced weight (−2.99 kg, 95% CI −4.38, −1.61). Albeit not significant, effect estimates were also in the direction of benefit for blood pressure levels (diastolic: −1.35 mmHg, 95% CI, −3.31, 0.61; systolic: −2.03 mmHg, 95% CI −4.96, 0.91). When adjusting results for the prescription of blood pressure-lowering medication, improvements in systolic blood pressure persisted and became marginally significant (p = 0.092; Table 2). Referral to intensive lifestyle counseling also significantly reduced triglyceride levels (−0.33 mmol/l, 95% CI −0.54, −0.12) and increased HDL levels (0.04 mmol/l, 95% CI 0, 0.09). There was no significant effect on other serum cholesterol levels (LDL and the total cholesterol-to-HDL ratio), and no effect on the probability of being prescribed lipid-lowering medication. Results were robust to adjusting for baseline observables and using different bandwidths (table S6-7, fig. fig S2-S12), a global polynomial approach (table S5), and alternative treatment definitions (table S8). ### Exploratory analyses suggest limited short-term downstream health effects Diabetic complications, emergency hospitalization for major adverse cardiovascular events, and mortality were not significantly reduced by being referred to intensive lifestyle counseling in exploratory analyses (Table 2, fig. S13−15). The low incidence of adverse downstream events during our relatively short follow-up period, with 26 567 (1.3%) of patients dying and 36 567 (1.8% of those 2 037 384 linkable to HES data) having an emergency hospitalization for a major adverse cardiovascular event, resulted in low statistical power for detecting small, short-term changes in these outcomes. Further analyses using T2DM, hypertension, and hyperlipidemia incidence as outcomes are shown in the Supplemental Text. We interpreted these results cautiously since diagnoses in electronic health records may be less reliable than biochemical measures (*29*) and a stronger focus on identifying people who are at risk of diabetes is likely to initially increase the incidence of T2DM independent of health improvements (*30*). Lastly, we present a set of additional sensitivity analyses in the Supplement ruling out that potential sample selection effects due to immortal time bias or differential loss to follow-up may be responsible for our results (Supplementary Text, fig. S16−20, table S9−11). ### Men may be benefitting more than women Results stratified by gender, age group, ethnicity, socioeconomic status (based on the Index of Multiple Deprivation which is derived from the patient’s postcode), and rural or urban practice location are presented in the Supplement (table S12−26). Stratification led to relatively small sample sizes for practices in rural locations and patients with Asian, Black, or Mixed ethnicity and in the youngest age group. Being referred to an intensive lifestyle intervention led to significant improvements in HbA1c, weight, blood pressure, and triglycerides in men, but not women (table S12−16). Both men and women significantly improved their BMI although effect estimates suggest larger improvements in men compared to women. There was no indication that a higher socioeconomic status was consistently associated with greater benefits (table S13−14). ## Discussion In our study using routine, population-based electronic health records, we found causal evidence that referral to intensive lifestyle counseling led to improved glycemic control and reductions in BMI and weight. While prescriptions of diabetes medication increased slightly at the same time, sensitivity analyses demonstrated that observed improvements in glycemic control were not driven by medication. While we were not able to demonstrate a significant reduction in mortality or emergency hospitalization for major adverse cardiovascular events in the fairly short timeframe since program implementation, improvements in intermediary health markers that are key to progression to T2DM establish the potential of intensive lifestyle counseling to improve population health when implemented at scale. ### Our “real world” results are largely comparable to effects in clinical trials An intervention should only be considered effective if it works in people who have been offered the intervention in a routine setting rather than generalizing from participants who receive the intervention in a controlled research setting (*31*). Thus, while previous studies have merely shown that intensive lifestyle counseling is *efficacious* in improving cardiovascular risk factors when performed in controlled research settings, we now present evidence that these health benefits can be successfully translated to and scaled up in routine care. Importantly—and not self-evidently— reductions in HbA1c, BMI, and weight in our study are comparable to effect sizes from clinical trials. Meta-analyses of controlled clinical trials studying effects on weight loss and blood pressure found effect sizes comparable to those presented in our study (*32–35*). These meta-analyses showed mixed effects on glucose outcomes, with both positive and null estimates having been reported (*32–34, 36*). In an early process evaluation (without a control group) of the NHS DPP (*24*), adults who attended at least one of 13 group-based intervention sessions had an HbA1c reduction of 1.26 mmol/mol and a mean weight loss of 2.3 kg. Given the fact that the wider availability and uptake of behavior change programs are currently still limited, our study lends support to calls for further investment in behavioral, population-based interventions and targeted strategies for individuals at risk for diabetes that are currently not reached through care pathways. Cardiovascular risk factors may be managed more effectively by integrating health professionals who are trained in the delivery of behavior change into primary health care teams and by improving workflow and referral processes, rather than relying on limited lifestyle advice by GPs (*37*). Our results further suggest a decrease in the probability of being prescribed lipid-lowering medication, which appeared to have contributed to an increase, and thus adverse, effect on serum LDL at the eligibility threshold. A potential explanation may be that GPs are hesitant to concurrently refer patients to a behavior change program and prescribe lipid-lowering medication, presuming positive effects of lifestyle changes on blood lipids. At the same time, evidence from clinical trials suggests that lifestyle changes are more likely to consistently improve HDL and triglycerides levels while LDL is less likely to be affected (*35*). However, the evidence is not conclusive (*38*). Potential unintentional downstream effects of disease-specific behavior change programs, such as reduced prescriptions of medications that are likely beneficial independent of behavior change, warrant further research attention. ### Applying rigorous quasi-experimental methods to routine data can advance our understanding of health service interventions We demonstrated the feasibility of using a regression discontinuity approach for evaluating a population-wide health service intervention by leveraging a threshold in treatment assignment induced by clinical guidelines. Thresholds are ubiquitous in clinical medicine and, thus, represent vast opportunities to generate causal, rather than associational, evidence of treatment effectiveness. In conjunction with increasing access to routine electronic health records and detailed health information, regression discontinuity analyses have the potential to greatly contribute to advancing evidence-based health care and implementation science for several reasons. First, many conceivable research questions that are of interest to clinical medicine and health systems research cannot be studied in conventional or pragmatic randomized trials, either for feasibility (such as very long follow-up periods, for example, to establish the effectiveness of anti-aging agents) or ethical reasons (such as potential harmful side effects of medical treatments) (*39, 40*). Second, since later-stage translation research questions required for population impact are frequently underfunded and understudied (*41*), applying quasi-experimental methods such as regression discontinuity to routine data could be a pragmatic approach to assessing the sustained benefit of programs similar to the NHS DPP. While social scientists are often concerned with the fact that effects estimated by regression discontinuity designs are not generalizable to observations further away from the threshold (*42*), this is less relevant to applications in clinical medicine because often patients close to the threshold are precisely those in whom we are most interested. Lastly, regression discontinuity designs may be used to investigate aspects of health equity, for example, heterogenous treatment effects between patient groups concerning sociodemographic characteristics and comorbidities, for which randomized controlled trials usually have too small of a sample size or an insufficiently diverse study population. ### Limitations and challenges in our study Inherent to any regression discontinuity analysis is the limitation that we can only estimate the causal effect for those who initiate the treatment *because* they crossed the threshold. This effect may differ from the (unobserved) treatment effects for patients that would have received lifestyle counseling regardless of baseline HbA1c levels (the “always takers”), for example, because of clinical symptoms, or patients who would not have participated in any program or counseling even if eligible (the “never takers”). Additionally, effects may not be generalizable to those further away from the HbA1c threshold that defines prediabetes. In our analysis, we mainly relied on the CACE, the causal average complier effect, estimating the effect for those who were actually referred to an intervention. Given that there was a large percentage of individuals presenting above the HbA1c threshold who were not referred to any lifestyle counseling opportunity, it is important to not mistake the observed CACE effects for the population health effect of the NHS DPP. While the NHS DPP is operating at a large scale, with 100 000 referrals being offered in 2021 (*43*), there remains a substantial share of adults in England with impaired glycemic control who are presently not taking part in intensive lifestyle counseling, whether it would be because placements in the NHS DPP are insufficiently available or other barriers exists that decrease compliance in populations that are at risk of diabetes. Another potential limitation is the violation of the exclusion restriction, whereby treatments or exposures other than intensive lifestyle counseling are affected by crossing the cutoff. While we could not precisely control for the relationship between drug dosage and secondary outcomes, we are confident that the observed health effects were not attributable to increased medication following treatment assignment as effect estimates were robust to adjusting for drug prescriptions. Further, it is likely that patients classified as high risk for progression to T2DM have been under closer observation by GPs and that GPs may have taken additional steps to mitigate the risk. While we cannot entirely preclude that the effects on clinical endpoints may have been caused by closer monitoring through the GP or self-care rather than intensive lifestyle counseling, we believe this is very unlikely because a placebo analysis using data before the NHS DPP roll-out provided no evidence that increased monitoring above the threshold (due to the already implemented guideline) would have led to the observed health improvements. Similarly, using electronic health records may have introduced a so-called informative observation scheme or informed presence due to differential loss to follow-up, leading to a biased observation of outcomes (*44*). Outcomes such as HbA1c or BMI were more likely to be observed in the follow-up period if patients had crossed the threshold. Since the intensity of healthcare utilization may be considered a marker of health, it may be that patients (in particular below the threshold) were more likely to consult their GP if they had other underlying health conditions or showed symptoms during the follow-up period, making patients with many visits systematically different from those with few and potentially contributing to the observed discontinuities in health outcomes. However, we addressed this concern by conditioning our results on the number of GP visits, which did not substantially change our results, and performing a sensitivity analysis restricting our sample to patients with regular GP visits before their treatment assignment. Finally, we had to rely on what is recorded in electronic health records, e.g., we had no detailed information about adherence to behavior change programs and lifestyle counseling. However, this is only of minor concern as we are most interested in whether the positive effects of such a program would persist in the “real world”, precisely despite non-adherence to the program. ## Conclusion The initially described skepticism about the effectiveness of lifestyle counseling for successful behavior change may stem from clinicians’ experience that brief lifestyle counseling—that is often the only feasible approach in GP consultations with pressing time constraints—may be of no or very limited effectiveness. However, we show that referrals for intensive lifestyle counseling in routine care, in the form of a population-wide diabetes prevention program, appears to be effective. Ultimately, investments in structured, intensive behavior change programs may not only reduce the risk of complications from diabetes and cardiovascular events, but their positive effects may also extend to other non-communicable diseases such as cancer, which is increasingly thought to be connected to unhealthy lifestyle habits and environments (*45*), or communicable diseases such as influenza or COVID−19, which more gravely affect people with known cardiovascular risk factors such as diabetes (*46*). Thus, our study not only demonstrates that intensive lifestyle counseling targeted at pre-diabetes can achieve important health improvements in routine care but also shows a potential route for improving population health more broadly. ## Data Availability The data that support the findings of this study are available from Clinical Practice Research Datalink, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Due to Clinical Practice Research Datalink license restrictions, we are unable to share data. ## Methods ### Description of the English Diabetes Prevention Programme UK NICE public health guideline 38 recommends that people who have non-diabetic hyperglycaemia and are thus at high risk of progression to type 2 diabetes (HbA1c level of 42–47 mmol/mol [6.0–6.4%] or fasting plasma glucose of 5.5–6.9 mmol/l) are referred to a “local, evidence-based, quality-assured intensive lifestyle change programme” to prevent or delay the onset of T2DM (*1*). The NHS DPP began phased roll-out in 2016. The provider contracts require the intervention to be delivered in face-to-face to groups of 15−20 adults over at least 13 sessions (totalling 16 hours) with a minimum of 9 months’ duration, with the aim of supporting behaviour change to result in improved diet, increased physical activity and weight loss. Activities include a mixture of education, group support, knowledge testing, visual activities, and activities led by patients (*2, 3*). Individuals were identified for inclusion in the NHS DPP following an NHS Health Check, through retrospective searches of general practice records, or through routine clinical practice (*4*). According to guidelines, people below the threshold should be offered brief advice or intervention and receive information about services that could help them change their lifestyle, bearing in mind their risk profile. ### Data source and study population The study used data from the Clinical Practice Research Datalink (CPRD) Aurum (*5*) and the NHS England Hospital Episode Statistics Admitted Patient Care (HES APC) (*6*). CPRD Aurum is a large primary care database of de-identified electronic health records from a network of General Practitioner (GP) practices across the UK. The data are representative of the broader English population in terms of geographical spread, deprivation, age, and gender (*5*). In July 2020, anonymized longitudinal data from 35.9 million patients and 1 296 currently contributing English GP practices were available. HES APC is a secondary care database in England and records patient data related to presentations in NHS hospitals or private healthcare institutions where the NHS provides partial funding. In contrast to CPRD Aurum data, which does not contain lifetime follow-up of patients since patients are entering the database when they register with a contributing GP practice and exiting the database when they leave that practice, patients in HES APC maintain the same ID throughout their time in the database (*7*). As this information was only available for patients linkable to HES APC, we kept all patients in the analysis regardless of whether they appeared under multiple CPRD IDs. Results were insensitive to dropping all patients with multiple CPRD IDs for a single HES ID from the study population. To ensure sufficient implementation of the NHS DPP during the study period after the phased roll-out in 2016, the population of interest consists of adults (aged 18 to 80 years) who received an HbA1c test between January 1st, 2017, and December 31st, 2018. Data were available until end of June 2020. We followed a target trial approach to mimic a randomised controlled trial that would be ideally conducted to estimate the causal program impact as closely as possible (table S1)82,83. Exclusion criteria were all patients who (i) exceeded the HbA1c threshold for diabetes or prediabetes prior to their index date (i.e., date of their baseline HbA1c record), or (ii) received any diabetes medication prior to their index date (all codes available in our OSF repository). Albeit specified in the NICE guideline as an alternative entry requirement, we did not use fasting plasma glucose as treatment assignment variable since routine testing of fasting plasma glucose to determine non-diabetic hyperglycemia is much less frequent as compared to HbA1c testing (*8*). ### Outcome and treatment variables The primary outcome was glycemic control measured as change in HbA1c between baseline and the final HbA1c taken during follow-up. Secondary outcomes included change in BMI, body weight, systolic and diastolic blood pressure, serum cholesterol levels (HDL cholesterol, LDL cholesterol, and ratio between total and HDL cholesterol), and serum triglycerides level. We conducted exploratory analyses investigating the effect of program entry onto probability of newly prescribed diabetes medications, blood pressure-lowering, and/or lipid-lowering medication (evaluated separately), probability of any diabetic complication (ophthalmic, neurological, or renal), all-cause mortality, and emergency hospitalization for major adverse cardiovascular event (MACE (*9*)) during follow-up. The follow-up started at six months and continued until the date of an outcome or censoring event (such as death or transfer-out of the patient). Details on how each outcome was defined is shown in table S1. Receipt of treatment was captured as the record of a referral to a behavior change program or intensive lifestyle counseling during the 12 months after the baseline HbA1c test. Treatment primarily included referrals to the NHS DPP, but we also included referrals to other structured programs and intensive lifestyle counseling as they are likely to serve as an alternative where placement in NHS DPP is not possible. All records considered as treatments are listed in table S4. ### Statistical analysis: Main analysis We used a regression discontinuity analysis to estimate the association of intensive lifestyle counseling with different outcomes quantifying patients’ health status. The analysis consists of two steps: First, we estimated the association between individual’s baseline HbA1c and being referred to intensive lifestyle counseling (“first stage”). Second, we estimated the association between baseline HbA1c and each outcome by fitting separate regression lines above and belove the HbA1c eligibility threshold. The difference in where these lines intersect the threshold quantifies the discontinuity in the outcome, our effect of interest. Specifically, our analysis represents a fuzzy regression discontinuity (FRD) design, where the treatment is not assigned deterministically but probabilistically. In the FRD design, we can estimate the intent-to-treat effect ITTFRD, that is, the effect of the patient presenting just above the eligibility threshold. ITTFRD measures the effect of treatment eligibility, as determined by the threshold rule, and is often of interest in its own right. In particular, ITTFRD can be interpreted as the effect of lowering the threshold on outcomes for the full population of patients close to the threshold. To obtain the effect of intensive lifestyle counseling itself on those induced to take up the intervention because of the threshold rule (so-called compliers), it is necessary to scale ITTFRD by the difference in the probability of treatment at the threshold. This results in the complier average causal effect (CACEFRD). We used a local linear approach, which minimizes bias by limiting the study sample to a defined bandwidth around the threshold in which a linear regression can be estimated (*10*). The size of the bandwidth was automatically selected using a data-driven method that seeks to optimally balance the bias-variance tradeoff (*10*). In addition, a triangular kernel was applied, such that individuals closer to the threshold were more heavily weighted than those further away. For estimation, we used a robust variance estimator and controlled for fixed effects at the practice-level. Additional analyses were performed to assess the sensitivity of results to bandwidth size and using polynomials of varying degrees. In regression discontinuity studies, unbiased visual presentation of the data is essential (*11*). We plot the relationship between baseline HbA1c, and referral to intensive lifestyle counseling to show the discontinuity in treatment assignment. We also provide visual evidence in support of key identifying assumptions that can be tested in the data: The first is that the density of the data should be continuous around the threshold; this would be violated if patients (or providers) could precisely manipulate the running variable baseline HbA1c. The second implication is that baseline covariates should be balanced (i.e., continuous) at the threshold. As it is in randomized controlled clinical trials, evidence of balance on baseline observables provides confidence that patients assigned to treatment and control conditions are exchangeable. We used a global polynomial approach over the entire support of the data to illustrate potential unknown relationships. We did not use the global polynomial approach for causal inference procedures as it does not deliver point estimators with good properties for the treatment effect as opposed to local lower-order regressions (*12*). ### Statistical analysis: Sensitivity analyses In sensitivity analyses, we adjusted for variables that may be associated with outcomes to show robustness and improve precision of effect estimates (*13*). These were age, gender, and number of GP consultations during follow-up. Importantly, for the primary and secondary outcomes we included time to follow-up record, i.e., months between baseline and endline measurement, and prescription of relevant medication: for HbA1c, we adjusted results for diabetes medication prescription; for blood pressure, we adjusted for blood pressure-lowering medication prescription; and for cholesterol and triglycerides levels, we adjusted for lipid-lowering medication prescription, in particular of statins, before endline measure. Estimates for mortality and MACE hospitalization were adjusted for whether patients received any of those three medications prior to their death or first MACE hospitalization. As a retrospective study relying on routinely collected data, we expected considerable missingness in outcomes. We assessed whether this gives rise to selection bias by evaluating if missingness in outcomes or time to follow-up, changed discontinuously at the eligibility threshold. We did further sensitivity analyses to show that results are robust when (i) limiting treatment to the NHS DPP only as opposed to all intensive lifestyle counseling treatments, (ii) restricting our sample to patients for whom we observe a follow-up period of at least 24 months, and (iii) restricting our sample to patients with at least three GP consultations in three years before their baseline HbA1c. All analyses were limited to complete cases and performed using R statistical software (version 4.1.1). Mean-squared error (MSE) optimal bandwidths were automatically selected by the *rdbwselect* command of the rdrobust package (*14*). Two-sided alpha was set at 0.05. ### Heterogenous treatment effects We present stratified, unadjusted regression discontinuity analyses, using local linear regressions with automatically selected optimal bandwidths for each subgroup, in the Supplement. Stratifying variables used in this analysis were gender (Male vs. female), age group (18−39, 40-59, 60-80), ethnicity (Asian, black, mixed or other, white; based on HES APC dataset), socioeconomic status, and practice residency (rural vs. urban). Socioeconomic status is based on the 2015 English Index of Multiple Deprivation composite score, resulting in five quintiles ranging from 1 (= “least deprived”) to 5 (= “most deprived”) and mapped via postcode of residence for patients in English practices that have consented to participate in the linkage scheme. ### Ethics The CPRD Independent Scientific Advisory Committee approved the study protocol (20_000052) in accordance with the Declaration of Helsinki. ## Funding National Institute of Allergy and Infectious Diseases (1DP2AI171011; PG) Chan Zuckerberg Biohub investigator award (PG) Alexander von Humboldt Foundation Alexander von Humboldt Professor award (TB) Ministry of Science, Research, and the Arts Baden-Wuerttemberg (MWK), Germany, German Research Foundation (DFG), the state of Baden-Wuerttemberg, Germany, and the German Research Foundation (DFG) grant INST 35/1314−1 FUGG (as part of data storage and computing resources) ## Author contributions Conceptualization: JL, CB, AJ, JD, SV, PG Data curation: JL, MX Formal analysis: JL, CB, MX Visualization: JL, PG Funding acquisition: TB, SV, PG Project administration: JL, TB, SV, PG Supervision: CB, TB, SV, PG Writing – original draft: JL, PG Writing – review & editing: JL, CB, MX, JD, AJ, SV, TB, PG ## Competing interests The authors declare no competing interests. ## Additional Information Supplementary Information is available for this paper. Correspondence and requests for materials should be addressed to Pascal Geldsetzer, Email: pgeldsetzer{at}stanford.edu. ## Acknowledgments: This study is based on data from the Clinical Practice Research Datalink obtained under license from the UK Medicines and Healthcare products Regulatory Agency. The data is provided by patients and collected by the NHS as part of their care and support. The interpretation and conclusions contained in this study are those of the authors alone. * Received June 8, 2023. * Revision received June 8, 2023. * Accepted June 12, 2023. * © 2023, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. 1. P. Saeedi, I. Petersohn, P. Salpea, B. Malanda, S. Karuranga, N. Unwin, S. Colagiuri, L. Guariguata, A. A. Motala, K. Ogurtsova, J. E. Shaw, D. Bright, R. Williams, Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Research and Clinical Practice. 157, 107843 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diabres.2019.107843&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 2. 2. World Health Organization, Global action plan for the prevention and control of noncommunicable diseases 2013–2020 (2013), (available at [http://apps.who.int/gb/ebwha/pdf\_files/WHA66/A66_R10-en.pdf?ua=1](http://apps.who.int/gb/ebwha/pdf_files/WHA66/A66_R10-en.pdf?ua=1)). 3. 3.The Emerging Risk Factors Collaboration, Diabetes mellitus, fasting blood glucose concentration, and risk of vascular disease: a collaborative meta-analysis of 102 prospective studies. The Lancet. 375, 2215–2222 (2010). 4. 4.NCD Risk Factor Collaboration (NCD-RisC), Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4·4 million participants. The Lancet. 387, 1513–1530 (2016). 5. 5. M. Asif, The prevention and control the type−2 diabetes by changing lifestyle and dietary pattern. Journal of Education and Health Promotion. 3, 1 (2014). 6. 6. E. Barry, S. Roberts, S. Finer, S. Vijayaraghavan, T. Greenhalgh, Time to question the NHS diabetes prevention programme. BMJ, h4717 (2015). 7. 7. S. Taheri, H. Zaghloul, O. Chagoury, S. Elhadad, S. H. Ahmed, N. El Khatib, R. A. Amona, K. El Nahas, N. Suleiman, A. Alnaama, A. Al-Hamaq, M. Charlson, M. T. Wells, S. Al-Abdulla, A. B. Abou-Samra, Effect of intensive lifestyle intervention on bodyweight and glycaemia in early type 2 diabetes (DIADEM-I): an open-label, parallel-group, randomised controlled trial. The Lancet Diabetes & Endocrinology. 8, 477–489 (2020). 8. 8. D. E. Laaksonen, J. Lindstrom, T. A. Lakka, J. G. Eriksson, L. Niskanen, K. Wikstrom, S. Aunola, S. Keinanen-Kiukaanniemi, M. Laakso, T. T. Valle, P. Ilanne-Parikka, A. Louheranta, H. Hamalainen, M. Rastas, V. Salminen, Z. Cepaitis, M. Hakumaki, H. Kaikkonen, P. Harkonen, J. Sundvall, J. Tuomilehto, M. Uusitupa, Physical Activity in the Prevention of Type 2 Diabetes: The Finnish Diabetes Prevention Study. Diabetes. 54, 158–165 (2005). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZGlhYmV0ZXMiO3M6NToicmVzaWQiO3M6ODoiNTQvMS8xNTgiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wNi8xMi8yMDIzLjA2LjA4LjIzMjkxMTI2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 9. 9.Diabetes Prevention Program Research Group, Long-term effects of lifestyle intervention or metformin on diabetes development and microvascular complications over 15-year follow-up: the Diabetes Prevention Program Outcomes Study. The Lancet Diabetes & Endocrinology. 3, 866–875 (2015). 10. 10. A. Ramachandran, C. Snehalatha, S. Mary, B. Mukesh, A. D. Bhaskar, V. Vijay, The Indian Diabetes Prevention Programme shows that lifestyle modification and metformin prevent type 2 diabetes in Asian Indian subjects with impaired glucose tolerance (IDPP−1). Diabetologia. 49, 289–297 (2006). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00125-005-0097-z&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16391903&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000235130200006&link_type=ISI) 11. 11. S. Brink, The Diabetes Prevention Program: How The Participants Did It. Health Affairs. 28, 57–62 (2009). [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6OToiaGVhbHRoYWZmIjtzOjU6InJlc2lkIjtzOjc6IjI4LzEvNTciO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wNi8xMi8yMDIzLjA2LjA4LjIzMjkxMTI2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 12. 12. K. I. Galaviz, M. B. Weber, K. Suvada, U. P. Gujral, J. Wei, R. Merchant, S. Dharanendra, J. S. Haw, K. M. V. Narayan, M. K. Ali, Interventions for Reversing Prediabetes: A Systematic Review and Meta-Analysis. American Journal of Preventive Medicine (2022), doi:10.1016/j.amepre.2021.10.020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.amepre.2021.10.020&link_type=DOI) 13. 13. G. Irving, A. L. Neves, H. Dambha-Miller, A. Oishi, H. Tagashira, A. Verho, J. Holden, International variations in primary care physician consultation time: a systematic review of 67 countries. BMJ Open. 7, e017902 (2017). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMjoiNy8xMC9lMDE3OTAyIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjMvMDYvMTIvMjAyMy4wNi4wOC4yMzI5MTEyNi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 14. 14. M. Rubio-Valera, M. Pons-Vigués, M. Martínez-Andrés, P. Moreno-Peral, A. Berenguera, A. Fernández, Barriers and Facilitators for the Implementation of Primary Prevention and Health Promotion Activities in Primary Care: A Synthesis through Meta-Ethnography. PLoS ONE. 9, e89554 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0089554&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24586867&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 15. 15. C. Keyworth, T. Epton, J. Goldthorpe, R. Calam, C. J. Armitage, British Journal of Health Psychology, in press, doi:10.1111/bjhp.12368. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/bjhp.12368&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 16. 16. E. T. Hébert, M. O. Caughy, K. Shuval, Primary care providers’ perceptions of physical activity counselling in a clinical setting: a systematic review. British Journal of Sports Medicine. 46, 625–631 (2012). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiYmpzcG9ydHMiO3M6NToicmVzaWQiO3M6ODoiNDYvOS82MjUiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wNi8xMi8yMDIzLjA2LjA4LjIzMjkxMTI2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 17. 17. A. Dewhurst, S. Peters, A. Devereux-Fitzgerald, J. Hart, Physicians’ views and experiences of discussing weight management within routine clinical consultations: A thematic synthesis. Patient Education and Counseling. 100, 897–908 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.pec.2016.12.017&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 18. 18. T. Kennedy-Martin, S. Curtis, D. Faries, S. Robinson, J. Johnston, A literature review on the representativeness of randomized controlled trial samples and implications for the external validity of trial results. Trials. 16, 495 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13063-015-1023-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26530985&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 19. 19. J. G. Ford, M. W. Howerton, G. Y. Lai, T. L. Gary, S. Bolen, M. C. Gibbons, J. Tilburt, C. Baffi, T. P. Tanpitukpongse, R. F. Wilson, N. R. Powe, E. B. Bass, Barriers to recruiting underrepresented populations to cancer clinical trials: A systematic review. Cancer. 112, 228–242 (2008). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/cncr.23157&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18008363&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000252266200002&link_type=ISI) 20. 20. J. R. Rogers, C. Liu, G. Hripcsak, Y. K. Cheung, C. Weng, Comparison of Clinical Characteristics Between Clinical Trial Participants and Nonparticipants Using Electronic Health Record Data. JAMA Network Open. 4, e214732 (2021). 21. 21. V. Suvarna, Phase IV of Drug Development. Perspectives in clinical research. 1, 57–60 (2010). 22. 22. M. S. Hagger, M. Weed, DEBATE: Do interventions based on behavioral theory work in the real world? International Journal of Behavioral Nutrition and Physical Activity. 16, 36 (2019). 23. 23.NICE (National Institute for Health and Care Exellence), Type 2 diabetes: prevention in people at high risk. 2012, (available at [https://www.nice.org.uk/guidance/ph38](https://www.nice.org.uk/guidance/ph38)). 24. 24. J. Valabhji, E. Barron, D. Bradley, C. Bakhai, J. Fagg, S. O’Neill, B. Young, N. Wareham, K. Khunti, S. Jebb, J. Smith, Early Outcomes From the English National Health Service Diabetes Prevention Programme. Diabetes Care. 43, 152–160 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZGlhY2FyZSI7czo1OiJyZXNpZCI7czo4OiI0My8xLzE1MiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIzLzA2LzEyLzIwMjMuMDYuMDguMjMyOTExMjYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 25. 25.25. M. D. Cattaneo, N. Idrobo, R. Titiunik, *A Practical Introduction to Regression Discontinuity Designs* (Cambridge University Press, 2019; [https://www.cambridge.org/core/product/identifier/9781108684606/type/element](https://www.cambridge.org/core/product/identifier/9781108684606/type/element)). 26. 26. G. W. Imbens, T. Lemieux, Regression discontinuity designs: A guide to practice. Journal of Econometrics. 142, 615–635 (2008). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jeconom.2007.05.001&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000252522400002&link_type=ISI) 27. 27. E. Selvin, M. W. Steffes, H. Zhu, K. Matsushita, L. Wagenknecht, J. Pankow, J. Coresh, 28. 28. F. L. Brancati, Glycated Hemoglobin, Diabetes, and Cardiovascular Risk in Nondiabetic Adults. New England Journal of Medicine. 362, 800–811 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa0908359&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20200384&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000275108300007&link_type=ISI) 29. 29. N. Garg, N. Moorthy, A. Kapoor, S. Tewari, S. Kumar, A. Sinha, A. Shrivastava, P. K. Goel, Hemoglobin A1c in Nondiabetic Patients: An Independent Predictor of Coronary Artery Disease and Its Severity. Mayo Clinic Proceedings. 89, 908–916 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.mayocp.2014.03.017&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24996234&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 30. 30. R. Persson, C. Vasilakis-Scaramozza, K. W. Hagberg, T. Sponholtz, T. Williams, P. Myles, S. S. Jick, CPRD Aurum database: Assessment of data quality and completeness of three important comorbidities. Pharmacoepidemiology and Drug Safety. 29, 1456–1464 (2020). 31. 31.30. Public Health England, Diabetes UK, NHS England, NHS DPP overview and FAQ. NHS England Publications Gateway Reference 05728 (2016), (available at [https://www.england.nhs.uk/wp-content/uploads/2016/08/dpp-faq.pdf](https://www.england.nhs.uk/wp-content/uploads/2016/08/dpp-faq.pdf)). 32. 32. K. S. Courneya, Efficacy, effectiveness, and behavior change trials in exercise research. International Journal of Behavioral Nutrition and Physical Activity. 7, 81 (2010). 33. 33. K. I. Galaviz, M. B. Weber, A. Straus, J. S. Haw, K. M. V. Narayan, M. K. Ali, Global Diabetes Prevention Interventions: A Systematic Review and Network Meta-analysis of the Real-World Impact on Incidence, Weight, and Glucose. Diabetes Care. 41, 1526–1534 (2018). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZGlhY2FyZSI7czo1OiJyZXNpZCI7czo5OiI0MS83LzE1MjYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wNi8xMi8yMDIzLjA2LjA4LjIzMjkxMTI2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 34. 34. M. Cardona-Morrell, L. Rychetnik, S. L. Morrell, P. T. Espinel, A. Bauman, Reduction of diabetes risk in routine clinical practice: are physical activity and nutrition interventions feasible and are the outcomes from reference trials replicable? A systematic review and meta-analysis. BMC Public Health. 10, 653 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2458-10-653&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21029469&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 35. 35. U. Mudaliar, A. Zabetian, M. Goodman, J. B. Echouffo-Tcheugui, A. L. Albright, E. W. Gregg, M. K. Ali, Cardiometabolic Risk Factor Changes Observed in Diabetes Prevention Programs in US Settings: A Systematic Review and Meta-analysis. PLOS Medicine. 13, e1002095 (2016). 36. 36. D. E. Jonas, K. Crotty, J. D. Y. Yun, J. C. Middleton, C. Feltner, S. Taylor-Phillips, C. Barclay, A. Dotson, C. Baker, C. P. Balio, C. E. Voisin, R. P. Harris, Screening for Prediabetes and Type 2 Diabetes. JAMA. 326, 744 (2021). 37. 37. N. P. Pronk, Structured diet and physical activity programmes provide strong evidence of effectiveness for type 2 diabetes prevention and improvement of cardiometabolic health. Evidence Based Medicine. 21, 18–18 (2016). [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6NToiZWJtZWQiO3M6NToicmVzaWQiO3M6NzoiMjEvMS8xOCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIzLzA2LzEyLzIwMjMuMDYuMDguMjMyOTExMjYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 38. 38. A. C. Blonstein, V. Yank, R. S. Stafford, S. R. Wilson, L. G. Rosas, J. Ma, Translating an Evidence-Based Lifestyle Intervention Program Into Primary Care. Health Promotion Practice. 14, 491–497 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/1524839913481604&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23539264&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 39. 39. L. Aucott, D. Gray, H. Rothnie, M. Thapa, C. Waweru, Effects of lifestyle interventions and long-term weight loss on lipid outcomes - a systematic review. Obesity Reviews. 12, e412–e425 (2011). 40. 40. R. Goulden, B. H. Rowe, M. Abrahamowicz, E. Strumpf, R. Tamblyn, Association of Intravenous Radiocontrast With Kidney Function. JAMA Internal Medicine. 181, 767 (2021). 41. 41. A. A. Soukas, H. Hao, L. Wu, Metformin as Anti-Aging Therapy: Is It for Everyone? Trends in Endocrinology & Metabolism. 30, 745–755 (2019). 42. 42. E. Proctor, D. Luke, A. Calhoun, C. McMillen, R. Brownson, S. McCrary, M. Padek, Sustainability of evidence-based healthcare: research agenda, methodological advances, and infrastructure support. Implementation Science. 10, 88 (2015). 43. 43. B. de la Cuesta, K. Imai, Misunderstandings About the Regression Discontinuity Design in the Study of Close Elections. Annual Review of Political Science. 19, 375–396 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1146/annurev-polisci-032015-010115&link_type=DOI) 44. 44. NHS Digital, Diabetes Prevention Programme: Non-Diabetic Hyperglycaemia, January to December 2021. National Diabetes Audit (2022), (available at [https://files.digital.nhs.uk/1E/852B7B/NDA\_NDH\_2021-22\_Q3\_01.01.21to31.12.21.xlsx](https://files.digital.nhs.uk/1E/852B7B/NDA\_NDH_2021-22_Q3_01.01.21to31.12.21.xlsx)). 45. 45. B. A. Goldstein, N. A. Bhavsar, M. Phelan, M. J. Pencina, Controlling for Informed Presence Bias Due to the Number of Health Encounters in an Electronic Health Record. American Journal of Epidemiology. 184, 847–855 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kww112&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27852603&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 46. 46. V. A. Katzke, R. Kaaks, T. Kühn, Lifestyle and Cancer Risk. The Cancer Journal. 21, 104–110 (2015). 47. 47. A. Silverio, M. Di Maio, R. Citro, L. Esposito, G. Iuliano, M. Bellino, C. Baldi, G. De Luca, M. Ciccarelli, C. Vecchione, G. Galasso, Cardiovascular risk factors and mortality in hospitalized patients with COVID−19: systematic review and meta-analysis of 45 studies and 18,300 patients. BMC Cardiovascular Disorders. 21, 23 (2021). ## Methods references 1. 1.NICE (National Institute for Health and Care Exellence), Type 2 diabetes: prevention in people at high risk. 2012, (available at [https://www.nice.org.uk/guidance/ph38](https://www.nice.org.uk/guidance/ph38)). 2. 2. R. E. Hawkes, E. Cameron, S. Cotterill, P. Bower, D. P. French, The NHS Diabetes Prevention Programme: an observational study of service delivery and patient experience. BMC Health Services Research. 20, 1098 (2020). 3. 3. L. Penn, A. Rodrigues, A. Haste, M. M. Marques, K. Budig, K. Sainsbury, R. Bell, V. Araújo-Soares, M. White, C. Summerbell, E. Goyder, A. Brennan, A. J. Adamson, F. F. Sniehotta, NHS Diabetes Prevention Programme in England: formative evaluation of the programme in early phase implementation. BMJ Open. 8, e019467 (2018). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.ncbi.nlm.nih.gov/pubmed/29467134&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 4. 4. J. Valabhji, E. Barron, D. Bradley, C. Bakhai, J. Fagg, S. O’Neill, B. Young, N. Wareham, K. Khunti, S. Jebb, J. Smith, Early Outcomes From the English National Health Service Diabetes Prevention Programme. Diabetes Care. 43, 152–160 (2020). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZGlhY2FyZSI7czo1OiJyZXNpZCI7czo4OiI0My8xLzE1MiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIzLzA2LzEyLzIwMjMuMDYuMDguMjMyOTExMjYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 5. 5. A. Wolf, D. Dedman, J. Campbell, H. Booth, D. Lunn, J. Chapman, P. Myles, Data resource profile: Clinical Practice Research Datalink (CPRD) Aurum. International Journal of Epidemiology. 48, 1740–1740g (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyz034&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 6. 6. A. Herbert, L. Wijlaars, A. Zylbersztejn, D. Cromwell, P. Hardelid, Data Resource Profile: Hospital Episode Statistics Admitted Patient Care (HES APC). International Journal of Epidemiology. 46, 1093–1093i (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyx015&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28338941&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F06%2F12%2F2023.06.08.23291126.atom) 7. 7. C. J. Sammon, T. P. Leahy, S. Ramagopalan, Nonindependence of patient data in the clinical practice research datalink: a case study in atrial fibrillation patients. Journal of Comparative Effectiveness Research. 9, 395–403 (2020). 8. 8. NHS Digital, “National Diabetes Audit: Non-Diabetic Hyperglycaemia, 2019−2020, Diabetes Prevention Programme. Main report” (2021), (available at [https://files.digital.nhs.uk/31/C59C4B/NDA\_NDH\_MainReport\_2019-20\_V1.pdf](https://files.digital.nhs.uk/31/C59C4B/NDA\_NDH_MainReport_2019-20_V1.pdf)). 9. 9. J. Davidson, “Clinical codelist - HES - Major Adverse Cardiovascular Event. [Data Collection]” (London, United Kingdom, 2021), doi:10.17037/DATA.00002198. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.17037/DATA.00002198&link_type=DOI) 10. 10. G. Imbens, K. Kalyanaraman, Optimal Bandwidth Choice for the Regression Discontinuity Estimator. The Review of Economic Studies. 79, 933–959 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/restud/rdr043&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000306645300003&link_type=ISI) 11. 11. D. S. Lee, T. Lemieux, Regression Discontinuity Designs in Economics. Journal of Economic Literature. 48, 281–355 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1257/jel.48.2.281&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000279239600001&link_type=ISI) 12. 12. M. D. Cattaneo, N. Idrobo, R. Titiunik, A Practical Introduction to Regression Discontinuity Designs (Cambridge University Press, 2019; [https://www.cambridge.org/core/product/identifier/9781108684606/type/element](https://www.cambridge.org/core/product/identifier/9781108684606/type/element)). 13. 13. S. Calonico, M. D. Cattaneo, M. H. Farrell, R. Titiunik, Regression Discontinuity Designs Using Covariates. The Review of Economics and Statistics. 101, 442–451 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access\_num=10.1162/rest_a_00760&link_type=DOI) 14. 14. S. Calonico, M. D. Cattaneo, M. H. Farrell, R. Titiunik, rdrobust: Robust Data-Driven Statistical Inference in Regression-Discontinuity Designs. R package version 1.0.8, (available at [https://cran.r-project.org/package=rdrobust](https://cran.r-project.org/package=rdrobust)).