The Proportional Treatment Effect: A Metric That Empowers and Connects ====================================================================== * Guoqiao Wang * Yijie Liao * Caiyan Li * Kun Jin * Yan Li * Gary Cutter ## Abstract Clinical trials with continuous endpoints, evaluate efficacy by comparing the difference in mean changes from baseline between groups. However, clinicians often interpret results in terms of a proportional reduction rather than an absolute difference. An alternative approach is to reparametrize this difference as a proportional treatment effect, calculated by dividing the difference by the placebo mean change. We demonstrate that, in theory, the proportional treatment effect can be more powerful than the simple difference in means while still controlling the type I error rate. This is achieved using the delta method as implemented in well-established computational tools like the R package ‘msm’ and the SAS procedure ‘NLMIXED’. By analyzing data from phase III trials, we illustrate how a proportional treatment effect connects treatment outcomes across various endpoints and different presentation formats. The availability of these well-established statistical tools for estimating proportional treatment effects, combined with this theoretical demonstration, suggests an alternative test statistic for clinical trials with continuous endpoints. ## 1. Introduction In clinical trials with continuous endpoints, efficacy inference has traditionally been based on comparing the difference (*μ**P* − *μ**T*) in the mean change from baseline to the last study visit between the placebo group (*μ**P*) and the treatment group (*μ**T*) using a two-sample t-test. Recently, an alternative approach has been proposed, which involves reparameterizing the difference as a proportional treatment effect: ![Graphic][1]</img>. It has been demonstrated that under the same model setting, this reparameterization can increase power compared with the difference between means.1, 2, 4 However, this conclusion is currently based on simulations and has not yet been derived theoretically in any literature. This proportional treatment effect essentially relates to the ratio of two means, a topic that has been extensively investigated,6-9 including the concise confidence interval formula provided by Fieller’s theorem.10 Nonetheless, none of these investigations have directly explored why a proportional treatment effect can potentially have greater power than the difference between the means of the treatment and placebo groups while controlling Type I error. In this study, we provide theoretical justification regarding the following points based on the delta method as implemented in well-established computational tools like the R package ‘msm’ and the SAS procedure ‘NLMIXED’: (i) A proportional treatment effect, under certain conditions, can lead to greater power than the difference in means (henceforth referred to as the placebo-treatment difference). (ii) A proportional treatment effect does not inflate Type I error. (iii) A proportional treatment effect connects various ways of measuring treatment efficacy across different endpoints within the same clinical trial. ## 2. Materials and Methods A conventional clinical trial with a continuous endpoint typically features a randomized (often 1:1 ratio), placebo-controlled, double-blind design with parallel groups (or active comparator, but for simplicity we will only use placebo without loss of generality). For a given primary continuous endpoint, either higher or lower values indicate better outcomes. Let *μ**P* denote the mean change from baseline for the placebo group, and *μ**T* for the treatment group. Without loss of generality, we assume that the investigational drug leads to improvement in the endpoint. That means *μ**P* ≤ *μ**T* < 0 when higher values indicating better outcomes; and *μ**P* ≥ *μ**T* > 0 when lower values indicating better outcomes. We define the proportional treatment effect as: ![Graphic][2]</img>. Therefore, regardless of the direction (negative or positive) of the mean change from baseline, *θ* is positive (i.e., *θ* > 0) when the treatment effectively slows down the disease progression. The test statistic for the difference in the mean change from baseline between the placebo and treatment groups was compared with the test statistic for a proportional treatment effect under this conventional clinical trial setting. The connection between various measures of the treatment effect, and between the treatment effects across different endpoints, are demonstrated using published data from the Clarity AD11 and TRAILBLAZER-ALZ212 trials for Alzheimer’s disease. ## 3. Results ### 3.1 A Proportional Treatment Effect Empowers and Controls Type I Error Let us consider a two-sample situation where the random variables are assumed to be mutually independent and normally distributed with an unknown but common variance *σ*2. Specifically, let *X**jP* ∼ *N*(*μ**P*, *σ*2) be the *N* observations of the change from baseline to the last visit in the placebo group and *X**jT* ∼ *N*(*μ**T*, *σ*2) be the *N* observations in the treatment group with *j* = 1, … , *N. Without loss of generality, we assume that lower values indicating better outcomes and μ**P* ≥ *μ**T*. Traditionally, the efficacy inference is based on the difference between the two group means. Let ![Graphic][3]</img> be the mean of these *X**jP* observations, then ![Graphic][4]</img>. Similarly, ![Graphic][5]</img>. The test statistic of the difference *μ**P* − *μ**T* can be estimated as: ![Formula][6]</img> When the proportional treatment effect is defined as: ![Graphic][7]</img> then *μ**T* = *μ**P* (1 − *θ*) Subsequently, *X**jP* ∼ *N*(*μ**P*, (1 − *θ*) *σ*2). By the classical delta method,10 ![Formula][8]</img> and, furthermore, *X**jT* is independent of *X**jP*, thus ![Formula][9]</img> It is worth noting that the use of this formula (i.e., first-order derivative) to approximate the variance of a ratio has been widely implemented across various studies6, 9, 13, 14 and statistical packages, including R packages “msm”15 and “car”16, as well as SAS “proc nlmixed”.3, 17 Our goal is to establish a connection between the test statistic of *θ* and the test statistic of the difference by applying this well-established variance formula. Given the variance, the test statistic of *θ* can be estimated as: ![Formula][10]</img> When ![Graphic][11]</img> can be related to ![Graphic][12]</img> in the following way: ![Formula][13]</img> Therefore, a proportional parameterization will lead to a larger test statistic, consequently yielding a smaller p-value and greater power. Under the null hypothesis, when *μ**P* = *μ**T*, we have ![Graphic][14]</img>. Therefore, both test statistics result in the same type I error control. It is worth noting that the power gain might be attributed to the delta method, which is used to approximate the distribution of the test statistic. Under these circumstances, our derivation reveals a potential limitation in commonly used statistical software packages when estimating nonlinear proportional treatment effects. It is crucial for statisticians to be aware of this issue. To the best of our knowledge, this matter has not been addressed in the existing literature. ### 3.2 Reparameterization Matters When *μ**P* ≥ *μ**T* and ![Graphic][15]</img>, Section 3.1 demonstrates that this proportional treatment effect can have greater power than the difference. However, when the proportional treatment effect is defined differently as: ![Graphic][16]</img>. The subscript *T* in *θ**T* indicates the proportional treatment effect is relative to the mean of the treatment group rather than the mean of the placebo group as in *θ*. Following the same derivation process described in Section 3.1, it can be showed that: ![Formula][17]</img> And, ![Formula][18]</img> Therefore, the test statistic of *θ**T* can be estimated as: ![Formula][19]</img> Similarly, when ![Graphic][20]</img> can be related to ![Graphic][21]</img> in the following way: ![Formula][22]</img> Therefore, this proportional parameterization will lead to a smaller test statistic, consequently yielding a larger p-value and less power. Under the null hypothesis, when ![Graphic][23]</img>. Therefore, the type I error is still controlled. ### 3.3 Simulation Results To illustrate the theoretical conclusions presented in Sections 3.1 and 3.2, simple simulations were performed to model a two-sample scenario as described in Section 3.1. Three test statistics were evaluated: (i) the proportional effect with the larger mean as the denominator ![Graphic][24]</img>; (ii) the proportional effect with the smaller mean as the denominator ![Graphic][25]</img>; and (iii) the difference between the two means (*μ**P* – *μ**T*). Table 1 presents the simulation results, which were generated using the SAS procedure *nlmixed* (supplemental material includes the corresponding SAS code). View this table: [Table 1:](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/T1) Table 1: Power and Type I error for various simulated scenarios. These results validate the theoretical conclusions, demonstrating that estimating the nonlinear model with the standard SAS procedure can reveal that the proportional treatment effect offers greater power than the traditional difference while maintaining Type I error control. Furthermore, the proportional treatment effect exhibits greater power when the larger mean is used as the denominator. In other words, reparameterization is critical. ### 3.4 A Proportional Treatment Effect Connects For clinical trials with a continuous primary endpoint, such as the Clarity AD11 and TRAILBLAZER-ALZ212 trials for Alzheimer’s disease, various efficacy measures have been used to present the treatment effect, including the difference in the mean change from baseline between groups, time savings in disease progression, and reduction in hazard ratio. Despite these diverse representations of the treatment effect, when converted to a proportional treatment effect, they converge to very similar values within the same trial (Table 2). View this table: [Table 2:](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/T2) Table 2: Illustration of various efficacy measures and their interconnections Additionally, each clinical trial typically employs multiple key secondary endpoints with various scale ranges. Comparing the treatment effects (i.e., the placebo-treatment difference) across primary and secondary endpoints can be challenging due to these differing scales. Figure 1 illustrates the comparison of two different representations of treatment effects for both the Clarity AD and TRAILBLAZER-ALZ2 (low/median tau population) trials. Despite the large variation in the differences between means for the four endpoints in each trial, converting these to proportional treatment effects (*θ*) relative to the placebo mean change makes the treatment effects more comparable and easier to interpret, both within the same trial and across trials. Table 2 and Figure 1 demonstrate that using a proportional treatment effect not only unifies various efficacy measures but also aligns treatment effects across different endpoints within the same trials.  [Figure 1:](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/F1) Figure 1: Comparison of two different types of presentation of treatment effects. Panels A and C: the difference between means; Panel B and D: the proportional treatment effect. Panels A and B were generated using data from the Clarity AD trial. Panels C and D were generated using data from the TRAILBLAZER-ALZ2 trial (low/median tau population). ### 3.5 Bias and Asymmetry It has been shown that the distribution of ![Graphic][26]</img> can be asymmetric, and its estimate is asymptotically unbiased with a bias given by ![Graphic][27]</img>. Consequently, the proportional treatment effect ![Graphic][28]</img> can also exhibit asymmetry, and its estimate is asymptotically unbiased. The bias is of *θ* is ![Graphic][29]</img>. In other words, the estimate of *θ* based on the delta method provides a smaller estimate than the true *θ* when both means are either positive or negative and *θ* is positive. ## 4. Discussion The use of a proportional treatment effect is well established in the analysis of survival time-to-event endpoints (e.g., Cox proportional model)19 and categorical endpoints (e.g., proportional odds ratio model20). A proportional effect provides a flexible way to evaluate the average treatment efficacy between groups and has become a standard tool for both survival and categorical endpoints. For continuous endpoints, efficacy inference traditionally relies on the difference in mean change from baseline between groups.11,12 The methodologies and computational packages needed to analyze this difference are well established. Additionally, estimating sample size when using the difference for efficacy inference is straightforward. In contrast, although the proportional treatment effect stemming from reparameterization using the difference and the placebo mean ![Graphic][30]</img> has been extensively used to communicate treatment effects, it has not been widely employed as a formal efficacy inference test statistic. For example, in the Clarity AD trial, a 27% reduction relative to the placebo decline in CDR-SB, obtained by dividing the difference between groups by the placebo mean (27% = 0.45/1.66), has been used to communicate the treatment effect to clinicians, patients, and the media. Similarly, in the TRAILBLAZER-ALZ2 trial (low/medium tau population), a 36% reduction has been widely reported. A proportional treatment effect is often easier to communicate than the difference in primary endpoints, especially for non-researchers. In this report, we demonstrate that a proportional treatment effect can potentially have greater power under certain circumstances than the difference, even though the former is a reparameterization of the latter. Our demonstration was based on the popular nonlinear approximation method employed in various well-established computational packages. We also show how reparameterization matters. Because the variance of the proportion is a function of the ratio of the two means, to reduce the variance, the larger mean should be the denominator. Furthermore, we illustrate that regardless of how treatment effects are presented, and which endpoint is used, they are interconnected once converted to a proportional treatment effect. This interconnection not only enables comparison of treatment effects across trials and endpoints but also offers new possibilities and flexibility in analyzing clinical trial data. For example, a shared proportional treatment effect can be used to model multiple endpoints with different scale ranges simultaneously, instead of modeling them sequentially. We hope that the enhanced capabilities of a proportional treatment effect to empower and connect, compared to the traditional test statistic of the difference between means, will attract the attention of statisticians so that this approach can be used more broadly. There are some limitations to our study. First, the derivation is based on the delta method that is used to approximate the test statistic and might not be true for other approximation approaches. If the power gain is attributed to the delta method, our derivation reveals a potential limitation in commonly used statistical software packages when estimating nonlinear proportional treatment effects. It is crucial for statisticians to be aware of this issue. To the best of our knowledge, this matter has not been addressed in the existing literature. Second, although, in theory, a proportional treatment effect can yield more power than the difference between groups, it can become unstable when the mean change in the denominator (i.e., *μ**P*) is close to zero, an issue inherent to any test statistic based on a ratio. Similar to Fieller’s theorem, which does not work well when the denominator is close to zero, we do not recommend the use of a proportional treatment effect under this circumstance either.10 To apply a proportional treatment effect, we recommend considering the following factors: (i) Whether a reasonably large placebo change over time is observed in natural history data or proof-of-concept phase I/II clinical trials, if available; and (ii) Conduct extensive simulations to evaluate the effectiveness and stability of a proportional treatment effect if the placebo change is small. The availability of well-established statistical packages for estimating a proportional treatment effect and the theoretical demonstration of its potential to empower and connect hopefully provide an alternative test statistic for clinical trials with continuous, longitudinal endpoints. ## Data Availability All data produced in the present work are contained in the manuscript and can be generated using the SAS codes provided in supplemental material. ## Disclosure of Conflict of Interests Guoqiao Wang, PhD, is the biostatistics core co-leader for the DIAN-TU. He reports serving on a Data Safety Committee for Eli Lilly and Company, Amydis Corporate, Abata Therapeutics, and statistical consultant for Eisai inc. and Alector Inc. The other authors report no COIs. ## Supplemental Materials ### Simulation Sample Code  [](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/F2/graphic-13)  [](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/F2/graphic-14)  [](http://medrxiv.org/content/early/2025/02/13/2025.02.12.25322182/F2/graphic-15) * Received February 12, 2025. * Revision received February 12, 2025. * Accepted February 13, 2025. * © 2025, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at [http://creativecommons.org/licenses/by-nc/4.0/](http://creativecommons.org/licenses/by-nc/4.0/) ## 5. References 1. 1.Wang G, Berry S, Xiong C, et al. A novel cognitive disease progression model for clinical trials in autosomal-dominant Alzheimer’s disease. Statistics in medicine 2018. 2. 2.Wang G, Liu L, Li Y, et al. Proportional constrained longitudinal data analysis models for clinical trials in sporadic Alzheimer’s disease. Alzheimer’s & Dementia: Translational Research & Clinical Interventions 2022; 8:e12286. 3. 3.Wang G, Wang W, Mangal B, et al. Novel non-linear models for clinical trial analysis with longitudinal data: A tutorial using SAS for both frequentist and Bayesian methods. Statistics in Medicine 2024. 4. 4.Raket LL. Progression models for repeated measures: Estimating novel treatment effects in progressive diseases. Statistics in Medicine 2022; 41:5537–5557. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36114798&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2025%2F02%2F13%2F2025.02.12.25322182.atom) 5. 5.Wang G, Cutter G, Oxtoby NP, et al. Statistical considerations when estimating time-saving treatment effects in Alzheimer’s disease clinical trials. Alzheimer’s & Dementia 2024. 6. 6.Van Kempen G and Van Vliet L. Mean and variance of ratio estimators used in fluorescence ratio imaging. Cytometry: The Journal of the International Society for Analytical Cytology 2000; 39:300–305. 7. 7.Hauschke D, Kieser M, Diletti E, et al. Sample size determination for proving equivalence based on the ratio of two means for normally distributed data. Statistics in Medicine 1999; 18:93–105. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/(SICI)1097-0258(19990115)18:1<93::AID-SIM992>3.0.CO;2-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9990695&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2025%2F02%2F13%2F2025.02.12.25322182.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000077873800007&link_type=ISI) 8. 8.Friedrich JO, Adhikari NK and Beyene J. Ratio of means for analyzing continuous outcomes in meta-analysis performed as well as mean difference methods. Journal of clinical epidemiology 2011; 64:556–564. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2010.09.016&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21447428&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2025%2F02%2F13%2F2025.02.12.25322182.atom) 9. 9.Pham-Gia T, Turkkan N and Marchand E. Density of the ratio of two normal random variables and applications. Communications in Statistics-Theory and Methods 2006; 35:1569–1591. 10. 10.Cox C. Fieller’s theorem, the likelihood and the delta method. Biometrics 1990: 709–718. 11. 11.Van Dyck CH, Swanson CJ, Aisen P, et al. Lecanemab in early Alzheimer’s disease. New England Journal of Medicine 2023; 388:9–21. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJ-Moa2212948&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36449413&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2025%2F02%2F13%2F2025.02.12.25322182.atom) 12. 12.Sims JR, Zimmer JA, Evans CD, et al. Donanemab in early symptomatic Alzheimer disease: the TRAILBLAZER-ALZ 2 randomized clinical trial. JAMA 2023. 13. 13.Bartlett JW, De Stavola BL and Frost C. Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine 2009; 28:3158–3178. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19777493&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2025%2F02%2F13%2F2025.02.12.25322182.atom) 14. 14.Gu K, Ng HKT, Tang ML, et al. Testing the ratio of two poisson rates. Biometrical Journal: Journal of Mathematical Methods in Biosciences 2008; 50:283–298. 15. 15.Jackson C. Multi-state models for panel data: the msm package for R. Journal of statistical software 2011; 38:1–28. 16. 16.Fox J, Friendly GG, Graves S, et al. The car package. R Foundation for Statistical computing 2007; 1109:1431. 17. 17.Singer JD. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford university press, 2003. 18. 18.Scott A and Wu C-F. On the asymptotic distribution of ratio and regression estimators. Journal of the American Statistical Association 1981; 76:98–102. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/01621459.1981.10477612&link_type=DOI) 19. 19.Allison PD. Survival analysis using SAS: a practical guide. Sas Institute, 2010. 20. 20.Agresti A. Categorical data analysis. John Wiley & Sons, 2012. [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: /embed/inline-graphic-5.gif [6]: /embed/graphic-1.gif [7]: /embed/inline-graphic-6.gif [8]: /embed/graphic-2.gif [9]: /embed/graphic-3.gif [10]: /embed/graphic-4.gif [11]: /embed/inline-graphic-7.gif [12]: /embed/inline-graphic-8.gif [13]: /embed/graphic-5.gif [14]: /embed/inline-graphic-9.gif [15]: /embed/inline-graphic-10.gif [16]: /embed/inline-graphic-11.gif [17]: /embed/graphic-6.gif [18]: /embed/graphic-7.gif [19]: /embed/graphic-8.gif [20]: /embed/inline-graphic-12.gif [21]: /embed/inline-graphic-13.gif [22]: /embed/graphic-9.gif [23]: /embed/inline-graphic-14.gif [24]: /embed/inline-graphic-15.gif [25]: /embed/inline-graphic-16.gif [26]: /embed/inline-graphic-17.gif [27]: /embed/inline-graphic-18.gif [28]: /embed/inline-graphic-19.gif [29]: /embed/inline-graphic-20.gif [30]: /embed/inline-graphic-21.gif