RT Journal Article SR Electronic T1 An evaluation of reproducibility and errors in published sample size calculations performed using G*Power JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.07.15.24310458 DO 10.1101/2024.07.15.24310458 A1 Thibault, Robert T A1 Zavalis, Emmanuel A A1 Malicki, Mario A1 Pedder, Hugo YR 2024 UL http://medrxiv.org/content/early/2024/07/16/2024.07.15.24310458.abstract AB Background. Published studies in the life and health sciences often employ sample sizes that are too small to detect realistic effect sizes. This shortcoming increases the rate of false positives and false negatives, giving rise to a potentially misleading scientific record. To address this shortcoming, many researchers now use point-and-click software to run sample size calculations. Objective. We aimed to (1) estimate how many published articles report using the G*Power sample size calculation software; (2) assess whether these calculations are reproducible and (3) error-free; and (4) assess how often these calculations use G*Power's default option for mixed-design ANOVAs; which can be misleading and output sample sizes that are too small for a researcher's intended purpose. Method. We randomly sampled open access articles from PubMed Central published between 2017 and 2022 and used a coding form to manually assess 95 sample size calculations for reproducibility and errors. Results. We estimate that more than 48,000 articles published between 2017 and 2022 and indexed in PubMed Central or PubMed report using G*Power (i.e., 0.65% [95% CI: 0.62% - 0.67%] of articles). We could reproduce 2% (2/95) of the sample size calculations without making any assumptions, and likely reproduce another 28% (27/95) after making assumptions. Many calculations were not reported transparently enough to assess whether an error was present (75%; 71/95) or whether the sample size calculation was for a statistical test that appeared in the results section of the publication (48%; 46/95). Few articles that performed a calculation for a mixed-design ANOVA unambiguously selected the non-default option (8%; 3/36). Conclusion. Published sample size calculations that use G*Power are not transparently reported and may not be well-informed. Given the popularity of software packages like G*Power, they present an intervention point to increase the prevalence of informative sample size calculations.Competing Interest StatementThe authors have declared no competing interest.Clinical Protocolshttps://doi.org/10.17605/OSF.IO/UJXHWFunding StatementRobert Thibault was supported by a general support grant awarded to METRICS from Arnold Ventures and a postdoctoral fellowship from the Canadian Institutes of Health Research. Robert Thibault will serve as guarantor for the contents of this paper. Hugo Pedder was supported by the UK National Institute for Health and Social Care Excellence (NICE) via the Bristol Technology Assessment Group and the NICE Technical Support Unit. The funders had no role in the preparation of this manuscript or the decision to publish.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesData, data dictionaries, analysis scripts, and other materials related to this study are publicly available at https://osf.io/msz24/. The study protocol was registered on 31 May 2022 at https://doi.org/10.17605/OSF.IO/UJXHW. Discrepancies between this manuscript and the registered protocol are outlined in Supplementary Material A. The analysis script can be rerun by selecting "Reproducible Run" in the Code Ocean container available at https://doi.org/10.24433/CO.4349082.v1. https://osf.io/msz24/https://doi.org/10.24433/CO.4349082.v1