RT Journal Article SR Electronic T1 GPT for RCTs?: Using AI to measure adherence to reporting guidelines JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.14.23299971 DO 10.1101/2023.12.14.23299971 A1 Wrightson, J.G. A1 Blazey, P. A1 Khan, K.M. A1 Ardern, C.L. YR 2023 UL http://medrxiv.org/content/early/2023/12/15/2023.12.14.23299971.abstract AB Background Adherence to established reporting guidelines can improve clinical trial reporting standards, but attempts to improve adherence have produced mixed results. This exploratory study aimed to determine how accurately a Large Language Model generative AI system (AI-LLM) could measure reporting guideline compliance in a sample of sports medicine clinical trial reports.Methods The OpenAI GPT-3.5 AI-LLM was evaluated for its ability to determine reporting guideline adherence in a sample of 113 published sports medicine and exercise science clinical trial reports. For each paper, the model was prompted to answer a series of nine reporting guideline questions. The dataset was randomly split (80/20) into a TRAIN and TEST dataset. Hyperparameter and model fine-tuning were performed using the TRAIN dataset. Model performance (F1-score, classification accuracy) was assessed using the TEST dataset.Results Across all questions, the AI-LLM demonstrated acceptable performance (F1-score = 86%). However, there was significant variation in performance between different reporting guideline questions (accuracy between 70-100%). The model was most accurate when asked to identify a defined primary objective or endpoint and least accurate when asked to identify an effect size and related confidence interval.Discussion The AI-LLM showed promise as a tool for assessing reporting guideline compliance. Next steps should include developing a cost-effective, open-source AI-LLM and exploring methods to improve model accuracy.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by a CIHR Research Operating Grant (Scientific Directors) held by Karim Khan. The funder had no role in the design and conduct of the study.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesCode notebooks and data are available on the Open Science Framework (relevant links within the text). Copyright issues prevent the sharing of some of the text extracted from the papers used in this analysis; however, details of the steps needed to reproduce the extracted text from open and closed papers can be found within these notebooks.