Abstract
Background Artificial intelligence (AI) can greatly enhance efficiency in systematic literature reviews and meta-analyses, but its accuracy in screening titles/abstracts and full-text articles is uncertain.
Objectives This study evaluated the performance metrics (sensitivity, specificity) of a GPT-4 AI program, Review Copilot, against human decisions (gold standard) in screening titles/abstracts and full-text articles from four published systematic reviews/meta-analyses.
Research Design Participant data from four already-published systematic literature reviews were used for this validation study. This was a study comparing Review Copilot to human decision-making (gold standard) in screening titles/abstracts and full-text articles for systematic reviews/meta-analyses. The four studies that were used in this study included observational studies and randomized control trials. Review Copilot operates on the OpenAI, GPT-4 server. We examined the performance metrics of Review Copilot to include and exclude titles/abstracts and full-text articles as compared to human decisions in four systematic reviews/meta-analyses. Sensitivity, specificity, and balanced accuracy of title/abstract and full-text screening were compared between Review Copilot and human decisions.
Results Review Copilot’s sensitivity and specificity for title/abstract screening were 99.2% and 83.6%, respectively, and 97.6% and 47.4% for full-text screening. The average agreement between two runs was 95.4%, with a kappa statistic of 0.83. Review Copilot screened in one-quarter of the time compared to humans.
Conclusions AI use in systematic reviews and meta-analyses is inevitable. Health researchers must understand these technologies’ strengths and limitations to ethically leverage them for research efficiency and evidence-based decision-making in health.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study received no funding outside the authors' regular salaries.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes