PT - JOURNAL ARTICLE AU - Liu, Taotao AU - Duan, Yaocong AU - Li, Yanchun AU - Hu, Yingying AU - Su, Lingling AU - Zhang, Aiping TI - ChatGPT achieves comparable accuracy to specialist physicians in predicting the efficacy of high-flow oxygen therapy AID - 10.1101/2023.10.12.23296773 DP - 2023 Jan 01 TA - medRxiv PG - 2023.10.12.23296773 4099 - http://medrxiv.org/content/early/2023/10/12/2023.10.12.23296773.short 4100 - http://medrxiv.org/content/early/2023/10/12/2023.10.12.23296773.full AB - Rationale The failure of high-flow nasal cannula (HFNC) oxygen therapy can necessitate endotracheal intubation in patients. Timely prediction of the endotracheal intubation risk due to HFNC failure is critical for avoiding delays in intubation, therefore potentially decreasing mortality.Objectives To investigate the accuracy of ChatGPT in predicting the risk of endotracheal intubation within 48 hours after HFNC therapy and compare it with the predictive accuracy of specialist and non-specialist physicians.Methods We conducted a prospective multicenter cohort study based on the data of 71 adult patients who received HFNC therapy. We recorded patient baseline data, the results of blood gas analysis, and physiological parameters after 6-hour HFNC therapy. For each patient, this information was used to create a 6-alternative-forced-choice natural language questionnaire that asked participants to predict the risk of 48-hour endotracheal intubation using graded options from 1 to 6, with higher scores indicating a higher risk. GPT-3.5, GPT-4.0, respiratory and critical care specialist physicians and non-specialist physicians completed the same 71 questionnaires respectively. We then determined the optimal diagnostic cutoff point for each of them, as well as 6-hour ROX index, using the Youden index and compared their predictive performance using receiver operating characteristic (ROC) analysis.Results The optimal diagnostic cut-off points for GPT-4.0 and specialist physicians were determined to be ≥4. The precision of GPT-4.0 was 76.1% [specificity=78.6% (95%CI=52.4-92.4%); sensitivity=75.4% (95%CI=62.9-84.8%)]. The precision of specialist physicians was 80.3% [specificity=71.4% (95%CI=45.4-88.3%); sensitivity=82.5% (95%CI=70.6-90.2%)]. The optimal diagnostic cut-off points for GPT-3.5 and non-specialist physicians were determined to be ≥5, with precisions of 73.2% and 64.8% respectively. The area under the ROC (AUROC) of GPT-4.0 was 0.821 (95%CI=0.698-0.943), which was greater than, but not significantly (p>0.05) different from the AUROCs of GPT-3.5 [0.775 (95%CI=0.652-0.898)] and specialist physicians [0.782 (95%CI=0.619-0.945)], while was significantly higher than that of non-specialist physicians [0.662 (95%CI=0.518-0.805), P=0.011]. Grouping the patients by GPT-4.0’s prediction value ≥4 (high-risk group) and ≤3 (low-risk group), the 28-day cumulative intubation rate (56.00% vs. 15.22%, P<0.001) and 28-day mortality (44.00% vs. 10.87%, P<0.001) of the high-risk group were significantly higher than those of the low-risk group.Conclusion GPT-4.0 achieves an accuracy level comparable to specialist physicians in predicting the 48-hour endotracheal intubation risk in patients after HFNC therapy, based on patient baseline data and 6-hour parameters of receiving HFNC therapy. Large-scale studies are needed to further inspect whether GPT-4.0 can provide reliable clinical decision support.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialChiCTR2100053027Funding StatementThis study was funded by Shenyang RMS Medical Tech Company [20210901].Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Medical research ethics approval was obtained from each center (the First Affiliated Hospital of Henan University of Science and Technology 2021-0241; Jiangyan Hospital Affiliated to Nanjing University of Chinese Medicine 2021-016).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present work are contained in the manuscript.HFNChigh-flow nasal cannulaNIVnoninvasive ventilationICUintensive care unitBMIbody mass indexABGarterial blood gasIQRinterquartile rangeAUCarea under the receiver operating characteristic curveROX indexratio of SpO2/FiO2 to respiratory rate