RT Journal Article
SR Electronic
T1 A hybrid-computer vision model to predict lung cancer in diverse patient populations
JF medRxiv
FD Cold Spring Harbor Laboratory Press
SP 2024.10.07.24315011
DO 10.1101/2024.10.07.24315011
A1 Zakkar, Abdul J.
A1 Perwaiz, Nazia
A1 Zhong, Weiheng
A1 Krule, Alex
A1 Burrage-Burton, Mason
A1 Kim, Daniel
A1 Miglani, Mehak
A1 Narra, Vijeth
A1 Yousef, Farah
A1 Gadi, V.K.
A1 Korpics, Mark C.
A1 Kim, Sage J.
A1 Khan, Aly A.
A1 Molina, Yamilé
A1 Dai, Yang
A1 Marai, G. Elisabeta
A1 Meidani, Hadi
A1 Nguyen, Ryan H.
A1 Salahudeen, Ameen A.
YR 2024
UL http://medrxiv.org/content/early/2024/10/07/2024.10.07.24315011.abstract
AB Importance Lung cancer disparities occur across minorities, namely Black populations, who face increased risks yet are screened at lower rates. Standards set by the United States Preventive Services Task Force (USPSTF) are derived from a predominantly White cohort: the National Lung Cancer Screening Trial (NLST), which exacerbates disparities in lung cancer screening (LCS) and diagnosis.Objective To evaluate individualized risk assessment using highly accurate risk models that integrate clinical and imaging-based risk factors for lung cancer prediction for improving LCS accuracy to reduce disparities among minoritized populations.Design, Setting, and Participants A retrospective real-world patient cohort from University of Illinois Health (UIH) using available LDCT scans (January 1, 2015 to March 16, 2024) was assembled. We then evaluated the performance of a ResNet-18 model trained on LDCTs from the predominantly white NLST cohort on the diverse UIH patient population, consisting of 65,106 patients, of which 8,823 identify as Black. Inclusion criteria of the UIH cohort utilized CPT codes, as well as ICD-9 and ICD-10 criteria for neoplasm of the bronchus or lung. The proposed hybrid model was assessed for its predictive accuracy across different racial groups and Body-Mass Index (BMI) categories.Main Outcomes and Measures The primary outcomes included the hybrid AI model’s ability to improve lung cancer screening adherence, its effectiveness across diverse racial groups—highlighting disparities in performance between Black and White populations—and its performance in individuals with varying BMI, particularly those with BMI ≥ 30. Secondary outcomes were the hybrid model’s performance in terms of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) compared to traditional USPSTF guidelines.Results The hybrid AI model was trained using clinical and imaging data from the NLST cohort and tested on a diverse urban and suburban population in the Chicago metropolitan area (UIH cohort). The model, optimized to 7 clinical features, achieved ROC-AUC values of 0.64-0.67 in the NLST test set and 0.60-0.65 in the UIH cohort. The inclusion of ResNet-based image predictors significantly improved the model’s performance, achieving ROC-AUC values of 0.78-0.91 and PR-AUC values of 0.25-0.33 in NLST. However, the hybrid model’s performance deteriorated when applied to Black patients in the UIH cohort, with ROC-AUC values of 0.65-0.75, and to 0.67 in obese patients (BMI ≥ 30). Further investigation found the ResNet-18 model was the underlying cause of the disparate results with higher performance among White patients compared to Black patients UIH patients. Attempts to optimize the ResNet-18 outputs revealed a domain shift, where model optimization in Black patients resulted in deterioration in White patients, reflecting the limited representation of Black patients in the model’s original training dataset. Model performance also deteriorated for individuals with a BMI ≥ 30 in both the NLST and UIH data sets.Conclusions and Relevance The hybrid AI model shows promise in providing personalized lung cancer risk predictions with improved accuracy compared to clinical risk models alone. However, biases in training data, particularly regarding race and BMI, limit its generalizability. Future work should focus on developing more inclusive training datasets and further validating the model in diverse prospective cohorts to enhance its applicability in reducing lung cancer disparities.Competing Interest StatementA.Z. and A.A.S. have filed an invention disclosure related to this work. A.A.S. has an equity interest in Tempus AI.Funding StatementThis study was funded by UIC institutional funds.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:IRB of University of Illinois at Chicago gave ethical approval for this work under protocol numbers listed in this manuscript.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesRequests for data related to this study are subject to IRB approval and review by the UIC Office of Research.