RT Journal Article SR Electronic T1 A Machine Learning Framework for Cancer Prognostics: Integrating Temporal and Immune Gene Dynamics via ARIMA-CNN JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.12.09.24318717 DO 10.1101/2024.12.09.24318717 A1 Lin, Rui-Bin A1 Zhou, Linlin A1 Lin, Yu-Chun A1 Yu, Yu A1 Yang, Hung-Chih A1 Yu, Chen-Wei YR 2024 UL http://medrxiv.org/content/early/2024/12/10/2024.12.09.24318717.abstract AB Hepatocellular carcinoma (HCC) poses a significant global health challenge due to its high incidence and mortality rates. Our study investigates the prognostic significance of chemokine (C-C motif) ligand 5 (CCL5) and various immune gene signatures in HCC using an innovative combination of Autoregressive Integrated Moving Average (ARIMA) and Convolutional Neural Network (CNN) models. Time series data were utilized to apply an ARIMA model that captures the temporal dynamics of CCL5 expression. This model’s residual was integrated with immune signature expression data, including lymphocytes and macrophages, to extract features using a CNN model. Our study demonstrates that CNN-extracted features yield a statistically more robust association with patient survival compared to the traditional median split method, which primarily focuses on single-gene analysis. Specifically, CNN-extracted features from CD8 T cells and effector T cells resulted in a hazard ratio (HR) of 0.7324 (p = 0.0008) with log-rank p-value (0.0131), underscoring their pivotal role in the anti-tumor immune response. This methodology highlights the superior prognostic value obtained through integrated multi-gene analyses, providing deeper insights into tumor-immune interactions than conventional single-gene approaches. Moreover, clustering immune genes based on non-parametric correlations unveiled distinct survival patterns. A cluster comprising B cells, Th2 cells, T cells, and NK cells exhibited a moderate protective effect (HR: 0.8714, p = 0.1093) alongside a significant log-rank p-value (0.0233). However, the cluster, including granulocytes, Tregs, macrophages, and myeloid-derived suppressor cells, showed no significant survival association, highlighting the intricate immune regulation within the tumor microenvironment. These findings emphasize the necessity of incorporating temporal dynamics and synergistic immune gene interactions for more accurate prognostic evaluations. Our integrated ARIMA-CNN framework represents a significant advancement, leveraging both linear and nonlinear modeling to uncover the dynamic influence of multiple immune genes. This framework holds excellent potential for identifying robust biomarkers and personalizing immunotherapy strategies, ultimately paving the way for innovative cancer management solutions.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was funded by NSTC Taiwan.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:TCGA.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present study are available upon reasonable request to the authors.