Abstract
Pancreatic cancer (PC) is associated with high mortality overall. Recent literature has focused on investigating long noncoding RNAs (lncRNAs) in several cancers, but studies on their functions in PC are lacking. To identify significantly altered expression of lncRNA in PC, I collected information from The Cancer Genome Atlas (TCGA) and extracted RNA-sequencing (RNA-seq) transcriptomic profiles of pancreatic carcinomas and performed differential gene expression analysis. Out of 60,660 gene transcripts shared between 151 PC patients, I identified 38 lncRNAs that were significantly differentially expressed. To further investigate the functions of these genes, gene set enrichment analysis (GSEA) was performed on the population lncRNA panel. GSEA results revealed enrichment of several terms implicated in proliferation. To assess the contribution of these lncRNAs to metastatic progression, I used different ML algorithms, including logistic regression (LR), support vector machine (SVM), random forest classifier (RFC) and eXtreme Gradient Boosting Classifier (XGBC). Explicitly using significantly differentiated lncRNA genes and hyperparameter tuning, in addition to reducing bias through the synthetic minority oversampling technique, the accuracy of the ML models improved. Regardless, out of the four algorithms, both SVM and RFC were able to predict metastatic progression with 76% accuracy. To the best of my knowledge, this is the first study of its kind to identify this lncRNA panel to differentiate between nonmetastatic PC and metastatic PC, with many novel lncRNAs previously unmapped to PC. The ML accuracy score reveals important involvement of the detected RNAs. Based on these findings, I suggest further investigations of this gene panel in vitro and in vivo, as they could be targeted for improved outcomes in PC patients, as well as assist in the diagnosis of metastatic progression based on RNA-seq data of primary pancreatic tumors.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All data used in this study was acquired from the cancer genome atlas (TCGA) available from https://portal.gdc.cancer.gov/projects/TCGA-PAAD. The search algorithm for retrieving the data can be provided on request. The filters picked included only data with open access. Data with controlled access was excluded.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes