PT - JOURNAL ARTICLE AU - Arsalan Riaz AU - Maryam Shah AU - Saad Zaheer AU - Abdus Salam AU - Faisal F Khan TI - Dynamic, stage-course protein interaction network using high power CpG sites in Head and Neck Squamous Cell Carcinoma AID - 10.1101/2021.06.30.21259548 DP - 2021 Jan 01 TA - medRxiv PG - 2021.06.30.21259548 4099 - http://medrxiv.org/content/early/2021/07/05/2021.06.30.21259548.short 4100 - http://medrxiv.org/content/early/2021/07/05/2021.06.30.21259548.full AB - Head and neck cancer is the sixth leading cause of cancer across the globe and is significantly more prevalent in South Asian countries, including Pakistan. Prediction of pathological stages of cancer can play a pivotal role in early diagnosis and personalized medicine. This project ventures into the prediction of different stages of head and neck squamous cell carcinoma (HNSCC) using prioritized DNA methylation patterns. DNA methylation profiles for each HNSCC stage (stage-I-IV) were used to extensively analyze 485,577 methylation CpG sites and prioritize them on the basis of the highest predictive power using a wrapper-based feature selection method, along with different classification models. We identified 68 high-power methylation sites which predicted the pathological stage of HNSCC samples with 90.62 % accuracy using a Random Forest classifier. We set out to construct a protein-protein interaction network for the proteins encoded by the 67 genes associated with these sites to study its network topology and also undertook enrichment analysis of nodes in their immediate neighborhood for GO and KEGG Pathway annotations which revealed their role in cancer-related pathways, cell differentiation, signal transduction, metabolic and biosynthetic processes. With information on the predictive power of each of the 67 genes in each HNSCC stage, we unveil a dynamic stage-course network for HNSCC. We also intend to further study these genes in light of functional datasets from CRISPR, RNAi, drug screens for their putative role in HNSCC initiation and progression.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialNo clinical trial undertaken in this study.Funding StatementWe are grateful to the Higher Education Commission (HEC) of Pakistan who through their National Center for Big Data and Cloud Computing (NCBC) have provided funding and support for this project. We would also like to show our gratitude to CECOS University and Rehman Medical Institute (RMI) for being partners in this unique academia-industry complex at the Precision Medicine Lab (PML). Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:No IRB approval required for this study.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll the data is publicly available at The Cancer Genome Atlas (TCGA). Case IDs of randomly selected patients is available in the supporting information S3. The R code for building machine learning models is available at Arsalan_Riaz/Rcode_patterns.zip at https://github.com/PML-research/Arsalan_Riaz