RT Journal Article SR Electronic T1 Repurposing digitised clinical narratives to discover prognostic factors and predict survival in patients with advanced cancer JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.10.28.20214627 DO 10.1101/2020.10.28.20214627 A1 Frank PY Lin A1 Osama SM Salih A1 Nina Scott A1 Michael B Jameson A1 Richard J Epstein YR 2020 UL http://medrxiv.org/content/early/2020/10/30/2020.10.28.20214627.abstract AB Electronic medical records (EMR) represent a rich informatics resource that remains largely unexploited for improving healthcare outcomes. Here we report a systematic text mining analysis of EMR correspondence for 4791 cancer patients treated between 2001 and 2017. Meaningful groups of text descriptors correlating with poor survival outcomes were systematically identified, and applying machine learning analysis to clinical text accurately predicted cancer patient survival at selected timepoints up to 12 months. In a validation cohort of 726 patients, inclusion of EMR descriptors to machine learning models outperformed the predictivity of conventional clinical symptom scores by 4.9% (p = 0.001). These results prove that labour-intensive EMR data collection can be repurposed to add clinical value. Extension of this approach to a broader spectrum of digital health data should transform the real-time utility of such latent informatics resources, enabling healthcare systems to be more adaptive and responsive to patient circumstances.Competing Interest StatementThe authors have declared no competing interest.Funding StatementFPL was supported by Shine Translational Fellowship 2016, Garvan Institute of Medical Research. This project was partly supported by a research project grant of Waikato Research Foundation 2018. FPL and RE acknowledge the support from the Wolf family.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was approved by Northern Health & Disability Ethics Committee, New Zealand (#16/STH/251).All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe original data set is not available, with the exception of summarised and non-reidentifiable datasets supplied as supplementary text. Computer code associated with this manuscript is released as open-source software and is freely available.