Abstract
Diffuse large B-cell lymphoma (DLBCL) is characterised by pronounced genetic and biological heterogeneity. Several partially overlapping classification systems exist – developed from mutation, rearrangement or gene expression data. We apply a customised network analysis to nearly five thousand DLBCL cases to identify and quantify modules indicative of tumour biology. We demonstrate that network-level patterns of gene co-expression can enhance the separation of DLBCL cases. This allows the resolution of communities of related cases which correlate with genetic mutation and rearrangement status, supporting and extending existing concepts of disease biology and delivering insight into relationships between differentiation state, genetic subtypes, rearrangement status and response to therapeutic intervention. We demonstrate how the resulting fine-grained resolution of expression states is critical to accurately identify potential responses to treatment.
Significance statement We demonstrate how exploiting data integration and network analysis of gene expression can enhance the segregation of diffuse large B-cell lymphoma, resolving pattens of disease biology and demonstrating how the resolution of heterogeneity can enhance the understanding of treatment response.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by Cancer Research UK program grant (C7845/A17723 and C7845/A29212) (M.C, G.D., D.W, and R.T). HMRN is supported by Cancer Research UK program grant A29685 (D.P., S.C., A.S., E.R.). D.J.H. was supported by a fellowship from Cancer Research UK (CRUK) (RCCFEL∖100072) and received core funding from Wellcome (203151/Z/16/Z) to the Wellcome-MRC Cambridge Stem Cell Institute and from the CRUK Cambridge Centre (A25117). D.J.H is supported by the National Institute for Health and Care Research (NIHR) Cambridge Biomedical Research Centre (BRC-1215-20014). R.T., G.D., E.R., A.S., D.P., D.W. are supported by the National Institute for Health and Care Research Leeds Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. For the purpose of Open Access, the authors have applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
https://www.ncbi.nlm.nih.gov/geo/GSE4475, GSE4732, GSE10846,GSE12195,GSE19246_FF,GSE22470,GSE31312,GSE32918,GSE34171,GSE53786,GSE87371,GSE9858,GSE181063,GSE117566 PubMedID15550490 PudMedID29641966(NCICCR-DLBCL) https://ega-archive.org/EGAS00001002606)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Lead contact: Reuben Tooze, Wellcome Trust Brenner Building, Leeds Institute of Medical Research, University of Leeds, Leeds, LS9 7TF, UK, tel: (44)-113-3438639, E-mail: r.tooze{at}leeds.ac.uk
The manuscript has been revised to address 1) the correlation between expression patterns in modules and neighbourhoods of the network and driver gene mutation state - new Figure 3 and new Supplemental Figure 4 and associated text discussing these results. Lines 180-238 of the revised draft. 2) to address in detail how selection of features from the network impacts on clustering of cases into lymphoma communities. This shows in new Figure 4a that use of network information can enhance the selection of attributes for consistent clustering of DLBCL cases. The approach is discussed in results section line 240-261. Details are provided in new supplemental methods. 3) the new clustering approach developed in point 2 above largely confirms the clustering results used in the original manuscript draft, but there are subtle differences. This results in downstream changes to the analyses of lymphoma communities in individual data sets, new Figure 4B and C, new Figure 5 and 6 and supplemental figures 5, 6, 7 and 8. There are associated revisions to text and nomenclature from line 263-383. 4) in a new addition to the manuscript the lymphoma community structure is used to analyse outcome data from the REMoDL-B study. These results are shown in new Figure 7 and 8 and new supplemental figures 9 an 10. There are corresponding new results sections in lines 385-412. 5) there are changes to the discussion section for example lines 450-465 and lines 481-497. 6) an updated web resource is provided for the new version of the manuscript, retaining the old version, allowing comparisons.
Data Availability
All data produced are available at https://mcare.link/DLBCL2