Abstract
The progression of cancer—including the acquisition of therapeutic resistance and the fatal metastatic spread of therapy-resistant cell populations—is an evolutionary process that is challenging to monitor between sampling timepoints. Here we apply mutational signature analysis to clinically correlated cancer chronograms to detect and describe the shifting mutational processes caused by both endogenous (e.g. mutator mutation) and exogenous (e.g. therapeutic) factors between tumor sampling timepoints. In one patient, we find that cisplatin therapy can introduce mutations that increase the likelihood of genetic adaptation to subsequent targeted therapeutics. In another patient, we trace the emergence of known driver mutation CTNNB1 S37C to specific detection of defective mismatch repair associated mutational signature SBS3. Metastatic lineages were found to emerge from a single ancestral lineage arising during therapy—a finding that argues for the consideration of local consolidative therapy over other therapeutic approaches in EGFR-positive non-small cell lung cancer. Broadly, these results demonstrate the utility of phylogenetic analysis that incorporates clinical time course and mutational signature detection to inform clinical decision making and retrospective assessment of disease etiology.
Introduction
Cancer progression is an evolutionary process, enabling tumors to evade our best therapeutics and, ultimately, to recur and metastasize. The majority of cancer mortality occurs through the metastatic spread of therapy-resistant cell populations1. The evolution of therapeutic resistance in vivo remains enigmatic in part because it must be reconstructed from patient biopsies whose timing and location are dictated by patient care, rather than from design to empower discovery. This limitation prevents direct observation of cancer’s evolution between clinically-appropriate sampling timepoints and limits investigation of tumor evolution in response to medical intervention. By understanding the genetic evolution of clinically-important characteristics within these incidental sampling intervals, we can potentially gain insight into both the personal and the general etiology of disease progression.
First-line targeted therapies for epidermal growth factor receptor (EGFR)-positive Non-Small Cell Lung Cancer (NSCLC) offer excellent, initially sustained clinical response. Following initial treatment, patients are at risk of developing delayed metastatic disease2, which can be precipitated by the common EGFR T790M mutation. EGFR T790M arises within a tumor due to two distinct evolutionary processes: mutation, which leads to its appearance in single cells of the tumor, and selection, which increases its frequency in the tumor in response to treatment. Several in vitro experiments have suggested EGFR-inhibitor resistance arises due to strong selection during treatment3–5. However, determining the cause of the mutation and whether it arises in a single ancestral cell lineage before a metastatic cascade or separately among multiple metastatic lineages remains uncertain. Patient care seldom takes the form of a single, monolithic intervention. A better understanding could circumvent clinical trajectories that increase the frequency of resistance, or support pursuit of local consolidative treatment (LCT) over alternative approaches such as maintenance therapy6.
Here, we use techniques from evolutionary biology that reconstruct ancestral states of genetic mutations to infer relationships among clonal lineages and reveal temporal patterns of divergent events in metastatic progression7. We combine Bayesian inference of cancer chronograms with mutational signature analysis to deeply probe tumor evolution in two patients with metastatic lung adenocarcinoma, providing the first-in-human, clinically correlated tumor phylogenetic analysis of the metastatic cascade in EGFR-mutated NSCLC. Through examination of the phylogenetic topology and branchwise mutational signature analysis, we contextualize the effect of known endogenous and exogenous mutational processes within the evolution of individual patient metastatic cascades. Specifically, we investigate whether exogenous exposure to platinum therapy preceding erlotinib could precipitate T790M mutation. We then examine whether endogenous mutational processes could play a similar role introducing new mutations into a dynamically evolving tumor. Lastly, we test whether such mutations arising in therapy-associated tumor lineage bottlenecks lead to single lineages that form the basis of multiple therapy-resistant metastatic events. We use these results to evaluate the clinical relevance of shifts in endogenous and exogenous mutational processes and to motivate clinical consideration of potential opportunities for local consolidative treatment.
Materials and Methods
Tumor Sampling and Sequencing
The tissues assessed in this study were obtained from the Yale Pathology Archives based on Yale Human Investigation Committee at Yale University, Protocol no. 0304025173 to Dr. David L. Rimm, which enabled retrieval of tissue from archives that was consented or had been approved for use with waiver of consent. Sample collection and tumor sequencing were performed as described in Zhao et al8. Clinical information was retrospectively obtained for this investigation from electronic medical records after approval from the Yale University Human Investigation Committee (HIC 1508016314).
Variant Calling
Variant calling was performed using the protocol described in Zhao et al.8 using the same, but updated, SNP reference databases (accessed July 2019). Briefly this approach involves alignment of whole-exome capture Illumina HiSeq reads to the hg19 human reference genome using Eland. Somatic single-nucleotide variants are then provisionally called using a combination of GATK and freebayes, filtering out polymorphisms within the National Heart, Lung, and Blood Institute Exome Sequencing Project, 1000 Genomes, or Yale Human Exome Database. These somatic variants are then analyzed for false negatives using the multinomial approach also outlined in Zhao et al.8
Phylogenetic Analysis
Evolutionary relationships and divergence times were simultaneously estimated in BEAST v.2.5.2 9 assuming an uncorrelated model of molecular rates with a lognormal distribution (UCLN) and a coalescent exponential population growth branching process prior. Divergence times were calibrated using 1) a lower bound for the most recent common ancestor of the germline lineages at the patients’ year of birth, and 2) a non-contemporaneous tip age for primary tumor sequences at the time of biopsy and contemporaneous tip ages for all metastatic samples taken during autopsy. Each cancer chronogram was inferred by three independent runs lasting 5.0 × 108 iterations, sampling every 1,000 generations. Sufficient sampling of the posterior distribution for each parameter was evaluated via computation of effective sample size (ESS) values, with ESS values greater than 200 indicating adequate sampling of a target parameter. Independent runs were then assessed for convergence and appropriate levels of burn-in through visual inspection of the marginal posterior probabilities versus the generation state using Tracer v.1.6. Sampled posteriors from these three consistent executions were combined with TreeAnnotator v.2.4 to generate a maximum clade-credibility tree that summarized the posterior distribution of estimated evolutionary relationships and branch lengths.
To ensure the robustness of the tumor tree topology to the choice of alternate phylogenetic inference approaches, we validated the tree topology using maximum-likelihood in IQ-TREE10. We conducted 1000 bootstrap iterations of a GTR model on the alignment and found that the bootstrap support corresponded to the strength of topology indicated by the Bayesian distributions. Ancestral states of variant nucleotides were then reconstructed via maximum likelihood in FastML11 using the maximum clade-credibility tree under a GTR substitution model specifying a gamma distribution of rates discretized into eight categories. These ancestral states were found to be entirely consistent with those obtained by computing the ancestral states using the empirical-Bayes method in IQ-TREE and when taking the average probability determined by FastML across sampled posterior tree states.
Tracing of Mutational Processes Associated with COSMIC Mutational Signatures
Mutational signatures were assessed using deconstructSigs (v.1.9.0) 12, referencing only lung-associated endogenous and any known exogenous signatures (COSMIC Exome reference signatures, version 3, May 2019) and enforcing a standard minimum signature contribution threshold of 0.05 and ensuring cosine-similarity between called signatures does not exceed COSMIC’s own. The proportions of mutations attributable to these mutational signatures were traced along patient chronograms for each set of inferred ancestral-state variants along the branches of the phylogeny.
Code and Data Availability
Input files for phylogenetic analyses, as well as other code, intermediate files, and diagnostic images are available at https://github.com/Townsend-Lab-Yale/LUAD-PhyLCT. The sequencing reads have been made publicly available through the NCBI Sequence Read Archive, published under the BioProject accession PRJNA674368 on November 3rd, 2020.
Results
Platinum Therapy Preceding Erlotinib Therapy Precipitates T790M Mutation
We applied our Bayesian phylogenetic inference and mutational signature analysis to two patients. Patient 435 received cisplatin, a platinum-based chemotherapeutic, coupled with pemetrexed (Pe), shortly after primary tumor biopsy, then over a year later received the EGFR tyrosine kinase inhibitor erlotinib. Our phylogenetically informed mutational signature inference (Figure 1) illustrates that the known platinum-associated mutational signature SBS35 (light grey) is coincident with cisplatin therapy: there is no SBS35 signature attributed to the ancestor of the primary tumor and metastatic lineages, and the highest proportion of mutations attributable to this signature (19.1%) is observed in the inferred ancestor immediately subsequent to cisplatin exposure. The EGFR T790M resistance mutation is coincident with this peak proportion of cisplatin mutations, occurring along the same branch in the phylogeny via the nucleotide substitution c.2369C→T. Signature SBS35 is especially associated with C→T mutations, which constitute 20.3% of mutations associated with the signature. Preceding erlotinib therapy with cisplatin therapy presumably reduces the cancer cell population size, but simultaneously induces specific genetic heterogeneity that potentially includes initially undetectable low-frequency cancer cells with the T790M mutation. While signature SBS35 persisted until end-of-life, the cessation of cisplatin therapy led to serially smaller average proportions (≤12.4%) of mutations attributable to SBS35 across the entire metastatic clade, as mutations arising from other processes increased while mutations from SBS35 had already plateaued—as early as three years prior to sampling of the metastases at autopsy.
CTNNB1 S37C Mutation Follows Treatment with Bevacizumab and Precipitates Defective Mismatch-Repair
We conducted tumor-tree inference and mutational signature analysis on a second patient. This patient received first-line erlotinib therapy before disease progression at 1 year, prompting a switch to carboplatin, pemetrexed, and bevacizumab. Following the onset of bevacizumab treatment, a lineage diverged and gave rise to a highly vascular splenic metastatic mass and a distinct metastatic clade with an epithelial-mesenchymal transition- and angiogenesis-associated CTNNB1 S37C mutation. Less than a year before the time of death, lineages of this CTNNB1-mutant metastatic clade diverged genetically and colonized adrenal, lung, kidney, paratracheal lymph node, and liver tissues (Fig. 2). The known defective mismatch repair mutational signature SBS3 (light grey) followed acquisition of CTNNB1 S37C: there is no detectable SBS3 signature on any lineage lacking the mutation (Fig. 2), including the contemporaneous lesion in the highly vascularized spleen. Unlike the exogenously generated SBS35 signature in patient 435— which attenuated after discontinuation of cisplatin therapy (Fig. 1)—the endogenously generated SBS3 signature only becomes more prominent over time subsequent to the clonal growth of the CTNNB1 S37C lineage and its metastatic lineage diversification. Only 7.8% of mutations in the common ancestor of the non-splenic metastases were attributable to signature SBS3. In contrast, 17.8% and 16.9% of the attributable mutations in the liver lesion and the lineage leading to the other CTNNB1 S37C metastases were attributable to signature SBS3. This signature of DNA mismatch repair deficiency is specific to the metastatic lineages that arose subsequent to the appearance of the CTNNB1 S37C mutation.
Discussion
Here we have demonstrated that placing patient clinical treatment data and disease progression into the context of the molecular evolutionary history of tumor lineages provides clinically relevant insights. In one case, we showed how cisplatin-therapy induced a temporally localized burst of mutations, precipitating the rapid evolution of EGFR T790M resistance to erlotinib. In a second case, a CTNNB1 S37C mutation arose during the administration of Bevacizumab, preceding the rapid diversification of tumor lineages with a signature of DNA mismatch repair deficiency and leading to a clade of metastases. In both cases, metastatic lineages were found to emerge from a single ancestral lineage arising during therapy. Collectively, these results demonstrate the clinical relevance of shifts in mutational signatures that result from shifting endogenous and exogenous mutational processes, and support pursuit of local consolidative treatments in EGFR-driven non-small cell lung cancer. Where mutagenic chemotherapy and targeted systemic agents are initial first-line options, traditional chemotherapeutics in the first-line setting may decrease the time to progression associated with targeted agents due to the higher likelihood of extant chemotherapy-induced low-frequency escape mutations.
These two cases illustrate that endogenous and exogenous mutational processes provide the raw material for selective evolution enabling the resistance to therapy. The EGFR T790M mutation is known to convey resistance to erlotinib13,14, and our results suggest that CTNNB1 S37C is rescuing a tumor lineage from the anti-angiogenic selection pressure applied by therapy with bevacizumab 15–17. However, confirmation of this role necessitates estimation of the strength of selection for resistance or rescue mutations which requires application of cancer effect size estimation18 to distinct branches of tumor trees that are contemporaneous with clinical treatment across a larger cohort with samples of normal, primary, and recurrent or metastatic tumor tissue. Future clinical trials or other studies incorporating metachronous sampling that apply such approaches have great potential to illuminate the clinical implications of serial therapeutic treatments on progression.
It might be counterintuitive that a CTNNB1 mutation could be considered a rescue mutation from treatment with bevacizumab when treatment with bevacizumab is clinically indicated by the presence of CTNNB1. However, both bevacizumab and CTNNB1 operate—at least in part—by shifting the tumor along a phenotypic axis of angiogenesis via modulation of the VEGF pathway. CTNNB1 S37C is a putative gain-of-function mutation associated activation of the Wnt pathway, which is associated with loss of mismatch repair function19–21. Through its role in Wnt signaling, CTNNB1 also increases VEGF expression by binding its gene promoter22—which features seven confirmed consensus binding sites for the beta-catenin/TCF complex—stimulating angiogenesis23,24. Treatment with bevacizumab inhibits VEGF function. In the context of a tumor with CTNNB1 mutation, treatment with bevacizumab abrogates the increase in angiogenesis and consequent benefits to tumor growth and proliferation of CTNNB1 mutation. In the context of a CTNNB1 wildtype tumor, bevacizumab decreases angiogenesis, throttling the tumor, and CTNNB1 mutation rescues angiogenesis, benefitting tumor growth and proliferation. Our results lend support to the hypothesis that CTNNB1 S37C can act as a rescue mutation from VEGF-inhibitors.
Patient 459 exhibited no detectable SBS31 or SBS35 signature despite receiving a short end-of-life treatment with carboplatin. It is possible that the signatures of low-frequency mutations may not have been detected by constantly improving machine-learning oriented approaches to deconvolve mutational signatures within tumors25–31. However, the presence of SBS35 in all extant and inferred tumor samples subsequent to cisplatin therapy in patient 435 supports the consistency and specificity of signature extraction from tree branches, as does the absence of SBS3 on the early diverging spleen lineage in patient 459, which lacks mutations associated with defective homologous recombination repair. Instead, this lack of immediate detection following treatment is more likely attributable to the lag time before mutations that arise in single cells drift or are selected to clonal fixation32. Mutations require time to reach detectable prominence in the tumor via sustained replication and clonal growth.
The observation of a clade of lineages giving rise to metastases with a single common ancestor with the primary tumor implies a strong selective bottleneck associated with erlotinib therapy. Extant low-frequency mutations can form the basis of evolving resistance to targeted therapies, suggesting there is opportunity for local consolidative treatment (LCT) to prevent later metastatic disease. Compared to utilizing first-line treatment in isolation, multi-modality management (such as high-precision ablative radiotherapy or surgery) could alter the trajectory of disease and provide a survival benefit. Notably, recent randomized controlled phase-II trials have confirmed that patients with oligometastatic EGFR-positive NSCLC experience prolonged overall survival and progression-free survival with LCT 6,33,34. This prolonged survival suggests that early intervention with LCT to extinguish the reservoir of drug-tolerant cells, and prevent the emergence of resistant subclones, prior to divergence of a metastatic lineage within the primary tumor could result in significant therapeutic benefit. Anticipatory treatment of evolving cancers, such as is provided by early LCT, may improve clinical outcomes. Given the potential for radiation to induce heterogeneous mutations, this work also suggests that ablative radiation doses should be used as less-than-ablative doses may also have the unintended potential to accelerate potential tumor escape from targeted therapy.
The impact of therapeutic intervention and shifts in mutational processes driven by genic aberration can be deconvolved using mutational signature analyses that provide deeper insights into the tumor evolution than the same analyses devoid of phylogenetic information. These insights include several important clinical implications regarding the timing of DNA-damaging therapy and targeted cancer treatment for which acceleration of mutational heterogeneity may be detrimental. The presence of SBS35 subsequent to cisplatin therapy in patient 435 in particular has clinical implications regarding the sequencing of chemotherapeutics that increase mutational loads and targeted treatment. The same can be said for pre-systemic therapy or concurrent radiation treatment, if radiation doses are used that are unlikely to be ablative. Our study serves as a template for future evolutionary studies that seek to uncover the etiology of metastases that are essential steps to understanding general rules that shape the evolutionary trajectory of cancer. Developing an understanding of how dynamic mutational processes evolve and result in the diversification of metastatic lineages can inform clinical decision making, guide the design of treatment paradigms, and improve patient outcomes.
Data Availability
Input files for phylogenetic analyses, as well as other code, intermediate files, and diagnostic images are available at https://github.com/Townsend-Lab-Yale/LUAD-PhyLCT. The sequencing reads have been made publicly available through the NCBI Sequence Read Archive, published under the BioProject accession PRJNA674368 on November 3rd, 2020.
Author contributions
JNF and ARM contributed equally to this work. JPT and JY conceived of the study. DRM and JNC provided materials and helped with clinical data collection. ARM, SA, and JY reviewed, gathered, and organized patient clinical data. SGG performed initial bioinformatic analysis of sequence data. JNF and AD performed phylogenetic and cancer chronogram inference. JNF performed phylogenetic analyses of mutational signatures. All authors contributed to interpretation of results. JNF, ARM, JPT, and AD drafted the manuscript, with contributions from SGG and JY. All authors approved the final version of the manuscript.
Data and code availability
All data and code used in the analysis is referenced within the Methods and Supplementary Tables.
Conflicts of Interest
Sanjay Aneja reposts grants from the Agency for Healthcare Research and Quality, the American Cancer Society, the American Society of Clinical Oncology, the National Cancer Institute, and the National Science Foundation and personal fees for work as an associate editor from JCO Clinical Cancer Informatics. Joseph N. Contessa reports a research agreement with Spring Bank Pharmaceuticals. David Rimm reports his work as a consultant, advisor, or position on a Scientific Advisory Board for Amgen, Astra Zeneca, Agendia, Biocept, BMS, Cell Signaling Technology, Cepheid, Daiichi Sankyo, GSK, Merck, NanoString, Perkin Elmer, PAIGE, Sanofi, and Ultivue. He has received research funding from Astra Zeneca, Cepheid, Nanostring, Navigate/Novartis, NextCure, Lilly, Ultivue, and Perkin Elmer. James Yu reports onsulting and speaker fees from Augmenix / Boston Scientific. Jeffrey Townsend reports research funding from Gilead Sciences, Inc and consulting with Black Diamond, Agios, and Servier.
Acknowledgements
JPT and DR acknowledges Gilead Sciences, Inc, for funding supporting this research, as well as the Elihu endowment and the Notsew Orm Sands Foundation. JNF acknowledges NLM Grant 5T15LM007056-32 and NCI Grant 1F31CA257288-01. We also thank the Yale Center for Research Computing for their continued support.
Footnotes
Disclosures: JNF: None
ARM: None
AD: None
SGG: None
SA: Grants from the Agency for Healthcare Research and Quality, the American Cancer Society, the American Society of Clinical Oncology, the National Cancer Institute, and the National Science Foundation and personal fees for work as an associate editor from JCO Clinical Cancer Informatics.
JNC: Research agreement with Spring Bank Pharmaceuticals
DR: Consultant, advisor, or served on a Scientific Advisory Board for Amgen, Astra Zeneca, Agendia, Biocept, BMS, Cell Signaling Technology, Cepheid, Daiichi Sankyo, GSK, Merck, NanoString, Perkin Elmer, PAIGE, Sanofi, and Ultivue. He has received research funding from Astra Zeneca, Cepheid, Nanostring, Navigate/Novartis, NextCure, Lilly, Ultivue, and Perkin Elmer.
JBY: Consulting and speaker fees from Augmenix / Boston Scientific
JPT: Research funding from Gilead Sciences, Inc; Consulting with Black Diamond, Agios, and Servier.