Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing
- Sergey Aganezov1,
- Sara Goodwin2,
- Rachel M. Sherman1,
- Fritz J. Sedlazeck3,
- Gayatri Arun2,
- Sonam Bhatia2,
- Isac Lee4,
- Melanie Kirsche1,
- Robert Wappel2,
- Melissa Kramer2,
- Karen Kostroff5,
- David L. Spector2,
- Winston Timp4,
- W. Richard McCombie2 and
- Michael C. Schatz1,2,6
- 1Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21211, USA;
- 2Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA;
- 3Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
- 4Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21211, USA;
- 5Northwell Health, Lake Success, New York 11042, USA;
- 6Department of Biology, Johns Hopkins University, Baltimore, Maryland 21211, USA
Abstract
Improved identification of structural variants (SVs) in cancer can lead to more targeted and effective treatment options as well as advance our basic understanding of the disease and its progression. We performed whole-genome sequencing of the SKBR3 breast cancer cell line and patient-derived tumor and normal organoids from two breast cancer patients using Illumina/10x Genomics, Pacific Biosciences (PacBio), and Oxford Nanopore Technologies (ONT) sequencing. We then inferred SVs and large-scale allele-specific copy number variants (CNVs) using an ensemble of methods. Our findings show that long-read sequencing allows for substantially more accurate and sensitive SV detection, with between 90% and 95% of variants supported by each long-read technology also supported by the other. We also report high accuracy for long reads even at relatively low coverage (25×–30×). Furthermore, we integrated SV and CNV data into a unifying karyotype-graph structure to present a more accurate representation of the mutated cancer genomes. We find hundreds of variants within known cancer-related genes detectable only through long-read sequencing. These findings highlight the need for long-read sequencing of cancer genomes for the precise analysis of their genetic instability.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.260497.119.
- Received December 22, 2019.
- Accepted August 7, 2020.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.