Abstract
Whole genome sequencing (WGS) using tissue and matched blood samples from cancer patients is becoming in reach as the most complete genetic tumor diagnostic test. With a trend towards the availability of only small biopsies, and at the same time the need to screen for an increasing number of (complex) biomarkers, the use of a single all-inclusive test is preferred over multiple consecutive assays. To meet the high-quality diagnostics standards, we have optimized and validated the performance of a clinical grade WGS workflow, resulting in a technical success rate of 95.6% for samples with sufficient (≥20%) tumor cell percentage.
Independent validation of identified biomarkers against commonly used diagnostic assays showed a high sensitivity (98.5%) and specificity (98.4%) for detection of somatic SNV and indels, and high concordance (93.3%) for gene amplification detection. Gene fusion analysis showed a concordance of 91.3% between DNA-based WGS and an orthogonal RNA-based gene fusion assay.
Microsatellite (in)stability assessment showed a sensitivity of 100% with a specificity of 97%, and high-risk human papillomavirus detection showed an accuracy of 95.8% compared to standard pathological tests.
In conclusion, whole genome sequencing has a >95% sensitivity and specificity compared to routinely used DNA techniques in diagnostics and all relevant mutation types can be detected reliably in a single assay.
Introduction
Needs and complexity in molecular cancer diagnostics are rapidly increasing, driven by a growing number of targeted drugs and developments towards more personalized treatments 1,2. Simultaneously, advances in next-generation DNA sequencing technology have greatly enhanced the capability of cancer genome analyses, thereby rapidly progressing diagnostic approaches from small targeted panels to large panels and exome sequencing. Currently, whole genome sequencing (WGS) using tissue and matched blood samples from patients with (metastatic) cancer 3 is getting in reach as the most complete genetic tumor diagnostics test. In the context of the Dutch national CPCT-02 clinical study (NCT01855477) Hartwig Medical Foundation has established a national WGS facility including robust sampling procedure and logistics in more than 45 (of the 87) hospitals located across the Netherlands for the centralized analysis of tumor biopsies by WGS. Since the start in 2016, more than 5,000 tumors and matched control samples have been analyzed by WGS, of which the first cohort of 2500 patients has been extensively characterized and described 4. Originally, this clinical study aimed to analyse data for biomarker discovery, but with growing clinical demands for more extensive and broader DNA analysis for patient stratification towards targeted treatments 5, the scope of WGS is now entering routine diagnostic usage. As part of this development, the required amount of tumor tissue for as well as the turn-around-time of the WGS procedure was decreased, together with implementation of more extensive quality control metrics and independent validation required for accreditation. Currently, there is an ongoing trend towards the availability of only small biopsies, especially for advanced stage cancer where metastatic lesions are sampled using core needle biopsies, with at the same time a growing need to screen for an increasing number of (complex) biomarkers. For future-proof and efficient molecular diagnostics, the use of a single all-inclusive test is preferred over multiple consecutive assays that, together, often take more time, require more tissue and provide a far less complete profile of the molecular characteristics. For complex molecular diagnostic indications (e.g. non-small cell lung cancer) the expected cost of WGS is now in the same order as the combined multiple individual tests, especially when also taking into account (technical) personnel cost and costs of maintaining and updating multiple test setups 6,7.
To meet the high-quality diagnostics standards, we have optimized and clinically validated the performance of the WGS workflow, both technically as well as bioinformatically, as these are highly interconnected in determining the specificity and sensitivity of the test. The validation efforts include current standard-of-care biomarkers (oncogenic hotspots, inactivating mutations in tumor suppressor genes), but also broader analyses of gene fusions and other genomic rearrangements as well as emerging genome-wide or complex biomarkers like tumor mutational burden estimation, microsatellite instability (MSI) 8, and homologous repair deficiency (HRD) signatures 9,10. Importantly, an open-source and data-driven filtering and reporting strategy has been put into place to reduce the wealth of information into a diagnostically manageable size and to provide an overview of all clinically relevant DNA aberrations.
Here we show that WGS has an overall >95% sensitivity and specificity as compared to other targeted detection techniques that are routinely used in cancer diagnostics and that all relevant mutation types can be readily and reliably detected in a single assay. Although WGS requires minimal quantity of input material and can be applied pan-cancer, the tumor purity can be a limiting factor below 20% tumor cells as well as the availability of fresh frozen tumor material, which is a prerequisite for high-quality results as described here. Together, WGS has now matured from a research technology into an ISO accredited test that is ready to be used for clinical decision making in routine cancer care.
Methods
Patient selection
For this study, samples were used from patients that were included as part of the CPCT-02 (NCT01855477), DRUP (NCT02925234) and WIDE (NL68609.031.18) clinical studies, which were approved by the medical ethical committees (METC) of the University Medical Center Utrecht and the Netherlands Cancer Institute.
Whole Genome Sequencing
Whole Genome Sequencing (WGS) was performed under ISO-17025 accreditation at the Hartwig Medical Foundation laboratory (Amsterdam). The WGS test uses high quality DNA extracted from tumor tissue and blood samples. Input tissue type includes fresh-frozen or frozen archived samples from solid metastatic or primary tumor samples. In addition to tissue samples, frozen cell pellets from pleural fluid samples and ascites can be used.
DNA extraction is performed on the QiaSymphony following standard reagents and protocols. 50-200 ng gDNA is fragmented by sonication on the Covaris LE220 Focused ultrasonicator (median fragment size 450 bp) for NGS Truseq nano library preparation including PCR amplification (8 cycles). All procedures are automated on the Beckman Coulter Biomek4000 and i7 liquid handling robots. The Illumina® HiSeqX and NovaSeq6000 platforms are used for sequencing >90x and >30x average read coverage of tumor and normal genomes, respectively. To improve cost effectiveness, shallow whole-genome sequencing (8-15x coverage depth) has been used to estimate the tumor purity of the received tumor sample, before continuing to “deep” sequencing (90 – 100x in total) in case of sufficient tumor cell content (≥20%).
Sequencing data is analyzed with an in-house developed bioinformatic open source software-based pipeline (https://github.com/hartwigmedical/). The pipeline is designed to detect all types of somatic alterations, including single and multiple nucleotide substitutions (SNV and MNV), small insertions and deletions (indels), copy number alterations (aneuploidies, amplifications and gene copy losses), genomic rearrangements, and structural variants (e.g. gene fusions, mobile element insertions) 11.
The blood sample is used to filter out the ubiquitously present germline polymorphisms to be able to report somatic variants only.
Additionally, genome-wide mutational characteristics are determined and reported including microsatellite instability (MSI), tumor mutational load (ML), mutational burden (TMB), and Homologous Recombination DNA repair-deficiency (HRD). Further, viral integrations in the tumor genome are detected and reported. All code and scripts used for analysis of the WGS data are available at GitHub (https://github.com/hartwigmedical/). The raw and analyzed WGS data used in this manuscript are available for validation and cancer research purposes through a standardized controlled access procedure (see https://www.hartwigmedicalfoundation.nl/applying-for-data/ for details).
Orthogonal validation tests
Independent validation was performed for all to-be-reported types of clinically relevant DNA aberrations, including mutations (SNV, MNV and indels) with specific focus on BRAF, gene copy number (ERBB2 as example), microsatellite (in)stability, gene fusions, and viral infection (HPV as example). WGS results were retrospectively compared against (as far as possible) routine diagnostic assays performed independently in ISO15189 accredited pathology laboratories. If a clinical assay was not available for the validation purpose, a custom research-use-only test was performed. The following independently performed validation experiments were performed. An overview of the used tumor samples per validation assay is available as Suppl Data 1.
Overview validation samples
Validation of SNV, MNV and indel detection
A custom designed (research-use-only based) single molecule Molecular Inversion Probe (smMIP) sequencing panel was designed for independent confirmation of variants detected by WGS in an independent lab (Radboudumc). The smMIP panel sequencing was designed and processed similar to previous reports 12,13. In total 415 smMIPs were designed to test 192 randomly selected variants (including driver and passenger variants) that were detected by WGS across 29 tumor samples. smMIP validation was performed using the same isolated DNA as was used for WGS, and SeqNext (JSI medical systems) was used for analysis.
Orthogonal clinical validation of variant detection was performed using 48 samples and compared against a custom-made Oncomine NGS gene-panel (ThermoFisher), processed independently (double blind) in a routine pathology laboratory under ISO15189 accreditation (Erasmus MC). The custom Oncomine assay covered 25.2 kb exonic regions across 40 genes (design (v5.1) available in supplementary data14) and was performed using the same isolated DNA as was used for WGS, thereby ruling out potential tumor heterogeneity. Analysis was done using SeqNext (JSI medical systems) and a formal clinical report was generated. Additionally, for 10 samples a comparison was made between the WGS based mutational load (ML) assessment and the Oncomine Tumor Mutational Load (TML) assay (Thermofisher).
Validation of copy number assessment
WGS based copy number assessment was validated against fluorescent in situ hybridization (FISH) using COLO829 and a cohort of diagnostic tumor samples. For COLO829, a comparison was made for the ploidy of chromosomes 9, 13, 16, 18, 9p24 (CD274/PDCD1LG2), and 2q23 (ALK) at the Amsterdam UMC. Chromosome Enumeration Probes (CEP) for the centromeric region of chromosome 9, 13, 16 and (CEP9, CEP13, CEP16, CEP18) were used, as well as locus specific break-apart probes for 2p23 (ALK) fusion (Abbott Vysis) and 9p24 (CD274/PDCD1LG2) fusion (Leica Biosystems). Slides were visualized on a Leica DM5500 fluorescence microscope and for each marker, 100 cells/slide were scored for the percentages of cells with respective numbers of chromosomes (signals) counted.
Diagnostic ERBB2 copy number readout was validated using 16 tumor samples and using HER2/neu FISH analysis at an independent routine pathology laboratory under ISO15189 accreditation (University Medical Center Utrecht). FISH scoring was performed according to guidelines 15. New tumor sections (fresh-frozen) were used for probe hybridization (Cytocell LPS001), scanned using the Leica DM6000 scanner and analyzed with Cytovision software (Leica Biosystems). A formal clinical report was generated that was compared with the WGS results.
Validation of fusion gene detection
Validation of gene fusion detection by WGS was performed against RNA-based Anchored Multiplex PCR NGS assay (Archer FusionPlex Solid Tumor, ArcherDx). Twenty-four samples were selected based on the WGS results to include multiple fusion genes. Matching RNA (200 ng), isolated from the same tissue as the DNA that was used for WGS, was analyzed according to routine pathological procedures (ISO15189 certified) (Erasmus MC). A formal clinical report was generated and was compared with the WGS results.
Validation of microsatellite (in)stability readout
For a set of 48 tumor samples, the microsatellite status was validated using the MSI analysis system (Promega) and performed at a routine pathology laboratory (Erasmus MC) 16. This fluorescent multiplex PCR assays analyzed five nearly monomorphic mononucleotide microsatellite loci (BAT-25, BAT-26, NR-21, NR-24, and MONO-27). Matching tumor and blood samples were analyzed for accurate detection. Both the number of positive loci as well as binary classification of microsatellite instable (MSI) and stable (MSS) were reported.
Validation of tumor associated virus detection
WGS based detection of genomically integrated high-risk Human Papillomavirus (HPV) DNA was validated against routine pathological testing (Netherlands Cancer Institute) using the QIAscreen HPV PCR Test (Qiagen). If available, results of routine testing were used for comparison with WGS. If not, HPV status was determined retrospectively using an aliquot of the DNA (20 ng) that was used for WGS.
Results
Analytical performance and reproducibility of clinical-grade WGS
In addition to the orthogonal clinical validation experiments that are described in the next paragraphs, the analytical performance and consistency of our WGS setup is continuously monitored using a Genome-in-a-bottle (GIAB) mix-in sample (tumor 30% NA12878: normal 100% NA24385) for which all DNA aberrations are known. The accuracy of GIAB genome-wide variant detection (SNV and short indels) by WGS was very high and stable across different runs and using multiple sequencers (in a time period of eight months) with a precision of 0.998 (range 0.994-0.998) and a sensitivity of 0.989 (range 0.973-0.990) (Table 1). Most importantly, all F-scores for variant detection exceeded the pre-set 0.98 lower limit for high-quality sequencing data (median 0.993, range 0.985-0.994). WGS coverage analysis across a set of 25 randomly selected tumor samples indicated stable and high coverage across the entire genome (median coverage 106x, range 84-130) (Table 1).
Performance characteristics for clinical-grade WGS using Genome-in-a-bottle (GIAB) and tumor biopsy samples. The GIAB sample has been analyzed in duplicate runs using multiple sequencers and across a time period of eight months. Data from 25 randomly selected tumor samples were used for coverage performance and a bioinformatics reanalysis of another set of 18 tumor samples (selected across a period of six months) was used to determine the pipeline reproducibility.
Robustness and reproducibility of the bioinformatic data analysis pipeline was assessed by re-analysis of 18 samples (selected across a period of six months) starting from raw sequencer output files. Compared to the initial output, results from the reanalysis show near identical results with a percentage positive agreement (PPA) of 99.98 for SNVs, 99.96 for MNVs and 99.88 for indels (Table 1). The observed small differences are partially caused due to random feeds by the algorithms but mainly due to (periodic) improvements in the bioinformatics pipeline. The reproducibility of the complete workflow was furthermore confirmed on two diagnostic cases (non-small cell lung cancer and an undifferentiated pleomorphic sarcoma) in which the replicated tests starting from independent biopsy/blood isolation provided highly similar molecular profiles with identical diagnostic reports (Figure 1).
Representation of all tumor specific DNA aberrations as detected using WGS. For each case the complete CIRCOS is shown as well as the reported genomics events, including the mutational burden and microsatellite readout. WGS is performed in duplicate (starting with DNA isolation) for 2 tumor samples (A, non-small cell lung cancer; B, undifferentiated pleomorphic sarcoma).
Sample quality and overall WGS success rate
Samples used for WGS analysis currently comprise predominantly of fine needle biopsies taken from a metastatic lesion from patients with stage IV cancer. To determine whether WGS quality is dependent on the (primary) tumor type, a large-scale analysis was performed on samples that were processed as part of the Dutch CPCT-02 trial. Eighty-six percent of the analyzed samples (n=2,520)4 passed all quality criteria, with a lowest success rate for kidney (72.3%), liver (77.3%), and lung (79.1%) cancer patients (Figure 2A). An insufficient amount of tumor cell (<20% based on WGS-derived tumor purity) was the most prevalent failure rate: 6.4% of samples showed a tumor DNA purity between 5-20% and for 2.9% of the cases a seemingly absence (<5%) of tumor DNA was observed despite prior pathological assessment. As a consequence of the restricted use of only fresh frozen biopsies as input material, insufficient sequencing data quality was only observed for 4.4% of the samples, indicating a high technical success rate of 95.6% for samples with sufficient (≥20%) tumor purity (Figure 2A). Of note, technical success rate has further increased to >98% in a currently ongoing prospective clinical study 17.
(A) WGS success rates for different primary tumor types. Success rates are shown for all samples and for samples that have sufficient tumor content. The average overall success rate across all tumor types is indicated by the vertical lines. (B) Global Imbalance Value G to T scores (GIVG>T) (n=2520). As a reference the GIVG>T score range is depicted for the 1000 Genomes Project (1000-GP) and a TCGA subset that are described previously 18. (C) Comparison of pathological tumor percentage scoring (pTCP) with sequencing based tumor DNA purity. (D) Comparison of tumor purity assessment using shallow sequencing (grey) (∼15x) and based on deep whole genome sequencing (black) (∼100x) (n=43).
Although the physical damage to the DNA is expected to be much lower for fresh-frozen samples as compared to formalin-fixed paraffin-embedded (FFPE) samples, we used the previously described Global Imbalance Value (GIV) score as a measure indicative for DNA damage 18. The analyzed set of 2,520 samples showed very low GIVG>T scores with a median of only 1.02 (range 0.495 - 2.495) indicating only 3 samples (0.11%) were considered as damaged samples with a GIV score >1.5 (Figure 2B). In comparison, 41% of the 1000 Genomes Project samples had a GIVG>T score of at least 1.5, while 73% of the TCGA samples showed a GIVG>T score >2 18.
For accurate determination of absolute tumor-specific allele frequencies and copy number status, it is crucial to correctly assess the tumor cell contribution to a sample (tumor purity). Traditionally, pathological tumor cell percentages (pTCP) are used as representation for tumor DNA purity. However, the tumor purity can be determined more accurately from WGS data by genome-wide determination of the ratio of normal and aberrant genomic segments or nucleotides (mTCP). While the pTCP scores show a modest but significant correlation with the tumor DNA purity for samples with higher tumor content (r=0.40 p=0.002), this association was absent for samples with lower (<30%) tumor purity (r=0.08, p=0.76) (Figure 2C). Instead of using pTCP, molecular tumor cell purities (mTCP) that are based on analysis of shallow sequencing data (∼8-15x average coverage) of the tumor’s genome were found to be a more reliable measurement. Validation of the mTCP assessments by shallow sequencing showed a very good correlation with the mTCP of deep WGS (∼90-110x) (R2 of 0.931, n=43, Figure 2D), with an average deviation between both purities of only 3.2% (range 0% to 35% caused by an outlying non-small cell lung cancer case). This data confirms that shallow sequencing data is sufficient for reliable initial tumor purity estimation detection and can be a valuable and cost-effective approach for upfront selection of suitable samples for deep whole genome sequencing.
Specificity and sensitivity of SNV, MNV and indel detection
Specificity of the variants detected by WGS was assessed by a tailored single molecule Molecular Inversion Probe (smMIP) panel sequencing 12,13. Across 29 samples, 192 randomly selected variants were sequenced and analyzed by a custom designed smMIP panel (no reliable panel design was possible for 17.6% of the initial selected WGS variants). Nearly all (98.4%) of the variants were confirmed by smMIP sequencing indicating a very high specificity of WGS for small variant detection. In addition, the observed variant allele frequencies showed a high correlation (R2=0.733 between both assays (Figure 3A).
(A) Variant allele frequencies (VAF) for SNV, MNV and short indel variants that are detected using WGS and confirmed by smMIP NGS panels sequencing. (B) Overview of all protein-changing mutations that are detected by WGS and or the custom-made Oncomine NGS assay. Mutations reported by both assays are marked in green, variants only reported by WGS in blue and only using the panel NGS assay in orange. For BRAF, also mutations detected by WGS but which are not included in the panel assay design are shown (in grey). For all other genes, only mutations included in the panel design are considered. (C) Comparison of WGS based mutational load (ML) readout with NGS panel based tumor mutational burden (TMB).
Orthogonal clinical validation of mutations in a specific oncogene, BRAF, was performed using 48 selected samples and compared against the custom-made Oncomine gene-panel NGS assay (ThermoFisher). Twenty-five samples showed a BRAF exon 15 or exon 11 mutation by WGS that were all confirmed by panel NGS (Figure 3B). Vice-versa, 26 BRAF mutations that were detected using panel-based sequencing were also identified using WGS. A single BRAF p.Gly469Ala mutation identified by panel NGS was not confirmed using the WGS analysis due to low mutation frequency (∼2%). On the other hand, WGS identified two less common BRAF variants (p.Ala762Val and p.Pro403fs) that were not found by the custom Oncomine assay as the panel design does not include the corresponding exons. Both variants are unlikely to result in BRAF activation and are predicted passenger variants, especially because both tumors were MSI with a high TMB. All other 20 BRAF wild-type samples by WGS were confirmed by panel sequencing.
Next, all somatic non-synonymous mutations across the NGS panel design were evaluated (25.2 kb covering hotspot exons of 40 genes). Combined with the BRAF results, in total 139 mutations were detected by at least one of the tests of which 137 (98.6%) were reported by WGS and 133 using panel sequencing (Figure 3B) resulting in an overall 98.5% sensitivity for WGS compared to panel based NGS and 95.6% for panel compared to WGS. A PTEN p.Lys327Arg mutation that was identified using the panel, was not reported by the WGS test. Re-analysis of the WGS read data confirmed the presence of this variant at a low VAF (7% with a coverage of 8 out of 116 reads). On the contrary, the panel assay did not report a pathogenic PTEN variant (p.Tyr27Ser), which was identified by WGS (VAF of 12%) using the same input DNA. The variant was present in the NGS panel data (VAF 6%), but did not meet the criteria for clinical reporting. The panel also missed identification of the APC p.Thr1556fs inactivating mutation in three samples. This APC codon lies within a homopolymeric DNA region and the IonTorrent sequencing technology used for the panel sequencing is known to face more difficulties in repetitive DNA regions.
Although the performance of tumor mutational load (ML) estimations are directly following the performance of accurate non-synonymous variant calling (analytically, ML is only a simple summation of the observed variants), mutational burden readout was compared on 10 additional samples between WGS and Oncomine Tumor Mutational Load (TML) assay (Thermofisher). Both readouts showed a high correlation (R2=0.94) but this was mainly caused by a single high ML sample (ML > 1200) (Figure 1C). Binary classification based on both tests (WGS based ML cutoff of 140 mut vs. TML based TMB cutoff of 10 mut/Mb) indicated a concordance for 7 out of 9 samples (1 sample was not evaluable by Oncomine TML), but also indicated a lower correlation in the cutoff region (R2=0.16 when excluding 2 highest ML/TMB samples). This result illustrates the challenges of accurate mutational burden readout using a more limited gene panel as compared to exome or genome-wide measurements, as discussed elsewhere 19,20.
Taken together, the Oncomine and smMIP NGS validation results indicate both a high sensitivity (98.5%) and a high specificity (98.4%) of detection of SNV, MNV and indels using WGS and biopsies with ≥20% tumor purity, which is similar as compared to commonly used panel-based approaches.
Copy number alterations
WGS-based chromosomal ploidy and local genomic copy number analytical performance were initially assessed by independent FISH analysis on 6 genomic locations of the COLO829 tumor cell line (centromeric region of chromosomes 9, 13, 16, and 18, and 2q23 ALK and 9p24 CD274/PDCD1LG2 (PD-L1/PD-L2) using diagnostic ‘break-apart’ probes). WGS and FISH analysis showed highly similar purity and ploidy calculations with Chr9 showing 4x in ∼55% of cells, Chr13 3x in ∼55%, Chr18 3x in ∼60%, 2q23 locus 3x in 70-80% and complete diploid Chr 16 and 9q24 locus for all cells (Figure 4).
Comparison of COLO829 copy number analysis based on WGS and using FISH probes for copy number assessment of chromosomes 9, 13, 16 and 18, and for 9p24 (CD274/PDCD1LG2) and 2q23 (ALK). For both tests the copy number as well as the percentage of tumor cells is determined.
Further orthogonal clinical validation focussed on accurate detection of ERBB2 (Her2/neu) amplification. Sixteen samples from various tumor types were used (11 mamma, 2 colorectal, 1 stomach, 1 bladder and 1 melanoma). Importantly, samples were representative of the full spectrum of ERBB2 amplifications also including samples with only marginal amplification and samples with increased ploidy of the complete chromosome 17. New tissue sections from the same biopsy or a second biopsy obtained at the same moment as the samples used for WGS were analyzed by FISH at an independent routine pathology laboratory (Table 2). For one sample (#5) FISH analysis failed due to insufficient tumor cells (confirmed by immunohistochemistry). All other FISH results were considered representative. All samples with a WGS-based ERBB2 copy-number greater than 6x were confirmed by FISH to harbor substantial ERBB2 amplified signals (defined as ERBB2 >6). For ERBB2 WGS copy-numbers between 2-6, at best an ERBB2 gain was observed by FISH but considered insufficient for classification as ERBB2 amplified (classified as ERBB2 gain or equivocal). A borderline discordant ERBB2 status was observed for a single case (sample #16, FISH 2-4x in 83% compared to WGS 6x). No technical explanation could be identified, but this might be caused due to tumor heterogeneity between the sections used for WGS and FISH. Of note, this specific case involved a colorectal tumor for which the FISH assay is not used in routine practice.
ERBB2 copy number analysis by WGS and FISH. ERBB2 FISH results were scored solely on tumor cells and categorized as; normal signals, 2-4 signals, 4-6 signals and more than 6 ERBB2 signals (according to guidelines 15). For WGS, the ERBB2 copy number as well as the median ploidy of the complete chr17 is shown.
The copy number validation data showed a high concordance (93%, 14 of the 15 cases) of WGS and FISH analysis indicating that WGS can reliably be used for detection of sufficiently high gene amplifications. For lower copy numbers (range 2 to 6) the concordance showed more variability but the question remains whether such low gains are biologically and/or clinically relevant 21.
Detection of fusion genes
Detection of gene fusions by WGS was compared with results obtained with an RNA-based Anchored Multiplex PCR NGS assay (Archer FusionPlex, ArcherDx) and was performed independently on 24 samples using matching DNA and RNA from the same biopsy. Samples were selected based on the WGS results to include one or more clinically relevant fusion genes. The Archer NGS assay confirmed the WGS findings for 21 of the 23 samples (91.3%), including fusion of ALK, NRG1 and ROS1 (Table 3). For one sample no comparison could be made, as the TMPRSS2-ERG fusion is not covered by the used Archer NGS assay.
Fusion genes detected by WGS and the Archer FusionPlex on matching DNA and RNA samples of 24 tumor biopsies.
A NTRK1 fusion detected by Archer NGS (MEF2D-NTRK1: (22 reads, 60% VAF) could not be identified using WGS, possibly due to a complex structural variation pattern involving multiple break-junctions in the intronic regions and thus more difficult to call using WGS data compared to analysis of RNA. Vica versa, one fusion (SPAG17-ALK) detected by WGS showed no evidence in the tumor RNA. Although based on fusion at DNA level a viable in-frame fusion protein was predicted, it can very well be that the corresponding RNA was expressed at low levels (e.g. due to temporal or spatial expression variation) that are insufficient for reliable detection by the Archer assay.
Quantification of microsatellite instability (MSI)
WGS-based MSI classification was validated independently using 48 selected samples including multiple tumor types (32 colorectal, 5 prostate, 3 esophagus, 2 pancreatic and 6 other) and using the routinely used 5-marker PCR MSI panel 16,22. Assessment of microsatellite (in)stability by WGS, defined as the number of small indels per million bases occurring in ≥5-mer homopolymers and in di, tri- and tetranucleotide repeats 8, showed an average microsatellite instability (MSI) score of 1.11 with the vast majority of samples having a low score and a long tail towards higher MSI scores (range 0.004 to 93, n=2520, Figure 5A). 2.7 percent of the samples were classified as MSI using a cutoff of 4 (cutoff was based on the apparent bi-nominal distribution of the MSI scores). On the validation set (n=48) the sensitivity of WGS MSI classification was 100% (95%CI 82.6-100%) with a specificity of 97% (95%CI 88.2-96.9%) and a Cohen’s kappa score of 0.954 (95%CI 0-696-0.954). In addition to the binary MSI/MSS concordance, the MSI score correlated with the number of positive PCR markers, in which samples with only 1 or 2 positive PCR markers showed a marginal MSI score (Figure 5B). The only discordant results were from a lymphoma sample with a complex pathology showing 1/5 positive PCR markers (classified as MSS) but a WGS MSI score of 5.9 (classified MSI). IHC analysis showed no substantial loss of mismatch repair (MMR) proteins although WGS analysis indicated a somatic PMS2 p.Ile193Met variant in combination with a likely inactivating PMS2 structural variant. The p.Ile193Met mutation is classified with a high prior in de Leiden Open Variant Database (LOVD, https://databases.lovd.nl/shared/variants/PMS2) and thus likely represents a pathogenic variant. Both the MSI PCR test as well as the MMR IHC had not been validated for use in lymphoma cases so a definitive conclusion remained difficult.
WGS based microsatellite instability (MSI) quantification across a cohort of 2520 metastatic cancer samples (A), and compared to the 5-marker PCR based test using an independent set of 48 validation samples (B).
Tumor-genome integrated virus detection
Recently it has been shown that the presence of viruses can be detected with great accuracy using WGS 23. Assessment of the presence of integrated viral DNA was validated against standard routine pathological assessment, typically a PCR test. We focused on Human papillomavirus (HPV) due to the prevalence and clinical importance and the availability of HPV routine testing (e.g. QIAscreen HPV PCR assay, Qiagen). Twenty-four tumor samples (including 10 GI-tract, 6 female reproductive, 3 head-neck, 2 male reproductive and 3 other cancer types) were used for independent validation between WGS and PCR assay. The concordance of WGS and standard pathology was very high with an accuracy of 95.8% and a sensitivity of 90.9% (95%CI 67.6-90.9) and specificity of 100% (80.3-100%). Cohen’s kappa score of 0.915 (95%CI 0.48-0.92) indicates an ‘almost perfect agreement’, in which also the HPV high-risk types were concordant between both tests (Table 4).
Detection and typing of HPV in tumor biopsies using WGS and PCR analysis.
A single sample showed a discordant result in which the PCR assay indicated HPV type 16 while no such evidence was found by WGS. A follow-up PCR test on the same DNA that was used for WGS analysis showed the same result, thereby ruling out sample heterogeneity. This result can most likely be explained due to a non-integrated HPV infection, as the WGS analysis pipeline only considered viral DNA fragments that were integrated in the host genome (shared viral-human read pairs), or due to integration into a non-sequenceable part of the genome.
Discussion
During the past few years, whole genome sequencing (WGS) and the associated bioinformatic data processing and interpretation has matured from a research-use-only tool to a diagnostic-level technology 24. Together with the clinical need to screen for an increasing number of (complex) biomarkers in an increased number of tumor types (or even pan-cancer) 1,25 and the availability of limited amounts of biopsy material, the use of a single all-inclusive DNA test is a more than welcome development for efficient molecular diagnostics. While costs are currently still relatively high, sequencing technology continues to evolve including decreasing costs. Here we report on (retrospective) orthogonal validation efforts of WGS and show, to our knowledge for the first time, that the performance of WGS is equal to the range of routinely used diagnostic tests with technical concordances of >95%. More specifically, we show that a single WGS-based tumor-normal test can replace separate test for 1) actionable small variant (SNV, indel) driver mutations (previously detected by targeted PCR-based or NGS panel-based tests), 2) gene amplifications (FISH), 3) fusion genes (FISH or RNA panels), 4) microsatellite instability (amplicon fragment analysis), 5) HPV infection, and 6) tumor mutational load determination (NGS-panels). Prospective clinical validation and integration into routine workflow is currently being evaluated by a direct comparison of simultaneously obtained routine diagnostics and WGS-based test results 17. To make the WGS test suitable for diagnostic use, the turn-around-time has already been reduced towards a clinically acceptable 10 working-days.
The good performance of WGS for diagnostic use is primarily the result of two important aspects of the workflow that are fundamentally different from most existing molecular diagnostics procedures for cancer: 1) the use of only fresh frozen tumor material yielding consistent high quality DNA and sequencing results, and 2) parallel processing of the patient’s fresh blood sample to serve as a control/baseline for the matching tumor sample. Hereby, all germline variants can be subtracted automatically from the tumor data thereby allowing for precise pinpointing of all tumor specific changes. Even with focus on a set of ∼500 cancer related (driver) genes 4, the bulk of all missense variants observed in the tumor are in fact inherited germline polymorphisms without clinical significance, making comprehensive (manual) tumor-only interpretation and filtering a daunting task. This challenge is not unique for WGS but in principle also applies for all large NGS panels 26,27. Filtering out germline variants using population database information is challenging due to various reasons (e.g. biases in such databases toward Caucasian population, rare patient or sub-population specific variants) and although known driver mutations are readily detected by panel-based tests, the impact on tumor mutational load measurements is likely severely impacted when germline or somatic status of a variant cannot be discriminated accurately.
With the increase in (technical) sequencing capabilities, the bioinformatics part (‘dry-lab’) has become essential for a good analysis and interpretation of the sequencing data of WGS but also for the emerging larger comprehensive panels. Traditionally, (hospital) laboratories have focused most on the wet-lab performance and automatization but it has become clear that the downstream bioinformatics, and the ICT infrastructure to handle (and store) all data, pose the greatest challenge. Complex bioinformatics and high-end reporting tools are essential for an understandable communication of the results to the (clinical) end-users. An example of our current WGS report which is multilayered to serve the different end-users (oncologist, pathologist), is provided (Suppl Data 2). Currently, WGS still requires a tumor content that is somewhat higher than focussed panel based approaches (minimal 20% for WGS versus 5-10% for panel NGS). This limitation is caused due to a lower sequencing depth by WGS, but with ongoing price reductions we anticipate that WGS with ∼250x coverage will be feasible in the next coming years and thus will also be able to analyse samples with lower tumor content and to detect minor tumor subclones. A more challenging limitation is the need of fresh-frozen (or freshly lysed) samples for WGS analysis as this will, for most hospitals, require an adaptation in the pathology laboratories that are currently mostly FFPE orientated. The adaption of WGS in a routine pathology workflow is currently being evaluated and optimized in a prospective clinical study 17.
Example WGS rapport
DNA sequencing tests are often performed as laboratory-developed tests (LDTs) and the technical parameters, validation requirements and quality assurance are typically governed by national regulation and legislation that can differ. Various expert groups have drafted guidelines and recommendations for the standardization of multigene panel testing 2,28 and for our validation efforts we have followed the guidelines for setup and validation of (new) sequencing tests in ISO-accredited pathological laboratories in the Netherlands. However, with the ongoing approval of NGS panel assays by the FDA 29 and the upcoming new European Regulations for in-vitro diagnostic medical devices IVDR (2017/746) in 2022 30, it is anticipated that (whole) genome sequencing tests will become regulated following international guidelines, standardization and quality schemes. Clinical validation by comparison with common standards, as described here, will be a key component of such regulations.
With the rapid development of more targeted drugs and their associated biomarkers, it is next to standardization of the (complex) test results, important to be able to efficiently and quickly add new biomarkers/genes to the clinical reports (e.g. NRG1 and NTRK fusions and PIK3CA activating mutations). WGS will allow such a rapid and efficient co-development of (all) future diagnostic DNA markers, because it ‘only’ requires an update of the bioinformatics and reporting aspects, without the need of laborious and costly new test developments or adaptations of panel designs including the required laboratory analytical validation experiments. In addition, the data from previously tested patients can, in principle and upon request from the treating physician, be reanalyzed for the presence of the (all) new biomarkers and recontacting of the patient can be considered 31.
Setting aside the direct impact WGS can have for routine clinical use and comprehensive screening for clinical study eligibility, a whole-genome view of the tumor will yield a wealth of valuable research data and provide the opportunity to increase our insights in oncogenic processes and to better explain or predict the response to targeted or immunotherapy. Such a learning-health-care system, where we learn from today’s patients will greatly enhance our understanding of this complex disease and facilitate the discovery of newly identified (complex) biomarkers, targeted therapies, and improved treatment decision making for future patients.
Data Availability
The raw and analyzed WGS data used in this manuscript are available for validation and cancer research purposes through a standardized controlled access procedure (see https://www.hartwigmedicalfoundation.nl/applying-for-data/ for details).
Acknowledgements
The authors would like to thank Peggy Atmodimedjo, Isabelle Meijssen, Ronald van Marion and Hanna Schoep for (technical) assistance with collecting the data, Sandra van den Broek for data analysis support and Immy Riethorst for sample logistics. This publication and the underlying study have been made possible partly on the basis of the data that Hartwig Medical Foundation and the Center of Personalised Cancer Treatment (CPCT) have made available to the study.