PT - JOURNAL ARTICLE AU - Paulin, Luis F AU - Fan, Jeremy AU - O’Neill, Kieran AU - Pleasance, Erin AU - Porter, Vanessa L. AU - Jones, Steven J.M AU - Sedlazeck, Fritz J. TI - The benefit of a complete reference genome for cancer structural variant analysis AID - 10.1101/2024.03.15.24304369 DP - 2024 Jan 01 TA - medRxiv PG - 2024.03.15.24304369 4099 - http://medrxiv.org/content/early/2024/03/18/2024.03.15.24304369.short 4100 - http://medrxiv.org/content/early/2024/03/18/2024.03.15.24304369.full AB - The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remains challenging.We hypothesized that use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumour/normal matched benchmark sample and two patient samples show that the CHM13-T2T improves SV detection and prioritization accuracy compared to GRCh38, with a notable reduction in false positive calls. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations.In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 49 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. The benchmark is available at: 10.5281/zenodo.10819636 Our work demonstrates new approaches to optimize somatic SV prioritization in cancer with potential improvements in other genetic diseases.Competing Interest StatementThe following authors disclose relevant potential competing interests: Kieran O Neill, Vanessa Porter, Luis F Paulin and Steven J.M. Jones received travel funding from Oxford Nanopore Technologies to present at conferences in 2022 and/or 2023. Fritz J Sedlazeck receives research support from ONT, Pacbio, Illumina and Genentech. Luis F Paulin received research support from Genentech from 2021 to 2023.Funding StatementThis study was in part supported by funding from the Canada Research Chairs Program, Terry Fox Research Institute Marathon of Hope and the British Columbia Cancer Foundation. FJS, LFP was supported by NIH (UM1DA058229, 1UG3NS132105-01, 1U01HG011758-01). This study was conducted with the financial support of The Terry Fox Research Institute and the Terry Fox Foundation. The views expressed in the publication are the views of the authors and do not necessarily reflect those of the Terry Fox Research Institute or the Terry Fox Foundation.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Cancer samples were derived from the Personal Oncogenomics (POG) program (Pleasance et al. 2020), clinical trial NCT02155621, approved by and conducted under the University of British Columbia BC Cancer Research Ethics Board (H1200137, H1400681). Samples are deidentified.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesSupplementary Table 1 summarizes the samples used in this study and includes links for accessing the data. The new COLO829/COLO829BL proposed benchmark and merge VCF files that include the eight tumor/normal samples can be found at 10.5281/zenodo.10819636. Nutty, a Sniffles2 companion app for parsing the VCF was used https://github.com/lfpaulin/nutty It contains commands to reproduce the COLO829 and POG analysis.