Abstract
Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the GREGoR consortium. Each family’s CNV data was analyzed using the seqr platform and candidate CNVs classified using the 2020 ACMG/ClinGen CNV interpretation standards. We developed additional evidence criteria to address situations not covered by the current standards. The addition of CNV calling to exome analysis identified causal CNVs for 173 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb with estimates that 44% would not have been detected by standard chromosomal microarrays. The causal CNVs consisted of 141 deletions, 15 duplications, 4 suspected complex structural variants (SVs), 3 insertions and 10 complex SVs, the latter two groups being identified by orthogonal validation methods. We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as high interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.
Competing Interest Statement
H.L.R. has received support from Illumina and Microsoft to support rare disease gene discovery and diagnosis. A.O-D.L. has consulted for Tome Biosciences and Ono Pharma USA Inc. D.G.M is a paid advisor to GlaxoSmithKline, Insitro, Variant Bio and Overtone Therapeutics, and has received research support from AbbVie, Astellas, Biogen, BioMarin, Eisai, Google, Merck, Microsoft, Pfizer, and Sanofi-Genzyme. C.A.W. is a paid advisor to Maze Therapeutics. M.E.T. receives research funding from Microsoft Inc, Illumina Inc and Levo Therapeutics. The remaining authors declare no competing interests.
Funding Statement
Sequencing and analysis were provided by the Broad CMG, funded by the National Human Genome Research Institute grants UM1HG008900, U01HG0011755 and R01HG009141.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee of Massachusetts General Brigham gave ethical approval for this work. Families recruited and sequenced through this project are enrolled in research studies with local institutional review board (IRB) approval, including for sharing de-identified samples for sequencing and analysis (MassGeneralBrigham 2013P001477).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript. The CNVs that were interpreted as causal in this cohort were submitted to ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) (submitter ID 506627, Broad Rare Disease Group). The ClinVar accession numbers of each CNV are listed in Table S1.