Abstract
Copy number variations (CNVs) are a type of structural variants involving alterations in the number of copies of specific regions of DNA, which can either be deleted or duplicated. CNVs contribute substantially to normal population variability; however, abnormal CNVs cause numerous genetic disorders. Nowadays, several methods for CNV detection are used, from the conventional cytogenetic analysis through microarray-based methods (aCGH) to next-generation sequencing (NGS). We present GenomeScreen – NGS-based CNV detection method for lowcoverage whole-genome sequencing. We determined the theoretical limits of its accuracy and confirmed it with extensive in-silico study and real patient samples with known genotypes. Theoretically, at least 6M uniquely mapped reads are required to detect CNV with a length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in-silico analysis showed the requirement of at least 8M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has a 200 kb mean resolution. GenomeScreen and aCGH both detected 59 deviations, GenomeScreen furthermore detected 134 other (usually) smaller variations. The performance of the proposed GenemoScreen tool is comparable or superior to the aCGH regarding accuracy, turnaround time, and cost-effectiveness, presenting a reasonable benefit particularly in a prenatal diagnosis setting.
Competing Interest Statement
We declare a potential competing financial interest in the form of employee contracts (see affiliations for each author) with Geneton Ltd. and TrisomyTest Ltd. Geneton Ltd. participated in the development of a commercial NIPT test in Slovakia; however, it is not a provider of this commercial test, but continues to do basic and applied research in the field of NIPT. On the other hand, TrisomyTest Ltd. is the commercial provider of NIPT testing in Slovakia. Its participation in the study was limited to the routine NIPT testing that generated the genomic results reused in our research. Related to this work, there are no patents, products in development, or marketed products to declare. The authors declare no other conflict of interest.
Clinical Trial
Each included individual agreed to use of their genomic data in an anonymized form for general biomedical research. The NIPT study (study ID 35900_2015) was approved by the Ethical Committee of the Bratislava Self-Governing Region (Sabinovska ul.16, 820 05 Bratislava) on 30th April of 2015 under the decision ID 03899_2015.
Funding Statement
This publication was supported by the project "Long term strategic research and development focused on the occurrence of Lynch syndrome in the Slovak population and possibilities of prevention of tumors associated with this syndrome" (ITMS 313011V578) co-financed by the European Regional Development Fund (ERDF). The article was also created with the support of the OP Integrated Infrastructure for the project: Introduction of an innovative test for screening and monitoring of cancer patients - GenoScan LBquant, ITMS: NFP313010Q927, co-financed by the ERDF. The data infrastructure was built with the support of the Operational Program Integrated Infrastructure within the project: "Horizontal ICT support and centralized infrastructure for research and development institutions", ITMS code 313011F988, co-financed by the ERDF.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Each included individual agreed to use of their genomic data in an anonymized form for general biomedical research. The NIPT study (study ID 35900_2015) was approved by the Ethical Committee of the Bratislava Self-Governing Region (Sabinovska ul.16, 820 05 Bratislava) on 30th April of 2015 under the decision ID 03899_2015.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Following is a short summary of responses to the reviewers' comments: ● Comment: thesis is not unfocused, incomplete; placemark text "Please add" is still visible in the "Conclusion" section; uncomplete Conclusion; the first figure displayed is figure 4 (multiple reviewers) Reaction: We added a proper Conclusion section, fixed the numbering error on Figures, and expanded the Discussion ● Comment: Unclear purpose; poor attention to added value or contribution; clearly differentiate the contribution of this manuscript in comparison to the previous two papers (by multiple reviewers); Reaction: We rewrote the Introduction and parts of the Discussion sections to clarify the state of the art, the purpose and added value of our tool, and its differentiation from our previous articles in the field ● Comment: authors did not compare their methods with other already exist CNV detection methods (eg. Decon, convading, ExomeDepth etc) (reviewer 3) Reaction: We added an explanation that most of the tools for CNV detections cannot be used or are very unsuitable for the specific low-coverage WGS scenario. Furthermore, we believe this would be out of the scope of the article since we have focused on the comparison with the aCGH method. ● Comment: The authors have mentioned that their tool can identify larger CNVs; but can it also identify small CNVs affecting only one or a few small exons with same efficiency? (reviewer 3) Response: We explained in the Discussion that this is not possible in the low-coverage WGS scenario and the tool will never have the required accuracy to do so. ● Comment: CNVs are very sensitive to datasets used. Authors should test their methods both in house and another dataset (reviewer 3) Response: We failed to test our method on another dataset due to demanding training procedure and unavailability of public datasets, where there are healthy samples (for training) and samples with known CNVs (for testing) generated with the same procedure and on the same sample type. We explained the disadvantage and sensitivity of training of GenomeScreen properly in the Discussion.
Data Availability
Data and scripts (Python 3.7) are available on the website https://github.com/marcelTBI/GenomeScreen .
Abbreviations
- aCGH
- array-based comparative genomic hybridization
- CBS
- circular binary segmentation
- CNV
- copy number variant
- NGS
- next-generation sequencing
- NIPT
- non-invasive prenatal testing
- WGS
- whole-genome sequencing