Abstract
Background Autism spectrum disorder (ASD) is a heterogenous multifactorial neurodevelopmental condition with a significant genetic susceptibility component. Thus, identifying genetic variations associated with ASD is a complex task. Whole-exome sequencing (WES) is an effective approach for detecting extremely rare protein-coding single-nucleotide variants (SNVs) and short insertions/deletions (INDELs). However, interpreting these variants’ functional and clinical consequences requires integrating multifaceted genomic information.
Methods We compared the concordance and effectiveness of three bioinformatics tools in detecting ASD candidate variants (SNVs and short INDELs) from WES data of 220 ASD family trios registered in the National Autism Database of Israel. We studied only rare (<1% population frequency) proband-specific variants. According to the American College of Medical Genetics (ACMG) guidelines, the pathogenicity of variants was evaluated by the InterVar and TAPES tools. In addition, likely gene-disrupting (LGD) variants were detected based on an in-house bioinformatics tool, Psi-Variant, that integrates results from seven in-silico prediction tools.
Results Overall, 605 variants in 499 genes distributed in 193 probands were detected by these tools. The overlap between the tools was 64.1%, 17.0%, and 21.6% for InterVar–TAPES, InterVar– Psi-Variant, and TAPES–Psi-Variant, respectively. The intersection between InterVar and Psi-Variant (I∩P) was the most effective approach in detecting variants in known ASD genes (OR = 5.38, 95% C.I. = 3.25–8.53), while the union of InterVar and Psi-Variant (I U P) achieved the highest diagnostic yield (30.9%).
Conclusions Our results suggest that integrating different variant interpretation approaches in detecting ASD candidate variants from WES data is superior to each approach alone. The inclusion of additional criteria could further improve the detection of ASD candidate variants.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was supported by a grant from the Israel Science Foundation (1092/21).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Soroka University Medical Center (SOR-076-15; 17 April 2016).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Additional results of tool-specific diagnostic yield (highlighted in Fig. 2(C)) have been added. Also, a list of (Supplementary Table S2) LP/P/LGD variants (N=605) detected by at least one of InterVar, TAPES, and Psi-Variant, has been provided.
List of abbreviations
- ACMG/AMP
- American College of Medical Genetics and Genomics/Association of Molecular Pathology
- ASD
- autism spectrum disorder
- C.I.
- confidence interval
- GATK
- Genome Analysis Toolkit
- LGD
- likely gene disrupting
- LoF
- loss of function
- LP
- likely pathogenic
- ML
- machine learning
- NADI
- National Autism Database in Israel
- NGS
- next-generation sequencing
- OR
- odds ratio
- P
- pathogenic
- PPV
- positive predictive value
- SNV
- single nucleotide variants
- VEP
- Variant Effect Predictor
- vcf
- variant calling format
- VUS
- variants of uncertain significance
- WES
- whole exome sequencing.