PT - JOURNAL ARTICLE AU - Beaumont, Robin N. AU - Wright, Caroline F. TI - Estimating diagnostic noise in panel-based genomic analysis AID - 10.1101/2022.03.18.22272595 DP - 2022 Jan 01 TA - medRxiv PG - 2022.03.18.22272595 4099 - http://medrxiv.org/content/early/2022/03/21/2022.03.18.22272595.short 4100 - http://medrxiv.org/content/early/2022/03/21/2022.03.18.22272595.full AB - Background Gene panels with a series of strict variant filtering rules are often used for clinical analysis of exomes and genomes. Panels vary in size, which affects the sensitivity and specificity of the test. We sought to investigate the background rate of candidate diagnostic variants in a population setting using gene panels developed to diagnose a range of heterogeneous monogenic diseases.Methods We used the Genotype-2-Phenotype database with the Variant Effect Predictor plugin to identify rare non-synonymous variants in exome sequence data from 200,643 individuals in UK Biobank. We evaluated five clinically curated gene panels: developmental disorders (DD; 1708 genes), heritable eye disease (536 genes), skin disorders (293 genes), cancer syndromes (91 genes) and cardiac conditions (49 genes). We further tested the DD panel in 9,860 proband-parent trios from the Deciphering Developmental Disorders (DDD) study.Results As expected, bigger gene panels resulted in more variants being prioritised, varying from an average of ∼0.3 per person in the smallest panels, to ∼3.5 variants per person using the largest panel. The number of individuals with prioritised variants varied linearly with coding sequence length for monoallelic disease genes (∼300 individuals per 1000 base pairs) and quadratically for biallelic disease genes, with some notable outliers. Based on cancer registry data from UK Biobank, there was no detectable difference between cases and controls in the number of individuals with prioritised variants using the cancer panel, presumably due to the predominance of sporadic disease. However, we observed a marked increase in the number of prioritised variants in the DD panel in the DDD study (∼5 variants per proband). Phasing of compound heterozygotes in biallelic genes resulted in a modest reduction in the number of prioritised variants.Conclusions Although large gene panels may be the best strategy to maximize diagnostic yield in genetically heterogeneous diseases, they will frequently prioritise false positive candidate variants potentially requiring additional clinical follow-up. Most individuals will have at least one rare nonsynonymous variant in panels containing >500 monogenic disease genes. Extreme caution should therefore be applied when interpreting potentially pathogenic variants found in the absence of relevant phenotypes.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the MRC [MR/T00200X/1] and the Wellcome Trust [200990/A/16/Z] and was conducted using the UK Biobank Resource under Application number 49847. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003] a parallel funding partnership between the Wellcome Trust and the Department of Health, and the Wellcome Trust Sanger Institute [grant no. WT098051].Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The UK Biobank has approval from the North West Multi-centre Research Ethics Committee (21/NW/0157) as a Research Tissue Bank approval. The DDD study has UK Research Ethics Committee approval (10/H0305/83, granted by Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). All individuals included in this study gave appropriate consent.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data produced in the present work are contained in the manuscript