Abstract
Loss-of-function genetic variants (LoFs) often result in severe phenotypes, including autosomal dominant diseases driven by haploinsufficiency. Due to low carrier frequencies, their penetrance is generally unknown but typically variable. Here, we investigate the penetrance of >6,000 predicted LoFs (pLoFs) linked to 91 haploinsufficient diseases using a cohort of ≈24,000 carriers with linked electronic health record data. We find evidence for widespread reduced penetrance, which persisted after accounting for variant annotation artifacts, missed diagnoses, and incomplete clinical data. We thus hypothesized that many pLoFs have incomplete penetrance, which may be driven by residual allelic activity. To test this, we trained machine learning models to predict pLoF penetrance using variant-specific genomic features that may correlate with incomplete loss-of-function. The models were predictive of pLoF penetrance across a range of diseases and variant types, including those with prior clinical evidence for pathogenicity. This suggests that many pLoFs have incomplete penetrance due to residual allelic activity, complicating disease prognostication in asymptomatic carriers.
Competing Interest Statement
The first author (D. Blair) has received research funding from BioMarin, Idorsia, QED Therapeutics and Sanofi in the last 36 months. None of the research reported in this manuscript was funded by these entities.
Funding Statement
This work was supported by grants from the National, Heart, Lung and Blood Institute (K38HL164956) and the George Banks and Sarah Ellen Huntington Memorial Fund.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The UK Biobank has approval from the North West Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB). Approved researchers do not require separate ethical clearance and can operate under the RTB approval. The UKBB was accessed via approved Application Number 99922. Authorization for access to participant-level data in All of Us is based on a "data passport" model, through which authorized researchers do not need IRB review for each research project. The data passport is required for gaining data access to the Researcher Workbench and for creating workspaces to carry out research projects using All of Us data.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.