Abstract
A novel algorithm, AlphaMissense, has been shown to have an improved ability to predict the pathogenicity of rare missense genetic variants. However, it is not known whether AlphaMissense improves the ability of gene-based testing to identify disease-causing genes. Using whole-exome sequencing data from the UK Biobank, we compared gene-based association analysis strategies including sets of deleterious variants: predicted loss-of-function (pLoF) variants only, pLoF plus AlphaMissense pathogenic variants, pLoF with missense variants predicted to be deleterious by any of five commonly utilized annotation methods (Missense (1/5)) or only variants predicted to be deleterious by all five methods (Missense (5/5)). We measured performance to identify 519 previously identified positive control genes, which can cause Mendelian diseases, or are the targets of successfully developed medicines. These strategies identified 850k pLoF variants and 5 million deleterious missense variants, including 22k likely pathogenic missense variants identified exclusively by AlphaMissense. The gene-based association tests found 608 significant gene associations (at P<1.25×10−7) across 24 common traits and diseases. Compared to pLOFs plus Missense (5/5), tests using pLoFs and AlphaMissense variants found slightly more significant gene-disease and gene-trait associations, albeit with a marginally lower proportion of positive control genes. Nevertheless, their overall performance was similar. Merging AlphaMissense with Missense (5/5), whether through their intersection or union, did not yield any further enhancement in performance. In summary, employing AlphaMissense to select deleterious variants for gene-based testing did not improve the ability to identify genes that are known to cause disease.
Competing Interest Statement
J.B.R is the CEO of 5 Prime Sciences (www.5primesciences.com), which provides research services for biotech, pharma, and venture capital companies for projects unrelated to this research. He has served as an advisor to GlaxoSmithKline and Deerfield Capital. The institution of J.B.R. has received investigator-initiated grant funding from Eli Lilly, GlaxoSmithKline, and Biogen for projects unrelated to this research. YC is an employee of 5 Prime Sciences.
Funding Statement
The Richards research group is supported by the Canadian Institutes of Health Research (CIHR: 365825, 409511, 100558, 169303), the McGill Interdisciplinary Initiative in Infection and Immunity (MI4), the Lady Davis Institute of the Jewish General Hospital, the Jewish General Hospital Foundation, the Canadian Foundation for Innovation, the NIH Foundation, Cancer Research UK, Genome Quebec, the Public Health Agency of Canada, McGill University, Cancer Research UK, and the Fonds de Recherche Quebec Sante (FRQS). J.B.R. is supported by an FRQS Merite Clinical Research Scholarship. Support from Calcul Quebec and Compute Canada is acknowledged. TwinsUK is funded by the Welcome Trust, Medical Research Council, European Union, the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guys and St Thomas NHS Foundation Trust in partnership with Kings College London. Y.C. is supported by an FRQS doctoral training fellowship and the Lady Davis Institute/TD Bank Studentship Award. G.B.L. is supported by scholarships from the FRQS, the CIHR, and Quebec ministry of health and social services.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The UK Biobank was approved by the North West Multi-centre Research Ethics Committee and informed consent was obtained from all participants prior to participation.This research has been conducted using UK Biobank data under application ID 27449.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data availability
Individual-level genotype, exome sequencing, and phenotype data is available to approved researchers via UK Biobank at: https://www.ukbiobank.ac.uk. ExWAS summary statistics will be made available at GWAS Catalog (https://www.ebi.ac.uk/gwas/).