Abstract
Background Genomic matchmaking - the process of identifying multiple individuals with overlapping phenotypes and rare variants in the same gene - is an important tool facilitating gene discoveries for unsolved rare genetic disease (RGD) patients. Current approaches are two-sided, meaning both patients being matched must have the same candidate gene flagged. This limits the number of unsolved RGD patients eligible for matchmaking. A one-sided approach to matchmaking, in which a gene of interest is queried directly in the genome-wide sequencing data of RGD patients, would make matchmaking possible for previously undiscoverable individuals. However, platforms and workflows for this approach have not been well established.
Results We released a beta version of the One-Sided Matching Portal (OSMP), a platform capable of performing one-sided matchmaking queries across thousands of participants stored in genomic databases. The OSMP returns variant-level and participant-level information on each variant occurrence (VO) identified in a queried gene and displays this information through a customizable data table. A workflow for one-sided matchmaking was developed so that researchers could effectively prioritize the many VOs returned from a given query. This workflow was then tested through pilot studies where two sets of genes were queried in over 2,500 individuals: 130 genes that were newly associated with disease in OMIM, and 178 candidate genes that were not yet associated with a described disease-gene association in OMIM. These pilots both returned a large number of initial VOs (12,872 and 20,308, respectively), however the workflow successfully filtered out over 99.8% of these VOs before they were sent for review by a patient’s clinician. Filters on participant-level information, such as variant zygosity, participant phenotype, and whether a variant was also present in unaffected participants were especially effective in this workflow at reducing the number of false positive matches.
Conclusions As demonstrated through the two pilot studies, one-sided matchmaking queries can be efficiently performed using the OSMP. The availability of variant-level and participant-level data is key to ensuring this approach is practical for researchers. In the future, the OSMP will be connected to additional RD databases to increase the accessibility of matchmaking to unsolved RGD patients.
Competing Interest Statement
OJB and MB have an equity interest in, and OJB is an employee of PhenoTips®, which licenses software used in the Genomics4RD database.
Funding Statement
This study was funded through the Genomics and Precision Health Top-up grant GPT-174518 titled ″Care4Rare-Solve: Efficient cross-border matchmaking to deliver diagnoses for rare genetic diseases″, awarded by the Canadian Institutes of Health Research. The development of Genomics4RD and the production of its housed GWS data was performed under the Care4Rare Canada Consortium funded by Genome Canada and the Ontario Genomics Institute (OGI-458 147), the Canadian Institutes of Health Research, Ontario Research Fund, Genome Alberta, Genome British Columbia, Genome Quebec, and Children″s Hospital of Eastern Ontario Foundation. T.H. was supported by a Frederick Banting and Charles Best Canada Graduate Scholarship Doctoral Award from Canadian Institutes of Health Research. MB is a CIFAR AI Chair. KMB was supported by a Canadian Institutes of Health Research Foundation grant FDN-154279 and a Tier 1 Canada Research Chair in Rare Disease Precision Health. The work at Children″s Mercy Kansas City is supported by generous donors to Children″s Mercy Research Institute and Genomic Answer For Kids program.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The institutional review board of the Children″s Hospital of Eastern Ontario gave ethical approval for this work. The institutional review board of Clinical Trials Ontario gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The source code for the OSMP is available at https://github.com/ccmbioinfo/osmp. Access to the beta version of the OSMP is currently limited to a small set of Genomics4RD users. The genotypic and phenotypic data that support the findings of this study are located in the controlled access database Genomics4RD. Genomics4RD open access data is available at https://www.genomics4rd.ca/.
List of abbreviations
- AD
- Autosomal dominant
- AR
- Autosomal recessive
- CADD
- Combined Annotation Dependent Depletion
- CMA
- Chromosomal microarray
- DECIPHER
- DatabasE of genomiC varIation and Phenotype in Humans using Ensembl Resources
- FORGE
- Finding of Rare Disease Genes
- GPAP
- Genome-Phenome Analysis Platform
- GWS
- Genome wide sequencing
- HPO
- Human Phenotype Ontology
- MME
- Matchmaker Exchange
- OMIM
- Online Mendelian Inheritance in Man
- OSMP
- One-Sided Matching Portal
- RGD
- Rare genetic disease
- UCSC
- University of California Santa Cruz
- VEP
- Variant Effect Predictor
- VO
- Variant occurrence
- XLR
- X-linked recessive