RT Journal Article SR Electronic T1 Critical assessment of variant prioritization methods for rare disease diagnosis within the Rare Genomes Project JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.08.02.23293212 DO 10.1101/2023.08.02.23293212 A1 Stenton, Sarah L. A1 O’Leary, Melanie A1 Lemire, Gabrielle A1 VanNoy, Grace E. A1 DiTroia, Stephanie A1 Ganesh, Vijay S. A1 Groopman, Emily A1 O’Heir, Emily A1 Mangilog, Brian A1 Osei-Owusu, Ikeoluwa A1 Pais, Lynn S. A1 Serrano, Jillian A1 Singer-Berk, Moriel A1 Weisburd, Ben A1 Wilson, Michael A1 Austin-Tse, Christina A1 Abdelhakim, Marwa A1 Althagafi, Azza A1 Babbi, Giulia A1 Bellazzi, Riccardo A1 Bovo, Samuele A1 Carta, Maria Giulia A1 Casadio, Rita A1 Coenen, Pieter-Jan A1 De Paoli, Federica A1 Floris, Matteo A1 Gajapathy, Manavalan A1 Hoehndorf, Robert A1 Jacobsen, Julius O.B. A1 Joseph, Thomas A1 Kamandula, Akash A1 Katsonis, Panagiotis A1 Kint, Cyrielle A1 Lichtarge, Olivier A1 Limongelli, Ivan A1 Lu, Yulan A1 Magni, Paolo A1 Mamidi, Tarun Karthik Kumar A1 Martelli, Pier Luigi A1 Mulargia, Marta A1 Nicora, Giovanna A1 Nykamp, Keith A1 Pejaver, Vikas A1 Peng, Yisu A1 Pham, Thi Hong Cam A1 Podda, Maurizio S. A1 Rao, Aditya A1 Rizzo, Ettore A1 Saipradeep, Vangala G A1 Savojardo, Castrense A1 Schols, Peter A1 Shen, Yang A1 Sivadasan, Naveen A1 Smedley, Damian A1 Soru, Dorian A1 Srinivasan, Rajgopal A1 Sun, Yuanfei A1 Sunderam, Uma A1 Tan, Wuwei A1 Tiwari, Naina A1 Wang, Xiao A1 Wang, Yaqiong A1 Williams, Amanda A1 Worthey, Elizabeth A. A1 Yin, Rujie A1 You, Yuning A1 Zeiberg, Daniel A1 Zucca, Susanna A1 Bakolitsa, Constantina A1 Brenner, Steven E. A1 Fullerton, Stephanie M A1 Radivojac, Predrag A1 Rehm, Heidi L. A1 O’Donnell-Luria, Anne YR 2023 UL http://medrxiv.org/content/early/2023/08/04/2023.08.02.23293212.abstract AB Background A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average “diagnostic odyssey” lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting.Methods Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds.Results Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency.Conclusions By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.Competing Interest StatementAuthors S.Z., I.L., E.R., P.M., and R.B., own shares of enGenome srl. Authors F.D.P. and G.N. are employees of enGenome srl. Authors T.J., R.S., S.G.V., N.S., A.R., U.S., N.T., are employees of TCS Ltd. Authors P.J.C., C.K., K.N., and P.S. are employees of Invitae Ltd. H.L.R. receives support from Illumina and Microsoft for rare disease gene discovery and diagnosis. A.O’D-L. is a member of the scientific advisory board for Congenica Inc and the Simons Foundation SPARK for Autism study and co-chairs the clinical advisory board for CAGI. S.E.B receives support at UC Berkeley from a research agreement from TCS. All other authors report no competing interests.Funding StatementS.L.S. is supported by a fellowship from the Manton Center for Orphan Disease Research at Boston Children’s Hospital. G.L. was supported by Fonds de recherche en sante du Quebec. V.S.G. was supported by the Mass General Brigham Training Program in Precision and Genomic Medicine (NHGRI T32 HG10464). Data and diagnoses were provided by Broad Institute of MIT and Harvard Center for Mendelian Genomics with funding to H.L.R. and A.O’D-L., by the National Human Genome Research Institute (NHGRI) grants UM1HG008900, U01HG011755, and R01HG009141 and by the Chan Zuckerberg Initiative through an advised fund of the Silicon Valley Community Foundation grant 2020-224274. This study was also supported by the NHGRI CAGI grant U24 HG007346 (to S.E.B. and P.R.), National Institute of Child Health and Human Development grant 1R01HD103805‐01, and National Institute of General Medical Sciences R35GM124952, along with funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) grants URF/1/4355-01-01, URF/1/4675-01-01, FCC/1/1976-34-01.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The Rare Genomes Project study is approved by the Mass General Brigham Institutional Review Board (IRB) protocol 2016P001422. Written informed consent for the publication of clinical details was obtained from the participants or legal guardians.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesACMG/AMPAmerican College of Medical Genetics and Genomics and the Association for Molecular PathologyADautosomal dominantAFRAfrican/African AmericanAMRAdmixed AmericanARautosomal recessiveASJAshkenazi JewishCAGICritical Assessment of Genome InterpretationCSFcerebrospinal fluidDMdisease mutationEPCRestimated probability of causal relationshipF-maxmaximum F-measureHPOHuman Phenotype OntologyIGVIntegrative Genome Viewerindelsmall insertion/deletionLPlikely pathogenicNFENon-Finnish EuropeanPpathogenicPHSPitt-Hopkins syndromeRGPRare Genomes ProjectSASSouth AsianSEstandard errorSNVsingle nucleotide variantSVstructural variantVCFvariant call fileVEPVariant Effect PredictorVUSvariant of uncertain significanceXLRX-linked recessive