PT - JOURNAL ARTICLE AU - Hiatt, Susan M. AU - Lawlor, James M.J. AU - Handley, Lori H. AU - Latner, Donald R. AU - Bonnstetter, Zachary T. AU - Finnila, Candice R. AU - Thompson, Michelle L. AU - Boston, Lori Beth AU - Williams, Melissa AU - Nunez, Ivan Rodriguez AU - Jenkins, Jerry AU - Kelley, Whitley V. AU - Bebin, E. Martina AU - Lopez, Michael A. AU - Hurst, Anna C. E. AU - Korf, Bruce R. AU - Schmutz, Jeremy AU - Grimwood, Jane AU - Cooper, Gregory M. TI - Long-read genome sequencing and variant reanalysis increase diagnostic yield in neurodevelopmental disorders AID - 10.1101/2024.03.22.24304633 DP - 2024 Jan 01 TA - medRxiv PG - 2024.03.22.24304633 4099 - http://medrxiv.org/content/early/2024/03/26/2024.03.22.24304633.short 4100 - http://medrxiv.org/content/early/2024/03/26/2024.03.22.24304633.full AB - Variant detection from long-read genome sequencing (lrGS) has proven to be considerably more accurate and comprehensive than variant detection from short-read genome sequencing (srGS). However, the rate at which lrGS can increase molecular diagnostic yield for rare disease is not yet precisely characterized. We performed lrGS using Pacific Biosciences “HiFi” technology on 96 short-read-negative probands with rare disease that were suspected to be genetic. We generated hg38-aligned variants and de novo phased genome assemblies, and subsequently annotated, filtered, and curated variants using clinical standards. New disease-relevant or potentially relevant genetic findings were identified in 16/96 (16.7%) probands, eight of which (8/96, 8.33%) harbored pathogenic or likely pathogenic variants. Newly identified variants were visible in both srGS and lrGS in nine probands (∼9.4%) and resulted from changes to interpretation mostly from recent gene-disease association discoveries. Seven cases included variants that were only interpretable in lrGS, including copy-number variants, an inversion, a mobile element insertion, two low-complexity repeat expansions, and a 1 bp deletion. While evidence for each of these variants is, in retrospect, visible in srGS, they were either: not called within srGS data, were represented by calls with incorrect sizes or structures, or failed quality-control and filtration. Thus, while reanalysis of older data clearly increases diagnostic yield, we find that lrGS allows for substantial additional yield (7/96, 7.3%) beyond srGS. We anticipate that as lrGS analysis improves, and as lrGS datasets grow allowing for better variant frequency annotation, the additional lrGS-only rare disease yield will grow over time.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe CSER1 project was supported by a grant from the US National Human Genome Research Institute (NHGRI; UM1HG007301). The SouthSeq project (U01HG007301) was supported by the Clinical Sequencing Evidence-Generating Research (CSER2) consortium, which is funded by the National Human Genome Research Institute with co-funding from the National Institute on Minority Health and Health Disparities and the National Cancer Institute. The Alabama Genomic Health Initiative is an Alabama-State earmarked project (F170303004) through the University of Alabama in Birmingham. The PGEN cohort was funded by the Alabama Pediatric Genomics Initiative. lrGS of some samples was supported by a Research Grant from the Muscular Dystrophy Association (MDA 963255). Some reagents were provided by PacBio as part of an early-access testing program.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Western IRB gave ethical approval for this work (WIRB 0071). IRB of the University of Alabama in Birmingham gave ethical approval for this work (UAB IRB protocols 170303004, 300000328, and 130201001).I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesFor participants who consented to controlled-access sharing, the lrGS data generated in this study will be submitted to dbGAP and/orAnVIL under accession number phs003537.v1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003537.v1). srGS data for participants who consented to controlled-access sharing in NIH-funded studies are available via dbGAP and/or AnVIL (CSER1: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001089.v3.p1, dbGaP accession phs001089; SouthSeq: https://anvilproject.org/data/studies/phs002307, dbGaP accession phs002307). https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003537.v1 https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001089.v3.p1 https://anvilproject.org/data/studies/phs002307