RT Journal Article SR Electronic T1 Spinal muscular atrophy diagnosis and carrier screening from whole-genome sequencing data JF medRxiv FD Cold Spring Harbor Laboratory Press SP 19006635 DO 10.1101/19006635 A1 Chen, Xiao A1 Sanchis-Juan, Alba A1 French, Courtney E A1 Connell, Andrew J A1 Chawla, Aditi A1 Halpern, Aaron L A1 Taft, Ryan J A1 NIHR BioResource A1 Bentley, David R A1 Butchbach, Matthew ER A1 Raymond, F Lucy A1 Eberle, Michael A YR 2019 UL http://medrxiv.org/content/early/2019/10/02/19006635.abstract AB Purpose Spinal muscular atrophy (SMA), caused by loss of the functional SMN1 gene, is a leading genetic cause of early childhood death. Due to the near identical sequences of SMN1 and its paralog SMN2, analysis of this region is challenging. Population-wide SMA screening to quantify the SMN1 copy number (CN) is recommended by the American College of Medical Genetics.Methods We developed an informatics method that accurately identifies the CN of SMN1 and SMN2 using whole-genome sequencing (WGS) data. This algorithm calculates the CNs of SMN1 and SMN2 using read depth and eight informative reference genome differences between SMN1/2.Results We characterized SMN1/2 in 12,747 genomes across five ethnic populations and identified 251 (1317) samples with SMN1 losses (gains) and 6241 (374) samples with SMN2 losses (gains). We calculated a pan-ethnic carrier frequency of 2%, consistent with previous studies. Additionally, we validated our calls and all (48/48) SMN1 and 98% (47/48) of SMN2 CN calls agreed with digital PCR.Conclusion This WGS-based SMN copy number caller can be used to identify both carrier and affected status of SMA, enabling SMA testing to be offered as a comprehensive test in neonatal care and an accurate carrier screening tool in large-scale WGS sequencing projects.Competing Interest StatementXiao Chen, Aditi Chawla, Aaron L Halpern1, Ryan J Taft, David R Bentley, and Michael A Eberle are all employed by Illumina a maker of genome sequencing instruments.Funding StatementThis work was supported by the Cambridge Biomedical Research Centre and the National Institute for Health Research (NIHR) for the NIHR BioResource (grant number RG65966), the National Institute of General Medical Sciences of the National Institutes of Health (P30GM114736 and P20GM103446; to MERB) and the Nemours Foundation (to MERB). We thank the New York Genome Center (supported by NHGRI Grant 3UM1HG008901-03S1), and the Coriell Institute for Medical Research for generating and releasing the 1kGP WGS data. Author DeclarationsAll relevant ethical guidelines have been followed and any necessary IRB and/or ethics committee approvals have been obtained.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesAny clinical trials involved have been registered with an ICMJE-approved registry such as ClinicalTrials.gov and the trial ID is included in the manuscript.Not ApplicableI have followed all appropriate research reporting guidelines and uploaded the relevant Equator, ICMJE or other checklist(s) as supplementary files, if applicable.Not ApplicableThe 1kGP data can be downloaded from https://www.ncbi.nlm.nih.gov/bioproject/PRJEB31736/. Data from the NIHR BioResource participants have been deposited in European Genome-phenome Archive (EGA) at the EMBL European Bioinformatics Institute. Those participants from the NIHR BioResource who enrolled for the 100,000 Genomes Project-Rare Diseases Pilot can be accessed by seeking access via Genomics England Limited following the procedure outlined at: https://www.genomicsengland.co.uk/about-gecip/joining-research-community. The Bam files from the NGC individuals have been deposited in EGA under accession number EGAD00001004357.