Abstract
Fully understanding the genetic factors involved in Autism Spectrum Disorder (ASD) requires whole-genome sequencing (WGS), which theoretically allows the detection of all types of genetic variants. With the aim of generating an unprecedented resource for resolving the genomic architecture underlying ASD, we analyzed genome sequences and phenotypic data from 5,100 individuals with ASD and 6,212 additional parents and siblings (total n=11,312) in the Autism Speaks MSSNG Project, as well as additional individuals from other WGS cohorts. WGS data and autism phenotyping were based on high-quality short-read sequencing (>30x coverage) and clinically accepted diagnostic measures for ASD, respectively. For initial discovery of ASD-associated genes, we used exonic sequence-level variants from MSSNG as well as whole-exome sequencing-based ASD data from SPARK and the Autism Sequencing Consortium (>18,000 trios plus additional cases and controls), identifying 135 ASD-associated protein-coding genes with false discovery rate <10%. Combined with ASD-associated genes curated from the literature, this list was used to guide the interpretation of all other variant types in WGS data from MSSNG and the Simons Simplex Collection (SSC; n=9,205). We identified ASD-associated rare variants in 789/5,100 individuals with ASD from MSSNG (15%) and 421/2,419 from SSC (17%). Considering the genomic architecture, 57% of ASD-associated rare variants were nuclear sequence-level variants, 41% were nuclear structural variants (SVs) (mainly copy number variants, but also including inversions, large insertions, uniparental isodisomies, and tandem repeat expansions), and 2% were mitochondrial variants. Several of the ASD-associated SVs would have been difficult to detect without WGS, including an inversion disrupting SCN2A and a nuclear mitochondrial insertion impacting SYNGAP1. Polygenic risk scores did not differ between children with ASD in multiplex families versus simplex, and rare, damaging recessive events were significantly depleted in multiplex families, collectively suggesting that rare, dominant variation plays a predominant role in multiplex ASD. Our study provides a guidebook for exploring genotype-phenotype correlations in the 15-20% of ASD families who carry ASD-associated rare variants, as well as an entry point to the larger and more diverse studies that will be required to dissect the etiology in the >80% of the ASD population that remains idiopathic. All data resulting from this study are available to the medical genomics research community in an open but protected manner.
Competing Interest Statement
E.A. has received consultation fees from Roche, Quadrant, and Oron; grant funding from Roche; in-kind supports from AMO Pharma and CRR; editorial honoraria from Wiley; and book royalties from APPI and Springer. She co-holds a patent for the device Anxiety Meter (patent # US20160000365A1). S.W.S. is on the Scientific Advisory Committee of Population Bio, serves as a Highly Cited Academic Advisor for King Abdulaziz University, and intellectual property from aspects of his research held at The Hospital for Sick Children are licensed to Athena Diagnostics and Population Bio. These relationships did not influence data interpretation or presentation during this study but are disclosed for potential future considerations.
Funding Statement
This work was supported by the University of Toronto McLaughlin Centre; Genome Canada/Ontario Genomics; Genome BC; the Government of Ontario; the Canadian Institutes of Health Research (CIHR); the Canada Foundation for Innovation (CFI); Autism Speaks; Autism Speaks Canada; Brain Canada; Kids Brain Health Network; Qatar National Research Fund (NPRP10 0202 170320); Ontario Brain Institute and SickKids Foundation. B.T. has been supported by the CIHR Banting Postdoctoral Fellowship and the Canadian Open Neuroscience Platform (CONP) Research Scholar Award. L.O.L. held the Lap Chee Tsui Fellowship for Research Excellence. S.G. has been supported by the CONP Research Scholar Award. M.H.Z. acknowledges the generous support of the Fonds de recherche du Quebec Sante Junior 1 Research Scholar programme. J.S. and the REACH project were supported by grants from the National Institutes of Health (MH113715, MH119746, 1MH109501) and the Simons Foundation Autism Research initiative (SFARI 606768). L.Z. holds the Stollery Childrens Hospital Foundation Chair in Autism. M.E.S.L. holds a BC Childrens Hospital Research Institute Investigator Grant Award. S.W.S. holds the Northbridge Chair in Pediatric Research at The Hospital for Sick Children and the University of Toronto.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee/IRB of WCG IRB (https://www.wcgirb.com); Montreal Childrens Hospital/McGill University Health Centre; McMaster University/Hamilton Integrated; Memorial University/Eastern Health; Holland Bloorview Kids Rehabilitation Hospital; Queens University; University of Alberta; University of British Columbia; IWK Health Centre; University of California Davis; University of California San Diego; University of Miami; and The Hospital for Sick Children gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Access to the data contained in MSSNG and SSC can be obtained by completing data access agreements at https://research.mss.ng and https://www.sfari.org/resource/sfari-base, respectively. The 1000G WGS data are publicly available via Amazon Web Services (https://docs.opendata.aws/1000genomes/readme.html).