The impact of damaging epilepsy and cardiac genetic variant burden in sudden death in the young =============================================================================================== * Megan J. Puckelwartz * Lorenzo L. Pesce * Edgar J. Hernandez * Gregory Webster * Lisa M. Dellefave-Castillo * Mark W. Russell * Sarah S. Geisler * Samuel D. Kearns * Felix K Etheridge * Susan P. Etheridge * Tanner O. Monroe * Tess D. Pottinger * Prince J. Kannankeril * M. Benjamin Shoemaker * Darlene Fountain * Dan M. Roden * Heather MacLeod * Kristin M. Burns * Mark Yandell * Martin Tristani-Firouzi * Alfred L. George * Elizabeth M. McNally ## Abstract **Background** Sudden unexpected death in children is a tragic event. Understanding the genetics of sudden death in the young (SDY) enables family counseling and cascade screening. The objective of this study was to characterize genetic variation in an SDY cohort using whole genome sequencing. **Methods** The SDY Case Registry is a National Institutes of Health/Centers for Disease Control surveillance effort to discern the prevalence, causes, and risk factors for SDY. The SDY Case Registry prospectively collected clinical data and DNA biospecimens from SDY cases <20 years of age. SDY cases were collected from medical examiner and coroner offices spanning 13 US jurisdictions from 2015-2019. The cohort included 211 children (mean age 1 year; range 0-20 years), determined to have died suddenly and unexpectedly and in whom DNA biospecimens and next-of-kin consent were ascertained. A control cohort consisted of 211 randomly sampled, sex-and ancestry-matched individuals from the 1000 Genomes Project. Genetic variation was evaluated in epilepsy, cardiomyopathy and arrhythmia genes in the SDY and control cohorts. American College of Medical Genetics/Genomics guidelines were used to classify variants as pathogenic or likely pathogenic. Additionally, genetic variation predicted to be damaging was identified using a Bayesian-based artificial intelligence (AI) tool. **Results** The SDY cohort was 42% European, 30% African, 17% Hispanic, and 11% with mixed ancestries, and 39% female. Six percent of the cohort was found to harbor a pathogenic or likely pathogenic genetic variant in an epilepsy, cardiomyopathy or arrhythmia gene. The genomes of SDY cases, but not controls, were enriched for rare, damaging variants in epilepsy, cardiomyopathy and arrhythmia-related genes. A greater number of rare epilepsy genetic variants correlated with younger age at death. **Conclusions** While damaging cardiomyopathy and arrhythmia genes are recognized contributors to SDY, we also observed an enrichment in epilepsy-related genes in the SDY cohort, and a correlation between rare epilepsy variation and younger age at death. These findings emphasize the importance of considering epilepsy genes when evaluating SDY. Keywords * sudden death in the young * genome sequencing * epilepsy * arrhythmia * cardiomyopathy * gene burden ## BACKGROUND Sudden death in children has immediate and sustained medical and emotional consequences for affected families. Elucidating the etiology of sudden death can inform risk for surviving family members. The Sudden Death in the Young (SDY) Case Registry was created as a joint initiative between the National Institutes of Health (NIH) and the Centers for Disease Control and Prevention (CDC) to enable a population-based surveillance of SDY and facilitate research through the collection of clinical data and biospecimens (1). The SDY Case Registry engaged public health agencies in thirteen diverse US sites to collect information on children, ages 0-20 years, who died suddenly and unexpectedly. Data from the SDY Registry indicates that sudden death mortality rates are greater for infants less than 1 year (120/100000 live births) than for children 1-17 years (1.9/100000 children) (2). SDY can be attributed to primary cardiac causes including arrhythmias, cardiomyopathies, congenital heart defects, and vascular disease, as well as noncardiac causes such as epilepsy. While extensive work has established social and environmental factors that contribute to sudden infant death syndrome (SIDS), the genetic contributors to sudden unexpected infant death (SUID), sudden cardiac death, and sudden unexpected death in epilepsy (SUDEP) remain incompletely described. There is increasing evidence that SDY has a genetic component and postmortem genetic screening can aid in identifying the cause of death. Testing of genes implicated in cardiac rhythm and function identifies a pathogenic or likely pathogenic (P/LP) variant in 10-25% of individuals with sudden unexplained death <40 years old (3, 4). Approximately 4% of SIDS may be due to clinically actionable genetic cardiac causes (5). In a cohort of sudden unexpected death in pediatrics (SUPD), contributory genetic variants were identified in 11% of decedents, and the authors also found an excess burden of rare, damaging SUDEP gene variants compared to a control group (6). We previously used genome sequencing (GS) to analyze an SDY cohort (n=103) covering a larger age range (1-44 years) and found ∼13% carried a pathogenic/likely pathogenic (P/LP) variant in an arrhythmia or cardiomyopathy gene. Younger decedents carried an excess of suspicious variants of uncertain significance (VUS) and P/LP variants compared to a control population. Furthermore, for decedents >2 years of age at death, a younger age was associated with harboring more rare cardiac variants (7). Here, we interrogated the genomes of 211 decedents (≤19 years old) ascertained through the NIH/CDC SDY Registry with a median age at death of 1 year, with an analysis of genes associated with epilepsy, cardiomyopathies and arrhythmias. ## METHODS ### SDY Case Registry The SDY Case Registry was previously described (1). Briefly, the SDY Case Registry aggregated information from 13 participating states and jurisdictions in collaboration with public health agencies, including medical examiner offices. Clinical data were collected via the US National Child Death Review Case Reporting System. The Institutional Review Board (IRB) at the Office of Research Integrity at the Michigan Public Health Institute (MPHI) provided ethical approval for the study (initial approval 2015 with subsequent annual renewals). The MPHI was the central IRB, and for those jurisdictions who did not rely on the Central IRB, local IRBs provided approval for the study. Parents/next of kin provided written informed consent for research use of the decedent’s DNA and linked information with consent to broad, future, unspecified research on their child’s DNA and linked information to create a research repository. The MPHI also served as the Data Coordinating Center (DCC) for the SDY Case Registry. Biospecimens for DNA extraction were obtained at autopsy. All data was deidentified for the analysis performed in this work. #### Genome Sequence (GS) Analysis *Sequencing methods are detailed in the Supplement.* Variants were called with either the Genome Analysis Tool Kit (GATK v3.3.0) using the MegaSeq pipeline or Sentieon joint variant calling software (8, 9, 10). Variant call files were annotated using SnpEff (11). Variants were annotated using ClinVar (accessed March 2022) and global and ancestry-specific allele frequencies using the Genome Aggregation Database (gnomAD) (12). M-CAP was used to annotate variant pathogenicity (13). #### Genetic Evaluation of Ancestry Ancestry was determined using principal component analysis (PCA) conducted using singular-value decomposition of ∼5 million biallelic variants across the genome. PCAs were generated using the first 2 components for global ancestry estimates. Asian ancestry was determined using the 3rd PC. Analyses were performed using PLINK v1.9 and R v4.1. #### Gene Panel Analysis Four previously-established gene panels were analyzed (two epilepsy panels and two arrhythmia and cardiac function panels, eTables 2 and 3 in the Supplement): 1) The Early Infantile Epileptic Encephalopathy (EIEE)- Online Mendelian Inheritance in Man (OMIM) panel (82 genes) derived from the OMIM phenotypic series for early infantile epileptic encephalopathy, a curated epilepsy list; 2) The Epilepsy gene panel (191 genes, overlaps with the EIEE panel), and is a curated list derived from the Invitae Epilepsy Panel (Invitae, San Francisco, CA); 3) CMAR1, CardioMyopathy and Arrhythmia (gene panel including 118 arrythmia and cardiac genes previously described (7); and 4) CMAR2, a Pan-Cardiomyopathy Arrhythmia and Cardiomyopathy Comprehensive gene panel including 143 genes with overlap with the CMAR1 panel (14). Control genes were selected from uniformly distributed housekeeping genes (15, 16). To create burden-matched control gene lists, a burden ratio measurement was calculated for each gene in RefSeq by counting the number of rare variants (MAF ≤ 0.005) found in the gnomAD database over the largest coding transcript of every gene divided by transcript length (12). The mean burden ratio and standard deviation from each list were matched by randomly sampling from the whole genome (excluding genes in the test list) (15, 17). #### Artificial-Intelligence (AI)-Based Variant Prioritization AI-based prioritization and scoring of candidate disease genes and diagnostic conditions were performed using GEM (18) in the commercially available Fabric Enterprise platform (Fabric Genomics, Oakland, CA). Briefly, GEM aggregates data from multiple sources and clinical data sets to identify pathogenic variants relevant to phenotype terms input by the user. The GEM pipeline requires a variant call file, affection status, and human phenotype ontology (HPO) terms for each individual. A log10 Bayes-factor score (GEM score) was generated that calculated the extent of support for a given model using multiple lines of evidence from open-source tools and databases. A GEM score ≥0.69 was used to define a likely deleterious genetic variant, based on the observation this threshold recovered 95% of true positive cases in a cohort of critically ill newborns undergoing rapid genome sequencing (18). #### Control Cohort The control cohort of 211 genome samples derived from the 1000 Genomes Project ([https://www.internationalgenome.org/](https://www.internationalgenome.org/)) that were variant called using the Sentieon pipeline described above against the GRCh37/hg19 human reference genome(19). This control cohort matched the ancestry and sex composition as the SDY cohort. #### Enrichment Analysis Since detailed clinical information was lacking from the SDY cohort, we used HPO terms expected to be associated with SDY: Sudden Cardiac Death (HP:0001645), Sudden Death (HP:0001699), Cardiac arrest (HP:0001695), Cardiomyopathy (HP:0001638), Abnormal QT interval (HP:0031547), and Seizure (HP:0001250). We also performed the analysis using a control root phenotype, Phenotypic Abnormality (HP:0000118) which was used as a normalization factor to control for ontology biases and was expected to behave similarly across different cohorts. GEM analysis was performed for each HPO term with each iteration identifying potentially causal variants for the three phenotypes described. To assess enrichment between cases and the controls a resampling analysis was performed as previously described (17). Briefly, for each investigated gene list (CMAR1, CMAR2, EIEE-OMIM, and Epilepsy Panels) of size N, 100,000 random samples of equal size were drawn from the 18,876 RefSeq genes and intersected with the corresponding GEM list of damaged genes producing the number of damaged genes in the resampled list. Fisher exact test was used to determine if there were significant differences in the number of genes between cases and controls using the permutation test to normalize the data across runs and produce an empirical p-value (17). Multiple comparison correction was performed using the False Discovery Rate (FDR) approach (20). Each null distribution was scaled and centered producing Z-scores to directly compare the p values of different gene sets. #### Age at death burden analysis Age at death dependence was correlated with the cumulative number of nonsynonymous variants in the Epilepsy or CMAR gene panels. Variants were binned by global gnomAD frequency [<0.001, 0.001-0.01, 0.01-0.1, 0.1-0.25, 0.25-0.5]. Multivariate linear models adjusting for ancestry (PCs 1-6) using custom code based on the *lm* function in R 4.1 were regressed to test each bin. P-values were adjusted for multiple comparisons. An extreme value sensitivity test was performed for decedents at the ends of the rare variant distribution (gnomAD <0.001); p values were adjusted for ancestry PCs 1-6. #### Variants of Uncertain Significance (VUS) Analysis The number of nonsynonymous, rare variants (<0.001) found in gnomAD (V2) for each gene in the Epilepsy and CMAR1 gene panels was compiled. To determine expected number of variants in each gene for the gene set used (Epilepsy/CMAR), we created a multinomial model using the observed number of missense variants in the same gene set in gnomAD. We estimated the probability of observing a variant in each gene in the gene set as the observed number of variants in gnomAD by dividing the total number of variants in the gene set in gnomAD. ## RESULTS ### The Sudden Death in the Young (SDY) Cohort In total, 230 decedents from the SDY registry had full genome sequencing (**Figure 1**). Of the 230, 19 were excluded; 4 due to failed inclusion criteria and 15 due to failed sequencing where no additional sample was available. Of the remaining 211, 152 (72%) had partial phenotype data available from the SDY Case Registry (**Table 1**). The median age at death was 1 year [0.33, 0.96] (median, IQR), indicating an enriched sample of cases who died in the first year of life. Genetic ancestry analysis was used to classify SDY decedents as: 42% European, 30% African, 17% Hispanic, and 11% with mixed ancestries. The cohort was 61% male and 39% female (**Table 1**). The SDY Case Registry provided cause of death information by standardized case report forms. The cause of death was included for 152 decedents, including 104 whose death remained unexplained and 48 with a known cause of death reported. Detailed cause of death is noted in **eTable 1** in the Supplement with sudden unexplained death as the most common cited cause (43%). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/03/29/2023.03.27.23287711/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/F1) Figure 1. CONSORT diagram of decedents selected for genome sequencing. GS = genome sequencing. View this table: [Table 1.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T1) Table 1. Demographics of study cohort (n=211) ### Pathogenic/likely pathogenic variants in cardiac and epilepsy genes To identify variants that may have contributed to SDY, we employed gene panels to filter for rare, high impact variants. We hypothesized that the most likely candidates for SDY would be epilepsy, and cardiomyopathy and arrhythmia gene variants. We identified nonsynonymous genetic variants in 161 out of 191 genes for the Epilepsy panel and in 99 out of 118 of the CMAR1 genes (**eTable 2** and 3 in the Supplement). Variants were annotated using ACMG classification and ClinVar interpretations to designate variants as pathogenic (P), likely pathogenic (LP), or variant of uncertain significance (VUS). Within the two gene sets we identified P/LP variants in ∼6% of decedents (11 of 211). In these 11 individuals, the same *TTR* variant appeared in 4 decedents (**eTable 4** in the Supplement). In the epilepsy panel, we identified 4 variants in 3 decedents, but these genes generally associate with autosomal recessive inheritance. Six individuals had at least one P or LP variant in genes from the CMAR1 panel. One variant was identified in the *FKRP* gene which associates with recessive disease. One individual had a variant in the *KCNH2* gene, which is found in both the Epilepsy and CMAR1 panels (**eTable 4** in the Supplement). To identify potentially pathogenic variants in a broader set of genes without constraining to the Epilepsy and CMAR panels, we deployed GEM, an AI tool designed for whole genome-based diagnosis of Mendelian conditions (18), using HPO terms related to seizures and sudden cardiac death. Thirty-eight decedents (18%) harbored ClinVar-designated P/LP variants in genes linked to a Mendelian disorder (13 related to the HPO term seizure and 25 related to sudden cardiac death; **eTable 5** in the Supplement). Four of these decedents harbored compound heterozygous damaging variants in genes associated with recessive disorders, including *CFTR* (Cystic Fibrosis), *BCDHA* (Maple Syrup Urine Disease), *MPDZ* (Congenital Hydrocephalus-2), and *GDAP1* (Charcot-Marie-Tooth Disease, Type 4A). ### Damaging Genetic Variation in Cardiac and Epilepsy Genes was Enriched in the SDY cohort We compared the burden of damaging cardiac and epilepsy genes in SDY cases versus controls, using two CMAR and two Epilepsy gene panels (see Methods). A GEM score ≥0.69 was used to define a likely deleterious genetic variant (see Methods). We also used a known housekeeping gene list and a random sample of genes matched for similar variant burden (16). Figure 2 provides the distributions for the random gene samples (gray bars) and the number of genes identified for enrichment in the gene lists (dark arrows) and the burden-matched controls (light arrows). This analysis revealed an increased burden of genes with damaging variants for both Epilepsy and CMAR1 panels when compared with the control gene set (**Figure 2A** and **B**, respectively). A similar pattern was seen when we analyzed the EIEE-OMIM and CMAR2 gene panels (**e**Figure 1 in the Supplement). The Fisher exact-test p-values FDR adjusted are Epilepsy p<0.027, EIEE-OMIM p<0.019, CMAR1 p<0.001, CMAR2 p<0.001 (**eTable 6** in the Supplement). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/03/29/2023.03.27.23287711/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/F2) Figure 2. **Relative enrichment of damaging variants in epilepsy and Cardiomyopathy/Arrhythmia (CMAR) genes.** The SDY cohort has enriched damaging (A) epilepsy and (B) cardiac gene burden compared to a sex-and ancestry-matched control cohort. Histograms (gray bars) represent distributions of damaged genes (GEM Score >0.69) sampled randomly from RefSeq genes using a root phenotype from the SDY cohort (n=211) (bottom panels) and 1000 Genomes Project Cohort (control) matched for sex and ancestry (n=211) (top panels). Damaged genes identified in the Epilepsy (n=191 genes, green) **(A)** and CMAR1 (n=118 genes, orange) **(B)** gene lists were significantly different between the SDY and control cohorts (dark arrows, epilepsy, p=0.027; cardiac p<0.001). Light arrows represent the number of damaged genes identified in a control housekeeping gene set. **(C)** The results in Figure 1A and B and eFigure 1 were Z-score transformed to make the gene list findings comparable. The plot reveals a considerable enrichment in both cardiac and epilepsy gene lists (+8.0- +13.0 SDs). Black line represents the expected normal distribution. To compare the findings among the distributions shown in **Figure 2A** and **B**, each distribution was normalized by Z-score transformation (Figure 2C). The standardized Z-score values of the enrichment observed in the gene panels (Figure 2C) exceeded the expected normal distribution (shown in black), indicating a strong enrichment for damaging variation in Epilepsy and CMAR genes in the SDY cohort (+8.0-+13.0 standard deviations). ### Rare Variants in Epilepsy Genes Correlated with Age at Death The enrichment of predicted damaging variation in Epilepsy and CMAR genes and the few decedents with P/LP variants, led us to determine if there is an association between rare Epilepsy and CMAR variants and age at death. We aggregated variants across five gnomAD allele frequency bins (<0.001, 0.001-0.01, 0.01-0.1, 0.1-0.25, 0.25-0.5) for the Epilepsy and CMAR1 gene panels. We found that having more rare, nonsynonymous variants in Epilepsy genes was associated with younger age at death. Nonsynonymous Epilepsy variants with an allele frequency <0.001 were significantly associated with age at death, (p=0.0053, adjusted for ancestry) (Figure 3A and **eTable 7** in the Supplement). Results remained significant (p=0.026) following correction for multiple hypothesis testing for all 5 frequency bins. To confirm that these data were not driven by a few outliers, we performed a sensitivity analysis that supported the results of the main analysis (**eTable 7** in the Supplement). A similar analysis of rare, nonsynonymous variants in the CMAR1 gene panel did not show an association for age at death in any frequency bin (Figure 3B). ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/03/29/2023.03.27.23287711/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/F3) Figure 3. **Number of rare, nonsynonymous, epilepsy gene variants correlated with younger age at death**. The age at death was plotted against against the number of rare variants (gnomAD allele frequency <0.001) in either the **(A)** Epilepsy or **(B)** CMAR1 gene list. Number of variants in the epilepsy gene list significantly correlated with age at death, (p=0.0053, adjusted for the first 6 principal components of ancestry). A similar analysis of rare, predicted damaging variation in the CMAR1 genes did not show association with age at death (p=0.85, p value adjusted for ancestry). Inset is a histogram showing number of decedents <1 year old by variant burden, box plots (black) represent decedent age distribution for each variant burden. Dark gray line = regression line of unadjusted model; dark gray shading represents 95% confidence intervals. ## Rare VUS in Epilepsy and Cardiac Genes were enriched in the SDY cohort We next considered if variants of uncertain significance (VUS) might associate with SDY. VUS are rare variants about which there is insufficient information as to whether they are pathogenic or benign, and the major determinant of VUS status is rare population frequency. Figure 4 plots the number of rare, nonsynonymous variants (<0.001 gnomAD allele frequency) not reported as pathogenic or benign in ClinVar. Variants were scored using the *in silico* predictor M-CAP and scored as damaging using an M-CAP score >0.025. We plotted total variants, damaging variants, expected number of variants based on gnomAD distribution and age at death (3rd quartile) of each decedent, and we found that genes in both Epilepsy and CMAR1 panels had greater than expected numbers of variants. Some genes had very few damaging variants, while others had more than expected. Those genes with greater than expected variation may potentially drive or modify risk for SDY. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/03/29/2023.03.27.23287711/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/F4) Figure 4. **Rare VUS in Epilepsy and CMAR Genes.** The number of rare (<0.001 gnomAD allele frequency) variants not classified as pathogenic or benign in ClinVar are reported for genes from either the **(A)** Epilepsy or **(B)** CMAR1 gene list (light gray bars). Variants were scored using the *in silico* tool M-CAP. The number of damaging VUS for each gene (>0.025 M-CAP score) is shown by the dark gray bars. The gold line indicates the expected number of variants based on gnomAD allele frequency data. Gene lists are ranked by the difference between the observed and expected number of variants, when ratios are approximately the same, ranking is determined by number of damaging gene variants. ## DISCUSSION This SDY cohort is unique for its very young age (mean age 1 year) and its diverse ancestry. In the SDY cohort, we identified P/LP variants in Epilepsy and CMAR genes in ∼6% of the cohort. An unrestrained AI-driven analysis identified 42 P/LP variants in 38 decedents suggesting that a purely gene panel-based approach may be diagnostically limited. Previous studies identified that the prevalence of P/LP gene variants is less in younger cohorts (3, 4, 5, 21), consistent with findings in our young cohort. In adults, the prevalence is likely 15-20%; whereas, in infants, our data agrees with others that the prevalence is < 10% when cardiomyopathy and arrhythmia genes are considered with a panel approach. Analysis of 278 SIDS cases for ultrarare variants in noncardiac, SIDS-susceptibility genes did not identify a monogenic basis for SIDS, even when using pathway burden analyses (22). A recent study of SDY examined decedents >1 to 44 years of age, and specifically excluded the group under 1 year of age (7). A similar burden analysis found more CMAR1 gene variants and an association between gene variant burden and younger age at death (7). There is evidence that epilepsy-related mechanisms associate with SDY pathogenesis (23, 24). This younger SDY cohort, with a median age of 1 year at time of death, displayed enrichment of both epilepsy and cardiac damaging variants, but only the epilepsy gene variant burden correlated with younger age at death. These data are in line with another study showing an increased burden in epilepsy, cardiac and metabolic genes with an excess of rare variants in all three gene classes in SUDP cases ≤1 year old (6). ## CONCLUSIONS Together, these findings support epilepsy etiologies for SDY, particularly in infants with a broader genetic contribution including cardiac and epilepsy variants in those >1 year of age. Even more relevant, the genetic associations for SDY appear to derive less from single gene P/LP variants and instead correlate with an aggregation of potentially damaging variants within an individual genome that predispose to SDY. ## Data Availability Both phenotype (where available) and sequence data will be available via dbGAP after publication or by reasonable request to the corresponding author. Relevant data and scripts will be made available upon request to the corresponding author. ## DECLARATIONS ### ETHICS APPROVAL AND CONSENT TO PARTICIPATE We attest that data included in this manuscript was conducted in a manner consistent with principles of research ethics per the standards of the Belmont Report. This research was performed with the voluntary, informed consent of the next of kin of the decedent, free from coercion. Registry activities involving biospecimen collection and consent of surviving family members for research were approved by the institutional review boards at the Data Coordinating Center and participating states/jurisdictions. ### CONSENT FOR PUBLICATION NA #### AVAILABILITY OF DATA AND MATERIALS Both phenotype (where available) and sequence data will be available via dbGAP after publication or by reasonable request to corresponding author. Relevant data and scripts will be made available upon request to the corresponding author. ## COMPETING INTERESTS MY serves as consultant to Fabric Genomics Inc. and has received consulting fees and stock grants from Fabric Genomics Inc. EMM is a consultant for Amgen, Avidity, AstraZeneca, Cytokinetics, PepGen, Pfizer, Tenaya Therapeutics, Stealth BioTherapeutics, Invitae, and is the founder of Ikaika Therapeutics. ## FUNDING This work was supported by grants from the National Institutes of Health (U01HL131914, U01HL131698, HL128075, and U01HL131911), the American Heart Association Strategically Focused Research Network on Arrhythmia and Sudden Cardiac Death and Career Development Award (MJP). This study was made possible by the Sudden Death in the Young Case Registry, a collaboration between the National Heart, Lung, and Blood Institute and National Institute of Neurologic Disorders and Stroke of the National Institutes of Health and the Centers for Disease Control and Prevention, and its Data Coordinating Center at the Michigan Public Health Institute and biorepository at the University of Michigan. ## AUTHOR CONTRIBUTIONS MJP, GW, LMDC, KMB, MTF, ALG, and EMM conceptualized the project. MJP, LMM, EJH, GW,SDK, MY developed methods including scripts for data analysis. MJP, LLP, GW, LMDC, SDK, FKF, HM, KMB participated in data curation. MJP, LLP, EJH, performed formal analysis of the data. MJP, LLP, EJH, GW, LMDC, FKF, HM, KMB conducted experiments through data collection and in silico analysis. TOM, TDP validated data and experiments. MJP was responsible for the first draft of the manuscript. LLP, EJH, GW, LMDC, SPE, PJK, MBS, DF, DMR, KMB, HM, MY, MTF, ALG and EMM carefully reviewed data and experiments and edited the manuscript. PJK, HM, KMB coordinated collection of decedent samples and information. MJP, PJK MTF, ALG and EMM secured funding. ## SUPPLEMENT ### Supplemental Methods WGS was performed on DNA obtained from the SDY Case Registry using an Illumina XTen sequencer (Garvan Institute of Medical Research NSW, Australia) with a yield of >100 GB sample, correlating to >30-fold coverage across the genome. The Burrows- Wheeler Aligner was used to align reads to human reference sequence GRCh37/hg19 (28). ![Supplemental Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2023/03/29/2023.03.27.23287711/F5.medium.gif) [Supplemental Figure 1.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/F5) Supplemental Figure 1. The SDY cohort had enriched damaging (A) epilepsy and (B) CMAR2 gene burden compared to a sex- and ancestry-matched control cohort. Histograms (gray bars) represent distributions of damaged genes (GEM Score >0.69) sampled randomly from RefSeq genes using a root phenotype from the SDY cohort (n=211) (bottom panels) and 1000 Genomes Project Cohort (control) matched for sex and ancestry (n=211) (top panels). Damaged genes identified in the Epilepsy EIEE (n=82 genes, green) (A) and CMAR2 (n=143 genes, orange) (B) gene lists were significantly different between the SDY and control cohorts (dark arrows, epilepsy, p=0.019; cardiac p<0.001). Light arrows represent the number of damaged genes identified in a control gene set. View this table: [Supplemental Table 1.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T2) Supplemental Table 1. Detailed Cause of Death View this table: [Supplemental Table 2.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T3) Supplemental Table 2. Epilepsy Gene panels View this table: [Supplemental Table 3.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T4) Supplemental Table 3. Cardiomyopathy and Arrhythmia Gene panels View this table: [Supplemental Table 4.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T5) Supplemental Table 4. Pathogenic and Likely Pathogenic Variants Identified in Epilepsy and Cardiac Genes View this table: [Supplemental Table 5.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T6) Supplemental Table 5. Pathogenic and Likely pathogenic, Mendelian variants as ranked by GEMS View this table: [Supplemental Table 6.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T7) Supplemental Table 6. Enrichment of variants in epilepsy or cardiac genes in the SDY cohort compared to a ancestry and sex matched 1000 genomes cohort. View this table: [Supplemental Table 7.](http://medrxiv.org/content/early/2023/03/29/2023.03.27.23287711/T8) Supplemental Table 7. Linear regression of age at death against number of rare epilepsy variants*. ## ACKNOWLEDGEMENTS The views expressed in this manuscript are those of the authors and do not reflect official positions of the National Heart, Lung, and Blood Institute, the National Institutes of Health, or the United States Department of Health and Human Services. This work used Stampede2 at The Texas Advanced Computing Center (TACC) through allocation XRAC - BIO220012 (PI MJP) from the Extreme Science and Engineering Discovery Environment (XSEDE), which was supported by National Science Foundation grant number #1548562 (25). We thank Victor EijKhout for his assistance with porting and optimization, which was made possible through the XSEDE Extended Collaborative Support Service (ECSS) program (26). We also thank Justin Wozniak for his support in installing and optimizing our swift-t workflow (27). *The information that is the basis of this presentation and/or publication was provided by the NCRPCD/DCC, which is funded in part by the US Centers for Disease Control and Prevention Division of Reproductive Health (CDC) and the Health Resources and Services Administration. The data is part of the Sudden Death in the Young (SDY) Case Registry, which is funded by the National Heart, Lung, and Blood Institute and the National Institute of Neurologic Disorders and Stroke of the National Institutes of Health (NIH) and the Centers for Disease Control and Prevention. The contents are solely the responsibility of the authors and do not necessarily represent the official views of NCRPCD, CDC, NIH or the participating states*. ## LIST OF ABBREVIATIONS *SDY* : Sudden death in the young NIH : National Instistutes of Health CDC : Center for Disease Control SIDS : Sudden infant death syndrome SUID : Sudden unexprected infant death SUDEP : Sudden unexplained death in epilepsy P/LP : Pathogenic or likely pathogenic SUDP : Sudden unexpected death in pediatrics GS : Genome sequencing VUS : Variant of unknown significance gnomAD : Genome aggregation database PCA : Principal component analysis EIEE : Early infantile epileptic encephalopathy OMIM : Online Mendelian inheritance in man CMAR1 : CardioMyopathy and Arrhythmia panel 1 CMAR2 : CardioMyopathy and Arrhythmia panel 2 AI : Artificial intelligence HPO : Human phenotype ontology FDR : False discovery rate * Received March 27, 2023. * Revision received March 27, 2023. * Accepted March 29, 2023. * © 2023, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## REFERENCES 1. 1.Burns KM, Cottengim C, Dykstra H, Faulkner M, Lambert ABE, MacLeod H, et al. Epidemiology of Sudden Death in a Population-Based Study of Infants and Children. J Pediatr X. 2020;2:100023. 2. 2.Burns KM, Cottengim C, Dykstra H, Faulkner M, Lambert ABE, MacLeod H, et al. Epidemiology of Sudden Death in a Population-Based Study of Infants and Children. J Pediatr X. 2020;2. 3. 3.Lahrouchi N, Raju H, Lodder EM, Papatheodorou E, Ware JS, Papadakis M, et al. Utility of Post-Mortem Genetic Testing in Cases of Sudden Arrhythmic Death Syndrome. J Am Coll Cardiol. 2017;69(17):2134–45. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czo0OiJhY2NqIjtzOjU6InJlc2lkIjtzOjEwOiI2OS8xNy8yMTM0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjMvMDMvMjkvMjAyMy4wMy4yNy4yMzI4NzcxMS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 4. 4.Bagnall RD, Weintraub RG, Ingles J, Duflou J, Yeates L, Lam L, et al. A Prospective Study of Sudden Cardiac Death among Children and Young Adults. N Engl J Med. 2016;374(25):2441–52. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1510687&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27332903&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 5. 5.Tester DJ, Wong LCH, Chanana P, Jaye A, Evans JM, FitzPatrick DR, et al. Cardiac Genetic Predisposition in Sudden Infant Death Syndrome. J Am Coll Cardiol. 2018;71(11):1217–27. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czo0OiJhY2NqIjtzOjU6InJlc2lkIjtzOjEwOiI3MS8xMS8xMjE3IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjMvMDMvMjkvMjAyMy4wMy4yNy4yMzI4NzcxMS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 6. 6.Koh HY, Haghighi A, Keywan C, Alexandrescu S, Plews-Ogan E, Haas EA, et al. Genetic Determinants of Sudden Unexpected Death in Pediatrics. Genet Med. 2022;24(4):839–50. 7. 7.Webster G, Puckelwartz MJ, Pesce LL, Dellefave-Castillo LM, Vanoye CG, Potet F, et al. Genomic Autopsy of Sudden Deaths in Young Individuals. JAMA Cardiol. 2021;6(11):1247–56. 8. 8.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ2Vub21lIjtzOjU6InJlc2lkIjtzOjk6IjIwLzkvMTI5NyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIzLzAzLzI5LzIwMjMuMDMuMjcuMjMyODc3MTEuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 9. 9.Puckelwartz MJ, Pesce LL, Nelakuditi V, Dellefave-Castillo L, Golbus JR, Day SM, et al. Supercomputing for the parallelization of whole genome analysis. Bioinformatics. 2014;30(11):1508–13. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btu071&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24526712&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 10. 10.Freed D, Aldana R, Weber JA, Edwards JS. The Sentieon Genomics Tools - A fast and accurate solution to variant calling from next-generation sequence data. bioRxiv. 2017:115717. 11. 11.Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2):80–92. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1101/2021.03.09.21252822&link_type=DOI) 12. 12.Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alfoldi J, Wang Q, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581(7809):434–43. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-020-2308-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32461654&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 13. 13.Jagadeesh KA, Wenger AM, Berger MJ, Guturu H, Stenson PD, Cooper DN, et al. M- CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat Genet. 2016;48(12):1581–6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3703&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27776117&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 14. 14.Dellefave-Castillo LM, Cirino AL, Callis TE, Esplin ED, Garcia J, Hatchell KE, et al. Assessment of the Diagnostic Yield of Combined Cardiomyopathy and Arrhythmia Genetic Testing. JAMA Cardiol. 2022;7(9):966–74. 15. 15.Eilbeck K, Quinlan A, Yandell M. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet. 2017;18(10):599–612. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nrg.2017.52&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28804138&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 16. 16.Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet. 2013;29(10):569–74. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.tig.2013.05.010&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23810203&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000325300800005&link_type=ISI) 17. 17.Watkins WS, Hernandez EJ, Wesolowski S, Bisgrove BW, Sunderland RT, Lin E, et al. De novo and recessive forms of congenital heart disease have distinct genetic and phenotypic landscapes. Nat Commun. 2019;10(1):4722. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-019-12582-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 18. 18.De La Vega FM, Chowdhury S, Moore B, Frise E, McCarthy J, Hernandez EJ, et al. Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases. Genome Med. 2021;13(1):153. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13073-021-00965-0&link_type=DOI) 19. 19.Auton A, Abecasis GR, Altshuler DM, Durbin RM, Abecasis GR, Bentley DR, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature15393&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26432245&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) 20. 20.Benjamini Y, Hochberg Y. On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics. Journal of Educational and Behavioral Statistics. 2000;25(1):60–83. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3102/10769986025001060&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000087561900004&link_type=ISI) 21. 21.Steinberg C, Padfield GJ, Champagne J, Sanatani S, Angaran P, Andrade JG, et al. Cardiac Abnormalities in First-Degree Relatives of Unexplained Cardiac Arrest Victims: A Report From the Cardiac Arrest Survivors With Preserved Ejection Fraction Registry. Circ Arrhythm Electrophysiol. 2016;9(9). 22. 22.Perrone S, Lembo C, Moretti S, Prezioso G, Buonocore G, Toscani G, et al. Sudden Infant Death Syndrome: Beyond Risk Factors. Life (Basel). 2021;11(3). 23. 23.Brownstein CA, Goldstein RD, Thompson CH, Haynes RL, Giles E, Sheidley B, et al. SCN1A variants associated with sudden infant death syndrome. Epilepsia. 2018;59(4):e56–e62. 24. 24.Halvorsen M, Petrovski S, Shellhaas R, Tang Y, Crandall L, Goldstein D, et al. Mosaic mutations in early-onset genetic diseases. Genet Med. 2016;18(7):746–9. 25. 25.Towns J, Cockerill T, Dahan M, Foster I, Gaither K, Grimshaw A, et al. XSEDE: Accelerating scientific discovery. Computing in Science and Engineering. 2014;16(5):62–74. 26. 26.Wilkins-Diehr N, Sanielevici S, Alameda J, Cazes J, Crosby L, Pierce M, et al., editors. An overview of the XSEDE extended collaborative support program. Communications in Computer and Information Science; 2016. 27. 27.Wozniak JM, Armstrong TG, Wilde M, Katz DS, Lusk E, Foster IT, editors. Swift/T: Scalable data flow programming for many-task applications. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP; 2013. 28. 28.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btp324&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19451168&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F29%2F2023.03.27.23287711.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000267665900006&link_type=ISI)