Abstract
Congenital heart defects (CHDs) are the most common structural birth defect and are present in 40-50% of children born with Down syndrome (DS). To characterize the genetic architecture of DS-associated CHD, we sequenced genomes of a multiethnic group of children with DS and a CHD (n=886: atrioventricular septal defects (AVSD), n=438; atrial septal defects (ASD), n=122; ventricular septal defects (VSD), n=170; other types of CHD, n=156) and DS with a structurally normal heart (DS+NH, n=572). We performed four GWAS for common variants (MAF>0.05) comparing DS with CHD, stratified by CHD-subtype, to DS+NH controls. Although no SNP achieved genome-wide significance, multiple loci in each analysis achieved suggestive significance (p<2×10−6). Of these, the 1p35.1 locus (near RBBP4) was specifically associated with ASD risk and the 5q35.2 locus (near MSX2) was associated with any type of CHD. Each of the suggestive loci contained one or more plausible candidate genes expressed in the developing heart. While no SNP replicated (p<2×10−6) in an independent cohort of DS+CHD (DS+CHD: n=229; DS+NH: n=197), most SNPs that were suggestive in our GWASs remained suggestive when meta-analyzed with the GWASs from the replication cohort. These results build on previous work to identify genetic modifiers of DS-associated CHD.
Introduction
Congenital heart defects (CHDs) are a heterogenous group of structural malformations that arise during embryonic heart development. They affect 8-10 of every 1000 live births and account for 25% of infant mortality (Boneva et al., 2001; Hoffman, 2013; Shuler et al., 2013), making them the most common structural birth defect in humans. CHDs have a heterogeneous etiology that includes genetic and environmental factors. One of the largest established contributors to CHD pathogenesis are chromosomal abnormalities (Hartman, Rasmussen, et al., 2011; Zaidi & Brueckner, 2017). It is estimated that 9-18% of CHD cases occur in infants with an aneuploidy, most often trisomy 21 (Abdulla, 1998; Bosi et al., 2003; Dadvand et al., 2009; Hartman, Rasmussen, et al., 2011; Schellberg et al., 2004; Zaidi & Brueckner, 2017). Trisomy 21, also known as Down syndrome (DS), is most often the consequence of nondisjunction during meiosis, resulting in an extra copy of chromosome 21. About 40-50% of infants with DS have a CHD (Freeman et al., 2008). The most common type of CHDs in this population are septal defects, the most prevalent being the atrioventricular septal defect (AVSD), followed by the ventricular septal defect (VSD) and the atrial septal defect (ASD) (Freeman et al., 2008).
AVSD is a severe form of CHD. It arises when the endocardial cushions fail to fuse during fetal valvulogenesis. The resulting defect is due to the combined effects of an ASD, VSD, and abnormalities of the mitral or tricuspid valves. In the euploid population, AVSD affects only 1 in 10,000 infants (Hartman, Riehle-Colarusso, et al., 2011), making it challenging to accumulate large cohorts needed for genetic studies. As such, little is known about the genetic architecture and etiology of AVSD. However, AVSD affects 1 in 5 infants with Down syndrome (DS) (Hartman, Riehle-Colarusso, et al., 2011), thus allowing more feasible assembly of cohorts for gene discovery. The 2000-fold increase in DS-associated AVSD compared to the euploid population suggests that aneuploidy of chromosome 21 is a principal risk factor for AVSD. However, since nearly half of infants with DS have a structurally normal heart, the extra copy of chromosome 21 alone is insufficient to cause AVSD or even abnormal cardiac development in general, suggesting that there are other factors that could modify this risk.
Many studies have previously sought to identify common variants associated with CHD and the various subtypes in euploid (Agopian et al., 2014; Cordell, Bentham, et al., 2013; Cordell, Topf, et al., 2013; Hu et al., 2013; Lahm et al., 2021) and DS populations (Ramachandran, Mulle, et al., 2015; Ramachandran, Zeng, et al., 2015; Sailani et al., 2013) with mixed results. A recent GWAS that studied over 4,000 cases with CHD and over 8,000 controls identified 20 genome-wide significant loci (Lahm et al.). As observed in prior studies, few loci could be identified when different CHD subtypes were combined. However, stratification into subtypes resulted in multiple genome-wide significant loci per group.
In DS, there has not been strong evidence that common variants contribute to CHD risk, although these studies were only powered for variants of relatively large effect. In 187 cases with DS and AVSD, ASD, or VSD and 151 controls with DS and structurally normal hearts, Sailani et al. identified no genome-wide significant loci when all CHD types were combined into a single analysis (Sailani et al.). Ramachandran et al. performed a genome-wide association study (GWAS) in 210 DS+AVSD cases and 242 DS+NH controls (Ramachandran, Zeng, et al., 2015) and also found no variants that exceeded genome wide significance, though they identified four non-chromosome 21 loci with nominal evidence of association (Ramachandran, Zeng, et al., 2015). Additionally, both Sailani et al. and Ramachandran (Ramachandran, Mulle, et al., 2015) found evidence that copy-number variants can contribute to AVSD risk in DS. In a comprehensive study of CNVs on chromosome 21, no evidence for their role in DS+AVSD was found (Rambo-Martin et al., 2018).
These studies show that phenotypic heterogeneity in CHD could negatively influence gene discovery efforts and that there may be etiologic heterogeneity and subtype-specific associations that can only be found in stratified analyses. Consistent with this, Ripoll et al., were able to distinguish the transcriptomes of lymphoblastoid cell lines established from individuals with DS with an AVSD and those with DS and an ASD or VSD, suggesting that there may be different molecular causes for different CHD subtypes (Ripoll et al., 2012). In this study, we focus on common variants (MAF > 5%) as modifiers of CHD risk in DS. Here we report the results of the largest GWAS of CHD in DS to date comparing individuals with DS and CHD (discovery: n=886; replication: n=229) to individuals with DS and structurally normal hearts (discovery: n=572; replication: n=197).
Methods
Study Subjects and Phenotyping
Discovery Cohort
Samples for this cohort were collected from four existing cohorts: the DS360 project at Emory University; the Human Trisome Project (HTP) at the Linda Crnic Institute for Down Syndrome at the University of Colorado; the Children’s Oncology Group (COG); and the Pediatric Cardiac Genomic Consortium (PCGC). Samples from individuals with Down syndrome and their unaffected parents were collected in accordance with ethical standards of human subjects’ research and with informed consent or appropriate assent, following approval by the Institutional Review Boards of the participating institutions. About 78% of samples are of predominantly European ancestry, 8% black or African American, 1% Asian ancestry and 0.5% American Indian. We collected data on the CHD phenotype from clinician and/or parent reports or medical records. Detailed CHD descriptions were collapsed into five broad categories: atrioventricular septal defect (AVSD; n=438) that included complete and incomplete AVSD, atrial septal defect (ASD; n=122), ventricular septal defect (VSD; n=170), other types of CHD (n=156), and structurally normal heart (NH; n=672). Individuals with only a patent ductus arteriosus (PDA) or patent foramen ovale (PFO) were grouped with normal heart for the purposes of this analysis.
Replication Cohort
The samples for this independent cohort were from the Oxford Down Syndrome Cohort Study (ODSCS) and includes 436 children with DS who were collected prospectively from birth, enrolled across 18 hospitals in the UK between 2006 and 2012, and with serial clinical and hematological data from birth to age 4 years (Roberts I, 2017). The majority (64%) of newborns are of predominantly European ancestry, with smaller numbers with African (12%) or Asian (14%) ancestry. CHD phenotypes were coded into the same five categories: AVSD (n=77), ASD (n=72), VSD (n=37), other CHD (n=14), and NH (n=201). Parents gave written informed consent in accordance with the Declaration of Helsinki. The study was approved by the Thames Valley Research Ethics Committee (06MRE12-10; NIHR portfolio no. 6362).
Whole Genome Sequencing (WGS) and Variant Calling
For both Discovery and Replication cohorts, paired-end whole genome sequencing (WGS) was completed by the Broad Institute as part of funded NIH INCLUDE and Gabriella Miller Kids First projects on an Illumina HiSeqX with a target read depth of ∼30x coverage. The data were aligned to hg38 and harmonized by the Kids First Data Research Center via Cavatica using a custom pipeline based on GATK best practices (Van der Auwera et al., 2013).The GATK genotyping workflow included base quality score recalibration (BQSR), simultaneous calling of SNPs and indels using single-sample variant calling (HaplotypeCaller), multiple-sample joint variant calling, and finally refinement with variant quality score recalibration (VQSR) and filtering of called variants. Kids First DRC pipelines are open source and made available to the public via GitHub (Alignment workflow: https://github.com/kids-first/kf-alignment-workflow and Joint genotyping workflow: https://github.com/kids-first/kf-jointgenotyping-workflow)
Quality Control Steps
Discovery Cohort
Quality control (QC) was performed on 2,394 samples and 98,049,632 variants using VCFtools v0.1.13. Variant calls with genotype quality (GQ) < 20 or depth (DP) < 10 were set to missing. Variants were then filtered to remove flagged sites (filter flag not equal to “PASS”), multiallelic sites, monomorphic sites (minor allele count (MAC) < 1) and sites with > 3% missingness, leaving 74,841,842 variants. Samples were dropped for quality metrics outside of three standard deviations from the mean for missingness, theta, transition/transversion (Ts/Tv) ratio, silent/replacement ratio, and heterozygous/homozygous ratio. Sex and family relationships were confirmed by X chromosome heterozygosity and identity-by-descent analyses in PLINK v1.9. The final analysis dataset consisted of 2,344 samples. Variants were then excluded from analysis if they had MAF < 5%, deviated from Hardy-Weinberg (p-value < 1×10−7), or had site missingness > 5%.
Replication Cohort
QC was performed on 436 samples and 51,213,158 variants using VCFtools v0.1.14 and PLINK v1.9 & v2.0. Variants were filtered to exclude non-autosomal and chromosome 21 variants, multiallelic variants, those with missingness >5%, significant differential missingness between cases and controls, variants deviating from Hardy-Weinberg, and those with MAF less than 1%. Five samples were removed from the analyses following sex checks and relationship checks (only unrelated samples were retained), and low call rates (>10% missing data). Related samples were identified if the kinship coefficient was greater than 0.177 (monozygotic twins, duplicate, first-degree relatives). Sex was inferred using X chromosome inbreeding coefficient F, where smaller than 0.25 indicated female and greater than 0.8 indicated male. An additional five samples were missing information on CHD and were not included in the GWAS. The top 20 principal components were calculated using pruned SNPs. After filtering, replication datasets included 12,601,590, 12,291,231, 12,726,561, and 12,214339 variants respectively for further AVSD, ASD, VSD, and any CHD vs. NH case-control analyses.
Statistical Analyses
Discovery GWAS
We performed four within-DS case-control association tests using a logistic regression with an additive model, including sex and the first 7 principal components of ancestry as covariates in Plink v 2.00a2.3LM. The top 30 principal components were calculated in Plink v 2.00a2.3LM using a list of pruned SNPs. Each group of “cases” (DS+AVSD, n=438; DS+ASD, n=122; DS+VSD, n=170; and any DS+CHD, n=886) were contrasted with DS+NH controls (n=572). All discovery GWASs included 6,095,966 variants. Variants on chromosome 21 were excluded because genotype calls on this chromosome require special treatment due to their triploid state.
Replication GWAS
GWAS were performed for AVSD (n=77), ASD (n=72), VSD (n=37), and any CHD (n=229) vs. NH (n=197) using logistic regression with an additive model, including sex and first 10 principal components of ancestry as covariates in Plink v2.00a3.7LM.
Meta-analysis
Summary statistics from the discovery and replication GWASs were used to perform a fixed effects meta-analysis on all variants present in both the discovery and replication cohorts (Any CHD: 5,600,511 variants; AVSD: 5,595,672 variants; ASD: 5,598,098 variants; VSD: 5,594,911 variants). Meta-analysis was implemented in PLINK v 1.90b5.3.
Results
We sought to identify common variants associated with various types of CHD in DS. First, we performed a GWAS for any DS-associated CHD by comparing individuals with DS and any type of CHD (any DS+CHD) to “controls” with DS and a structurally normal heart (DS+NH). We also performed stratified subtype-specific GWAS for AVSD, ASD, and VSD to unmask SNPs associated with a specific type of CHD that may otherwise be undetected by combining all CHD types into a single analysis. For all four discovery analyses, no SNP achieved genome wide significance (pD < 5×10−8), but many reached suggestive evidence of association (pD < 2×10−6) (Supplemental Table 1). We also performed GWASs in an independent replication cohort and combined the results for the discovery and replication studies by meta-analysis. For all replication GWASs, no SNP that had suggestive evidence of association in the discovery GWASs achieved p-values with nominal significance in replication analyses (pR < 2×10−6), but for several the direction and magnitude of effects were consistent and the p-values in the meta-analysis (pM) were more significant than in the discovery GWAS alone. QQ-plots are shown in Supplemental Figure 1 and a comparison of suggestive SNPs from the discovery GWASs to the replication GWASs and meta-analyses are listed in Supplemental Table 2. In the next sections, we describe the major results for each GWAS.
DS+CHD
In the any DS+CHD discovery GWAS, we identified two regions with suggestive evidence of association: 5q11.2 and 5q35.2 (Figure 1). At 5q11.2, the lead SNP was rs11747003 (pD = 1.78×10−6, OR = 0.58), located within an intron of ITGA1. Although not significant in the replication (pR = 0.153, OR = 0.73), the meta-analysis p-value improved (pM = 9.74×10−7 OR = 0.60). The lead SNP at 5q35.2 was rs76104490 (pD = 7.67×10−8, OR = 1.97). This locus is located near MSX2, a homeobox protein. This SNP did not achieve suggestive significance in the replication GWAS (pR = 0.625, OR = 1.12), but remained suggestive in the meta-analysis (Pm = 7.46×10−7, OR = 1.73).
The top panels show locus zoom plots for the lead SNP in each region with the left y-axis displaying the -log10(p-value) and the right y-axis displaying the recombination rate. Points are colored according to their strength of linkage disequilibrium with the lead SNP (purple diamond). The bottom panels show the odds ratio with a 95% confidence interval for the lead SNP in each region from four of the GWAS’: AVSD+DS (blue), ASD+DS (pink), VSD+DS (purple), anyCHD+DS (black).
DS+AVSD
In addition to 5q35.2 (described above), three loci were nominated by the AVSD discovery GWAS: 12q21.2, 13q21.33, and 19q13.3 (Figure 2 A-D, top). Three SNPs were suggestive at the 12q21.2 locus (lead SNP rs7956838; pD = 9.76×10−7, OR = 0.57), about 84 kb from NAV3, a neuronal regulator (Lv et al., 2022). SNPs in this region had p-values between 2.26×10−5 and 8.37×10−5 in the meta-analysis. The 13q21.33 locus (lead SNP rs9592842; pD = 1.64×10−6, OR = 1.69) contains multiple genes predicted to be actively transcribed during heart development based on chromatin state segmentation data from human embryonic hearts (VanOudenhove et al., 2020). The lead SNP at this locus was nominally significant in the meta-analysis (pM = 1.45×10−6, OR = 1.63). Lastly, there were nine suggestive SNPs at 19q13.3, The strongest signal was rs4807577 (pD = 1.22×10−6, OR = 1.92), approximately 2 kb from FSD1, a centrosome-associated protein, and overlaps a known embryonic heart enhancer (Supplemental Figure 2) (Whyte et al., 2013).
The top panels show locus zoom plots for the lead SNP in each region with the left y-axis displaying the -log10(p-value) and the right y-axis displaying the recombination rate. Points are colored according to their strength of linkage disequilibrium with the lead SNP (purple diamond). The bottom panels show the odds ratio with a 95% confidence interval for the lead SNP in each region from four of the GWAS’: AVSD+DS (blue), ASD+DS (pink), VSD+DS (purple), anyCHD+DS (black).
Two additional loci (11p15.4 and 15q36.3) reached suggestive significance in the replication GWAS which did not in the discovery GWAS. The lead SNP at the 11p15.4 locus, rs2179 (pR = 1.87×10−06, OR = 3.25), is located within an intron of TRIM22. The lead SNP (rs7169638) at the 15q36.3 locus approached genome-wide significance (pR = 7.44×10−8, OR = 4.15). SNPs at this locus are in the vicinity of embryonic heart enhancers, but the nearby genes (MEX2B, ELF1) are not yet known to be involved in human heart conditions.
DS+ASD
In the DS+ASD discovery, we identified six loci: 1p35.1, 1q41, 8p11.21, 10p11.21, 10q26.3, and 12q13.12 (Figure 3 A-F). Two SNPs were suggestive within 1p35.1. The lead SNP was rs75999698 (pD = 3.32×10−7, OR = 3.15), 365 bp from RBBP4, a histone binding subunit. This locus was the only one to maintain suggestive significance in the meta-analysis (pM = 1.87×10−6, OR = 2.45). The 1q41 signal was in an intergenic region between USH2A and ESRRG (lead SNP rs34506231, pD = 1.98×10−6, OR = 0.42). At 8p11.21, there was one suggestive SNP, rs12545518 (pD = 6.46×10−7, OR = 3.44), near KAT6A. At locus 10p11.21, 13 SNPs had p-value < 2×10−6, including lead SNP rs72794622 (pD = 1.93×10−6, OR = 3.61). This locus is 938 kb from FZD8, a Wnt receptor expressed in the heart and associated with hypertrophic cardiomyopathy (Bergmann, 2010). Two SNPs were suggestive at the 10p26.3 locus (lead SNP rs12415529; pD = 9.04×10−7, OR =3.09). While there are many genes at this locus, none are known to be associated with the heart in the existing literature. Finally, the 12q13.12 locus contained four suggestive SNPs (rs608233; pD = 2.57×10−7, OR = 2.69). This locus contains many genes predicted to be actively transcribed during human heart development (VanOudenhove et al., 2020). Further, one SNP (rs632151) overlaps with a known embryonic heart enhancer (Supplemental Figure 3) (Whyte et al., 2013) and shows some evidence of being an eQTL in GTEx (Supplemental Figure 4-5) (Consortium, 2013).
The top panels show locus zoom plots for the lead SNP in each region with the left y-axis displaying the -log10(p-value) and the right y-axis displaying the recombination rate. Points are colored according to their strength of linkage disequilibrium with the lead SNP (purple diamond). The bottom panels show the odds ratio with a 95% confidence interval for the lead SNP in each region from four of the GWAS’: AVSD+DS (blue), ASD+DS (pink), VSD+DS (purple), anyCHD+DS (black).
One SNP (rs1005999) at 2q12.1 in a gene desert achieved suggestive significance in the replication GWAS (pR = 1.38×10−6, OR=3.82). SNPs at two additional loci achieved suggestive evidence of association in the meta-analysis: rs19793 at 3p26.3 (pM = 1.07×10−6, OR = 0.47), located within CNTN6, and rs78087419 at 10p15.3 (pM = 8.36×10−7, OR=3.04), located in a gene desert.
DS+VSD
In the DS+VSD analysis, we identified four regions with suggestive evidence of association: 4q34.3, 5q33.1, 14q21.2, and 17q24.1 (Figure 4 A-D, top). At 5q33.1, two SNPs had p-values < 2×10−6, including the lead SNP rs9764868 (pD = 8.74×10−7, OR = 1.98). The lead SNP is approximately 1.08 Mb from HAND1 (Reamon-Buettner et al., 2008; Reamon-Buettner et al., 2009) and 1.05 Mb from SAP30L (Teittinen et al., 2012); the former is an essential gene for cardiac morphogenesis. The lead SNPs at 4q34.3 (rs12499181; pD = 1.98×10−6, OR = 2.07) and 14q21.2 (rs4906494; pD = 1.14×0-6, OR = 1.95), were located within gene deserts. Only one SNP was suggestive at the 17q24.1 locus (rs72838611; pD = 1.22×10−6, OR = 2.07), located in an intron of PRKCA, a regulator of cardiac contractility not known to be involved in CHD. One additional locus achieved suggestive evidence of association in the meta-analysis: 4p16.2. At this locus, there were two SNPs which had p-values < 2×10−6 (lead SNP rs34336375; pM = 6.91×10−7, OR = 0.49), which is 1.32 Mb from EVC.
The top panels show locus zoom plots for the lead SNP in each region with the left y-axis displaying the -log10(p-value) and the right y-axis displaying the recombination rate. Points are colored according to their strength of linkage disequilibrium with the lead SNP (purple diamond). The bottom panels show the odds ratio with a 95% confidence interval for the lead SNP in each region from four of the GWAS’: AVSD+DS (blue), ASD+DS (pink), VSD+DS (purple), anyCHD+DS (black).
Comparison of Effect Sizes of Discovery GWAS
To determine if any of these SNPs were specific for the CHD subtype in which they were identified, we compared the estimated odds ratio with a 95% confidence interval for the lead SNPs from any discovery subtype GWASs to the odds ratios for that SNP in each of the other three discovery GWASs. For the lead SNPs that were suggestive in the DS+AVSD discovery analysis, we found little evidence of association in the DS+ASD or DS+VSD phenotypes based on p-values. However, because the 95% CIs around the effect sizes are overlapping (Figure 2 A-D), we cannot exclude the possibility that these SNPs could also contribute a similar magnitude of risk for other types of CHD and that we have insufficient power to detect an association. We found similar results for the ASD-associated loci except for one locus. At 1p35.1, associated with DS+ASD, the confidence interval for rs75999698 in DS+ASD does not overlap that of DS+AVSD or DS+VSD (Figure 3A). The CIs for DS+AVSD and DS+VSD cross 1 (and p-values < 0.5), indicating no association. Taken together, this suggests that the 1p35.1 locus may increase risk specifically for DS+ASD. For DS+VSD discovery analysis, we observed a different pattern (Figure 4). Although all of the CIs overlapped for each subtype, the p-values were always nominally significant in AVSD (p < 0.001) but not in ASD (p < 0.01). This suggests that there may be a shared etiology between AVSD and VSD that is different from risk for ASD.
Replication of previously published SNPs in discovery GWAS
We investigated 35 SNPs from a published GWAS of CHD and its subtypes in DS (Sailani et al., 2013) for replication in our subtype specific discovery GWASs [Table 1]. While there are other GWASs of CHD, only this study was investigated for replication since it was in an independent cohort of individuals with DS and overlapping phenotypes. Five SNPs achieved nominal significance (p < 0.05) in at least one of subtype GWASs. Two SNPs, rs2983879 (p = 5.06 × 10−6, OR= 0.37) and rs2983882 (p= 6.21 × 10−6, OR= 2.55), from the DS+AVSD GWAS in Sailani et al. (Sailani et al., 2013) replicated with nominal significance in the same phenotype in our analyses (rs2983879: p = 0.042, OR=0.82; rs2983882: p=0.004, OR=0.77). In addition, rs11102433, from the DS+ASD GWAS in Sailani et al. (Sailani et al., 2013) (p = 8.34 × 10−6, OR=3.15) replicated with nominal significance in DS+AVSD (p = 1.53×10−2, OR=0.74) and DS+VSD (p = 4.86×10−2, OR=0.71).
Discussion
CHDs occur more often in children with DS than euploid children, indicating that the dosage of genes on chromosome 21 is an important risk factor for abnormal heart development. However, since not every child with DS has a CHD, we hypothesized that there are other genetic and environmental factors that modify the risk of developing a CHD in the presence of an extra copy of chromosome 21. In this study, we sought to identify common variants (MAF > 5%) outside of chromosome 21 that are modifiers of CHD risk in DS. We performed a within-DS case-control GWAS for all types of CHD. Then, we stratified the CHD cases by CHD type and performed additional GWASs of AVSD, ASD, and VSD. While no SNP achieved genome-wide significance, multiple SNPs from each discovery GWAS achieved suggestive significance. We attempted to replicate these findings in an independent cohort of DS and CHD, however most of these failed to replicate. This may be a consequence of insufficient power resulting from a small sample size, differences in race/ethnicities between the discovery and replication cohorts, heterogeneity in CHD risk, differences in CHD phenotyping, or a combination of these factors. There may be non-genetic or epigenetic factors that influence the development of CHD in DS (Mouat et al., 2023).
Although not genome-wide significant, many loci contained genes known to play key roles in heart development. These included 5q35.2, associated with all types of CHD and AVSD. The associated region contains multiple conserved and putative human cardiac enhancers approximately 115 kb downstream from MSX2. MSX2 encodes a homeobox protein that, together with its partner MSX1, protects secondary heart field precursors against apoptosis and regulates proliferation of cardiac neural crest cells (Chen et al., 2007). In support of the critical role for MSX2 in heart development, of the 37 individuals in the DECIPHER database with a CNV in MSX2, ∼10% (4/37) present with an ASD, ∼10% (4/37) present with a VSD, and ∼5% (2/37) present with an abnormality of the cardiovascular system (Firth et al., 2009). Similarly, the 5q33.1 locus (associated with VSD), contained many genes in this region expressed in embryonic cardiac tissue, including HAND1. HAND1, with its partner HAND2, has a role in the formation of the right ventricle and aortic arches. Variants in HAND1 have been identified in tissue samples from individuals with septal defects (AVSD, ASD, and VSD) (Reamon-Buettner et al., 2009) and hypoplastic hearts (Reamon-Buettner et al., 2008).
Multiple lines of evidence from human genetics to animal models demonstrate an important role for cilia in heart development. In an ENU screen in mice, half of the mutations identified from animals screened for congenital heart defects were in cilia related genes (Li et al., 2015). Recessive alleles in structural components of cilia and signaling machinery are associated with a variety of syndromes including CHD, laterality defects, and other cardiac abnormalities (Djenoune et al., 2022). In this study, we identified associations with several loci containing cilia genes including 13q21.33 (AVSD) and 19q13.3 (AVSD). The 13q21.33 locus contains PIBF1, a component of centriolar satellites (Kumar et al., 2021; Mansour et al., 2021) and the 19q13.3 locus contains FSD1, encoding a centrosome protein (Tu et al., 2018). Both genes result in cardiac looping defects when deleted or perturbed in animal models (Kumar et al., 2021; Liu et al., 2019).
We also identified associations near genes implicated in heart development but not yet associated with human phenotypes. An example of this is the 12q21.2 locus (AVSD), which contains NAV3. NAV3 encodes a microtubule-binding protein that modulates neural cell migration during nervous system development (Maes et al., 2002). While NAV3 is primarily expressed in the brain, it is also expressed in the heart, kidney, and liver (Maes et al., 2002). Although not previously associated with CHD in humans, nav3-/- mutant zebrafish present with severe cardiac defects (Lv et al., 2022). Furthermore, expression of nav3 in the developing zebrafish heart occurs from 24 to 48 hours post fertilization (Lv et al., 2022), when cardiac looping and the formation of the atrioventricular canals occurs (Bakkers*, 2011), suggesting a role for nav3 in atrial-ventricle differentiation (Lv et al., 2022). Expression data based on chromHMM marks shows that NAV3 is expressed in human cardiac tissue throughout various Carnegie stages (VanOudenhove et al., 2020).
The 1p35 locus was the only to show a clear specificity of association with one subtype. This locus was associated with ASD and contained the gene RBBP4, a component of the nucleosome remodeling and histone deacetylase (NuRD) complex, which interacts with known heart development gene TBX5 (Boogerd & Evans, 2016). RBBP4 has no reported associations with cardiac abnormalities in humans; however, it is transcribed in human embryonic cardiac tissue at all Carnegie stages (VanOudenhove et al., 2020). Notably, TBX5 is a marker of a subset of cells that ultimately becomes the atrial septum (Nadeau et al., 2010) and mutations in TBX5 cause Holt-Oram syndrome, which frequently includes ASD or VSD among the clinical presentation (McDermott et al., 1993).
This study only focused on variants outside of chromosome 21. However, genetic variation on chromosome 21 may also contribute to CHD risk in DS. Analysis of trisomic genotypes on chromosome 21 is essential and ongoing. Similarly, this study only focused on single nucleotide polymorphisms and did not consider that CNVs may modify risk for CHD in DS.
In summary, this study identified multiple loci that are worthy of follow-up in larger cohorts of individuals with DS+CHD and also in cohorts of chromosomally typical CHD, as we have no reason yet to suspect these loci may be specific to DS. The many loci identified suggest the possibility of a significant polygenic or genetically heterogeneous risk for CHD in DS (and possibly in CHD in general). It also underscores the importance of inclusion of individuals with DS in genetic studies for CHD and the potential power of stratifying by specific type of CHD.
Acknowledgments
This work is a collaborative effort supported by X01-HL145686 (PJL), X01-HD117361 (JME), X01-HD110902 (JME), X01-HD107382 (JME), X01-HD107380 (AD), R01-CA249867 (KRR, PJL), R03-HL156476 (EJL, DJC), U01-HL153009 (BDG), and the Global Down Syndrome Foundation. We also thank the individuals with Down syndrome and their families for participating in this research along with the efforts of the many staff members who engaged with and collaborated with families.