Abstract
Neural tube defects (NTDs) remain among the most common congenital anomalies. Contributing risk factors include genetics and nutrient deficiencies, however, a comprehensive assessment of nutrient-gene interactions in NTDs is lacking. We applied a nutrient-focused gene expression analysis pipeline to identify nutrient-sensitive gene regulatory networks in amniocyte gene expression data (GSE4182) from fetuses with NTDs (cases; n=3) and fetuses with no congenital anomalies (controls; n=5). Differentially expressed genes (DEGs) were screened for having nutrient cofactors. Nutrient-dependent transcriptional regulators (TRs) that regulated DEGs, and nutrient-sensitive miRNAs with a previous link to NTDs, were identified. Of the 880 DEGs in cases, 10% had at least one nutrient cofactor. DEG regulatory network analysis revealed that 39% and 52% of DEGs in cases were regulated by 22 nutrient-sensitive miRNAs and 10 nutrient-dependent TRs, respectively. Zinc- and B vitamin-dependent gene regulatory networks (Zinc: 10 TRs targeting 50.6% of DEGs; B vitamins: 4 TRs targeting 37.7% of DEGs, 9 miRNAs targeting 17.6% of DEGs) were dysregulated in cases. We identified novel, nutrient-sensitive gene regulatory networks not previously linked to NTDs, which may indicate new targets to explore for NTD prevention or to optimise fetal development.
Introduction
Neural tube defects (NTDs) are among the most prevalent congenital anomalies1. Complex genetic and environmental contributions underly around 75% of NTDs, which have multifactorial causation2 and can associate with multiple comorbidities, including placental maldevelopment and function3, fetal growth restriction, and preterm birth3,4. Folate and vitamin B12 have a well-recognised contribution to the etiology of isolated NTDs5, while other nutritional factors, including inositol6, vitamin D7, and choline8 and essential trace elements9, may also contribute but are less well-studied. Accordingly, widespread and targeted nutritional prevention strategies, such as folic acid fortification and periconceptional supplementation, have decreased NTD incidence10. However, that more than 300,000 infants remain affected by NTDs annually1 highlights the importance of improving knowledge on their causation to inform further strategies for prevention.
The role of genetics in NTD etiology is supported by associations between NTDs and chromosomal anomalies, heritable susceptibility to NTDs11, and the identification of the C677T polymorphism in the methylenetetrahydrofolate reductase (MTHFR) gene as a genetic risk factor for NTDs12. Distinct genetic signatures that associate with NTDs have also been revealed through large-scale genomic analysis13,14, which can provide an integrated, global view of the mechanisms regulating normal and pathological development and aid in the identification of biomarkers for prevention and diagnosis15. Some research on gene expression signatures in NTDs has been performed using amnioctyes13,14, which are a heterogeneous population of fetal cells in the amniotic fluid16. Amniocytes comprise pluripotent, lineage-committed, and mature differentiated cells, and are mostly derived from embryonic/fetal tissues16,17. Amniocyte transcriptome profiling has detected the expression of genes with known tissue-specific expression in fetal organs, including the brain and placenta16, suggesting that amniocytes may be uniquely positioned to inform on gene expression signatures in NTDs, and potentially placental-related comorbidities. While transcriptome profiling can help provide a gene-, pathway-, and network-level understanding of the gene expression signatures that associate with NTDs, there is also a need to integrate knowledge on gene-environment, including nutrient, interactions to work towards a more complete understanding of NTD phenotypes, and to identify target candidates for prevention.
Adequate nutritional resources are essential for ensuring normal embryonic and fetoplacental growth and development. Nutrients play important roles in DNA stability and repair and gene expression regulation, and host genetics influence nutrient metabolism and bioavailability18. Research to-date has largely focused on the contribution of genetic variants in the folate pathway to NTD risk11. However, knowledge is limited on whether, and the extent to which, other nutrients and their interactions with genes can explain NTD etiology. This is a critical gap to fill, given that fetal NTDs persist even in countries with folic acid food fortification19, and in pregnant people with normal blood folate levels20. Comorbidities associated with NTDs, including altered placental phenotype3, poor fetal growth4, and early birth3,4, may also be responsive to, or driven by, altered nutrient bioavailability21. However, to date, no study has integrated transcriptome network analysis with field knowledge on nutrient-gene interactions in the context of NTDs or their comorbidities.
Here, we applied an untargeted, nutrient-focused gene expression analysis pipeline to help address knowledge gaps and generate new hypotheses on the role of nutrient-gene interactions in NTDs. Using data from a small cohort study, we first aimed to examine the degree to which differentially expressed genes (DEGs) in fetuses with spina bifida (cases) were micronutrient dependent. We next aimed to discover novel nutrient-sensitive gene regulatory networks dysregulated in NTDs. We hypothesised that multiple nutrient-gene interactions would be evident in gene signatures associated with NTDs, and that clear nutrient-sensitive miRNA and TR ‘regulons’ (a group of genes belonging to a common regulatory network) would emerge among DEGs.
Methods
Data source and pre-processing
Amniocyte gene expression data for fetuses with spina bifida (cases; n=4) and fetuses with no congenital anomalies (controls; n=5) were obtained from Gene Expression Omnibus (GEO; GSE4182) on June 28, 2021, and re-analysed. Secondary analysis of anonymized data is exempt from ethics approval by the Carleton University Research Ethics Board. As part of a case-control study in Budapest, Hungary, amniocytes were isolated from amniotic fluid samples collected during amniocentesis and stored at -80°C until RNA extractions were performed14. Amniocyte transcriptome sequencing was carried out using the Affymetrix Human Genome U133 Plus 2.0 Array, and microarray deposited in GEO in November 200614,22. Sample characteristics are provided in Supplementary Table S1. All three cases had a spina bifida diagnoses, and one case also had anencephaly. Gestational age at time of amniotic fluid collection ranged from 13-19 weeks for cases and 17-19 weeks and 5 days for the control group. Each control sample was pooled from two patient amniotic fluid samples from pregnancies undergoing routine amniocentesis because of a maternal age >35 years, and thus included amniotic fluid from 10 controls total.
All data pre-processing were performed in R software (v3.6.1) using the Bioconductor (v3.12) Affy package23. Raw microarray data (.CEL files) were downloaded and normalised using the robust multi-array average method24. Expression values were visualised using box-and-whisker plots and outliers were identified using sample clustering and uniform manifold approximation and projection (UMAP)25. Where multiple probesets corresponded to a single gene, a representative probeset was chosen for each gene, based on recommended methods (Supplementary Table S2)26. Probeset identifiers were annotated with Entrez gene IDs27.
Differential gene expression analysis
DEGs between cases and controls were identified (eBayes; R [v3.6.1])28. Statistical significance was determined after false discovery rate correction at q value <0.05 and an absolute log2 fold change (FC)≥229. Functional analysis of the DEG list was performed using the Protein ANalysis THrough Evolutionary Relationships (PANTHER) classification system (v16.0)30.
Identifying nutrient cofactors of differentially expressed genes
To determine which of the DEGs coded for proteins that were micronutrient dependent (i.e., they coded for proteins with one or more micronutrient [vitamin or mineral] cofactor), we created a DEG-nutrient interaction network. Using a previously published dataset of protein/gene-nutrient (cofactor) interactions31, we identified DEGs coding for a cofactor-interacting protein and their associated cofactors. This dataset on protein-nutrient interaction networks integrates knowledge on nutrient-protein (gene) interactions from the Metal MACiE25, Uniprot23, Expasy24 and EBI CoFactor databases. Applying a systems-approach to nutrient-gene interaction analysis is key for understanding the ways in which multiple micronutrients, their metabolic products, and interacting proteins function in biological processes, and for obtaining a more complete view of nutritional contributions to health and disease31. The PANTHER classification system (version 16.0) was then used to functionally annotate DEGs that were nutrient-dependent30, and we visualised DEG-nutrient relations as a network using Cytoscape (version 3.8.2)32.
Nutrient-sensitive miRNA targetome analysis
Next, to construct nutrient-sensitive miRNA regulatory networks for DEGs in cases, we first curated a list of nutrients that have known relevance to NTDs. This included cofactors for one carbon metabolism and nucleotide synthesis (choline33, vitamins B134-36, B633,35,36, B9 [folate]35,36, and B1233,35), as well as inositol37, iron36, and vitamin D7. We next used miRWalk (version 2.0), a platform that integrates data on validated and predicted miRNA-target (gene) interactions, to identify miRNAs that are known to interact with biological processes related to these nutrients, or with activity known to be sensitive to these nutrients38. We then cross-referenced the list of miRNAs sensitive to, or associated with, these NTD-related nutrients against a curated inventory of miRNAs that have been associated with NTDs or neural tube closure in previous animal and human studies, which was also built using miRWalk and through manual searching in PubMed (search string: “neural tube*” AND “miRNA”). Targetome files with miRNA-target gene network data (miRNA targetomes), limited to experimentally validated miRNA-target interactions (miRTarBase)38,39, were then retrieved for the miRNAs that were found in both lists (i.e., miRNAs whose activity was sensitive to the NTD-related nutrients, that also had a previous link to NTDs). Our rationale to limit our focus to miRNAs with a previous NTD link when constructing nutrient-sensitive miRNA regulatory networks was driven by the high number of miRNAs associated with the nutrients of interest. For example, approximately 1000 miRNAs were identified to be sensitive to each of choline, inositol and iron. Lastly, the targetome files were used to construct nutrient-sensitive miRNA-DEG regulatory networks in Cytoscape (version 3.8.2)32.
Nutrient-sensitive transcription factor regulatory network analysis
Next, we used iRegulon (version 1.3) to identify possible transcriptional regulators (TRs) predicted to co-regulate DEGs in cases40. The iRegulon software uses a reverse engineering approach to predict the transcriptional regulatory network of a given gene list, using cis-regulatory sequence analysis, and has been shown to outperform other motif-discovery methods40. First, a whole genome ranking of 22,284 human genes (RefSeq) for each regulatory motif is generated. Specifically, TR binding motifs, or CHIP-seq peaks (derived from ChIP-seq data sets, also referred to as tracks), were identified in the regulatory sequences in the 20 kb (kilobase; 1000 base pairs of DNA) surrounding the transcription start site (TSS) of each gene (from −10 kb to +10 kb of the TSS)40. Then, the list of differentially expressed genes is input and tested against the ranked gene list using an area under the cumulative recovery curve (AUC). AUC scores are normalized to normalized enrichment scores (NES). Motifs with high NES scores represent motifs that have a large proportion of the differentially expressed genes (input genes) within its top rankings. We used a NES threshold of 3.0 to determine motif enrichment, which corresponds to an FDR of 3-9%. At the same time, the subset of genes from the input list that are predicted to be targeted by a given motif is identified using the leading edge of the recovery curve. Lastly, iRegulon uses a Motif2TR procedure to link candidate motifs (those with a NES>3.0) with TRs using a constructed database that links 1,191 human TRs to 6,031 motifs (based on motif-TR direct annotations, orthology, and motif-motif similarity)40.
Two lists of candidate TRs associated with enriched motifs were produced: One representing TRs targeting up-regulated genes in cases, and the other targeting down-regulated genes in cases. Within these lists, TRs with nutrient cofactors were identified using the same dataset of protein (gene)-nutrient interactions used above, to identify DEGs with nutrient cofactors31. Nutrient-sensitive TRs and their direct DEG targets were then exported as network files, separately for up-and down-regulated DEG lists. Nutrient-sensitive TR-DEG regulatory networks were visualised in Cytoscape (version 3.8.2)32.
Results
Data pre-processing
Expression value distribution, density, and intensity were similar across all nine samples (Supplementary Figure S1A). One case sample (GSM94601) identified as an outlier in UMAP analysis was excluded (Supplementary Figure S1A), consistent with approaches taken in previous analyses of this dataset13. The remaining eight samples were then re-clustered using UMAP, and complete separation of the case and control groups was evident (Supplementary Figure S1B). Three cases and five control samples were retained for subsequent analyses.
Amniocyte gene expression differs in cases compared to controls
In a comparison of global amniocyte gene expression, 1,155 genes (2.11% of probesets sequenced) were differentially expressed between cases and controls (q<0.05; absolute log fold change ≥2). After the removal of duplicate probesets and genes with no available Entrez ID (Supplementary Figure S2 and Table S2), 880 DEG (downregulated: n=725, upregulated: n=155) were carried forward for further analyses (Figure 1).
(A) Volcano plot and (B) heat map of genes with increased (n=155) and decreased (n=725) expression in cases compared to controls (clustering method: average linkage; distant measurement method: Euclidean; clustering applies to rows and columns). Differential expression was determined at absolute fold change > 2.0 and q < .05. GE = gene expression.
After functional annotation of the DEGs using PANTHER (Supplementary Figure S3), the top identified GO molecular functions of DEGs in cases were binding (n=274 genes), catalytic activity (n=152) and regulatory functions (n=120). The top identified biological processes of DEGs in cases were cellular processes (n=431 genes), biological regulation (n=290), and metabolic processes (n=226). The top protein classes of DEGs were protein modifying enzymes (n=71 genes), gene-specific transcriptional regulators (n=63), and metabolite interconversion enzymes (n=49). The full DEG list is provided in Supplementary Table S3.
Nutrient-gene relationships identified in genes dysregulated in cases
We next investigated whether DEGs in case amniocytes coded for proteins that had nutrient cofactors. Of the DEGs in cases, 87 (9.89%) had at least one nutrient cofactor (downregulated: n=69, upregulated: n=18; Table 2). Cofactors included calcium (for n=14 genes), chloride (n=1), iron (n=6), heme (n=4), magnesium (n=22), manganese (n=4), S-Adenosyl methionine (n=3), vitamins B2 (n=4), B3 (n=7), B5 (n=2), B6 (n=1), and B9 (folate; n=1), vitamin C (n=1), zinc (n=19), and metal cations (n=9; Figure 2A). In nearly all DEG clusters interacting with a nutrient cofactor, the majority of DEGs were downregulated, apart from DEGs interacting with calcium (9 of 14 were upregulated) and folate (Folate receptor beta [FOLR2] was upregulated and the only folate dependent DEG; Figure 2A).
(A) Genes (circle nodes) that had increased (pink) or decreased (blue) expression in cases compared to controls and have a known nutrient cofactor (grey diamonds) are represented in this network diagram. Data on nutrient-gene (protein) interactions was obtained from Scott-Boyer et al. 2016. (B) The largest proportion of DEG are targeted by miR-124-3p, miR-17 (both sensitive to choline, inositol, iron and vitamin D), and miR-20a (sensitive to choline, inositol and iron; miRWalk2.0). Red = upregulated, blue = downregulated, where number = number of DEG targeted by that miRNA. CoA = Coenzyme A. FAD = Flavin adenine dinucleotide. FMN = Flavin mononucleotide. logFC = log fold change. NAD = Nicotinamide adenine dinucleotide. PP = Pyridoxal phosphate. SAM = S-Adenosyl methionine. Vit = vitamin.
Functional annotation of nutrient-dependent DEGs in case amniocytes revealed the top GO molecular functions to be catalytic activity (n=46 genes), binding (n=20), and transporter and molecular transducer activity (n=4 each; Supplementary Figure S4). The top identified biological processes of DEGs with a nutrient cofactor were cellular process (n=51 genes), metabolic process (n=33), and biological regulation (n=33), and the top protein classes were metabolite interconversion enzymes (n=23 genes), protein modifying enzymes (n=14), and calcium-binding proteins (n=6).
The top molecular functions of dysregulated genes with a nutrient cofactor were lipid and purine nucleotide metabolic processes, post-translational protein modifications, and cellular homeostasis and motility-related processes (Table 1). Specifically, the three most downregulated, nutrient-dependent genes in cases were Ethanolamine-Phosphate Phospho-Lyase (ETNPPL; vitamin B6-dependent; FC=-5.14, q=0.002), involved in lipid metabolism, Baculoviral IAP Repeat Containing 3 (BIRC3; zinc-dependent; FC=-3.40, q=0.002), associated with apoptosis inhibition, inflammatory signalling pathways, and cell proliferation, and Polypeptide N-Acetylgalactosaminyltransferase 3 (GALNT3; manganese-dependent; FC=-3.31, q=4.57e-4), involved in O-linked oligosaccharide biosynthesis. The three most upregulated genes were Adenylosuccinate Synthase 1 (ADSSL1; magnesium-dependent; FC=3.23, q=0.009), which codes for an enzyme involved in purine nucleotide synthesis and metabolism, Matrix metallopeptidase 16 (MMP16; zinc- and calcium-dependent; FC=3.17, q=0.002), involved in extracellular matrix degradation in normal physiological and disease processes, and Ectonucleotide Pyrophosphatase/Phosphodiesterase 2 (ENPP2; zinc- and calcium-dependent; FC=2.83, q=0.006), associated with motility-related processes, including angiogenesis and neurite outgrowth.
Genes with a nutrient cofactor that were differentially expressed in cases (vs. controls).
Overall, nutrient-dependent, downregulated genes in cases were involved in immune-related processes and nutrient metabolism and homeostasis (Table 1). This included eight genes from the metal cation-binding Metallothionein (MT) gene family (MT1E, MT1F, MT1G, MT1H, MT1HL1, MT1M, MT1X and MT2A), associated with intracellular metal homeostasis, as well as 16 zinc-interacting genes involved in immune and inflammatory processes, antioxidant activity, mitochondrial morphology, the renin-angiotensin pathway, and cell signaling (Table 2). Downregulated genes coding for proteins with B vitamin cofactors included Riboflavin Kinase (RFK), which is essential for vitamin B2 utilization, ETNPPL and Glycerol-3-Phosphate Acyltransferase 3 (AGPAT9), which are involved in lipid metabolism, and Methylsterol Monooxygenase 1 (MSMO1) and Isopentenyl-Diphosphate Delta Isomerase 1 (IDI1), which are involved in the biosynthesis of cholesterol.
Transcription factors with a nutrient cofactor that are predicted to target genes that are differentially expressed in cases compared to controls.
Upregulated genes in cases included those involved in immune and inflammatory processes, such as Arachidonate 5-lipoxygenase (ALOX5; calcium- and iron-dependent), Colony Stimulating Factor 1 Receptor (CSF1R; magnesium-dependent), FOLR2 (folate-dependent), and NLR Family Apoptosis Inhibitory Protein (NAIP; zinc-dependent; Table 2). Calcium-dependent Neuropilin 1 (NRP1) and Ectonucleotide Pyrophosphatase/Phosphodiesterase 2 (ENPP2), and iron- and heme-dependent Prostaglandin-Endoperoxide Synthase 1 (PTGS1), were also upregulated in cases and are involved in angiogenesis (Table 1).
Nutrient-sensitive miRNAs target differentially regulated genes in cases
We identified 22 unique miRNAs that had a previous link to NTDs and were sensitive to the nutrients of interest identified a priori (Figure 2B). Collectively, these 22 miRNAs targeted 39% (n=344) of DEGs in cases (19% [n=155] for B vitamin-sensitive miRNAs, 30% [n=263] for choline-sensitive miRNAs, 23% [n=203] for vitamin D-sensitive miRNAs, 33% [n=289] for inositol sensitive miRNAs, and 33% [n=293] for iron-sensitive miRNAs).
The largest proportion of DEGs were targeted by MIR-124-3p (n=87 DEGs), a known suppressor of tumor proliferation and invasion, and neural inflammation41-43, with activity sensitive to choline, inositol, iron, and vitamin D. MIR-17 and MIR-20a, both belonging to the miR-17/92 cluster with known roles in cell cycle, proliferation and apoptosis, immune function, and neurodegeneration44, targeted the next largest proportions of DEGs in amniocytes of cases (n=80 DEGs for MIR-17, sensitive to choline, inositol, iron, and vitamin D; n=71 DEGs for MIR-20a; sensitive to choline, inositol and iron; Figure 2B). Two of the miRNAs (MIR142-3p and MIR-200a; both involved in epithelial-to-mesenchymal transition45,46), only targeted downregulated genes in cases, while the remaining 20 miRNAs identified targeted both up- and downregulated DEGs in cases (Figure 2B).
Network analysis revealed that seven of these 22 unique miRNAs were known to be sensitive to only one nutrient of interest. This included inositol-sensitive miRNAs: MIR-23a-3p, MIR-142-3p, MIR-144, MIR-185; iron-sensitive miRNAs: MIR-200a, MIR-301; and vitamin B6-sensitive miRNA: MIR-30e-3p (Figure 3A-B). Iron, inositol, and choline-sensitive miRNAs targeted the largest proportion of DEGs in cases (Figure 3C). miRNA-targetome networks for DEG in amniocytes from cases compared to controls revealed a high degree of overlap in DEGs targeted by the 22 unique nutrient-sensitive miRNAs (Figure 3D).
(A-B) 22 unique miRNAs known to be sensitive to the activity of nutrients of interest (identified a priori, miRWalk2.0) with a previous link to NTDs. Of the 22 miRNAs, seven were sensitive to only one nutrient of interest (inositol-sensitive: miR-23a-3p, miR-142-3p, miR-144, miR-185; iron-sensitive: miR-200a, miR-301; vitamin B6-sensitive: miR-30e-3p). (C) Iron, inositol and choline-sensitive miRNAs targeted the largest proportion of DEGs in cases, respectively. Collectively, the 22 unique miRNAs targeted 39% of DEG in cases (n=344; B vitamin-sensitive miRNAs: 19% [n=155], choline-sensitive miRNAs: 30% [n=263], vitamin D-sensitive miRNAs: 23% [n=203], inositol-sensitive miRNAs: 33% [n=289], iron-sensitive miRNAs: 33% [n=293]). (D) Nutrient-sensitive miRNA-targetome networks for DEG in cases compared to controls. Diamonds = nutrient-sensitive miRNAs. Circles (nodes) = DEG targets, where lighter shades = downregulated in cases, darker shades = upregulated in cases. DEG = differentially expressed genes. GE = gene expression. FC = fold change. Vit = vitamin.
Nutrient-dependent transcriptional regulators target differentially expressed genes
iRegulon analysis revealed 125 and 106 unique TRs predicted to target up- and down-regulated genes, respectively, in case amniocytes (Supplementary Table S4). Of these, 16 TRs had nutrient cofactors and are predicted to target down- (n=5) and up-regulated (n=10) DEGs, or both (n=1; Table 2). Nutrient cofactors of these TRs included calcium (for n=1 TR), magnesium (n=1), metal cations (n=1), vitamins A (n=1), B2 (n=1), B3 (n=2), B5 (n=1), D (n=1) and zinc (n=10; Figure 4A). Collectively, these 16 TRs are predicted to regulate the expression of 52% (n=460) of DEGs (Figure 4B). E1A Binding Protein P300 (EP300), a TR with known roles in cell growth, differentiation, and NTDs, was predicted to regulate the highest number of DEGs of the TRs identified. EP300 had 311 expected targets (upregulated: n=9, downregulated: n=302; 35.3% of total DEGs; Table 2), and zinc and vitamin B5 cofactors. The zinc-requiring P53 family proteins (TP53, TP63, and TP73), involved in cell proliferation and apoptosis, were the next-largest predicted regulators of DEGs, potentially targeting 252 downregulated genes in cases (28.6% of total DEGs).
(A) 16 transcriptional regulators with nutrient-cofactors that are predicted to regulate DNA motifs or tracks of DEG were identified. Nutrient cofactors of these TRs included calcium (for n=1 TR), magnesium (n=1), metal cations (n=1), vitamins A (n=1), B2 (n=1), B3 (n=2), B5 (n=1), D (n=1) and zinc (n=10). (B) Collectively, these 16 TRs are predicted to regulate the expression of 52% (n=460) of DEGs. Red = upregulated in cases, blue = downregulated in cases, grey = total proportion of DEG targeted by a nutrient-sensitive TR. Nutrient-sensitive TR-gene networks were visualised for TRs predicted to target (C) downregulated genes (n=5 unique TRs, left-bottom panel) and (D) upregulated genes (n=10 unique TRs, right-bottom panel) separately. One TR (E1A Binding Protein P300; EP300) is predicated to target both up- and down-regulated genes in cases. Diamonds = nutrient-sensitive TRs. Circles (nodes) = DEG targets, where colours on the left of the colour palette = downregulated in cases, right of the colour palette = upregulated in cases. DEG = differentially expressed genes. GE = gene expression. FC = fold change. TR = transcriptional regulator.
Of TRs identified as potential regulators of upregulated genes, the metal-cation reliant TROVE domain family, member 2 (TROVE2) gene, a known regulator of immune and inflammatory processes, was associated with the largest proportion (n=35 [4.0%] upregulated genes in cases). Quiescin Sulfhydryl Oxidase 1 (QSOX1; vitamin B2-dependent) which targeted the next-largest group of DEGs (n=25 upregulated genes), was also downregulated in cases (FC=-2.16, q=0.002; Table 2). One of the P53 family proteins (TP63), as well as an additional zinc-dependent TR identified (Churchill Domain Containing 1 [CHURC1]), have not been previously linked to NTDs.
Nutrient-sensitive TR-gene networks were visualised for enriched TRs predicted to target downregulated and upregulated genes (Figure 4C-D). All TRs predicted to target downregulated genes had zinc as a cofactor, and one (EP300) was also vitamin B5-dependent (Figure 4C). TRs targeting upregulated genes were dependent on calcium, magnesium, metal cations, vitamins A, B3, B5, D, and zinc (Figure 4D). In the TR-gene network figures, it is visually evident that the regulons of downregulated genes targeted by nutrient-sensitive TRs were much larger than the upregulated regulons, but were fewer in number (Figure 4C-D).
Discussion
Using a nutrient-focused gene expression analysis pipeline, we identified multiple nutrient-sensitive genes and gene regulatory networks that were dysregulated in amniocytes in a small cohort of fetuses with NTDs. We found that over half of the dysregulated genes were predicted to be co-regulated by TRs and miRNAs that are sensitive to, or dependent on, nutrients with known roles in NTD etiology, namely zinc, B vitamins, and B vitamin-like nutrients (choline and inositol). These findings generate new hypotheses on the role of nutrient-gene interactions in NTDs, and suggest that nutrient-sensitive genes and gene regulatory networks may explain an important portion of gene expression signatures associated with NTDs. Profiling nutrient-transcriptome interaction networks in the context of NTDs may help to illustrate, in part, the gene-environment interactions that characterise an intrauterine environment that is permissive of NTDs and associated comorbidities.
Several zinc-dependent genes and gene regulatory networks were dysregulated in case amniocytes compared to controls. Zinc is a key regulator of gene expression, and is essential for the rapid DNA synthesis, cell growth, and proliferation that accompany embryonic development, including neural tube closure47,48. Low maternal zinc status may be associated with increased NTD risk9, and zinc imbalance in animal studies leads to NTDs, possibly through modulation of p53 activity49. Here, three zinc-dependent P53 family proteins were identified as transcriptional regulators predicted to target over a quarter of DEGs in case amniocytes, and eight Metallothionein genes, which are essential for maintaining zinc homeostasis and controlling zinc-dependent cellular signaling50, were downregulated in cases. Altogether, our findings support a role for zinc-associated gene dysregulation in NTD genetic phenotypes, in alignment with previous animal studies, and illustrate novel zinc-dependent gene networks in an NTD-affected a human cohort.
One zinc- and vitamin B5-dependent TR, EP300, was predicted to target over one third of DEGs in case amniocytes. EP300 has been studied in mouse models of NTDs and is predicted to activate pathways that promote fusion of the neural tube51,52. In humans, rare deleterious variants in the EP300 gene have recently been associated with NTDs53, and here, EP300 was the dominant, nutrient-sensitive focal point for DEG regulation in cases. Two additional nutrient-dependent TRs identified here further support a role for both zinc and B vitamin-interacting gene dysregulation in cases: Sirtuin 6 (SIRT6; vitamin B3- and zinc-dependent) and CHURC1 (zinc-dependent). SIRT6 suppression by high glucose levels is a proposed to mechanism linking maternal diabetes to increased risk of fetal NTDs54, however, CHURC1 has not previously been associated with NTDs. In zebrafish and xenopus, churc1 is known to play a role in embryogenesis and neuroectodermal development55,56. In humans, CHURC1 has been investigated in relation to autism, where a microdeletion in a gene region containing the genes for both CHURC1 and methylenetetrahydrofolate dehydrogenase 1 (MTHFD1), a folate metabolism enzyme with known links to NTD etiology11,57, associated with increased autism risk58. While research on zinc-B vitamins interactions is limited, there is some evidence that zinc may have a role in niacin (vitamin B3) metabolism through interaction with pyridoxine (vitamin B6)59,60. Overall, the presence of dysregulated zinc- and B vitamin-dependent gene networks in our results suggest that both zinc- and B vitamin-gene interactions in NTDs should be targets of further study.
To date, folates (vitamin B9), and more recently, folate one-carbon metabolism (OCM) cofactors (namely vitamins B12 and B6), have been the central nutrient targets for NTD prevention. In agreement with the established role of folates and OCM cofactors in NTDs, we identified nine miRNAs with known associations to NTDs that were also B vitamin-sensitive, eight being sensitive to folate, and were predicted to regulate the expression of nearly one fifth of DEGs in case amniocytes. Further, the three miRNAs that targeted the largest proportion of DEGs in cases (MIR-124-3p, MIR-17, and MIR-20a) were each sensitive to both choline and inositol, water soluble B vitamin-like molecules with known roles in NTD etiology6,8, and four TRs with B vitamin cofactors (EP300 and SIRT6, as well as C-terminal-binding protein 2 [CTBP2] and QSOX1) predicted to target DEGs in cases were identified. Both QSOX1 and CTBP2 have inferred associations with NTDs via their interactions with valproic acid and folic acid61, and the role of the CTBP2 gene in the Wnt and Notch signaling pathways62. However, no studies have provided direct evidence linking QSOX1 or CTBP2 to NTD pathogenesis or phenotype in humans, and here, QSOX1 expression was also downregulated in case amniocytes compared to controls. Taken together, our findings advance beyond a folic acid (vitamin B9)-centric view of the role of B vitamins in NTDs and highlight B vitamin-sensitive or dependent gene candidates for further study in NTDs.
Additionally, we found that six miRNAs with known associations to NTDs were vitamin D-sensitive, and collectively, were predicted to target nearly one quarter of DEGs in case amniocytes. Vitamin D receptor (VDR) was also identified as a transcription factor targeting DEGs, and Methylsterol monooxygenase 1 (MSMO1), associated with vitamin D analog metabolism, was downregulated in cases. Vitamin D plays important anti-inflammatory and immunomodulatory roles63, and low maternal vitamin D levels have been associated with fetal NTDs7. In mice, vitamin D3 supplementation prevents lipopolysaccharide-induced NTDs via increased folate transport from the maternal circulation to the embryo64. Thus, our findings support vitamin D as a nutrient of interest for further research on nutrient-gene interactions in NTDs.
Maternal immune dysregulation is also known to associate with an increased risk of fetal NTDs65,66, and here, several nutrient-dependent DEGs in cases were involved in inflammatory and immune-related biological processes. While it is possible that the dysregulated amniocyte gene expression in immune or inflammatory process-related genes was shaped by maternal health factors, activation of pro-inflammatory signaling pathways at the site of improper neural tube closure has also been reported in fetuses with spina bifida67,68, and may serve as an additional ‘hit’ to the developing central nervous system. Whether signaling changes in spinal cord injury-induced secondary lesion cascades could influence gene expression in amniocytes is an area requiring further study. Given the relationships between nutrient intakes and inflammatory load69, and the importance of nutrition in supporting immune signaling and functions70, the interactions between nutrient-sensitive and immune/inflammatory processes in NTD pathogenesis and phenotype is an area requiring further investigation.
The small size of this cohort is a key limitation to our study that may limit the validity of our findings. However, the rarity of NTDs and the challenges that accompany amniotic fluid collection make these data a valuable resource for generating new hypotheses on the role of nutrient-gene interactions in NTDs. While amniocentesis was performed after the period of neural tube closure, the gene expression changes observed may still reflect a uterine environment that was permissive of altered embryonic development. Alternatively, gene expression changes in amniocytes may also be influenced by signaling changes following improper neural tube closure, which could inform on NTD phenotype and the uterine environment that is permissive of comorbidities that develop later in pregnancy. Notably, maternal cell contamination in amniotic fluid is known to occur71 and could have confounded in the amniocyte gene expression profiles reported here. To address this limitation and characterise the transcriptome of certain minor cell populations in amniotic fluid, future studies on new samples should consider single-cell type RNA sequencing (sctRNA-seq) and computational approaches applied to sctRNA-seq to detect and control for changes in transcript expression due to off-target cell contamination72. Data on cohort characteristics were also limited, and it is plausible that there are unexplored variables that could help explain the differences observed here. These variables may include maternal medical history, or clinical or demographic characteristics, as well as dietary intakes; namely of nutrients that have known associations with NTD pathogenesis. Fortification of wheat products with folic acid had also not been introduced at the time of data collection, and while the original authors did report that all study participants were taking a multivitamin supplement throughout pregnancy14, it remains difficult to speculate about the nutrient status, including folate, of the study’s participants. Maternal hyperhomocysteinemia and C677T polymorphism in the MTHFR gene are also known risk factors for fetal NTDs12,73, and may have influenced the gene expression in cases if present, even if mothers were nutritionally replete. However, maternal biomarker and genotype data were not available for this cohort, limiting our ability to assess relationships between maternal genotype or plasma markers and amniocyte gene expression. Our ability to identify genes encoding for a nutrient-interacting protein was limited by the data available on cofactor-protein interactions, which may have led to an over-representation of well-studied nutrients31. Further, as the transcriptional regulators identified through iRegulon to target DEGs are predicted, and include both validated and unvalidated regulatory relationships, causal relationships cannot be determined at this time. Our scope for miRNA-targetome analysis was narrowed to focus on only miRNAs with a previous link to NTDs. This may have prevented the identification of novel miRNAs associated with NTDs, and future research with higher computational resources should consider an expanded focus. Ultimately, these findings can inform testable hypotheses and should be next investigated in experimental models, that could evaluate causal relationships between the bioavailability of multiple micronutrients and resulting changes in gene expression. Confirming these findings in an independent cohort would also be an important next step to replicate and validate these findings.
Building on the work of authors who previously analysed these data to understand gene- and gene pathway dysregulation in isolated NTDs13,14, our nutrient-focused network analysis approach integrates knowledge on nutrient-gene interactions with gene expression network analysis to examine the functional networks underlying NTD phenotypes. This is a first step in generating testable hypotheses for better understanding relationships between the bioavailability of, and interactions between, multiple micronutrients and gene expression in NTD pathogenesis or comorbidities. Through a network medicine approach, ‘big’ biomedical data can be leveraged to identify the functional networks that drive health and disease states15, and here, we apply this approach in the context of nutrient-gene interactions in NTDs. The nutrient-focused gene expression analysis pipeline applied here allows for the characterisation of nutrient-sensitive mechanisms that underly disease processes, which may aid in the identification of novel nutrition targets for the prevention of disease and promotion of optimal health and developmental outcomes.
Data Availability
The microarray data used in this analysis were obtained from GEO (GSE4182). The code used for microarray pre-processing has been deposited in a GitHub repository (https://github.com/marina-white/ntd_af_nutrient-gene_study.git).
Data availability statement
The microarray data used in this analysis were obtained from GEO (GSE4182). The code used for microarray pre-processing has been deposited in a GitHub repository (https://github.com/marina-white/ntd_af_nutrient-gene_study.git).
Funding
This research is supported by the Canadian Institutes of Health Research (CIHR PJT-175161). MW is supported by an Ontario Graduate Scholarship and JAP is supported by a Dean’s Summer Research Internship, Faculty of Science, Carleton University.
Acknowledgments
We thank the authors who originally sequenced these data for making them publicly available in GEO.
Footnotes
Revisions made to text and Figure 4 following peer review.