ABSTRACT
Background Gestational exposure to polychlorinated biphenyls (PCBs) has been associated with elevated risk for neurodevelopmental disorders. The mechanism of risk is unclear but may involve placental epigenetics. Prior studies have associated differential placental DNA methylation with maternal PCB exposure or with increased risk of autism spectrum disorder (ASD). However, sequencing-based placental methylomes have not previously been tested for simultaneous associations with maternal PCB levels and child neurodevelopmental outcomes.
Objectives We aimed to identify placental DNA methylation patterns associated with maternal PCB levels and child neurodevelopmental outcomes in the high-risk ASD MARBLES cohort.
Methods We measured 209 PCB congeners in 104 maternal serum samples collected at delivery. We identified networks of DNA methylation from 147 placenta samples using the Comethyl R package, which performs weighted gene correlation network analysis for whole genome bisulfite sequencing data. We tested placental DNA methylation modules for association with maternal serum PCB levels, child neurodevelopment, and other participant traits.
Results PCBs 153 + 168, 170, 180 + 193, and 187 were detected in over 50% of maternal serum samples and were highly correlated with one another. Consistent with previous findings, maternal age was the strongest predictor of serum PCB levels, alongside year of sample collection, pre-pregnancy BMI, and polyunsaturated fatty acid levels. Twenty seven modules of placental DNA methylation were identified, including five which significantly correlated with one or more PCBs, and four which correlated with child neurodevelopment. Two modules associated with maternal PCB levels as well as child neurodevelopment, and mapped to CSMD1 and AUTS2, genes previously implicated in ASD and identified as differentially methylated regions in mouse brain and placenta following gestational PCB exposure.
Conclusions Placental DNA co-methylation modules were associated with maternal PCBs and child neurodevelopment. Methylation of CSMD1 and AUTS2 could potentially be mechanistically involved in ASD risk following maternal PCB exposure.
1. INTRODUCTION
Polychlorinated biphenyls (PCBs) are a class of 209 synthetic compounds that are ubiquitous in the environment and harmful to human health. Prior to their international ban in 2001 (Stockholm Convention, 2001), PCBs were commonly used in industrial processes and consumer products, such as plastics, paint, and pesticides. Over 20 years later, PCBs continue to be released into the environment from spills/improper disposal, degradation of PCB-containing materials, and inadvertent production (Grimm et al., 2015; Hu & Hornbuckle, 2010), resulting in their presence throughout our environment, from oceans (Wagner et al., 2019), soil (reviewed in (Wolska et al., 2014)) and landfills (reviewed in (Gabryszewska & Gworek, 2021), to human food products (Saktrakulkla et al., 2020) and indoor air, including air in schools (reviewed in (Herrick et al., 2016)). PCBs are detectable and persistent in many marine (reviewed in (Domingo & Bocio, 2007)) and land animal species (reviewed in (Rhind, 2002)), as well as in humans, including pregnant people (Granillo et al., 2019) and newborns (Berghuis et al., 2013; Mori et al., 2014; Yu et al., 2019). Adult exposure to PCBs by inhalation, absorption through skin, or consumption of contaminated foods such as dairy, meat and fish, increases risk for many diseases, including cancer (reviewed in (Leng et al., 2016; Wan et al., 2022)), cardiovascular disease (reviewed in (Gupta et al., 2018)) and reproductive abnormalities (reviewed in (Rattan et al., 2017)). Of significant concern, fetal exposure to PCBs via maternal blood has been associated with adverse health outcomes in the offspring, including increased risk of neurodevelopmental disorders such as autism spectrum disorder (ASD) (Panesar et al., 2020; Pessah et al., 2019).
ASD is a group of neurodevelopmental disorders characterized by impairments in social reciprocity and communication, restrictive interests, and repetitive behaviors. Risk for and protection from ASD is influenced by a wide variety of genetic and environmental factors as well as gene by environment interactions (reviewed in (Chaste & Leboyer, 2012; Cheroni et al., 2020; Lyall, Croen, Daniels, et al., 2017)). Diagnoses of ASD have been increasing over the past several decades and are currently estimated to affect approximately 1 in 44 children in the United States (Maenner, 2021). These sobering statistics are a primary premise for the hypothesis that gestational exposure to environmental chemicals influence individual risk for ASD. However, the mechanisms by which environmental factors influence risk remain controversial (Lein, 2015).
Epigenetic marks are widely posited as one mechanism mediating gene by environment interactions that confer ASD risk (Lein, 2015; Panesar et al., 2020) with the placental epigenome functioning as a potential biological substrate (Zhu et al., 2022). The placenta is a fetal tissue that invades the uterus and is responsible for providing nutrients and oxygen to the fetus as well as removing waste. While the placenta filters out many toxic chemicals from the maternal bloodstream, small lipophilic chemicals, such as PCBs, can cross the placental barrier and enter the bloodstream of the fetus (Guvenius et al., 2003; Park et al., 2008; Soechitram et al., 2004). Indeed, PCB levels in human placenta have been found to significantly correlate with those in maternal blood and umbilical cord blood (Jeong et al., 2018). This placental exposure can alter patterns of DNA methylation, a tissue-specific epigenetic mark that is influenced by both genetic and environmental factors. DNA methylation marks are dynamic during development when they are being established and then remain mostly stable, affecting gene expression throughout the lifetime (Moore et al., 2013).
Recently, differential DNA methylation has been identified in human placenta and umbilical cord blood of individuals later diagnosed with ASD (Mordaunt et al., 2020; Zhu et al., 2022). Moreover, maternal PCB levels have been associated with altered DNA methylation in human placenta (Ouidir et al., 2020) and in mouse placenta and fetal brain (Laufer et al., 2022). Both Ouidir et al. and Laufer et al. found enrichment for pathways related to neurodevelopment and neuronal/brain functions. However, no human study to date has examined associations between maternal PCB exposure, whole-epigenome placenta DNA methylation, and offspring neurodevelopmental outcomes.
In this study, we measured 209 PCB congeners from 104 samples of maternal serum collected at delivery and analyzed DNA methylation networks from 147 placental samples using weighted gene correlation network analysis (WGCNA) (Langfelder & Horvath, 2008) (Figure 1). We correlated placental DNA methylation modules with maternal serum PCB values, as well as participant traits related to child neurodevelopment, maternal health, demographic factors, and birth factors. Using this approach, we identified known ASD risk genes CSMD1 and AUTS2 and other potential biomarkers of PCB exposure relevant to neurodevelopment.
2. METHODS
2.1. Study Population
2.1.1. MARBLES and EARLI cohorts
This study included mother-child pairs that were enrolled in the Markers of Autism Risk in Babies – Learning Early Signs (MARBLES) study from 2006-2016. As previously described (Hertz-Picciotto et al., 2018), MARBLES enrolls mothers in northern California who already have a biological child with ASD and are therefore at increased risk of delivering another child who will develop ASD. Enrollment in the MARBLES study required families to meet the following criteria: 1) the mother or father already had a biological child diagnosed with ASD, and the mother was 2) at least 18 years of age; 3) already pregnant or planning to become pregnant; 4) living within a 2-hour drive of the Davis/Sacramento area; and 5) proficient in English. This study used 104 maternal serum samples from MARLBES for PCB analysis and 147 placentas (after quality control) from MARBLES for DNA methylation analysis; 95 of the serum and placenta samples came from a mother-fetus pair.
This study uses an independent replication cohort from the Early Autism Risk Longitudinal Investigation (EARLI) to validate results from DNA methylation analysis. Similarly to MARBLES and previously described elsewhere (Newschaffer et al., 2012), EARLI enrolls mothers who already have a biological child with ASD. Families were recruited from Drexel/Children’s Hospital of Philadelphia, Johns Hopkins/Kennedy Krieger Institute, Kaiser Permanente Northern California, and University of California, Davis. Enrollment in the EARLI study required families to meet the following criteria: 1) the mother or father already had a biological child diagnosed with ASD, and the mother was 2) at least 18 years of age; 3) less than 29 weeks pregnant; 4) living within a 2-hour drive of the study site; and 5) proficient in English or Spanish. This study used 47 placentas from EARLI to confirm the preservation and quality of the DNA co-methylation modules.
2.1.2. Diagnostic classification for autism spectrum disorder
Children enrolled in MARBLES and EARLI were clinically assessed for autism spectrum disorder (ASD) around 36 months of age by professional examiners. Diagnosis was based on the Autism Diagnostic Observation Schedule (ADOS) (Lord et al., 2000), Autism Diagnostic Interview – Revised (ADI-R) (Lord et al., 1994), and Mullen Scales of Early Learning (MSEL) (Mullen, 1995). ADOS is a standardized assessment tool that uses structured and semi-structured activities that allow the examiner to evaluate communication skills, social interaction, and imaginative use of materials. The ADOS module 2 scores range from 0 to 28, with scores equal to or higher than 7 meeting the cut-off for ASD. ADI-R is a structured interview with a parent or caregiver to discuss the developmental history and current behavior of the child; this interview can be useful to distinguish ASD from other developmental disorders. MSEL is a standardized evaluation whereby the child completes a set of motor, visual, and language tasks which are assessed by the examiner. MSEL has four subscale scores (fine motor, expressive language, receptive language, and visual reception) as well as an overall early learning composite score, which is referred to as Mullen Score in this manuscript. The composite score range is 49 to 155, with scores below 70 indicating neurodevelopmental delays.
Based on ADOS and MSEL scores, children in MARBLES and EARLI were categorized into three outcome groups: ASD, typical development (TD), and non-typical development (non-TD)(Chawarska et al., 2014; Ozonoff et al., 2014; Schmidt et al., 2019). Children classified in the ASD group had ADOS scores above the cut-off and met the DSM-5 criteria for ASD. Children classified in the TD group had MSEL scores that were all within 2 SD and no more than one MSEL subscale score that was 1.5 SD below the normative mean, as well as ADOS scores that were 3 or more points below the cut-off score for ASD. Children that met the criteria for neither ASD nor TD were classified as non-TD; these children had low MSEL scores (two or more MSEL subscale scores more than 1.5 SD below the normative mean or at least one MSEL subscale score more than 2 SD below the normative mean) and/or high ADOS scores (within 3 points of the cut-off score for ASD).
2.1.3. Participant traits
MARBLES participants provided demographic, nutritional, environmental, pregnancy, and medical information through telephone-assisted interviews and mailed questionnaires. We selected particular traits of interest, coded them numerically for analysis, and grouped them into four categories for ease of discussion: demographics, maternal health, birth factors, and child neurodevelopment. Information on some traits was not available for all participants, in which case the data were removed and statistical analyses performed on the data available.
Demographic traits evaluated include both parents’ birthplace (California/not California), race (white/non-white), ethnicity (Hispanic/non-Hispanic), education level (less than high school/high school diploma or GED/some college/Bachelor’s degree/graduate or professional degree), and age at time of the child’s birth, as well as home ownership (renter/owner), delivery payer (public/private insurance at time of child delivery), and sex assigned at birth of the enrolled child (male/female). Maternal health factors evaluated include diagnosis of gestational diabetes (yes/no), pre-eclampsia (yes/no), or chronic hypertension (yes/no), as well as maternal BMI pre-pregnancy (continuous; calculated from weight and height), maternal plasma polyunsaturated fatty acid (PUFA) levels (DPA + DHA + EPA measured in maternal serum), maternal omega 3s (high/low), prenatal vitamin use in month 1 of pregnancy (yes/no), and maternal smoking during pregnancy (yes/no). PUFA and omega 3 data have been previously reported (Huang et al., 2020). Birth factors evaluated include delivery method (vaginal/C-section), birth month, birth season (fall, winter, spring, summer), birth year (2006-2016), parity (number of pregnancies over 20 weeks gestation prior to this pregnancy), Apgar score at 5 minutes, Apgar score at 1 minute, birth weight (grams), birth length (cm), and gestational age at delivery (number of weeks into pregnancy when delivery occurs). Factors related to child neurodevelopment include age at which ADOS and MSEL tests were performed, MSEL composite score, ADOS comparison score, and diagnosis of ASD (ASD, TD, non-TD).
Participant traits were correlated with one another using Pearson and Spearman correlations and p-values were corrected for multiple testing by FDR using the psych R package v2.2.5 corr.test function (Revelle, 2022). Pearson measures linear correlation, whereas Spearman measures monotonic correlation, providing less restrictive but also less powerful results. Given the structural diversity of our data, we elected to use both correlation measures to capture as many potential relationships as possible. We adjusted p-values by FDR due to its power to detect true positives (Benjamini & Hochberg, 1995, 2000; Chen et al., 2010; Korthauer et al., 2019). GraphPad Prism version 9.4.1, GraphPad Software, San Diego, California USA, www.graphpad.com, chi-squared test was used to test if prenatal vitamin use differed among ASD, non-TD, and TD diagnostic groups, while one-way ANOVA and Tukey’s multiple comparisons test were used to test differences in MSEL scores across diagnostic groups.
2.2. PCB measurements and analysis
2.2.1. Serum collection and storage
Maternal serum was collected in red-top serum-separator vacutainers at time of delivery and processed with centrifugation within 4 hours if possible, and stored frozen at -80 degrees Fahrenheit. For this analysis, specimen aliquots were selected based on: having a minimum of 6 mL at delivery, child diagnosis data available, and a placenta specimen available for WGBS. Serum aliquots were de-identified and shipped to the Analytical Core of the Iowa Superfund Research Program frozen on dry ice. A pooled serum specimen from samples with ample volume was created and included in each analysis run for quality assurance purposes. In this paper, PCB congeners are referred to by their Ballschmitter and Zell number (US EPA, 2015).
2.2.2. Chemicals and materials
13C-labeled PCBs, including 13C12-PCB 3, 13C12-PCB 15, 13C12-PCB 31, 13C12-PCB 52, 13C12-PCB 118, 13C12-PCB 153, 13C12-PCB 180, 13C12-PCB 194, 13C12-PCB 206, and 13C12-PCB 209, were purchased from Cambridge Isotope Laboratories Inc. (Andover, Massachusetts, USA). Deuterated PCB 30 (d5-PCB 30) was obtained from C/D/N Isotopes (Pointe-Claire, Quebec, Canada). A calibration standard containing all 209 PCB congeners and PCB 204 (internal standard) were purchased from AccuStandard, Inc. (New Haven, Connecticut, USA). Unique chemical identifiers of all 209 PCB congeners have been reported previously (Li, Westra, et al., 2022). Pesticide grade solvents, such as hexane and dichloromethane, were purchased from Fisher Scientific (Pittsburgh, Pennsylvania, USA). The Standard Reference Material (SRM 1957) was obtained from the National Institute of Standards and Technology (NIST, Gaithersburg, Maryland, USA).
2.2.3. Extraction of PCBs from maternal serum
Serum samples were extracted and analyzed by gas chromatography-tandem mass spectrometry (GC-MS/MS) analysis using published procedures (Li, Hefti, et al., 2022; Marek et al., 2014; Milanowski et al., 2010) in the Analytical Core of the Iowa Superfund Research Program. Briefly, tissue samples were homogenized in 3 mL isopropanol. One mL of diethyl ether was added, and samples were spiked with isotope-labeled PCB (5 ng/each in hexane). The tubes were inverted for 5 minutes and centrifuged at 1,378 g for 5 minutes. The organic layer was transferred to a new tube. The residue was re-extracted with 1 mL of isopropanol and 2.5 mL of hexane and diethyl ether (9:1, vol/vol). The combined organic extracts were washed with 5 mL of phosphoric acid (0.1 M in 0.9% aqueous sodium chloride), and the aqueous phase was re-extracted with 1 mL of hexane and diethyl ether (9:1, v/v). The combined extracts were concentrated to near dryness under a gentle stream of nitrogen, reconstituted in 4 mL of hexane, and 2 mL of a potassium hydroxide solution (0.5 M in water-ethanol, 1:1, v/v) was added. After inversion for 5 min and centrifugation at 1,378 g for 3 min, the organic phase was separated, and the bottom aqueous phase was re-extracted with 3 mL of hexane. The combined organic phase contained the PCBs. The PCB fraction was concentrated to ∼ 0.5 mL under a gentle stream of nitrogen and passed through a glass cartridge filled with 2 g of acidified silica gel (silica gel and concentrated sulfuric acid, 2:1, w/w) with 0.2 g activated silica gel at the bottom and prewashed with 3 mL of hexane. PCBs were eluted from the cartridge with 14 mL of hexane. The eluent was concentrated to 4 mL and treated with 2 mL of concentrated sulfuric acid. After inversion for 5 min and centrifugation at 1,378 g for 3 min, the organic phase was collected, and the sulfuric acid layer was re-extracted with 3 mL of hexane. The combined organic extract was concentrated to about 50 µL and transferred to a glass autosampler vial with an insert. Before analysis, the sample was spiked with the internal standards (d5-PCB 30 and PCB 204; 5 ng each in hexane).
2.2.4. GC-MS/MS determination of PCBs
PCB samples were analyzed on an Agilent GC system coupled with an Agilent 7000 Triple Quad in the multiple reaction monitoring (MRM) mode on an SPB-Octyl capillary column (30 m length, 250 µm inner diameter, 0.25 µm film thickness; Sigma-Aldrich, St. Louis, Missouri, USA). The following temperature program was used: 45 °C, hold for 2 min, 100 °C/min to 75 °C, hold for 5 min, 15 °C/min to 150 °C, hold for 1 min, 2.5 °C/min to 280 °C, and hold for 5 min. The injector temperature program is as follows: start at 45 °C, hold for 0.06 min, 600 °C/min to 325 °C and hold for 5 min. The transfer line temperature was 280 °C. Helium was used as a carrier gas with a constant flow rate of 0.8 mL/min. The precursor-product ion transitions of all PCB analytes used for the MS/MS analysis have been reported previously (Li, Hefti, et al., 2022).
2.2.5. Quality assurance/quality control for the GC-MS/MS analysis
Extraction efficiency, reproducibility, and accuracy were assessed using 13C-labeled surrogate standards, method blanks, and analysis of a standard reference material (SRM 1957, NIST). Limits of quantification (LOQs) were calculated as , where indicates the mean of method blank results, k indicates the Students′ t-value appropriate for the single-tailed 99th percentile t statistic and a standard deviation estimate with n-1 degree of freedom, and sb indicates the sample standard deviation of the replicated method blank samples. The recovery rates of the surrogate standards ranged from 70±10 % to 97±9 %.
2.2.6. Statistical analysis of PCBs and PCB-participant trait relationships
PCB congeners were assessed for detection frequency, defined as the percent of the 104 materal serum samples with PCB values over the LOQ, and labeled as detected (non-censored) or non-detected (censored) for each sample. Individual or co-eluting congeners that were detected in ≥50% (n=52) of samples were selected for further analysis due to the increased robustness of statistical methods. PCBs 153+168, 170, 180+193, and 187 met this cutoff. All PCB congener values (ng) were then adjusted by maternal serum weight (g), and these serum weight-adjusted values were used in all remaining analyses. PCB analyses described in this section were completed with the NADA R Package v1.6-1.1 using R v4.1.3, unless otherwise specified (Helsel, 2005). NADA is optimized for censored environmental data (Lee, 2020)
Summary statistics (mean, median, standard deviation) were estimated for PCBs 153+168, 170, 180+193, and 187 using the censtats function, which reports statistics from three methods: reverse Kaplan-Meier (a non-parametric test that does not assume a normal distribution of the data), Robust Regression on Order Statistics (ROS; a semi-parametric model that assumes the detected data is normally distributed but makes no assumptions about the censored data), and Maximum Likelihood Estimation (MLE; a parametric model that assumes a normal distribution of all data). The data were log-transformed for ROS and MLE to attain normality, due to the right-skew of the non-transformed data. All three models avoid removal, replacement, or imputation of the nondetected values, instead estimating statistics using all data points.
PCBs 153+168, 170, 180+193, and 187 were correlated with one another using Kendall’s tau for left-censored data, a nonparametric correlation coefficient that measures the monotonic association between two variables, one or both of which may include censored data. Both detected and non-detected data were included in the correlations. MARBLES participant traits (related to demographics, maternal health, birth factors, and child neurodevelopment) were correlated with PCBs 153+168, 170, 180+193, and 187 values using Kendall’s tau for left-censored data, and p-values were corrected by FDR. To assess which participant traits most contributed to the levels of PCBs in the maternal serum, we computed fitted multiple linear regression models. First, trait-PCB congener pairs were filtered to those that were associated with a raw p-value < 0.2. If multiple trait-PCB congener pairs met this cut-off and the traits were correlated with each other, r > 0.5, the trait-PCB pair with the less significant p-value was removed from the analysis. This was done to remove multicollinearity and increase the reliability of the model. With the remaining PCB-trait pairs, a fitted multiple linear regression equation for censored data was computed using the cenmle function. We then assessed detection frequencies and congener levels for all congeners across maternal age groups (>35 vs ≤35), offspring birth year (2006-2011 vs 2012-2016), maternal BMI (>25 vs ≤25), offspring ASD diagnosis (ASD vs non-TD vs TD), and offspring sex (male vs female) using coin R package v1.4.2, VGAM R package v1.1.6, and censReg R package v0.5.34.
To further examine the relationship between maternal serum PCB levels and diagnosis of ASD in the offspring, the Peto-Prentice non-parametric test was used to test the means of PCB values between diagnostic groups (ASD/non-TD/TD). The Peto-Prentice test reports two-sided p-values which were divided in two to produce one-sided p-values. To further examine the relationship between maternal serum PCB levels and maternal pre-pregnancy BMI, we correlated PCB-BMI tau with Kow, the octanol-water partition coefficient, for each PCB: 153+168, 170, 180+193, and 187 (Hawker & Connell, 1988). For co-eluting PCB congeners (PCBs 153+168 and 180+193), we averaged the Kow of the two congeners.
2.3. Placental DNA methylation analysis
2.3.1. Whole genome bisulfite sequencing (WGBS)
The placental samples and WGBS data used in this study were previously described (Ladd-Acosta et al., 2021; Zhu et al., 2022). Briefly, 157 MARBLES and 47 EARLI placental samples were frozen within 4 hours of birth. DNA was extracted from the placental samples and treated with sodium bisulfite. Following whole genome sequencing of bisulfite-treated DNA, sequencing files were processed, aligned to the hg38 genome, and converted to CpG methylation count matrices.
2.3.2. Placental DNA co-methylation network analysis
We analyzed placental DNA methylation using Comethyl, v1.1.1 with R v 4.1.0, a published R package that performs weighted gene correlation network analysis (WGCNA) for whole-genome bisulfite sequencing (WGBS) data (Mordaunt et al., 2022). Comethyl forms modules from groups of genomic regions that have correlated methylation, meaning that the proportion of CpGs that are methylated in a given region correlate with the proportion of CpGs methylated in other regions across samples. Modules are biologically informative because regions with correlated methylation tend to be part of the same biological pathways and are less susceptible to random variation s in methylation than are individual CpG sites. In order to build modules from genomic regions that are particularly informative, we filter regions to include only those that have variable methylation across samples and therefore, may contribute to differential phenotypes. Methylation of a given module is represented by a single value for each sample, called the “eigennode”, where a more positive eigennode value reflects higher methylation levels and a more negative value reflects lower methylation.
Specifically, 157 individual placenta Bismark CpG reports were read into a single BSseq object that was used to calculate the number of CpGs at different filtering options for both sequencing coverage and number of samples. Ten samples were removed due to low sequencing coverage, resulting in 147 samples being used for analysis. CpGs with at least 4x coverage in at least 80% of the 147 samples were selected to form regions, resulting in 13,056,694 filtered CpGs (44% of CpGs). Regions, defined as 3 or more CpGs separated by no more than 150 base pairs, were filtered for those with at least 12x coverage and a methylation standard deviation of at least 5%, resulting in 199,944 filtered regions. The methylation data was then adjusted for the top 26 principal components to adjust for potential confounding variables (Parsana et al., 2019). Using the adjusted methylation data, scale-free topology with Pearson correlations was assessed to evaluate fit and connectivity of the model. A soft power threshold of 15 was selected, meaning that all correlations were raised to the power of 15 to increase the strength of strong correlations and minimize weak correlations and background noise. Finally, modules were identified using a two-phase clustering approach that reduces computational intensity: filtered regions were grouped into 6 blocks using k-means clustering, then a full network analysis was performed for each block. Modules were identified in each block through hierarchical clustering and adaptive branch pruning, then were merged if their eigennode values were highly correlated (r > 0.9). Eigennode values are a single value that summarizes the methylation of a module for each sample. Regions not assigned to a module were assigned to the grey module.
2.3.3. Co-methylation module, sample, and trait correlations
Once modules were identified, we correlated module eigennode values with one another to identify highly related pairs. Correlations were calculated by Bicor, and p-values were derived based on Fisher’s transformation.
We then correlated module eigennode values with MARBLES participant traits (related to demographics, maternal health, birth factors, and child neurodevelopment) as well as measured maternal serum PCB values. Module eigennode values were correlated with participant traits using Bicor, and with PCB values using Kendall’s tau for left-censored data (NADA R package).
2.3.4. Co-methylation module annotation
Module regions were annotated to genes using the Genomic Regions Enrichment of Annotations Tool (GREAT) and the hg38 genome (McLean et al., 2010). A region assigned to a module was annotated to a gene if its genomic coordinates fell within the gene’s regulatory domain, defined as 5 kb upstream and 1 kb downstream from its transcription start site, or up to 1 Mb from the transcription start site until the nearest gene’s regulatory domain. Regions can map to zero, one, or multiple genes. Gene information was provided by BioMart while gene context and CpG context were provided by annotatr (Cavalcante & Sartor, 2017). Genic features include promoters (<1kb upstream of a transcription start site), 5’UTRs, exons, introns, 3’UTRs, enhancers, intergenic regions, and regions 1kb-5kb upstream of a transcription start site (Cavalcante & Sartor, 2017). CpG features include CpG islands (regions of the genome with a high frequency of CpG sites as defined by UCSC Golden Path), CpG shores (the 2kb upstream and downstream of CpG island boundaries), CpG shelves (the 2kb upstream and downstream of CpG shores), and the open sea (beyond 4kb from CpG island boundaries) (Cavalcante & Sartor, 2017).
We analyzed the overlap between module regions and partially methylated domains (PMDs) using placenta PMDs previously identified by a machine learning approach on the hg18 genome (Schroeder et al., 2013). We used the UCSC liftover tool to lift the hg18 PMDs to hg38, then compared regions assigned to modules to PMDs; a region was considered within a PMD if the entire region fit in the PMD.
2.3.5. Co-methylation module preservation
To assess the consistency of modules generated by the MARBLES samples we performed module preservation using an independent dataset from the EARLI study.We called modules from the EARLI dataset using CpGs and regions restricted to those covered in the MARBLES dataset. Methylation data was corrected for the top 16 principal components, and a soft power threshold of 23 was selected to achieve a high fit for the model. Pearson correlation was then used to identify associations between regions and to call modules. We evaluated the MARBLES and EARLI modules through multiple statistics, assessing the quality, preservation, separability, and accuracy of each module (Langfelder et al., 2011). Module assignments are permuted 100 times to calculate Z-scores and P-values. Based on simulation studies by Langfelder et al., 2011, we considered Z-scores over 10 to be strong evidence for module preservation and Z-scores between 2 and 10 to be weak-moderate evidence.
3. RESULTS and DISCUSSION
3.1. Highly chlorinated PCB congeners are most frequently detected in maternal serum
We measured levels of all 209 PCB congeners in 104 MARBLES study samples of maternal serum collected at time of delivery (Figure 2A) (Supplemental Tables S1 and S2) (Hertz-Picciotto et al., 2018). While most human PCB studies use serum collected during pregnancy, delivery serum provides a unique timepoint to study lipid-soluble PCBs because maternal plasma free fatty acids have been found to double during labor (Kashyap et al., 1976). In 27 instances, two-four PCB congeners co-eluted, resulting in 173 PCB measurements per sample. Each sample had between zero and 41 PCB congeners detected, with a mean of 6.68 and a median of 4 congeners detected per sample. Eighty-one single or co-eluting PCB congeners were detected in zero samples; 76 in 1-10% of samples; 5 in 11-20% of samples; 7 in 21-50% of samples; and 4 in ≥50% of samples (Supplemental Table S1). These detection frequencies are lower than those in postmortem human brain using similar methods (Li, Hefti, et al., 2022). In both studies, highly chlorinated PCBs were more likely to be detected than lower chlorinated PCBs, despite the decreased response of the instrument as the degree of chlorination increases. This may be explained by highly chlorinated PCB congeners’ resistance to metabolization, leading to bioaccumulation in the body (Kato et al., 1980; Mathews & Anderson, 1975; Mills et al., 1985).
The four PCB congeners/congener pairs that were detected in ≥50% (n=52) of samples were selected for further analysis due to the increased robustness of statistical methods when censored data points represent less than half the data: PCBs 153s + 168, 170, 180 + 193, and 187. These PCB congeners were also detected in ≥50% of postmortem brain samples (Li, Hefti, et al., 2022) and found at high concentrations in maternal blood plasma, cord blood plasma, breast milk (Guvenius et al., 2003), and placenta (Naqvi et al., 2018). No PCB congeners were detected in fewer than 50% of all samples but in greater than 50% of one of the ASD diagnostic groups (ASD/non-TD/TD), indicating that we did not miss a congener that may discriminate between diagnostic groups.
3.2. Highly detected PCB congeners (153 + 168, 170, 180 + 193, and 187) have variable but correlated levels in maternal serum
We estimated summary statistics (mean, median, standard deviation) for PCBs 153 + 168, 170, 180 + 193, and 187 (Table 1 and Supplemental Table S3) using both the detected and non-detected data in the models, which has been shown to be more effective than substituting or replacing non-detected data points (Ahmadi et al., 2021; Canales et al., 2018). Inputted PCB values were adjusted by serum weight, which did not differ significantly across ASD diagnostic groups (ASD/non-TD/TD) (Supplemental Figure S4). PCBs 153 + 168 and 180 + 193 had higher median and mean concentrations than PCBs 170 or 187. PCB 153 and PCB 180 have previously been detected as among the 12 most abundant PCB congeners in maternal serum in the MARBLES cohort (Sethi et al., 2019), as well as highly detected in brain tissue (Chu et al., 2003; Corrigan et al., 1998; Dewailly et al., 1999; Li, Hefti, et al., 2022; Mitchell et al., 2012). Additionally, PCB 153 + 168 are the least lipophilic of this group, meaning they are more readily released from adipose tissue into the blood, potentially resulting in higher average serum levels (Louis et al., 2016).
We next tested the hypothesis that measured values of PCB congeners in serum correlate with one another. Confirming previous findings (DeVoto et al., 1997; Pauwels et al., 1999), PCBs 153 + 168, 170, 180 + 193, and 187 were all significantly correlated (tau correlation coefficients ranged from 0.48 to 0.63, with all p-values < 10−12), though correlation coefficients were lower than in previous studies, likely due to our inclusion of non-detected data points (Figure 2B) (Supplemental Table S4).
3.3. Participant traits related to demographics, maternal health, birth factors, and child neurodevelopment are correlated
Next, we took advantage of the large volume of metadata collected within the MARBLES study to explore relationships between traits related to demographics, maternal health, birth factors, and child neurodevelopment (Supplemental Tables S5 and S6). To identify linear and monotonic relationships, we correlated all numerically-coded participant traits using Pearson (linear) (Figure 3A) (Supplemental Tables S7, S9) and Spearman (monotonic) (Supplemental Figure S2) (Supplemental Tables S8, S10) correlations. Spearman detected 54 significant trait-trait correlations (FDR < 0.05), two of which were not detected by Pearson, while Pearson detected 55, three of which were not detected by Spearman, indicating the value of performing both methods. Unsurprisingly, traits within a given category were often strongly correlated, including demographic traits of parental ages, parental education levels, home ownership, and delivery payer (public or private insurance), as well as paired demographic traits of the two parents such as birthplace, race, ethnicity, age, and education. Birth factors, including birth weight and birth length, as well as maternal health factors, including omega 3 and plasma PUFA levels, were also strongly correlated (FDR < 0.05).
ASD diagnosis strongly correlated with ADOS comparison score (FDR = 0) and Mullen score (FDR = 0), which are both used to diagnose ASD, as well as birth year (FDR < 0.01), with rates of diagnosis increasing from 2006 to 2016. This aligns with previous findings that ASD is increasing in prevalence and/or diagnostic rates (CDC, 2022). ADOS comparison score correlated (FDR < 0.05) with prenatal vitamin use, which has previously been shown to be protective for ASD (Levine et al., 2018; Schmidt et al., 2011, 2019). To further examine this relationship in our cohort, we used chi-squared to test if prenatal vitamin use in month 1 of pregnancy (yes/no) was significantly different for ASD diagnostic groups (TD/non-TD/ASD). Aligning with previous findings, prenatal vitamin use was associated with ASD diagnosis in the protective direction (P = 0.0062) (Supplemental Figure S3). Additionally, Mullen Score was significantly associated with mom education (FDR = 0.02), with greater educational attainment as child Mullen Score increased. Higher Mullen scores indicate greater cognitive development and decreased likelihood of ASD diagnosis (Supplementary Figure S4). Low maternal education has been associated with increased risk and severity of ASD in previous studies (Dong et al., 2022; Leonard et al., 2005; Schoultz et al., 2010).
3.4. PCB levels in maternal serum are correlated with participant traits, particularly maternal age, but not child ASD diagnosis
Next, we explored associations between participant traits and levels of PCBs 153 + 168, 170, 180 + 193, and 187 in maternal serum collected at delivery (Figure 3B). Based on existing literature, we hypothesized that maternal serum PCB levels would be positively associated with diagnosis of ASD in the offspring (Bernardo et al., 2019; Lyall, Croen, Sjödin, et al., 2017), negatively associated with maternal BMI (Koh et al., 2016; Lan et al., 2021; Mullerova et al., 2008), and negatively associated with child birth year (year of maternal serum sample collection), with the latter prediction reflecting data indicating that environmental levels of highly chlorinated PCBs, including PCBs 153 + 168, 170, 180 + 193, and 187, are decreasing over time (Brändli et al., 2007; Jones et al., 1992).
Kendall’s tau for left-censored data was estimated for each PCB-trait pair, and p-values were adjusted by FDR (Supplemental Table S11, S12, S13). PCBs 153 + 168, 170, 180 + 193, and 187 were not significantly (FDR < 0.05) associated with ASD diagnosis, which we confirmed by the Peto-Prentice test (Supplemental Table S14). At FDR < 0.05, PCBs 153+168 and 187 were significantly associated with maternal and paternal ages, paternal education level, and home ownership (renter/owner); PCB 170 with maternal and paternal ages; and PCB 180+193 with maternal and paternal ages and education levels (Supplemental Figure S5). Because the demographic variables of maternal and paternal age and education, home ownership, and delivery payer are all highly correlated with one another (Figure 3A), it was difficult to identify which traits were the most significant predictors of PCB levels in maternal serum.
To address this question, we computed fitted multiple linear regression models for censored data. First, we identified all PCB-trait correlations with FDR < 0.2. This cut-off was purposely lenient to include all traits that may be important in affecting PCB levels. To address multicollinearity, we removed a PCB-trait pair from analysis if the trait was highly correlated with another trait (r > 0.5) that was more strongly associated (lower FDR) with the PCB congener. With all PCB-trait pairs that met these cut-offs, we performed multiple linear regression using a log scale to normalize the left-censored PCB data. Though all congeners’ multiple linear regression models shared characteristics, each was distinct despite the high correlation between all four PCB congeners/congener pairs (Figure 2B). Maternal age (MomAge) was the most significant predictor (P < 0.00001) of maternal levels of PCBs 153 + 168, 170, 180 + 193, and 187, with increasing age predicting increasing serum PCB levels. PCBs 153 + 168 (P < 0.05) and 180 + 193 (P < 0.005) were also strongly predicted by the child’s birth year (year of maternal serum sample collection), with decreasing levels of PCBs from 2006 to 2016. Higher maternal serum levels of PCB 153 + 168 was also predicted by lower maternal BMI (P < 0.05), while higher maternal serum levels of PCB 180 + 193 were predicted by higher levels of maternal plasma polyunsaturated fatty acids (P < 0.05) (Figure 3B). Standard errors of the coefficients and other regression model variables are reported in Supplemental Table S14.
Fitted multiple linear regression models:
lnPCB 153 + 168 = 211 + 0.134*MomAge - 0.0322*MaternalBMI - 0.109*BirthYeara
lnPCB 170 = -8.9024 + 0.1127*MomAge
lnPCB 180 + 193 = 219.31014 + 0.09813*MomAge + 0.00588*MaternalPlasmaPUFA – 0.11286*BirthYeara
lnPCB 187 = -10.1156 + 0.12414*MomAge
aBirth year refers to child birth year which is also the year of maternal serum sample collection
The multiple linear regression models partially confirmed our hypotheses that child diagnosis of ASD, year of sample collection, and maternal BMI would be significant predictors of maternal serum PCB levels. PCBs 153 + 168, 170, 180 + 193, and 187 were not significantly predicted by ASD diagnosis or ADOS score, which we confirmed by the Peto-Prentice test (Supplemental Table S15). However, the negative associations between PCB levels and year of sample collection as well as maternal BMI confirmed our hypotheses and aligned with previous publications. Due to the negative association between BMI and PCBs, we decided to not lipid-adjust our PCB values. Previous studies have indicated a potential causal effect of PCBs on serum lipid levels (Hennig et al., 2005; Langer et al., 2003), and have found lipid standardization to be prone to bias (Schisterman et al., 2005). Further, different groups of PCBs are differently associated with total serum lipids, making lipid adjustment difficult for all 209 PCB congeners in our study (Aminov et al., 2013).
To investigate why PCB 153 + 168 was more strongly negatively correlated with (and predicted by) maternal BMI than the other three PCB congeners/congener pairs we correlated the BMI-PCB tau value with the octanol-water partition coefficients (Kow) for each PCB congener/pair (Supplemental Figure S6) (Supplemental Table S16). Kow represents the lipophilicity of a compound, with more positive values representing increased lipophilicity and decreased water-solubility. Of the four PCB congeners/congeners pairs, PCB 153 + 168 is the least lipophilic (lowest Kow) and has the strongest negative PCB-BMI correlation tau value. This inverse relationship between BMI and serum levels of PCBs with lower lipophilicity is revealed among individuals with lower fat stores (lower BMI) that have higher amounts of PCB 153 + 168 in their bloodstream. This may occur with PCB 153 +168 more than the other congeners because its lower lipophilicity, meaning PCB 153 +168 has lower propensity to stay in the fat stores and greater solubility in the bloodstream (Bruno et al., 2021).
Lastly, the multiple linear regression models showed that PCB 180 + 193 serum levels are predicted by maternal plasma polyunsaturated fatty acid levels. PCBs have previously been associated with polyunsaturated fatty acids in human serum due to the high prevalence of both PCBs and PUFAs in fish and fish oil (Ashley et al., 2013; Bourdon et al., 2010; Dellinger et al., 2018; Ge et al., 2019; Tøttenborg et al., 2015). However, we did not include dietary information in this study, so cannot validate a hypothesis for the association between PCB 180 + 193 and dietary polyunsaturated fatty acid intake.
Based on our findings with participant traits and PCBs 153 + 168, 170, 180 + 193, and 187, we assessed detection frequencies (Figure 3C) and levels (Supplemental Figure S7) of all PCB congeners by offspring sex, offspring ASD diagnosis, maternal age, maternal BMI pre-pregnancy, and offspring birth year (year of maternal serum sample collection) (Supplemental Table S17). Overall, 18 congeners were more frequently detected in women who gave birth at over age 35 (and therefore had their serum collected at over age 35) than those who gave birth at age 35 or younger. One congener was more frequently detected in women with pre-pregnancy BMI 25 or under (PCB 169), while one congener was more frequently detected in women with TD offspring compared to ASD/non-TD (PCB 194). Seven congeners were more frequently detected in those with sample collection (child birth year) between 2006-2011 compared to 2012-2016.
3.5. Placental DNA co-methylation modules map to large blocks of adjacent regions
We next assessed DNA methylation patterns from placental samples in the MARBLES cohort to analyze the relationships between DNA methylation, maternal serum PCB levels, and participant traits including those related to child neurodevelopment. Although we did not see a significant association between maternal serum PCB levels and child ASD diagnosis at the population level, placental DNA methylation may act as a measurable intermediary at the level of the individual. To identify patterns of DNA methylation, we used the Comethyl R package (Mordaunt et al., 2022), which performs weighted gene correlation network analysis (WGCNA) with WGBS data. This method allowed us to look at modules of co-methylated regions instead of individual CpG sites, which are more susceptible to stochastic variation. Additionally, Comethyl module eigennodes can be correlated with any number of traits of interest, using a statistical method of choice.
CpGs from 157 placental samples were grouped into regions from which modules were constructed. Correlations between region methylation were all raised to the power of 15 to amplify strong connections and decrease the background noise of weak connections (Figure 4A). Regions formed 6 blocks (Supplemental Figure S15) from which 27 modules were identified, plus a grey module, to which regions not assigned to another module were assigned. From 199,945 total filtered regions, 928 were assigned to a module (0.46%) while the remaining regions (99.54%) were assigned to the grey module. Each of the 27 modules contained between 10 and 89 regions (Figure 4B), which is consistent with previous Comethyl findings using cord blood samples (Mordaunt et al., 2022). However, the modules generated from placenta were unique because most modules mapped to large blocks of adjacent regions (Figure 4C) (Supplemental Table S23). This is in contrast to other tissues, where modules typically map to short regions across several chromosomes (Mordaunt et al., 2022). This unique finding led us to investigate where the modules mapped on the genome.
3.6. Placental DNA co-methylation modules annotate to gene introns and CpG open sea contexts
We tested the hypothesis that co-methylated modules mapped to partially methylated domains (PMDs), regions of the placental genome with reduced average methylation levels (approximately 40-45% methylated) that are highly variable across individuals (Schroeder et al., 2013). Because we selected for regions with a high standard deviation in methylation levels and found that filtered regions with the highest standard deviations had 40-60% mean methylation (Supplemental Figure S13), we hypothesized that these regions would preferentially map to PMDs (Schroeder et al., 2013). Of the 928 regions assigned to modules (excluding the grey module), 407 (43.86%) mapped to PMDs (Supplemental Table S23). PMDs cover an average of 37% of the placental genome (Schroeder et al., 2013), showing that PMDs are over-represented in modules compared to the placental genome (chi-squared P < 0.0001) (Supplemental Table S24). Of the 27 modules, 11 consist of 100% regions that map onto PMDs; 9 consist of 100% regions that fall outside of PMDs; and 7 consist of a mixture. The average mean methylation across the regions that do not map to PMDs is 0.6628 (SD 0.1142), while the average mean methylation across the regions that do map to PMDs is 0.6262 (SD 0.1124). These relatively high standard deviations indicate the high variability in methylation levels across regions in both PMDs and non-PMDs.
We then annotated regions within modules to genes, and identified regions’ genic features (i.e. promoter, exon, intron, UTR, enhancer, intergenic) and CpG features (i.e. island, shore, shelf, open sea) (Supplemental Table S23). We found that 877/928 (94.5%) of regions within modules mapped to open sea CpG contexts, supporting previous findings that most DMRs do not reside at CpG islands (Visone et al., 2019). 700/928 (75.4%) of regions in modules mapped to introns of genes, while only 116,878/199,944 (58.5%) of total filtered regions mapped to introns. This indicates that introns are over-represented amongst regions that co-methylate in placenta, compared to background regions (chi-squared P < 0.0001) (Supplemental Table S24). The enrichment for introns is likely related to the finding that the entire gene body, including introns and exons, is more highly methylated over active genes in placenta (Schroeder et al, 2016), and introns cover a larger genomic territory than exons.
We also correlated module eigennodes with one another to identify highly similar modules and explore whether they annotated to the same or different genes (Figure 4D) (Supplemental Figure S16). The most highly correlated pairs of modules were DarkTurquoise with LightCyan (Bicor = 0.792, p-value = 6.08E-33) and Red with DarkRed (Bicor = 0.718, p-value = 1.44E-24). Though these modules are strongly correlated, they were not combined into a single module because they did not meet the Bicor cut-off of 0.9. The DarkTurquoise and LightCyan modules consist of large blocks of adjacent regions that map to CSMD1, a gene that codes for a complement control protein. Nearly all of CSMD1 lies within a placental PMD (Schroeder et al., 2013). Regions in both modules map primarily to CSMD1 introns with one region from each module mapping to one of CSDM1’s 69 exons. The Red and DarkRed modules both consist of large blocks of adjacent regions that map to AUTS2, activator of transcription and developmental regulator. The DarkRed and Red regions lie approximately 396-526kb and 364kb-1Mb upstream of AUTS2, respectively, showing that regions from the two modules overlap with one another on the genome. Overall, these results highlight the complexity of DNA co-methylation in the placental genome.
3.7. Placental DNA co-methylation modules are correlated with child neurodevelopment and map to genes previously associated with ASD, including GALC, AUTS2, and CSMD1
To explore potential relationships between placental DNA methylation and child neurodevelopment, we correlated module eigennodes with participant traits using Bicor (Figure 5A). We also mapped regions within modules to genes in order to relate methylation in certain genes to traits of interest (Figure 5C). Four modules significantly (P < 0.05) correlated with a child neurodevelopmental trait (ASD diagnosis, Mullen Score, or ADOS comparison score): Salmon, DarkRed, Red, and DarkTurquoise (Figure 6 and Supplemental Figures S17-S19). These modules mapped, respectively, to GALC, AUTS2, AUTS2, and CSMD1, all of which have previously been associated with ASD (Cukier et al., 2014; Hori et al., 2021, p. 2; Zhu et al., 2022).
The GALC Salmon module negatively correlated with ASD diagnosis (P < 0.01), as well as ADOS comparison score (P < 0.05), but positively correlated with Mullen Score (P < 0.05). The Salmon module also positively correlated with maternal education (P < 0.05), indicating that increased placental DNA methylation at Salmon loci (GALC gene) correlates with more typical development (and less ASD diagnosis) in the offspring and higher maternal education levels. This also fits with the positive correlation of maternal education with the Mullen Score, for which higher values reflect improved cognition and lower ASD diagnosis (Figure 5A). The Salmon module regions map to GALC, a gene that encodes the enzyme galactosylceramidase, a component of myelin, the protective coating around some nerve cells. These regions were previously identified as placenta ASD hypomethylated DMRs in the MARBLES cohort using a DMR bioinformatic approach (Zhu et al., 2022).
The AUTS2 Red and DarkRed modules correlated negatively with Mullen Score, indicating higher methylation at AUTS2, a known ASD risk gene, when Mullen Scores are lower and ASD is more likely to be diagnosed. This study may be underpowered to detect a significant correlation between these modules and ASD diagnosis (Red P = 0.219, DarkRed P = 0.243), a categorical variable (ASD, non-TD, TD), despite their significance with Mullen Score, a continuous variable. Additionally, the correlation with Mullen Score but not with ADOS Score may reflect that methylation at this loci is associated with cognitive measures, as reflected by Mullen, but not with social aspects of ASD, as reflected by ADOS. The AUTS2 DarkRed module also correlated negatively with maternal omega 3s, while the Red module correlated negatively with prenatal vitamin use, suggesting that high omega 3s and taking prenatal vitamins are associated with lower DNA methylation at this locus, in the expected protective direction for cognition.
The CSMD1 DarkTurquoise module negatively associated with both ADOS comparison score (P < 0.05) and maternal chronic hypertension (P < 0.005) (Figure 6A-B). Again, a significant correlation was not detected with ASD diagnosis (P = 0.080), potentially because it is a categorical variable, or with Mullen Score, perhaps reflecting the sensitivity of CSMD1 methylation to the social aspects of ASD that are measured by ADOS rather than the cognitive aspects that are measured by Mullen. Hypertensive disorders during pregnancy have been shown to potentially increase risk of ASD in offspring (Curran et al., 2018) and common variants in CSMD1 have previously been associated with both hypertension (Hong et al., 2010) and ASD (Cukier et al., 2014), as well as schizophrenia (Liu et al., 2017; Ripke et al., 2014). CSMD1 has high expression in brain, particularly in neurons (Baum et al., 2020), and was identified as a hypermethylated differentially methylated gene in brain tissue from individuals with Dup15q (Dunaway et al., 2016), a common CNV observed in ASD (Chaste et al., 2014; Hogart et al., 2010).
3.8. Placental DNA co-methylation modules are correlated with maternal serum PCBs
To investigate relationships between maternal serum PCBs and placenta DNA methylation, we correlated module eigennodes with PCBs 153 + 168, 170, 180 + 193, and 187 using Kendall’s tau for left-censored data (Figure 5B). The CSMD1 DarkTurquoise module negatively correlated (P < 0.05) with PCBs 153+168, 180+193, and 187 (Figure 6C-F); the AUTS2 Red (P < 0.05) and ADGRD1 Tan (P < 0.05) modules significantly negatively correlated with PCB 187; the SOX11 DarkGreen module significantly negatively correlated with PCB 170 (P < 0.05); and the RoyalBlue module significantly positively correlated with PCB 153 + 168 (P < 0.05). Of the seven significant module – PCB correlations, six were negative correlations, meaning that individuals with higher serum PCB levels had lower methylation levels at those loci. These findings align with a mouse model of gestational PCB exposure, in which placenta displayed global hypomethylation in both sexes as well as a large skew in DMRs, where 81% of female and 85% of male DMRs were hypomethylated in PCB-exposed mice compared to controls (Laufer et al., 2022).
Interestingly, the CSMD1 DarkTurquoise and AUTS2 Red modules correlated with maternal PCBs as well as measurements of child neurodevelopment. In both modules, the methylation change associated with elevated maternal serum PCB levels appears to be in the opposite direction as ASD in the child, which is the inverse of our hypothesis but accompanies our finding that ASD diagnosis was non-significantly negatively correlated with PCBs 153+168, 170, 180+193, and 187 (Figure 3B). A previous study using a mouse model of prenatal PCB exposure identified DMRs that mapped to CSMD1 as well as AUTS2 in male and female placentas and fetal brains (Figure 5C) (Laufer et al., 2022). Several DMRs mapped to each gene for both sexes and tissues and were inconsistent in the direction of differential methylation, indicating that PCB exposure may dysregulate DNA methylation at CSMD1 and AUTS2 in variable ways. Additionally, methylation levels of CSMD1 in sperm have been inversely correlated with serum DDE levels in Faroese individuals who are exposed to high levels of organic pollutants (Maggio et al., 2021). DDE and PCB exposures often occur together, and both chemicals contain phenyl groups and chlorine molecules, indicating a potential convergent mechanism.
Additionally, it is notable that the CSMD1 DarkTurquoise module correlated with maternal chronic hypertension and serum PCB levels (Figure 6 B-F). Hypertension has previously been associated with PCBs, though correlations in the positive (Raffetti et al., 2020) and the negative (Warembourg et al., 2019) direction have both been found. Increases in systolic and diastolic blood pressure have also been previously associated with high fish intake during pregnancy (Warembourg et al., 2019), which is known to increase exposure to PCBs.
3.9. Placental DNA co-methylation modules are highly preserved in an independent dataset
Lastly, we assessed the placental DNA methylation module quality and preservation in an independent WGBS ASD and TD placenta dataset from the EARLI study (Langfelder et al., 2011). EARLI networks were constructed using the same set of CpGs and regions as were used in the MARBLES dataset. The four types of features assessed were quality, preservation, accuracy, and separability.
All 27 modules had strong evidence for quality and preservation as shown by all modules having summary.qual and summary.pres Z-scores > 10. 19 modules had strong evidence for accuracy, including two of our modules of interest (CSMD1 DarkTurquoise and GALC Salmon). In contrast, only eight modules showed evidence of separability in the reference set (MARBLES) and one module in the test set (EARLI), which did not include our modules of interest. These results suggest that the identified modules are of high quality and are consistent between multiple placenta datasets (Supplemental Figure S20)(Supplemental Table S25).
4. CONCLUSIONS
We measured all PCB congeners in maternal serum collected at delivery from an ASD enriched risk cohort and detected 4 congeners/congener pairs (PCBs 153 + 168, 170, 180 + 193, 187) in more than 50% of samples. While we found maternal age, year of sample collection, pre-pregnancy BMI, and polyunsaturated fatty acid levels to be predictive of maternal serum PCB levels, we did not find an association between maternal PCBs and child ASD diagnosis. This finding may be explained by the fact that PCB congener profiles in pregnant people are changing over time and congeners that affect neuroevelopment may be less prevalent in our population than those in previous studies. Additionally, PCBs may not alter neurodevelopment themselves, but instead modify genetic risk through epigenetic marks such as DNA methylation.
To investigate this potential gene by environment interaction, we used a systems approach to identify regions with correlated DNA methylation in placenta. We identified and replicated 27 co-methylated modules, 4 of which correlated with child neurodevelopment measurements (ASD diagnosis, ADOS comparison score, or Mullen score), and 5 of which correlated with PCB congener levels, as well as other variables, including maternal hypertension and prenatal vitamin use. DNA co-methylation patterns in the placenta identified 2 gene loci (CSMD1 and AUTS2) that were correlated with one or more PCB congeners and child neurodevelopment measures. Interestingly, increased methylation at CSMD1 and AUTS2 correlated with higher rates of ASD diagnosis but lower maternal serum PCB levels, indicating complex dysregulation at these loci during development and the need for further investigation to clarify exact mechanisms. Together, these results demonstrate the utility of comparing placental methylation patterns with multiple variables to understand complex gene by environmental interactions in the etiology of ASD and other neurodevelopmental disorders.
Data Availability
Datasets supporting the conclusions are available in the Gene Expression Omnibus repository (GEO) at accession number (GSE178206). Code and scripts for this study are available on GitHub.
USE OF HUMAN SUBJECTS DECLARATION
This study was performed in accordance with the ethical principles for medical research involving human subjects in the Declaration of Helsinki. The University of California, Davis Institutional Review Board and the State of California Committee for the Protection of Human Subjects approved this study and the MARBLES Study protocols (IRB# 225645). Human Subjects Institutional Review Boards at each of the four sites in the EARLI Study approved this study and the EARLI Study protocols (IRB# 214753). Neither data nor specimens were collected until written informed consent to participate was obtained from the parents.
AUTHOR CONTRIBUTIONS
Julia S. Mouat: methodology, software, formal analysis, writing – original draft, visualization, conceptualization Xueshu Li: investigation, resources, visualization, writing – review and editing Kari Neier: methodology, software, formal analysis, writing – review and editing Yihui Zhu: resources, writing – review and editing Charles E. Mordaunt: software, writing – review and editing Michele A. La Merrill: methodology, writing – review and editing Hans-Joachim Lehmler: investigation, resources, supervision, writing – review and editing Michael P. Jones: conceptualization, writing – review and editing Pamela J. Lein: funding acquisition, supervision, writing – review and editing Rebecca J. Schmidt: funding acquisition, supervision, data curation, writing – review and editing Janine M. LaSalle: funding acquisition, supervision, conceptualization, writing – review and editing, project administration
COMPETING INTERESTS
The authors declare they have no competing interests.
FUNDING
This work was supported by National Institutes of Health NIEHS R01 ES029213 (JML, RJS, PJL), T32 ES00705 (JSM), the UC Davis Intellectual and Developmental Disabilities Research Center (P50 HD103526),and the UC Davis Perinatal Origins of Disparities Center Graduate Student Fellowship (JSM). The PCB analyses were performed in the Analytical Core of the Iowa Superfund Research Program, which is supported by P42 ES013661.
ACKNOWLEDGEMENTS
We would like to thank Anthony E. Valenzuela, University of California, Davis, for joining discussions about this paper and providing helpful background reading; Elizabeth Angel, University of California, Davis, for quickly organizing MARBLES and EARLI metadata; Drs. Rachel F. Marek and Keri C. Hornbuckle, University of Iowa, for supporting the congener specific PCB analysis. The graphical abstract, Figure 1, and Figure 2B were made with BioRender.com.
ABBREVIATIONS
- PCB
- polychlorinated biphenyl
- WGBS
- whole genome bisulfite sequencing
- ASD
- autism spectrum disorder
- TD
- typical development
- DMR
- differentially methylated region
- PMD
- partially methylated domain
- WGCNA
- weighted gene correlation network analysis