Abstract
While genetic and environmental factors have been shown as predictors of children’s reading ability, the interaction effects of identified genetic risk susceptibility and specified environmental for reading ability have rarely been investigated. The current study assessed potential gene-environmental (G×E) interactions on reading ability in 1477 school-aged children. The gene-environment interactions on character recognition were investigated by an exploration analysis between the risk single-nucleotide polymorphisms (SNPs) which were discovered by previous genome-wide association studies of developmental dyslexia (DD), and parental education (PE). The re-parameterized regression analysis suggested that this G×E interaction conformed to the strong differential-susceptibility model. Results showed that rs281238 exhibits a significant interaction with PE on character recognition. Children with “T” genotype profited from high PE, whereas they performed worse in low PE environment, but “CC” genotype children were not malleable in different PE environments.
Research Highlights
We calculated the Cumulative Genetic Score (CGS) of 9 SNPs related to developmental dyslexia, and found that the interaction between CGS and parental education on reading ability. The G×E comforted to the differential-susceptibility model that those individuals carrying more plasticity alleles were affected more than those carrying more fewer.
The interaction between rs281238 and parental education conformed to the strong differential-susceptibility model that children with “T” allele would have highest reading ability in positive environment and lowest reading ability in adversity environment, whereas children without “T” allele would not be affected by parental education.
Introduction
Developmental dyslexia (DD) is known to be a hereditary neurological disorder with unbalanced development between reading ability and age, which cannot be accounted for by low intelligence, inadequate education, visual or auditory acuity deficits or other mental disorders (Arbanas, 2015). The prevalence of DD is reported to be around 1.3-17.2% among school-age students varying in different orthographies and depending on the criteria used for diagnosis (Di Folco, Guez, Peyre, & Ramus, 2020) and its etiology is influenced by both genetic and environmental factors (Bishop, & Snowling, 2004).
Numerous studies have also suggested that environmental factors, such as birth weight, born preterm, home-literacy environment (HLE), socioeconomic status (SES) and parental education (PE), have important influences on children reading abilities’s development (Allotey et al., 2018; Aram, Korat, & Hassunah-Arafat, 2013; Fernald, Marchman, & Weisleder, 2013; Friend, DeFries, & Olson, 2008; Litt et al., 2012). The studies have highlighted the importance of family socioeconomic status in early childhood literacy development, and these findings consistently show that children from high socioeconomic status have higher reading and language skills before and after formal education than children from low socioeconomic status (Hoff, 2003;Noble, Farah, & McCandliss, 2006; Noble, McCandliss, & Farah, 2007; Rowe & Goldin-Meadow, 2009).
In recent years, a few studies have attempted to identify single nucleotide polymorphisms (SNPs) associated with reading or dyslexia by genome-wide association analysis study (GWAS) (Meaburn et al., 2008; Field et al., 2013; Gialluisi et al., 2014; 2019; 2020; Eicher et al., 2013; Truong et al., 2019). There were four variants genome-wide significantly associated with reading ability and dyslexia in the previous literature (Gialluisi et al., 2019). However, genome-wide association analysis ignored the influence of environment on gene expression, it is essential to consider the effect of environment in the variants.
At present, gene-environment interactions (G×E) on reading ability have been rarely studied, and the sample size of gene-environment interaction studies is usually four times larger than the sample size needed to discover the main effect of genes (Thomas, 2010). Therefore, we wanted to explore the feasibility of the interaction between the set of SNPs in GWAS and the environment.
Although research of G×E on reading ability is little, G×E on behavior have been well-studied in recent years (Manuck, & McCaffery, 2014; Rutter, 2006). For example, behavioral genetic studies with identical and fraternal twins have suggested that the degree of genetic influence on, or heritability of, individual differences in cognitive and reading abilities varies with SES (Turkheimer, Haley, Waldron, D’Onofrio, & Gottesman, 2003) or PE (Friend et al., 2008). The bioecological model is usually applied in twin studies that explain how does the heritability of certain phenotype vary with the environment. Friend et al. (2008) as the first twin study of G×E on reading disability found that genetic influence was higher among children whose parents had a high level of education, than those children whose parents had a lower level of education.
Consistently, molecular genetic studies further reported that dyslexic candidate genes’ influence on reading disability may depend on the environment. Mascheretti et al. (2013) explored five candidate genes’ (DYX1C1, DCDC2, ROBO1, KIAA0319, and TTRAP) interactions with a series of environmental factors. Results showed that children with DYX1C1 risk alleles would perform worse in adversity environment, such as low birth weight, low SES and history of smoking during pregnancy but would not be affected in positive environment. Su et al. (2015) investigated G×E on orthography processing in Chinese children and found that children carrying the minor allele of rs1091047 exhibited a smaller N170 effect in low home-literacy environment than those children from high hone-literacy environment. Both studies were consistent with the diathesis-stress model, which suggests that heritability for a particular behavior would be greater in poor environments while the deleterious gene would not be observed in more supportive environments and this model has been proposed to explain why certain behavioral disorders had a greater association with risk genes in environments where individuals have been exposed to a great deal of stressful life events (Caspi, et al., 2002; 2003).
Recently, however, a theoretical alternative to the diathesis-stress model has been proposed (i.e., the differential-susceptibility model, for summary, see Belsky & Pluess, 2013) and applied to the study of gene-environment interactions (Belsky & Pluess, 2009). The differential-susceptibility model framework stipulates that some individuals are not only more susceptible to negative environmental factors but also to positive ones as well. According to this model, some ‘risk genes’ might be better conceptualized as ‘plasticity genes’ (Belsky et al., 2009). In early literacy instruction area, Kegel, Bus, & IJzendoorn (2011) have found children had differential susceptibility of environment. Individuals with 7-repeat allele of the dopamine receptor D4 (DRD4-7R) profited most from the positive feedback of computer program, whereas they performed worst of early literacy skills in the absence of such feedback. And more recently, many other molecular genetic studies of gene-environment interactions have confirmed this method (Green et al., 2017; Wang, Tian, & Zhang, 2020).
So far, studies on G×E effect have mainly focused on Single Nucleotide Polymorphisms (SNPs), such as SNPs from previous genome-wide association (GWA) studies (Manuck & McCaffery, 2014). GWA study is a theoretical approache to gene discovery that is meant to find phenotype associated genetic variation without regard to biological pathway and function. Therefore, the 9 single-nucleotide polymorphisms (SNPs) from previous genome-wide association (GWA) studies in different cohorts in developmental dyslexia and reading ability were selected in the present study as genetic factors on reading ability of Chinese children. We focused on parental education (PE) as a marker of environmental variable. Parental education (PE) as an environmental factor has been shown to be a strong predictor for a variety of cognitive outcomes of children (Bradley & Corwyn, 2002; Craig, 2006). We predicted that if genetic and environmental influence were interdependent in reading ability, the effects of genetic signal should vary along the PE distribution. In confirmation analysis, two gene-environment interaction models (the diathesis-stress model vs. the differential-susceptibility model) were compared in a re-parameterized analysis to identify which model was a better fit to our data.
Methods
Participants
A total of 3217 primary school students from grade 3 to grade 6 were recruited from three provinces in northwestern part of China (Shaanxi Province, Gansu Province and Inner Mongolia Province). All these participants were school-aged children without any neurological disorder or neurotropic drugs history. In total, 2476 saliva samples of these participants were eligible for subsequently assaying genes and 2415 students’ genetic data were available at last. Finally, there were 1477 students had all the phenotype, genotype and environmental data (age = 116.34 ± 12.14 months, male/female = 737/740). Ethical approval was obtained from Shaanxi Normal University and written informed consent was obtained for all the participants’ parents.
Measures
Genetic analyses
DNA was obtained from oral epithelial cells in students saliva samples and 2476 samples were genetyped using Illumina Asian screening array (650K) by Beijing Compass Biotechnology. Quality control was performed standard quality control metrics (Anderson et al., 2010; Chang, 2015) by using PLINK v1.9 (http://pngu.mgh.harvard.edu/purcell/plink/). Eight samples were excluded as they had sex discrepancies between the records and the genetically inferred data. Fifty-three samples were removed because had unexpected duplicates or probable relatives (PI-HAT>0.20). Next, SNPs were eliminated if they showed a variant call rate <0.95, a minor allele frequency (MAF) <0.02, a missing genotype data (mind) <0.90, or a hardy-weinberg equilibrium (HWE) <10-5 with each dataset. Then, autosomal variants were aligned to the 1000G genomes phase 1v3 reference panel for imputation, which follow the standard procedure consistent with previous GWAS studies (see Gialluisi et al., 2020). Finally, significant reading-related or dyslexia-related SNPs were extracted from our data.
We collected all the SNPs which were associated with reading and recorded by the GWAS Catalog (https://www.ebi.ac.uk/gwas/). There were 4 SNPs (rs1555839, rs17663182, rs349045 and rs16928927) showed a strict significant association with reading on genome-wide level (p ≤ 5 × 10-8), however, none of them existed in the current study sample. Thus, we loosened the criteria from p ≤ 5 × 10-8 to p ≤ 5 × 10-7. Then, there were 22 SNPs recorded by GWAS Catalog (for detailed information see Supplementary material Table S1) and 9 of them existed in our samples (see Table 1).
Reading ability
Each child’s reading ability was tested by Chinese character recognition test, a reading test used in Mainland China (Lei et al., 2011; Pan & Shu, 2014). The test consisted of 150 single Chinese characters selected from China’s Elementary School Textbooks (1996). The average frequency of the characters was 182 per million (ranging from 0 to 2282), and the reliability of this test was .95 (Pan & Shu, 2014). Each child was individually tested and required to read aloud each character at a time. Each child’s reading score, namely the number of characters reading correctly, was recorded. Finally, 2269 children completed this test.
Parental education (PE)
Students’ parental education (PE) were collected with questionnaire. Scores from 1 to 8 represent parents’ highest educational qualification from primary school, junior high school, senior high school, junior college, undergraduate, master, doctor to postdoctor, respectively. Both mother’s and father’s education level were collected. PE were the average of maternal and father’s highest educational achievement. In total, 1507 students’ PE information were obtained.
Data analysis
Both exploratory and confirmatory analytic approaches consistent with Widaman et al. (2012) were used for 9 SNPs. Standard exploratory analysis was aimed to test whether there were G×E effects on character recognition. Confirmatory re-parameterized approach was employed to contrast the different hypothesis of G×E, i.e., strong and weak forms of the differential-susceptibility and diathesis-stress models to determine which provided the best, most parsimonious fit to the data.
Exploratory CGS
To examine the interaction between 9 SNPs and PE, we compute the cumulative-genetic socre (CGS), the risk allele was coded by 1 while the normal allele was coded by 0 (e.g. the two different alleles of rs281238 were C and T, and the risk allele was T, then genotype of CC, CT and TT was coded by 0, 1, 2, respectively). Parental education (PE) was the environmental variable X1, and a continuous variable CGS was the genetic variable X2. Age and sex were covariates.
The standard multiple regression model can be written as:
where Y is the dependent variable (reading ability); A0 is the intercept, A1 and A2 are regression slopes for main effects of environment (X1) and CGS (X2), respectively; A3 is the regression coefficient for the product variable (X1 × X2) and represents the difference in slope on X1 for the CGS; A4 and A5 are the regression slopes for covariants age and sex; E is a stochastic error term.
Confirmatory CGS
Following Widaman et al. (2012), we re-parameterized the regression model, allowing a testing of alternative forms of the G×E interaction, as:
here Equation 2 is the re-parameterized the regression model for PRS and single SNP, respectively; C is the point on X1 or X at witch the slopes for the different PRS or genotype cross. If the cross point of C and its confidence interval (CI) is within the range of value on environment (X1 or X), the interaction tested is disordinal, reflecting differential-susceptibility model. On the contrary, if the cross point of C or its confidence interval (CI) is greater than or equal to the most positive point on environment (X1 or X) in this study, the interaction is ordinal, consistent with diathesis-stress model.
Next, to compare the efficiency of differential-susceptibility model and diathesis-stress model, we construct Model 3a, Model 3b. In Model 3a, if the cross point of C and its confidence interval (CI) is within the range of value on environment (X1 or X), the interaction tested is disordinal, reflecting differential-susceptibility model. On the contrary, if the cross point of C or its confidence interval (CI) is greater than or equal to the most positive point on environment (X1 or X) in this study, the interaction is ordinal, consistent with diathesis-stress model.
Results
All the nine SNPs existed in our sample were conformed to Hardy Weinberg equilibrium(p > .05, see Table 1).
Standard exploratory analysis
CGS
The standard multiple regression of CGS and PE without G×E was fit to data on character recognition. It had an R2 = .2552 (p < .001), with environment effect significant, Â1 = 1.87 (SE = .43), p < .001, but not the gene main effect, Â2 = −.80 (SE = .29), p = .78. The main effects of covariates age and sex were also significant. Adding the G×E product term to the equation results in an increase in ΔR2, of .0023, as was the coefficient for the G×E product term itself, Â3 = .52 (SE = .24), p = .0307. (See Table 2 and Figure 1). After ANOVA analysis, , F ratio > 1.0 suggested that we can further to evaluate competing theoretical models (Belsky & Widaman, 2018).
Single SNP
After test the G×E of CGS, we verify the interactions of each single SNP (See Supplementary materials Section1, Table S2 and S3). Only the SNP of rs281238 was significant after Bonferroni correction as shown in Table S3. Results of the other SNPs are shown in supplementary materials Table S2. Model 1 in Table S3 has an R2 = .2531 (p < .001), with a significant main effect of PE, Â1 = 1.88 (SE = .43), p < .001, but not the SNP main effect, Â2 = .01 (SE = .67), p = .98. The G×E interaction produced a significant ΔR2 of .0035, p = .005 (see Model 2). The coefficients of rs281238, Â2 = −5.08 (SE = 1.94), p = .009, and G×E product term, Â3 = −1.56 (SE = .56), p = .005, and , by ANOVA analysis, so confirmatory re-parameterized analysis can be estimated. Figure 2 shows the difference in the impact of environment on character recognition of different-genotypes.
Confirmatory re-parameterized analysis
Different versions of Equation 2 reflecting strong and weak forms of differential-susceptibility and diathesis-stress models for G×E interaction on character recognition.
Differential-susceptibility vs. diathesis-stress model for CGS
The estimated cross-over point C fell close to the sample mean on PE and the 95% CI of C fell within the range of PE. In Model 3a, Ĉ = 3.34 (SE = .56), the 95% CI of Ĉ [2.24, 4.43], and explained a significant amount of variance in character recognition, R2 = .2555, p < .0001 (see Table 2). Choosing the highest PE value as C leads to Model 3b, the diathesis-stress model. In Model 3b, diathesis-stress model, explained a significant amount of variance in character recognition, R2 = .2536, p < .0001. Compared with Model3b, Modle3a adds a parameter Ĉ, resulting in a significant increase in R2, ΔR2 = .0019, p = .049, so Model3a were accepted.
Differential-susceptibility vs. diathesis-stress model for CGS
The estimated cross-over point C fell close to the sample mean on PE and the 95% CI of C fell within the range of PE. In Model 3a, Ĉ = 3.21 (SE = .36), the 95% CI of Ĉ [2.50, 3.91], and in Model 3b, Ĉ = 3.21 (SE = .42), the 95% CI of Ĉ [2.39, 4.03]. Model 3a represent the strong differential-susceptibility model suggests that children without the risk allele T would be unaffected by PE, but children with different number of the risk allele T would be positively influenced by PE at different degrees. Model 3a explained a significant amount of variance in character recognition, R2 = .2567, p < .0001 (See Section 2 & Table S3). The nested model of Model 3a, relaxing the constraint that A1 = 0, leads to Model 3b, the weak differential-susceptibility model. Model 3b has a modest increase on explaining character recognition variance over Model 3a, but not reach the significant level, ΔR2 = .0002, p = .52. It suggested that adding a parameter A1 in the model cannot increase the fitness of the model significantly. Thus, we accept the strong differential-susceptibility model and reject the weak differential-susceptibility model.
In Model 3c, the strong diathesis-stress model, children without risk allele T would not affect by PE, however, reading ability of children with risk allele T would increase with the improvement of PE, but never catch up with the children without allele T. Model 3c explained a significant amount of variance in character recognition, R2 = .2452, p < .0001. The nested model of Model 3c, which relaxing the constraint that A1 = 0, leads to Model 3d, the week diathesis-stress model which explained a large amount of additional variance over that explained by Model 3c, ΔR2 = .0078, p = .0001. Therefore, we found no significant basis for accepting the parsimonious Model 3c, putting support in favor of the weak diathesis-stress model as a more optimal representation of the data.
Since there are no statistical tests can evaluate the efficiency of Model 3a and Model 3d because of the same degree of freedom. Therefore, AIC and BIC values were employed to compare these two models. Since Model 3a has lower values in both AIC and BIC than Model 3d, we accepted Model 3a, the strong differential-susceptibility model as the best model for the current data. In order to provide an effective way to display the interactions in data, we plot predicted values under strong differential-susceptibility model. The plot of predicted character recognition score is shown in Figure S1, where the number of T allele determine children’s malleability. Below the cross-over point (low level of PE), children’s reading ability increase from low to high in TT group, TC group and CC group as a function of PE. However, above the cross-over point (high level of PE), TT and TC group shows consistently higher performance than CC group as a function of PE.
Discussion
In the current study, we found that CGS and rs281238 had a significant interaction with parental education on Chinese children’s reading ability and this interaction effect was consistent with the differential-susceptibility model. To our knowledge, this is the first study that implemented a molecular genetic approach to investigate the interaction of environmental factor and SNPs associated with dyslexia from GWAS in reading ability by employing a multiple regression and re-parameterized analysis to distinguish gene-environment interaction between differential-susceptibility and diathesis-stress models.
We calculated cumulative gene scores for nine SNPs and found an interaction between CGS and parental education on reading ability, suggesting the effect of multiple genes on complex traits (Plomin & Davis, 2009). The study also demonstrated cumulative effects on genes (Tyler et al ., 2009; Steiger et al ., 2016), thus, those children carrying more plasticity alleles were influenced more than those carrying fewer on reading, providing new evidence for the differential-susceptibility model for G×E effect on reading ability. This finding also supplemented evidence supporting the differential-susceptibility model for G×E effect on behavioral traits (Belsky et al., 2013; Wang et al., 2020) which offers an alternative, evolutionary-inspired view: in order to improve adaptation of different environments, some individuals are simultaneously sensitive to both negative and positive experience (Bakermans-Kranenburg & van IJzendoorn, 2006). Although GWAS found significant loci, their explanatory rate for reading was very small. This finding supports the idea that G×E interaction as a possible aid in gene discovery(Thomas, 2010) and the consideration of environment in the GWAS studies might increase the efficiency of gene. Our results provide a good insight into the influence of environment as well as the presence of multiple genes.
Moreover, the current study was performed in normal school-age students and the reading ability is normally distributed, thus, these results might put an explanation for both children with high and low reading ability. Since positive environments can facility those children with risk alleles get a better reading performance, it is instructive for intervention plans. Future gene-environment research should consider both positive and adverse environmental factors.
More importantly, we observed that the gene-environment interactions between rs281238 and parental education on reading ability were consistent with differential-susceptibility model. The SNP, rs281238, locates in SEMA6D, a member of the semaphorin gene family, which codes a transmembrane protein that might play a navigational role in the signaling process of the nervous system (He, Wang, Koprivica, Ming, & Song, 2002). Other SNPs in SEMA6D, such as rs1378214, rs281320 and rs281323, have been reported a significant association (p < 5×10-8) with cognitive ability, attention deficit hyperactivity disorder (ADHD) and education attainment and have been replicated in independent studies (Kichaev et al., 2019; Klein et al., 2019; Lam et al., 2017; Lee et al., 2018; Okbay et al., 2016). SNPs in SEMA6D, such as rs12910916, rs2573570 and rs8039398 have also been reported related with reading, mathematics, cognition, education achievement and ADHD (Davies et al., 2018; Gialluisi et al., 2019; Lam et al., 2017; Lee et al., 2018; Martin et al., 2017; Rietveld et al., 2014). All these results suggest the impact of SEMA6D on cognitive and reading related traits. Taken together, our results put new evidence for the importance of SEMA6D in the complex cognition area - reading ability.
Notably, there are several limitations in the current study. First, genome-wide significant SNPs reported in previous literature of reading and dyslexia (Giallusisi et al., 2019) were not genotyped in our data and therefore were not tested for G×E effect on reading ability in this study. Future studies with genome sequencing of this Chinese cohort might be desirable to further test G×E effect on reading ability with genome-wide significant SNPs. Second, we only tested parental education as environmental factor. Other environmental factors might be valuable to be used to test G×E effect on reading ability in future studies. Finally, to further verify the external validity of this study, it is essential to replicate these findings in independent cohorts in future.
In summary, we have provided initial evidence for how the significant SNPs in developmental dyslexia GWA studies affect children’s reading performance by interacting with environmental factor of parental education. Our results indicated the validity of the differential-susceptibility model in reading ability.
Data Availability
The beta values of the SNPs in this paper can be found in previous studies. The statistical methods and statistical data available from the corresponding author upon reasonable request. The raw data are not publicly available due to legal or ethical restrictions.
Conflict of interest
The authors have no conflicts of interest to disclose.
Data available statement
The beta values of the SNPs in this paper can be found in previous studies. The statistical methods and statistical data available from the corresponding author upon reasonable request. The raw data are not publicly available due to legal or ethical restrictions.
Acknowledgments
This work was funded by National Natural Science Foundation of China (61807023), Funds for Humanities and Social Sciences Research of the Ministry of Education (CN) (17XJC190010), Natural Science Foundation of Shaanxi Province (CN) (2018JQ8015), and Fundamental Research Funds for the Central Universities (CN) (GK201702011) to Jingjing Zhao. This study was also funded by Fund for Humanities and Social Sciences Research of the Ministry of Education (19YJC190023), and the China Postdoctoral Science Foundation funding project (2019M663924XB), Natural Science Foundation of Shaanxi Province (CN) (2021JQ-309), and Planning Subject for the 14th Five Year Plan of Shaanxi Education Sciences (SGH21Y0040) to Zhengjun Wang.