Abstract
Background Climate disasters such as wildfires, floods and droughts can introduce significant interruptions and trauma to impacted communities. Children and young people can be disproportionately affected with additional educational disruptions. However, evaluating the impact of disasters is challenging due to difficulties in establishing studies and recruitment post-disasters.
Objectives With the increasing threat of climate change, we aimed to (1) establish a new analytical framework to evaluate the impact of climate disasters on academic achievement and (2) evaluate the impact of the 2014 Hazelwood mine fire (a six-week fire event in Australia).
Methods Bayesian hierarchical meta-regression was developed to evaluate the impact of the mine fire using only aggregated school-level data from the standardised National Assessment Program-Literacy and Numeracy (NAPLAN) test. NAPLAN results and school characteristics (2008-2018) from 69 primary/secondary schools with different levels of mine fire-related smoke exposure were used to estimate the impact of the event. Using an interrupted time series design, the model estimated immediate effects and post-interruption trend differences with full Bayesian statistical inference.
Results Major academic interruptions across NAPLAN domains were evident in high exposure schools in the year post-mine fire (greatest interruption in Writing: 11.09 [95%CI: 3.16-18.93], lowest interruption in Reading: 8.34 [95%CI: 1.07-15.51]). The interruption was comparable to a four to five-month delay in educational attainment and had not fully recovered after several years.
Discussion Considerable academic delays were found as a result of a mine fire, highlighting the need to provide educational and community-based supports in response to future events. Importantly, this work provides a statistical method using readily available aggregated data to assess the educational impacts in response to other climate disasters.
1 Introduction
Each year, climate disasters, such as wildfires, floods, droughts, affect 100 to 300 million people worldwide (Guha-Sapir et al., 2016). The frequency and severity of disasters such as wildfires have increased dramatically in recent years within the context of climate change (CRED, 2019; UNDRR, 2020; Xu et al., 2020). Children and adolescents are particularly vulnerable to experiencing difficulties in the aftermath of climate disasters due to physiological and developmental factors, as well as their dependence on others for care, protection and decision-making, associated with their age (Hoffman, 2008; Norris et al., 2002). The health, social and economic disruption of disasters impose substantial impacts on children and adolescents through factors such as trauma, illness, prolonged school interruption and reduced social support (Peek et al., 2018; Saylor, 1993).
It is well established that exposure to trauma in childhood is a potential pathway to later development of a range of psychological problems, such as anxiety, depression, fear, anhedonia, attention issues, aggressive behaviour and social withdrawal (Lubit et al., 2003). Indeed, posttraumatic stress disorder (PTSD), depression, anxiety and behavioural problems are widely reported in children and adolescents impacted by climate disasters (Kar, 2009; Kar and Bastia, 2006; McLaughlin et al., 2009; Terasaka et al., 2015). Wildfires have also been found to cause traumatic distress for children and adolescents (McFarlane et al., 1987, 1997; Yelland et al., 2010).
Exposure to disaster has the potential to result in long-term neurodevelopmental consequences for children and adolescents. Childhood trauma exposure is known to causing ongoing sensitivity to stress (De Bellis and Zisk, 2014), impair sequential development of brain structures and functions (De Bellis and Zisk, 2014; Perry, 2006) and cause difficulties in attention, memory and executive function (Buckley et al., 2000; Samuelson et al., 2010). These changes can be carried into adulthood and, in turn, increase risks of psychosocial problems throughout the lifespan (Caffo et al., 2005; De Bellis and Zisk, 2014; Edwardset al., 2003; Huh et al., 2014).
The psychological and neurodevelopmental impacts from trauma are important determinants of academic attainment in childhood and adolescence. The academic performance of students who have experienced trauma is generally poorer, and they are recognised to be at greater risk of early school dropout (Perfect et al., 2016; Romano et al., 2015). Although the predominant focus of childhood trauma research has been on the impacts of maltreatment, adverse academic outcomes have been similarly observed in children and adolescents following exposure to climate disaster (Gibbs et al., 2019; Perez-Pereira et al., 2012; Siriwardhana et al., 2013).
Climate disasters also impact children and adolescents through ongoing physical, educational, social and economical disruption. Schools may be forced to close or relocate for periods of time during and after disasters, causing interruptions and sometimes discontinuation in students’ academic programs, recreational activities, and social connections (Gibbs et al., 2019; Jaycox et al., 2007; Masten and Motti-Stefanidi, 2020; Peek and Richardson, 2010). Other physical, economic and social impacts, such as damage to property, loss of job opportunities, reduced incomes, increased distress levels and illness reported impacting adults (Xu et al., 2020; Broder et al., 2020; Norris et al., 2002; North and Pfefferbaum, 2013) can be potentially magnified in children and adolescents. However, the psychological and academic needs of children and adolescents have often been overlooked after disasters (McFarlane et al., 1987; Silverman and La Greca, 2002).
Therefore, there is a need to understand the impact of climate disaster on children and young people particularly under the increasing threat of climate change. However, evaluating the impacts of climate disaster on children and adolescents at a population-level is challenging, due to high cost, low response rates in surveys and lack of pre-disaster measurements necessary for evaluating changes (Kousky, 2016; Grievink et al., 2006; Tang et al., 2017). Administrative data such as the national standardised academic testing results (with participant rates of about 95%) can be used to evaluate the educational impact of disasters (Gibbs et al., 2019). However, accessing individual students’ academic records requires parental consent and complex ethics procedures, which can be a lengthy process and particularly challenging following disaster events.
These factors can lead to a lack of timely understanding of the degree and extent of climate disasters’ impacts, as well as how resources should be distributed to facilitate students’ coping and recovery and prevent long-term deficits and inequity. Therefore, there is an urgent need for establishing advanced methods that can use readily available aggregated educational data to evaluate the impact of climate disasters on children and adolescents.
In February 2014, a bushfire ignited the Morwell coal mine adjacent to the Hazelwood Power Station in the Latrobe Valley, Victoria, Australia and burned for approximately six weeks. Whilst the flames themselves did not directly threaten homes or lives, heavy smoke concentrations throughout the six-week period resulted in increased mortality, physical ill-health and psychological distress in the local community (Broder et al., 2020; Dimitriadis et al., 2021; Gao et al., 2020; Holt et al., 2021.; Ikin et al., 2020; Johnson et al., 2019). The event also caused considerable school disruption, including temporary closures and relocations (Berger et al., 2018; Ikin et al., 2020).
The Hazelwood Health Study (HHS; www.hazelwoodhealthstudy.org.au), established to evaluate the health and wellbeing impact of the mine fire (Ikin et al., 2020; Melody et al., 2020), conducted a school survey to evaluate the psychological outcomes of the mine fire on students attending schools in affected communities. A subsequent evaluation of the National Assessment Program-Literacy and Numeracy (NAPLAN) results of survey participants suggested academic delays in highly smoke-exposed areas (Berger, Gao, et al., 2020). However, the results from the study with the low participation rate, as is frequently the case in post-disaster studies, are at the risk of bias (Pullins et al., 2005; Salloum and Overstreet, 2012) and requires further evaluation. Aggregated school-level NAPLAN data were subsequently obtained to further consolidate our findings.
In this study, we developed a Bayesian interrupted time series hierarchical meta-regression model to evaluate the impact of the Hazelwood mine fire on academic performance. Using this method, instead of individual-level data, only aggregated school-level data from standardised academic tests is required for evaluating spatial and temporal profiles of community-wide traumatic events. Here, we have presented detailed analysis procedures and results to inform the conduct of research concerning educational outcomes after climate disasters.
2 Methods
2.1 Study design
2.1.1 Selection of schools and classification of exposure risks
The modelled particulate matter exposure (PM∼2.5) during the mine fire period suggested that the town Morwell, a statistical area level 2 area (SA2, comparable to township) (ABS, 2011) closest to the mine fire experienced the highest smoke concentrations, which at times greatly exceeded national safety standards (Luhar et al., 2020). The wider Latrobe Valley (statistical area level 3 [SA3] area comparable to a local government area which includes Morwell) suffered a moderate level of exposure (Luhar et al., 2020). Due to wind direction, the nearby SA3 area of Wellington, with a similar socioeconomic profile to Latrobe Valley, had little to no exposure to the smoke. Therefore, Wellington was chosen as the comparison area for this analysis.
As the students would be exposed air pollution both at their school and at home, the smoke exposure of at individual school location was not used in the analysis. Instead, 69 primary and secondary schools in the area were classified into three mine fire exposure groups: Morwell (high exposure), the remainder of the Latrobe Valley (moderate exposure) and Wellington (no/low exposure), see Figure 1.
2.1.2 NAPLAN and school profile data
In Australia, NAPLAN tests are conducted annually in May for Grade 3, 5, 7 and 9 students and NAPLAN examines educational domains of reading, writing, numeracy, spelling and grammar/punctuation (ACARA, 2019). As NAPLAN assesses incremental learning, students’ scores are expected to increase when they are re-tested at each two-year interval, from an average score of around 420 in Grade 3, to 580 in Grade 9 across individual domains (ACARA, 2019).
Aggregated school-level NAPLAN data (mean score and SE for each domain) were requested from the Australian, Curriculum, Assessment and Reporting Authority (ACARA) for all Victorian schools between 2008 and 2018. School profiles, including total enrolments, school sector (government, non-government), school type (primary, secondary), gender proportion and Index of Community Socio-Educational Advantage (ICSEA, a scale indicating socioeducational advantage of the school based on students’ family background information), were provided by ACARA for each year.
2.2 Statistical methods
School characteristics (in 2014) and NAPLAN participation rate (2008-2018) were first compared across the three exposure groups. School-level mean NAPLAN scores pre- and post-mine fire were visualised using box plots by exposure group for each educational domain and student grade level.
2.2.1 Centring NAPLAN scores
NAPLNA tests were designed to represents growth of student’s achievement over time from Grade 3 to Grade 9 with scores increase for each test. As the test score can be interpreted in terms of academic progression, it was not standardized. However, to make results comparable across student’s grades and domains, all scores were first centred against the matching mean scores in regional Victoria (for the same year, grade and educational domain).
2.2.2 Bayesian interrupted time series hierarchical meta-regression
Hierarchical two-level meta-regression models were carried out in a Bayesian modelling framework to estimate the association between mine fire exposure and centred NAPLAN scores. This approach directly model distributions of students’ performance within schools using mean scores and standard errors, which provides unbased estimates that are comparable with the model using individual-level records. Interrupted time series model using only means with linear regression instead of meta-regression will be inefficient (significant loss of statistical power) and also bias towards smaller schools (large schools with more students should be given more weights). Although hierarchical meta-regressions can be carried out in non-Bayesian framework, the complex model structure can introduce convergence difficulties and problematic confidence intervals (van Houwelingen et al., 2002). In the Bayesian framework prior information can be also incorporated.
The modelling framework is illustrated in Figure 2. The first level random effects (random intercepts) were modelled as student cohorts, e.g., the cohort of students progressing from Grade 3 in 2014 to Grade 5 in 2016 at the same school. The nested second level random effects (random intercepts) were modelled at the school-level. The centred mean NAPLAN score at Grade g level for cohort c students in given school s was modelled as follows:
The terms θs and θs,c are the random effects for the school s and cohort c in that school, respectively. θe is the random error term. We assume and θe ∼ . τs,g,c is the standard error of the mean NAPLAN score of the given school, student cohort and grade level (input data). The matrix Xs,g for fixed effects includes potential confounding factors: school characteristics (ICSEA, total enrolments, percentage of girls, school sector), grade level, long-term trend (year) as well as mine fire exposure effect variables detailed below.
The mine fire exposure effect was evaluated using the interrupted time series design (Bernal et al., 2017) with a time-specific interruption variable to capture the immediate effect of the mine fire on that year’s NAPLAN results, and a change in trend from pre- to post-mine fire (an interaction between the time interruption variable and year), see Figure 2(A). It was hypothesised that there would be no interruption effect in the no/low exposure group. Mine fire interruption effects and post interruption trend differences were evaluated separately for the high and moderate exposure groups, summarised as follows: where Em and Eh are indicator variables for moderate and high exposure. Ipost is the indicator variable for pre or post-mine fire (0 for 2008-2013 and 1 for 2014-2018). Tpost is the post-mine fire time variable (0 for 2008-2014, 1 for 2015, 2 for 2016 etc.). Therefore, βe1 and βe2 can be interpreted as the prior-mine fire differences when comparing moderate and high exposure schools with no/low exposure schools. These two coefficients are subsequently referred to as the fixed intercepts. The coefficients βe3 and βe4 can be interpreted as the mine fire interruptions effect (relative to the students’ developmental trajectories) for moderate and high exposure schools, respectively. The coefficients βe5 and βe6 are the post-mine fire trend differences compared with the trend before the mine fire. Other than βe1, βe2…βe6, the vector of coefficients, β, also contains coefficients of other confounding variables detailed above.
2.2.3 Estimation and modelling framework
Estimation was conducted by Markov chain Monte Carlo (MCMC) sampling implemented in the Stan (Stan Development Team, 2020) programming language via RStan package (Stan Development Team, 2020). Stan is a platform for high-performance full Bayesian statistical inference using “No-U-Turn Sampler” (NUTS) (Hoffman and Gelman, 2014). Weakly informative prior distributions were used for standard deviations (SDs) of random effects, namely, σs,c ∼ TN(10, 52), σs ∼ TN(10, 52) and σe ∼ TN(10, 52), where TN is the truncated normal distribution with support over the range [0, infinity). The mean of 10 for prior distributions of the random effects was chosen based on a preliminary exploratory evaluation of all Victorian schools. Weakly informative prior distributions N(0, 502) were used for all fixed effect parameters as non-informative priors base on uniform distributions can lead to a range of model fitting issues (Gelman et al., 2017; Stan Development Team, 2020).
Separate models were estimated for each testing domain. Results were reported as the posterior mean of estimated coefficients (4 Monte Carlo chains with 2000 iterations each), 95% credible interval (CI) and the probability of estimated coefficients (β) being greater or less than 0. Predicted centred NAPLAN score for schools were calculated and visualised for each testing domain and exposure group using line plots.
2.2.4 Sensitivity analyses
A range of sensitivity analyses were undertaken to test the robustness of results, which included using different prior distributions and excluding cohort random effects. Also, two schools in Morwell were relocated during the mine fire event and remained at their relocation sites for an extended period afterwards; hence sensitivity analyses were conducted excluding the two relocated schools. Code for fitting the models using synthetic data is provided in the Supplementary Material I.
3 Results
3.1 School characteristics
The characteristics of schools across exposure groups are presented in Table 1. ICSEA socio-educational advantage scores were lower for schools in the high exposure group (Morwell) compared to schools in the moderate and no/low exposure groups. Other school characteristics, including the percentage of girls, number of students, school sector and NAPLAN test participation rates were comparable between exposure groups.
3.2 NAPLAN score pre-and post-mine fire
Distributions of NAPLAN scores pre- and post-mine fire for each domain and grade level were plotted by exposure group, which are presented in Figure 3. While there was a general trend across all exposure groups for NAPLAN performance to decline post-mine fire, this was greater in higher exposure schools.
3.3 Bayesian hierarchical meta-regression models
Results from the Bayesian hierarchical meta-regression models are presented in Table 2 and Figure 4. Compared with the Victorian regional average, there was an estimated downward trend across all domains of testing for the schools in the three exposure groups (see Figure 4 and Table S1 to S5 in Supplementary Material II). As shown in Table 2, NAPLAN scores were found to be similar between schools in the moderate and no/low exposure group pre-mine fire after controlling for other confounding factors (e.g., there was no evidence that the fixed intercept mode coefficients for moderate exposure differed from 0). However, pre-mine fire NAPLAN scores were estimated to be lower in schools in the high exposure group for most domains when compared with no/low exposure schools (fixed intercept for high exposure ranged between -3 to -14, Table 2).
Table 2 also shows that there were substantial interruption effects post-mine fire in high exposure schools across all academic domains, with estimated mean score reductions of: -10.91 (95%CI: - 18.68, -3.08) for grammar and punctuation; -10.9 (95%CI: -17.98, -3.67) for numeracy; -8.34 (95%CI: -15.51, -1.07) for reading; -10.31 (95%CI: -17.39, -3.38) for spelling and -11.09 (95%CI: -18.93, -3.16) for writing. Typically, NAPLAN scores increase about 27 points per year (ACARA, 2019), which means that the delay in educational attainment in high exposure schools was equivalent four to five months.
After the initial drop in academic performance subsequent to the mine fire, there was evidence that writing and grammar and punctuation scores began to recover (positive slope) in high exposure schools, see Figure 4. However, there was no such improvement post-mine fire for spelling in high exposure schools (estimated slope of -1.49 post mine fire; 95%CI: -4.20, 1.21). Also, the predicted centred NAPLAN scores in 2018 were found to be lower compared with the pre-mine fire period in all academic domains for high exposure schools (see Figure 4), which indicates incomplete recovery five years post-mine fire.
3.4 Sensitivity analysis
The choice of prior distributions for the SDs of random effects were found to have little impact on results; however, using a weakly informative prior distribution reduced the time taken to fit the Bayesian models compared with using a non-informative prior distribution. Sensitivity analysis excluding the cohort random effects (only the school level clustering is considered) produced very similar results except for a slightly larger mine fire interruption effect for high exposure schools (see Table S6). Models excluding the two relocated schools showed results that were consistent with all schools, but with slightly smaller interruption effects (see Table S7). This suggests that school relocation might have had an adverse impact on NAPLAN performance in addition to the mine fire exposure. It’s also possible that the adverse effects were stronger in the relocated schools which were closest to the mine fire.
4 Discussion
This study presents an innovative method to evaluate the impact of disasters on students’ academic performance using only readily accessible aggregated school-level data. Results suggest that the Hazelwood mine fire had a major impact on academic performance in schools located within areas highly exposed to the smoke during the event. The impact of the event on academic achievement was consistent across all NAPLAN testing domains in high exposure schools, which is comparable with four to five months delay in educational attainment. While there was some recovery in academic performance in high exposure schools across all academic domains except for spelling, performance levels remained below those seen prior to the mine fire some four years afterwards. Given academic underachievement can lead to unemployment, social and economic disadvantage, and ill-health later in life (Bowman et al., 2017), it is critical that impacts of climate disaster on the academic progression of students are recognised and remedied.
4.1 Links between mine-fire and academic performance
The impact of the mine fire on academic performance may be attributable to a combination of factors. The mine fire introduced disruption to day-to-day operations in nearby schools (Berger et al., 2018). Two schools were relocated causing direct educational interruptions as well as threats to student’s sense of safety and security (Berger et al., 2018). The mine fire also led to ongoing distress experienced by students, teachers and parents in the smoke affected community post the mine fire (Berger et al., 2018; E. Berger, Maybery, et al., 2020; Broder et al., 2020; Maybery et al., 2020). Hazelwood mine fire-related air pollution has also been linked to a range of adverse physical health effects such as respiratory morbidity (Gao et al., 2020; Holt et al., 2021; Johnson et al., 2019; Shao et al., 2020). More broadly, the community wellbeing was also negatively impacted with local residents reporting a loss of trust in authorities and feeling abandoned with little support provided (Yell et al., 2019a). The subsequent closure of the Morwell mine and Hazelwood power station some three years after the fire occurred introduced additional social and economical challenges in an already disadvantaged community (VCOSS, 2015; Yell et al., 2019b). These factors may have impacted students’ learning abilities both directly and indirectly.
Potential links between air pollution and cognitive development, educational and behavioural outcomes in children is a rapidly increasing area of research. Associations have been found between chronic ambient air pollution and cognitive function as well as educational attainment (Clifford et al., 2016; Forns et al., 2017; Marcotte, 2017). A recent study suggested a link between prenatal airborne polycyclic aromatic hydrocarbons (PAH) exposure, which was also found in Hazelwood mine fire smoke (Reisen et al., 2017), and academic underachievement, particularly in spelling, during early adolescent years (Margolis et al., 2021). However, we are not aware of previous research evaluating associations between medium-duration air pollution events and long-term academic outcomes. Although our study design cannot delineate the relative contributions of all possible factors such as psychological trauma, disruption to schooling, or air pollution exposure, it indicates a possible linkage that warrants further investigation.
4.2 Improvements post-mine fire
Gradual improvements in academic performance were observed across most domains post-mine fire among students from high exposure schools. This may be a result of a slow recovery both at the individual and community level. Increases in school funding from the Victorian Government to upgrade school facilities and a range of implemented government strategies focused on improving health, wellbeing and access to services in the impacted community, may have assisted the recovery process (DTF, 2015; IGEM, 2017). Student’s performance in the spelling domain continued to deteriorate post-mine fire, which may be related idiosyncrasies in English language orthography (Babayiğit and Stainthorp, 2011; Barry and Seymour, 1988), or difficulties in teaching spelling (Adoniou, 2014; Bosman and Van Orden, 1997; Treiman, 2018). As difficulties in spelling may persists and lead to arrested writing development subsequently (Graham and Santangelo, 2014), further investigation is needed.
4.3 Comparison with other studies
Although a theoretical link between disasters and educational outcomes has been well-established (Peek, 2008), most studies have been limited to evaluations of how disasters impacted school attendance or drop-out rates (Perez-Pereira et al., 2012; Pietro, 2018; Siriwardhana et al., 2013) and academic delays have rarely been investigated. One study suggested more than 75% of the African American children evacuated in response to Hurricane Katrina (which resulted in the loss of more than 1,300 lives, 800,000 homes and 110 schools) experienced a decline in grades (Peek and Richardson, 2010). To our knowledge, only Gibbs and colleagues (2019) have previously evaluated the impact of disasters using national standardised academic performance tests. Their study found that exposure to the 2009 Black Saturday bushfire in Australia (loss of 173 lives, 2,000 homes and 3 schools) was associated with delays in academic achievement from Grade 3 to 5 in reading and numeracy but not in writing, spelling, and grammar domains (Gibbs et al., 2019). However, only academic progressions of a single cohort of students (who were in Grade 3 in 2011 and were aged around 7 at the time of the event) were evaluated. Our study, on the other hand, has utilized a time series design to assess academic achievement across all grade-levels completing NAPLAN testing and incorporated more historical data, which sheds further light on academic impacts caused across age groups.
4.4 Strength and limitations
This paper also provides an easily adaptable method to evaluate the impact of different types of community-level traumatic exposures (e.g., disasters and disease outbreaks) on school-aged children’s and adolescents’ academic progression using only aggregated school-level data. The method that has been developed has a number of strengths. Firstly, it enables the evaluation of the spatial and temporal profile of the impact, which could be used to inform policy and resource allocation in academic settings after disasters.
Secondly, the method can be applied using only summary statistics at the school-level to obtain unbiased estimates without losing substantial statistical power to detect meaningful differences. Although only 69 schools were included in our analysis, the study is effectively powered with over 30,000 students who have studied in these schools from 2008 to 2018. Accessing aggregated data is a particularly useful approach to research in this area as it circumvents the challenges of attempting to recruit individuals in communities post-disaster who are already burdened and traumatized as a result of their experiences, and is also a less resource-intensive approach. Similar approach can also be used without the interrupted time series design to study the associations between student’s education performance and other environmental factors such as air pollution.
Lastly, the interrupted time series design also facilitates comparison between academic outcomes pre- and post-disaster to identify changes in trends that may be attributable to the event. This approach can be easily adopted to provide information for teachers, schools and education departments to plan and implement educational modifications and accommodate students’ additional needs post-disaster.
This study also has recognised limitations. The model implemented assumes that student NAPLAN scores within schools were normally distributed, which could be unrealistic if, for example, distributions were skewed or the number of students are small. Although the number of schools in the high exposure group was modest, most of these schools had over 150 students enrolled, so the mine fire interruption effects identified are likely to be robust. Random slopes for schools, an extension of the model applied, was not considered due to the unwarranted increase in modelling complexity with limited data. Random slope and intercept models can be used where there are more schools and time points available. Furthermore, the aggregated nature of the data prohibited evaluating the contribution of individual-level risk factors to the academic outcomes observed. More detailed region-specific data on other risk factors, such as service availability, were not available for this analysis but in principle could be included in the proposed model.
5 Conclusion
This study provides evidence indicating that an extended air pollution event resulted in a delay in academic performance across multiple educational domains, which had not fully recovered several years after the event occurred. While most research to date has focused on the educational impact of maltreatment, or of major disasters causing significant loss of property and life, the current study shows that a community-wide traumatic event posing a minimal immediate risk to life and property can also have considerable long-term educational impacts. This finding highlights the substantial vulnerability of children and adolescents and the need to respond to community-wide disaster events, providing targeted support during and following the event, in the hope of preventing or ameliorating any educational impacts.
This paper provides a novel statistical method for using readily available aggregated data to assess educational impacts of disasters. Implementing research programs post-disaster is enormously challenging. Accordingly, an approach that enables accurate and timely assessment of educational impacts without impost on the community is invaluable. This model could be applied to investigate the effect of other extended events on academic achievement, for instance, the COVID-19 pandemic, which has impacted students’ access to, and delivery of, schooling worldwide.
Data Availability
The data underlying this article were provided by Australian, Curriculum, Assessment and Reporting Authority (ACARA). Data will be shared on request to the corresponding author with the permission of ACARA. However, a synthetically generated data set based on the source data for the tutorial part of the paper is available at https://github.com/CarolineXGao/NAPLAN_ impact.
Funding
This work was funded by the Victorian Department of Health and Human Services. The paper presents the views of the authors and does not represent the views of the Department.
Ethics approval
All procedure of the study was approved by the Monash University Human Research Ethics (project number: 5834) and the Victorian Department of Education and Training.
Data availability statement
The data underlying this article were provided by Australian, Curriculum, Assessment and Reporting Authority (ACARA). Data will be shared on request to the corresponding author with the permission of ACARA. A synthetically generated data set based on the source data for the tutorial part of the paper is available at https://github.com/CarolineXGao/NAPLAN_impact.
Supplementary material I - Tutorial for Bayesian interrupted time series hierarchical meta-regression
Import library packages
Import data
The data as well as the analysis code used in this tutorial can be directly downloaded from Github repository: https://github.com/CarolineXGao/NAPLAN_impact.
The variables in the data set are:
Modelling
Prepare data for modelling
A number of variables need to be changed for Bayesian modeling with Stan. Categorical variables, including grade (Grade) and exposure group (Exposure_group), need to be re-coded as dummy variables. Binary variables should be 0 or 1 (Government vs non-Government). In order to intemperate intercept of the model here we also center the numeric variables at the mean value and year at the start of the cohort (2008). Finally, interaction effects need to be created prior to the modelling. Grade 7 and 9 were combined due to relatively smaller numbers. Also the estimated effect size were also very similar when included separately.
Stan model block
When using Rmarkdown file, stan code can be directly included as a block of code with specification of {stan output.var = “StanModel”} in the code block. In this model we use weakly informative priors, N(10, 5), for the SDs of the random school effects, random cohort effects as well as random error N(10, 5). 10 was chosen because when using two-level mixed-effects models with the mean score differences as the outcome variable, the estimated error terms are close to 10.
Run Stan model
Input data is needed to be saved in a list.
The next stage is to run the simulation using the Stan model defined as ‘StanModel’
Next we extract results
Full model diagnostics can be evaluated using an interactive shinystan package. Here we provide a few static diagnostic plots.
Markov chain Monte Carlo (MCMC) plot shows no signs of poor mixing for each coefficient. There was no warning of divergent transitions (using non-centred parameterisation centered parameterisation can help with avoiding divergent transitions), which can be diagnosed using diagnostic plots for the NUTS. All continuous variables are confounding variables (we are not interested in estimating effect sizes from these parameters), hence they were all standardised in the analysis to improve model fitting speed. If any variable of interest is a continuous variable, the original parameters can be easily recovered (see Stan manual) post standardisation.
The mcmc_pairs function is used to visualize the univariate histograms as well as bivariate scatter plots for key parameters. It is useful in identifying multicollinearity (strong correlation) and other non-identifiability issues (banana-like shapes).
There is a negative association between sampled coefficients of the intercept term (alpha) and school sector (government, beta[7]). This is possible as school sector (government vs non-government) is the most important predictor of school-level NAPLAN results. Hence the sampled intercept will be impacted by the sampled coefficient of the school sector.
Model summary
Mean and credible intervals
Plot marginal effects
Here we obtain predicted margins using the posterior distribution of coefficients using the following steps:
Obtain design matrix with confounding variables fixed at reference values
Calculate posterior distribution of NAPLAN score difference by year and exposure zone
Plot posterior mean (with error bar) of NAPLAN score difference for each year and exposure zone
Supplementary material II
Acknowledgements
We wish to thank the Latrobe Valley and Gippsland communities for their support and participation in the Hazelwood Health Study. We also like to acknowledge Prof Rob Hyndman for his generous sharing of the Rmarkdown LaTex template (https://github.com/robjhyndman/MonashEBSTemplates) for writing this paper in the Rmarkdown environment.
Footnotes
With more detailed information provided