Abstract
Background In drought-prone regions, timely and granular predictions of the burden of acute malnutrition could support decision-making. We explored whether routinely collected and/or publicly available data could be used to predict the prevalence of global and severe acute malnutrition, as well as the mean weight-for-height Z-score and middle-upper-arm circumference for age Z-score, in arid- and semi-arid regions of Kenya, where drought is projected to increase in frequency and intensity.
Methods The study covered six counties of northern Kenya and the period 2015-2019, during which a major drought occurred. To validate models, we sourced and curated so-called SMART anthropometric surveys covering one or more sub-counties for a total of 79 explicit survey strata and 44,218 individual child observations. We associated these surveys’ predictors specified at the sub-county or county level, and comprising climate food security, observed malnutrition, epidemic disease incidence, health service utilisation and other social conditions. We explored both generalized linear or additive models and random forests and quantified their out-of-sample performance using cross-validation.
Results In most counties, survey-estimated nutritional indicators were worst during the October 2016-December 2019 drought period; the drought also saw peaks in insecurity and steep vaccination declines. Candidate models had moderate performance, with random forests slightly outperforming generalised linear models. The most promising performance was observed for global acute malnutrition prevalence.
Discussion The study did not identify a model that could very accurately predict malnutrition burden, but analyses relying on larger datasets with a wider range of predictors and encompassing multiple drought periods may yield sufficient performance and are warranted given the potential utility and efficiency of predictive models in lieu of assumptions or expensive and untimely ground data collection.
Background
Acute malnutrition or wasting is a leading underlying factor behind childhood mortality and untoward outcomes of pregnancy (1). Over the past few decades, extremely high prevalences of acute malnutrition have been observed in crisis-affected populations, particularly during periods of acute food insecurity (2–5). Drought conditions pose a major threat to food security within arid and semi-arid regions of the Horn of Africa, including northern Kenya, and are projected to occur with greater frequency and intensity in this region due to climate change (6).
Knowing the population burden (prevalence) of acute malnutrition serves multiple purposes, including the selection of appropriately scaled food security, nutritional and health interventions (e.g. cash transfers; blanket feeding distributions; management of severe and/or moderate malnutrition), resource mobilisation, logistical planning for an expected incident level of cases and monitoring of an ongoing response (7). In drought-affected settings, the mainstay of measuring acute malnutrition prevalence is so-called Standardised Monitoring and Assessment of Relief and Transitions (SMART) surveys, highly standardised data collection exercises usually conducted at the administrative level 2 scale or below and targeting children aged 6 to 59 months old (mo), the main at-risk group (7).
Over the past decade, SMART surveys have benefited from increased technical and software support (8), with evidence of improved quality (9). However, they remain somewhat expensive (indicatively, about 10-40,000 USD depending on transport and other factors; (10)) and cannot feasibly be conducted on an ongoing basis across all potentially affected areas. While sentinel-based approaches have also been attempted (11,12) and reductions to the sample size of surveys have been shown to be possible (13), no method is currently available that can efficiently provide estimates with timeliness and geographic granularity.
Across different areas of public health, statistical approaches underpinned by analysis of multiple existing or routinely collected data sources have been used to provide estimates based solely on secondary, desk-based analysis (14–17). Such an approach could offer a complementary (and much cheaper) method to surveys and support governments and humanitarian actors to detect worsening conditions and respond efficiently and on time to these. We previously showed that predictive small-area estimation models did not accurately predict acute malnutrition prevalence in South Sudan and Somalia (18). Here, we report a similar study focussed on the arid- and semi-arid counties of Kenya and relying on a somewhat different range of data.
Methods
Study population and period
We included in the analysis the counties of Baringo, Garissa, Isiolo, Mandera, Marsabit, Samburu, Tana River, Turkana, Wajir and West Pokot, and all sub-counties within these (Figure 1; in Kenya, sub-counties have limited administrative function and have seen considerably boundary changes: we relied on a list and boundary provided by the United Nations in 2019 (19)). The period of analysis was from January 2015 to December 2019 inclusive, which included a period of acute drought across the Horn of Africa unfolding between October 2016 and the end of 2018. The region of analysis had an estimated population of 8.17 million (see Data Sources and Management) at the midpoint of the period.
Map of the sub-counties and counties included in the study. The extent of shading indicates the number of survey observations available for this analysis. Green squares indicate the location of sentinel markets.
Data sources and management
Anthropometric data
We obtained all available raw datasets and reports of SMART anthropometric surveys carried out within the study area and timeframe. All surveys relied on a multi-stage cluster sampling design, and some had explicit strata corresponding to either single sub-counties or groupings of these. Datasets had various formats; we sought within each the following minimal variable set: cluster number, child ID, age in months or days, sex, weight to the nearest 0.1 Kg, height in cm, presence of bilateral oedema and middle-upper arm circumference (MUAC) in cm or mm. To the extent possible, we identified the sub county location of individual survey clusters using household coordinates where these were included in the dataset or names of localities where each cluster fell if contained in the report or the data. We computed anthropometric indices of interest (weight-for-height Z-score, WHZ; MUAC-for-age Z-score, MUACZ) based on WHO 2006 growth chart standards using the R anthro package (20). We then applied the following exclusion criteria to each child observation: (i) minimal set of variables (see above) incomplete; (ii) age outside the 6 to 59mo range; (iii) sub-county location of cluster unknown; (iv) WHZ and/or MUACZ <> 5SD, indicating implausible values as per WHO recommendations (21). We defined global (GAM) and severe (SAM) acute malnutrition as WHZ < −2SD and/or bilateral oedema, and WHZ < −3SD and/or bilateral oedema, respectively.
Predictors of malnutrition
We wished to explore statistical models that could accurately predict anthropometry, and which government and humanitarian actors could apply routinely without needing to collect ad hoc ground data. We therefore searched for public or non-public and routinely collected/compiled data that would directly or by proxy represent any of a range of factors that would plausibly causally affect or at least correlate with acute malnutrition. Datasets also needed to feature sufficient geographic granularity (at least by county, preferably by sub-county) and consistently cover the geography and time period analysed. Guided by two previously published causal frameworks (18,22) of acute malnutrition (see Figures S1-S2, S1 Appendix), we worked with the Ministry of Health of Kenya and UN agencies to identify suitable datasets, while also searching for public sources online.
Table 1 lists predictors that, after lengthy searches and exploratory data cleaning, met all the above criteria and were used to develop candidate models: the table also provides specific data management steps for each. For all time-varying predictors, we computed 3- and 6-monthly right-aligned rolling means, which also resolved minor missingness instances. We categorised heavily skewed predictors into quintiles, with a zero category if the latter was frequent and epidemiologically meaningful, e.g. in the case of cholera incidence. Summary statistics for each predictor variable are shown in Table S3, S1 Appendix.
Population denominators
So as to make the value of all predictors comparable across time and sub-county, we divided acute malnutrition admissions, cases of epidemic disease, vaccine doses administered and incidence of insecurity events by population. As sub-counties are not presented as a unit in existing census projections, we used annual WorldPop estimates (29,30), constrained to match census-based demographic projections but available at 100m2 pixels (https://data.worldpop.org/GIS/Population/), which we aggregated to sub-county level and interpolated to provide a monthly value.
Statistical analysis
To adequately propagate survey non-systematic error, we fitted models to individual child observations, with the concurrent value of predictors specified either at the sub-county (if possible) or county level. We developed models for four outcomes: GAM, SAM (binary, assumed to be binomially distributed), WHZ and MUACZ (continuous, assumed to be Gaussian-distributed). We explored two classes of models: (i) generalised linear or additive models featuring combinations of predictors, with or without additive smoothing terms and/or a random effect for county to capture additional variability, using the R mgcv package (31); and (ii) random forest classification using the ranger package (32), using 1000 trees and maximum three variables to split at within each tree node.
We first screened for promising predictor variables and rolling-mean periods of these by fitting univariate generalised linear or additive models and monitoring the Akaike Information Criterion (AIC). To select the most performant generalised linear/additive model, we entered variables into models based on their lowest (best) to highest AIC rank, with or without a smoothing term for continuous time-varying variables and avoiding pairs of highly correlated variables (Figure S4, S1 Appendix). As models grew in complexity, we tracked their performance for out-of-sample prediction (the key metric by which to evaluate their likely utility) using leave-one-out cross-validation (LOOCV), with folds specified at the individual survey explicit stratum level. We also tested mixed models with county as the random effect, but these tended to predict less well out-of-sample than fixed-effects models.
As metrics of performance, we computed models’ absolute bias relative to predictions, mean absolute error, probability of predicting the outcome within given thresholds precision bounds, and sensitivity against alternative crisis severity thresholds (i.e. the proportion of observations above the threshold that were also classified as above the threshold by the model). All analysis was done in R Statistical Software (33). Data and analysis code are available on https://github.com/francescochecchi/ken_malnut_prediction/.
Ethics
Secondary analysis of the data was approved by the Amref Ethics and Scientific Review Committee, the competent committee in Kenya (ref. ESRC P723/2019) and the London School of Hygiene and Tropical Medicine Ethics Committee (ref. 15334, amendment 3).
Results
Anthropometric survey patterns
Altogether, the analysis relied on data from 79 explicit survey strata, comprising 44,536 child observations, of which 44,218 (99.3%) were eligible for analysis after exclusion criteria were applied, or a mean of 560 observations per stratum. Individual survey-stratum attrition and estimates are reported in the S1 Appendix (Table S1 and Table S2, respectively). Survey coverage was uneven across the region studied, with sub-counties in Turkana accounting for most observations and Garissa the fewest (Figure 1).
Survey-stratum point estimates are shown in Figure 2 for SAM and GAM and Figure 3 for WHZ and MUACZ, with lines connecting sequential stratum estimates. Generally, the highest SAM/GAM prevalences (lowest mean WHZ/MUACZ) were observed during 2017, but there were considerable differences among strata within the same county (Marsabit, Turkana) and in some counties (Samburu, Wajir) there was comparatively less variation.
Point estimates of severe and global acute malnutrition prevalence, by county, stratum and survey date. The pink-shaded band indicates the drought period (October 2016 to December 2018).
Point estimates of weight-for-height Z-score and MUAC-for-age Z-score, by county, stratum and survey date. The pink-shaded band indicates the drought period (October 2016 to December 2018).
Predictor patterns
County-level trends in each predictor are shown in Figure 4. The onset of drought appeared more visible when tracking NDVI and standardised NDVI than rainfall abnormalities. Cereal prices increased in all sentinel markets (see S1 Appendix, Figure S3) and continued increasing in Garissa and Mandera. During the drought period, MAM and SAM admissions increased markedly in Turkana and Marsabit, with a more delayed peak in Isiolo. The drought period was also generally concurrent with peaks in insecurity events and deaths, and with a steep fall in vaccination output. The period also saw epidemics of cholera and measles, but these occurred mostly outside the drought period, with the largest peak in Tana River.
Trends in each predictor variable, by county. The pink-shaded band indicates the drought period (October 2016 to December 2018). DPT = pentavalent vaccine. MAM = moderate acute malnutrition. MMR = measles, mumps, rubella vaccine. (S)NDVI = (Standardised) Normalised Difference Vegetation Index. SAM = severe acute malnutrition. Cov. = coverage. Insec. = insecurity. See S1 Appendix, Table S3 for predictor units.
Predictive model performance
Generalised linear or additive models
The goodness-of-fit (AIC) of each predictor variable on univariate analysis is summarised in Figure S5, S1 Appendix. By this metric, (S)NDVI and recent incidence of SAM or MAM treatment admissions had the strongest associations with each outcome. Figure 5 shows predictions versus observations on cross-validation for the highest-performance multivariate models identified for each outcome.
Generalised linear or additive model predictions versus observations for each fold in leave-one-out cross-validation, by outcome. Within each graph, dots represent individual survey strata left out of the training sample and which the model was validated on. The size of each dot is proportional to the number of child-observations, and the colour maps to the county. The diagonal line denotes perfect fit, while shaded areas above and beyond it show alternative error thresholds. Finally, horizontal and vertical lines denote interesting threshold values of the outcome for which the model’s sensitivity was computed.
Generalised linear models were retained for GAM and SAM prevalence, but additive versions of these were marginally superior for mean WHZ and MUACZ (Table 2). All models had negligible bias, but only a minority of predictions fell within the stricter of the two error bounds suggested for each outcome. Models had reasonably high sensitivity for predicting instances of high GAM and SAM prevalence at thresholds of 15% and 2% respectively, but low sensitivity when these thresholds were raised to 20% and 5%.
Random forest models
Prediction-versus-observations graphs for random forest models of each outcome are shown in Figure 6. Random forest performance was marginally better than that of generalised linear/additive models for all outcomes except MUACZ (Table 3), though the comparison of sensitivity between the two model classes is based on few observations, at least for the more severe of the two thresholds evaluated.
Random forest predictions versus observations for each fold in leave-one-out cross-validation, by outcome. Within each graph, dots represent individual survey strata left out of the training sample and which the model was validated on. The size of each dot is proportional to the number of child-observations, and the colour maps to the county. The diagonal line denotes perfect fit, while shaded areas above and beyond it show alternative error thresholds. Finally, horizontal and vertical lines denote interesting threshold values of the outcome for which the model’s sensitivity was computed.
Discussion
Key findings
While nutritional surveillance in Kenya has become more systematic in recent years (34), a dearth of ground data continues to hamper the timeliness and geographic targeting of nutritional and other public health responses to droughts. Our study explored a predictive modelling approach to complement ground surveillance. None of the models evaluated, regardless of outcome (nutritional indicator), offered compellingly high performance. While sensitivity was reasonably high for the lower of two thresholds against which it was benchmarked, this threshold included a majority of observed point estimates, i.e. may not be of great utility in the Kenyan context to detect a marked deterioration from baseline. Only the random forest model for GAM prevalence offered reasonably high sensitivity for the higher threshold of ≥20% prevalence, which, at least within this pool of survey data, would indeed identify unusually elevated malnutrition.
Findings in context
While largely disappointing, the performance of models in Kenya, albeit underpinned by different predictor variables, is higher than in Somalia and South Sudan based on a previous study (18). Other attempts at predicting childhood malnutrition at subregional resolution have had mixed success. A model of stunting based on demographic and health survey data and remotely sensed predictors performed reasonably well in Bangladesh but not Ghana (35). A landmark small-area estimation study (36) of malnutrition across Africa, relying on large-scale countrywide surveys and a wide range of static and time-varying predictors, achieved high performance for stunting and reasonable performance for wasting, although the predictors and data included in this study may not offer the kind of temporal resolution required to predict deteriorations in nutritional status within the relatively short timeframe of a developing drought. In Bangladesh, random forest models outperformed other approaches for predicting maternal undernutrition, but these are based on individual-level predictors collected during demographic and health surveys, i.e. do not circumvent the need for data collection and may be more suitable for individual screening and case identification (37).
Limitations
This analysis is mainly limited by the quantity and quality of available data that could be used to develop and evaluate models. While SMART data had good quality, the time period investigated encompassed a single drought period, and in some counties even the drought was not correlated with dramatic changes in nutritional indicators, leaving relatively limited statistical variability with which to discriminate between alternative models. The analysis also relied on a limited range of predictors, reflecting datasets found to have consistent availability and geographic specification over the region and period analysed. It is plausible that additional data on likely correlates of malnutrition, including endemic morbidity, and more specific and granular data on food insecurity, e.g. terms of trade (purchasing power) or adoption of household coping mechanisms, would have improved predictive performance. Generally, any error in both anthropometric data (e.g. due to inadequate measurement practices) and predictors would have affected model performance, most probably by ‘diluting’ regression coefficients and thus under-estimating the true predictive contribution of each predictor.
Apart from data available, the study did not investigate the performance of alternative models that may be more suitable for prediction with many co-variates (e.g. lasso regression) or machine learning approaches such as neural networks, though the latter, like random forests, would likely encounter challenges predicting out of sample, and may thus require careful fine-tuning. The relationships between predictors and the outcomes investigated may themselves not be constant in time, and it is therefore possible that a given model may perform well over the period from which data to evaluate it were sourced, but less so later. Similarly, our findings should only be considered applicable to the Kenyan context, as predictor-outcome correlations may be differently modulated by other variables in other settings.
Conclusions
We identified generalized linear and, perhaps more promisingly, machine learning (random forest) models that yielded moderate predictive performance for different indicators of acute childhood malnutrition in Kenya. The models for GAM in particular could potentially be used in lieu of educated guesses to support real-time situational awareness of and decision-making on deteriorating nutrition.
It is plausible that these models could be improved if trained on a longer time series of data (critically, these should contain several peaks and troughs in ground-measured acute malnutrition burden, i.e. more variability in the outcomes studied, as would be the case if data were updated until the present time, thereby also capturing the 2022-2023 drought); and if a wider range of consistently available predictor data were harnessed, including more indicators of morbidity and health service performance, as well as measures of food insecurity other than market prices. A predictive modelling approach remains attractive in contexts, such as northern Kenya, where drought is a known threat and the costs and feasibility of ongoing primary data collection across the affected region are a serious constraint to surveillance.
S1 Appendix. Supplementary tables and figures
Financial disclosure statement
The study was funded by UNICEF’s East and Southern Africa Regional Office (ref. ESARO/SSFA/2019-001) and by the Innovative Methods and Metrics for Agriculture and Nutrition Actions (IMMANA) programme, itself funded by the Bill and Melinda Gates Foundation and the United Kingdom Foreign Commonwealth and Development Office.
Acknowledgements
We are grateful to UNICEF and the Ministry of Health of Kenya for sharing non-public datasets used for this analysis. Séverine Frison and Claire Dooley contributed to early versions of this work.