The gut microbiome and early-life growth in a population with high prevalence of stunting ========================================================================================= * Ruairi C. Robertson * Thaddeus J. Edens * Lynnea Carr * Kuda Mutasa * Ceri Evans * Ethan K. Gough * Hyun Min Geum * Iman Baharmand * Sandeep K. Gill * Robert Ntozini * Laura E Smith * Bernard Chasekwa * Florence D. Majo * Naume V. Tavengwa * Batsirai Mutasa * Freddy Francis * Joice Tome * Rebecca J. Stoltzfus * Jean H. Humphrey * Andrew J. Prendergast * Amee R. Manges * the SHINE Trial Team ## ABSTRACT Stunting affects one-in-five children globally and is associated with greater infectious morbidity, mortality and neurodevelopmental deficits. Recent evidence suggests that the early-life gut microbiome affects child growth through immune, metabolic and endocrine pathways, and microbiome perturbations may contribute to undernutrition. We examined early-life fecal microbiome composition and function in 875 stool samples collected longitudinally in 335 children from 1-18 months of age in rural Zimbabwe, from a cluster-randomized trial of improved water, sanitation, and hygiene (WASH), and improved infant and young child feeding (IYCF). Using whole metagenome shotgun sequencing, we examined the effect of the interventions, in addition to environmental or host factors including maternal HIV infection, on the succession of the early-life gut microbiome, and employed extreme gradient boosting machines (XGBoost) to model microbiome maturation and to predict child growth. WASH and IYCF interventions had little impact on the fecal microbiome, however children who were HIV-exposed but uninfected exhibited over-diversification and over-maturity of the early-life gut microbiome in addition to reduced abundance of *Bifidobacteria* species. Taxonomic microbiome features were poorly predictive of linear and ponderal growth, however functional metagenomic features, particularly B-vitamin and nucleotide biosynthesis pathways, moderately predicted both attained linear and ponderal growth and growth velocity. We find that the succession of the gut microbiome in a population at risk of stunting is unresponsive to WASH and IYCF interventions, but is strongly associated with maternal HIV infection, which may contribute to deficits in growth. New approaches targeting the gut microbiome in early childhood may complement efforts to combat child undernutrition. **One sentence summary** The gut microbiome of rural Zimbabwean infants undergoes programmed maturation that is unresponsive to sanitation and nutrition interventions but is comprehensively modified by maternal HIV infection and can moderately predict linear growth. ## INTRODUCTION Stunting, or linear growth failure, is a form of chronic undernutrition that affects 22% of children under 5 years of age worldwide (1, 2). Stunting is associated with infectious morbidity, reduced childhood survival and impaired cognitive development (3). The lifelong impacts of poor growth contribute to an intergenerational cycle of stunting and impaired development, lower educational attainment, and reduced adult economic productivity (4). Nutritional interventions, however, only reduce stunting by approximately 12% (5), suggesting that other pathophysiological mechanisms contribute to chronic undernutrition, which may inform new therapeutic strategies. The determinants of stunting and other forms of child undernutrition are complex and include a myriad of biological, environmental and social factors including breastfeeding and complementary feeding practices, household water, sanitation and hygiene (WASH) practices, birthweight, maternal HIV status, maternal anthropometry and maternal education. However, growing evidence suggests that a subclinical disorder of the small intestine, termed environmental enteric dysfunction (EED), may play a role in impaired child growth (6). EED is characterized by blunted intestinal villi, increased gut permeability, and microbial translocation into the circulatory system resulting in both local and chronic inflammation and nutrient malabsorption (7, 8). It is hypothesized that high enteric pathogen carriage, as seen in poor-hygiene, low-resource settings, contributes to the pathophysiology of EED (9–11); however, interventions to improve WASH and reduce the pathogen burden in children have failed to demonstrate improvements in linear growth (12). Additionally, both enteric pathogen load and common biomarkers of EED are not consistently associated with linear growth in different geographical cohorts (13–16), suggesting that the pathway linking microbial exposures, impaired gut function and early-life growth remains to be fully elucidated. In addition to research investigating the influence of diarrheal pathogens on child undernutrition and EED, emerging evidence supports the role of the commensal gut microorganisms in mediating child growth. Healthy-growing children exhibit a patterned ecological assembly of the gut microbiome through the first 2 years of life, which is defined by delivery mode, breastfeeding and complementary feeding practices (17). This microbial succession impacts a number of metabolic, immune and endocrine pathways in early life that contribute to early-life growth and development (18). Disturbances to this normal microbiome maturation therefore may impair these critical growth and developmental pathways. Immaturity of the early-life gut microbiome is associated with severe acute malnutrition (19), whilst reduced microbiome diversity is associated with higher risk of future diarrheal episodes (20). Indeed, a ‘malnourished’ early-life gut microbiome can recapitulate phenotypes of faltering growth and EED when transplanted into germ-free mice and pigs (21, 22). Furthermore, nutritional interventions designed to specifically target the impaired gut microbiome in acute malnutrition in both animal studies and small-scale human trials have recently demonstrated a positive effect on ponderal growth (23, 24), but not on linear growth. Microbiome differences that may contribute to stunting are likely influenced by a number of environmental factors including household WASH, infant feeding practices and maternal HIV infection. To date, little research has investigated the effect of improved WASH or infant feeding interventions on the assembly of the infant gut microbiome in low resources settings. However, recent data show that children who are HIV-exposed but uninfected (CHEU), consume breast-milk with an altered oligosaccharide composition from their mothers (25), and may be exposed to abnormal microbiome profiles from their mothers, which have been reported in people living with HIV (26, 27). CHEU also receive prophylactic antibiotics, to prevent infectious morbidity associated with HIV exposure. Each of these exposures may influence the seeding and succession of the gut microbiome in CHEU (28, 29), which may contribute to the high prevalence of stunting observed in CHEU (30). Evidence of the effect of other early-life environmental exposures on the assembly of the infant gut microbiome in low resources settings is scarce but may provide insights into the influence of microbial and microbiota-modifying exposures on child growth in the context of undernutrition. Previous cross-sectional data from sub-Saharan Africa hypothesized that decompartmentalization of the gastrointestinal tract occurs in stunted children, as demonstrated by the overgrowth of oropharyngeal bacterial taxa in the intestine (31), whilst a handful of other cross-sectional studies report variations in gut microbiota composition in stunted children that are inconsistent across geographical settings (32–34). We previously reported that the maternal gut microbiome can predict birthweight and neonatal growth in rural Zimbabwe (27). However, there are few studies mapping the compositional and functional maturation of the gut microbiome throughout early childhood, accounting for feeding, WASH, maternal HIV infection and other environmental exposures, in populations from low-resource settings and at high risk of stunting. Here, we characterize the succession and maturation of the fecal microbiome from 1 to 18 months of age in 335 children from rural Zimbabwe who were enrolled in the Sanitation, Hygiene, Infant Nutrition Efficacy (SHINE) Trial (35). We explore gut microbiome maturation and examine the influence of randomized WASH and nutrition interventions and maternal HIV infection. Using compositional and functional metagenomic data as well as extensive epidemiological information, we use machine learning to test the ability of the early-life gut microbiome to predict both attained linear and ponderal growth and growth velocity through the first 18 months of life. ## RESULTS ### Sub-study population characteristics The fecal microbiota was characterized in 875 fecal samples from 335 children from 1-18 months of age (Fig. S1). A mean (SD) of 2.6 (1.3) samples were analysed per child. The children in the microbiome sub-study largely resembled the population of all live-born infants in the overall SHINE trial cohort (Table S1), however, the microbiome sub-study included a larger number of children who were born to women living with HIV (29.6%) compared to the whole SHINE cohort (15.6%), due to the deliberate over-sampling of HIV-positive mothers and their infants. In addition, the microbiome sub-study included infants with slightly older mothers and longer gestational ages. The majority of infants were born by vaginal delivery (94.5%) in an institution (89.9%) and were exclusively breastfed (91% at 3 months). Prevalence of stunting (LAZ <-2) varied from 18-34% across study time-points. ### Metagenome sequencing performance Overall, 875 unique infant-visit whole metagenome sequencing datasets were used. On average, 12 million ± 4.2 million quality-filtered read pairs were generated per sample. Sixteen negative controls produced a mean of 655 quality-filtered reads (range = 149 to 1,425; SD = 456). The median percent of human reads detected was 0.05% but ranged widely by age group and decreased over time (Fig. S2a). The median percent un-annotatable reads detected in each sample was 58.6% and increased over time (Fig. S2b). Thirty-six samples were subject to repeated extraction and metagenome sequencing to assess technical variation. These samples originated from 4 unique children, each with 3 visit samples, where each visit sample was extracted and sequenced in 3 replicates. Principal coordinates analysis (PCoA) of Bray-Curtis distances and phylum-level relative abundances revealed little variation between replicates (Fig. S2c-d). ### Succession of gut microbiome composition in early childhood After prevalence and relative abundance threshold filtering, 161 annotated bacterial species were identified. Seven Eukaryotic and 4 Archaeal species were detected in a small proportion of samples, but these did not meet the prevalence thresholds (Table S2). *Bifidobacterium longum* was the predominant species at all time-points up to 12 months of age. Four other *Bifidobacteria* speces (*B. breve, B. bifidum, B. pseudocatenulatum, and B. kashiwanohense*), *Escherichia coli, Bacteroides fragilis* and *Veillonella* species were consistently amongst the most abundant species at the earlier time points before being outnumbered by *Faecalibacterium prausnitzii* and *Prevotella copri* at 12 and 18 months of age. Taxonomic a-diversity metrics and gene richness tended to decline or remain stable over the first 4-6 months of life, during exclusive breast-feeding, but increased as expected with infant age from 6-18 months of age (Fig. S3a-b), with the introduction of complementary feeds. A large proportion of variation in both compositional (R2 = 0.198, P<0.001) and functional 1-diversity (R2 = 0.378, P<0.001) was explained by age (Fig. 1a-b and S3 c-d). ![Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F1.medium.gif) [Fig. 1.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F1) Fig. 1. Compositional and functional maturation of the gut microbiome of 335 infants from rural Zimbabwe from 1-18 months of age. PCoA of Bray-Curtis distances of species (a) and metagenomic pathways (b) coloured by age category. PERMANOVA model results are also plotted. The top 20 features and model pseudo-R2 from XGBoost models predicting age using species (c) or pathways (d) are ranked by scaled feature importance and relative abundance (0-1) plotted by age (e-f) to visualize taxonomic and functional microbiome succession from 1-18 months of age. We employed extreme gradient boosting machines (XGBoost), a machine learning approach, to train and test a model of compositional and functional microbiome maturation, with child age as an outcome. Children who were born to HIV-negative mothers, who were non-stunted at 18 months (LAZ > −2) and had at least 2 stool samples collected were used as a ‘training set’, which was then used to predict child age in the remaining samples (test set). Using species composition, the microbiome was highly predictive of child age (Model pseudo-R2 = 0.77, MAE = 1.4 months). This ‘microbiota age’ score was also strongly correlated with biological age in the subset of children from the test set who were also non-stunted and born to HIV-negative mothers (pseudo-R2 = 0.67). The species most strongly predictive of age included *Faecalibacterium prausnitzii, Blautia wexlerae, Prevotella copri, Staphylococcus hominis, Dorea formicigenerans, Bifidobacterium longum, Agathobaculum butyriciproducens, Bifidobacterium bifidum, Bacteroides thetaiotaomicron, Streptococcus vestibularis* and *Veillonella parvula* (Fig. 1c). Metagenomic pathways also predicted age with high accuracy (Model pseudo-R2 = 0.68; MAE = 1.5; Fig. 1d). The pathways most strongly predictive of age included methanogenesis from acetate (METH-ACETATE-PWY), multiple nucleotide and amino acid metabolic pathways, including L-tryptophan biosynthesis (TRPSYN-PWY), purine ribonucleosides degradation (PWY0-1296), pyrimidine deoxyribonucleotides de novo biosynthesis I (PWY-7184), L-histidine degradation I (HISDEG-PWY), dTDP-L-rhamnose biosynthesis I (DTDPRHAMSYN-PWY), flavin biosynthesis I (RIBOSYN2-PWY) and nitrate reduction I (DENITRIFICATION-PWY). This ‘metagenome age’ was also highly correlated with age in the subset of children from the test set who were also non-stunted and born to HIV-negative mothers (pseudo-R2 = 0.67). Using these models, we created a microbiota-for-age Z-score (MAZ) and metagenome-for-age Z-score (MetAZ), which accounted for variance of microbiota ages with respect to chronological ages at each study visit (see Methods). The top 20 features contributing most strongly to age predictions are plotted in Fig. 1e-f. ### WASH and IYCF interventions have little influence on the infant gut microbiome We have previously reported in the SHINE trial that the WASH intervention had no impact on infant growth, whilst the IYCF intervention increased LAZ scores by 0.16, leading to a 23% reduction in stunting by 18 months of age. We tested whether these randomized interventions impacted any metrics of gut microbiome diversity or maturity in each age group. By performing principal coordinate analysis (PCoA) on Bray-Curtis distances of taxonomic data and functional data we found no significant differences in 1-diversity between IYCF and non-IYCF arms at any time-point. There was a significant difference in Bray-Curtis distances for microbiome composition between WASH and non-WASH arms at the 3-month time-point (PERMANOVA, P=0.013; R2= 0.011), but at no other time-points (Fig. 2a-d). No significant differences were observed in a-diversity metrics or gene richness between WASH versus non-WASH arms nor IYCF versus non-IYCF arms at any time-point (Fig. 2e-f). Using multivariable regression analyses (MaAsLin2), there were also no differences in the relative abundance of species or pathways between intervention arms apart from a small number of features at 3 months in the WASH arms (increased *Klebsiella pneumoniae,* reduced *Collinsella aerofaciens* and more abundant metagenomic pathways involved in biotin and folate synthesis) and at 18 months in the IYCF arms (reduced *Eubacterium siraeum*, *E. rectale* and *Agathobaculum butyriciproducens*; Table S3 and S4). ![Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F2.medium.gif) [Fig. 2.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F2) Fig. 2. Impact of randomized WASH and IYCF interventions on infant gut microbiome. PCoA of Bray-Curtis distances species coloured by WASH vs non-WASH arms (a), including PERMANOVA model results, and IYCF vs non-IYCF arms (b) are plotted in addition to the first component (PC1) from PCoA of species (c) and pathways (d). The IYCF intervention was introduced after 6 months of age, therefore direct comparisons of IYCF vs non-IYCF arms are not shown in the 1-, 2- and 3-month age categories. No significant differences were observed in Shannon alpha diversity (e) and gene richness (f) according to trial arm. ### HIV exposure comprehensively alters infant gut microbiome composition and function We previously reported in this cohort that maternal HIV exposure significantly impacts infant growth, whereby CHEU displayed significantly poorer linear growth compared with children who are HIV-unexposed (CHU) (30). We assessed diversity metrics and microbiome maturity in CHEU versus CHU and found that CHEU displayed significantly greater diversity (Shannon index; Wilcoxon rank-sum test P = 0.002; Fig. 3a) and species richness (P = 0.01; Fig. 3b) compared with CHU at 12 months of age, although the size of the two groups was imbalanced within this age category (n=91 CHEU, n=27 CHU). Other metrics of a-diversity were significantly lower in CHEU at 2 months of age (evenness P = 0.04, Simpson index P = 0.04). Metagenomic gene richness was also elevated in CHEU at 1, 3, 6 and 12 months (Wilcoxon rank-sum test P = 0.002, P = 0.059, P = 0.007, P < 0.001, respectively; Fig. 3c). Analysis of taxonomic 1-diversity between CHEU and CHU also revealed significant differences at 1, 3, 6, 12 and 18 months suggesting that *in utero* HIV exposure was significantly associated with gut microbiome succession and development throughout the first 18 months of life (Fig. 3d-e). HIV exposure also explained a significant proportion of the variation in metagenome pathway 1-diversity (PERMANOVA, P = 0.002, R2 = 0.033) at 1 month of age (Fig. 3h-i), but not at later ages. We next tested the association between HIV exposure and gut microbiome maturity. We found that CHEU displayed greater microbiome age and MAZ, and hence microbiota over-maturity, compared with CHU at 12 months of age (median 14.7 vs 9.2 months; 1 = 3.18, P < 0.001), which tended to be higher in CHEU at 1 months and at 6 months also (P = 0.074 and P = 0.059, respectively; Fig. 3 f-g). However, at 18 months CHEU displayed lower microbiome age (15.3 vs 16.4 months; 1 = −1.1, P = 0.045 respectively). Similarly, CHEU displayed significantly greater metagenome ages and MetAZ scores compared with CHU at 1 month (1 = 1.1, p = 0.039) and 12 months of age (1 = 2.2, P = 0.04; Fig. 3j-k), suggesting that HIV exposure drives both compositional and functional microbiome over-maturity. The proportion of CHEU receiving prophylactic cotrimoxazole ranged from 56-83% across the 3, 6, 12 and 18-month study visits. ![Fig. 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F3.medium.gif) [Fig. 3.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F3) Fig. 3. Maternal HIV infection comprehensively alters infant gut microbiome diversity and maturity. Shannon alpha diversity (a), species richness (b) and gene richness (c) shows significant over-diversification in CHEU vs CHU (Wilcoxon rank-sum test; *p<0.05). PCoA of Bray-Curtis distances (d) and PC1 (e) of species composition in CHEU vs CHU show significant differences throughout 18 months of life (PERMANOVA). Microbiome age (f) and microbiome-for-age Z score (MAZ; g) shows significant differences in gut microbiome maturity in CHEU vs CHU (linear regression analyses). PCoA of microbiome gene pathways shows differences in CHEU vs CHU at 1 month of age (h-i) in addition to differences in metagenomic maturity (j-k). We next explored which species were differentially abundant between CHEU and CHU by performing multivariable regression analyses, adjusting for age at stool sample collection, exclusive breastfeeding status, delivery mode, and randomised trial arm. Between 1-3 months of age, two Bifidobacteria species, *B. longum* and *B. bifidum* (Fig. 4a-b), in addition to *Veillonella seminalis* were significantly less abundant in CHEU versus CHU (q < 0.1; Table S3). Conversely, *Flavonifractor plautii* was significantly more abundant in CHEU at 3 months (q = 0.02). At 18 months, *B. bifidum* was again significantly less abundant in CHEU (q = 0.04), whilst two other Bifidobacteria species were also weakly associated with CHEU, whereby *B. breve* was lower and *B. pseudocatenulatum* higher (q < 0.25). Regression analyses of metagenomic pathways with CHEU status generated similar outcomes. Following adjustment for covariates, significant associations for metagenomic pathways were only present at 1 and 3 months of age. At 1 month of age, these included significant negative associations between CHEU and amino acid synthetic pathways (superpathway of L-threonine biosynthesis, superpathway of L-isoleucine biosynthesis I and L-lysine biosynthesis I; Fig. 4c-e) and positive associations with pathways involved in the degradation of sugar derivatives, including fructuronate, glucoronate and galacturonate (PWY 7242 D-fructuronate degradation, PWY 6507 4-deoxy-L-threo-hex-4-enopyranuronate degradation, GALACTUROCAT PWY D-galacturonate degradation I, GALACT GLUCUROCAT PWY superpathway of hexuronide and hexuronate degradation and GLUCUROCAT PWY superpathway of beta D-glucuronide and D-glucuronate degradation; Fig. 4f-h, Table S4). A handful of pathways involved in fatty acid oxidation and fermentation (fatty acid beta oxidation peroxisome and succinate fermentation to butanoate) were also significantly positively associated with CHEU at 3 months of age. ![Fig. 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F4.medium.gif) [Fig. 4.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F4) Fig. 4. Maternal HIV infection is associated with reduced abundance of Bifidobacteria abundance and amino acid biosynthesis genes. Relative abundance (0-1) of *Bifidobacterium longum* (a) and *B. bifidum* (b) in the gut microbiome of CHEU and CHU at each age category via multivariate regression analyses. Multivariate regression of gene pathways demonstrates reduced abundance of amino acid biosynthetic pathways (c-e) and increase in abundance of pathways involved in degradation of sugar derivatives (f-h). Multivariate regression using MaAsLin2 using default q < 0.25 cut-off. *q<0.1. ### Microbiome functionality, but not composition, predicts linear and ponderal growth We next examined the relationship between taxonomic and functional features of the gut microbiome and attained growth (LAZ and WHZ) and growth velocity (WHZ and LAZ velocity, increase in z-score increments per day between visits) from 1-18 months of age using XGBoost models. Models were run separately for each of the six age groups, stratified by maternal HIV status and run in two combinations of predictive features: (i) microbiome features alone (species or pathways); and (ii) microbiome features and epidemiological variables, which included maternal anthropometry, baseline WASH and infant diet variables, amongst others (Table S5). In models combining microbiome features with epidemiological features, birthweight, maternal height, maternal mid-upper arm circumference, and household wealth were all important predictors of attained infant LAZ and WHZ and growth velocity. Models including microbiome taxonomic features (species) alone performed poorly for both attained and growth velocity at every age category and regardless of HIV exposure status (Fig. 5a and 6a), with a majority of models resulting in pseudo-R2 values < 0. Model performance for linear growth improved when epidemiological features were included, suggesting that gut microbiota composition alone was poorly predictive of growth. Taxonomic features were weakly predictive of WHZ velocity at 1-3 months (pseudo-R2 0.09-0.12) and attained WHZ at 6 months (pseudo-R2 = 0.22), but only in children born to HIV-negative mothers (Fig. 6a). Conversely, models containing functional metagenomic pathways were moderately predictive of both attained and future growth throughout 18 months of age (pseudo-R2 = 0-0.55; Fig. 5a and 6a) albeit with relatively large mean absolute errors (MAE) for both linear (0.48-0.95 LAZ) and ponderal growth models (0.6-1.18 WHZ). The inclusion of epidemiological variables in the metagenomic models added little to performance suggesting that pathway features were independently predictive of both linear and ponderal growth. Models predicted WHZ better than LAZ and models including children born to HIV-negative mothers also tended to perform better. MAE decreased in all models as age increased (Fig. S4 a-d). ![Fig. 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F5.medium.gif) [Fig. 5.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F5) Fig. 5. Prediction of attained LAZ and LAZ velocity using XGBoost models. Performance of XGBoost models as assessed by pseudo-R2 values for prediction of attained LAZ and LAZ velocity (LAZ increase per day to next study visit) using species or metagenomic pathways, stratified by age category and maternal HIV status (a). Models were run using microbiome features alone (species or metagenomic pathways; blue points) and in combination with epidemiological variables (yellow points). The top ranked pathways predicting LAZ at each age category are plotted (b), stratified by maternal HIV status and coloured by scaled importance in the XGBoost model. Accumulated effect plots (ALE) of representative pathways ranking highly in XGBoost model predictions display change in predicted linear growth (LAZ or LAZ velocity) by percentile of the feature abundance distribution. Tick marks on the x-axis are a rug plot of individual feature abundance percentiles. ALEs were generated using the *ALEplot* package and were plotted using *ggplot2*. Standard deviations (sd) were calculated per increment in microbiome feature and were used to calculate and plot increment-wise 95% confidence intervals as the average change in the outcome ±1.96(sd/sqrt(n)), where n is the number of observed feature values, and sd is the standard deviation of the change in the outcome variable in an interval. ![Fig. 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F6.medium.gif) [Fig. 6.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F6) Fig. 6. Prediction of attained LAZ and LAZ velocity using XGBoost models. Performance of XGBoost models as assessed by pseudo-R2 values for prediction of attained WHZ and WHZ velocity (WHZ increase per day to next study visit) using species or metagenomic pathways, stratified by age category and maternal HIV status (a). Models were run using microbiome features alone (species or metagenomic pathways; blue points) and in combination with epidemiological variables (yellow points). The top ranked pathways predicting WHZ at each age category are plotted (b), stratified by maternal HIV status and coloured by scaled importance in the XGBoost model. Accumulated effect plots (ALE) of representative pathways ranking highly in XGBoost model predictions display change in predicted linear growth (LAZ or LAZ velocity) by percentile of the feature abundance distribution. Tick marks on the x-axis are a rug plot of individual feature abundance percentiles. ALEs were generated using the *ALEplot* package and were plotted using *ggplot2*. Standard deviations (sd) were calculated per increment in microbiome feature and were used to calculate and plot increment-wise 95% confidence intervals as the average change in the outcome ±1.96(sd/sqrt(n)), where n is the number of observed feature values, and sd is the standard deviation of the change in the outcome variable in an interval. ### Microbiome features associated with linear growth In all but the 2-month age group, birthweight contributed most strongly to prediction of attained LAZ. Metagenomic pathways consistently performed better as predictors than other epidemiological features including maternal height. The most predictive pathways were largely similar between infants born to HIV-positive and HIV-negative mothers. At 1 month and 2-months, metagenomic pathways encoding enzyme co-factor biosynthesis, nucleotide degradation and biosynthesis and amino acid biosynthesis were consistently predictive of both attained LAZ and LAZ velocity (Fig. 5b). Accumulated local effects (ALE) (36, 37) plots show the average effect of some of the most important features on model outcomes (Fig. 6c). At 3 months and 6 months, pathways encoding fermentation and carbohydrate biosynthesis were consistently predictive of attained LAZ and LAZ velocity, whilst amino acid degradation pathways, amongst others, were predictive of growth at the oldest age groups. In particular, pathways involved in vitamin B biosynthesis (flavin, folate, biotin, thiazole and cobalamin biosynthetic pathways) were consistently predictive of attained LAZ, and included flavin biosynthesis I, 6-hydroxymethyl-dihydropterin diphosphate biosynthesis, superpathway of tetrahydrofolate biosynthesis, adenosylcobalamin salvage from cobinamide I, biotin biosynthesis II and thiazole biosynthesis I. Increasing abundances of B vitamin biosynthesizing genes contributed to increasing predicted growth in ALE plots, apart from Thiamin biosynthesis, which predicted lower LAZ at 12 months (Fig. 5c). At 12 months of age, 4-coumarate degradation (anaerobic), a pathway involved in plant polysaccharide degradation, was the most predictive pathway of LAZ in children born to HIV-positive mothers, with greater abundance associated with greater LAZ. We also assessed pathways predicting growth velocity and found similar results to that of attained growth (Fig. S5a). At 2 months of age, folate biosynthetic pathways (folate transformations II and N10-formyl-tetrahydrofolate biosynthesis) were the top two predictive features of linear growth velocity in children born to HIV-negative mothers, whilst at 3 months, purine and pyrimidine pathways were amongst the most predictive features of linear growth velocity. Similarly, as per the attained growth models, amino acid and fatty acid biosynthetic pathways were also strongly predictive. Glycogen biosynthesis pathways were also consistently predictive of linear growth velocity at all ages from 2 months onwards. ### Microbiome features associated with ponderal growth In the few models incorporating taxonomic features that weakly predicted WHZ velocity in infants born to HIV-negative mothers, *Escherichia coli* at 2 months and *Bacteroides fragilis* and *Veillonella atypica* at 3 months were amongst the most predictive features (Fig. S6). Many of the same categories of biosynthetic microbiota pathways were predictive of both WHZ and LAZ, including amino acid and nucleotide (especially purine biosynthesis) biosynthetic pathways in addition to a number of lipid synthesis pathways at older age groups (Fig. 6b-c). The most predictive pathways were largely similar between infants born to HIV-positive and HIV-negative mothers. O-antigen biosynthesis pathways (PWY-7328 UDP-glucose-derived O-antigen building blocks biosynthesis & PWY-7332 UDP-N-acetylglucosamine-derived O-antigen building blocks biosynthesis and OANTIGEN-PWY pathway), which did not appear in LAZ models, were consistently amongst the most predictive features of WHZ at 1, 6, 12 and 18 months, whereby greater abundance was associated with reduced growth. Pyrimidine and purine synthetic pathways were consistently the strongest predictors of WHZ velocity with varying directions of association, including superpathway of pyrimidine deoxyribonucleotides de novo biosynthesis (PWY0-166) which was the strongest predictive feature of WHZ velocity at 12 months in children born to HIV-mothers (Fig. S5b). Similar to the attained WHZ models, O-antigen biosynthesis pathways, amino acid synthetic pathways and glycogen biosynthesis pathways were all strongly predictive of ponderal growth velocity. ## DISCUSSION We report the succession and maturation of the early-life gut microbiome in a cohort of 335 children from rural Zimbabwe through the first 18 months after birth. We find that taxonomic composition of the gut microbiome is poorly predictive of child growth, however functional composition moderately predicts both attained LAZ/WHZ and LAZ/WHZ velocity, with pathways including B vitamin and nucleotide biosynthesis genes amongst the most predictive of child growth. We also report that randomized WASH and IYCF interventions have little impact on early-life gut microbiome composition, whilst maternal HIV infection, which is associated with impaired infant growth, is associated with over-maturity of the gut microbiome, featuring a depletion in commensal *Bifidobacteria* species. Collectively, these data suggest that disturbances in the functional potential of the infant gut microbiome may contribute to poor infant growth and that interventions targeting the infant gut microbiome may serve as novel solutions to combat child stunting, particularly in CHEU. Our previous data from the SHINE trial found that the WASH intervention had no impact on linear growth, whilst IYCF improved growth by 0.16 LAZ (35). Furthermore, WASH had no impact on carriage of enteropathogens or diarrheal incidence (14). Here, we show that the improved WASH and IYCF interventions also had little impact on the infant gut microbiome throughout 18 months after birth. Our results support those from high-income settings, showing a structured, programmed assembly of the gut microbiome in healthy children who are born by vaginal delivery and exclusively breastfed (17, 38). These data suggest that this programmed microbial maturation is robust to changes in WASH and complementary feeding, as delivered in this trial, and that potential microbiome-mediated pathways affecting early-life growth occur independently of these specific interventions. Improvements in growth as a result of the IYCF intervention are not driven by the microbiome, as supported by previous reports showing that the gut microbiome does not mediate the effect of lipid-based IYCF nutrient supplements on child growth (39). More intensive interventions that target WASH, microbial exposures, nutrient intake and microbiota-directed foods during the first 2 years of life may be required to modify this programmed trajectory of gut microbiome succession. The data reported here complement previous research associating the gut microbiome with infant growth. The composition and maturity of the gut microbiota has been shown to be disturbed during severe acute malnutrition (SAM) and could be used to predict growth recovery (19). More recently, an “ecogroup” of 15 bacterial taxa has been identified that exhibits consistent covariation, thereby representing microbiota maturation, throughout the first 2 years after birth across different geographical cohorts (40). However, little research has examined microbiome maturation in the context of child stunting. We report similar maturation of the early-life microbiome in this stunting cohort, driven by many of the same age-predictive taxa as previously reported, notably *Faecalibacterium prausnitzii* as the species most predictive of age. We extend this to report functional maturation of the early-life gut microbiota and find that, in addition to amino acid and B-vitamin biosynthetic pathways, methanogenesis from acetate (METH-ACETATE-PWY) was the pathway most predictive of age, despite the apparent lack of methanogens. The predictive strength of this pathway may reflect accumulation of acetogenic species, including *Blautia wexlerae*, that feed into reactions upstream of the METH-ACETATE-PWY. However, our results contrast with previous cross-sectional studies reporting an association between the taxonomic composition of the gut microbiome and stunting (31–33). A previous study from sub-Saharan Africa (Afribiota) reported significant differences in the fecal microbiome of stunted and non-stunted children between 2-5 years of age, hypothesizing that decompartmentalization of the gastrointestinal tract and overgrowth of oropharyngeal taxa are associated with stunting (31). We report here no association between the taxonomic composition of the gut microbiome and linear growth, however our study examined children at younger ages to that of the Afribiota cohort, suggesting that differences in the taxonomic composition of the gut microbiome mediating linear growth may only manifest later in childhood. We identified a range of metagenomic pathways that predicted linear and ponderal growth through 18 months suggesting that the potential influence of an altered gut microbiome on child growth is dependent upon a number of interacting metagenomic pathways. This discrepancy in the ability of functional metagenomic features versus taxonomic features to predict growth may suggest that metagenomic pathways contributing to differences in early-life growth may be harboured across a number of functionally redundant species. Intriguingly, many of the pathways that we found predictive of growth in infants, were also found to be predictive of infant birthweight and neonatal growth in analysis of maternal gut microbiomes from the same study cohort (27). Glycogen synthase was the pathway most predictive of birthweight in maternal microbiomes, whereby higher abundance predicted lower birthweight. Here we identified 2 related pathways (PWY-622 starch biosynthesis, GLYCOGENSYNTH-PWY glycogen biosynthesis I (from ADP-D-Glucose)) that were consistently ranked as highly predictive of both linear and ponderal growth velocity. Glycogen synthesis occurs as a starvation response in bacteria which facilitates transition into a biofilm state (41, 42). These data suggest that microbiome starvation responses are associated with growth as early as the first months after birth, which may have downstream implications for host metabolism and associated growth pathways. It is plausible that these pathways are a result of altered nutrient composition in breastmilk of mothers of stunted infants, thereby providing insufficient substrates for infant microbiome maturation, as has been identified previously (43). Alternatively, these signatures may be a consequence of a host-induced effect in poorly growing infants on gut microbiome function. Pathways encoding biosynthesis of B vitamins were consistently amongst the top predictive features in models predicting both attained and LAZ and WHZ growth velocity. Previous evidence supports the importance of B vitamins in early-life growth whereby maternal folic acid supplementation increases infant birthweight (44). In infants, vitamin B12 status is predictive of both linear and ponderal growth (45), however the largest randomized trial of B12 supplementation on infant growth to date showed no effect (46, 47). The gut microbiome biosynthesizes and metabolises B vitamins, including cobalamin (B12) and folate (B9), at levels similar to dietary intake, and abundance of B vitamin-synthesizing genes in the infant gut microbiome differs by delivery mode, antibiotic exposure (48), exclusive breastfeeding practices and geographic location, where vitamin biosynthesis genes are greater in Western settings (38). Greater relative abundance of B vitamin biosynthetic pathways such as thiazole, tetrahydrofolate and flavin biosynthesis in the maternal gut microbiome predicted greater birthweight and neonatal growth in this same cohort, whilst biotin biosynthesis predicted reduced birthweight (27). The gut microbiome transferred from mother to infants may influence the metabolic capacity of the infant microbiome to biosynthesize essential nutrients and influence downstream growth pathways. Purine and pyrimidine biosynthetic pathways consistently contributed to growth predictions across all age groups. In mothers from this same cohort, purine and pyrimidine salvage pathways were associated with increasing birthweight (27). Meta-analyses have also found that dietary nucleotide supplementation in infants significantly increases head circumference and rate of weight gain (49), suggesting that microbiome-derived nucleotide metabolism may play an important role in nutritional status in early infancy. An important observation, however, is that many of these pathways predicting growth, including B vitamin and purine/pyrimidine biosynthesis, were also predictive of age and hence microbiota maturation. There was a strong association between age and growth in the SHINE cohort, whereby LAZ declined steadily between 1-18 months of age. This highlights the difficulty in delineating the independent effect of the gut microbiome on growth during infancy, when the microbiome is concurrently undergoing age-related maturation, which is by far the strongest contributor to gut microbiome variability. Although we attempted to account for age-related effects by examining samples within specified age categories, the microbiome-growth relationship observed here may be confounded by age. Previous studies, employing MAZ as a maturation index to account for age have demonstrated microbiome maturation is disturbed in acutely malnourished states (19) but is not associated with linear growth (50). Our observations that functional microbiome characteristics moderately predict changes in linear growth add novel findings to this literature but need to be replicated in other large cohorts examining the functional maturation of the gut microbiome throughout early childhood in similar settings. We report that maternal HIV infection had a significant impact on the infant microbiome throughout the first 18 months after birth. We previously reported that CHEU have a 16% higher prevalence of stunting, 40% higher risk of infant mortality and poorer cognitive development compared with CHU (30). The results presented here raise the intriguing possibility that altered succession and assembly of the infant gut microbiome may drive some of these poorer clinical outcomes in CHEU. These findings are in line with previous reports of disturbed gut microbiome composition in CHEU (25, 29). A number of factors may explain these differences. Firstly, CHEU receive prophylactic cotrimoxazole from 6 weeks of age, which may impact gut microbiome succession throughout childhood (51, 52). However, we found the largest differences in gut microbiome composition and function in samples from infants < 6 weeks of age, suggesting that these findings were independent of antibiotic prophylaxis and that CHEU may acquire an altered microbiome from their mothers. We saw relatively minor differences, however, in the gut microbiome of mothers living with HIV or without HIV in this same cohort (27). Although there were significant differences in compositional beta diversity, the only species that differed in abundance was *Treponema berlinense*, which was significantly less abundant in mothers living with HIV. Exclusivity of breast-feeding is one of the most impactful factors determining infant gut microbiome composition, however there was no significant difference in EBF rates between CHEU and CHU in this cohort, suggesting that exclusivity of breast-feeding was not responsible for these differences. Previous research has shown that the HMO content of breast milk differs between mothers living with and without HIV (25). HMOs are the among the primary substrates for digestion by the infant gut microbiome thereby fundamentally determining gut microbiome composition. Indeed, we found that *Bifidobacteria* species, which are primary degraders of HMOs were significantly less abundant in CHEU, as were genes involved in amino acid biosynthesis. Previous, in-depth profiling of infant immune development has found that a lack of *Bifidobacteria* in infancy is associated with systemic inflammation and immune dysregulation (21) which are also observed in CHEU (53–55), suggesting that the lack of commensal *Bifidobacteria* may mediate some of the poor immune, growth and clinical outcomes observed in CHEU. This study is strengthened by the large birth cohort of healthy infants from two rural districts in sub-Saharan Africa. This population is underrepresented in microbiome research to date. The use of whole metagenome shotgun sequencing strengthens the study, providing a unique dataset in which to examine both compositional and functional microbiome maturation in early childhood. Furthermore, the machine learning approach (XGBoost) and comprehensive clinical and epidemiologic data allows us to account for environmental exposures influencing gut microbiome succession and infant growth in this rural, low-resource settings. However, there are several limitations to this analysis: (i) the SHINE microbiome sub-study included more HIV-positive mothers than the main SHINE trial (30% versus 15%), and the proportion of HIV-exposed infants varied by age group. This resulted in some small sub-groups in some of our age categories by HIV exposure analyses, which likely resulted in unstable predictions in certain XGBoost models; (ii) a significant proportion of the sequencing reads included in our datasets were not annotatable (median 58.6%) using the specified bioinformatic pipelines (MetaPhlAn3 and HUMAnN3). This large abundance of unknown sequences is common in samples derived from non-Western populations (56) and leads to inferences solely being made from the assignable fraction, potentially missing important microbiota features that are predictive of infant growth but are currently not represented in databases; (iii) *Escherichia coli* was one of the most prevalent bacterial species across all age groups; however, MetaPhlAn3 cannot differentiate between *E. coli* pathotypes. Different *E. coli* pathotypes, such as enteropathogenic and enteroaggregative *E. coli* have been associated with intestinal pathology and EED, however we previously reported in the same cohort that some of these pathotypes were not associated with growth (14); (iv) we attempted to account for the age-related confounding of the microbiome-growth relationship by predicting attained growth and growth velocity in discrete age groups, but residual confounding may still be present, influencing our ability to identify microbiome features independently associated with growth; (v) data on infant antimicrobial use in the SHINE trial was incomplete, limiting our ability to confidently assess this and other potential confounders; (vi) differences in microbiome composition and function may also be driven by differences in intestinal microbial load, motility and biogeography, which were not assessed; (vii) finally, we chose mother-infant pairs with the most complete sample collection during follow-up, in order to strengthen our inferences about development of the gut microbiome over time. Baseline characteristics of the microbiome sub-cohort were largely similar to those of the larger trial, suggesting that the microbiome sub-study cohort studied was largely representative of the larger SHINE trial. Collectively, these data suggest that HIV exposure shapes maturation of the infant gut microbiota, and that the functional composition of the infant gut microbiome is moderately predictive of infant growth in a population at high risk of stunting. Novel therapeutic approaches targeting the gut microbiome may mitigate the poor clinical outcomes that are observed in CHEU, a growing population of children in sub-Saharan Africa. By contrast, current WASH and IYCF interventions fail to impact the infant gut microbiome and therefore transformative WASH and microbiome-targeted dietary interventions may prove to be more successful approaches to target the microbial pathways mediating early-life growth. ## MATERIALS AND METHODS ### SHINE trial design The study design and methods for The Sanitation Hygiene Infant Nutrition Efficacy (SHINE) trial and for the corresponding microbiome analyses, have been reported previously (57, 58). Briefly, SHINE was a 2×2 cluster-randomized trial, conducted between 2012 and 2017, to determine the independent and combined effects of improved infant and young child feeding (IYCF) and WASH on child stunting and anaemia in two rural Zimbabwean districts ([NCT01824940](http://medrxiv.org/lookup/external-ref?link_type=CLINTRIALGOV&access_num=NCT01824940&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom)). 5280 pregnant women were cluster-randomized to one of four interventions: WASH, IYCF, WASH+IYCF, and Standard of Care (SOC). The SOC interventions, included in all trial arms, comprised exclusive breastfeeding promotion for all infants up to 6 months and strengthened prevention of mother to child transmission (PMTCT) of HIV services. The household WASH intervention was initiated during pregnancy and was designed to reduce exposure to human and animal feces, including, at the household level: construction of a ventilated improved pit latrine, installation of two hand-washing stations plus monthly delivery of liquid soap and water chlorination solution, provision of a play space for the infant, and hygiene counseling. The IYCF intervention was designed to improve infant diets using a small-quantity lipid-based nutrient supplement (SQ-LNS), provided to the infant from 6-18 months, and educational interventions promoting the use of age-appropriate, locally available foods and dietary diversity. Lastly, a combined trial arm, WASH+IYCF, evaluated the effects of both improved WASH and infant nutrition. Infants were followed up at study visits at 1, 3, 6, 12 and 18 months of age. Length and weight were measured at each infant visit, as described previously (35). Length-for-age z scores (LAZ) and weight-for-height z scores (WHZ) were calculated from length and weight measurements at each visit according to WHO Child Growth Standards. Epidemiologic data for the infants was collected from the baseline and follow-up visits using trial questionnaires that included maternal anthropometry, birth outcomes, baseline household WASH facilities, household wealth, maternal education, religion, parity, household size, dietary diversity, changes in breastfeeding and complementary feeding practices, food security, 7-day and 3-month infant health status, and antimicrobial use. ### HIV testing HIV testing was conducted on mothers at the baseline visit using a rapid test algorithm (Alere Determine HIV1/2 test, followed by INSTI HIV-1/2 test if positive). Those testing positive for HIV had CD4 counts measured (Alere Pima Analyser) and referral to local clinics; women were encouraged to begin co-trimoxazole prophylaxis and ART, to exclusively breastfeed, and to attend clinic at 6 weeks postpartum for early infant diagnosis and infant co-trimoxazole prophylaxis. Women testing negative for HIV were offered retesting at 32 gestational weeks and 18 months postpartum. Children of mothers living with HIV were offered testing for HIV at each of the study visits. Those who tested positive were referred to local clinics for ART. HIV was diagnosed using DNA PCR on dried blood-spot samples or RNA PCR on plasma in samples collected prior to 18 months. In samples collected after 18 months, HIV was diagnosed by PCR or rapid test algorithm, depending on samples provided. Children born to HIV-positive mothers and whom tested negative at 18 months were classified as HIV-exposed uninfected (CHEU). Inconclusive or discordant results were re-tested; if no further samples were available or repeat testing was inconclusive, children were classified as HIV-unknown. ### Microbiome sub-study All CHEU and a subgroup of CHU from the SHINE study were enrolled into an Environmental Enteric Dysfunction (EED) sub-study (n=1,656 mother-child pairs); these infants underwent intensive biological specimen collection at 1, 3, 6, 12 and 18 months of age (58). The EED sub-study was therefore enriched for mothers living with HIV, by design. Sample selection for inclusion into the current microbiome study was conducted to enhance longitudinal profiling of the mother and infants gut microbiota. Of the mother-infant pairs within the EED sub-study, those with least one maternal fecal specimen (of 2 possible) and at least 2 infant fecal specimens (of 5 possible) were included in the gut microbiome analyses. An additional 94 samples collected at the 1 and 3-month visits, that did not meet these criteria, but had microbiome sequencing data available from a separate study examining rotavirus vaccine immunogenicity in the SHINE trial (59), were also included in these analyses. Infant ages varied at each study visit due to the allowable window around the visit date for the larger SHINE trial. Therefore, for this microbiome study, each stool samples were re-categorized into 6 age groups corresponding to important stages in infant microbiome development: “1 month” (0-6 weeks), “2 months” (7 weeks – <3 months), “3 months” (3-6 months), “6 months” (6-9 months), “12 months” (9-15 months), and “18 months” (15-20 months). ### Sample collection Study visits were conducted by trained study nurses in participants’ homes. Sterile stool collection tubes were provided to mothers, who collected stool samples from their infants on the morning of each study visit. Samples were placed in cool boxes immediately upon collection by study nurses and transported by motorbike to field laboratories where they were aliquoted and stored at −80°C within 6 hours of collection before subsequent transport to the central laboratory in Harare for long-term storage at −80°C. An aliquot of each stool sample was shipped on dry ice by courier to the British Columbia Centre for Disease Control in Vancouver, Canada. A strict cold chain was maintained throughout transport, ensuring no freeze-thaw cycles occurred between sample collection and processing. ### Whole metagenome library preparation and sequencing DNA was extracted from 100-200mg of stool samples using the Qiagen DNeasy PowerSoil Kit as per the manufacturer’s instructions. DNA quantity was assessed by fluorometry (QuBit) and quality confirmed by spectrophotometry (SimpliNano). 1µg DNA was subsequently used as input for metagenomic sequencing library preparation using the Illumina TruSeq PCR-free library preparation protocol, using custom end-repair, adenylation and ligation enzyme premixes (New England Biolabs). The concentration and size of constructed libraries were assessed by qPCR and by TapeStation (Agilent). DNA-free negative controls and positive controls (ZymoBIOMICS) were included in all DNA extraction and library preparation steps. Libraries were pooled in random batches of 48 samples including one negative control. A set of specimens were subject to replicate DNA extraction, library preparation and, sequencing to estimate the magnitude of technical variability among samples. Whole metagenome sequencing was performed with 125-nucleotide paired-end reads using either the Illumina HiSeq 2500 or HiSeqX platforms at Canada’s Michael Smith Genome Sciences Centre, Vancouver, Canada. ### Bioinformatics Sequenced reads were trimmed of adapters and filtered to remove low-quality, short (<70% raw read length), and duplicate reads, as well as those of human, other animal or plant origin, using *KneadData* with default settings. Species composition was determined by identifying clade-specific markers from reads using MetaPhlAn3 with default settings (60). Relative abundance estimates were obtained from known assigned reads, and unknown read proportions were estimated from total, assigned and unassigned, reads. Percent human DNA was estimated from *KneadData* output, using the proportion of quality-filtered reads that align to the human genome. Given the smaller viral genome sizes, sequencing depth, and limitations of MetaPhlAn3 for virus identification, we did not include viruses in our current analyses. We applied a minimum threshold of >0.1% relative abundance and ≥5% prevalence for all detected species. Metabolic pathway composition was determined using HUMAnN3 with default settings against the UniRef90 database (60). Pathway abundance estimates were normalized using reads per kilobase per million mapped reads (RPKM) and then re-normalized to relative abundance. We applied a minimum relative abundance threshold of 3×10-7% and ≥5% prevalence for all metagenomic pathways. ### Statistical analysis All data were analysed using R (v.4.0.5). Microbiome data were handled using the phyloseq package. Alpha diversity metrics were calculated using the vegan package. Beta-diversity was estimated using the Bray-Curtis dissimilarity index and analysed by permutation analysis of variance (PERMANOVA). Differential abundance analysis of species or functional pathways was assessed using multiple regression analyses using the *MaAsLin2* package (61). Four covariates were chosen for adjustment in *MaAsLin2* regression models and included age at stool sample collection, exclusive breastfeeding status (recorded at 3 months old), delivery mode, and randomised trial arm. These covariates were chosen based on biological plausibility and previous evidence of their influence on gut microbiome composition in large birth cohorts (17). Adjustment for multiple comparisons was performed using the Benjamini-Hochberg false discovery rate (FDR). The SHINE trial did not observe an interaction between the randomized WASH and IYCF interventions and growth; therefore, randomised trial arms were combined into WASH versus non-WASH arms and IYCF versus non-IYCF arms for specified analyses. We restricted the IYCF analysis to the 6, 12 and 18 month visits, corresponding to the period during which supplemental infant feeding was introduced (from 6 months of age). All children, regardless of HIV status or exposure status were included in the growth analyses (875 total stool samples). 16 samples had missing ages and were excluded from age prediction models. Stool samples collected from children classified as HIV-unknown (24 samples) or HIV-positive (4 samples) at 18 months were excluded from direct comparisons of CHEU vs CHU infants. ### XGBoost models Relationships between the infant microbiome and age or growth (attained LAZ/WHZ and WHZ/LAZ velocity) were evaluated using extreme gradient boosting machines (XGBoost). XGBoost builds an optimized predictive model by creating an ensemble from a series of weakly predictive models. XGBoost is also non-parametric, can capture non-linear relationships, and can accommodate high-dimensional data (62). The XGBoost models were developed using microbiome relative abundances (species or pathways) and all epidemiologic variables. XGBoost model selection was performed in 3 stages as previously described (27). In brief, *BayesianOptimization* function of the *rBayesianOptimization* package was used with 10-fold cross-validation to select model hyperparameters by minimizing the mean squared error (MSE). Models with the lowest MSE (in the 5th percentile) were retained, and from these models the variables that contributed to the top 95% of variable importance by proportion were retained. In stage two, all epidemiologic variables were included with the microbiome variables obtained in stage one, and *BayesianOptimization* was re-run as for stage one but using leave-one-out cross-validation. Microbiome variables that contributed to the top 95% of variable importance by proportion were retained. In stage three, all epidemiologic variables, microbiome features, and hyperparameters selected in stage two were used to fit our final models, using leave-one-out cross-validation to minimize the MSE. This 3-stage hyperparameter tuning and model building was performed for two feature sets, one comprising microbiome features and a second comprising microbiome plus epidemiologic features; this was done to assess model performance and to examine the contribution of epidemiologic versus microbiome features. Separate models were built for attained LAZ and WHZ and growth velocity outcomes (13). We assessed microbiota composition and functional pathways separately. XGBoost models were fit using the H20.ai engine and *h2o* R package interface with the *XGBoost* package. XGBoost model performance was evaluated using pseudo-R2 and mean absolute error (MAE). Pseudo-R2 values < 0 indicated that the prediction of the model was worse than the mean response. Scaled relative importance for each model feature was used to identify the twenty most informative variables for further interpretation, where the most important variable is ranked first, and the importance of subsequent variables are relative to the first variable. The marginal relationships between the twenty most important features and each growth outcome were visualized for interpretation (36) using accumulated local effects plots (ALE). ALE plots can be interpreted as showing a marginal effect, adjusted for all covariates retained in the final model, showing the expected change in the outcome variable per increment in a model feature. The resulting effect sizes are plotted cumulatively and centered about the average effect size (37). ALEs were generated using the *ALEplot* package, modified to compute confidence intervals, and were plotted using *ggplot2*. Standard deviations were calculated per increment and were used to calculate and plot increment-wise 95% confidence intervals. ### Microbiome age We investigated microbiome maturation by building an age prediction model using XGBoost and microbiome features only (species or pathways). We partitioned the infants and their corresponding datasets into three groups to train and test a model of microbiome age: (1) CHU with LAZ > −2 at 18 month of age, who contributed >1 dataset (healthy training set), (2) remaining CHU infants with LAZ > −2 at 18 month of age, who contributed a single metagenomic dataset (healthy test set); and (3) CHEU or children with LAZ ≤ −2 at 18 months of age (unhealthy test set). Age was log transformed as a response in the XGBoost model. We performed the same 3-stage tuning and model building procedure, as described above. We generated model performance metrics, including pseudo-R2, mean absolute error (MAE) and mean squared error (MSE) for the three sets. For the training set, we used the cross-validation, hold-out predictions to generate the metrics, and for the two test sets, we used the predicted values from the final to calculate the model performance metrics. We exponentiated the predicted log transformed ages and plotted these values against the observed age. The predicted age using these models is referred to as ‘microbiota age’ for the models trained using species and ‘metagenome age’ for models trained using pathways. To account for variance of microbiota ages with respect to chronological age within the age range of each study visit, a microbiota for age Z-score (MAZ) and metagenome-for-age Z-score (MetAZ) was also created using the microbiome age and metagenome ages as previously described (19). A Z-score was calculated to account for variation in ages within each study visit using the following formula: (Microbiota age of child – median microbiota age of ‘healthy’ child at same study visit)/ standard deviation of microbiota age of ‘healthy’ child at same study visit. ## Supporting information Supplementary tables (S3-S5) [[supplements/273587_file02.xlsx]](pending:yes) ## Data Availability Raw sequencing reads are deposited in the European Bioinformatics Database under accession number PRJEB51728. Associated metadata from the larger SHINE trial will be uploaded to http://ClinEpiDB.org following publication of all primary manuscripts from the SHINE trial. Prior to that time, the data are housed on the ClinEpiDB platform at the Zvitambo Institute for Maternal and Child Health Research and available upon request from Ms. Virginia Sauramba (vsauramba{at}zvitambo.co.zw). ## SUPPLEMENTARY MATERIALS ![Figure S1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F7.medium.gif) [Figure S1.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F7) Figure S1. Participants and samples included in the study. 875 stool samples collected from 335 unique infants underwent whole metagenome shotgun sequencing and were categorized into 6 age groups (a). CONSORT diagram of the participants from the SHINE trial included in the infant microbiome sub-study (b). ![Figure S2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F8.medium.gif) [Figure S2.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F8) Figure S2. Whole metagenome sequencing performance. A median of 0.05% sequencing reads were assigned to the human genome in each sample (a), which varied by age at stool sample collection. The percentage of sequencing reads that could be aligned to known sequences using the MetaPhlAn3 and HUMAnN3 pipelines decreased in stool samples collected at older ages (b). PCoA (c) and phylum relative abundances (d) of sequencing sample replicates showed high reproducibility and little technical variation. ![Figure S3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F9.medium.gif) [Figure S3.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F9) Figure S3. Diversity metrics in entire dataset. Shannon alpha diversity (a) and gene richness (b) across the entire dataset revealed stable diversity up to 4-5 months of age followed by rapid taxonomic and functional diversification. Bray-curtis distances between samples at within and across each age visit showed low inter-individual variability in species composition which increase with age (c), and high inter-individual variation in metagenome pathways (d). ![Figure S4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F10.medium.gif) [Figure S4.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F10) Figure S4. XGBoost model performance metrics as assessed by mean absolute error. Mean absolute error (MAE) in XGBoost model performances in models predicting LAZ (a), WHZ (b), LAZ velocity (c) and WHZ velocity (d) stratified by maternal HIV status and age categories. Models were run using microbiome features alone (species or metagenomic pathways) and in combination with epidemiological variables. ![Figure S5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F11.medium.gif) [Figure S5.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F11) Figure S5. Top ranked pathways in XGBoost models predicting growth velocity. Top ranked features in XGBoost model predictions of LAZ velocity (a) and WHZ velocity (b) stratified by maternal HIV status. Only features from XGBoost models with pseudo-R2 > 0 are plotted. ![Figure S6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/04/19/2022.04.19.22273587/F12.medium.gif) [Figure S6.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/F12) Figure S6. Figure S5. Top ranked species in XGBoost models predicting growth. Top ranked features in XGBoost model predictions of WHZ velocity (a) and WHZ in children born to HIV+ (b) and HIV-mothers (c). Only features from XGBoost models with pseudo-R2 > 0 are plotted. View this table: [Table S1.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/T1) Table S1. Baseline characteristics of infants in SHINE trial and microbiome substudy View this table: [Table S2.](http://medrxiv.org/content/early/2022/04/19/2022.04.19.22273587/T2) Table S2. Detected Eukaryota and Archaea species prior to prevalence filtering **Table S3. (*Auxiliary supplementary file)*** Multivariate regression analysis examining the effect of maternal HIV infection, age (days), exclusive breastfeeding status, delivery mode and trial arm on taxonomic microbiome composition **Table S4. (*Auxiliary supplementary file)*** Multivariate regression analysis examining the effect of maternal HIV infection, age (days), exclusive breastfeeding status, delivery mode and trial arm on gene pathway microbiome composition **Table S5. (*Auxiliary supplementary file)*** Epidemiological variables included in XGBoost models ## Funding * - Bill & Melinda Gates Foundation (OPP1021542 and OPP1143707; JHH and AJP), with a subcontract to the University of British Columbia (20R25498; ARM) * - United Kingdom Department for International Development (DFID/UKAID; JHH and AJP) * - Wellcome Trust (093768/Z/10/Z, 108065/Z/15/Z; AJP) * - Swiss Agency for Development and Cooperation (JHH and AJP) * - US National Institutes of Health (2R01HD060338-06; JHH) * - UNICEF (PCA-2017-0002; JHH and AJP). The funders had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. ## Author contributions Conceptualized and study design: ARM, LES, RJS, JHH and AJP. Data and biospecimen collection: KM, RN, BC, FDM, NVT, JT, and BM Biospecimen processing: HMG, IB, SKG, RCR, FF and LC Bioinformatics and machine learning analyses: TJE Data analysis and interpretation: RCR, ARM, TJE, LC, CE and EKG Writing - original draft: RCR, ARM, LC and TJE Writing - reviewing and editing: All authors Supervision and verification of data: ARM, AJP and JHH ## Competing interests TJE was paid a scientific consulting fee in relation to the analysis of the data presented here by Zvitambo Institute for Maternal and Child Health Research. RCR declares remittance from Abbott Nutrition Health Institute (March 2022) and Nutricia (May 2021) for public conference talks outside the submitted work. All other authors declare that they have no competing interests. ## Data and materials availability Raw sequencing reads are deposited in the European Bioinformatics Database under accession number PRJEB51728. Associated metadata from the larger SHINE trial will be uploaded to [http://ClinEpiDB.org](http://ClinEpiDB.org) following publication of all primary manuscripts from the SHINE trial. Prior to that time, the data are housed on the ClinEpiDB platform at the Zvitambo Institute for Maternal and Child Health Research and available upon request from Ms. Virginia Sauramba (vsauramba{at}zvitambo.co.zw). ## Ethics approvals All SHINE mothers provided written informed consent. The Medical Research Council of Zimbabwe (MRCZ/A/1675), Johns Hopkins Bloomberg School of Public Health (JHU IRB # 4205.), and the University of British Columbia (H15-03074) approved the study protocol, including the microbiome analyses. The SHINE trial is registered at [ClinicalTrials.gov](http://ClinicalTrials.gov) ([NCT01824940](http://medrxiv.org/lookup/external-ref?link_type=CLINTRIALGOV&access_num=NCT01824940&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom)). ## **Acknowledgements:** We thank all the mothers, babies, and their families who participated in the SHINE trial, the leadership and staff of the Ministry of Health and Child Care in Chirumanzu and Shurugwi districts and Midlands Province (especially environmental health, nursing, and nutrition) for their roles in operationalization of the study procedures, the Ministry of Local Government officials in each district who supported and facilitated field operations, Phillipa Rambanepasi and her team for proficient management of all the finances, Virginia Sauramba for management of compliance issues, and the programme officers at the Gates Foundation and the Department for International Development, who enthusiastically worked with us over a long period to make SHINE happen. ## Footnotes * 11 Members of the SHINE Trial team who are not named authors are listed in [https://academic.oup.com/cid/article/61/suppl\_7/S685/358186](https://academic.oup.com/cid/article/61/suppl_7/S685/358186) * Received April 19, 2022. * Revision received April 19, 2022. * Accepted April 19, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## REFERENCES 1. 1.Victora CG, Christian P, Vidaletti LP, Gatica-Domínguez G, Menon P, Black RE. Revisiting maternal and child undernutrition in low-income and middle-income countries: variable progress towards an unfinished agenda. Lancet. 2021;397(10282):1388–99. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(21)00394-9&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 2. 2.Black RE, Victora CG, Walker SP, Bhutta ZA, Christian P, de Onis M, Ezzati M, Grantham-McGregor S, Katz J, Martorell R, Uauy R, Group MaCNS. Maternal and child undernutrition and overweight in low-income and middle-income countries. Lancet. 2013;382(9890):427–51. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(13)60937-X&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23746772&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000322638500036&link_type=ISI) 3. 3.Prendergast AJ, Humphrey JH. The stunting syndrome in developing countries. Paediatr Int Child Health. 2014;34(4):250–65. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1179/2046905514Y.0000000158&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25310000&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 4. 4.Dewey KG, Begum K. Long-term consequences of stunting in early life. Matern Child Nutr. 2011;7 Suppl 3:5–18. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1740-8709.2011.00349.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21929633&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 5. 5.Dewey KG, Stewart CP, Wessells KR, Prado EL, Arnold CD. Small-quantity lipid-based nutrient supplements for the prevention of child malnutrition and promotion of healthy development: overview of individual participant data meta-analysis and programmatic implications. Am J Clin Nutr. 2021;114(Suppl 1):3S–14S. 6. 6.Prendergast A, Kelly P. Enteropathies in the developing world: neglected effects on global health. Am J Trop Med Hyg. 2012;86(5):756–63. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoidHJvcG1lZCI7czo1OiJyZXNpZCI7czo4OiI4Ni81Lzc1NiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzA0LzE5LzIwMjIuMDQuMTkuMjIyNzM1ODcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 7. 7.Harper KM, Mutasa M, Prendergast AJ, Humphrey J, Manges AR. Environmental enteric dysfunction pathways and child stunting: A systematic review. PLoS Negl Trop Dis. 2018;12(1):e0006205. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 8. 8.Crane RJ, Jones KD, Berkley JA. Environmental enteric dysfunction: an overview. Food Nutr Bull. 2015;36(1 Suppl):S76–87. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/15648265150361S113&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25902619&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 9. 9.Kosek MN, Investigators M-EN. Causal Pathways from Enteropathogens to Environmental Enteropathy: Findings from the MAL-ED Birth Cohort Study. EBioMedicine. 2017;18:109–17. 10. 10.Prendergast AJ, Kelly P. Interactions between intestinal pathogens, enteropathy and malnutrition in developing countries. Curr Opin Infect Dis. 2016;29(3):229–36. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/QCO.0000000000000261&link_type=DOI) 11. 11.Amadi B, Zyambo K, Chandwe K, Besa E, Mulenga C, Mwakamui S, Siyumbwa S, Croft S, Banda R, Chipunza M, Chifunda K, Kazhila L, VanBuskirk K, Kelly P. Adaptation of the small intestine to microbial enteropathogens in Zambian children with stunting. Nat Microbiol. 2021;6(4):445–54. 12. 12.Pickering AJ, Null C, Winch PJ, Mangwadu G, Arnold BF, Prendergast AJ, Njenga SM, Rahman M, Ntozini R, Benjamin-Chung J, Stewart CP, Huda TMN, Moulton LH, Colford JM, Luby SP, Humphrey JH. The WASH Benefits and SHINE trials: interpretation of WASH intervention effects on linear growth and diarrhoea. Lancet Glob Health. 2019;7(8):e1139–e46. 13. 13.Mutasa K, Ntozini R, Mbuya MNN, Rukobo S, Govha M, Majo FD, Tavengwa N, Smith LE, Caulfield L, Swann JR, Stoltzfus RJ, Moulton LH, Humphrey JH, Gough EK, Prendergast AJ. Biomarkers of environmental enteric dysfunction are not consistently associated with linear growth velocity in rural Zimbabwean infants. Am J Clin Nutr. 2021;113(5):1185–98. 14. 14. Rogawski McQuade ET, Platts-Mills JA, Gratz J, Zhang J, Moulton LH, Mutasa K, Majo FD, Tavengwa N, Ntozini R, Prendergast AJ, Humphrey JH, Liu J, Houpt ER. Impact of Water Quality, Sanitation, Handwashing, and Nutritional Interventions on Enteric Infections in Rural Zimbabwe: The Sanitation Hygiene Infant Nutrition Efficacy (SHINE) Trial. J Infect Dis. 2020;221(8):1379–86. 15. 15.Rogawski ET, Liu J, Platts-Mills JA, Kabir F, Lertsethtakarn P, Siguas M, Khan SS, Praharaj I, Murei A, Nshama R, Mujaga B, Havt A, Maciel IA, Operario DJ, Taniuchi M, Gratz J, Stroup SE, Roberts JH, Kalam A, Aziz F, Qureshi S, Islam MO, Sakpaisal P, Silapong S, Yori PP, Rajendiran R, Benny B, McGrath M, Seidman JC, Lang D, Gottlieb M, Guerrant RL, Lima AAM, Leite JP, Samie A, Bessong PO, Page N, Bodhidatta L, Mason C, Shrestha S, Kiwelu I, Mduma ER, Iqbal NT, Bhutta ZA, Ahmed T, Haque R, Kang G, Kosek MN, Houpt ER, Investigators M-EN. Use of quantitative molecular diagnostic methods to investigate the effect of enteropathogen infections on linear growth in children in low-resource settings: longitudinal analysis of results from the MAL-ED cohort study. Lancet Glob Health. 2018;6(12):e1319–e28. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 16. 16.Richard SA, McCormick BJJ, Murray-Kolb LE, Lee GO, Seidman JC, Mahfuz M, Ahmed T, Guerrant RL, Petri WA, Rogawski ET, Houpt E, Kang G, Mduma E, Kosek MN, Lima AAM, Shrestha SK, Chandyo RK, Bhutta Z, Bessong P, Caulfield LE, Investigators M-EN. Enteric dysfunction and other factors associated with attained size at 5 years: MAL-ED birth cohort study findings. Am J Clin Nutr. 2019;110(1):131–8. 17. 17.Stewart CJ, Ajami NJ, O’Brien JL, Hutchinson DS, Smith DP, Wong MC, Ross MC, Lloyd RE, Doddapaneni H, Metcalf GA, Muzny D, Gibbs RA, Vatanen T, Huttenhower C, Xavier RJ, Rewers M, Hagopian W, Toppari J, Ziegler AG, She JX, Akolkar B, Lernmark A, Hyoty H, Vehik K, Krischer JP, Petrosino JF. Temporal development of the gut microbiome in early childhood from the TEDDY study. Nature. 2018;562(7728):583–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-018-0617-x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30356187&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 18. 18.Robertson RC, Manges AR, Finlay BB, Prendergast AJ. The Human Microbiome and Child Growth-First 1000 Days and Beyond. Trends Microbiol. 2019;27(2):131–47. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.tim.2018.09.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 19. 19.Subramanian S, Huq S, Yatsunenko T, Haque R, Mahfuz M, Alam MA, Benezra A, DeStefano J, Meier MF, Muegge BD, Barratt MJ, VanArendonk LG, Zhang Q, Province MA, Petri WA, Ahmed T, Gordon JI. Persistent gut microbiota immaturity in malnourished Bangladeshi children. Nature. 2014;510(7505):417–21. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature13421&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24896187&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000337350200039&link_type=ISI) 20. 20.Rouhani S, Griffin NW, Yori PP, Gehrig JL, Olortegui MP, Salas MS, Trigoso DR, Moulton LH, Houpt ER, Barratt MJ, Kosek MN, Gordon JI. Diarrhea as a Potential Cause and Consequence of Reduced Gut Microbial Diversity Among Undernourished Children in Peru. Clin Infect Dis. 2020;71(4):989–99. 21. 21.Blanton LV, Charbonneau MR, Salih T, Barratt MJ, Venkatesh S, Ilkaveya O, Subramanian S, Manary MJ, Trehan I, Jorgensen JM, Fan YM, Henrissat B, Leyn SA, Rodionov DA, Osterman AL, Maleta KM, Newgard CB, Ashorn P, Dewey KG, Gordon JI. Gut bacteria that prevent growth impairments transmitted by microbiota from malnourished children. Science. 2016;351(6275). 22. 22.Smith MI, Yatsunenko T, Manary MJ, Trehan I, Mkakosya R, Cheng J, Kau AL, Rich SS, Concannon P, Mychaleckyj JC, Liu J, Houpt E, Li JV, Holmes E, Nicholson J, Knights D, Ursell LK, Knight R, Gordon JI. Gut microbiomes of Malawian twin pairs discordant for kwashiorkor. Science. 2013;339(6119):548–54. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzMzkvNjExOS81NDgiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8wNC8xOS8yMDIyLjA0LjE5LjIyMjczNTg3LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 23. 23.Chen RY, Mostafa I, Hibberd MC, Das S, Mahfuz M, Naila NN, Islam MM, Huq S, Alam MA, Zaman MU, Raman AS, Webber D, Zhou C, Sundaresan V, Ahsan K, Meier MF, Barratt MJ, Ahmed T, Gordon JI. A Microbiota-Directed Food Intervention for Undernourished Children. N Engl J Med. 2021;384(16):1517–28. 24. 24.Gehrig JL, Venkatesh S, Chang HW, Hibberd MC, Kung VL, Cheng J, Chen RY, Subramanian S, Cowardin CA, Meier MF, O’Donnell D, Talcott M, Spears LD, Semenkovich CF, Henrissat B, Giannone RJ, Hettich RL, Ilkayeva O, Muehlbauer M, Newgard CB, Sawyer C, Head RD, Rodionov DA, Arzamasov AA, Leyn SA, Osterman AL, Hossain MI, Islam M, Choudhury N, Sarker SA, Huq S, Mahmud I, Mostafa I, Mahfuz M, Barratt MJ, Ahmed T, Gordon JI. Effects of microbiota-directed foods in gnotobiotic animals and undernourished children. Science. 2019;365(6449). 25. 25.Bender JM, Li F, Martelly S, Byrt E, Rouzier V, Leo M, Tobin N, Pannaraj PS, Adisetiyo H, Rollie A, Santiskulvong C, Wang S, Autran C, Bode L, Fitzgerald D, Kuhn L, Aldrovandi GM. Maternal HIV infection influences the microbiome of HIV-uninfected infants. Sci Transl Med. 2016;8(349):349ra100. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6InNjaXRyYW5zbWVkIjtzOjU6InJlc2lkIjtzOjE0OiI4LzM0OS8zNDlyYTEwMCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzA0LzE5LzIwMjIuMDQuMTkuMjIyNzM1ODcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 26. 26.Lozupone CA, Li M, Campbell TB, Flores SC, Linderman D, Gebert MJ, Knight R, Fontenot AP, Palmer BE. Alterations in the gut microbiota associated with HIV-1 infection. Cell Host Microbe. 2013;14(3):329–39. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.chom.2013.08.006&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24034618&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000330852000012&link_type=ISI) 27. 27.Gough EK, Edens TJ, Geum HM, Baharmand I, Gill SK, Robertson RC, Mutasa K, Ntozini R, Smith LE, Chasekwa B, Majo FD, Tavengwa NV, Mutasa B, Francis F, Carr L, Tome J, Stoltzfus RJ, Moulton LH, Prendergast AJ, Humphrey JH, Manges AR, Team ST. Maternal fecal microbiome predicts gestational age, birth weight and neonatal growth in rural Zimbabwe. EBioMedicine. 2021;68:103421. 28. 28.Amenyogbe N, Dimitriu P, Cho P, Ruck C, Fortuno ES, Cai B, Alimenti A, Côté HCF, Maan EJ, Slogrove AL, Esser M, Marchant A, Goetghebuer T, Shannon CP, Tebbutt SJ, Kollmann TR, Mohn WW, Smolen KK. Innate Immune Responses and Gut Microbiomes Distinguish HIV-Exposed from HIV-Unexposed Children in a Population-Specific Manner. J Immunol. 2020;205(10):2618–28. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiamltbXVub2wiO3M6NToicmVzaWQiO3M6MTE6IjIwNS8xMC8yNjE4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDQvMTkvMjAyMi4wNC4xOS4yMjI3MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 29. 29.Machiavelli A, Duarte RTD, Pires MMS, Zárate-Bladés CR, Pinto AR. The impact of *in utero* HIV exposure on gut microbiota, inflammation, and microbial translocation. Gut Microbes. 2019;10(5):599–614. 30. 30.Evans C, Chasekwa B, Ntozini R, Majo FD, Mutasa K, Tavengwa N, Mutasa B, Mbuya MNN, Smith LE, Stoltzfus RJ, Moulton LH, Humphrey JH, Prendergast AJ, Team SHINEST. Mortality, Human Immunodeficiency Virus (HIV) Transmission, and Growth in Children Exposed to HIV in Rural Zimbabwe. Clin Infect Dis. 2021;72(4):586–94. 31. 31.Vonaesch P, Morien E, Andrianonimiadana L, Sanke H, Mbecko JR, Huus KE, Naharimanananirina T, Gondje BP, Nigatoloum SN, Vondo SS, Kaleb Kandou JE, Randremanana R, Rakotondrainipiana M, Mazel F, Djorie SG, Gody JC, Finlay BB, Rubbo PA, Wegener Parfrey L, Collard JM, Sansonetti PJ, Investigators A. Stunted childhood growth is associated with decompartmentalization of the gastrointestinal tract and overgrowth of oropharyngeal taxa. Proc Natl Acad Sci U S A. 2018;115(36):E8489–E98. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTE1LzM2L0U4NDg5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMDQvMTkvMjAyMi4wNC4xOS4yMjI3MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 32. 32.Dinh DM, Ramadass B, Kattula D, Sarkar R, Braunstein P, Tai A, Wanke CA, Hassoun S, Kane AV, Naumova EN, Kang G, Ward HD. Longitudinal Analysis of the Intestinal Microbiota in Persistently Stunted Young Children in South India. PLoS One. 2016;11(5):e0155405. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0155405&link_type=DOI) 33. 33.Gough EK, Stephens DA, Moodie EE, Prendergast AJ, Stoltzfus RJ, Humphrey JH, Manges AR. Linear growth faltering in infants is associated with Acidaminococcus sp. and community-level changes in the gut microbiota. Microbiome. 2015;3:24. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s40168-015-0089-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26106478&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 34. 34.Ordiz MI, Stephenson K, Agapova S, Wylie KM, Maleta K, Martin J, Trehan I, Tarr PI, Manary MJ. Environmental Enteric Dysfunction and the Fecal Microbiota in Malawian Children. Am J Trop Med Hyg. 2017;96(2):473–6. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoidHJvcG1lZCI7czo1OiJyZXNpZCI7czo4OiI5Ni8yLzQ3MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzA0LzE5LzIwMjIuMDQuMTkuMjIyNzM1ODcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 35. 35.Humphrey JH, Mbuya MNN, Ntozini R, Moulton LH, Stoltzfus RJ, Tavengwa NV, Mutasa K, Majo F, Mutasa B, Mangwadu G, Chasokela CM, Chigumira A, Chasekwa B, Smith LE, Tielsch JM, Jones AD, Manges AR, Maluccio JA, Prendergast AJ, Team SHINEST. Independent and combined effects of improved water, sanitation, and hygiene, and improved complementary feeding, on child stunting and anaemia in rural Zimbabwe: a cluster-randomised trial. Lancet Glob Health. 2019;7(1):e132–e47. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 36. 36.Zhao Q, Hastie T. Causal interpretations of black-box models. J Bus Econ Stat. 2019;2019. 37. 37.Apley D, Zhu J. Visualizing the effects of predictor variables in black box supervised learning models. J R Stat Soc Ser B (Statistical Methodol). 2020;82:1059–86. 38. 38.Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, Magris M, Hidalgo G, Baldassano RN, Anokhin AP, Heath AC, Warner B, Reeder J, Kuczynski J, Caporaso JG, Lozupone CA, Lauber C, Clemente JC, Knights D, Knight R, Gordon JI. Human gut microbiome viewed across age and geography. Nature. 2012;486(7402):222–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11053&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22699611&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000305189000027&link_type=ISI) 39. 39.Hughes RL, Arnold CD, Young RR, Ashorn P, Maleta K, Fan YM, Ashorn U, Chaima D, Malamba-Banda C, Kable ME, Dewey KG. Infant gut microbiota characteristics generally do not modify effects of lipid-based nutrient supplementation on growth or inflammation: secondary analysis of a randomized controlled trial in Malawi. Sci Rep. 2020;10(1):14861. 40. 40.Raman AS, Gehrig JL, Venkatesh S, Chang HW, Hibberd MC, Subramanian S, Kang G, Bessong PO, Lima AAM, Kosek MN, Petri WA, Rodionov DA, Arzamasov AA, Leyn SA, Osterman AL, Huq S, Mostafa I, Islam M, Mahfuz M, Haque R, Ahmed T, Barratt MJ, Gordon JI. A sparse covarying unit that describes healthy and impaired human gut microbiota development. Science. 2019;365(6449). 41. 41.Wilson WA, Roach PJ, Montero M, Baroja-Fernández E, Muñoz FJ, Eydallin G, Viale AM, Pozueta-Romero J. Regulation of glycogen metabolism in yeast and bacteria. FEMS Microbiol Rev. 2010;34(6):952–85. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1574-6976.2010.00220.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20412306&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 42. 42.Sekar K, Linker SM, Nguyen J, Grünhagen A, Stocker R, Sauer U. Bacterial Glycogen Provides Short-Term Benefits in Changing Environments. Appl Environ Microbiol. 2020;86(9). 43. 43.Charbonneau MR, O’Donnell D, Blanton LV, Totten SM, Davis JC, Barratt MJ, Cheng J, Guruge J, Talcott M, Bain JR, Muehlbauer MJ, Ilkayeva O, Wu C, Struckmeyer T, Barile D, Mangani C, Jorgensen J, Fan YM, Maleta K, Dewey KG, Ashorn P, Newgard CB, Lebrilla C, Mills DA, Gordon JI. Sialylated Milk Oligosaccharides Promote Microbiota-Dependent Growth in Models of Infant Undernutrition. Cell. 2016;164(5):859–71. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2016.01.024&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26898329&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 44. 44.Jonker H, Capelle N, Lanes A, Wen SW, Walker M, Corsi DJ. Maternal folic acid supplementation and infant birthweight in low- and middle-income countries: A systematic review. Matern Child Nutr. 2020;16(1):e12895. 45. 45.Strand TA, Ulak M, Kvestad I, Henjum S, Ulvik A, Shrestha M, Thorne-Lyman AL, Ueland PM, Shrestha PS, Chandyo RK. Maternal and infant vitamin B12 status during infancy predict linear growth at 5 years. Pediatr Res. 2018;84(5):611–8. 46. 46.Strand TA, Taneja S, Kumar T, Manger MS, Refsum H, Yajnik CS, Bhandari N. Vitamin B-12, folic acid, and growth in 6- to 30-month-old children: a randomized controlled trial. Pediatrics. 2015;135(4):e918–26. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTA6InBlZGlhdHJpY3MiO3M6NToicmVzaWQiO3M6MTA6IjEzNS80L2U5MTgiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8wNC8xOS8yMDIyLjA0LjE5LjIyMjczNTg3LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 47. 47.Strand TA, Ulak M, Hysing M, Ranjitkar S, Kvestad I, Shrestha M, Ueland PM, McCann A, Shrestha PS, Shrestha LS, Chandyo RK. Effects of vitamin B12 supplementation on neurodevelopment and growth in Nepalese Infants: A randomized controlled trial. PLoS Med. 2020;17(12):e1003430. 48. 48.Vänni P, Tejesvi MV, Ainonen S, Renko M, Korpela K, Salo J, Paalanne N, Tapiainen T. Delivery mode and perinatal antibiotics influence the predicted metabolic pathways of the gut microbiome. Sci Rep. 2021;11(1):17483. 49. 49.Wang L, Mu S, Xu X, Shi Z, Shen L. Effects of dietary nucleotide supplementation on growth in infants: a meta-analysis of randomized controlled trials. Eur J Nutr. 2019;58(3):1213–21. 50. 50.Kamng’ona AW, Young R, Arnold CD, Kortekangas E, Patson N, Jorgensen JM, Prado EL, Chaima D, Malamba C, Ashorn U, Fan YM, Cheung YB, Ashorn P, Maleta K, Dewey KG. The association of gut microbiota characteristics in Malawian infants with growth and inflammation. Sci Rep. 2019;9(1):12893. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-019-49274-y&link_type=DOI) 51. 51.Bourke CD, Gough EK, Pimundu G, Shonhai A, Berejena C, Terry L, Baumard L, Choudhry N, Karmali Y, Bwakura-Dangarembizi M, Musiime V, Lutaakome J, Kekitiinwa A, Mutasa K, Szubert AJ, Spyer MJ, Deayton JR, Glass M, Geum HM, Pardieu C, Gibb DM, Klein N, Edens TJ, Walker AS, Manges AR, Prendergast AJ. Cotrimoxazole reduces systemic inflammation in HIV infection by altering the gut microbiome and immune activation. Sci Transl Med. 2019;11(486). 52. 52.D’Souza AW, Moodley-Govender E, Berla B, Kelkar T, Wang B, Sun X, Daniels B, Coutsoudis A, Trehan I, Dantas G. Cotrimoxazole Prophylaxis Increases Resistance Gene Prevalence and a-Diversity but Decreases 1-Diversity in the Gut Microbiome of Human Immunodeficiency Virus-Exposed, Uninfected Infants. Clin Infect Dis. 2020;71(11):2858–68. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/ciz1186&link_type=DOI) 53. 53.Prendergast AJ, Chasekwa B, Rukobo S, Govha M, Mutasa K, Ntozini R, Humphrey JH. Intestinal Damage and Inflammatory Biomarkers in Human Immunodeficiency Virus (HIV)-Exposed and HIV-Infected Zimbabwean Infants. J Infect Dis. 2017;216(6):651–61. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/infdis/jix367&link_type=DOI) 54. 54.Jalbert E, Williamson KM, Kroehl ME, Johnson MJ, Cutland C, Madhi SA, Nunes MC, Weinberg A. HIV-Exposed Uninfected Infants Have Increased Regulatory T Cells That Correlate With Decreased T Cell Function. Front Immunol. 2019;10:595. 55. 55.Dirajlal-Fargo S, Mussi-Pinhata MM, Weinberg A, Yu Q, Cohen R, Harris DR, Bowman E, Gabriel J, Kulkarni M, Funderburg N, Chakhtoura N, McComsey GA. HIV-exposed-uninfected infants have increased inflammation and monocyte activation. AIDS. 2019;33(5):845–53. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/QAD.0000000000002128&link_type=DOI) 56. 56.Pasolli E, Asnicar F, Manara S, Zolfo M, Karcher N, Armanini F, Beghini F, Manghi P, Tett A, Ghensi P, Collado MC, Rice BL, DuLong C, Morgan XC, Golden CD, Quince C, Huttenhower C, Segata N. Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle. Cell. 2019;176(3):649–62.e20. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 57. 57.Gough EK, Prendergast AJ, Mutasa KE, Stoltzfus RJ, Manges AR, Team SHINEST. Assessing the Intestinal Microbiota in the SHINE Trial. Clin Infect Dis. 2015;61 Suppl 7:S738–44. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/civ850&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26602302&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 58. 58.Humphrey JH, Jones AD, Manges A, Mangwadu G, Maluccio JA, Mbuya MN, Moulton LH, Ntozini R, Prendergast AJ, Stoltzfus RJ, Tielsch JM, Team SHINEST. The Sanitation Hygiene Infant Nutrition Efficacy (SHINE) Trial: Rationale, Design, and Methods. Clin Infect Dis. 2015;61 Suppl 7:S685–702. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/civ844&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26602296&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 59. 59.Robertson RC, Church JA, Edens TJ, Mutasa K, Min Geum H, Baharmand I, Gill SK, Ntozini R, Chasekwa B, Carr L, Majo FD, Kirkpatrick BD, Lee B, Moulton LH, Humphrey JH, Prendergast AJ, Manges AR, Team ST. The fecal microbiome and rotavirus vaccine immunogenicity in rural Zimbabwean infants. Vaccine. 2021;39(38):5391–400. 60. 60.Beghini F, McIver LJ, Blanco-Míguez A, Dubois L, Asnicar F, Maharjan S, Mailyan A, Manghi P, Scholz M, Thomas AM, Valles-Colomer M, Weingart G, Zhang Y, Zolfo M, Huttenhower C, Franzosa EA, Segata N. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife. 2021;10. 61. 61.Mallick H, Rahnavard A, McIver LJ, Ma S, Zhang Y, Nguyen LH, Tickle TL, Weingart G, Ren B, Schwager EH, Chatterjee S, Thompson KN, Wilkinson JE, Subramanian A, Lu Y, Waldron L, Paulson JN, Franzosa EA, Bravo HC, Huttenhower C. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput Biol. 2021;17(11):e1009442. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F04%2F19%2F2022.04.19.22273587.atom) 62. 62.Chen T, Guestrin C, editors. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016; New York, NY, USA: ACM.