Gastroesophageal reflux disease is associated with differences in the allograft microbiome, microbial density and inflammation in lung transplantation ====================================================================================================================================================== * Pierre H.H. Schneeberger * Chen Yang Kevin Zhang * Jessica Santilli * Bo Chen * Wei Xu * Youngho Lee * Zonelle Wijesinha * Elaine Reguera-Nuñez * Noelle Yee * Musawir Ahmed * Kristen Boonstra * Rayoun Ramendra * Courtney W. Frankel * Scott M. Palmer * Jamie L. Todd * Tereza Martinu * Bryan Coburn ## Abstract **Rationale** Gastroesophageal reflux disease (GERD) may affect lung allograft inflammation and function through its effects on allograft microbial community composition in lung transplant recipients. **Objectives** Our objective was to compare the allograft microbiota in lung transplant recipients with or without clinically diagnosed GERD in the first post-transplant year, and assess associations between GERD, allograft microbiota, inflammation and acute and chronic lung allograft dysfunction (ALAD/CLAD). **Methods** 268 bronchoalveolar lavage samples were collected from 75 lung transplant recipients at a single transplant centre every 3 months post-transplant for 1 year. Ten transplant recipients from a separate transplant centre provided samples pre/post-anti-reflux Nissen fundoplication surgery. Microbial community composition and density were measured using 16S rRNA gene sequencing and qPCR, respectively and inflammatory markers and bile acids were quantified. **Measurements and Main Results** We observed three community composition profiles (labelled community state types, CSTs 1-3). Transplant recipients with GERD were more likely to have CST1, characterized by high bacterial density and relative abundance of the oropharyngeal colonizing genera *Prevotella* and *Veillonella*. GERD was associated with more frequent transition to CST1. CST1 was associated with lower per-bacteria inflammatory cytokine levels than the pathogen-dominated CST3. Time-dependant models revealed associations between CST3 and development of ALAD/CLAD. Nissen fundoplication decreased bacterial load and pro-inflammatory cytokines. **Conclusion** GERD was associated with a high bacterial density, *Prevotella/Veillonella* dominated CST1. CST3, but not CST1 or GERD, was associated with inflammation and early development of ALAD/CLAD. Nissen fundoplication was associated with decreases in microbial density in BALF samples, especially the CST1-specific genus, *Prevotella*. Keywords * lung microbiota * gastroesophageal reflux disease * bronchoalveolar lavage * chronic lung allograft dysfunction * longitudinal microbial data * lung allograft inflammation ## Introduction Gastroesophageal reflux disease (GERD) is common after lung transplantation and is characterized by reflux of gastric contents, including gastric acid, mucus, digestive enzymes and bile acids. This refluxate can be aspirated by some patients, leading to lung allograft injury and inflammation. Indeed, we recently reported an association between GERD, levels of taurocholic acid (TCA, a bile acid) and inflammatory markers in the bronchoalveolar lavage fluid (BALF), and acute lung allograft dysfunction (ALAD) at 3-months post-transplant (1). The association of GERD and chronic lung allograft dysfunction (CLAD) – the leading cause of death in the late post-transplant period (2) – is inconsistent, however, with some, but not all studies indicating GERD is a risk factor for CLAD (3, 4). The composition of microbial communities in the lung allograft after transplantation varies between individuals and is associated both with CLAD (5, 6) and acute inflammation (7, 8). Composition of the allograft microbial community may follow dynamics similar to those presented in the ecological concept of island biogeography (9) in which community composition is the product of immigration and extinction rates (10, 11). These rates can be affected by multiple factors including individual characteristics such as proximity of the trachea to the oropharynx, and, potentially, GERD. The composition of allograft microbial communities is associated with distinct patterns of soluble biomarkers in BALF, with communities dominated by *Proteobacteria* or *Firmicutes* being associated with inflammation and *Bacteroidetes* domination associated with markers of airway remodelling (7, 8). We reasoned that allograft microbial community composition, GERD and inflammation may be associated in the post-transplant period, and that models incorporating both GERD and community composition may predict inflammation, ALAD and CLAD better than models including only individual predictors. To address this, we compared lung allograft microbial community composition from BALF between individuals with and without GERD in the first year after transplant and assessed GERD/microbial-community/inflammation associations, including longitudinal comparisons. We assessed GERD and microbial community composition as predictors of inflammation, ALAD, and CLAD in this cohort. Finally, we measured changes in microbial density and inflammatory cytokines levels before and after Nissen fundoplication in a secondary cohort of lung transplant recipients to determine whether surgical treatment of GERD altered microbial density and inflammation. ## Results ### Cohort features Baseline patient characteristics of the GERD cohort were comparable between GERD and no-GERD patients (**Table 1**) and inclusion/exclusion criteria are reported in **Figure S1**. Transbronchial biopsy pathology results (A-grades and B-grades), BALF culture results, CLAD diagnoses, and follow-up times are reported for each patient in **Figure S2**. Baseline characteristics of Nissen cohort patients are reported in **Table 2**. ![FIGURE S1](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F7.medium.gif) [FIGURE S1](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F7) FIGURE S1 ![FIGURE S2](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F8.medium.gif) [FIGURE S2](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F8) FIGURE S2 View this table: [Table 1.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/T1) Table 1. Baseline patient characteristics of the main cohort. View this table: [Table 2.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/T2) Table 2. Characteristics of patients undergoing Nissen fundoplication to treat GERD. ### Comparison of bacterial community composition in BALF between lung transplant recipients with and without GERD We first assessed bacterial community composition by 16S rRNA gene sequencing (**Figure 1**). Using a Bray-Curtis dissimilarity matrix, we identified three distinct underlying community clusters, or community state types 1-3 (CST1-3, **Figure 1A**). GERD cases and no-GERD controls had distinguishable community composition distributions (PERMANOVA *P* < 0.001, **Figure 1B**) but GERD status explained only a small amount of compositional variance (R2 = 0.013). The proportion of GERD cases with CST1 was greater than that of controls, while the proportion of CST2 was less than controls (*P* = 0.014, OR for CST1 [95% CI] = 2.4 [1.2-4.4]) but a similar proportion of patients in both groups presented CST3 (*P* = 0.12, OR for CST1 [95% CI] = 1.7 [0.9-3.2], **Figure 1C**). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F1) Figure 1. Composition of the lung microbiota in post-transplant patients with and without GERD. **A**. Classification of BALF samples based on Bray-Curtis dissimilarity measure. **B**. Non-metric multidimensional scaling plots based on Bray-Curtis dissimilarity measure comparing spatial ordination of GERD vs no-GERD patients, CST1-3 samples, and showing genera contributing most to this classification system. **C**. Proportion of CSTs by GERD status. GERD = gastroesophageal reflux disease; BALF = bronchoalveolar lavage fluid; CST (1-3) = community state types (1 to 3). #### Association with Bile Acids As previously reported (1), TCA was significantly associated with GERD at 3 months post-transplant in this cohort. In this present study, we assessed CST and bile acid associations at 3 months post-transplant. The CSTs were not significantly associated with bile acid concentrations, but weakly trend towards greater TCA and GCA concentrations in CST1 and CST3 (**Figure S3**). ![FIGURE S3](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F9.medium.gif) [FIGURE S3](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F9) FIGURE S3 Microbial community composition, bacterial density and within-sample (alpha) diversity differed by CST (**Figure 2**). CST1 was characterized by the high relative abundance of oropharyngeal taxa including *Prevotella* and *Veillonella*. The genera *Streptococcus* and *Tanerella*, were significantly enriched in CST2, while CST3 was characterized by an enrichment of genera with commonly pathogenic species *Pseudomonas* and *Staphylococcus* (**Figure 2A**). The BALF CSTs were distinguished most strongly by the genera *Prevotella, Veillonella, Streptococcus, Pseudomonas*, and *Staphylococcus* with mean decreases of classification accuracy of 0.102, 0.041, 0.025, 0.02, and 0.018, in a random forest model that omitted these taxa, respectively (model classification accuracy=82% and Cohen’s Kappa=73.9%). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F2) Figure 2. CST-specific microbial features. **A**. Genera enriched in each CST, identified using the LefSe (Linear discriminant analysis (LDA) Effect Size) pipeline. A higher LDA score indicates a higher difference between groups. **B**. Comparison of total bacterial density based on 16S rRNA gene copies/mL BALF and *Prevotella*-specific quantitative polymerase chain reaction, by CST. **C**. Comparison of alpha diversity indices, by CST. Group comparison was tested using Mann-Whitney tests. CST = community state type; BALF = bronchoalveolar lavage fluid. The median bacterial density (16S rRNA gene copies/mL BALF) was ∼10-fold higher in CST1 than in CST 2 or 3, as was the absolute abundance of the CST1-associated genus *Prevotella* as measured by 16S rRNA gene and *Prevotella*-specific qPCR, respectively (**Figure 2B**). Although taxonomic richness was highest in CST1, composite (Shannon) diversity was lower than CSTs 2 and 3 due to the high relative abundance of *Prevotella*. CST2 had the greatest evenness/lowest tendency towards dominated communities, while CST3 was characterized by high variability in density, diversity and taxonomic dominance, with many samples highly dominated by a single taxon (**Figure 2C**). The most abundant genera in the highly dominated communities in CST3 were *Staphylococcus* and *Pseudomonas*. Thus, CST1 seems to represent a high-bacterial-density state dominated by oropharyngeal taxa, CST2 a low-density state and CST3 a variable-density state commonly characterized by dominance with pathogenic taxa. Although the proportion of samples in each CST differed by GERD status, compositional differences between GERD and no-GERD samples within CSTs were not observed. #### Longitudinal comparisons in the first post-transplant year We next assessed whether bacterial density, alpha diversity, CST membership and their longitudinal stability differed between GERD cases and no-GERD controls. Using a generalized estimating equation (GEE) model, we assessed stability in bacterial density over time and observed that GERD patients had significantly greater variability in microbial density than controls (coefficient of correlation for cases: ρ = 0.165, *P* = 0.332 and for controls ρ = 0.153, *P* = 0.01, a *P* < 0.05 indicating stability over time, **Figure 3A**). Shannon diversity and Berger-Parker dominance were consistent over the first year in controls (meanSDI(no-GERD) = 2.35, CI95 [2.26-2.45]; meanBP(no-GERD) = 0.33, CI95[0.31-0.36]). In GERD patients, both indices were significantly lower than controls at 3 months (MW; meanSDI(GERD) = 1.98, meanBP(GERD) = 0.42, *P* = 0.007 and 0.025, respectively*)* but not at later timepoints (meanSDI(GERD) (6-12mts) = 2.32, CI95 [2.18-2.48]; meanBP(GERD) (6-12mts) = 0.33, CI95 [0.29-0.36], **Figure 3B**), indicating that recovery of microbial diversity is delayed in patients with GERD. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F3) Figure 3. Longitudinal analysis of the lung microbiota 1 year after transplantation. **A**. Comparison of total bacterial density variation by GERD status. **B**. Comparison of alpha diversity indices changes over time by GERD status. **C**. Patient-specific variation measured using the Bray-Curtis dissimilarity measure between consecutive samples, by GERD status. **D**. Transitions of CSTs between 3-6-, 6-9-, and 9-12-months post-transplant, by GERD status. **E**. Changes in total bacterial density associated with transition to the different CSTs. **F**. Changes in *Prevotella* density associated with CST transitions. Compositional variability over time was greatest in both groups at early sampling intervals (3-6 months) and stabilized over time (**Figure 3C**). While transitions between CSTs were common in both groups, transitions to CST1 were most common in GERD, and CST1 was more stable in GERD patients than controls (**Figure 3D**). Transitions to CST1 were associated with significant increases in absolute total bacterial abundance (**Figure 3E**) and absolute abundance of the CST1-associated genus *Prevotella* (**Figure 3F**). Conversely, transitions from CST1 were associated with significant decreases in absolute total bacterial and *Prevotella* abundance. Transitions between CST 2 and 3 were not associated with changes in absolute bacterial abundance or absolute abundance of *Prevotella*. Instability in bacterial abundance in GERD patients during the first year is thus driven by transitions to and from CST1. ### GERD and CST as predictors of inflammation, ALAD, CLAD, and death We have previously reported differences in inflammatory cytokine levels, lung allograft dysfunction and GERD in this cohort (1). We sought to assess whether the addition of microbial community composition and density (CST and 16s rRNA gene copy density) affected associations between allograft inflammation, ALAD and CLAD. #### Inflammation 16S density was strongly associated with individual proinflammatory cytokine levels independently of GERD status (**Figure 4A**). Inflammation was defined as having at least two out of four pro-inflammatory cytokines (IL-1α, IL-1β, IL-6, and IL-8) in the 75th percentile, based on a similar previously published approach (12). Samples with and without inflammation showed distinct distributions of proinflammatory cytokine concentrations (**Figure S4A**). Inflammation was associated with higher bacterial density at 3-, 6-, and 12-months post-transplant (**Figure S4B**). Despite bacterial density being significantly higher in patients with CST1, the proportion of patients with inflammation was significantly higher in patients with CST3 when compared to CST1 (Fisher’s Exact OR [95% CI] = 2.3 [1.2-4.7], *P* = 0.022). Although not significant, this CST3-inflammation relationship held when compared to CST2 (OR [95% CI] = 2.1 [1.0-4.4], *P* = 0.090) as well. There were no differences in the proportion of samples with inflammation from patients with and without GERD within CST1 (Fisher’s Exact OR [95% CI] = 1.6 [0.6-4.0], *P* = 0.462), CST2 (Fisher’s Exact OR [95% CI] = 1.6 [0.5-5.3], *P* = 0.510), and CST3 (Fisher’s Exact OR [95% CI] = 1.4 [0.5-3.9], *P* = 0.596) (**Figure 4B**). Comparison of individual proinflammatory cytokines showed no differences between CSTs, with the exception of CST2 which was associated with lower IL-8 concentrations compared to CST3 (**Figure S4C**). Notably, while bacterial density/inflammation correlations were independent of GERD diagnosis, they differed significantly by CST (**Figure 5**). All four proinflammatory cytokine levels (IL-1α, IL-1β, IL-6 and IL-8) were associated with consistently lower bacterial density for CST3 compared to CST1 (*P*-values for differences of regression intercepts between CSTs were 0.044, 0.014, 0.022 and <0.0001, respectively). This may in part explain why, although GERD is associated with higher and more variable bacterial density in the first transplant year, it is not associated with increased inflammation in our cohort. ![FIGURE S4](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F10.medium.gif) [FIGURE S4](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F10) FIGURE S4 ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F4) Figure 4. Associations between GERD status, microbial composition and density, and inflammation. **A**. Linear regression analysis of total bacterial density and individual pro-inflammatory cytokines. **B**. Proportion of samples with inflammation - defined by two or more cytokines in the top 75th percentile -, by CST and further stratified by GERD status. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F5) Figure 5. Association between total bacterial density and individual pro-inflammatory cytokines, stratified by community state type (CST1 = green; CST2 = yellow; CST3 = red). #### ALAD, CLAD and Death At three months, a diagnosis of GERD was not significantly associated with concurrent ALAD (Fisher’s Exact OR [95% CI] = 1.4 [0.4-6.5], *P* = 0.692). However, ALAD was more common in patients with CST3 than with CST1 (Fisher’s Exact OR [95% CI] = Infinity [1.5-Infinity], *P* = 0.030), but not different from CST2 (Fisher’s Exact OR [95% CI] = 1.8 [0.3-9.3], *P* = 0.694). A Cox proportional hazards model was used to assess the association of GERD status and the number of CST3 events at different intervals with time to CLAD and death, as shown in **Table 3**. One patient was excluded from this analysis due to missing metadata. GERD status was not associated with CLAD. However, patients with higher cumulative number of CST3 events at 6-, 9-, and 12-months post-transplant were more likely to develop CLAD compared to CST1 or CST2, even when adjusted for sex, age at transplant, CMV serostatus mismatch, and primary disease. This relationship also held at 3 months post-transplant when adjusted for sex and CMV mismatch. GERD status and the number of CST3 events were not significantly associated with death. View this table: [Table 3.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/T3) Table 3. Cox proportional hazards associated with GERD and events of community state type 3 (CST3). GERD status and the number of CST3 events at different intervals were used to predict CLAD and death, and adjusted for potential confounders. HR = hazard ratios; G+ = GERD positive; mts = months; EV = events; tx = transplant; CMV = cytomegalovirus ### Association of Nissen Fundoplication with Inflammatory cytokines and bacterial density In a cohort of 10-patients who underwent Nissen fundoplication for symptomatic GERD, we assessed pre/post-procedure bacterial load and pro-inflammatory cytokine levels in BALF supernatants. Similarly to our previous report (1), significant decreases in pro-inflammatory cytokines were observed after fundoplication (**Figure 6A**). While no significant changes were observed in overall bacterial loads pre and post fundoplication, individuals with the greatest decreases in inflammatory cytokines also had the greatest decreases in bacterial density as measured by 16S rRNA gene qPCR, and had the highest baseline densities of the CST1-associated genus *Prevotella*. This suggests that fundoplication may have direct or indirect effects on the allograft microbiota that vary by baseline community composition (**Figure 6B**). ![Figure 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/10/2021.09.03.21263067/F6.medium.gif) [Figure 6.](http://medrxiv.org/content/early/2021/09/10/2021.09.03.21263067/F6) Figure 6. Pre/Post-Nissen fundoplication comparison of bacterial density and individual cytokines. **A**. Changes in total bacterial density and cytokines, by patient. **B**. Regression between individual cytokines and total bacterial density, weighted by pre-Nissen density of the CST1-associated genus, *Prevotella*. ## Discussion This study is the first to systematically compare the allograft microbiota in lung transplant recipients with and without GERD. We were able to compare the relationship between microbial community composition, inflammation, ALAD and CLAD in lung allograft recipients with and without GERD, while also assessing the effects of anti-reflux surgery on allograft bacterial density. We believe we add the following observations to our understanding of allograft microbiota-host associations in the context of GERD: 1. The strongest feature of allograft community composition is the underlying compositional clustering of the allograft microbiota. We observed three major clusters, which we referred to as CST 1-3, with no significant observed microbial community compositional differences between GERD cases and controls within these clusters; 2. The proportion of samples with these CSTs differed between GERD and control groups, with the high-density, oropharyngeal-taxa-enriched CST1 being more common in GERD than in no-GERD controls; 3. GERD was associated with a more variable community composition and bacterial abundance in the first-year post-transplant, with more frequent transitions between lower density CST2/3 and CST1 in GERD cases than controls; 4. CST-inflammation associations exceeded any association between GERD status and inflammation in this cohort; 5. The CST most common in lung allograft recipients with GERD (CST1) was associated with lower per-bacteria inflammatory cytokine levels than the pathogen-dominated CST3; 6. Models used to predict inflammation, ALAD and CLAD revealed CST – particularly CST3, but not GERD-status associations; 7. Nissen fundoplication was associated with strongly correlated decreases in bacterial load and pro-inflammatory cytokines, especially in individuals with high pre-fundoplication density of the CST1-associated genus *Prevotella*. Our observation of a clear structure underlying beta-diversity of the lung allograft confirms the results of others (5, 8). We applied the term ‘community state types’ to describe these distinct clusters as is commonly done to describe the microbiota of the female genital tract (13), which has a similarly limited number of taxa and discrete compositional clustering compared to other human sample types such as gut/stool. We note that there are taxonomic differences in the clusters we observed compared to those reported by others, which may in part be due to technical factors (e.g. sequencing methods and sequence variant annotation) or differences in methods to identify and remove taxa suspected to be reagent contaminants, which is especially consequential for low density samples in CST2 (14). Nonetheless, we noted a very consistent composition for CST1, dominated by oropharyngeal taxa and CST3 characterized by the prevalence and relative abundance of putative pathogens. Thus, we feel our findings with respect to community composition are likely to be generalizable to other cohorts. We were unable to assess whether enrichment for CST1 amongst individuals with GERD was due to actual aspiration of refluxed oropharyngeal content in this observational study. However, we found that decreases in inflammatory cytokines after Nissen fundoplication were greatest in individuals with significant decreases in bacterial density and high initial *Prevotella* absolute abundance in BALF supernatant. This is consistent with a model in which GERD is associated with aspiration of oropharyngeal taxa, leading to inflammation. This hypothesis requires both replication and experimental validation. CST was not significantly associated with bile acid concentrations, although we cannot exclude the possibility that this association exists and future datasets can reassess this in larger cohorts. Conjugated bile acids, like TCA and GCA, have been associated with poor clinical outcomes and bacterial infections (1, 15). This may explain the weak trends in CST1 and CST3, CSTs that contain aspiration-related microbes and pathogens. A similar number of patients with and without GERD had CST3, suggesting that this CST arises independently of reflux. CST3 was characterized by increased relative abundance of *Pseudomonas* and *Staphylococcus*, two taxa with known pathogenic species. Despite CST1 being enriched for GERD and having higher bacterial density, CST1 and CST2 were similarly associated with decreased rates of inflammation, CLAD, and ALAD compared to CST3. The association of pathogen-communities, inflammation and host status replicates the findings of others (5, 8). Importantly, our study highlights how the relationship between GERD and clinical outcomes may be confounded by the underlying microbial composition of the lung, since the GERD-associated CST1 correlated with less inflammation, ALAD and CLAD. Future studies assessing relationships between GERD and host status or prognosis should control for the potential confounding by underlying CST or analyse it as an interacting variable. Our study has several important limitations. The primary analysis was based on a case-control, retrospective, single-centre cohort, requiring validation. In addition, the cohort was composed of patients with clear GERD or no-GERD testing results, including individuals with numbers of reflux episodes in 24 hours of >48 or <23. Our findings, therefore, may not apply to individuals with intermittent or less frequent/severe reflux. Importantly, this limitation would be expected to exaggerate the differences between cases and controls, and thus our observation that underlying CST structure is more strongly associated with host status and CLAD than GERD is likely true, as our cohort represents an ‘extreme’ GERD phenotype. In addition, we did not comprehensively assess the relationships between CST, GERD and a wide array of soluble or other host factors in BALF, and restricted our analysis to four pro-inflammatory cytokines, ALAD and CLAD. Associations between GERD and other soluble or host-derived factors in BALF may be present. Finally, our small Nissen fundoplication cohort was assessed without sequencing, since we had only BALF supernatant available, which is compositionally different from BALF pellets or raw BALF (uncentrifuged or fractionated). In summary, increased bacterial density and taxonomic composition dominated by oropharyngeal taxa is associated with GERD. This microbial state was not associated with increased risk of CLAD or death and surgical treatment of GERD showed correlation between changes in bacterial density and levels of pro-inflammatory cytokines. Similarly to what was recently published, we showed that a pathogen-dominated community independent of GERD status is associated with increased risk of lung allograft dysfunction. Future studies should investigate the exact effect of GERD treatment and whether it promotes or protects from colonization of the lung ecological niche by CLAD-associated microbial communities. ## Methods ### Cohort design This study was approved by the Toronto University Health Network Research Ethics Board (REB#15-9698-AE) and the Duke University Internal Review Board (Pro00013378). The cohorts were derived from two previously published cohorts, leveraging the cytokine and bile acid data that was generated by this prior study (1). #### GERD Cohort (Figure S1) Patients from the previously published GERD cohort were included if they had at least one available raw (unprocessed) BALF sample obtained in the first-year post-transplant (1). In summary, this was a nested case-control cohort, drawn from all adults who underwent bilateral or single lung transplantation at Toronto General Hospital between 2010 and 2015. Patients were included if they had 24-hour pH/impedance probe testing and at least one raw and supernatant BALF sample collected and preserved in the first-year post-transplant. Patients whose 24-hour pH/impedance study showed ≥48 reflux episodes were included as GERD cases. For no-GERD controls, patients with < 23 reflux episodes were excluded if they reported symptoms of reflux during the study or had underlying disease commonly associated with GERD (e.g. scleroderma, hiatal hernia, Barrett’s esophagus, history of Nissen fundoplication or pyloroplasty), and matched 2:1 to the GERD patients by transplant type (single vs. bilateral). For the selected patients, available BALF samples closest to the 3-, 6-, 9-, and 12-month time points of surveillance bronchoscopies were obtained. #### Nissen Cohort Patients from the previously published Nissen cohort were included if they had sufficient pre- and post-Nissen BALF supernatant remaining for analysis (1). In summary, this was a retrospective cohort drawn from all adults who underwent bilateral lung transplantation at Duke University Medical Center between 2005 and 2008. 10 patients were included who underwent Nissen fundoplication within 6 months post-transplant and had available paired BALF samples obtained within 3 months before and after Nissen fundoplication. ### Clinical Standards of Care and Definitions Standard of care for lung transplant recipients was delivered as described previously by the Toronto and Duke programs (16, 17). In the first year after lung transplant, both programs performed surveillance bronchoscopies to obtain BALF and transbronchial biopsies at pre-specified time points (Toronto: 0.5, 1.5, 3, 6, 9, 12 months; Duke: 1, 3, 6, 9, 12 months) and as clinically indicated. During the study period, the Toronto BAL protocol was to instill 50 mL of normal saline twice, while the Duke protocol consisted of one 60 mL instillation, the right middle lobe being the default location at both institutions. At Toronto, patients were treated with proton pump inhibitors based on symptoms and routine pH/impedance studies performed at 3 months post-transplant. No patients in the cohort underwent Nissen fundoplication in the first post-transplant year. At Duke, patients were treated with proton pump inhibitors regardless of reflux studies. Patients underwent routine 24-hour pH monitor prior to transplantation, and repeat testing post-transplant if the pre-transplant study was negative. Any positive study, defined as <0.9% proximal acid contact time and/or <4.2% distal acid contact time, led to referral for Nissen fundoplication. Acute lung allograft dysfunction (ALAD) was defined as a ≥10% decline in measured forced expiratory volume in 1 second (FEV1) compared to the higher of the two preceding FEV1 measurements. CLAD was defined as per latest consensus report from the International Society for Heart and Lung Transplantation (2). ### BAL sample processing In Toronto, raw BALF samples were aliquoted and stored at -80 °C. The remaining BALF was centrifuged at 3184 g for 20 minutes and the supernatant was also stored at -80 °C. At Duke, BALF samples were centrifuged at 1750 g for 10 minutes at 4°C and the supernatant was stored at -80 °C. ### Analysis of BALF supernatant samples Markers of innate immune activation, including IL-1α, IL-1β, IL-6, and IL-8, were measured in the BALF supernatant using a custom-designed multiplex assay (R&D), as reported previously (1). Taurocholic acid (TCA), one of the most abundant bile acids, was measured using LC-MS/MS mass spectrometry, as reported previously (1). ### DNA isolation and quantification from BALF samples Nucleic acids were isolated from 250 µl of raw BALF samples (Toronto) or BALF supernatant (Duke) using a PowerSoil DNA isolation kit (MO-BIO; Carlsbad, CA, USA) following the manufacturer’s instructions except for the elution step which was done in 60 µl purified water. Densities were measured using a 16S quantitative polymerase chain reaction (qPCR;(18)) and a standard (*Pseudomonas aeruginosa* str. PAO1), and the number of 16S copies /ml was inferred using the URI Genomics & Sequencing Center online calculator ([http://cels.uri.edu/gsc/cndna.html](http://cels.uri.edu/gsc/cndna.html)). 16S qPCR primers and conditions are described in the online supplement. qPCR reactions were carried out in a volume of 11 µl using the TaqMan Gene Expression Master Mix (Applied Biosystems, Foster City, CA, USA) according to the manufacturer’s protocol. ### 16S rRNA gene sequencing of DNA from raw BALF samples The V4 hypervariable region of the 16S rRNA gene is amplified using a universal forward sequencing primer and a uniquely barcoded reverse sequencing primer to allow for multiplexing (19). Amplification reactions are performed using 12.5 µL of KAPA2G Robust HotStart ReadyMix (KAPA Biosystems), 1.5 µL of 10 µM forward and reverse primers, 7.5 µL of sterile water and 2 µL of DNA. The V4 region was amplified by cycling the reaction at 95°C for 3 minutes, 28x cycles of 95°C for 15 seconds, 50°C for 15 seconds and 72°C for 15 seconds, followed by a 5-minute 72°C extension. All amplification reactions were done in triplicate, checked on a 1% agarose TBE gel, and then pooled to reduce amplification bias. Pooled triplicates were quantified using PicoGreen (Thermo Fisher) and combined by even concentrations. The library was then purified using Ampure XP beads and loaded on to the Illumina MiSeq for sequencing, according to manufacturer instructions (Illumina, San Diego, CA). Sequencing is performed using the V2 (150bp x 2) chemistry. DNA from a single-species, *Pseudomonas aeruginosa*, from a mock community (Zymo Microbial Standard: [https://www.zymoresearch.de/zymobiomics-community-standard](https://www.zymoresearch.de/zymobiomics-community-standard)), and a template-free negative control were included in the sequencing run. Two individual libraries were prepared and sequenced from each sample. ### Analysis of the bacterial microbiome The UNOISE pipeline, available through USEARCH v11.0.667 and vsearch v2.10.4, was used for sequence analysis (20-22). The last base was removed from all sequences using cutadapt v.1.18. Sequences were assembled and quality trimmed using –fastq_mergepairs with a –fastq_trunctail set at 5, a –fastq_minqual set at 5, and minimum and maximum assemble lengths set at 243 and 263 (+/-10 from the mean) base pairs. Sequences were first de-replicated and sorted to remove singletons, then denoised and chimeras were removed using the unoise3 command. Assembled sequences were mapped back to the chimera-free denoised sequences at 97% identity OTUs. Taxonomy assignment was executed using SINTAX (23) available through USEARCH, and the UNOISE compatible Ribosomal Database Project (RDP) database version 16, with a minimum confidence cutoff of 0.8 (24). OTU sequences were aligned using align_seqs.py v.1.9.1 through QIIME1 (25). Sequences that did not align were removed from the dataset and a phylogenetic tree of the filtered aligned sequence data was made using FastTree (26). ### Post-profiling filtering approaches 16S rRNA gene sequences from contaminants is a recurrent issue when analysing BALF samples (27) and we controlled for sequencing contaminants as described in Schneeberger *et al*. (14). Briefly, OTUs which were negatively correlated with bacterial density measured with qPCR across all samples were considered as contaminants. In addition, OTUs which were only present in one sequencing replicate were removed. ### Statistical analysis Bray-Curtis dissimilarity indices were calculated using the “dissimilarity” function from the Vegan R package version 2.5-2 (28). The metaMDS function was used with the options “k=6” and “try=500” to generate NMDS ordination plots. Most important features for CST classification were identified using the random forest model from the randomForest package version 4.6-14 based on (29) and coordinates from the most important classification features (= genus) were extracted manually from the metaMDS output. The correlation coefficients assessing stability in bacterial density were estimated by generalized estimating equations (GEE) from the geepack R package version 1.3-2 (30). Odds ratios and independence were calculated using the Fisher’s exact test. Taxonomic differences between CSTs were identified using the LEfSe pipeline (31). Mann-Whitney tests, Wilcoxon signed rank tests, and Spearman correlations were calculated using XLSTAT 2019 (Addinsoft: Paris, France). Regression analyses were conducted in Graphpad Prism version 9.0.0 (San Diege, CA, USA). Plots were generated using OriginPro 2017 (Northampton, MA, USA) and the R packages ggplot2 version 3.0.0 (32), reshape2 version 1.4.3 (33), and ggalluvial version 0.12.1 based on (34). ## Data Availability Sequence data that support the findings of this study have been deposited in the NCBI Short Read Archive with the primary accession code PRJNA754787. ## Availability of data and material Sequence data that support the findings of this study have been deposited in the NCBI Short Read Archive with the primary accession code PRJNA754787. ## Funding This study received financial support from the Canadian Institutes for Health Research (PJT149057, to BC and TM); from the Multi-Organ Transplant Program at the University Health Network (to BC and TM); from the National Institutes of Health (NIH)/National Institute of Allergy and Infectious Diseases (NIAID), Clinical Trials in Organ Transplantation (CTOT-20) Ancillary Study Fund (to TM); from the Comprehensive Research Experience for Medical Students (CREMS) Program (to CYKZ), and from the Cystic Fibrosis Foundation (to BC and TM). ## Author contributions **PHHS:** research design, experiments (microbial culture, pre-sequencing workup, concentration, DNA isolation, 16S qPCR quantification, taxonomic profiling), statistical analyses, figures generation, writing of the initial manuscript, manuscript editing; **CYKZ**: cohort design, cohort organization, clinical data collection, patient phenotyping, sample organization, multiplex data analysis, writing and editing manuscript; **JS**: statistical analyses, figures generation, writing of the initial manuscript, manuscript editing; **BC** and **WX**: statistical analyses, manuscript editing; **YL**: experiments (pre-sequencing workup, DNA isolation, qPCR), manuscript editing; **ZW**: sample organization, experiments for Nissen cohort (DNA isolation, 16S qPCR quantification), qPCR data analysis; **ERN** and **NY**: sample processing, manuscript editing **MA**: cohort design, cohort organization, clinical data collection, patient phenotyping; **KB**: Multiplex cytokine assays; **RR**: clinical data collection and curation; **CWF**: cohort organization, clinical data collection, sample organization (for Nissen cohort); **SMP** and **JLT**: cohort design, clinical data collection, patient phenotyping; **TM**: research design, project supervision, cohort design, data analysis, manuscript writing and editing; **BC**: research design, project supervision, data analysis, manuscript writing and editing. ## Competing Interests The authors declare that they have no competing interests ## Acknowledgements The authors thank the Centre for the Analysis of Genome Evolution and Function (CAGEF) for providing support in the sequencing experiments as well as bioinformatics support for the taxonomic profiling. The authors also thank the Toronto Lung Transplant Program Biobank team for helping with sample collection and retrieval. ## Footnotes * E-mail addresses: PHHS pierre.schneeberger{at}swisstph.ch; CYKZ cykevin.zhang{at}mail.utoronto.ca; JS jessica.santilli{at}mail.utoronto.ca; BC Bo.Chen{at}uhnresearch.ca; XW Wei.Xu{at}uhnresearch.ca; YL dopa25{at}gmail.com; ZW zonelle.wijesinha{at}mail.utoronto.ca; ERN ereguera1983{at}gmail.com; NY Noelle.Yee{at}uhnresearch.ca; MA musawir.ca{at}gmail.com; KB kristenboonstra1{at}gmail.com; RR rayoun.ramendra{at}uhn.ca; CWF courtney.frankel{at}duke.edu; SMP scott.palmer{at}duke.edu; JLT jamie.todd{at}duke.edu; TM tereza.martinu{at}uhn.ca; BC bryan.coburn{at}utoronto.ca * Received September 3, 2021. * Revision received September 3, 2021. * Accepted September 10, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## Bibliography 1. 1.Zhang CYK, Ahmed M, Huszti E, Levy L, Hunter SE, Boonstra KM, Moshkelgosha S, Sage AT, Azad S, Zamel R. Bronchoalveolar bile acid and inflammatory markers to identify high-risk lung transplant recipients with reflux and microaspiration. The Journal of Heart and Lung Transplantation 2020; 39: 934–944. 2. 2.Verleden GM, Glanville AR, Lease ED, Fisher AJ, Calabrese F, Corris PA, Ensor CR, Gottlieb J, Hachem RR, Lama V. Chronic lung allograft dysfunction: Definition, diagnostic criteria, and approaches to treatment―A consensus report from the Pulmonary Council of the ISHLT. The Journal of Heart and Lung Transplantation 2019; 38: 493–503. 3. 3.Razia D, Mittal S, Bremner R, Bansal S, Ravichandran R, Smith M, Walia R, Mohanakumar T, Tokman S. Pretransplant GERD-Induced Immune Response Predisposes to CLAD. The Journal of Heart and Lung Transplantation 2021; 40: S59. 4. 4.King BJ, Iyer H, Leidi AA, Carby MR. Gastroesophageal reflux in bronchiolitis obliterans syndrome: a new perspective. The Journal of heart and lung transplantation 2009; 28: 870–875. 5. 5.Combs MP, Wheeler DS, Luth JE, Falkowski NR, Walker NM, Erb-Downward JR, Lama VN, Dickson RP. Lung microbiota predict chronic rejection in healthy lung transplant recipients: a prospective cohort study. The Lancet Respiratory Medicine 2021; 9: 601–612. 6. 6.Willner DL, Hugenholtz P, Yerkovich ST, Tan ME, Daly JN, Lachner N, Hopkins PM, Chambers DC. Reestablishment of recipient-associated microbiota in the lung allograft is linked to reduced risk of bronchiolitis obliterans syndrome. American journal of respiratory and critical care medicine 2013; 187: 640–647. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1164/rccm.201209-1680OC&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23328523&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000316423800014&link_type=ISI) 7. 7.Royer P-J, Olivera-Botello G, Koutsokera A, Aubert J-D, Bernasconi E, Tissot A, Pison C, Nicod L, Boissel J-P, Magnan A. Chronic lung allograft dysfunction: a systematic review of mechanisms. Transplantation 2016; 100: 1803–1814. 8. 8.Mouraux S, Bernasconi E, Pattaroni C, Koutsokera A, Aubert J-D, Claustre J, Pison C, Royer P-J, Magnan A, Kessler R. Airway microbiota signals anabolic and catabolic remodeling in the transplanted lung. Journal of Allergy and Clinical Immunology 2018; 141: 718-729. e717. 9. 9.Dickson RP, Erb-Downward JR, Huffnagle GB. Towards an ecology of the lung: new conceptual models of pulmonary microbiology and pneumonia pathogenesis. The lancet Respiratory medicine 2014; 2: 238–246. 10. 10.Dickson RP, Erb-Downward JR, Freeman CM, McCloskey L, Beck JM, Huffnagle GB, Curtis JL. Spatial variation in the healthy human lung microbiome and the adapted island model of lung biogeography. Annals of the American Thoracic Society 2015; 12: 821–830. 11. 11.Dickson RP, Erb-Downward JR, Huffnagle GB. Homeostasis and its disruption in the lung microbiome. American Journal of Physiology-Lung Cellular and Molecular Physiology 2015; 309: L1047–L1055. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1152/ajplung.00279.2015&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26432870&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) 12. 12.Masson L, Passmore J-AS, Liebenberg LJ, Werner L, Baxter C, Arnold KB, Williamson C, Little F, Mansoor LE, Naranbhai V. Genital inflammation and the risk of HIV acquisition in women. Clinical Infectious Diseases 2015; 61: 260–269. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/civ298&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25900168&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) 13. 13.Ravel J, Gajer P, Abdo Z, Schneider GM, Koenig SS, McCulle SL, Karlebach S, Gorle R, Russell J, Tacket CO. Vaginal microbiome of reproductive-age women. Proceedings of the National Academy of Sciences 2011; 108: 4680–4687. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoyMToiMTA4L1N1cHBsZW1lbnRfMS80NjgwIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMTAvMjAyMS4wOS4wMy4yMTI2MzA2Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 14. 14.Schneeberger PHH, Prescod J, Levy L, Hwang D, Martinu T, Coburn B. Microbiota analysis optimization for human bronchoalveolar lavage fluid. Microbiome 2019; 7: 141. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s40168-019-0755-x&link_type=DOI) 15. 15.Urso A, Leiva-Juárez MM, Briganti DF, Aramini B, Benvenuto L, Costa J, Nandakumar R, Gomez EA, Robbins HY, Shah L. Aspiration of Conjugated Bile Acids Predicts Adverse Lung Transplant Outcomes and Correlates with Airway Lipid and Cytokine Dysregulation. The Journal of Heart and Lung Transplantation 2021. 16. 16.Tinckam K, Keshavjee S, Chaparro C, Barth D, Azad S, Binnie M, Chow C, de Perrot M, Pierre A, Waddell T. Survival in sensitized lung transplant recipients with perioperative desensitization. American Journal of Transplantation 2015; 15: 417–426. 17. 17.Gray AL, Mulvihill MS, Hartwig MG. Lung transplantation at Duke. Journal of thoracic disease 2016; 8: E185. 18. 18.Nadkarni MA, Martin FE, Jacques NA, Hunter N. Determination of bacterial load by real-time PCR using a broad-range (universal) probe and primers set. Microbiology 2002; 148: 257–266. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1099/00221287-148-1-257&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11782518&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000173355100027&link_type=ISI) 19. 19.Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, Owens SM, Betley J, Fraser L, Bauer M, Gormley N, Gilbert JA, Smith G, Knight R. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. The ISME journal 2012; 6: 1621–1624. 20. 20.Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010; 26: 2460–2461. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btq461&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20709691&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000282170000016&link_type=ISI) 21. 21.Edgar RC. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods 2013; 10: 996–998. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nmeth.2604&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23955772&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000325073800023&link_type=ISI) 22. 22.Edgar RC. UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. BioRxiv 2016: 081257. 23. 23.Edgar R. SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences. BioRxiv 2016: 074161. 24. 24.Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and environmental microbiology 2007; 73: 5261–5267. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYWVtIjtzOjU6InJlc2lkIjtzOjEwOiI3My8xNi81MjYxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDkvMTAvMjAyMS4wOS4wMy4yMTI2MzA2Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 25. 25.Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R. QIIME allows analysis of high-throughput community sequencing data. Nat Methods 2010; 7: 335–336. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nmeth.f.303&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20383131&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000277175100003&link_type=ISI) 26. 26.Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Molecular biology and evolution 2009; 26: 1641–1650. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/molbev/msp077&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19377059&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000266966200020&link_type=ISI) 27. 27.Carney SM, Clemente JC, Cox MJ, Dickson RP, Huang YJ, Kitsios GD, Kloepfer KM, Leung JM, LeVan TD, Molyneaux PL. Methods in lung microbiome research. American journal of respiratory cell and molecular biology 2020; 62: 283–299. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1165/rcmb.2019-0273TR&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) 28. 28.Dixon P. VEGAN, a package of R functions for community ecology. Journal of Vegetation Science 2003; 14: 927–930. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1654-1103.2003.tb02228.x&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000189220100019&link_type=ISI) 29. 29.Breiman L. Bagging predictors. Machine learning 1996; 24: 123–140. 30. 30.Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. Journal of statistical software 2005; 15: 1–11. 31. 31.Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. Metagenomic biomarker discovery and explanation. Genome biology 2011; 12: 1–18. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/gb-2011-12-1-r1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21501500&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F10%2F2021.09.03.21263067.atom) 32. 32.Wickham H. ggplot2: elegant graphics for data analysis. Springer; 2016. 33. 33.Wickham H. Reshaping data with the reshape package. Journal of Statistical Software 2007; 21: 1–20. 34. 34.Wickham H. A layered grammar of graphics. Journal of Computational and Graphical Statistics 2010; 19: 3–28.