ABSTRACT
Background Many genes associated with asthma explain only a fraction of its heritability. Most genome-wide association studies (GWASs) used a broad definition of “doctor-diagnosed asthma”, thereby diluting genetic signals by not considering asthma heterogeneity. The objective of our study was to identify genetic associates of childhood wheezing phenotypes.
Methods We conducted a novel multivariate GWAS meta-analysis of wheezing phenotypes jointly derived using unbiased analysis of data collected from birth to 18 years in 9,568 individuals from five UK birth-cohorts.
Results 44 independent SNPs were associated with early-onset persistent, 25 with preschool remitting, 33 with mid-childhood remitting and 32 with late onset wheeze. We identified a novel locus on chr9q21.13 (close to annexin 1 (ANXA1), p<6.7×10−9), associated exclusively with early-onset persistent wheeze. We identified rs75260654 as the most likely causative single nucleotide polymorphism (SNP) using Promoter Capture Hi-C loops, and then showed that the risk allele (T) confers a reduction in ANXA1 expression. Finally, in a murine model of house dust mite (HDM)-induced allergic airway disease, we demonstrated that anxa1 protein expression increased and anxa1 mRNA was significantly induced in lung tissue following HDM exposure. Using anxa1−/− deficient mice, we showed that loss of anxa1 results in heightened airway hyperreactivity and Th2 inflammation upon allergen challenge.
Conclusions We discovered a novel locus uniquely associated with early-onset persistent wheeze, identified the most likely causative variant, and showed that ANXA1 may play a role in regulating the pulmonary immune response to allergens. Targeting this pathway in persistent disease may represent an exciting therapeutic prospect.
INTRODUCTION
Asthma is a complex disorder caused by a variety of mechanisms which result in multiple clinical phenotypes(1). It has a strong genetic component, and twin studies estimate its heritability to be ∼ 60-70%(2). “Asthma genes” have been identified through a range of approaches, from candidate gene association studies(3) and family-based genome-wide linkage analyses(4) to genome-wide association studies (GWASs)(5-7). The first asthma GWAS (2007) identified multiple markers on chromosome 17q21 associated with childhood-onset asthma(5). A comprehensive review summarising the results of 42 GWASs of asthma and asthma-related traits has been published recently(8). The most widely replicated locus is 17q12-21, followed by 6p21 (HLA region), 2q12 (IL1RL1/IL18R1), 5q22 (TSLP) and 9p24 (IL33)(9). Overall, the evidence suggests that multiple genes are underlying the association peaks(9).
However, despite undeniable successes, genetic studies of asthma have produced relatively heterogeneous results, and only a small proportion of the heritability is accounted for(10). One part of the explanation for the paucity of precise replication are numerous gene-environment interactions(11). Another important consideration is asthma heterogeneity, in that asthma diagnosis comprises several conditions with distinct pathophysiology(12, 13), each potentially underpinned by different genetic associations(14). However, in order to maximise sample size, most GWASs used a definition of “doctor-diagnosed asthma”(15). Such aggregated outcome definitions are imprecise(16) and phenotypically and mechanistically heterogeneous(17), and this heterogeneity may dilute important genetic signals(14).
One way of disaggregating asthma diagnosis is to use data-driven methods to derive subtypes in a hypothesis-neutral way(18). For example, we jointly modelled data on wheezing from birth to adolescence in five UK population-based birth cohorts and identified five distinct phenotypes(19). However, although latent modelling approaches have been instrumental in elucidating the heterogenous nature of childhood asthma diagnosis(13), there has been little research into the genetic associations of phenotypes derived using data-driven methods. This is the first study to investigate the genetic architecture of wheezing phenotypes from infancy to adolescence, to identify genes specific to each phenotype and better understand the genetic heterogeneity between the disease class profiles.
MATERIALS & METHODS
Study design, setting, participants and data sources/measurement
The Study Team for Early Life Asthma Research (STELAR) consortium(20) brings together five UK population-based birth cohorts: Avon Longitudinal Study of Parents and Children (ALSPAC)(21), Ashford(22) and Isle of Wight (IOW)(23) cohorts, Manchester Asthma and Allergy Study (MAAS)(24) and the Aberdeen Study of Eczema and Asthma to Observe the Effects of Nutrition (SEATON)(25). The cohorts are described in detail in the Online Supplement (OLS). All studies were approved by research ethics committees. Informed consent was obtained from parents, and study subjects gave their assent/consent when applicable.
Validated questionnaires were completed on multiple occasions from infancy to adolescence(19). A list of variables, per cohort, is shown in Table E1, and the cohort-specific time points and sample sizes in Table E2. Data were harmonised and imported into Asthma eLab web-based knowledge management platform to facilitate joint analyses(20).
Definition of primary outcome (wheeze phenotypes from infancy to adolescence)
In the pooled analysis among 15,941 subjects with at least two observations on current wheeze, we used latent class analysis (LCA) to derive wheeze phenotypes from birth to age 18 years(19). A detailed description of the analysis is presented in(19) and in the OLS. A five-class solution was selected as the optimal model(19), and the classes (wheeze phenotypes) were labeled as: (1) never/infrequent wheeze (52.4%); (2) early-onset pre-school remitting wheeze (18.6%); (3) early-onset middle-childhood remitting wheeze (9.8%); (4) early-onset persistent wheeze (10.4%); and (5) late-onset wheeze (8.8%). These latent classes were used in the subsequent GWAS.
Genotyping, imputation and GWAS Meta-Analysis
Genotyping, quality control, imputation and exclusions are described in the OLS. Analyses were performed independently in ALSPAC, MAAS and the combined IOW-SEATON-ASHFORD (genotyped on the same platform, at the same time, and imputed together). We used SNPTEST v2.5.2(26) with a multinomial logistic regression model (-method newml), using the never/infrequent wheeze as the reference. A meta-analysis of the three GWASs was performed using METAL(27) with a total of 8,057,852 single nucleotide polymorphism (SNPs). LD pruning, pre-selection and gene annotation is described in the OLS.
Post-GWAS studies
Our GWAS identified a novel locus in chr9q21 nearby Annexin A1 (ANXA1), exclusively associated with early-onset persistent wheeze (see results section). We therefore proceeded with studies to identify causal variants and explore the biological mechanisms underlying this locus (detail in OLS). To this end, we firstly identified the most likely causative SNP using Promoter Capture Hi-C loops. We then ascertained genotype effect on gene expression and assessed the potential biological function of ANXA1 in asthma. Finally, we used a murine model of house dust mite (HDM)-induced allergic airway disease to investigate whether ANXA1 was important in regulating immune responses to a clinically relevant aeroallergen and used knock-out mice to derive further in vivo functional data to support our GWAS-finding.
RESULTS
Participants and descriptive data
We included a total of 9,568 subjects with European ancestry: ALSPAC, n=6833; MAAS, n=887; SEATON, n=548; ASHFORD, n=348; and IOW, n=952. Demographic characteristics of the study population(s) are shown in Table E3. Comparison of included vs. excluded participants across cohorts (per cohort and time point) is shown in Table E4.
GWAS Meta-Analysis
We conducted three GWASs (ALSPAC, MAAS, IOW-SEATON-Ashford) in parallel and results were meta-analyzed. QQ-plots are shown in Figure E1. A circular Manhattan plot showing an overview of the GWAS results by wheeze phenotype is shown in Figure 1. A total of 589 SNPs were associated with at least one phenotype with p<10−5. After clumping, we identified 134 independent SNPs uniquely associated with different phenotypes (p<10−5): of these, 44 were associated with early-onset persistent, 25 with early-onset preschool remitting, 33 with early-onset mid-childhood remitting and 32 with late-onset wheeze (Table E5). For some SNPs there were nominal associations with more than one phenotype. For example, chr17q21 was identified as a top locus for early-onset persistent wheeze (p=5.42×10−9), but some of the SNPs in this region were also associated with the early-onset mid-childhood remitting phenotype (p<0.0001).
To help identify functional elements located near the GWAS-associated variants (potential causal variants), we used locus zoom plots (LZP) for the 134 independent SNPs (p<10−5). Following close inspection of all plots, we short-listed 85 independent SNPs for which the LZPs potentially indicated more than one causal variant (Figures E2-E5) and followed them up for further annotation. The results of GWAS meta-analysis for these 85 SNPs with main associations across the four wheeze phenotypes are presented in Table 1. Previously associated traits for each region/gene associated with the different wheeze phenotypes are shown in Table E6. Briefly, one region (6q27) among the top hits for early-onset preschool remitting wheeze was previously associated with asthma, but in the context of obesity with a nominal association with asthma and BMI(28). Another region/gene (3q26.31/NAALADL2) identified as top hit for early-onset preschool remitting wheeze, was reported as an associate of severe asthma exacerbations, but only at nominal level (29). No regions/genes identified as top hits for early-onset mid-childhood remitting wheeze were found to have previous associations with asthma. Several genes/loci identified as top hits for late-onset wheeze were previously associated with asthma: ACOXL chr2q13 (later onset asthma and obesity(30)), PRKAA2 chr1p32.2 (lymphocyte count and asthma susceptibility(31)), CD200 3q13.2 (adult onset non-allergic asthma(32)), GIMAP family 7q36.1 (autoimmune diabetes, asthma, allergy(33)), 9p22.3 (asthma in <16 year-old (34)) and 16p12.1 (asthma and rhino-conjunctivitis at 10-15 years(35)).
We identified two GWAS-significant loci for early-onset persistent wheeze: 17q21, p<5.5×10−9, and a novel locus on 9q21.13 (ANXA1), p<6.7×10−9. The ANXA1 locus was the only GWAS-significant locus that had not previously been associated with asthma or atopic traits, with one previous study showing an association with FEV1/FVC and bronchodilator response in smokers(36). ANXA1 is strongly expressed in bronchial mast cells and has anti-inflammatory properties(37), and may be involved in epithelial airway repair(38) (Table E6). We therefore followed up top SNPs from this locus.
ANXA1 locus and persistent wheeze
Two SNPs (rs75260654, the lead SNP, and rs116849664 located downstream of ANXA1) were associated with early-onset persistent wheeze at genome-wide significance (GWS), with an additional SNP rs78320984 almost reaching GWS (Table E7). These SNPs are in LD with each other (Figure E6), but not with any other SNPs.
Promoter Capture identifies rs75260654 as the most likely causative variant
To identify the most likely causative variant, we investigated the overlap of the SNPs with Promoter Capture Hi-C interactions involving the ANXA1 promoter in CD4+ cells in MAAS cohort subjects. Of the three SNPs, only rs75260654 overlapped a region interacting with the ANXA1 promoter (Figure 2). Moreover, rs75260654 overlapped a POLR2A ChIP-seq peaks and an ATAC-seq peak and active enhancer in the type II pneumocyte derived A549 cell line. This shows that rs75260654 is located in a region directly interacting with the ANXA1 promoter and is transcriptionally active in relevant cell types.
Allele Frequencies of rs75260654 (MAF=0.02) across wheeze phenotypes are shown in Table E8. Two individuals (one in MAAS and one in ALSPAC) were homozygote for the minor allele (T), and both were in the early-onset persistent wheeze class. One subject reported current wheeze and asthma through childhood, with hospitalizations for lower respiratory tract infection in the 1st year of life confirmed in healthcare records. The second individual reported current wheezing at 1.5, 2.5 and 8-9 years and doctor-diagnosed asthma and the use of asthma medication at 8-9 years.
rs75260654: Effect on Genomic Features
VEP prediction shows the SNP rs75260654 (C changed to T) to be located downstream of three protein-coding transcripts of AXNA1 and overlapping the known regulatory region ID ENSR00000882742 on Chromosome 9: 73,173,001-73,173,200. This region is active in the GI tract, M2 macrophages, neural progenitor cells, and trophoblasts, but is repressed in T lymphocytes including CD4+ CD25+, Treg, and CD8+ cells.
rs75260654: Effect on gene expression
The effect of rs75260654 on the expression of nearby genes was investigated by browsing the eQTL GTEX data available in Ensembl. Compared to C, the T allele was found to reduce the expression of ANXA1 in naive B-cells (effect size=-2.36795, p=0.01) and to increase expression in Lymphoblasoid Cell Lines (LCL) (effect size=0.848856, pe=0.001) (Figure 3). This SNP affects expression of the neighboring gene ALDH1A1 (aldehyde dehydrogenase-1 family member A1) (effect size=-2.40446, p=0.0039 in macrophages infected with Salmonella). In the eQTL catalogue, rs75260654 is identified as an eQTL of ANXA1 in various immune cells (at nominal significance) including T cells, monocytes, fibroblasts, whole blood, Th2 memory cells, naive B cells. rs75260654 is also an eQTL of ANXA1 in monocytes that were stimulated with R848 (agonist of TLRs 7 and 8) and a human seasonal influenza A virus (39) (at nominal significance) (Table E9). In the lung rs116849664 and rs78320984 (both in LD with rs75260654) were eQTLs of ANXA1 (Table E10) as well as LINC01474 at nominal significance levels.
Additional supporting evidence regarding the significance of the T-allele on the expression of these genes was provided using eQTLGene Consortium meta-analysis of 24 Cohorts and 24331 samples(40). This method reproduced the previous modest results showing a cis-eQTL effect of rs75260654 on both the ANXA1 (p=6.02×10−23) and ALDH1A1 (p=1.11×10−19) at FDR=0. No significant trans-eQTLs were observed.
Potential biological function of ANXA1 in asthma
Protein-protein network analysis demonstrated that ANXA1 interacts directly with genes enriched for asthma (including IL4 and IL13) and inflammatory regulation (NR3C1, Glucocorticoid receptor) showing its significance in dysregulation of the immune response (see Figure E7 and Table E11).
Functional studies of anxa1 in a murine model
Pulmonary expression of anxa1 is modulated by aeroallergen exposure
We first analysed expression of anxa1 using a model of HDM-induced allergic airway disease (Figure 4A)(41). Consistently, immunohistochemistry analysis revealed anxa1 protein expression increased following HDM challenge (Figure 4B,C). Anxa1 mRNA was significantly induced in lung tissue following HDM exposure (Figure 4D). This increase suggests that the pro-resolving anxa1 may play a role in regulating the pulmonary immune response to allergen.
Anxa1 suppresses allergen-induced airway hyperresponsiveness (AHR) and type-2 inflammation
To confirm a functional role for anxa1 in allergic airway disease, we exposed anxa1−/− mice to intranasal HDM. Wildtype mice given HDM over 3 weeks developed significant AHR compared to PBS control mice. Mice deficient in anxa1 had significantly worse allergen-induced lung function (greater airway resistance) compared to WT treated mice (Figure 4E). Anxa1−/− mice exhibited significantly increased airway eosinophilia and elevated numbers of Th2 lymphocytes (Figure 4F,G). Lung tissue cytokine levels reflected the exacerbated airway Th2 inflammation, with elevation in IL-4, and significant induction of IL-5 and IL-13 (Figure 4H,J). Thus, anxa1 deficiency results in an alteration of the pulmonary immune response, with uncontrolled eosinophilia and an exacerbation of type-2 inflammation and AHR in response to allergen.
DISCUSSION
Herein, we present a comprehensive description of the genetic architecture of childhood wheezing disorders. Using a novel approach applied to a unique dataset from five UK birth cohorts, we identified subsets of SNPs differentially associated across four wheezing phenotypes: early-onset persistent (44 SNPs, 19 loci), early-onset preschool remitting (25 SNPs, 10 loci), early-onset mid-childhood remitting (33 SNPs, 9 loci) and late-onset (32 SNPs, 20 loci). We found little evidence of genetic associations spanning across different phenotypes. This suggests that genetic architecture of different wheeze phenotypes comprises a limited number of variants likely underpinning mechanisms which are shared across phenotypes, but that each phenotype is also characterized by unique phenotype-specific genetic associations. Importantly, we identified a novel locus in chr9q21 nearby ANXA1 exclusively associated with early-onset persistent wheeze (p<6.7×10−9). To identify the most likely causative variant, we investigated the overlap of the associated SNPs with Promoter Capture Hi-C interactions to demonstrate that SNP rs75260654 overlapped a region interacting with the ANXA1 promoter. Using eQTL data, we identified that the risk allele (T) o rs75260654 associated with early-onset persistent wheeze is also associated with ANXA1 expression. Further investigation of the biological function of ANXA1 revealed that it interacts with genes enriched for asthma (including IL4 and IL13) and inflammatory regulation (NR3C1, Glucocorticoid receptor). In functional mouse experiments, anxa1 protein expression increased and anxa1 mRNA was significantly induced in lung tissue following HDM exposure, suggesting that the pro-resolving anxa1 may play a role in regulating the pulmonary immune response to allergen. Concurrently, by utilizing anxa1−/− deficient mice we demonstrated that loss of anxa1 results in heightened AHR and Th2 inflammation upon allergen challenge, providing important in vivo functional data to support our GWAS-finding.
ANXA1 is a 37-kDa glycoprotein with potent anti-inflammatory and pro-resolving properties that are mediated by interaction with a specific G-protein coupled receptor FPR2(42). This axis represents an important resolution pathway in chronic inflammatory settings such as those of rheumatoid arthritis(43) and ulcerative colitis(44). ANXA1 belongs to the annexin family of Ca2+- dependent phospholipid-binding proteins, and through inhibition of phospholipase A2, it reduces eicosanoid production, which also contributes to its anti-inflammatory activities. Modulation of M2 macrophage phenotype is also promoted by ANXA1 to attenuate tissue inflammation(45). Corticosteroids (a mainstay of asthma treatment) increase the synthesis of ANXA1(46). Plasma ANXA1 levels are significantly lower in asthmatic patients with frequent exacerbations compared to those with stable disease, suggesting a link between this mediator and disease state(47). Furthermore, children with wheeze have reduced airway levels of ANXA1(48).
Previous functional studies using anxa1−/− deficient mice challenged with ovalbumin showed anxa1-deficient mice to have elevated AHR compared to WT mice(49). Ng et al. demonstrated that untreated anxa1-deficient mice have spontaneous AHR that predisposes them to exacerbated response to allergen(49). In the current study, we demonstrated in the murine lung the induction of Anxa1 in response to HDM exposure. In addition, genetic deletion of anxa1 potentiated the development of AHR and enhanced eosinophilia and markers of Th2 inflammation in mice treated with HDM, which is consistent with and extends previous reports. Of interest, in mice, anxa1 expression was recently found to be characteristic of a novel cell-type called the Hillock cell, which may be involved in squamous barrier function and immunomodulation(50). These data identify the ANXA1/FPR2 signaling axis as an important regulator of allergic disease, that could be manipulated for therapeutic benefit.
Our study has several limitations. By GWAS standards, our study is comparatively small and may be considered to be underpowered. The sample size may be an issue when using an aggregated definition (such as “doctor-diagnosed asthma”) but is less likely to be an issue when primary outcome is determined by deep phenotyping. This is indirectly confirmed in our analyses. Our primary outcome was derived through careful phenotyping over a period of more than two decades in five independent birth cohorts, and although comparatively smaller than some asthma GWASs, our study proved to be powered enough to detect previously identified key associations (e.g. chr17q21 locus). Precise phenotyping has the potential to identify new risk loci. For example, a comparatively small GWAS (1,173 cases and 2,522 controls) which used a specific subtype of early-onset childhood asthma with recurrent severe exacerbations as an outcome, identified a functional variant in a novel susceptibility gene CDHR3 (SNP rs6967330) as an associate of this disease subtype, but not of doctor-diagnosed asthma(51). This important discovery was made with a considerably smaller sample size but using a more precise asthma subtype. In contrast, the largest asthma GWAS to date had a ∼40-fold higher sample size(7), but reported no significant association between CDHR3 and aggregated asthma diagnosis. Therefore, with careful phenotyping, smaller sample sizes may be adequately powered to identify larger effect sizes than those in large GWASs with broader outcome definitions(52).
The importance of the precise outcome definition was highlighted in our previous studies in ALSPAC which explored genetic associates of wheeze phenotypes derived by LCA(53, 54). Our current findings are consistent with our earlier report suggesting that 17q21 SNPs are associated with early-onset persistent, but not with early transient or late-onset wheeze (53). Further analysis using genetic prediction scores based on 10-200,000 SNPs ranked according to their associations with physician-diagnosed asthma found that the 46 highest ranked SNPs predicted persistent wheeze more strongly than doctor-diagnosed asthma(54). Finally, a candidate gene study combining data from ALSPAC and PIAMA found different associations of IL33-IL1RL1 pathway polymorphisms with different phenotypes(55).
We are cognisant that there may be a perception of the lack of replication of our GWAS findings. We would argue that direct replication is almost certainly not possible in other cohorts, as phenotypes for replication studies should be homogenous(56). However, there is a considerable heterogeneity in LCA-derived wheeze phenotypes between studies, and although phenotypes in different studies are usually designated with the same names, they differ between studies in temporal trajectories, distributions within a population, and associated risk factors(57). This heterogeneity is in part consequent on the number and the non-uniformity of the timepoints used, and is likely one of the factors responsible for the lack of consistent associations of discovered phenotypes with risk factors reported in previous studies(58). This will also adversely impact the ability to identify phenotype-specific genetic associates. For example, we have previously shown that less distinct wheeze phenotypes in PIAMA were identified compared to those derived in ALSPAC(59). Thus, phenotypes that are homogeneous to those in our study almost certainly cannot readily be derived in available populations. This is exemplified in our attempted replication of ANXA1 findings in PIAMA cohort (see OLS, Table E12). In this analysis, the number of individuals assigned to persistent wheezing in PIAMA was small (40), associates of this phenotype differed to those in STELAR cohorts, and the SNPs’ imputation scores were low (<0.60), which meant the conditions for replication were not met.
Our study population is of European descent, and we cannot generalize the results to different ethnicities or environments. It is important to highlight the under-representation of ethnically diverse populations in most GWASs(9). To mitigate against this, large consortia have been formed, which combine the results of multiple ethnically diverse GWASs to increase the overall power to identify asthma-susceptibility loci. Examples include the GABRIEL(6), EVE(60) and TAGC(7) consortia, and the value of diverse, multi-ethnic participants in large-scale genomic studies has recently been shown(61). However, such consortia do not have the depth of longitudinal data to allow the type of analyses which we carried out to derive a multivariable primary outcome.
One strength of our approach is that we used data from five birth cohorts with detailed and lifelong phenotyping, which were harmonised in a common knowledge management platform(20), allowing joint analyses. We performed three parallel GWASs that produced estimates with remarkably consistent directions of effects.
In conclusion, using unique data from five UK birth cohorts, we identified subsets of SNPs differentially associated across four wheezing phenotypes from infancy to adolescence. We found little evidence of genetic associations spanning across different phenotypes. We discovered a novel locus in chr9q21 uniquely associated with early-onset persistent wheeze (p<6.7×10−9), identified SNP rs75260654 as the most likely causative variant, and demonstrated that the risk allele (T) confers a reduction in ANXA1 expression. In mouse experiments, ANXA1 expression increased in lung tissue following allergen exposure, suggesting that the pro-resolving ANXA1 may play a role in regulating the pulmonary immune response to allergen. Using ANXA1-deficient mice, we demonstrated that loss of ANXA1 results in heightened AHR and Th2 inflammation upon allergen challenge, providing important in vivo functional data to support our GWAS finding. Targeting these pathways to promote the clearance of chronic inflammation in persistent disease may represents an exciting therapeutic prospect.
Data Availability
The informed consent obtained from all included participants does not allow the data to be made freely available through any third party maintained public repository. However, data used for this submission can be made available on request to the corresponding cohort Executive. The ALSPAC website provides information on how to request and access its data (http://www.bristol.ac.uk/alspac/researchers/access/). For queries regarding access of data from MAAS, IoW, SEATON or Ashford please contact Philip Couch philip.couch@manchester.ac.uk). All summary data used to plot the figures in our manuscript has been deposited in Dryad.
https://datadryad.org/stash/share/NtN5Q2cplp016xWs3QAbYkR8LamBxRVIGOQ6JmzWopQ
Data availability
The informed consent obtained from all included participants does not allow the data to be made freely available through any third party maintained public repository. However, data used for this submission can be made available on request to the corresponding cohort Executive. The ALSPAC website provides information on how to request and access its data (http://www.bristol.ac.uk/alspac/researchers/access/). For queries regarding access of data from MAAS, IoW, SEATON or Ashford please contact Philip Couch philip.couch{at}manchester.ac.uk). All summary data used to plot the figures in our manuscript has been deposited in Dryad.
Code availability
Code used for this submission has been deposited in Dryad.
Author Contributions
R.G., J.C., S.H. and A.C. conceived and planned the study. R.G., A.C., J.W.H., L.L.Y. and C.M.L. wrote the manuscript. R.G. and J.C. analysed the data. J.C. and N.K. provided eQTL analysis for ANXA1, J.W.H. supervised eQTL analyses. J.H. and M.T. performed the PCHi-C experiments and analysis. J.C. and M.T. interpreted the genetic architecture results for ANXA1. G.H.K and J.M.V. performed a replication study in PIAMA. C.M.L, S.A.M. and L.G.G. conceived and designed murine experiments. S.A.M and L.G.G. conducted experiments in mouse models. M.P. provided annexin−/− mice and advice on annexin. All authors provided critical feedback and helped shape the research, analysis, and manuscript. This publication is the work of the authors and Raquel Granell and Adnan Custovic will serve as guarantors for the contents of this paper.
Competing Interests statement
Graham Roberts: MRC grant to my institution President of the British Society of Allergy and Clinical Immunology. Gerard Koppelman: Dutch Lung Foundation, Ubbo Emmius Foundation (Money to insititution) Dutch Lung Foundation, Vertex, TEVA the Netherlands, GSK, ZON-MW (VICI grant), European Union (Money to institution) Astra Zeneca, Pure IMS, GSK (Money to institution) Sanofi, Boehringer Ingelheim (Money to institution). Angela Simpson: Medical research council Research grant JP Moulton Charitable Foundation Research grant Asthma UK Research grant. Clare Murray: Asthma Uk National Institute for Health Research Moulton Charitable Foundation North West Lung Centre Charity GSK (Lecture fees) Novartis (Lecture fees). Clare Lloyd: Wellcome Trust 107059/Z/15/Z. John Holloway: Medial Research Council grant MR/S025340/1 (to institution) American Academy of Allergy Asthma and Immunology (AAAI) (Support for speaker travel to AAAAI annual congress). Adnan Custovic: MRC (research grants) EPSRC (research grant) Wellcome Trust (research grant) Worg Pharmaceoticals (Personal payment <US$5000) GSK Honorarium for lecture, personal, <US$ 5000 AstraZeneca Honorarium for lecture, personal, <US$ 5000 Sanofi Honorarium for lecture, personal, <US$ 5000 Stallergens-Greer Honorarium for lecture, personal, <US$ 5000 WAO (Board of officers, unpaid).
Funding
Supported by the UK Medical Research Council (MRC) Programme Grant MR/S025340/1, the Wellcome Trust (WT) Strategic Award (108,818/15/Z) and a WT Senior Fellowship to CML (107059/Z/15/Z). The MRC and Wellcome (Grant ref: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC. ALSPAC GWAS data was generated by Sample Logistics and Genotyping Facilities at Wellcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. A comprehensive list of grants funding is available on the ALSPAC website (http://www.bristol.ac.uk/alspac/external/documents/grant-acknowledgements.pdf).
PIAMA was funded by the Netherlands Lung Foundation (grant 3.4.01.26, 3.2.06.022, 3.4.09.081 and 3.2.10.085CO), the ZON-MW Netherlands Organization for Health Research and Development (grant 912-03-031), the Stichting Astmabestrijding and the Ministry of the Environment. Genome-wide genotyping was funded by the European Commission as part of GABRIEL (grant number 018996) and a grant from BBMRI-NL (CP 29). GHK is supported by a ZON-MW VICI grant. This research was funded in part by the Wellcome Trust [217065/Z/19/Z]. For the purpose of Open Access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.
SUPPLEMENTARY INFORMATION
ONLINE METHODS
Description of cohorts
The Study Team for Early Life Asthma Research (STELAR) consortium1 brings together five UK population-based birth cohorts as described below. All studies were approved by research ethics committees. Informed consent was obtained from parents, and study subjects gave their assent/consent when applicable. Data were harmonised and imported into Asthma eLab web-based knowledge management platform to facilitate joint analyses (www.asthmaelab.org)1.
The Avon Longitudinal Study of Parents and Children (ALSPAC)
ALSPAC is a birth cohort study established in 1991 in Avon, UK2, 3. Pregnant women with expected dates of delivery 1st April 1991 to 31st December 1992 were invited to take part in the study. The initial number of pregnancies enrolled is 14,541. Of these initial pregnancies, there was a total of 14,676 foetuses, resulting in 14,062 live births and 13,988 children who were alive at 1 year of age.
When the oldest children were approximately 7 years of age, an attempt was made to bolster the study with eligible cases who had failed to join originally. As a result, when considering variables collected from the age of seven onwards (and potentially abstracted from obstetric notes) there are data available for more than the 14,541 pregnancies mentioned above. The number of new pregnancies not in the initial sample (known as Phase I enrolment) that are currently represented on the built files and reflecting enrolment status at the age of 24 is 913 (456, 262 and 195 recruited during Phases II, III and IV respectively), resulting in an additional 913 children being enrolled. The phases of enrolment are described in more detail in the cohort profile paper and its update. The total sample size for analyses using any data collected after the age of seven is therefore 15,454 pregnancies, resulting in 15,589 foetuses. Of these 14,901 were alive at 1 year of age.
Ethical approval: Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time.
Data dictionary: The study website contains details of available data through a fully searchable data dictionary: http://www.bristol.ac.uk/alspac/researchers/our-data/
We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses.
The Manchester Asthma and Allergy Study (MAAS)
MAAS is an unselected birth cohort study established in 1995 in Manchester, UK4. It consists of a mixed urban-rural population within 50 square miles of South Manchester and Cheshire, United Kingdom located within the maternity catchment area of Wythenshawe and Stepping Hill Hospitals. All pregnant women were screened for eligibility at antenatal visits (8-10th week of pregnancy). Of the 1499 couples who met the inclusion criteria (≤10 weeks of pregnancy, maternal age ≥18 years, and questionnaire and skin prick data test available for both parents), 288 declined to take part in the study and 27 were lost to follow-up between recruitment and the birth of a child. A total of 1184 children were born into the study between February 1996 and April 1998. They were followed prospectively for 19 years to date and attended follow-up clinics for assessments, which included lung function measurements, skin prick testing, biological samples (serum, plasma and urine), and questionnaire data collection. The study was approved by the North West – Greater Manchester East Research Ethics Committee.
The Study of Eczema and Asthma to Observe the influence of Nutrition (SEATON)
SEATON is an unselected birth cohort study established in 1997 in Aberdeen, UK, which was designed to explore the relationship between antenatal dietary exposures and asthma outcomes in childhood5. 2000 healthy pregnant women attending an antenatal clinic, at median 12 weeks gestation, were recruited. An interviewer administered a questionnaire to the women and atopic status was ascertained by skin prick test (SPT). The cohort included 1924 children born between April 1998 and December 1999. Participants were recruited prenatally and followed up by self-completion questionnaire to 15 years of age using postal questionnaires to record the presence of asthma and allergic diseases. Lung function measurements and SPT to common allergens was performed at 5, 10 and 15 years. The study was approved by the North of Scotland Research Ethics Committee.
ASHFORD
The Ashford study is an unselected birth cohort study established in 1991 in Ashford, UK6. It included 642 children born between 1992 and 1993. Participants were recruited prenatally and followed to age 14 years. Detailed standardised questionnaires were administered at each follow-up to collect information on the natural history of asthma and other allergic diseases. Lung function measurements and SPT was carried out at 5, 8 and 14 years of age. In 2015, the study children aged 20 were sent a self-completion questionnaire, which was returned by 60% of the participants.
The Isle of Wight (IOW) cohort
IOW is an unselected birth cohort study established in 1989 on the Isle of Wight, UK7−9. After the exclusion of adoptions, perinatal deaths, and refusal for follow-up, written informed consent was obtained from parents to enrol 1,456 newborns born between 1st January 1989 and 28th February 1990. Follow-up-up assessments were conducted to 26 years of age to prospectively study the development of asthma and allergic diseases. At each follow-up, validated questionnaires were completed by the parents. Additionally, the Skin Prick Test (SPT) was performed on 980, 1036 and 853 participants at 4, 10 and 18 years of age to check allergic reactions to common allergens. At 10, 18, and 26 years, spirometry and methacholine challenge tests were performed to diagnose lung problems. Ethics approvals were obtained from the Isle of Wight Local Research Ethics Committee (now named the National Research Ethics Service, NRES Committee South Central – Southampton B) at recruitment and for the subsequent follow-ups.
Definition of variables
A list of all variables used in the current study, per cohort, is shown in Table E1.
Demographic, exposures and outcomes
Postal questionnaires were used in ALSPAC and SEATON, while interviewer-administered questionnaires were employed in other cohorts.
Parental history of asthma, eczema and hay fever was defined based on the responses given to the question “have you (and/or your partner) ever had asthma/eczema/hay fever”. Maternal and paternal smoking were defined based on the response given to the question “do you (or does your partner) smoke”, administered during pregnancy. Low birth weight was defined as birth weight less than 2500 g based on NHS birth records.
Asthma in MAAS was defined as a case if positive for two of the following criteria: doctor diagnosis of asthma in the past 12 months, current wheeze in the last 12 months, doctor prescription for asthma. Asthma in ALSPAC was defined as a mothers’ report of doctor ever diagnosis of asthma.
Current wheeze in MAAS was defined as a questionnaire report to the question “have you wheezed in the last 12 months” upon attendance at a follow up clinic. Current wheeze in ALSPAC was defined as a mothers’ report to the question “has your child had any wheezing or whistling in the last 12 months?”. Asthma medication in ALSPAC was defined as a mothers’ report to the question “has your child taken any asthma medication in the last 12 months?”. Lower respiratory hospital admissions: Data on hospital admissions in MAAS were obtained by manually inspecting the General Practice (GP) records for each individual.
Early-life risk factors were divided into four groups according to timing of exposure; maternal and child characteristics (gender, maternal smoking during pregnancy and maternal history of asthma), perinatal (low birth weight adjusted for gestational age), environmental (pet ownership, smoke exposure after birth) and allergic sensitization (defined based on positive skin prick test to cat, house dust mite or grass) variables.
Primary outcome: Joint wheeze phenotypes
We used latent class analysis (LCA) to identify longitudinal trajectories of wheeze10 based on pooled analysis among 15,941 children with at least two observations on wheezing at five time periods that were approximately shared across all cohorts: infancy (½-1 year); early childhood (2-3 years); pre- school/early school age (4-5 years); middle childhood (8-10 years); and adolescence (14-18 years). Cohort-specific definitions other variables derived from the questionnaires are provided in Table E2.
To control for cohort-specific variation, Cohort ID was included in the LCA model as an additional predictor by transforming the 5−category variable into a set of four dummy variables and including them as covariates. The largest cohort, ALSPAC, was treated as the non-coded category to which all other cohorts were compared. The expectation maximization algorithm was used to estimate relevant parameters, with 100,000 iterations and 500 replications.
To assess model fit, we used (1) the Bayesian information criterion (BIC), (2) the Akaike information criterion (AIC), (3) Lo-Mendell−Rubin likelihood ratio test (LMR), (4) Bootstrapped likelihood ratio and, (4) quality of classification certainty (model entropy). The BIC is an index used in Bayesian statistics to choose among a set of competing models; the model with the lowest BIC is preferred. Using the lowest BIC as a selection criterion, the best fitting model was chosen as the five-class solution with a nominal covariate (BIC:31340). Analyses were carried out using Mplus 8, R (http://www.r-project.org/) and Stata 14 (StataCorp, College Station, Tex).
Based on the statistical fit, a five-class solution was selected as the optimal model10, and the classes (wheeze phenotypes) were labeled as: (1) Never/Infrequent wheeze (52.4%); (2) Early-onset pre-school remitting wheeze (18.6%), with high prevalence of wheeze during infancy, decreasing to 20% around early-childhood and to less than 10% afterwards; (3) Early-onset middle-childhood remitting wheeze (9.8%), with early-onset wheeze and peak prevalence in early-childhood (∼70%), and diminishing by middle-childhood (<5%); (4) Early-onset persistent wheeze (10.4%) with 58% wheeze prevalence during infancy, and prevalence between 70- 80% thereafter; (5) Late-onset wheeze (8.8%) with very low prevalence until middle childhood, increasing rapidly to 55% in adolescence. These latent classes were used in the subsequent GWAS.
Minimising Bias and missing data effects
Extracted from reference Oskel et al. (10): “One of the advantages of our multicohort approach is that individual studies that might not provide conclusive evidence to make inference about the general population because of cohort specific effects and biases can contribute to revealing a more accurate picture when integrated together. The integration of five cohorts and their pooled analysis enhanced the credibility and generalizability of the phenotyping results to the U.K. population. A further advantage is to minimize the study-specific biases (including cohort specific effects, attrition effects, different recruitment strategies, and geographic factors) affecting the certainty of allocation of individuals to each latent class, while maximizing the benefits of individual cohort studies (e.g., potentially important risk factors and outcomes are captured in some, but not all cohorts).”
“Another strength of pooling cohort data is that a multicohort design allowed us to analyze a large sample with complete data on wheeze from birth to adolescence, thus increasing statistical power to detect less prevalent phenotypes.” However, “The optimal solution in the model using 15,941 children (allowing for missing data) remained five classes (see Table E3, Figure E1), and was very similar to that derived from a complete data set.” We used results from the larger sample, that is individuals with at least 2 observations of wheezing, to assign individuals to their most likely wheezing phenotype and used this as our primary outcome in this study.
Genotyping and imputation
ALSPAC
Participants were genotyped using the Illumina HumanHap550 quad genome-wide SNP genotyping platform (Illumina Inc., San Diego, CA, USA) by the Wellcome Trust Sanger Institute (WTSI; Cambridge, UK) and the Laboratory Corporation of America (LCA, Burlington, NC, USA), using support from 23andMe. Haplotypes were estimated using ShapeIT (v2.r644) which uses relationship information to improve phasing accuracy. The phased haplotypes were then imputed to the Haplotype Reference Consortium (HRCr1.1, 2016) panel11 of approximately 31,000 phased whole genomes. The HRC panel was phased using ShapeIt v2, and the imputation was performed using the Michigan imputation server.
MAAS
In MAAS, we used the Illumina 610 quad genome-wide SNP genotyping platform (Illumina Inc., San Diego, CA, USA). Prior to imputation samples were excluded on the basis of gender mismatches; minimal or excessive heterozygosity, genotyping call rates of <97%. SNPS were excluded if they had call rates of < 95%, minor allele frequencies of < 0.5% and HWE p<3×10-8. Prior to imputation each chromosome was pre-phased using EAGLE2 (v2.0.5)11 as recommended by the Sanger imputation server12. We then imputed with PBWT13 with the Haplotype Reference Consortium (release 1.1) of 32,470 reference genomes12 using the Sanger Imputation Server.
IOW, SEATON and ASHFORD
IOW, SEATON and ASHFORD were genotyped using the illumina Infinium Omni2.5-8 v1.3 BeadChip genotyping platform (Illumina Inc., San Diego, CA, USA). Genotype QC and imputation was carried out as described for MAAS.
Exclusions
Individuals were excluded on the basis of gender mismatches; minimal or excessive heterozygosity; disproportionate levels of individual missingness (>3%), insufficient sample replication (IBD < 0.8) or evidence of cryptic relatedness (IBD > 0.1). Following imputation, single nucleotide polymorphisms (SNPs) with a minor allele frequency of <1%, a call rate of <95%, evidence for violations of Hardy-Weinberg equilibrium (P<5E-7) or imputation quality measure (MaCH-Rsq or IMPUTE-info score) <0.40 were excluded. All individuals with non-European ancestry and siblings were removed.
GWAS Meta-analysis
GWAS of the joint wheezing phenotypes were performed independently in ALSPAC, MAAS and the combined IOW-SEATON-ASHFORD (combined as they were genotyped on the same platform, at the same time, and quality-controlled and imputed together). All genetic data were imputed to a new Haplotype Reference Consortium panel. This comprises around 31,000 sequenced individuals (mostly European), so the coverage of European haplotypes is much greater than in other panels. As a consequence, we expect to improve imputation accuracy, particularly at lower frequencies.
We used SNPTEST v2.5.214 with a multinomial regression model (-method newml, never/infrequent wheeze as the reference) to investigate the association between SNPs and wheezing phenotypes. A meta-analysis of the three GWASs including 5,887 controls and 943 cases for early-onset persistent, 1482 cases for early-onset remitting, 603 cases for mid-childhood-onset remitting and 652 cases for late-onset wheeze, was performed using METAL15 with a total of 8,057,852 SNPs present. We used the option SCHEME STDERR in METAL to implement an effect-size based method weighted by each study- specific standard error in a fixed-effects model. We performed clumping to keep only one representative SNP per Linkage disequilibrium (LD) block and used locus zoom plots to short-list independent SNPs for further annotation.
LD clumping, pre-Selection and Gene Annotation
LD clumping was performed for all SNPs with p-value<10−5 for at least one wheezing phenotype. In order to avoid redundancy between SNPs and to ensure associations are independent, we used significance thresholds of 0.05 for index and clumped SNPs (--clump-p1 0.05,--clump-p2 0.05), LD threshold of 0.80 (--clump-r2 0.80) and physical distance threshold of 250kb (--clump-kb 250). European 1000 Genome data were used to infer LD structure.
Locus Zoom plots (http://locuszoom.org/)16 were used for close inspection of all independent signals. Loci showing a peak with different colour dots (possibly indicating more than one causal variant) were short-listed for further annotation. SNPnexus database (https://www.snp-nexus.org/v4/)17 was used to annotate the overlapping, upstream and downstream genes; the GWAS catalogue (by SNP and then gene) (https://www.ebi.ac.uk/gwas/search), GeneCards (https://www.genecards.org/)18 database and phenoscanner (http://www.phenoscanner.medschl.cam.ac.uk/) were used to further explore previously associated relevant phenotypes and gene function. Lead SNPs were looked in https://www.regulomedb.org/ to assess potential functionality.
Gene expression in whole blood and lung tissues
The top independent SNPs associated with each of the wheeze phenotypes were assessed for their association with cis- and trans-acting gene expression (mRNA) in whole blood and lung tissues. We identified potential eQTL signals using Genotype-Tissue Expression database (https://gtexportal.org) using the European reference panel.
Post-GWAS: rs75260654 (ANXA1)
Annotation & distribution
Information including chromosome, strand, clinical significance was retrieved from ENSEMBL using the R package biomaRt19, 20. The effects of rs75260654 on genomic features were predicted by querying the using Ensembl Variant Effect Predictor (VEP) 21 web tool.
rs75260654 distribution in the GRCh38.p13 build of the human genome across African, Asian and European populations of the 1000 Genomes Project Phase 3 were accessed by querying the Ensembl (www.ensembl.org) web browser on 24 May 2021.
Promoter Capture
The Hi-C libraries were prepared from CD4+ T-cells isolated from 7 healthy individuals (2 libraries per individual) from the MAAS cohort using the Arima-HiC kit (Arima Genomics). Promoter Capture Hi-C (PCHi-C) libraries were generated by capturing the restriction fragments (RF) overlapping the TSS of 18775 protein coding genes using the Agilent SureSelectXT HS Target Enrichment System according to the manufacturers’ protocols. The final design included 305419 probes covering 13.476Mb and 18630 protein-coding genes. The restriction fragments (RF) overlapping the TSS (+/-1 RF, 3 RF per promoter) were captured with custom-designed biotinylated RNA baits. Libraries were sequenced to ∼300M 2×150bp reads each (∼600M reads/individual). The 3’-end of the reads was quality trimmed with Sickle. The sequencing data were processed with the HiCUP pipeline to map the sequencing reads and eliminate experimental artefacts and PCR duplicates22. The BAM files from technical replicates were merged. Promoter interactions were called using the CHiCAGO pipeline23, which calls statistically significant interactions in PCHi-C data while accounting for noise and PCHi-C specific bias. A CHiCAGO score > 5 (soft-thresholded -log weighted p-value) was considered significant. To gain information from all the available data, the BAM files from all 7 individuals were supplied as biological replicates in the analysis with CHiCAGO. Moreover, to increase power, restriction fragments were binned as follows: 10 consecutive RF that were not covered by the baits were binned together; the 3 baited fragments for each promoter were binned with 1 RF upstream and 1 downstream, totaling 5 fragments per promoter. If the bins for two consecutive promoters overlapped, these were binned together into a single larger bin. Publicly available ENCODE ATAC-seq (A549 cell line) and ChIP-seq (A549 cell line ENCFF900GVO) and POLR2A ChIP-seq (A549 cell line, ENCFF737ZKN) data and 18-state ChromHMM from the EpiMap Project (BSS00007) 24 for A549 cell line were downloaded. The PCHi-C interactions of interest and their overlap with ATAC-seq and ChIP-seq peaks, and putative enhancers from the 18-state ChromHMM model were visualised using the Sushi R package.
eQTL catalogue lookup
We queried the eQTL catalogue (https://www.ebi.ac.uk/eqtl/; accessed 6 May 2021) using tabix-0.2.6 to assess if rs75260654, rs116849664 or rs78320984 are eQTLs in studies that utilised the following cell types: lung, T cells, blood, monocytes, neutrophils, NK cells, fibroblasts, B cells, CD4+ T cells, CD8+ T cells, Th17 cells, Th1 cells, Th2 cells, Treg naive, Treg memory, CD16+ monocytes, Cultured fibroblasts, EBV-transformed lymphocytes. We defined nominal significance as p<=0.05.
Variant effect
Variant effect on tissue-specific gene expression, which is based on GTEx eQTL, was retrieved on May 24 from eQTL Ensembl database (https://www.ensembl.org/) and eQTLGene Consortium (https://www.eqtlgen.org/cis-eqtls.html).Using downloaded correlation of variant on tissue specific gene expression from Ensembl, the relative effect of T allele on the ANXA1 expression across 86 tissue types was presented in scatter plot using R version 3.6.1.25 To get information on the functional role of ANXA1, the top 30 interacting proteins and enrichment were retrieved from STRING database26 into cystoscape for visualization.27
Functional mouse experiments
Mice
In accordance with the Animals (scientific procedures) act 1986, all animal experiments were conducted under the approved UK Home Office Project License No: PPL 70/7643, reviewed by Imperial College’s Animal Welfare and Ethical Review body. Female WT BALB/c and annexin A1 knockout (KO) mice were purchased from Charles River (Bicester, UK). Animals aged 6-8 weeks of age received 25ug intranasal instillation of either HDM (Greer Laboratories, Lenoir, NC, USA; Cat: XPB70D3A25), or PBS 3x a week for 3 weeks. Mice were sacrificed 4 hours post-final HDM challenge. Mice were housed under specific pathogen-free conditions and a 12:12 light:dark cycle. Food and water were supplied ad libitum. All animal experiments were completed twice, with N=4-6 per group.
Airway hyperresponsiveness
Airway hyperreactivity was measured using Flexivent™. Lung resistance was measured in response to increasing doses of methacholine (3-100mg/ml, Sigma, Poole, UK, Cat: A2251) in tracheotomised anaesthetised mice using an EMMS system (Electro-Medical Measurement Systems, UK).
Flow Cytometry Analysis
Bronchoalveolar lavage (BAL) was collected. BAL cells were restimulated with ionomycin and phorbol 12-myristate 13-acetate in the presence of brefeldin (Sigma), as previously described28. Specific antibodies for T1/ST2 staining were purchased from Morwell Diagnostics (Zurich, Switzerland). Cells were also stained for lineage negative cocktail, Ly6G, CD45, CD11b, CD11c, SiglecF. Labelled cells were acquired on a BD Fortessa (BD Biosciences, Oxford, UK) and analysed using FlowJo software (Treestar, Ashland, Oregan, USA). Details of antibodies used can be found in the table below.
Analysis of cytokines and chemokines
Murine lung tissue homogenate supernatants were processed as previously described28. Cytokine levels were analysed by ELISA: IL-4, IL-5 (PharMingen, Oxford, UK), IL-13 Ready-Set-Go kits (eBioscience).
Real time-PCR
Total RNA was extracted from murine lung tissue using an RNeasy Mini Kit (Qiagen). Total RNA (1μg) was reverse transcribed into cDNA using a High-Capacity cDNA Reverse Transcription Kit (Life Technologies, UK). Real-time PCR reactions were performed using TaqMan Gene Expression Master Mix and TaqMan Gene Expression probes, annexin A1, and HPRT (Applied Biosystems). Values were normalised to HPRT and gene expression was analysed using the change-in-threshold 2-ΔCT method.
Annexin A1 Immunohistochemistry
Paraffin-embedded mouse lung sections were stained with annexin A1 (R&D Systems, MAB3770). Annexin A1 primary antibody was followed by a secondary detection antibody (donkey anti-goat 488, Thermofisher, A11055). Annexin A1+ cells were quantified by manual counting under microscope and numbers averaged over four fields, from five biological replicates per group.
Statistical analysis
Data are expressed as median ± IQR. Statistical differences between groups were calculated using Mann Whitney U test, unless otherwise specified. p -values are indicated in figures.
Replication of ANXA1 top hits in PIAMA cohort
PIAMA cohort description
PIAMA (Prevention and Incidence of Asthma and Mite Allergy) is an ongoing birth cohort study. Details of the study design have been published previously29, 30. In brief, pregnant women were recruited from the general population through antenatal clinics in the north, west and centre of the Netherlands in 1996-1997. The baseline study population consisted of 3963 new-borns. Questionnaires were completed by the parents during pregnancy when the child was 3 months old, and then annually from 1 up to 8 years; at ages 11, 14 and 17 years, questionnaires were completed by the parents as well as the participants themselves.
LCA wheezing phenotypes
A 6 class LCA model was identified including 3,832 individuals with at least 2 observations of wheeze between 1 and 11-12 years of age. The identified classes were labelled: never/Infrequent (2909,75.91%), pre-school onset remitting (571, 14.90%), mid-childhood school remitting (108, 2.82%), intermediate onset remitting (106, 2.77%), school-age onset persisting (74, 1.93%) and continuous wheeze (64, 1.67%).
ONLINE RESULTS
Participants and descriptive data
Demographic characteristics of the participants in STELAR cohorts included in this analysis are shown in Table E3. Cohorts contain similar proportions of males (range 48%-54%), maternal history of asthma (11%-14%), maternal smoking (14%-23%), (doctor-diagnosed) asthma ever during mid-childhood (16%-24%) and adolescence (20%-30%), current wheeze (12%-20% mid-childhood, 9%-25% adolescence) and current use of asthma medication (12%-17% mid-childhood, 11%-17% adolescence).
Comparison of participants included and excluded from this analysis
In ALSPAC, 11,176 individuals had data on wheezing phenotypes, of these 6,833 were white unrelated and had genetic data. We found more children from mothers who smoked during pregnancy in the excluded sample compared to the included sample; no difference in gender, maternal history of asthma, current wheezing at 8 or 15 years, and small evidence for more asthma ever and current medication at 8 years in the excluded sample (Table E4).
In MAAS, 1150 individuals had data on wheezing phenotypes, of these 887 were white unrelated and had genetic data. We found no difference in children from mothers who smoked during pregnancy in the excluded sample compared to the included sample; no difference in gender, maternal history of asthma or current wheezing at both 8 and 16 years. There was small evidence for more asthma ever and current medication at 8 years in the excluded sample (Table E4).
In SEATON, 1535 individuals had data on joint wheezing phenotypes, of these 548 were white unrelated and had genetic data. We found evidence for more children from mothers who smoked during pregnancy in the excluded sample compared to the included sample; and more males in the excluded sample. There was no difference in maternal history of asthma or current wheezing, asthma ever or current medication at both 10 and 15 years in the excluded sample compared to the included sample (Table E4).
In ASHFORD, 620 individuals had data on joint wheezing phenotypes, of these 348 were white unrelated and had genetic data. We found evidence for more children from mothers who smoked during pregnancy in the excluded sample compared to the included sample; no difference in gender, maternal history of asthma or asthma ever. There was small evidence for less current wheezing at 8 years, or current medication at 8 years in the excluded sample compared to the included sample (Table E4).
In IOW, 1460 individuals had data on joint wheezing phenotypes, of these 952 were white unrelated and had genetic data. We found evidence for more children from mothers who smoked during pregnancy in the excluded sample compared to the included sample; no difference in gender, maternal history of asthma, asthma ever at 10 and 18 years in the excluded sample compared to the included sample. There was small evidence for more children with current wheeze and medication at 8 years in the included sample compared to the included sample (Table E4).
GWAS Meta-Analysis: Main Results
The distribution of the minor allele frequencies was consistent across genotyped datasets (mean SD 0.01). We assessed the deviation of the observed p-values from the null hypothesis by plotting QQ- plots for each wheezing phenotype (Figure E1). Some observed p-values were clearly more significant than expected under the null hypothesis, particularly for early-onset persistent wheeze, without an early separation of the expected from the observed (low evidence of population stratification; genomic control lambda≥0.93 for all four phenotypes).
Results of GWAS meta-analysis for 85 SNPs with main associations across wheeze phenotypes
Previously associated traits for each region/gene associated with different wheezing phenotypes are presented in Table E6.
Persistent Wheeze
We identified two GWAS-significant loci: 17q21, p<5.5×10−9, and a novel locus on 9q21.13 (ANXA1), p<6.7×10−9. The remaining 17 loci (4.0×10−7≤p-values≤9.8×10−6) included regions previously associated with childhood asthma (1q43, 4p16.3, 4q31.21, 5p15.31, 7q22.3, 17q12), asthma and rhinitis (2p25.1), eosinophil count (3q21.3, 10q24.2, 11q23.3, 22q11.1), bronchial hyper-responsiveness (2q12.2), lung function (1q43, 3q21.3, 5q13.3, 15q25.2, 19p13.2), triglycerides measurement and/or glucose metabolism (11q23.3, 19p13.2 and 22q11.1), severe asthma (14q22.1) and severe asthma and insulin resistance (11p15.4).
Early-onset Preschool Remitting Wheeze
Among the regions associated with early-onset preschool remitting wheeze, we identified loci previously associated with smoking (3q26.31, 7q21.11 and 15q21.1), waist circumference and obesity (1q32.3), asthma and/or BMI (5q13.3, 6q27, 7q21.11), allergic disease and atopy (7q21.11) and airway repair (2p24.2 and 9p13.3).
Early-onset Mid-childhood Remitting Wheeze
Loci associated with this phenotype were previously associated with neutrophil counts (1q23.2, 3q29, 20p12.3-p12.2), eosinophil counts and allergic rhinitis (4q24), pollution and DNA methylation (2p22.3), atopy (3p25.3), food allergy (13q31.1) and BMI (3q29, 5q14.1).
Late-onset Wheeze
Regions associated with late-onset wheeze were previously associated with adult-onset non-allergic asthma (3q13.2), asthma/allergic disease and allergy/atopic sensitization (3q23, 7q36.1), asthma and/or allergy in adolescence (9p22.3, 16p12.1), late-onset asthma and obesity (2q13), lung function or body height (2q14.3, 3p24.2, 3q13.2, 15q25.2), lymphocyte count and asthma susceptibility (1p32.2), obesity and/or metabolic syndrome/dysfunction (1p36.13 and 22q13.32), eczema (7q36.3), insulin resistance (10q23.31), type 1 diabetes (8q21.3), alcohol drinking (15q15.3-q21.1) and sex hormone-binding globulin levels (11q23.3).
ONLINE TABLES
SUPPLEMENTARY FIGURES
Footnotes
↵13 STELAR/UNICORN investigators: Professor Graham Devereux, Dr Dimitrios Charalampopoulos
↵14 Breathing Together investigators: Prof Andrew Bush, Prof Sejal Saglani, Prof Benjamin Marsland, Prof Jonathan Grigg, Prof Jurgen Schwarze, Prof Mike Shields, Prof Peter Ghazal, Prof Ultan Power, Dr Ceyda Oksel
This article has an online appendix. This article is dedicated to the memory of our wonderful colleague and friend Professor John Henderson (1958-2019), whose contribution to the understanding of the heterogeneity of childhood asthma cannot be overstated. Rainbow-chasers and UNICORN riders forever.