Abstract
Background The role of copy number variants (CNVs) in susceptibility to asthma is not well understood. This is, in part, due to the difficulty of accurately measuring CNVs in large enough sample sizes to detect associations. The recent availability of whole-exome sequencing (WES) in large biobank studies provides an unprecedented opportunity to study the role of CNVs in asthma.
Methods We called common CNVs in 49,953 individuals in the first release of UK Biobank WES using ClinCNV software. CNVs were tested for association with asthma in a stage 1 analysis comprising 7,098 asthma cases and 36,578 controls from the first release of sequencing data. Nominally-associated CNVs were then meta-analysed in stage 2 with an additional 17,280 asthma cases and 115,562 controls from the second release of UK Biobank exome sequencing, followed by validation and fine-mapping.
Results Five of 189 CNVs were associated with asthma in stage 2, including a deletion overlapping the HLA-DQA1 and HLA-DQB1 genes, a duplication of CHROMR/PRKRA, deletions within MUC22 and TAP2, and a duplication in FBRSL1. The HLA-DQA1, HLA-DQB1, MUC22 and TAP2 genes all reside within the human leukocyte antigen (HLA) region on chromosome 6. In silico analyses demonstrated that the deletion overlapping HLA-DQA1 and HLA-DQB1 is likely to be an artefact arising from under-mapping of reads from non-reference HLA haplotypes, and that the CHROMR/PRKRA and FBRSL1 duplications represent presence/absence of pseudogenes within the HLA region. Bayesian fine-mapping of the HLA region suggested that there are two independent asthma association signals. The variants with the largest posterior inclusion probability in the two credible sets were an amino acid change in HLA-DQB1 (glutamine to histidine at residue 253) and a multi-allelic amino acid change in HLA-DRB1 (presence/absence of serine, glycine or leucine at residue 11).
Conclusions At least two independent loci characterised by amino acid changes in the HLA-DQA1, HLA-DQB1 and HLA-DRB1 genes are likely to account for association of SNPs and CNVs in this region with asthma. The high divergence of haplotypes in the HLA can give rise to spurious CNVs, providing an important, cautionary tale for future large-scale analyses of sequencing data.
Competing Interest Statement
LVW has research funding (outside of submitted work) from GSK and Orion Pharma and consultancy for Galapagos. All other authors declare that they have no competing interests.
Funding Statement
This work was supported by an Asthma UK Fellowship (AUK-CDA-2019-414) awarded to KAF. The funding body had no role in the design of the study, the collection, analysis, and interpretation of data, or the writing of the manuscript. LVW is supported by a GSK / British Lung Foundation Chair in Respiratory Research (C17-1).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study used anonymised data from UK Biobank, which comprises over 500,000 volunteer participants aged between 40 and 69 years recruited across Great Britain between 2006 and 2010. The protocol and consent were approved by the UK Biobank Research Ethics Committee. Our analysis was conducted under approved UK Biobank data application number 56607.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data (summary statistics) generated or analysed during this study are included in this published article [and its supplementary information files]. The ClinCNV genotyping will be available to approved UK Biobank researchers as a returned dataset.
List of abbreviations
- CNV
- Copy number variant
- WES
- Whole-exome sequencing
- SNP
- Single nucleotide polymorphism
- IGV
- Integrative Genomics Viewer
- HLA
- Human Leukocyte Antigen
- T1DGC
- Type 1 Diabetes Genetics Consortium