Abstract
One of the first protein polymorphisms identified in humans involves the abundant blood protein haptoglobin. Two exons of the HP gene (encoding haptoglobin) exhibit copy number variation that affects HP protein structure and multimerization. The evolutionary origins and medical relevance of this polymorphism have been uncertain. Here we show that this variation has likely arisen from many recurring deletions, more specifically, reversions of an ancient hominin-specific duplication of these exons. Although this polymorphism has been largely invisible to genome-wide genetic studies thus far, we describe a way to analyze it by imputation from SNP haplotypes and find among 22,288 individuals that these HP exonic deletions associate with reduced LDL and total cholesterol levels. We further show that these deletions, and a SNP that affects HP expression, appear to drive the strong association of cholesterol levels with SNPs near HP. Recurring exonic deletions in HP likely enhance human health by lowering cholesterol levels in the blood.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout







Similar content being viewed by others
References
Allison, A.C. & Rees, W.A. The binding of haemoglobin by plasma proteins (haptoglobins); its bearing on the renal threshold for haemoglobin and the aetiology of haemoglobinuria. BMJ 2, 1137–1143 (1957).
Langlois, M.R. & Delanghe, J.R. Biological and clinical significance of haptoglobin polymorphism in humans. Clin. Chem. 42, 1589–1600 (1996).
Smithies, O. & Walker, N.F. Genetic control of some serum proteins in normal humans. Nature 176, 1265–1266 (1955).
Wejman, J.C., Hovsepian, D., Wall, J.S., Hainfeld, J.F. & Greer, J. Structure and assembly of haptoglobin polymers by electron microscopy. J. Mol. Biol. 174, 343–368 (1984).
Nielsen, M.J. & Moestrup, S.K. Receptor targeting of hemoglobin mediated by the haptoglobins: roles beyond heme scavenging. Blood 114, 764–771 (2009).
Melamed-Frank, M. et al. Structure-function analysis of the antioxidant properties of haptoglobin. Blood 98, 3693–3698 (2001).
Tripathi, A. et al. Identification of human zonulin, a physiological modulator of tight junctions, as prehaptoglobin-2. Proc. Natl. Acad. Sci. USA 106, 16799–16804 (2009).
Smithies, O., Connell, G.E. & Dixon, G.H. Inheritance of haptoglobin subtypes. Am. J. Hum. Genet. 14, 14–21 (1962).
Shindo, S. Haptoglobin subtyping with anti-haptoglobin α chain antibodies. Electrophoresis 11, 483–488 (1990).
Martosella, J. & Zolotarjova, N. Multi-component immunoaffinity subtraction and reversed-phase chromatography of human serum. Methods Mol. Biol. 425, 27–39 (2008).
Cahill, L.E. et al. Currently available versions of genome-wide association studies cannot be used to query the common haptoglobin copy number variant. J. Am. Coll. Cardiol. 62, 860–861 (2013).
Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Levy, A.P. et al. Haptoglobin phenotype and prevalent coronary heart disease in the Framingham offspring cohort. Atherosclerosis 172, 361–365 (2004).
Koch, W. et al. Genotyping of the common haptoglobin Hp 1/2 polymorphism based on PCR. Clin. Chem. 48, 1377–1382 (2002).
Soejima, M. & Koda, Y. TaqMan-based real-time PCR for genotyping common polymorphisms of haptoglobin (HP1 and HP2). Clin. Chem. 54, 1908–1913 (2008).
Zethelius, B. et al. Use of multiple biomarkers to improve the prediction of death from cardiovascular causes. N. Engl. J. Med. 358, 2107–2116 (2008).
Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Salvatore, A. et al. Haptoglobin binding to apolipoprotein A-I prevents damage from hydroxyl radicals on its stimulatory activity of the enzyme lecithin-cholesterol acyl-transferase. Biochemistry 46, 11158–11168 (2007).
Salvatore, A., Cigliano, L., Carlucci, A., Bucci, E.M. & Abrescia, P. Haptoglobin binds apolipoprotein E and influences cholesterol esterification in the cerebrospinal fluid. J. Neurochem. 110, 255–263 (2009).
Spagnuolo, M.S. et al. Analysis of the haptoglobin binding region on the apolipoprotein A-I–derived P2a peptide. J. Pept. Sci. 19, 220–226 (2013).
Cigliano, L., Pugliese, C.R., Spagnuolo, M.S., Palumbo, R. & Abrescia, P. Haptoglobin binds the antiatherogenic protein apolipoprotein E—impairment of apolipoprotein E stimulation of both lecithin:cholesterol acyltransferase activity and cholesterol uptake by hepatocytes. FEBS J. 276, 6158–6171 (2009).
Maeda, N., Yang, F., Barnett, D.R., Bowman, B.H. & Smithies, O. Duplication within the haptoglobin Hp2 gene. Nature 309, 131–135 (1984).
McEvoy, S.M. & Maeda, N. Complex events in the evolution of the haptoglobin gene cluster in primates. J. Biol. Chem. 263, 15740–15747 (1988).
Hardwick, R.J. et al. Haptoglobin (HP) and haptoglobin-related protein (HPR) copy number variation, natural selection, and trypanosomiasis. Hum. Genet. 133, 69–83 (2014).
Hindson, B.J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011).
1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Asakawa, J., Kodaira, M., Nakamura, N., Satoh, C. & Fujita, M. Chimerism in humans after intragenic recombination at the haptoglobin locus during early embryogenesis. Proc. Natl. Acad. Sci. USA 96, 10314–10319 (1999).
Rodriguez, S. et al. Molecular and population analysis of natural selection on the human haptoglobin duplication. Ann. Hum. Genet. 76, 352–362 (2012).
Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).
Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).
Gallego Llorente, M. et al. Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent. Science 350, 820–822 (2015).
Scally, A. & Durbin, R. Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 13, 745–753 (2012).
Browning, S.R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).
Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).
Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).
Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Froguel, P. et al. A genome-wide association study identifies rs2000999 as a strong genetic determinant of circulating haptoglobin levels. PLoS One 7, e32327 (2012).
Soejima, M. et al. Genetic factors associated with serum haptoglobin level in a Japanese population. Clin. Chim. Acta 433, 54–57 (2014).
Ishibashi, S., Herz, J., Maeda, N., Goldstein, J.L. & Brown, M.S. The two-receptor model of lipoprotein clearance: tests of the hypothesis in “knockout” mice lacking the low density lipoprotein receptor, apolipoprotein E, or both proteins. Proc. Natl. Acad. Sci. USA 91, 4431–4435 (1994).
Yang, Y., Cao, Z., Tian, L., Garvey, W.T. & Cheng, G. VPO1 mediates ApoE oxidation and impairs the clearance of plasma lipids. PLoS One 8, e57571 (2013).
Guthrie, P.A.I. et al. Complexity of a complex trait locus: HP, HPR, haemoglobin and cholesterol. Gene 499, 8–13 (2012).
Regan, J.F. et al. A rapid molecular approach for chromosomal phasing. PLoS One 10, e0118270 (2015).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Willer, C.J., Li, Y. & Abecasis, G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G.R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Acknowledgements
We thank C. Usher for comments on the manuscript and work on the figures. This work was supported by a grant from the National Human Genome Research Institute (R01HG006855 to S.A.M.). The Yerkes Center (grant P51OD011132) provided primate DNA samples. R.M.S. was supported by a US National Institutes of Health/National Heart, Lung, and Blood Institute K99 award (1K99HL122515-01A1) and an advanced postdoctoral fellowship award from the Juvenile Diabetes Research Foundation (JDRF 3-APF-2014-111-A-N). G.M.P. was supported by the National Heart, Lung, and Blood Institute of the US National Institutes of Health under award K01HL125751.
Author information
Authors and Affiliations
Contributions
L.M.B., S.A.M. and R.E.H. designed the experiments for understanding HP structural evolution. R.M.S., L.M.B. and G.M.P. performed imputation and association analyses of cholesterol cohorts. L.M.B. performed computational analyses of HapMap and 1000 Genomes Project data, constructed the imputation reference panels and performed all laboratory experiments. L.M.B. and S.A.M. wrote the manuscript. J.N.H. and S.K. provided advice on data analysis. All authors contributed to interpretations of data and to revisions of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 SNPs on opposite sides of the CNV are routinely in high linkage disequilibrium.
The x axis indicates position on chromosome 16, while the y axis indicates the r2 correlation for linkage disequilibrium. The linkage disequilibrium value provided for each SNP is the maximum LD shared with a SNP on the opposite side of the CNV. The minor allele frequency for each SNP is indicated by color. (a) European populations (CEU, TSI, IBS). (b) African population (YRI).
Supplementary Figure 2 Sequence differences between human HP subtypes within the structurally variant region.
The x axis lists the base-pair position of each nucleotide variant as aligned to HP2FS-Left (hg19). The y axis indicates the HP subtype and HapMap or 1000 Genomes Project sample that was sequenced. Only polymorphic bases are shown. The lavender sequence indicates bases that result from paralogous gene conversion from HPR. The pink and green bases indicate alternate forms of the highly diverged region, while the white colored bases indicate other derived variants with respect to chimpanzee HP.
Supplementary Figure 3 Sequence differences between alleles, orthologs and paralogs in the HP structurally variant region.
An alignment of the structurally variant region for human structural haplotypes and other primates shows distinct polymorphic sections. The x axis lists the base-pair position of each nucleotide variant as aligned to HP2FS-Left (hg19). The homolog that provided the sequence is indicated on the y axis. Only the fixed differences between human HP subtypes are shown. The bases backed in white indicate mutations specific to human subtypes. The lavender region indicates bases that match human HPR. Human HP1F and HP2FS-Left have 30 derived bases (with respect to chimpanzee and bonobo HP) that match the human HPR gene. Bases in the highly diverged region are indicated in pink and green. The series of mutations that gave rise to the highly diverged region is unclear, as the great apes are also highly polymorphic in this region and multiple bases have potentially deep coalescence. Haptoglobin from the great apes was sequenced with the designed primers (Supplementary Table 14). Human HPR was provided by the hg19 reference sequence. Chimpanzee HPR and HPP are from previously released sequence (GenBank, M84462.1 and M84463.1). The following samples were provided by the Yerkes Center and Coriell Cell Repositories and were used to sequence HP in each respective great ape: PR00107 (gorilla), PR00251 (bonobo), NS03489 (chimpanzee) and NG06209 (orangutan).
Supplementary Figure 4 Assays designed to type HP variant boundaries.
(a) The coordinates for the boundaries of each sequence variant are given in hg19 coordinates. (b) Assays A–E were designed to the boundaries of HP variants. A 0 indicates that the assay sequence is absent and the haplotype will not amplify with the given assay, and a 1 indicates that the sequence is present and will produce amplicons.
Supplementary Figure 5 HP subtypes on different SNP haplotype backgrounds.
This figure is similar to Figure 1, but it includes subtype information in addition to the CNV. The vast majority of HP1F and HP1S alleles are on different SNP haplotype backgrounds. This plot displays the SNP haplotypes on either side of the CNV (10 kb on each side) segregating with HP1 and HP2. Each horizontal line represents an individual SNP haplotype. Note that the size of small clusters has been increased for visibility purposes, and the number of haplotypes contained in each cluster is indicated to the left of the plot. White represents the minor allele and gray indicates the major allele across all populations (CEU, IBS, TSI, YRI). YRI individuals are indicated with lavender bars to the left of the plot, while European populations (CEU, IBS, TSI) are indicated with dark purple bars to the left of the plot. Haplotypes were clustered with the k-means method using upstream SNP haplotypes. Similar SNP haplotypes carrying different structures are indicated with colored outlines (dark pink, light blue, green, gold).
Supplementary Figure 6 Model of HP structural evolution.
Our model of HP structural evolution is that a tandem duplication, paralogous gene conversion from HPR, and Form L and Form R of the highly diverged region were ancestral mutations that predate the deletion events that created the HP1S and HP1F structural alleles. However, it is possible that the ancestral duplication, gene conversion, and Form L and Form R could have arisen in an alternate order than is shown here. We interpret that the HP2SS structural allele arose relatively recently because it segregates solely on a homogeneous subset of the HP2FS-B SNP haplotype (Fig. 4 and Supplementary Fig. 5).
Supplementary Figure 7 HP1 alleles share derived nucleotides in the CNV region with HP2 alleles from the same SNP haplotype.
Nucleotide differences in the CNV region between HP1S alleles and HP2FS alleles on the same extended SNP haplotype background (as shown in Fig. 4). Derived variants are backed in white.
Supplementary Figure 8 A high-frequency recombinant haplotype increases LD between SNPs and HP2 structure in the YRI population.
The long-range SNP haplotype (100 kb) downstream of the CNV is shown. Minor alleles are shown in white, while major alleles are shown in gray. The minor alleles of SNPs with high LD to the CNV (r2 > 0.7) in the YRI population are colored in red. YRI haplotypes are indicated with lavender bars to the right of the plot. HP1S and HP1F SNP haplotypes are highly diverged locally around the CNV in both the European and YRI populations (also see Fig. 2). However, the YRI HP1S and HP1F haplotypes become nearly identical 27 kb downstream of the CNV (these haplotypes are indicated with stars to the right of the plot). This appears to be caused by a recombination event (interpreted location shown in green) on an HP1S haplotype. This recombination event allows HP1F and HP1S to be in high LD with the same alleles. Additionally, whereas this SNP haplotype is relatively common among European HP2FS alleles, it is very rare in YRI HP2FS alleles, allowing the LD between the CNV to be higher in the YRI population (haplotypes indicated with a black arrow).
Supplementary Figure 9 Association results for triglyceride and HDL cholesterol levels to HP structural forms and nearby SNPs.
(a) The association of HP structural forms and regional SNPs to triglyceride levels is not genome-wide significant with our combined cohort (HP2 P value = 2.24 × 10−3). (b) No association is observed between HP structural alleles and HDL cholesterol levels (HP2 P value = 0.57).
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–9, Supplementary Tables 1–15 and Supplementary Note. (PDF 3625 kb)
Supplementary Data Set
Reference panels for imputation of HP structural alleles. (ZIP 209 kb)
Source data
Rights and permissions
About this article
Cite this article
Boettger, L., Salem, R., Handsaker, R. et al. Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 48, 359–366 (2016). https://doi.org/10.1038/ng.3510
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3510