Abstract
Identification of genes associated with nonsyndromic hearing loss is a crucial endeavor, given the substantial number of individuals who remain without a diagnosis after even the most advanced genetic testing. PKHD1L1 was established as necessary for the formation of the cochlear hair-cell stereociliary coat and causes hearing loss in mice and zebrafish when mutated. We sought to determine if biallelic variants in PKHD1L1 also cause hearing loss in humans.
Exome sequencing was performed on DNA of three families segregating autosomal recessive moderate to severe nonsyndromic sensorineural hearing loss. Compound heterozygous missense p.[(Gly129Ser)];p.[(Gly1314Val)], homozygous missense p.(His2479Gln) and nonsense p.(Arg3381Ter) variants were identified in PKHD1L1 that were predicted to be damaging using in silico pathogenicity prediction methods. In vitro functional analysis of two missense variants was performed using purified recombinant PKHD1L1 protein fragments. We then evaluated protein thermodynamic stability with and without the missense variants found in one of the families. In vitro functional assessment indicated that both engineered PKHD1L1 mutant p.(Gly129Ser) and p.(Gly1314Val) constructs significantly reduced the folding and structural stabilities of the expressed protein fragments, providing further evidence to support pathogenicity of these variants. In silico molecular modelling using AlphaFold2 and protein sequence alignment analysis were carried out to further explore potential variant effects on protein folding and stability and exposed key structural features that might suggest PKHD1L1 protein destabilization.
Multiple lines of evidence collectively associate PKHD1L1 with nonsyndromic mild-moderate to severe sensorineural hearing loss. PKHD1L1 testing in individuals with mild-moderate hearing loss may identify further affected families.
Introduction
Hearing loss-associated genes are implicated in the function of all parts of the delicate molecular machinery that permits human hearing. The inner hair cells (IHCs) and outer hair cells (OHCs) of the organ of Corti contain an apical bundle of ∼100 actin-filled protrusions called stereocilia. Upon sound stimulation, stereocilia bundles are deflected by pressure-induced waves within the fluid-filled organ of Corti. Housing the mechanotransduction apparatus at the tips of stereocilia, these bundles mediate the transformation of the mechanical stimulus into an electrical signal the brain interprets as sound. While IHCs convert sound waves into nerve signals, OHCs allow for non-linear amplification of the sound stimuli by changing their length in response to bundle deflection, a process known as electromotility (Brownell 1990). Although IHCs and OHCs have two separate and distinct functions, both sensory cell types require a properly organized, functional stereocilia bundle. Stereocilia have a transiently expressed surface coat, but little is understood about the function or molecular architecture of this surface specialization. To date, there are over 30 genes reported to be critical for stereocilia bundle morphology that are associated with sensorineural hearing loss (SNHL) in humans (Michalski and Petit 2015; Petit and Richardson 2009).
One such stereocilia protein, polycystic kidney and hepatic disease 1-like 1 (Pkhd1l1) is critical to hearing in mice (Wu et al. 2019). The PKHD1L1 gene in humans encodes a 4,243 amino acid protein, which is predicted to be composed by a large extracellular domain, a 20 amino acid transmembrane domain, and a very short cytoplasmic domain of 9 residues. In mice, PKHD1L1 is highly enriched in both IHCs and OHCs, particularly at the tips of OHC stereocilia bundles (Wu et al. 2019). It is hypothesized that this protein makes up a major component of the transient stereociliary coat. Mice lacking Pkdh1l1 displayed elevated Auditory Brainstem Response (ABR) and Distortion Product Otoacoustic Emissions (DPOAE) thresholds in response to pure tone stimuli in a progressive fashion (Wu et al. 2019). More recent data from zebrafish (Danio rerio) with a double knockout of pkhd1l1a and pkhd1l1b (orthologs of human PKHD1L1) show significant deficits in auditory startle response at the larval stage, consistent with an early-onset auditory phenotype in zebrafish (Makrogkikas et al. 2023). Based on these findings in animal models, we sought to determine whether variants in PKHD1L1 cause hearing loss in humans.
In this study, we propose PKHD1L1 as a novel cause of autosomal recessive nonsyndromic hearing loss in humans. We describe three unrelated families with biallelic variants in PKHD1L1 identified via exome sequencing. All three probands present bilateral congenital SNHL which is moderate to severe. In addition, in vitro functional evaluation of two of the missense variants in protein fragments show decreased stabilities, suggesting that they may negatively impact their structures and molecular assembly in vitro.
Methods
Recruitment and Clinical Assessment
This study was approved by the institutional review boards of Boston Children’s Hospital (IRB P-00031494), University Medical Center Göttingen (No. 3/2/16), and the School of Biological Sciences, University of Punjab, Lahore, Pakistan (IRB No. 00005281). Written informed consent was obtained from participating members of the three families or parents for their minor children.
The proband in family 1 derived from non-consanguineous parents ascertained as part of a large clinically diverse cohort of patients with SNHL in the United States. The proband in family 2 derived from consanguineous parents and was ascertained as part of a large ethnically diverse Middle Eastern population rare disease study. The proband in family 3 was born to consanguineous parents and identified from a special education school. The parents did not participate in the study.
Demographic, otolaryngologic, audiological, and relevant medical data were ascertained from the medical records of probands. Affected individuals underwent a complete otologic evaluation. Routine pure-tone audiometry was performed according to current standards and measured hearing thresholds at 0.25, 0.5, 1, 2, 4, 6 and 8 kHz. Pure tone audiometry for proband 3 was performed in ambient noise conditions due to lack of soundproof testing environment. The probands in families 1 and 2 underwent tympanometry and speech audiometry testing. Proband 2 additionally underwent otoacoustic emissions testing.
Exome Sequencing
Genomic DNA (gDNA) from individuals in families 1 (I:1, I:2 and II:1), 2 (III:1, III:2, IV:1 and IV:2) and 3 (II:1) was isolated from either buccal mucosa or whole blood using standard procedures.
Family 1
Exome sequencing was performed in a Clinical Laboratory Improvements Amendments (CLIA)-certified environment (GeneDx, Gaithersburg, MD, USA). Analysis was performed using the DRAGEN pipeline (Illumina, San Diego, CA, USA). Copy number variants were annotated using genome analysis toolkit (GATK) and a normalized segmented depth of coverage model. Variants were prioritized according to allele frequency, in silico predictors, variant segregation, prior observations, and gene-disease association, as previously described (Rockowitz et al. 2020).
Family 2
Exome sequencing for the proband was performed using the SureSelect Target Enrichment v7 (Agilent Technologies, Santa Clara, CA) kit and sequenced with a NovaSeq 6000 (Illumina, San Diego, CA, USA) sequencer following manufacturer’s protocols. Bioinformatics analysis of the proband in family 2 was performed as previously described (Vona et al. 2021).
Family 3
Exome sequencing for the proband was carried out at 3billion, Inc., Seoul, South Korea (https://3billion.io/index). Briefly, coding exon regions of human genes (∼22,000) were captured by xGen Exome Research Panel v2 (Integrated DNA Technologies, Coralville, IA, USA). The captured regions of the genome were sequenced with NovaSeq 6000 (Illumina, San Diego, CA, USA). The raw genome sequencing data analysis, including alignment to the GRCh37/hg19 human reference genome, variant calling and annotation, was conducted with open-source bioinformatics tools including Franklin (https://franklin.genoox.com/clinical-db/home) and 3billion in-house software.
Variant Assessment and Validation
Variants were prioritized based on population and in silico pathogenicity prediction software. Variant minor allele frequencies were derived from gnomAD (v2.1.1 and v3.1.2) (Chen et al. 2022; Karczewski et al. 2020). Various pathogenicity prediction tools were used including SIFT (Ng and Henikoff 2001), PolyPhen-2 (Adzhubei et al. 2010), FATHMM (Shihab et al. 2014), MutationTaster (Schwarz et al. 2014), REVEL (Ioannidis et al. 2016) and CADD (Rentzsch et al. 2019). Visualization of amino acid substitution tolerance was supported by the MetaDome web server (Wiel et al. 2019).
Variants were annotated using the PHKD1L1 NM_177531.6 accession (ENST00000378402.9). The GTEx Portal (GTEx consortium 2013) was referenced for assessing the location of variants across the annotated PHKD1L1 isoform (Supplementary Fig. S1).
PKHD1L1 variant segregation in families 1 and 2 was confirmed using Sanger sequencing. Confirmation via Sanger sequencing was not performed for the proband of family 3.
Sequence analyses and structural modeling of PKHD1L1 protein
We compared the PKHD1L1 isoform sequence among 10 different species (see Supplementary Table S1 for more details about the selected species). The sequences were obtained from the National Center for Biotechnology Information (NCBI) protein database (Supplementary Table S1). First, each individual protein sequence was used to predict their signal peptides and domains using the Simple Modular Architecture Research Tool (SMART) (Letunic et al. 2021). Signal peptides were further predicted using the SignalP-5.0 (Almagro Armenteros et al. 2019) and the Prediction of Signal Peptides (PrediSi) online servers (Hiller et al. 2004). AlphaFold2 modelling was used to predict the potential signal peptide cleavage site and accurately inform the start and end of each predicted domain before carrying out the protein sequence alignment (Mirdita et al. 2022). Since the PKHD1L1 IPT domains do not have a clear conservation pattern at their IPT protein start and end sequence and connecting linker domains, AlphaFold2 modeling results were then combined with protein sequence alignment to better predict the signal peptide, IPT domain start and end residue positions, and the location of missense mutations.
The ClustalW algorithm (Larkin et al. 2007) on Geneious (Kearse et al. 2012) was used for the sequence identity analysis using the default parameters. Alignment files from Geneious were imported and color-coded in JalView with 35% conservation threshold, as previously described (Kearse et al. 2012). AlphaFold2 simulations of PKHD1L1 fragments were carried out using the Colabfold v1.5.2-patch server using default parameters (Mirdita et al. 2022).
Cloning, expression, and purification of engineered bacterially expressed PKHD1L1 protein fragments and mutant constructs
Wild-type (WT) Mus musculus (Mm) PKHD1L1 IPT1-3 and IPT5-6 were subcloned into the NdeI and XhoI sites of the pET21a+ vector. Site-directed mutagenesis was applied to engineer the Mm PKHD1L1 IPT1-3 p.(Gly129Ser) and PKHD1L1 IPT5-6 p.(Gly1314Val) constructs using the QuickChange lightning kit (Agilent Technologies). All constructs were used for protein expression in Rosetta (DE3) competent cells (Novagen) and cultured in TB as reported previously (De-la-Torre et al. 2018). Expressed proteins were refolded at 4 °C using conditions outlined below. WT Mm PKHD1L1 IPT1-3 and IPT1-3 p.(Gly129Ser) were refolded by fast or drop-wise dilution as previously reported for other protein families (De-la-Torre et al. 2018): 20 mL of pure denatured protein (0.5-1 mg/mL) was dropped into 480 mL of refolding buffer containing 20 mM TrisHCl [pH 8.0], 150 mM KCl, 50 mM NaCl, 2 mM CaCl2, 400 mM L-arginine, and 2 mM D(+) glucose. Mm WT PKHD1L1 IPT5-6 and IPT5-6 p.(Gly1314Val) were refolded by dialysis of 40 mL of eluted denatured protein at 0.5 mg/mL into 1000 mL of refolding buffer (20 mM TrisHCl [pH 7.5], 150 mM KCl, 50 mM NaCl, 5 mM CaCl2, 400 mM L-arginine, 1 mM of glutathione oxidized). Proteins were concentrated using 10,000 Da Amicon Ultra-15 centrifugal filters and purified using size exclusion chromatography with an Akta Purifier System with the S200 16/600 pg and S200 13/300 increase GL columns (GE Healthcare) in a buffer containing 20 mM Tris-HCl pH 7.5, 150 mM KCl, 50 mM NaCl, and 5 mM CaCl2 to preserve the most abundant endolymphatic cations. Purity of the recombinant and mammalian proteins was analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).
Nanoscale Differential Scanning Fluorimetry
WT Mm PKHD1L1 IPT1-3 and IPT5-6 protein fragments and their respective missense IPT1-3 p.(Gly129Ser) and IPT5-6 p.(Gly1314Val) proteins were used for functional evaluation in vitro. Thermodynamic evaluation and folding stabilities of these constructs in solution were carried out using nano differential scanning fluorimetry (NanoDSF). Pure proteins were concentrated to 0.5-1 mg/mL for NanoDSF using a Prometheus NT.48 (Nanotemper) and scanned between 20-95 °C with a pre-stabilization phase of 1 min and a temperature slope of 2 °C/min (37 min in total). Data were processed using a PR.ThermControl v2.1.2 software and plotted using GraphPad Prism. At least two biological replicates were used for each experiment. Each protein preparation was independently expressed and refolded at least two times (two biological replicates), and evaluated independently on NanoDSF. Each NanoDSF scan was run in parallel using at least four separate capillary tubes for each biological replicate. Each result per biological replicate represents average values of these measurements. Protein folding analysis results were plotted as a relationship of the normalized F350/F330 (%) ratio intensities (Tonset). The first derivative of F350/F330 (%) intensities were plotted to obtain the thermal unfolding transition midpoints (Tm).
Results
Clinical Genetics
Family 1 (Fig. 1)
The proband is a 10-15-year-old female born to healthy nonconsanguineous parents. She referred bilaterally on newborn hearing screening. Pure tone audiometry has been performed approximately every 6 months and consistently demonstrated a slowly progressive mild to moderate SNHL bilaterally. Between the ages of 0-5 years and 10-15 years, there is a decrease in pure tone average (PTA) of 5 dB for the right ear and 8 dB for the left ear. PTA was 41.25 and 38.75 in the right and left ears, respectively at age 0-5. Most recent audiometric testing showed PTA of 45 and 48.75 in the right and left ears, respectively. Speech audiometry at most recent evaluation (age 10-15 years) demonstrates a 90%-word recognition at a comfortable listening level. Speech recognition threshold (SRT) is 45 dB bilaterally. There is a history of monthly episodes of benign paroxysmal positional vertigo (BPPV) which resolved after Epley maneuver. Imaging studies of the inner ear were not performed. An electrocardiogram and ophthalmology exam were normal. Cytomegalovirus testing was negative. There were no dysmorphic facial features, neurological or developmental abnormalities, or other pertinent history. Exome sequencing revealed candidate variants only in PKHD1L1. Other variants were excluded based on a combination of in silico predictors, allele frequency, and prior clinical observation. The proband was compound heterozygous for missense variants, p.(Gly129Ser) and p.(Gly1314Val) (Table 1) that segregated within the trio in a Mendelian recessive manner (residue numbering corresponds to the NCBI human (Hs) sequence including the signal peptide, Supplementary Table S1).
The p.(Gly129Ser) substitution resides in exon 4 of 78 in PKHD1L1 and has a maximum allele frequency (MAF) of 0.001471% in gnomAD (v3.1.2). This variant is predicted to be deleterious to protein structure and function via in silico predictors (Table 1). This nonpolar glycine to polar serine substitution occurs at the tip of the PKHD1L1 IPT1 domain (Fig. 2b-d). This locus is predicted to be somewhat tolerant to missense substitution (Supplementary Fig. S2). On the other hand, the p.(Gly1314Val), located in exon 32 of 78 in PKHD1L1, has a MAF of 0.07204% (gnomAD, v3.1.2) and is also predicted to be deleterious to protein structure and function by in silico predictors (Table 1). This glycine to valine substitution occurs within the PKHD1L1 IPT6 domain region (Fig. 2b-d). This locus is predicted to be intolerant to missense substitution (Supplementary Fig. S2).
Family 2 (Fig. 1)
The proband is a 6-10-year-old male born to healthy consanguineous parents. Congenital SNHL was suspected and later clinically diagnosed. It has progressed to a bilateral moderate to severe degree. Pure tone audiometry shows moderate to severe SNHL in all frequencies. Speech audiometry understanding is 100% at a comfortable listening level, and the otoacoustic emissions test is the reference on both ears. His SRT is 60 dB and his speech discrimination score (SDS) is 100% at the intensity level of 80 dB. The proband currently uses hearing aids bilaterally. There have been no vestibular or movement abnormalities. Exome sequencing revealed that the proband was homozygous for the p.(Arg3381Ter) nonsense variant that resided in a ∼28 Mb run of homozygosity (Table 1). All other variants were excluded on the basis of segregation (Supplementary Table S2). Sanger sequencing at this locus confirmed the homozygous variant and revealed that the parents were both heterozygous carriers of the PKHD1L1 variant.
The p.(Arg3381Ter) variant is located in exon 62 of 78 and identified with a MAF of 0.01936% in population databases (gnomAD, v3.1.2). This variant occurs in a region with parallel beta-helix (Pbh1) repeats (Fig. 2b and Supplementary Fig. S3 for residue conservation), introducing a premature stop codon that is predicted to result in the loss of approximately 20% of the transcript (∼ 882 amino acids), including those encoding the transmembrane domain, and could potentially cause nonsense mediated decay. However, if expressed in a truncated form, lack of the transmembrane domain is likely to impair proper localization of PKHD1L1 in the cell membrane or might induce unconventional secretion of PKHD1L1 protein fragments.
Family 3 (Fig. 1)
The proband was a 10-15-year-old male with congenital SNHL born to healthy consanguineous parents. Audiometric testing demonstrated a bilateral severe SNHL. No further follow up was possible for the affected individual. Exome data analysis revealed two homozygous missense variants of interest; one in PKHD1L1 and one in MYO7A (NM_000260.4:c.1123C>G, p.(Leu375Val)). The homozygous variant in MYO7A was deprioritized given uncertain and neutral in silico predictions with respect to impact on protein structure and function (Supplementary Table S2).
The p.(His2479Gln) substitution is located in exon 49 of 78 and is identified at a MAF of 0.3107% in population databases (gnomAD, v3.1.2). This positively charged histidine to neutral glutamine substitution is located in a topological region with an unpredicted domain structure when using the SMART prediction tool (Fig. 2b).
Investigating the conservation of the mutated residue positions throughout evolution
All three variants do not appear to cluster in any particular region of the PKHD1L1 isoform that was used for alignment and further analysis (Supplementary Fig. S1). In comparing the longest PKHD1L1 isoform sequences among 10 different species, we uncovered an overall amino acid sequence identity of 79.2 % (Supplementary Fig. S3) across all 10 species. Notably, Mm and Hs PKHD1L1 share 81.8% of amino acid sequence identity (when comparing for identical sites excluding the signal peptides), suggesting high protein sequence conservation between the two species. Although some previous studies report protein sequence alignments and predictions of PKHD1L1 IPT domains (Hogan et al. 2003), an in-depth analysis was necessary to more accurately predict the signal peptide cleavage sites, as well as the starting and ending residues for each IPT domain. The location of the native Gly126, Gly1314, and His2479 residues, where the reported missense variants were detected, are highly conserved across a diverse set of the 10 different PKHD1L1 orthologues analyzed (Fig. 2d-e, Fig. 3a, and Supplementary Fig. S3 and Fig. S4).
AlphaFold2 modeling of PKHD1L1 substitutions
PKHD1L1 has 14 predicted IPT extracellular-domain repeats of similar fold but with non-identical protein sequence and labeled as IPT1 to IPT14 from its N-terminal to C-terminal end, and other key domain features (Fig. 2b). The AlphaFold2 model of Hs WT IPT1-2 and its mutant p.(Gly129Ser) shows no apparent differences between their predicted structures (Fig. 2f-g), likely because the small side chain carrying this residue is located on a loop region exposed to the solvent. More specifically, the p.(Gly129Ser) variant is located within the connecting loop between the β-strand 6-7 of IPT1, close to a potential disulphide bond formed by p.Cys100 and p.Cys86, also found in plexin-like domains (Fig. 2g). Changes of the polarity or the electrostatic potential of this loop by p.(Gly129Ser) might cause structural changes or altered loop dynamics in IPT1 (Krieger et al. 2005). We also generated AlphaFold2 models for Hs IPT5-6 consistent with the expected IPT plexin-like folding for this structure (Fig. 2h-i). According to this structural model, the Hs p.(Gly1314Val) mutation is also located at the connecting loop between the β-strand 6-7 of IPT6 (Fig. 2h-j). Furthermore, the p.(Gly1314Val) variant is located within the connecting area between IPT5 and IPT6, and the AlphaFold2 model suggests a structural change for this specific fragment (Fig. 2i-j).
In a previous study, authors used protein sequence analysis of PKHD1, PKHD1L1 and TMEM2 reporting that PKHD1 and PKHD1L1 share two regions of significant sequence homology with TMEM2 (Hogan et al. 2003). AlphaFold2 modelling of the third variant, p.(His2479Gln) and surrounding PKHD1L1 region, revealed a high structural homology with a Hs TMEM2 protein (Fig. 3). We identified that this region features a conserved (throughout 10 different PKHD1L1 species, and Supplementary Fig. S4) p.His2479 residue (p.His552 in Hs TMEM2, Fig. 3a; PDB: 8C6I (Niu et al. 2023)) reported to form a nickel-finger binding site, which might mediate catalytic functions in TMEM2. Disruption of this site in PKHD1L1 and TMEM2 might impair cation binding (Fig. 3a-f and Supplementary Fig. S3 and S4) and suggests a potential deleterious effect for this variant on protein structure and function. This locus is predicted neutral in terms of tolerance to missense substitution (Supplementary Fig. S2).
Functional testing of the p.(Gly129Ser) and p.(Gly1314Val) substitutions
Next, we cloned, expressed, and purified WT Mm PKHD1L1 IPT1-3 and IPT5-6 protein fragments as well as the respective IPT1-3 p.(Gly129Ser) and IPT5-6 p.(Gly1314Val) mutant constructs using size-exclusion chromatography (SEC) (Supplementary Fig. S5). These protein constructs represent key regions of the complete extracellular domain of PKHD1L1 where these mutations might locally affect the structural assembly of the protein. The thermodynamic and folding stabilities were measured using NanoDSF to assess the protein stabilities in solution for WT PKHD1L1 protein fragments and compared to fragments carrying missense mutations (Fig. 4 and Supplementary Fig. S5). For WT IPT1-3, the Tonset (melting temperature at which unfolding begins) was measured at a maximum of 58 °C, while the Tonset for IPT1-3 p.(Gly129Ser) variant was 52 °C (a 6 °C decrease, Fig. 4 and Supplementary Fig. S5). In addition, there was a decrease on the Tm (melting temperature or point at which 50% of the protein is unfolded) of ∼ 4 °C comparing different thermal transition points between WT and the IPT1-3 p.(Gly129Ser) variant (Fig. 4a). These measurements strongly suggest that the p.(Gly129Ser) variant affects PKHD1L1 protein stability within this region.
Similarly, NanoDSF measurements for WT Mm PKHD1L1 IPT5-6 and Mm IPT5-6 p.(Gly1314Val) showed a Tonset of 35.48 °C and 28.56 °C, respectively (∼ 7 °C decrease, Fig. 4 and Supplementary Fig. S5). Additionally, WT Mm PKHD1L1 IPT5-6 showed a melting temperature Tm of 45.10 °C, while the mutant IPT5-6 p.(Gly1314Val) displayed a decrease on this Tm to 36.6 °C (decreasing the temperature ∼ 8.6 °C) (Fig. 4b). These results confirm that both IPT1-3 p.(Gly129Ser) and IPT5-6 p.(Gly1314Val) variants indeed significantly decrease the thermal and folding stabilities of these PKHD1L1 protein constructs.
Discussion
A majority of congenital SNHL is attributable to a genetic etiology, and clinical genetic testing for known SNHL genes is an established standard of care in the diagnostic evaluation of bilateral SNHL in pediatric patients (Shearer and Smith 2015; Smith et al. 2005). There are over 120 known genetic causes of nonsyndromic hearing loss, and gene panel tests are recommended to facilitate accurate and timely genetic diagnosis of SNHL (https://hereditaryhearingloss.org). However, despite advances in genetic testing for SNHL, the diagnostic yield for SNHL ranges from 22.5% to 55.7% with an average of about 40% (Downie et al. 2020; Perry et al. 2023; Rouse et al. 2022; Sloan-Heggen et al. 2016). The identification of novel hearing loss genes is critical to improving diagnostic rates, thus impacting care and management for individuals with SNHL.
PKHD1L1 is predominantly expressed in the OHC stereocilia by P0 to P12 with a basal-to-apical (decreasing) expression gradient and is a major component of the stereocilia surface coat (Wu et al. 2019). Pkhd1l1-deficient mice lack the surface coat at the stereocilia tips and exhibit progressive SNHL by ABR and DPOAE measurements, starting as early as 3 weeks. The two functional hypotheses of PKHD1L1 expression at the stereocilia include that it may be required for the correct localization of other stereocilia proteins or during the development of attachment crowns at the stereocilia to secure the tectorial membrane to the bundle. This immature attachment could manifest as a persisting relaxed tectorial membrane coupling (Wu et al. 2019). Still, it is unknown whether PKHD1L1 is the only component of the stereocilia coat. Recently, pkhd1l1 was shown to play a critical role in regulating hearing in zebrafish (Makrogkikas et al. 2023). pkhdl1l has a ubiquitous expression pattern and is sustained for most of embryonic development (Makrogkikas et al. 2023). Through depletion of the two paralogous genes (pkhd1l1a and pkhd1l1b), double mutant zebrafish exhibited significant hearing loss from the larval stage (6 days post fertilization) which differs compared to progressive hearing loss in the mouse (Wu et al. 2019).
The degree of hearing impairment in the patients we present is fairly broad: The proband in family 1, with p.[(Gly129Ser)];p.[(Gly1314Val)] compound heterozygous variants, was diagnosed with congenital hearing impairment that remains mild to moderate at the age of 10-15 years; the proband in family 2, with a homozygous p.(Arg3381Ter) variant, already showed moderate to severe SNHL at the age of 6-10 years; and the proband in family 3, at the age of 10-15 years, showed severe hearing impairment attributed to the homozygous p.(His2479Gln) variant. While we have identified individuals in three families with variants in PKHD1L1, this study highlights the necessity for an extended case series with longitudinal audiological follow up and functional studies to assess variant effects of patient-specific perturbations on development, maturation, and function of the auditory system, as well as explore the potential of accelerated effects of age, noise, or trauma on progression, which remain as current major limitations. Interestingly, the PKHD1L1 gene has been suggested to be associated with adult-onset hearing loss (Lewis et al. 2023). Since the studied variants are also located in different residue positions in the PKHD1L1 protein sequence, the broad range of hearing impairment from these patients might suggest that these variants differentially impact the protein expression, folding, and/or the stability and function of PKHD1L1.
Thus, we also investigated the conservation of the mutated residue positions throughout evolution. Multiple sequence alignments of the complete PKHD1L1 amino acid sequence from 10 different species were analysed and found to be highly conserved. This suggests that these native residues might be critical to protein folding and assembly. Therefore, variants in these positions might disrupt protein function and potentially cause hearing impairment in vivo.
To further predict how these PKHD1L1 mutant variants might affect the PKHD1L1 protein at the structural level, we modelled the structures of the individual domains carrying reported variants (Fig. 2f-i and Fig. 3b-e). The p.(Gly129Ser) substitution in IPT1 was not predicted to exert an apparent structural difference. We speculate that instead changes of the polarity or the electrostatic potential of the β-strand linker loop by p.(Gly129Ser) might alter loop dynamics in IPT1. Interestingly, both glycine substitutions p.(Gly129Ser) and p.(Gly1314Val) are located within the same connecting loop between β-strand 6-7 in IPT1 and IPT6, respectively. While the AlphaFold2 model of the p.(Gly129Ser) mutant shows no apparent structural changes in the predicted structure (Fig. 2f), the AlphaFold2 model of p.(Gly1314Val) shows a conformation change, likely due to steric hindrance on the structure (Fig. 2i). Finally, we also mapped the location of the p.(His2479Gln) using structural modeling by AlphaFold2 (Fig. 3, Supplementary Fig. 4). Our results indicate that the His2479 position (from 10 different species) is 100% conserved with a Hs TMEM2 protein (PDB: 8C6I), a regulator of the hyaluronan metabolism (Fig. 3a and 3e-f) (Sato et al. 2023). Interestingly, experiments suggest that TMEM2 activity is calcium-dependent (Yamamoto et al. 2017) and TMEM2 has been previously studied for its structural similarities with the CEMIP deafness gene candidate (Yoshida et al. 2013).
Since the p.(Arg3381Ter) introduces a stop codon that would prevent the translation and expression of the transmembrane domain of PKHD1L1 (Fig. 2b), the proper localization of PKHD1L1 in the cell membrane is expected to be affected. Because the helical peptide of PKHD1L1 transmembrane domain is predicted between p.Ile4211-4231Ala (850 amino acids downstream from the Arg3381 site from the PKHD1L1 N-terminal towards the C-terminal end), this large portion of the protein will not be expressed (Fig. 2b-c, Supplementary Fig. S3). This is likely to impair the proper folding, trafficking, and insertion of PKHD1L1 in the stereocilia-plasma membrane, or even result in secretion of PKHD1L1 extracellular fragments that could progressively affect hearing function. Interestingly, secreted versions of extracellular PKHD1L1 have been found in supernatant solution from platelet cells (Maynard et al. 2007) and their soluble concentrations could be modulated by protease inhibitors (Fong et al. 2011), suggesting potential cleavage sites in Hs PKHD1L1. However, the role of these cleaved extracellular PKHD1L1 protein fragments is still unknown.
Our NanoDSF thermal folding analysis showed that both Tonset and Tm values decreased in the presence of the p.(Gly129Ser) and p.(Gly1314Val) variants. The Tm measurements using NanoDSF showed p.(Gly129Ser), located in a loop, decreases the stability of IPT1 and further showed how this variant propagates its destabilizing effects to the neighboring IPT2 and IPT3 (Fig. 2b, Fig. 4a, and Supplementary Fig. S5). Likely, the p.(Gly1314Val) variant also alters the stability of the loop and the chemical environment in the IPT5-IPT6 connection, since the measured folding stability showed an 8.6 °C decrease of unfolding temperature between WT IPT5-6 fragment and the p.(Gly1314Val) variant (Fig. 4). This is the first study showing strong evidence to support how missense variants locally affect the structural folding and stability of PKHD1L1 fragments in vitro. Given the high conservation rate of 81.8% in amino acid sequence identity between the Hs and Mm PKHD1L1, and the 100% conservation of the mutated sites across the species analyzed, we believe our findings using Mm PKHD1L1 protein fragments can be directly applied to Hs PKHD1L1. Future functional studies involving full-length protein purification would allow for better understanding of various effects such variants might have on the stability of the entire PKHD1L1 extracellular domain, its protein expression and proper localization, which might be linked to the progression and hearing loss severity. Furthermore, studies focused on uncovering the influence of mutations on the structure of the complete PKHD1L1 extracellular domain will help in understanding the role of PKHD1L1 in hearing function and beyond, since the PKHD1L1 has been suggested as a tumor suppressor (Yang et al. 2023) and a human cancer biomarker (Kafita et al. 2023; Wang et al. 2023; Zheng et al. 2019).
Additional syndromic involvement was excluded in our three patients. However, in addition to hearing impairment, disruption of PKHD1L1 has also been associated with increased susceptibility to pentylenetetrazol-induced seizures in mice indicating a possible role in maintenance of neuronal excitability in the central nervous system (Yu et al. 2023). PKHD1L1 is expressed in the hippocampus and cerebral cortex in adult WT mice. Knockdown of PHKD1L1 using PHKD1L1-shRNA or PHKD1L1-shRNA-AAV increased susceptibility of seizures as indicated by increased epileptiform bursting activity in cultured hippocampal neurons and pentylenetetrazol-induced seizures of mice following knockdown, suggesting a role for PKHD1L1 in the maintenance of normal excitation-inhibition balance (Yu et al. 2023). Although these findings have not been validated in Pkhd1l1-knockout mice, an open question remains as to whether similar pleiotropic effects paralleling, for example, the various phenotypes caused by pathogenic variants in TBC1D24 may also occur as more PKHD1L1 patients are discovered. The gene-phenotype relationship of TBC1D24 is extensive and includes both autosomal dominant (DFNA65, OMIM 616044) and recessive (DFNB86, OMIM 614617) nonsyndromic deafness, as well as familial infantile myoclonic epilepsy (FIME, OMIM 605021) with or without deafness, early infantile epileptic encephalopathy (EIEE16, OMIM 615338), progressive myoclonic epilepsy (PME, OMIM 310370), or deafness, onychodystrophy, osteodystrophy, impaired intellectual development, and seizures syndrome (DOORS, OMIM 220500) (as reviewed by Mucha et al. 1993; Rehman et al. 2017). Further reporting of variants in PKHD1L1 will be critical to discovering a potential parallel between PKHD1L1 and the complex syndromes using the TBC1D24 analogy.
Finally, we conclude with a cautionary note for diagnostic laboratories, as this work serves as an extended invitation to include PKHD1L1 screening in their future patient diagnostic screening. Determination of pathogenicity of genes causing hearing loss relies on multiple lines of supporting evidence. However, in the case of large genes with large amounts of natural variation, like PKHD1L1, the allele frequency criteria, specifically the PM2_Supporting pathogenicity criteria, becomes a point of contention for classification. The PM2_Supporting criterion as proposed by the ACMG/AMP hearing loss expert panel specification have calculated a ≤0.00007 frequency for autosomal recessive disorders to trigger application (Oza et al. 2018). With MAFs of 0.0007204 in the non-European Finnish and 0.003107 in the South Asian populations (gnomAD v3.1.2), the c.3941G>T, p.(Gly1314Val) and c.7437C>A, p.(His2479Gln) variants exceeded the ACMG/AMP criteria with hearing loss expert panel specification, respectively, and suggests the potential for missed diagnoses if the proposed frequency is applied for variant filtering (Oza et al. 2018). This merits attention because the PM2 criterion is most commonly applied among the criteria for hearing loss variant classification (Kim et al. 2022). Given the large size of PKHD1L1 and that more missense variants are observed than expected (Z = −0.72, o/e = 1.04 (1.02 – 1.08)) minor allele frequency threshold usage should be carefully assessed for this gene. This study is a call to clinical laboratories to include careful screening of PKHD1L1 biallelic variants in patients with a hearing loss ranging from mild-moderate to severe.
Conclusion
Here we provide data to support PKHD1L1 as a cause of human nonsyndromic autosomal recessive congenital, mild-moderate to severe hearing loss. We demonstrated that all reported missense variants are of residues highly conserved throughout evolution, suggesting that the native residues are key for protein folding and function, while variants in these sites affect the thermal-folding stability of PKHD1L1 fragments in solution. Inclusion as a hearing loss gene is supported by multiple families segregating plausible variants, in vitro functional data confirming the impact of variants, as well as previously published mouse and zebrafish models demonstrating hearing loss. Further research will be needed to determine the effect of age, noise, or trauma on progression of hearing loss.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Statements and Declarations
Funding
This work was supported by NIDCD K08 DC019716 to AES, the De Garay Family Fund to MAK with support from the Boston Children’s Rare Disease Cohort Initiative, and the German Research Foundation (DFG) VO 2138/7-1 grant 469177153 to BV, and by NIH R01DC017166 (NIDCD) and R01DC020190 (NIDCD) to AAI. Funding for work in Pakistan was provided by the University of the Punjab (SN).
Author contributions
SER and PD contributed equally as shared first authors and AAI, AES and BV contributed equally as shared last authors. AES, PD, and AAI conceived the study. PD, TM, and AAI designed methodology. PD and TM contributed software. MZ, PD, and AAI performed formal analyses that were validated by MZ, MK, PD, TM, and AAI. MZ, HK, MK, SN, MAK, PD, TM, AAI, and BV conducted the investigation process. HG, GS, MAK, AES, SER, and PD secured patient and experimental resources. MZ, HG, GS, BV, SER, AES, PD, and AAI performed data curation. BV, AES, PD, and AAI wrote the original draft. MZ, HK, MK, SN, HG, BV, SER, AES, PD, and AAI contributed to reviewing and editing. BV, AES, SER, PD, and AAI prepared figures. SN, AES, PD, and AAI provided supervision. BV, AES, and AAI oversaw project administration. All authors reviewed and approved of the manuscript.
Data Availability
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
Compliance with ethical standards
Conflict of interest
Go Hun Seo is an employee of 3Billion, Inc. The other authors have no conflicts to declare.
Ethics approval
This study was approved by the institutional review boards of Boston Children’s Hospital (IRB P-00031494), University Medical Center Göttingen (No. 3/2/16), and the School of Biological Sciences, University of Punjab, Lahore, Pakistan (IRB No. 00005281).
Consent to participate
Written informed consent was obtained from all participants.
Consent to publish
Consent to publish was obtained from all participants.
Acknowledgments
We thank the families for their participation.