ABSTRACT
The acronym PHACE stands for the co-occurrence of posterior brain fossa malformations, hemangiomas, arterial anomalies, cardiac defects, and eye abnormalities. The majority of patients have a segmental hemangioma and at least one developmental structural anomaly. The etiology and pathogenesis are unknown. Here we discuss the candidate causative genes identified in a de novo analysis of whole genome sequencing of germline samples from 98 unrelated trios in which the probands had PHACE, all sequenced as part of the Gabriella Miller Kids First Pediatric Research Program. A g:Profiler pathway analysis of the genes with rare, de novo variants suggested dysregulation of the RAS/MAPK and PI3K/AKT pathways that regulate cell growth, migration, and angiogenesis. These findings, along with the developmental anomalies and the vascular birthmark, support including PHACE within the RASopathy family of syndromes.
INTRODUCTION
The RASopathies represent a collection of rare genetic conditions caused by mutations in individual genes within the RAS/MAP kinase (MAPK) pathway. RAS/MAPK signaling is upstream of many cellular pathways; therefore, depending on the individual mutation and affected tissue, single-gene perturbations can cause numerous syndromes, including Cardio-Facio-Cutaneous (CFC), Costello (CS), Legius (LS), Neurofibromatosis type 1 (NF1), Noonan (NS), Noonan-like (NS with Multiple Lentigines, NSML; NS with Loose Anagen Hair, NSLH) and Capillary Malformation-Arteriovenous Malformation (CM-AVM) (Grant et al. 2018; Stevenson et al. 2016). These syndromes share many clinical features including distinct facial features, developmental delays, cardiac defects, growth delays, and structural brain anomalies. While these individual syndromes are rare, as a group the RASopathies are among the most common genetic conditions in the world. The vast majority of mosaic neurocutaneous syndromes have been shown to be caused by somatic variants in RAS pathway genes, including the association of HRAS and KRAS variants with sebaceous nevus syndrome and NRAS variants with neurocutaneous melanosis (Chacon□macho et al. 2019; Kinsler et al. 2013).
The PHACE (OMIM 606519) acronym represents the sporadic vascular syndrome exhibiting the co-occurrence of posterior brain fossa malformations, hemangiomas, arterial anomalies, cardiac defects, and eye abnormalities (Frieden et al. 1996). The majority of patients have a segmental infantile hemangioma and at least one other feature (Garzon et al. 2016; Metry et al. 2009). Dysplasia and stenosis of the arteries of the head and neck is the second most common feature (occurring in >85%), followed by structural brain anomalies (including hypoplastic cerebellum), cardiovascular anomalies (including aortic coarctation or ventricular septal defect), and eye anomalies (Metry et al. 2009; Siegel 2017). The significant overlap in developmental anomaly phenotypes between PHACE and RASopathy patients suggests a common mechanistic underpinning of sporadic PHACE and germline RASopathy variants (Ruggieri et al. 2020).
The etiology and pathogenesis of PHACE are unknown, although consideration of the variety of body systems affected suggests that the age of onset is between 3 and 12 weeks of gestation during early vasculogenesis (Frieden et al. 1996; Metry et al. 2006; Siegel 2017). There is a 6:1 female:male preponderance, but no significant difference in the severity of the phenotype between males and females (Metry et al. 2009; Wan et al. 2017). PHACE is a sporadic disorder; there are no reported familial cases and at least four female patients with PHACE have given birth to clinically unaffected children (Martel et al. 2015; Stefanko et al. 2019). PHACE is believed to have a genetic cause because there is no evidence of a common exposure during gestation or early infancy for children diagnosed with PHACE (Wan et al. 2017). Autosomal dominant inheritance caused by de novo, germline pathogenic variants or mosaicism caused by somatic mosaic pathogenic variants have been suggested as possible modes of inheritance (Mitchell et al. 2012; Siegel 2017).
Here we present the de novo analysis of whole genome sequencing (WGS) of germline samples from 98 unrelated trios in which the probands had PHACE, all sequenced as part of the Gabriella Miller Kids First project (X01HL140519-01). Additionally, we analyzed the DECIPHER database (Firth et al. 2009) for genes affected by copy number variants (CNVs) identified in individuals with the search terms “hemangioma” and “facial hemangioma” and used this analysis to prioritize genes with de novo variants identified among our cohort of 98 patients. Among the rare, de novo variants, we prioritized variants in 10 genes in 18 patients; six of these genes were in the RAS pathway. Our results support including PHACE within the RASopathy family of syndromes.
RESULTS
Germline whole genome sequencing and de novo analysis pipeline
We identified a total of 16,107 de novo single nucleotide variants (SNVs) in our cohort of 98 patients; on average, this is 164 SNVs per patient. Our analysis found 15,752 rare (minor allele frequency [MAF] < 1%), de novo variants, including 219 coding variants (119 missense/nonsense/stoploss/splicing and 100 synonymous) and 15,533 noncoding variants. No de novo variant was found in more than two patients, and no gene had nonsynonymous, de novo coding variants in more than one patient.
Variant prioritization
Among the rare, de novo variants, we prioritized 18 candidate SNVs in 10 genes in 18 patients, and two indels in these same genes (Supplemental Table 1; see Methods for a description of prioritization metrics). Briefly, variants were prioritized if they were rare or novel (population frequency < 1% or 0), affected a gene known to cause a phenotype similar to PHACE in humans or mice, or affected genes disrupted by CNVs in a previous study of PHACE (Siegel et al. 2013).
Genes associated with vascular human phenotypes
Using OMIM annotation, we identified a missense, heterozygous STAMBP–p.Ile301Asn variant in Patient 16. Missense and frameshift homozygous or compound heterozygous variants in this gene cause autosomal recessive microcephaly-capillary malformation syndrome (OMIM 614261), which is characterized by capillary malformation, microcephaly, optic atrophy, cleft palate, tapered fingers, and seizures. STAMBP–p.Ile301Asn is not found in gnomAD and is predicted to be pathogenic by 18 in silico algorithms. Additionally, 88% of non-VUS missense variants in this gene are pathogenic. Patient 16 was heterozygous for the STAMBP–p.Ile301Asn variant and a second candidate variant in this gene was not identified despite careful analysis of the WGS data including a search for deletions or duplications.
Genes associated with vascular mouse phenotypes
We then identified coding and noncoding variants in six genes associated with lethal knockout mouse models due to vascular abnormalities. Two variants, RASA3–p.Val85Met in Patient 4 and THBS2–p.Asp859Asn in Patient 14, were exonic. The RASA3– p.Val85Met variant is predicted to be pathogenic by 11 algorithms (CADD, DEOGEN2, EIGEN, FATHMM-MKL, FATHMM-XF, LRT, Mutation assessor, MutationTaster, PrimateAI, SIFT, and SIFT4G) and to affect splicing. The THBS2–p.Asp859Asn variant is found in the C-terminal type 3 repeats and likely disrupts intraprotein interactions and calcium binding (Figure 1); this variant is predicted to be pathogenic by 19 algorithms (BayesDel-addAF, BayesDel-noAF, CADD, DEOGEN2, EIGEN, EIGEN PC, FATHMM, FATHMM-MKL, FATHMM-XF, LIST-S2, MVP, MetaLR, MetaSVM, Mutation assessor, MutationTaster, PROVEAN, REVEL, SIFT, and SIFT4G). The other variants were noncoding and included two intronic SNVs in BCAS3, three intronic SNVs in DLC1, one intronic SNV in GLRX3, one intronic SNV in PIK3CA, and one intronic SNV in THBS2 (see Supplemental Table 1). Most of the identified variants are novel. Variants in BCAS3, DLC1, GLRX3, and PIK3CA are predicted to affect splicing. Variants in BCAS3, DLC1, PIK3CA, and THBS2 are predicted to be pathogenic by at least one algorithm. The BCAS3 SNV in intron 23 is predicted to fall within a binding site for GATA1 and the THBS2 intronic SNV is predicted to fall in a binding site for IKZF1. All three variants in DLC1 and the variants in GLRX3 and PIK3CA are predicted to be in open chromatin in human vascular endothelial cells (HUVECs) by Encode and therefore accessible to proteins regulating gene expression: four are predicted to be transcribed, and one (DLC1 intron 5) is predicted to be in an enhancer.
Genes previously implicated in CNV analysis
We next looked for variants in genes affected by CNVs identified in a cohort of patients with PHACE described by Siegel et al. (2013) among our cohort. This analysis identified the aforementioned intronic variant in PIK3CA again. We also prioritized two intronic variants in AFF2, an intronic variant in EPHA3, and four intronic variants in EXOC4. All four genes were affected by a CNV in a single patient in the prior study. The EPHA3 intronic variant is predicted to be pathogenic by EIGEN (0.58), to affect splicing, and to fall within binding sites for CTCF, TCF12, and RAD21. All four EXOC4 variants are predicted to affect splicing, and the EXOC4 variant in intron 17 is predicted to be pathogenic by EIGEN (0.05) and GWAVA (0.5) and to fall within binding sites for GATA2 and GATA3.
Indels in prioritized genes
Finally, we investigated our cohort for rare, de novo indels affecting any of the 12 prioritized genes and identified two more variants: a 2-bp intronic deletion in BCAS3 in Patient 20 and an intronic 8-bp insertion in THBS2 in Patient 12, just upstream of the previously identified intronic SNV in THBS2 in the same patient.
DECIPHER analyses
Next, we analyzed published CNVs using DECIPHER to identify previously reported individuals with a phenotypic description containing the term “hemangioma” and CNVs containing our prioritized genes. This analysis identified five individuals in DECIPHER with deletions affecting THBS2; a single individual each with a deletion affecting BCAS3, GLRX3, and RASA3; and a single individual each with a duplication affecting DLC1 and PIK3CA.
We expanded our DECIPHER analysis to identify individuals with “facial hemangioma” and the genes disrupted by their CNVs. Next, we searched for rare, de novo variants in our cohort in the genes disrupted in two or more DECIPHER individuals. THBS2 was only disrupted by deletions in DECIPHER. Another two genes were each disrupted by two CNVs, one deletion and one duplication, in DECIPHER individuals with facial hemangioma. ALDH3A1 has a novel p.Val373Met variant in our Patient 8; this SNV is in the first codon of exon eight, is predicted to affect splicing by Human Splicing Finder (HSF), and is predicted to be pathogenic by CADD (23.3) and FATHMM-MKL (0.48). ATP7B has a p.Thr482Ala variant in our Patient 10; this SNV is predicted to affect splicing by HSF.
We performed an enrichment analysis for the DECIPHER CNVs disrupting ALDH3A1, ATP7B, and THBS2. Only deletions disrupting THBS2 were significantly more common among individuals with facial hemangioma (FET, p=3.7e-2, OR=7.17) or hemangioma (FET, p=3.9e-3, OR=5.24) in DECIPHER than among the individuals without these features in DECIPHER, respectively.
Pathway analysis
Genes containing rare, de novo, non-intergenic SNVs (n=4,320 genes) were analyzed through g:Profiler to look for common pathways disrupted in our cohort (Table 1). The five gene ontology (GO) terms with the most significant p-values were all related to development and morphogenesis. The most significant term, nervous system development (adjusted p=6.22e-24), included our candidate genes AFF2, DLC1, EPHA3, PIK3CA, and THBS2. We also observed an enrichment for cell adhesion (adjusted p=1.34e-10, including BCAS3, DLC1, EPHA3, PIK3CA, and THBS2), which is critical for angiogenesis. The three KEGG terms with the most significant p-values implicated RAP signaling (adjusted p=1.61e-6, including PIK3CA), focal adhesion (adjusted p=2.43e-6, including PIK3CA and THBS2), and phospholipase D signaling (adjusted p=8.13e-6, including PIK3CA).
Expression analysis
We explored the expression of the candidate genes RASA3, AFF2, DLC1, EPHA3, PIK3CA, and THBS2 across diverse cell types in the human developing heart (Figure 2) by incorporating chromatin accessibility (scATAC-seq) (Ameen & Sundaram et. al., unpublished) and gene expression (scRNA-seq) datasets (Asp et al. 2019; Miao et al. 2020; Suryawanshi et al. 2020). We observed that AFF2, EPHA3, PIK3CA, and THBS2 are co-expressed in the vasculature (pre-smooth muscle cells and vascular smooth muscle cells) in the fetal heart, whilst AFF2, EPHA3, and PIK3CA are co-expressed in the endothelium. Furthermore, RASA3, DLC1, and THBS2 are co-expressed in pericytes, the mural cells surrounding blood vessels and embedded within the basement membrane of the vasculature and adjacent to endothelial cells (Armulik et al. 2005), and RASA3 and THBS2 are co-expressed in perivascular fibroblasts, which may serve as pericyte precursors (Rajan et al. 2020). Together, these data suggest that the expression of the candidate genes is enriched in endothelial cells and mural cells (pericytes and vascular smooth muscle cells) in the blood vessel wall.
DISCUSSION
Our whole genome sequencing analyses support a disruption in RAS pathway signaling as the primary pathogenic mechanism behind PHACE. The RAS pathway genes identified in our analysis are associated with abnormal angiogenesis, including changes to cytoskeletal organization, cell migration, and adhesion, in relevant knockout mouse models. DLC1 binds RASA1 (a paralog of RASA3) at focal adhesions, and Dlc1 knockout causes embryonic lethal placenta vascular malformations (Durkin et al. 2005; Yang et al. 2009). EPHA3 regulates cytoskeletal organization, cell-cell adhesion, and cell migration in angiogenesis, and Epha3 knockout causes perinatal lethal cardiac defects (Clifford et al. 2008; Vaidya et al. 2003; Vearing et al. 2005). PIK3CA regulates cell migration and adhesion and has been associated with vascular anomalies, and Pik3ca knockout causes embryonic lethal vascular defects and hemorrhage (Bi et al. 1999; Graupera et al. 2008; Lelievre et al. 2005). RASA3 regulates cell-cell and cell-matrix adhesion and cell migration, and Rasa3 knockout causes embryonic lethal hemorrhage following vascular lumenization defects (Iwashita et al. 2007; Molina-Ortiz et al. 2018). THBS2 binds heparin in the extracellular matrix (ECM), mediates cell-matrix interactions, and inhibits cell adhesion and angiogenesis, and Thbs2 knockout causes increased vascularity and premature death (Bornstein et al. 2000; Kyriakides et al. 1998; Noh et al. 2003; Streit et al. 1999; Yang et al. 2000).
The g:Profiler pathway analysis of the genes with rare, de novo SNVs in these patients also implicated these same candidate genes. KEGG pathway analysis identified RAP signaling, focal adhesion, and phospholipase D (PLD) signaling as the most significantly affected pathways. RASA3 downregulates RAP1 GTPase signaling (Battram et al. 2017; Molina-Ortiz et al. 2018). BCAS3, DLC1, PIK3CA, and THBS2 all participate in the regulation of focal adhesion (Durkin et al. 2005; Elzie and Murphy-Ullrich 2004; Jain et al. 2012; Matsuoka et al. 2012; Yang et al. 2009). PLD is required for actin cytoskeleton remodeling in cell migration and angiogenesis, participates in RAS activation, and is activated by both RAS/MAPK/ERK and PI3K/AKT/mTOR signaling (Bruntz et al. 2014). RASA3 downregulates RAS signaling through MEK/ERK and also binds PI3K and regulates cell migration though PI3K-dependent integrin signaling (Battram et al. 2017; Castellano and Downward 2011). PIK3CA forms part of the PI3K complex and regulates endothelial cell migration during vascular development (Graupera et al. 2008). STAMBP binds GRB2 to regulate both the RAS/MAPK and PI3K/AKT/mTOR pathways (McDonell et al. 2013; Tsang et al. 2006). THBS2 signals through FAK activation of the MAPK/ERK and PI3K/AKT pathways to stimulate actin reorganization, decrease cell adhesion, and increase migration (Goicoechea et al. 2000; Greenwood et al. 1998; Orr et al. 2002). DLC1 and EPHA3 both bind RASA1, a close paralog of RASA3 (Haupaix et al. 2013; Yang et al. 2009); the EPHA3 intronic variant in particular is also in a binding site for TCF12, which downregulates ERK signaling (Yi et al. 2017). DLC1 is also phosphorylated by AKT, which abrogates its tumor suppression activity (Ko et al. 2010). The identification of variants in these genes among the probands with PHACE suggests that defects in matricellular signaling affecting endothelial cell adhesion and migration through the RAS and PI3K pathways may contribute to the pathogenesis of PHACE (Figure 3).
Our analysis of rare, de novo germline variants combined with the DECIPHER analysis of CNVs found in individuals with hemangioma identified THBS2 as a strong candidate causative gene for PHACE. THBS2 had missense and intronic variants in our cohort and was deleted in five individuals with hemangioma in DECIPHER; an enrichment analysis showed that deletions disrupting THBS2 were significantly more common among individuals with hemangioma than in individuals without this feature in DECIPHER. While no human phenotype has been associated with this gene, knockout of this gene in a mouse causes blood vessel overgrowth and premature death (Bornstein et al. 2000; Kyriakides et al. 1998; Yang et al. 2000). THBS2 is highly expressed in arterial tissues (aorta, coronary, and tibial) described in GTEx and in the developing vasculature in our scRNA analysis. THBS2 interacts with the ECM, mediates cell-matrix interactions, inhibits cell adhesion, inhibits angiogenesis, and acts as a tumor suppressor (Bornstein et al. 2000; Kyriakides et al. 1998; Noh et al. 2003; Streit et al. 1999; Yang et al. 2000). The THBS2 residue p.Asp859, which had a missense variant in our cohort, is found in the C-terminal type 3 repeats and participates in calcium binding (Figure 1). This calcium binding is necessary for correct folding of the C terminus, which itself must correctly trimerize in order to support the integrin-binding and cell motility functions of the THBS2 protein (Anilkumar et al. 2002; Carlson et al. 2008; Kvansakul et al. 2004). Therefore, a variant in this position could affect the protein function by disrupting protein folding and trimerization.
Our analysis identified RASA3 as a second strong candidate causative gene for PHACE. RASA3 had a missense variant in our cohort and was deleted in one DECIPHER individual with capillary hemangioma. Germline pathogenic variants in a related gene, RASA1, are known to cause capillary malformation–arteriovenous malformation (Eerola et al. 2003). No human phenotype has been associated with RASA3, but knockout of this gene in a mouse causes hemorrhage and embryonic lethality due to vascular abnormalities (Iwashita et al. 2007; Molina-Ortiz et al. 2018). RASA3 also regulates angiogenesis via the RAS/RAP pathway (Molina-Ortiz et al. 2018), which was implicated in our pathway analysis, and was highly expressed in pericytes in our scRNA analysis.
We also identified seven noncoding variants in four RAS pathway genes (DLC1, EPHA3, PIK3CA, and THBS2). These variants were predicted to affect splicing, functional DNA, transcription factor binding, and/or open chromatin in vascular cells (Table 2). The genes DLC1 and PIK3CA are highly intolerant to loss-of-function coding mutations (probability of loss-of-function intolerance scores > 0.9) (Lek et al. 2016) and have enhanced expression in PHACE-relevant tissues: DLC1 in neurons and retinal amacrine cells and PIK3CA in aorta and coronary artery (Genotype-Tissue Expression (GTEx) Project 2021). Single cell RNA expression analyses also showed that our candidate RAS pathway genes have high expression in neural crest, smooth muscle, and endothelial cells in the developing heart (Figure 2). The predicted disruption of expression and/or splicing of these genes by our identified noncoding variants may therefore contribute to other PHACE features. For instance, an intronic SNV, rs1905014-T, in DLC1 is significantly associated with optic disc size (Han et al. 2019). However, the effects of the noncoding variants on splicing and/or functional DNA need to be investigated further.
In conclusion, we have found strong candidate coding and noncoding variants in genes in the RAS/MAPK and PI3K/AKT pathways in individual patients with PHACE. These variants, paired with the vascular birthmark (infantile hemangioma) and the structural brain, heart, arterial, and eye anomalies in PHACE, support the inclusion of PHACE as a RASopathy syndrome. It is striking that the majority of these variants were predicted to have effects that may lead to altered gene expression during development. However, because of the complex cellular interactions between vascular cell types, further in vivo studies are necessary to confirm pathogenicity of the variants identified and to understand their effect on the pathways that are likely disrupted by these variants.
SUBJECTS AND METHODS
Subjects
Patients with PHACE and their parents were recruited and sequenced as part of the Gabriella Miller Kids First Pediatric Research Program (X01HL140519-01). The study is approved by the Medical College of Wisconsin IRB #PRO00034867 and patients gave written, informed consent. Of the 98 patients discussed here, 19 were male, and 85 were white (Table 3). The average age of the cohort was 5.8 years at the time of enrollment. Ninety patients had facial hemangioma.
Germline whole genome sequencing and de novo analysis pipeline
WGS was performed on germline DNA from 98 unrelated patients with PHACE and their parents. WGS libraries were sequenced on the Illumina NovaSeq 6000 platform using 150-bp paired ends and the NovaSeq 6000 S4 Reagent Kit (Illumina, San Diego, California). Alignment was performed using DRAGEN (v.SW: 01.011.269.3.2.8) (Ma et al. 2015); data conversion from CRAM to BAM files with samtools (v.1.9); data pre-processing with Picard RevertSam, MarkDuplicates, SortSam, and MergeBamAlignment (v2.18.2) (Broad Institute 2019) and base quality recalibration with GATK BQSR (v.4.0.3.0). Germline variant calling and single-sample gVCF generation were performed using GATK HaplotypeCaller (v.3.5-0-g36282e4). Genotype analysis of trios/families, de novo variant discovery, variant quality recalibration, joint variant calling on all samples, and recalculation of genotype call accuracy on the population level were performed with GATK tools (Van der Auwera et al. 2013): HaplotypeCaller (v.3.5-0-g36282e4), VariantAnnotator (v.3.8-0-ge9d806836), VQSR, CalculateGenotypePosteriors, and VariantFiltration (v.4.0.12.0). All outputs were released to the DRC Portal and dbGaP, where the files can be accessed and downloaded (github.com/kids-first/kf-alignment-workflow).
High confidence de novo SNVs and indels were called for variants present only in the proband with GQ scores ≥ 20 for all trio members. Resulting variants were lifted to GRCh37/hg19 using UCSC’s liftOver tool and annotated using Annovar (v2013_09_11) (Wang et al. 2010). The liftover was necessary because the TraP and GWAVA annotations (see Variant prioritization) are not available under hg38.
Variant prioritization
Variants were subset to include SNVs with gnomAD (Karczewski et al. 2020) minor allele frequency (MAF) < 1% (hereafter defined as rare). Next, we prioritized SNVs that:
1) were not present in gnomAD (novel);
2) affected a gene already known to cause a phenotype overlapping that of PHACE in OMIM (omim.org) or GWAS annotation (Buniello et al. 2019);
3) affected a gene where the mouse model phenotype (informatics.jax.org) overlapped that of PHACE;
4) affected genes disrupted by CNVs described by Siegel et al. (2013); or
5) affected genes with coding variants in two or more patients.
Prioritized coding variants were further evaluated for pathogenicity prediction scores using VarSome (Kopanos et al. 2019).
Prioritized noncoding variants were further evaluated for scores designed to predict effects on splicing [TraP (Gelfman et al. 2017), Human Splicing Finder (HSF) (Desmet et al. 2009)]; functional DNA [CADD (Rentzsch et al. 2019), DANN (Quang et al. 2015), GWAVA (Ritchie et al. 2014), EIGEN (Ionita-Laza et al. 2016), FATHMM (Shihab et al. 2015)]; transcription factor binding sites (TFBS) (Euskirchen et al. 2007); and open chromatin in vascular cells in Encode (Ernst et al. 2011).
Finally, we investigated the de novo indels identified in the genes prioritized using metrics 1-5.
DECIPHER analyses
We first selected the pathogenic or likely pathogenic deletions and duplications described in DECIPHER (Firth et al. 2009) in individuals with hemangioma. We compared the genes affected by these CNVs with the genes prioritized in our cohort (described above) to identify other individuals in DECIPHER with hemangiomas and CNVs affecting our prioritized genes.
We then selected all pathogenic or likely pathogenic CNVs described in DECIPHER in individuals with facial hemangioma. The “facial hemangioma” feature was chosen to represent PHACE because it is the most common manifestation of this syndrome. We identified 47 DECIPHER individuals with facial hemangioma, with a total of 55 CNVs (17 gains, 38 losses) disrupting 5,703 genes; 1,191 of these genes were disrupted in at least two DECIPHER individuals. This subset of 1,191 genes was compared to the list of rare, de novo variants identified in our cohort of patients with PHACE to prioritize further candidates.
An enrichment analysis was performed for each gene identified by the second DECIPHER analysis. The DECIPHER pathogenic or likely pathogenic deletions disrupting the gene of interest in individuals with facial hemangioma were classified as “cases”. The DECIPHER pathogenic or likely pathogenic deletions disrupting the same gene in individuals without facial hemangioma were classified as “controls”. For each gene we built a contingency table comparing the number of cases to controls and performed a Fisher’s exact test (FET) using a cutoff of p-value < 0.05 for statistical significance. For significant genes we also calculated an odds ratio.
Sanger sequencing
Prioritized candidate variants were validated by Sanger sequencing. DNA from the proband and each parent was amplified using Accuprime Taq polymerase (Invitrogen; 12339-016) using standard thermal cycling conditions with annealing at 60°C and variant-specific primer sequences (Supplemental Table 2). Small aliquots of each PCR product were run on a 1% agarose gel to verify amplification, and the remainder of each PCR product was cleaned up using a mixture of exonuclease 1 (Affymetrix; 70073X) and shrimp alkaline phosphatase (Affymetrix; 78390) and sequenced using the same primers.
Protein structure analysis
The experimentally derived structure of THBS2 was obtained from RCSB PDB (Berman et al. 2000) structure 1YO8 (Carlson et al. 2005). The effect of the identified p.Asp859Asn variant in THBS2 was examined using DynaMut (Rodrigues et al. 2018), which predicts the effects of missense mutations on known PDB structures.
Pathway analysis
All 4,320 genes with rare, de novo SNVs were analyzed with gProfiler (Raudvere et al. 2019) to determine if they had similar functions or affected the same pathways. Multiple testing-adjusted p-values (using the default g:SCS method in gProfiler) were reported, and 0.05 was used as the cutoff for statistical significance.
Expression analysis
The chromatin accessibility atlas and gene expression data of the human developing heart was derived from nuclei isolated from human fetal hearts (Ameen & Sundaram et al., unpublished). Briefly, the Chromium 10X platform (Satpathy et al. 2019) was used to generate scATAC-seq data from three primary human fetal heart samples at 6, 8, and 19 weeks post-gestation. The scRNA-seq data from fetal hearts were analyzed from published studies (Asp et al. 2019; Miao et al. 2020; Suryawanshi et al. 2020) using the batch correction algorithm Harmony (Korsunsky et al. 2019), and then annotated using known cell-type-specific marker genes.
Data Availability
Datasets related to this article are accessible through the Kids First Data Resource Portal (kidsfirstdrc.org) and/or dbGaP (ncbi.nlm.nih.gov/gap) accession phs001785.v1.p1.
DATA AVAILABILITY
Datasets related to this article are accessible through the Kids First Data Resource Portal (kidsfirstdrc.org) and/or dbGaP (ncbi.nlm.nih.gov/gap) accession phs001785.v1.p1.
CONFLICT OF INTEREST
Francine Blei has received honoraria and non-drug related educational grants from Pierre Fabre. Sarah Chamlin is a speaker for Sanofi Genzyme and Regeneron Pharmaceuticals. Beth Drolet is a consultant and on the clinical advisory board for Venthera. Ilona Frieden serves on the Data Safety monitoring boards for Pfizer, is a consultant and advisory board member for Novartis, and is a consultant for Venthera. Dawn Siegel has received royalties from UpToDate, is a consultant for Arqule/Merck, and received a SID SUN mid-career investigator award.
AUTHOR CONTRIBUTIONS (CREDIT)
Elizabeth S. Partan: Conceptualization, Data curation, Investigation, Visualization, Writing – original draft, Writing – review & editing
Francine Blei: Data curation, Writing – review & editing
Sarah L. Chamlin: Data curation, Writing – review & editing
Olivia M. T. Davies: Data curation, Writing – review & editing
Beth A. Drolet: Data curation, Writing – review & editing
Ilona J. Frieden: Data curation, Writing – review & editing
Ioannis Karakikes: Writing – review & editing
Chien-Wei Lin: Formal analysis, Writing – review & editing
Anthony J. Mancini: Data curation, Writing – review & editing
Denise Metry: Data curation, Writing – review & editing
Anthony Oro: Writing – review & editing
Nicole S. Stefanko: Data curation, Writing – review & editing
Laksshman Sundaram: Investigation, Visualization
Monika Tutaj: Data curation, Formal analysis, Writing – review & editing
Alexander E. Urban: Writing – review & editing
Kevin C. Wang: Writing – review & editing
Xiaowei Zhu: Writing – review & editing
Nara Sobreira: Conceptualization, Funding acquisition, Project administration, Supervision, Writing – review & editing
Dawn H. Siegel: Conceptualization, Data curation, Funding acquisition, Project administration, Supervision, Writing – review & editing
ACKNOWLEDGMENTS
We thank all the families for their participation in this study, and the PHACE Syndrome Community and Pediatric Dermatology Research Alliance (PeDRA) for support of this project. We acknowledge the support of the Children’s Research Institute at Children’s Wisconsin and the Genomic Sciences and Precision Medicine Center at Medical College of Wisconsin.
The results analyzed and shown here are based in part upon data generated by Gabriella Miller Kids First Pediatric Research Program (Kids First) projects phs001785.v1.p1. Kids First was supported by the Common Fund of the Office of the Director of the National Institutes of Health (commonfund.nih.gov/KidsFirst). The Hudson-Alpha Institute for Biotechnology was awarded a U24HD090744-01 to sequence structural birth defect cohort samples submitted by investigators through the Kids First program (1X01HL140519-01).
This study was supported by the NICHD, NHLBI, and NIAMS under award numbers 1X01HL14519, R03HD098526, and R01AR064258, respectively. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Additional support was received from a SID SUN grant, PHACE Foundation Canada, and the Children’s Research Institute.
This study makes use of data generated by the DECIPHER community. A full list of centers who contributed to the generation of the data is available from decipher.sanger.ac.uk and via email from decipher{at}sanger.ac.uk. Funding for the project was provided by Wellcome. The individuals who carried out the original analysis and collection of the data bear no responsibility for the further analysis or interpretation of it in this paper.
Footnotes
Work was performed in Baltimore, MD, USA; Milwaukee, WI, USA; and Palo Alto, CA, USA
Abbreviations used
- SNV
- single nucleotide variant
- CNV
- copy number variant
- ECM
- extracellular matrix