Abstract
Patients with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) show specific epigenetic and gene expression signatures of the disease. However, it is unknown whether these signatures in ME/CFS include abnormal levels of the human angiotensin-converting enzyme ACE and ACE2, the latter being the main receptor described for host-cell invasion by SARS-CoV-2. To investigate that, we first reviewed published case-control genome-wide association studies based on single nucleotide polymorphism data, case-control epigenome-wide association studies based on DNA methylation data, and case-control gene expression studies based on microarray data. From these published studies, we did not find any evidence for a difference between patients with ME/CFS and healthy controls in terms of genetic variation, DNA methylation, and gene expression levels of ACE and ACE2. In line with this evidence, the analysis of a new data set on the ACE/ACE2 gene expression in peripheral blood mononuclear cells did not find any differences between a female cohort of 37 patients and 34 age-matched healthy controls. Future studies should be conducted to extend this investigation to other potential receptors used by SARS-CoV-2. These studies will help researchers and clinicians to better assess the health risk imposed by this virus when infecting patients with this debilitating disease.
1. Introduction
On March 11th, 2020, the World Health Organization officially declared the world to be under fast-spreading pandemic of the Coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). This pandemic came in the aftermath of two past outbreaks of severe acute respiratory infections caused by other human beta coronaviruses: the 2002/2003 Severe Acute Respiratory Syndrome (SARS) pandemic caused by SARS-CoV-1 and the 2012 Middle East Respiratory Syndrome (MERS) caused by MERS-CoV (De Wit et al., 2016). Since then, research efforts have been made to identify the molecular receptors by which the diverse coronaviruses are able to invade human host cells. Until now, the strongest candidate receptor is the human angiotensin-converting enzyme 2 (ACE2) whose interaction with the viral spike glycoprotein (S1) serves as a viral entry into host cells (Ge et al., 2013; Hoffmann et al., 2020; Li et al., 2003). This enzyme is highly expressed in different organs, including the lungs, heart, kidneys, and skin (Hamming et al., 2004; Li et al., 2020; Radzikowska et al., 2020; To and Lo, 2004). Molecularly, ACE2 counteracts the effect of the angiotensin-converting enzyme (ACE), which results in the control of the blood pressure and systemic vascular resistance (Westermeier et al., 2015). Failure to balance the expression of these genes is expected to lead to hypertension and cardiovascular diseases. Perturbation in the ACE/ACE2 ratio has also been hypothesized as key to the development of COVID-19 (Pagliaro and Penna, 2020). In line with these expectations, patients with severe symptoms of COVID-19 tend to show baseline hypertension and other chronic heart conditions (Grasselli et al., 2020; Richardson et al., 2020; Yang et al., 2020). To explain these clinical observations, it was hypothesized that these individuals could be at a higher risk of developing COVID-19 due to an upregulated ACE2 expression (Fang et al., 2020). ACE2-deficient individuals also seem to be at a higher risk of COVID-19, because viral entry typically induces a downregulation of this enzyme, which ultimately affects its balance with ACE (Milne et al., 2020). In addition, a recent meta-analysis of drugs that raise ACE2 expression indirectly (i.e., ACE inhibitors) provided no statistical association between these drugs and the COVID-19 mortality rate (Akhtar et al., 2020). Given these disparate lines of evidence, it is important to identify well-defined clinical populations in which ACE2 expression could be impaired. These clinical populations can then be used to investigate the role of this enzyme in SARS-CoV-2 infections.
Patients with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) represent a neglected clinical population due to a poor recognition and limited knowledge of the disease by health staff and the society (Pheby et al., 2020; Raine et al., 2004). ME/CFS is a chronic disease characterized by an unexplained but persisting fatigue and post-exertional malaise as the hallmark symptom among other clinical manifestations (Carruthers et al., 2003; Fukuda et al., 1994), which has been even associated with long-lasting COVID-19 symptoms (Komaroff and Bateman, 2020). The etiology of the disease remains largely unknown, but many patients report an infection at their symptoms’ onset (Blomberg et al., 2018; Chu et al., 2019). Patients often show high prevalence of cardiovascular and endothelial dysfunctions, such as orthostatic intolerance, impaired blood pressure variability and arrythmia (Frith et al., 2012; Nelson et al., 2019; Newton et al., 2009; Scherbakov et al., 2020; Slomko et al., 2019; Wirth and Scheibenbogen, 2020). These patients also show features of an unbalanced immune system consistent with an autoimmunity origin of the disease (Sotzny et al., 2018). This immune perturbation could be a possible explanation for the frequent viral infections or the high rate of flu-like symptoms reported by ME/CFS patients (Lacerda et al., 2019). Interestingly, ACE levels were found to be elevated in about 80% of patients diagnosed with an old case definition of ME/CFS (Lieberman and Bell, 1993). Such observation suggested this enzyme as a possible biomarker for the disease. As far as we know, this biomarker potential was not tested with follow-up studies. In turn, little is known about the role of ACE2 in patients with ME/CFS.
In the last two decades, there was explosion of high-throughput technologies allowing to identify a full spectrum of genetic variations, epigenetic changes, and altered gene expressions in patients with complex diseases. Such developments motivated the research community to investigate specific genetic, epigenetic and gene expression signatures in patients with ME/CFS (Almenar-Pérez et al., 2019; Dibble et al., 2020; Kerr, 2008). These investigations generated large amounts of data in which the role of ACE and ACE2 could be specifically assessed. The main objective of our paper is then to re-analyse these existing data in terms of ACE/ACE2. We focused on studies comparing patients with ME/CFS to healthy controls (case-control study design). To corroborate existing evidence, we also report the ACE/ACE2 gene expression in a new female cohort of patients with ME/CFS and healthy controls.
2. Materials and Methods
2.1 Angiotensin I converting enzymes 1 and 2 (ACE and ACE2)
Human ACE and ACE2 are two homologous enzymes sharing 41% protein identity and 61% sequence similarity (Tipnis et al., 2000). In more detail, ACE is a protein comprising a total of 1306 amino acids (isoform 1) encoded by the ACE gene located on the q23.3 region of the chromosome 17 (genomic coordinates: 63,477,061-63,498,380 or 61,562,184-61,599,209 in the reference genomes hg38 and hg19, respectively). ACE2 is instead a protein with 805 amino acids of length encoded by the ACE2 gene located on the p22.2 region of the X chromosome (genomic coordinates: 15,561,033-15,602,148 or 15,579,156-15,620,271 in the reference genomes hg38 and hg19, respectively). Such information was used to extract the data publicly available on these genes.
2.2 Diagnosis of ME/CFS
Since there is still no disease-specific biomarker, many case definitions of ME/CFS have been proposed over the years (Brurberg et al., 2014). These case definitions are invariantly based on the symptoms reported by suspected patients. To reduce between-study heterogeneity, we only considered data from studies using either the 1994 Centre for Diseases Control criteria (Fukuda et al., 1994) or the 2003 Canadian Consensus Criteria (Carruthers et al., 2003). These criteria are hereafter denoted as 1994 CDC/Fukuda definition and 2003 CCC, respectively. Note that the choice of these two criteria is in line with the recent recommendations for research given by the European Network on ME/CFS (Pheby et al., 2020).
2.3 Analysis of case-control GWAS
Our analysis focused on three case-control GWAS specifically designed to investigate ME/CFS (Herrera et al., 2018; Schlauch et al., 2016; Smith et al., 2011) (Table 1). There were additional genetic association studies using the UK biobank data, as reviewed elsewhere (Dibble et al., 2020). However, these studies contemplated data from individuals who self-reported a clinical diagnosis of ME/CFS. Self-reported diagnosis of ME/CFS are not advisable for research purposes and, therefore, these studies were excluded from the present study. Finally, there was an additional GWAS, but this study did not include data from a control group (Perez et al., 2019).
Summary of the three GWAS under analysis
The analysis of these studies started with the identification of the manufacturer and the genotyping platform used. This basic information allowed to obtain the annotation file of the respective technology. The annotation file typically contained the meta-data of the interrogated single nucleotide polymorphisms (SNPs), their location in the genome, and other relevant information. To determine the SNPs located in the ACE and ACE2 genes as defined above, we searched the annotation files for the following RefSeq transcript identifiers: ACE - ENST00000290866 (Ensembl identifier) or NM_000789 (National Center for Biotechnology Information (NCBI) Reference Sequence); ACE2 - ENST00000252519 (Ensembl identifier) or NM_021804 (NCBI Reference Sequence).
We have recently made a qualitative assessment of these studies (Grabowska et al., 2020). In particular, we focused on the following quality control checkpoints for GWAS: (i) removal of non-informative SNPs (i.e., monomorphic or with too-low minor allele frequency), (ii) removal of problematic SNPs due to gross deviations from the Hardy-Weinberg equilibrium, (iii) removal of samples with exceptionally high or low heterozygosity rate due to possible sample contamination or potential inbreeding, respectively, and (iv) the use of X chromosome SNPs to check gender and to detect any sample swap (Marees et al., 2018). Since these studies did not make their data publicly, our analysis was based on the reported list of SNPs possibly associated with ME/CFS.
2.4 Analysis of case-control EWAS
We focused our analysis on four available case-control EWAS on ME/CFS (Brenu et al., 2014; De Vega et al., 2017, 2014; Trivedi et al., 2018), which were reviewed elsewhere (Almenar-Pérez et al., 2019), and two additional case-control studies published after this review (Helliwell et al., 2020; Herrera et al., 2018) (Table 2). These studies were conducted using Illumina methylation arrays with the exception of a single study which used the reduced representation bisulfite sequencing technology (Helliwell et al., 2020). We conducted a joint analysis of the four of the array-based studies which had their data available in the NCBI Gene Expression Omnibus (GEO) data repository (Barrett et al., 2013). For the remaining two studies, we extracted the corresponding lists of probes differentially methylated between patients and healthy controls and checked whether these lists contained any probes located in the genes of interest.
Summary of the six EWAS under analysis.
With respect to the joint analysis of array-derived data, we first searched for the annotation files associated with the DNA methylation platform used and then identified which probes were located in ACE and ACE2. A quality assessment of the identified probes was done using a known list of putative problematic probes as available elsewhere (Chen et al., 2013). Problematic probes were defined as (i) having a high p-value of detection (e.g., >0.05), (ii) being cross-reactive with other genomic regions due to high sequence homology, and (iii) being located in polymorphic sites (e.g., including common SNPs) (Dedeurwaerder et al., 2013). In the last assessment, probes were considered problematic if the polymorphic sites had genetic variants with an allele frequency higher than 0.05 in European or American populations given that these four studies were conducted in such populations. After the stage of the analysis, all identified probes were considered non-problematic.
We then performed a joint statistical analysis of the data derived from the probes located in the genes of interest. For each data set, methylation signals were given by β-values (defined as the ratio between the methylation signal divided by the total of methylated and non-methylated signals). To obtain a good approximation of the data to the Normal distribution, β-values were converted into the corresponding M-values using the logit transformation (Du et al., 2010). To analyze data of each CpG probe, we initially fitted a linear regression model with the M-values as the outcome variable and a study indicator variable and disease status as covariates. In this model, we considered the main effects plus the respective interaction terms. This model was then simplified using a backward stepwise procedure. Since the study indicator variable was always statistically significant in data from different probes, we reported the evidence for association of a given probe with ME/CFS by the p-value of a likelihood ratio test. In this test, we compared the model including the study indicator variable only with the best model including that covariate and disease status (i.e., either a model including main effects only or a model including the main effects plus the interaction term). To adjust for multiple testing, we applied the Benjamini-Yekutieli procedure (Benjamini and Yekutieli, 2001) ensuring a false discovery rate of 5%. The decision of using this alternative procedure for dependent tests was based on the observation that the M-values of different probes were positively correlated with each other according to Pearson’ s correlation coefficient.
2.5 Analysis of case-control GES
Our analysis was based on a total of eight case-control GES based on microarray technology, respectively (Table 3). These studies were based on PBMCs (5 studies), whole blood (2 studies) and muscle biopsies (1 one studies). There were three additional case-control GES based on similar technology; however, these studies used unclear case definitions of ME/CFS (Galbraith et al., 2011; Vernon et al., 2002) or case definitions other than 1994 CDC/Fukuda definition or 2003 CCC (Nguyen et al., 2017). In addition, there were four case-control GES using RNA-seq technologies (Bouquet et al., 2019, 2017; Raijmakers et al., 2019; Sweetman et al., 2019). However, they were excluded from further analysis due to the lack of basic quality control checks such as the percentage of reads that could be mapped onto the reference transcriptome, the percentage of the transcriptome covered, the average number of mapped reads per transcript, any effect of the GC content on the mapped read distribution, as recommended elsewhere (Conesa et al., 2016).
Summary of the 8 array-based GES under analysis.
To analyze the selected microarray-based GES, we first searched for the annotation files associated with the technologies used. Most of these annotation files could be found in the NCBI GEO data repository (Barrett et al., 2013). We then used these annotation files to determine whether the respective microarrays included probes evaluating the expression of ACE/ACE2 genes. We also checked whether each study made its data publicly available or at least reported any differential expression of ACE or ACE2 genes between ME/CFS patients and healthy controls. As a qualitative assessment of the data, we compiled the information on whether each selected study performed any data normalization before inferring any differentially expressed genes.
A re-analysis was performed in studies in which any data normalization was conducted, had the respective expression data available, or at least reported differential expression of ACE or ACE2 genes between patients and healthy controls. This investigation was carried out only with the studies whose expression data sets were publicly available. In studies where the respective data in linear scale did not resemble a Gaussian distribution, we transformed the data by finding the optimal Box-Cox transformation. Since the resulting data now resembled a Gaussian distribution, we calculated the classical t-based 95% confidence interval for the average difference between patients and healthy controls. This confidence interval was then converted back into a linear scale using the inverse of the optimal Box-Cox transformation used for the data. This confidence interval was finally log2-transformed in order to obtain the 95% confidence interval for the mean log2 fold change between patients and healthy controls. In studies in which the only available information was the mean log2 fold-change and the p-value associated with a Student’ s t-test for testing differentially expressed genes, we determined the associated standard error and then calculated the 95% confidence interval for the mean log2 fold change. Finally, we pooled the different estimates of the mean log2 fold change using the inverse-weighted variance method for meta-analysis (Viechtbauer, 2010).
2.6 Analysis of new RNA data on the ACE/ACE2 gene expression in ME/CFS
2.6.1 Study participants
Thirty-seven female patients with ME/CFS were recruited from the outpatient clinic for immunodeficiencies at the Institute for Medical Immunology at the Charité-Universitätsmedizin Berlin, Germany in 2020. Patients with ME/CFS were diagnosed according to the 2003 CCC while excluding other medical or neurological diseases which may cause fatigue (Carruthers et al., 2003). Thirty-four female controls with self-reported healthy status were recruited from staff.
2.6.2 Experimental procedure for RNA isolation and expression
PBMCs from study participants were isolated from heparinized whole blood by density gradient centrifugation using Biocoll Separating Solution (Merck Millipore, Biochom). Total RNA was isolated from PBMCs (2×106 cells) was extracted (NucleoSpin RNA Kit, Macherey-Nagel, cat. nr. 740955.50) according to the manufacturer’ s instructions. Afterwards cDNA was prepared by reverse transcription (High-Capacity cDNA Reverse Transcription Kit, Applied Biosystems, cat. nr. 4368814) and real-time PCR was performed using TaqMan® Universal PCR Master Mix (cat. nr. 4305719) and TaqMan® Gene Expression Assays (cat. nr. 4331182) for ACE (Hs00174179_m1), ACE2 (Hs01085333_m1) and the housekeeping gene HPRT1 (Hs02800695_m1) (Applied Biosystems). For the amplification of ACE and HPRT1 20 ng and of ACE2 100 ng template cDNA were used. All measurements were performed with the ABI7200 and software Step One Plus as absolute quantification according to manufacturer’ s instruction. Relative gene expression was analysed using the ΔCT method. Note that the expression of ACE2 mRNA was not possible to quantify for 11 patients due to insufficient cDNA material. Therefore, the analysis of ACE2 gene expression was based on data from 26 patients and 34 healthy controls.
2.6.3 Statistical analysis
We first tested whether the two groups were age-matched using the Kolgomorov-Smirnov test for two independent samples. For statistical convenience, raw gene expression data were independently transformed for ACE and ACE2 using the Box-Cox transformation. The estimates for the exponent of this transformation were 0.303 and 0.225 for ACE and ACE2, respectively. For each gene, a linear regression model was then applied to the resulting transformed data using age and disease status as covariates. The estimated linear regression models were then statistically validated by testing the normality assumption of the residuals using the Shapiro-Wilk test and by visually inspecting the assumption of constant variance of the same residuals as function of the covariates. The level of significance was set at 5% for this analysis.
2.6.4 Ethical Approval
The protocol of the German ME/CFS cohort study was approved by the Ethics Committee of Charité-Universitätsmedizin Berlin in accordance with the 1964 Declaration of Helsinki and its later amendments (reference number EA2/067/20). All patients and healthy controls recruited from staff gave written informed consent.
2.7 Statistical analysis and software
We performed our statistical analysis in the R software version 4.0.3. In this analysis, we used the following Bioconductor packages: hgu133a.db, hgu133plus2.db, IlluminaHumanMethylation450kanno.ilmn12.hg19, and IlluminaHumanMethylationEPICanno.ilm10b2.hg19 to analyze the annotation files of the GeneChip HG-U133A, GeneChip U133+2, Infinium HumanMethylation450K Array and HumanMethylationEPIC arrays, respectively.
3. Results
3.1 Evidence from GWAS
Our analysis focused on three published case-control GWAS under analysis (Table 1). These studies recruited patients diagnosed according to the 1994 CDC/Fukuda case definition or the 2003 CCC. Patients and healthy controls were at least matched for age and gender.
The oldest GWAS reported a total of 65 SNPs with a putative association with ME/CFS (Smith et al., 2011). However, this study excluded all SNPs located on the X chromosome and therefore, the association was not investigated between ME/CFS and rs1514279, which was the only SNP located in the ACE2 gene available in the array used. With respect to the association between ME/CFS and ACE gene, the performed array did not include any SNP located in this gene.
Another study (Schlauch et al., 2016) was based on Genome-wide SNP Array 6, in which there were 8 and 3 SNPs located in ACE and ACE2, respectively (Supplementary File 1). However, among the 442 SNPs reported as potential candidates for association with ME/CFS, none of them was located in the genes under analysis. Finally, in the most recent GWAS (Herrera et al., 2018), none of the evaluated SNPs reached statistical significance, including 74 and 48 SNPs located in the ACE and ACE2, respectively, which were available in the Human Omni 5–4 Array (Supplementary File 1).
In conclusion, there was no evidence for an association between ME/CFS and specific genetic factors in the ACE/ACE2 axis. Since the data from these studies were not made publicly available, we could not perform a re-analysis of the SNPs located in the genes of interest.
3.2 Evidence from EWAS
In the six published EWAS, patients with ME/CFS were diagnosed using the 1994 CDC/Fukuda definition, the 2003 CCC, or both (Table 2). These patients were matched with healthy controls in terms of age, gender, and body mass index (De Vega et al., 2017, 2014; Trivedi et al., 2018) with the exception of two studies where the matching was only based on the first two variables (Brenu et al., 2014; Helliwell et al., 2020). Trivedi et al. (2018) and Herrera et al. (2018) also matched for ethnicity, while the same matching could be assumed for the two other studies (De Vega et al., 2017, 2014) given that these studies only recruited white females. Samples were derived from PBMCs (De Vega et al., 2017, 2014; Trivedi et al., 2018), T lymphocytes (Herrera et al., 2018), and CD4+ T cells (Brenu et al., 2014). Four of the five case-control EWAS used the Infinium HumanMethylation450K Array by Illumina (Brenu et al., 2014; De Vega et al., 2017, 2014; Herrera et al., 2018), which allowed to interrogate the percentage of DNA methylation in 8 and 5 probes located within ACE and ACE2 coding regions, respectively. A single study was based on data generated from the Methylation EPIC Array (Trivedi et al., 2018), which evaluates the degree of methylation in 13 and 12 probes located in ACE and ACE2 genes, respectively. Finally, the most recent study (Helliwell et al., 2020) used the reduced representation bisulfite sequencing. The oldest EWAS (Brenu et al., 2014) did not share their data and therefore, our analysis was only based on the reported 120 probes whose the percentage of methylation was significantly different between patients with ME/CFS and healthy controls. Although located in 70 known genes, none of these probes was located in either ACE or ACE2.
The remaining four array-based EWAS (De Vega et al., 2017, 2014; Herrera et al., 2018; Trivedi et al., 2018) made their data publicly available in the NCBI GEO data repository and therefore, we conducted an joint analysis of the respective data. With this purpose, we focused our analysis on data of the 13 probes available in Infinium HumanMethylation450K Array which are shared with the Methylation EPIC Array. These probes were not considered problematic in terms of co-hybridization with other genomic regions (Supplementary File 2). Six out of these 13 probes could be mapped onto genomic regions including SNPs within either ACE or ACE2 genes. However, the associated SNPs were either not present in the European and American populations nor had allele frequencies above 0.05 in these populations (Figure 1A). Therefore, these probes were not considered problematic in this aspect.
Analysis of 13 ACE/ACE2-located CpG from 4 EWAS based on methylation arrays with available data in NCBI GEO data repository. (A) Allele frequency in European and American populations of SNPs associated with 8 and 5 CpG probes located in ACE and ACE2, respectively (see the respective data in Supplementary File 2). (B) Boxplot of all possible pairwise correlations between 8 and 5 CpG probes located in A. (C) Adjusted p-values for the association between each probe and ME/CFS. Adjusted p-values were calculated according to the Benjamini-Yekutieli procedure ensuring a false discovery of 5% (dashed line). (D) Boxplots of the M values for the statistically significant cg21881537 probe shown in C in patients with ME/CFS and healthy controls across different studies.
The joint analysis of the data from these four studies revealed that the percentage of methylation of the 13 probes of interest tended to be positively correlated with each other (Figure 1B). This observation made us choose the Benjamini-Yekutieli procedure for controlling the overall false discovery rate given that this procedure is particularly adequate for such data. The subsequent association analysis did not identify any differentially methylated CpG probes located in the ACE2 gene (Figure 1C). The only statistically significant result was obtained for a single CpG (cg21881537) located in the ACE gene. Such association would appear to be derived from data from the one of the studies (Trivedi et al., 2018) where this probe would appear to be hypomethylated in patients with ME/CFS when compared to healthy controls (Figure 1D and Supplementary File 3). This finding suggested a putative increased expression of ACE at least in patients from this study given that the degree of methylation and gene expression levels are usually negatively correlated with each other.
Finally, the most recent EWAS was the only study not based on an array-based technology (Helliwell et al., 2020) and, therefore, our analysis consisted of analyzing the reported list of a possible differentially methylated probes. This study reported 76 and 394 differentially methylated probes using two distinct statistical approaches for data analysis. These probes were located in 31 and 121 genes, respectively, which were neither ACE nor ACE2 (see additional file 1 from Helliwell et al., 2020).
3.3 Evidence from GES
The eight array-based GES under analysis were conducted in small cohorts of patients with ME/CFS (mean sample size=18.5; range=4-37) and healthy controls (mean sample size=18.6; range=5-50 individuals) (Table 3). In these studies, the patients and healthy controls were matched at least in terms of age and gender. Different commercial and custom microarray technologies were used for the respective gene expression quantification. There was only one study in which the microarray used did not include any probe in the genes of interest (Whistler et al., 2005). Another study used a custom array based on 9,522 genes from the RefSeq database as available in August 2002 (Kaushik et al., 2005). However, it was unclear whether the ACE/ACE2 gene expression could have been quantified, because this study did not make available the list of genes included in the respective array. In terms of data sharing, only two studies made their data available either in the NCBI GEO data repository (Gow et al., 2009)or within the respective publication (Saiki et al., 2008). Although not sharing the data, there was a study that reported a significant association between ME/CFS and ACE2 expression (log2(fold change)=0.190; 95% confidence interval=(0.021;0.359)) (Smith et al., 2011).
We conducted a re-analysis of the two studies in which the corresponding ACE/ACE2 data was made available (Figure 2A). This analysis suggested a significant increase of ACE expression in patients with ME/CFS in data from Saiki et al. (2008) (mean log2(fold change)=0.470; 95% confidence interval=(0.282;0.709)); this study used a custom array that consisted of stress-related genes not including ACE2. In opposition, the data from Gow et al. (2009) did not lead to any statistically significant result: -0.01 (95% CI=(−0.09;0.07)) and 0.00 (95% CI=(−0.08;0.07)) for probes 1 and 2 in the ACE gene, respectively; 0.04 (95% CI=(−0.12;0.19)) and 0.04 (95% CI=(−0.07;0.15)) for probes 1 and 2 in the ACE2 gene, respectively (Figure 2A).
Analysis of ACE/ACE2-related data from eligible microarray-based GES. (A) Boxplots of the data from studies based on microarray technology. (B) Forest plot for the study-specific and pooled estimate of the mean log2 fold change between patients with ME/CFS and healthy controls using data shown in A.
To increase the overall statistical power to detect putative differentially expressed genes, we pooled the estimate from Saiki et al. (2008) with the ones from Gow et al. (2009) for the ACE expression. The resulting pooled estimate for the fold-change was 0.115 with a 95% CI=(−0.067;0.297) (Figure 2B). The same pooling was done for ACE2 expression but using the estimates from Gow et al. (2009) and the reported mean log2 fold change reported by Smith et al. (2011). The pooled estimate was 0.074 with a 95% CI=(−0.015;0.163). Finally, the remaining array-based GES studies did not report any evidence for differentially expressed ACE/ACE2 genes between patients and healthy controls.
3.4 Analysis of ACE/ACE2 gene expression in PBMCs among German participants
To consolidate evidence from previously published studies, ACE/ACE2 gene expression levels were quantified in PBMCs from 37 female patients diagnosed with ME/CFS (mean age = 41.1 years old) and in 34 female healthy individuals (mean age = 37.4 years old) (Table 4). Patients and healthy participants were age-matched (Kolmogorov-Smirnov test, p-value = 0.38). Patients had an average disease duration of 5.4 months (range = 0–24) with four of them without information for this variable.
Summary statistics for the ACE/ACE2 gene expression data from the German female study participants.
As expected from PBMC samples, there was a higher mRNA level of ACE than of ACE2 (Table 4, Figure 3A). Further analysis of the transformed expression did not present any significant relation between ACE and ACE2 expression levels (Spearman correlation coefficient = -0.120) (Figure 3B). Finally, linear regression models adjusted for age did not find any significant difference between patients and healthy controls (Table 5).
Analysis of ACE and ACE2 gene expression from the German study. (A) Violin plots of ACE (left side) and ACE2 (right side) mRNA raw data (upper row) and transformed data using the best Box-Cox transformation (lower row). Gray-filled plots represent the cohort of healthy controls and blue-filled plots represent the ME/CFS-diagnosed patients. The best values for the Box-Cox transformation parameter λ are 0.303 and 0.225 for ACE and ACE2 mRNA data, respectively. (B) Scatterplot between ACE and ACE2 gene expression using the Box-Cox-transformed data (Spearman’ s correlation coefficient = -0.120).
Analysis of the linear regression models for the Box-Cox-transformed ACE and ACE2 mRNA levels.
4. Discussion
This research revealed scarce data to draw inferences over putative differences between patients with ME/CFS and healthy controls in terms of genetic variation, the DNA methylation and the gene expression of ACE and ACE2. This data scarcity is embodied not only in the reduced number of published studies, but also in the respective sample size used within each study, which limited the statistical power of the subsequent data analysis. We attempted to compensate this data limitation with the analysis of a new data from a cohort of German patients. This analysis suggested that the gene expression of ACE or ACE2 in PBMCs were similar between patients and healthy controls. Data scarcity can be explained by five main reasons. Firstly, there were only few GWAS, EWAS, and GES available in the ME/CFS literature. This limited number of studies could be related to a poor societal recognition of ME/CFS as a disease, which ultimately limits the funding available for the respective research. Access to limited research funding could also imply an additional difficulty in assembling multidisciplinary teams required to tackle the various challenging technical aspects of these studies.
Secondly, three published case-control GES based on microarray technology were excluded from this investigation, because they used broad or alternative case definitions of ME/CFS. Given the absence of an objective disease biomarker, the research community should aim to use consensual case definitions for research with the intention to make diagnostic comparable across studies while reducing between-studies heterogeneity. In this regard, our requirement for ME/CFS diagnosis was the 1994 CDC/Fukuda definition or the 2003 CCC according to the recommendation for research given by the European Network on ME/CFS (Pheby et al., 2020).
Thirdly, four RNA-seq studies were not included in our investigation due to unclear data quality. Issues concerning data quality is not an exclusive problem of ME/CFS studies, as highlighted by a comprehensive survey of the analytical steps taken by current RNA-seq studies (Simoneau et al., 2019). In theory, there are several recommended steps for data processing and analysis for these studies (Conesa et al., 2016). In practice, different studies adopt distinct pipelines for data analysis with a possible impact on scientific reproducibility (Simoneau et al., 2019). Again, a way to reduce between-study heterogeneity and to improve the respective data quality is to foster a stronger collaboration between ME/CFS researchers and bioinformaticians who have the technical competences to conduct the correct processing of the data and the subsequent statistical analysis, as suggested for the analysis of GWAS (Grabowska et al., 2020). Notwithstanding the exclusion of these studies from this paper, it is worth noting that none of them reported any difference between patients and healthy controls in terms of ACE/ACE2 expression levels (Bouquet et al., 2019, 2017; Raijmakers et al., 2019; Sweetman et al., 2019). These findings are in agreement with the results obtained from our gene expression analysis for ACE/ACE2 in German ME/CFS patients and healthy controls.
Fourthly, only a few of the published studies made their data publicly available, which we could use in our investigation. This issue was particularly limiting for GWAS, because none of these studies deposited the data in any open-access data repository. Currently, many funders and other science-related stakeholders are supporting the reuse and the long-term maintenance of scientific data generated by publicly funded research (Wilkinson et al., 2016). Above all, the benefit of a wide data-sharing practice is expected to accelerate scientific knowledge and to boost confidence in findings by allowing other researchers to take a fresh look at the same data. It could also promote collaboration among researchers, and to make science open to everyone, specifically, when it is funded by taxpayers and charities. Data sharing is also essential to cut down the costs of research by sharing resources among the research community. Reducing the costs of research by sharing limited resources is particularly important for the underfunded ME/CFS research field, as discussed above. Fifthly, the re-analysis of publicly available data was based on small cohorts of patients and healthy controls. In general, a small sample size limits the statistical power to detect any hypothetical differences between patients with ME/CFS and healthy controls. This issue is particularly problematic for GWAS, EWAS and GES, whose statistical analysis typically involves the execution of thousands of association tests. In the case of GWAS, the number of association tests could even reach several million, as illustrated by Herrera et al. (2018). Therefore, if correcting multiple testing is taken into account in the analysis, the most likely finding is the identification of relatively few disease associations, as demonstrated by Smith et al. (2011). In the worst-case scenario, correcting for multiple testing in studies with sample sizes leads to the absence of evidence for any disease association, as reported by different studies (Dibble et al., 2020; Herrera et al., 2018; Johnston et al., 2016). In the most optimistic scenario, we can hypothesize that ACE and ACE2 are both genes whose genetic variation and gene expression profiles are at best moderately associated with ME/CFS. However, this prediction is yet to be confirmed with future studies investigating the specific role of these genes on patients with ME/CFS infected with SARS-CoV-2.
Given that the high frequency of cardiovascular dysfunctions in patients with ME/CFS (Frith et al., 2012; Nelson et al., 2019; Newton et al., 2009; Scherbakov et al., 2020; Slomko et al., 2019; Wirth and Scheibenbogen, 2020), it is also possible that the available DNA methylation and gene expression data could be biased towards study participants who were taking any medication to restore their normal cardiovascular function. In this regard, only one study excluded putative study participants taking any beta blockers or ACE inhibitors (Trivedi et al., 2018). Other studies excluded any potential participants with previous consumption of medications with immunomodulatory effects or with putative effects on epigenetic mechanisms (De Vega et al., 2017, 2014; Herrera et al., 2018), excluded any putative participant taking any regular medication (Gow et al., 2009), or reported that the healthy controls were free from any medication at the time of data collection (Saiki et al., 2008). One study conducted a review of current medications taken by the study participants (Smith et al., 2011). However, it was unclear which medications were considered as exclusionary criteria.
Another cautionary note is that, for experimental convenience, gene expression and DNA methylation data sets were derived from PBMCs and, as such, they may not reflect what occurs in nasal and pulmonary epithelial and endothelial cells, which are the main cellular targets of SARS-CoV-2 (Sungnak et al., 2020). Interestingly, earlier studies on SARS-CoV-1 found the virus within T lymphocytes, macrophages, and monocyte-derived dendritic cells (Tay et al., 2020). In the same line of evidence is the observation of lymphopenia in the blood of patients infected by SARS-CoV-2 (Guan et al., 2020; Qin et al., 2020). It is then possible that SARS-CoV-2 also infects the different immune cell subsets present in the blood. If so, the infection of PBMCs by this virus could open the door for a widespread of the infection to different organs. However, it is worth noting that, even if PBMCs are in fact infected by the SARS-CoV-2, it is unclear whether the virus uses the same invasion route via interaction with ACE2.
Previously, some authors hypothesized that patients with chronic conditions, such as those with hypertension, diabetes mellitus, or chronic obstructive respiratory disease, could be more susceptible to COVID-19 due to a putative upregulation of the ACE2 gene (Fang et al., 2020). A subsequent study could not confirm this hypothesis by analyzing the expression profiles of ACE2 and other gene targets of SARS-CoV-2 in the lungs of these chronic patients (Milne et al., 2020). However, this study failed to acknowledge a possible effect of the underlying genetic variation associated with tACE2 in the respective results. In fact, a recent study showed a clear continental difference between different human populations based on ACE2 polymorphisms alone (Cao et al., 2020). Therefore, it is conceivable that different human populations could have a natural variation in the SARS-CoV-2 infectivity rate due to specific genetic variations in ACE2 that can increase the binding affinity between ACE2 and the S1 protein encoded by SARS-CoV-2. In line with this view, a bioinformatic analysis suggested that specific ACE2-related SNPs are able to stabilize the interaction between ACE2 and the S1 protein of SARS-CoV-2 (Othman et al., 2020). Given that genetic variation in ACE2 is typically associated with cardiovascular diseases and there is currently no evidence for such genetic association with ME/CFS, we hypothesize that patients with ME/CFS have the same SARS-CoV-2 infectivity rate as any healthy individual on the basis of the ACE2 data alone. On the other hand, it is known that patients with ME/CFS tend to have perturbations of the immune system with unresponsive natural killer cells upon antigen stimulation (Klimas et al., 1990), defective B and T cell immune responses against the Epstein-Barr virus (Loebel et al., 2014), decreased CD8+ T-cell cytotoxicity and activation (Brenu et al., 2011), and increased percentage of regulatory T cells (Curriu et al., 2013; Ramos et al., 2016). All of these clinical observations are possible reasons for frequent and persistent infections reported by some patients with ME/CFS (Lacerda et al., 2019). Given all of these observations, a recent study suggested that the pathology of ME/CFS could be related to a hyper-regulated immune system via regulatory T cells (Sepúlveda et al., 2019). As a corollary of this hypothesis, some patients with ME/CFS could have an increased SARS-CoV-2 infectivity rate not due to any underlying imbalanced expression of ACE2, but rather than due to a hypo-responsive (or hyper-regulated) immune system.
It is worth noting that the invasion of host cells by SARS-CoV-2 requires more than the simple interaction of the viral S1 protein with ACE2. Previously, it was found that SARS-CoV-1 interact with the human transmembrane protease serine 2 (TMPRSS2) for its activation and its role of priming host cells for viral entry (Glowacka et al., 2011; Matsuyama et al., 2010). Similar interaction was hypothesized for SARS-CoV-2 infectivity (Sungnak et al., 2020). In addition, TMPRSS2 is thought to induce SARS-CoV-1 cell entry through endocytosis via a mechanism of ACE2 cleavage, as reviewed elsewhere (Xiao et al., 2020). Similar mechanisms might occur in SARS-CoV-2 infections (Hoffmann et al., 2020). Another reported human protease potentially influencing SARS-CoV-2 infectivity is the A disintegrin and metallopeptidase domain 17 protein (ADAM17), which has an important role as a stress-response signal delivered to the immune system (Düsterhöft et al., 2019). Like TMPRSS2, this protease is also able to cleave ACE2, but with a different end-product (Heurich et al., 2014). As a consequence, the viral invasion seems less efficient in host cells whose ACE2 was preferentially cleaved by this protease than by TMPRSS2 (Heurich et al., 2014). At this moment, there is limited evidence for the role of these two proteases in the pathogenesis of ME/CFS. In this regard, one of the GES conducted a small gene expression study on different stress-response proteins including ADAM17 (Saiki et al., 2008). These authors did not find any significant difference in the expression of this protease between patients with ME/CFS and healthy controls. In addition, one of the EWAS provided evidence for hypomethylation of one ADAM17-related probe in patients with ME/CFS when compared to healthy controls (Trivedi et al., 2018). Therefore, the analyses conducted here for ACE2 alone could serve as a guideline for future studies on these proteases related to SARS-CoV-2.
Dipeptidyl peptidase-4 (DPP4), also known as lymphocyte cell surface protein CD26, was found to be the main functional receptor for the host-cell entry by MERS (van Doremalen et al., 2014; Widagdo et al., 2019). This molecule is highly expressed in PBMCs including CD4+ and CD8+ T cells (Radzikowska et al., 2020). It is then possible that SARS-CoV-2 is able to infect PBMCs via a route involving DPP4 rather than ACE2. Interestingly, a study reported an increased proportion of natural killers and T cells expressing DPP-4/CD26+ in patients with CFS when compared to healthy controls (Klimas et al., 1990). A follow-up study confirmed this finding but also showed evidence for a decreased number of CD26 molecules in T lymphocytes and natural killer cells of patients with ME/CFS (Fletcher et al., 2010). The same study suggested a decreased level of the soluble form of the molecule in the serum from patients. Similar observation was found in a recent study, but specifically for female patients whose disease was initiated after an infection (Szklarski et al., 2021). Therefore, perturbations of the normal levels of DPP4 would appear to be a hallmark of ME/CFS pathogenesis. If DPP4 is indeed an alternative receptor for immune-cell invasion by SARS-CoV-2, specific research is needed to determine the infectivity rate of PBMCs from patients with ME/CFS. This would allow to determine the susceptibility of these patients to infections by SARS-CoV-2.
5. Conclusions
In summary, there is limited evidence for an altered expression of ACE and ACE2 in PBMCs patients with ME/CFS. At this stage we could not rule out the hypothesis that patients and healthy controls alike could have the same infectivity rate of their PBMCs and other target cells by SARS-CoV-2. To investigate this hypothesis, further data should be analyzed, namely, on different human receptors (i.e., CD26) that the virus can use to invade different host cells. In this regard, analyzing samples from the UK ME/CFS biobank (Lacerda et al., 2018, 2017) is a potential research avenue due to its large sample size, extensive clinical characterization of the respective study participants, and robust ethics.
Data Availability
Data sets from epigenetic-wide and gene expression studies are publicly available at the NCBI GEO data repository. The data set from the German cohort is available from Prof Carmen Scheibenbogen upon request.
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE59489
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE93266
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE156792
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE111183
Conflict of Interest
The authors declared that they do not have any conflict of interest.
Authors contributions
NS conceived this research. JM, AF, JCM, and NS performed the literature search. FW and JCM performed a comprehensive literature review about SARS-CoV-2/COVID-19 and their relationship with chronic conditions. FS, SB, HF, and CS were responsible for designing the study conducted in Charité, for recruiting the study participants, for collecting the blood samples, and processing them in the laboratory. JM, AF, CC, and NS performed the statistical analysis. JM, FS, AF, AG, LG, CS, EML, LN, JCM, FW, and NS interpreted and discussed the results. FW created the graphical abstract of the manuscript. NS and JM wrote the paper. All authors have read, revised, and approved the final version of the manuscript.
Funding
JM and AF were fully funded by FCT – Fundação para a Ciência e Tecnologia, Portugal (ref.grant: SFRH/BD/149758/2019 and SFRH/BD/147629/2019, respectively). NS and CC were partially funded by FCT – Fundação para a Ciência e a Tecnologia, Portugal (ref. grant: UIDB/00006/2020). LN and EML acknowledge the funding from the National Institute of Allergy and Infectious Diseases (NIAID) of the National Institutes of Health (NIH -Award Number: R01AI103629), and from the ME Association (Award number: PF8947) for their studies on ME/CFS. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The funding agencies did not have any role in the designing, data collection, data analysis, interpretation or writing-up the present manuscript.
Footnotes
Declarations of interest: none
Abbreviations
- ACE and ACE2
- Human angiotensin-converting enzymes 1 and 2, respectively
- ADAM17
- A disintegrin and metallopeptidase domain 17 protein
- CCC
- Canadian Consensus Criteria
- 1994 CDC/Fukuda
- 1994 Centers for Diseases Control and Prevention Criteria
- COVID-19
- Coronavirus disease 2019
- DPP4
- Dipeptidyl peptidase-4
- EWAS
- epigenome-wide association study or (studies)
- GEO
- Gene Expression Omnibus
- GES
- gene expression study (or studies)
- GWAS
- Genome-wide association study (or studies)
- ME/CFS
- Myalgic encephalomyelitis/Chronic Fatigue Syndrome
- MERS
- middle east respiratory syndrome
- NCBI
- National Centre for Biotechnology Information
- PBMC
- peripheral blood mononuclear cell
- S1
- viral spike glycoprotein
- ROC
- receiver operating characteristic
- SARS-CoV-1
- severe acute respiratory syndrome coronavirus-1
- SARS-CoV-1/-2
- severe acute respiratory syndrome coronavirus-1/-2
- SNP
- single nucleotide polymorphism
- TMPRSS2
- transmembrane protease serine 2.