Abstract
MTHFR is a pivotal enzyme in the folic acid cycle. Two functional SNPs, rs 1801133 (677C/T) and rs 1801131 (1298A/C), which affect the function of the MTHFR, are associated with different cancers. In the present study, these SNPs were investigated in breast cancer patients from the Pakistani population. The pilot study includes 187 participants with 124 breast cancer patients and 63 medically confirmed healthy individuals as controls. PCR-RFLP methods validated by Sanger sequencing were used for the polymorphic investigations. Here, we report the significant and unique associations of these polymorphisms with breast cancers in a South-Asian population for the first time in the literature. The case-control analysis showed that in case of 1298A/C polymorphism, a significant protective effect of homozygous C genotype was observed in recessive [CC vs AA+AC; OR: 0.320 (95% CI: 0.259 – 0.397)] and co-dominant multiplicative [CC vs AA; OR 0.379 (95% CI: 0.273 – 0.527)] models. In the case of 677C/T analysis, no significant association was observed with the risk of breast cancers. However, homozygous T genotype was more frequent in the advanced age group (>35 years) patients as compared to the young age-group (<35 years) i.e. 6.7% vs 0%. Thus, carriers of this genotype are less likely to develop breast cancers at younger age. The combined genotype analysis at two loci revealed that 677CC+1298AC [OR: 2.688 (95% CI: 1.247-5.795)] and 677CT+1298AA [OR.: 20.91 (95% CI: 1.156-378.2)] increased risk of breast cancers, significantly. The latter association (677T*1298A) was also observed in a semi-parametric haplotype analysis (p-value: 0.03). Interestingly, despite the proximity, these loci were in linkage equilibrium (r2 = 0.029 and 0.049 in cases and controls, respectively). The study implies translational potentials of these polymorphisms for breast cancer management in the studied population.
1 INTRODUCTION
Methylene Tetrahydrofolate Reductase (MTHFR) (UniProtKB-P42898) is a 77kDa key enzyme of the folate pathway. It converts 5-10 methylene tetrahydrofolate (5,10 MTHF) to 5-methyl tetrahydrofolate (5 MTHF) (Chittiboyina, Chen, Chiorean, Kamendulis, & Hocevar, 2018). The latter is a co-substrate in the homocysteine-based re-methylation of methionine. The folate cycle is involved in DNA synthesis, repair and methylation, as well as detoxification pathways (Kawakita et al., 2017). The genomic instability and aberrant DNA methylation are two critical hallmarks of cancer(s) (Flavahan, Gaskell, & Bernstein, 2017; Macheret & Halazonetis, 2015). Thus, folate pathway provides protection against neoplastic transformation and progression (Friso, Udali, De Santis, & Choi, 2017; Stover, James, Krook, & Garza, 2018).
Decreased MTHFR activity has been associated with gastrointestinal stromal tumour, neural tube defects, folate sensitivity, MTHFR-deficiency, schizophrenia, as well as dosage and toxicity response to adriamycin and cyclophosphamide (M. J. Landrum et al., 2016; OMIM #235250). The gene for MTHFR is located on chromosome 1p36.3. It is one of the ten most investigated genes globally, due to its potential in clinical applications (Dolgin, 2017). Two functional Single Nucleotide Polymorphisms (SNPs), 677C/T (rs1801133) and 1298A/C (rs1801131) encode for thermolabile isoforms of the MTHFR, affecting its enzymatic activity. The SNP 677C/T results in the substitution of alanine with valine (Ala222Val). In 1298A/C variant, glutamic acid is replaced with alanine (Glu429Ala) (Nefic, Mackic-Djurovic, & Eminovic, 2018). These SNPs result in decreased enzymatic activity: almost 60% reduced activity as compared to the wild type (D’Angelo et al., 2011).
In cancer cell, thermolabile isoform of the MTHFR leads to the mis-incorporation of uracil instead of thymine with consequent DNA damage (Rai, 2014). Additionally, reduced enzymatic activity affects DNA methylation and hence dysregulates the gene expression (Friso et al., 2002).
The SNPs are located within a few kilobases of each other and due to the short distance are expected to be in high degree of linkage disequilibrium (LD) (Balding, 2006; Bodmer & Bodmer, 1978). However, several discrepancies in other closely-related genomic regions have been reported (Schaid, Chen, & Larson, 2018). Therefore, the quantitative value of LD among proximate genetic variations need to be assessed separately for each population. This is essential to map the population-specific contribution of closely situated polymorphisms in the disease phenotype.
Data regarding the role of these functional SNPs in the breast cancers remains inconclusive in different populations (McEwen, 2016; Naushad et al., 2016). In the developing countries, breast cancer is the leading cause of cancer-related deaths among females (Torre, Siegel, Ward, & Jemal, 2015). In Pakistan, age-specific incidence and mortality rates of breast cancer(s) are one of the highest among Asian countries and globally, respectively. Given this background, the present study is a population-focused systematic attempt to investigate the role of these functional MTHFR SNPs in breast cancers in Pakistani population. The analysis includes individual and combined contribution of these two loci, as well as the evaluation of LD, in the risk and pathology of breast cancers in this population.
2 METHODOLOGY
2.1 Ethics Statement
The study was conducted in accordance with the Declaration of Helsinki (WMA, 2018). The project was approved by the ethical review committees (ERCs) of the participating institutions: the independent ERC, International Center for Chemical and Biological Sciences (ICCBS), University of Karachi, Karachi, Pakistan [ICCBS/IEC-016-BS/HT-2016/Protocol/1.0], and the Atomic Energy Medical Centre (AEMC), Jinnah Postgraduate Medical Centre (JPMC), Karachi, Pakistan [Admin-3(257)/2016]. All the samples were collected after obtaining written informed consent from each participant. The ethical approval from controls has been submitted and published earlier (Ajaz S, et al. 2012).
2.2 Study Participants
The pilot study comprised case-control design and included 187 participants. The cases included 124 diagnosed and histologically confirmed primary breast cancer patients. The patients were either first-time visitors or under regular treatment at (AEMC), JPMC, Karachi, Pakistan during the period of July 2016 – July 2017. The exclusion criteria for patients were: lack of biopsy report for breast cancer. In case of controls, 63 samples of medically-confirmed, healthy individuals were included (Ajaz S et al., 2012). The controls were matched on the basis of age, gender and ethnicity. The exclusion criteria for controls were: age or gender based mismatch, previous diagnosis of any cancer or co-morbidity with any other chronic disease. The participants belonged to Southern-Pakistan.
2.3 Breast Cancer Data
At the time of sampling, after obtaining written informed consent, participants’ relevant information was obtained through a questionnaire, including age, ethnicity, place of residence, contact number, family history of cancers, age at menarche, obstetrics and gynaecology history, and if applicable, the age at menopause. Clinical characteristics including, tumor histology, size, grade, stage, axillary lymph node metastasis, ER status, PR status, and Her-2 status were collected from the patients’ medical records.
2.4 DNA Extraction and Genotyping
DNA samples were extracted using standard phenol-chloroform method with slight modifications (Sambrook & Russell, 2001). Subsequently, target DNA sequences were amplified for the identification of selected genetic polymorphisms in the MTHFR. The restriction sites for two polymorphisms are shown in Figure 1.
2.4.1 MTHFR 677C/T
For 677C/T, DNA fragment of 198bp was amplified in 25µl total volume. PCR mix contained 1X (NH4)2SO4 PCR buffer, 0.4mM MgCl2, 0.2mM dNTPs, 0.4U Taq polymerase, 0.35µM of each primer (Forward primer: 5’-TGAAGGAGAAGGTGTCTGCGGGA-3’; Reverse primer: 5’-AGGACGGTGCGGTGAGAGTG-3’) and 130ng of DNA template in a final reaction volume of 25 µl. Cycling conditions have already been published (Ajaz et al., 2012). Amplification was carried out at annealing temperature (Ta) of 64°C. 20µl of the amplified product was digested overnight with 10U of HinfI and 3.5µl of the recommended buffer (Thermo Scientific®, USA). The digested products were run on 10% polyacrylamide gel, stained with ethidium bromide, and observed under UV.
2.4.2 MTHFR 1298A/C
For 1298A/C, DNA fragment of 168bp was amplified. PCR mix of 25µl contained 1X (NH4)2SO4 based PCR buffer with 0.8mM of MgCl2, 0.25mM of dNTPs, 2U of Taq polymerase, 1.25µM of each primer (Forward primer: 5’-CTTTGGGGAGCTGAAGGACTACTA-3’; 5’-CACTTTGTGACCATTCCGGTTTG-3’) and 70ng of DNA template. Amplification was carried out at Ta of 60°C. Cycling conditions were the same as for 677C/T. 20µl of the amplified product was digested overnight with 2.5U of MboII and 2µl of the recommended buffer (Thermo Scientific®, USA). The digested products were run on 10% polyacrylamide gel, stained with ethidium bromide, and observed under UV. Results of both polymorphisms were validated by Sanger sequencing.
Representative gels for genotyping of the MTHFR 677C/T and 1298A/C are shown in Figures 2 and 3, respectively. The validation of each methodology by Sanger sequencing is shown in the inset.
2.5 Bioinformatics Analyses
In order to predict the impact of the missense substitutions on the protein function, bio-informatic analyses were performed. The softwares for investigation included MutPred2 (Balding, 2006), Provean (Gaunt, Rodríguez, & Day, 2007), and Polyphen2 (Adzhubei IA, 2010). The variant correlation with the disease-related phenotype was investigated using ClinVar (Landrum MJ et al., 2015) and with cancer was investigated by using CIViC database (Griffith M et al., 2017).
2.6 Statistical Analyses
Statistical analyses were carried out using IBM (SPSS®) v.21.0 software (Arbuckle, 2012). The genotype and allele frequencies were determined by gene counting method along with the weighted percentages. Genotype distributions among the cases and controls were analyzed for Hardy-Weinberg equilibrium using Chi-squared test. The association between qualitative variables such as clinico-pathological characteristics including tumour stages and grade were also assessed by Pearson’s Chi-squared test (Cochran-Armitage trend test was used for the assessment of the association where Hardy-Weinberg disequilibrium was observed (Balding, 2006). In order to measure the allelic, genotypic and haplotype risks for breast cancers, odds ratios (OR) with 95% confidence interval (95% CI) were calculated using logistic regression. For all the analyses p-value <0.05 was considered to be significant.
2.7 Haplotype Estimation and LD Statistics
Estimation of haplotype frequencies and LD analysis were carried out using cubeX webtool (Gaunt et al., 2007). The maximum likelihood estimations were carried out for the calculations of haplotype frequencies. The programme was used for the calculation of Lewontin’s standardized disequilibrium coefficient (D’), correlation co-efficient (r2) and χ2 test (with significant value at p<0.05) were used for the estimation of LD between two loci. The results were compared with the Phase 3 (version 5) 1000 Genomes Project data for different populations on LDlink, a National Cancer Institute website tool (Machiela MJ & Chanock SJ, 2015).
3 RESULTS
The average age of cases was 44.35 ± 0.932 years while that of the controls was 46.04 ± 0.80 years, respectively. Clinico-pathological data for the studied cohort is shown in Table 1. Briefly, majority of the patients had invasive ductal carcinoma (79%) with tumour size >5cm (48%) and stage III (55%), grade 3 (52%) tumours.
3.1. Hardy-Weinberg Equilibrium Estimation
The distributions of genotypes and allele frequencies of the MTHFR 677C/T and 1298A/C polymorphisms are given in Tables 2 and 3, respectively. Genotypes and allele distributions, for both markers, were determined in the groups of breast cancer patients, controls and the subject population combining both the groups. χ2 values show that the distributions of genotypes for MTHFR 677C/T occurred in Hardy-Weinberg proportions in controls, and breast cancer patients, however, in the combined group these were not in Hardy-Weinberg equilibrium. The genotypes were in Hardy-Weinberg dis-equilibrium for MTHFR 1298A/C in all the groups and the disequilibrium is highly significant in the breast cancer patients and the combined group, implying the role of this polymorphism in breast cancer susceptibility.
3.2. Significant Differences in the Distribution of the MTHFR 677C/T Genotypes in Breast Cancers on the Basis of Age
The distribution of the MTHFR 677C/T polymorphism and genotypes on the basis of age (≤35 years) and advanced age-group (>35years) are shown in Table 4. No overall significant associations were found between the polymorphism and the risk for breast cancers. However, Homozygous T genotype is associated with advanced age, while heterozygous CT is associated with younger age-group of the patients.
3.3. Significant Protective Effect of the MTHFR 1298 C-variant with Polymorphism in Breast Cancer
In the studied cohort, a significant protective effect of CC-genotype in MTHFR 1298A/C polymorphism against breast cancer was observed in Pakistani population. Statistical analysis of this SNP with breast cancers is shown in Table 5.
3.4. Significant Associations of Combined Genotypes of 677CC/1298AC and 677CT/1298AA with the Risk for Breast Cancers
The nine possible genotype combinations for two MTHFR polymorphisms are shown in Table 6. Case-control association analysis was carried out for each combination. The analysis revealed that homozygous C at 677C/T with heterozygous AC at 1298A/C genotypic combination when compared with reference genotype (CC/AA) significantly increased the risk of breast cancers (χ2 = 6.558, OR: 2.688, 95%CI: 1.247-5.795. Similarly, heterozygous 677CT with 1298 homozygous A (CT/AA) when compared with (CC/AA) showed significant association (p value<0.05, OR: 20.91, 95% CI: 1.156-378.2) with increased breast cancer susceptibility (Table 6).
3.5. LD Analysis at MTHFR 677C/T and 1298A/C Locus
The disequilibrium spread in pairwise allelic combinations at the MTHFR 677C/T and 1298A/C loci was quantified by maximum likelihood calculation from the frequency of diploid genotypes. Haplotype frequencies and the LD statistics, D’, r2, and χ2 values demonstrate that the variants at two loci are not associated (Table 7).
3.6. Bioinformatics analysis
The bioinformatics analysis for the effect of SNPs on protein function is shown in Table 8. MTHFR 677C/T was predicted to be deleterious by three software tools. On the other hand, MTHFR 1298A/C was predicted to have benign effect by two programmes, while deleterious by PROVEAN analysis (Table 8).
4 DISCUSSION
Breast cancers are a heterogeneous group of disorders, which are characterized by abnormal cellular proliferation in the mammary tissue (Kumar, Abbas, Aster, & Robbins, 2013). The contributing gene effects are complex in nature, which can vary in penetrance (Haines & Pericak-Vance, 2006). The present study is a component of the molecular genetic investigation of the breast cancers in Pakistani population. The aim is to decipher the molecular and genetic architecture of the breast cancers in a population, characterized by high consanguinity, multiparity, and low exposure to putative risk factors for breast cancers, particularly low levels of alcohol and pork consumption. Thus, the present population provides a unique model for the epidemiological and molecular investigations of breast cancers. Additionally, folate deficiency is an established severe public health concern in women of reproducible ages (Soofi et al., 2017). This insufficiency, often attributed to low intake, is likely to be compounded by the genetic factors, with consequent effect on the breast cancer spectrum observed across Pakistani population.
In the present study, we report significant associations of the MTHFR 677C/T (rs 1801133) and 1298A/C (rs 1801131) polymorphisms with the risk of breast cancers in Pakistani population. To the best of our knowledge, the age-specific differences in the distribution of the MTHFR 677C/T genotypes in breast-cancer patients are reported for the first time from any population. As shown in Table 8, the bio-informatic analysis of the MTHFR 677C/T polymorphism demonstrates that Ala222Val substitution has deleterious effect on the enzyme function. The breast cancer patients with MTHFR 677 heterozygous genotype (CT) aggregates in the young (≤35 years) group [OR: 2.81; 95% CI: 1.08 - 7.3]. On the other hand, all the homozygous T genotype breast cancer patients were found in the advanced age group (>35 years) [OR: 1.281; 95% CI: 1.162 – 1.412].
The other important observation in the present study is the lack of MTHFR 1298CC genotype in breast cancer patients. Additionally, the Mid-P chi-square test and the conditional maximum likelihood estimate (CMLE) odds ratio show that the protective effect is highly significant. A number of studies from different regions have reported the protective effect of C-allele against cancers including leukaemias and bladder cancer (Robien & Ulrich, 2003; Skibola et al., 1999; You et al., 2013). Mechanistically, it is proposed that the double-stranded breaks lead to chromosomal instability, translocation and aberrations with consequent increase in cancer risk and progression. It has been shown through the comet assay that C-allele decreases the frequency of double-stranded breaks, thereby conferring a protective effect against cancers (Fragkioudaki et al., 2017; Jackson & Bartek, 2009). Secondly, decreased activity of the MTHFR results in the accumulation of the substrate 5,10 THF (Figure 4). Although the effect of increased concentration of 5, 10 THF has not been investigated recently, the earlier studies showed subsequent increased purine and pyrimidine synthesis. Hence, the DNA replication becomes stable and has the least chance of mutations (Bagley & Selhub, 1998; Chen et al., 1996; Fintelman-Rodrigues, Correa, Santos, Pimentel, & Santos-Reboucas, 2009; Giovannucci et al., 1998; Ulrich et al., 1999).
In the genotype combination analysis (MTHFR 677C/T and 1298A/C), CT/AA combination genotype was not observed in 63 controls as compared to 11 cases in 124 breast cancer patients [OR: 20.91; 95% CI: 1.156 – 378.2]. A second combination i.e. CC/AC also showed significant association with breast cancers [OR: 2.68; 95% CI: 1.25 – 5.8]. The pattern in haplotype analysis also supports this result (χ2 test p-value = 0.03). Both of these heterozygous/homozygous combination genotypes (CC/AC and CT/AA) were also associated with the increased risk of renal cell carcinoma (RCC) in Pakistani population (Ajaz et al., 2012).
Three genotype combinations are absent in the studied breast cancer patients: CT/CC, CC/CC, and TT/CC. These results underscore the significant protective effect of the MTHFR 1298CC genotype against breast cancers and the comparatively young age of breast cancer patients in the present cohort (average age: 44.35 ± 0.932). Interestingly, TT/CC combination was absent in controls and RCC patients as well (Ajaz et al., 2012).
The linkage equilibrium between two loci, which are in close proximity is a unique finding of the current report. In 1000 genome project (Kumar et al., 2013), among South-Asian populations, these two SNPS have been shown to be in LD in PJL (Punjabis from Lahore), GIH (Gujrati Indians in Houston), ITU (Indian Telugu from the UK) populations; whereas these are known to be not associated in BEB (Bengali from Bangladesh), and STU (SriLankan Tamil from the UK). In the present study, in contrast to PJL, where D’ has been shown to be 1, and r2 to be 0.119, for breast cancer patients these values were −0.502, and 0.0273, respectively. In case of controls, the values were 0.41 and 0.0496, respectively. Thereby, lack of association between the variants at studied loci is reported in the present study. Thus, underscoring the importance of validating data from genome wide association studies by appropriately designed case-control studies.
Conclusions
The present study reports unique associations of the MTHFR 677C/T and MTHFR 1298A/C independently and in combinations with breast cancers in Pakistani population. A distinctive lack of association over the short genomic sequence is also a significant finding of the present molecular investigative report. The results have important implications in devising folate pathway specific strategies for breast cancer management in distinctive sub-population groups.
Data Availability
Data available upon request
Conflict of Interest
The authors declare no conflict of interest.
Acknowledgements
The authors wish to thank Pakistan Health Research Council for funding the study. ICCBS for core facilities. AEMC, JPMC staff for their co-operation. The authors are especially grateful to the participants in the study.
Footnotes
° for the Duration of Reported Research
Conflict of Interest: The authors declare no conflict of interest.