Abstract
MTHFR is a pivotal enzyme in the folic acid cycle. Two functional SNPs (677C/T and 1298A/C), which affect the function of the MTHFR, are associated with different cancers. In the present study, these SNPs were investigated in breast cancer patients from the Pakistani population. The pilot study includes 187 participants with 124 breast cancer patients and 63 medically confirmed healthy individuals as controls. PCR-RFLP methods validated by Sanger sequencing were used for the polymorphic investigations. Here, we report the significant and unique associations of these polymorphisms with breast cancers in a South-Asian population for the first time in the literature. The case-control analysis showed that in case of 1298A/C polymorphism, a significant protective effect of homozygous C genotype was observed in recessive [CC vs AA+AC; OR: 0.320 (95% CI: 0.259 – 0.397)] and homozygous co-dominant [CC vs AA; OR 0.379 (95% CI: 0.273 – 0.527)] models. In the case of 677C/T analysis, no significant association was observed with the risk of breast cancers. However, homozygous T genotype was more frequent in the advanced age group (>35 years) patients as compared to the young age-group (<35 years) i.e. 6.7% vs 0%. The combined genotype analysis at two loci revealed that 677CC+1298AC [OR: 2.688 (95% CI: 1.247-5.795)] and 677CT+1298AA [OR.:20.91 (95% CI: 1.156-378.2)] increased risk of breast cancers, significantly. The latter association (677T*1298A) was also observed in a semi-parametric haplotype analysis (p-value: 0.03). The study indicates translational applications of these polymorphisms against breast cancers in the studied population.
1 INTRODUCTION
Methylene Tetrahydrofolate Reductase (MTHFR) (UniProtKB-P42898) is a 77kDa key enzyme of the folate pathway. It converts 5-10 methylene tetrahydrofolate (5,10 MTHF) to 5-methyl tetrahydrofolate (5 MTHF) (Chittiboyina, Chen, Chiorean, Kamendulis, & Hocevar, 2018). The latter is a co-substrate in the homocysteine-based re-methylation of methionine. The folate cycle is involved in DNA synthesis, repair and methylation, as well as detoxification pathways (Kawakita et al., 2017). The genomic instability and aberrant DNA methylation are two critical hallmarks of cancer(s) (Flavahan, Gaskell, & Bernstein, 2017; Macheret & Halazonetis, 2015). Thus, folate pathway provides protection against neoplastic transformation and progression (Friso, Udali, De Santis, & Choi, 2017; Stover, James, Krook, & Garza, 2018).
Decreased MTHFR activity has been associated with gastrointestinal stroma tumour, neural tube defects, folate sensitivity, MTHFR-deficiency, schizophrenia, and dosage and toxicity response to adriamycin and cyclophosphamide (Institute; M. J. Landrum et al., 2016; OMIM). The gene for MTHFR is located on chromosome 1p36.3. It is one of the ten most investigated genes globally (Dolgin, 2017). Two functional Single Nucleotide Polymorphisms (SNPs), 677C/T (rs1801133) and 1298A/C (rs1801131) encode for thermolabile isoforms of the MTHFR. Thus, affecting its enzymatic activity. The SNP 677C/T results in the substitution of alanine with valine (Ala222Val). In case of 1298A/C, glutamic acid is replaced with alanine (Glu429Ala) (Nefic, Mackic-Djurovic, & Eminovic, 2018). These SNPs result in decreased enzymatic activity i.e. almost 60% reduced activity as compared to the wild type (D’Angelo et al., 2011).
In cancer cell, thermolabile isoform of the MTHFR leads to the mis-incorporation of uracil instead of thymine with consequent DNA damage (Rai, 2014). Additionally, reduced enzymatic activity affects DNA methylation and hence gene expression (Friso et al., 2002).
The SNPs are located within a few kilobases of each other and due to the short distance are expected to be in high degree of linkage disequilibrium (Balding, 2006; Bodmer & Bodmer, 1978). However, several discrepancies in other closely-related genomic regions have been reported (Schaid, Chen, & Larson, 2018). Therefore, the quantitative value of LD among genetic variations should be assessed separately for each population. This is essential to map the population-specific contribution of the polymorphisms, which are closely located, to the disease phenotype.
Data regarding the role of these functional SNPs in the breast cancers remains inconclusive in different populations (McEwen, 2016; Naushad et al., 2016). In the developing countries, breast cancer is the leading cause of cancer-related deaths among females (Torre, Siegel, Ward, & Jemal, 2015). In Pakistan, age-specific incidence and mortality rates of breast cancer(s) are one of the highest among Asian countries and globally, respectively. Given this background, the present study is a population-focused systematic attempt to investigate the role of MTHFR 677C/T and 1298A/C polymorphisms in breast cancers in Pakistani population. The analysis includes evaluation of LD between these two loci and their individual and combined contribution in the risk and pathology of breast cancers in this population.
2 METHODOLOGY
2.1 Ethics Statement
The study was conducted in accordance with the Declaration of Helsinki (WMA, 2018). The project was approved by the ethical review committees (ERCs) of the participating institutions: the independent ERC, International Center for Chemical and Biological Sciences (ICCBS), University of Karachi, Karachi, Pakistan [ICCBS/IEC-016-BS/HT-2016/Protocol/1.0], and the Atomic Energy Medical Centre (AEMC), Jinnah Postgraduate Medical Centre (JPMC), Karachi, Pakistan [Admin-3(257)/2016]. All the samples were collected after obtaining written informed consent from each participant.
2.2 Study Participants
The pilot study comprised case-control design and included 187 participants. The cases included 124 diagnosed and histologically confirmed primary breast cancer patients. The patients were either first-time visitors or under regular treatment at (AEMC), JPMC, Karachi, Pakistan during the period of July 2016 – July 2017. The exclusion criteria for patients were: lack of biopsy report for breast cancer. In case of controls, 63 samples of medically-confirmed, healthy individuals were included. The controls were matched on the basis of age, gender and ethnicity. The exclusion criteria for controls were: previous diagnosis of any cancer or co-morbidity with any other chronic disease. The participants belonged to Southern-Pakistan.
2.3 Breast Cancer Data
At the time of sampling, after obtaining written informed consent, participants’ relevant information was obtained through a questionnaire, including age, ethnicity, place of residence, contact number, family history of cancers, age at menarche, obstetrics and gynaecology history, and if applicable, the age at menopause.
Clinical characteristics including, tumor histology, size, grade, stage, axillary lymph node metastasis, ER status, PR status, and Her-2 status were collected from the patients’ medical records.
2.4 DNA Extraction and Genotyping
DNA samples were extracted using standard phenol-chloroform method with slight modifications (Sambrook & Russell, 2001). Subsequently, target DNA sequences were amplified for the identification of selected genetic polymorphisms in the MTHFR. The restriction sites for two polymorphisms are shown in Figure 1.
2.4.1 MTHFR 677C/T
For 677C/T, DNA fragment of 198bp was amplified in 25µl total volume. PCR mix contained 1X (NH4)2SO4 PCR buffer, 0.4mM MgCl2, 0.2mM dNTPs, 0.4U Taq polymerase, 0.35µM of each primer (Forward primer: 5’-TGAAGGAGAAGGTGTCTGCGGGA-3’; Reverse primer: 5’-AGGACGGTGCGGTGAGAGTG-3’) and 130ng of DNA template in a final reaction volume of 25 µl. Cycling conditions have already been published (Ajaz et al., 2012). Amplification was carried out at annealing temperature (Ta) of 64°C. 20µl of the amplified product was digested overnight with 10U of HinfI and 3.5µl of the recommended buffer (Thermo Scientific®, USA). The digested products were run on 10% polyacrylamide gel, stained with ethidium bromide, and observed under UV.
2.4.2 MTHFR 1298A/C
For 1298A/C, DNA fragment of 168bp was amplified. PCR mix of 25µl contained 1X (NH4)2SO4 based PCR buffer with 0.8mM of MgCl2, 0.25mM of dNTPs, 2U of Taq polymerase, 1.25µM of each primer (Forward primer: 5’-CTTTGGGGAGCTGAAGGACTACTA-3’; 5’-CACTTTGTGACCATTCCGGTTTG-3’) and 70ng of DNA template. Amplification was carried out at Ta of 60°C. Cycling conditions were the same as for 677C/T. 20µl of the amplified product was digested overnight with 2.5U of MboII and 2µl of the recommended buffer (Thermo Scientific®, USA). The digested products were run on 10% polyacrylamide gel, stained with ethidium bromide, and observed under UV. Results of both polymorphisms were validated by Sanger sequencing commercially (Eurofin).
Representative gels for genotyping of the MTHFR 677C/T and 1298A/C are shown in Figures 2 and 3, respectively. The validation of each methodology by Sanger sequencing is shown in the inset.
2.5 Bioinformatics Analyses
In order to predict the impact of the missense substitutions on the protein function, bio-informatic analyses were performed. The softwares for investigation included MutPred2 (Balding, 2006), Provean (Gaunt, Rodríguez, & Day, 2007), and Polyphen2 (Resource). The variation correlation with the disease-related phenotype was investigated using ClinVar (Melissa J Landrum et al., 2015) and with cancer was investigated by using CIViC database (Institute).
2.6 Statistical Analyses
Statistical analyses were carried out using IBM (SPSS®) v.21.0 software (Arbuckle, 2012). The genotype and allele frequencies were determined by gene counting method along with the weighted percentages. Genotype distributions among the cases and controls were analyzed for Hardy-Weinberg equilibrium using Chi-squared test. The association between qualitative variables such as clinico-pathological characteristics including tumour stages and grade were also assessed by Pearson’s Chi-squared test (Cochran-Armitage trend test was used for the assessment of the association where Hardy-Weinberg disequilibrium was observed, (Balding, 2006). In order to measure the allelic, genotypic and haplotype risks for breast cancers, odds ratios (OR) with 95% confidence interval (95% CI) were calculated using logistic regression. For all the analyses p-value <0.05 was considered to be significant.
2.7 Haplotype Estimation and LD Statistics
Estimation of haplotype frequencies and LD analysis were carried out using cubeX webtool (Gaunt et al., 2007). The maximum likelihood estimations were carried out for the calculations of haplotype frequencies. The programme was used for the calculation of Lewontin’s standardized disequilibrium coefficient (D’), correlation co-efficient (r2) and χ2 test were used for the estimation of LD between two loci. The results were compared with the Phase 3 (version 5) 1000 Genomes Project data for different populations on LDlink, a National Cancer Institute website tool (LDlink; project)
3 RESULTS
Clinico-pathological data for the studied cohort is shown in Table 1. Briefly, majority of the patients had invasive ductal carcinoma (79%) with tumour size >5cm (48%) and stage III (55%), grade 3 (52%) tumours.
3.1 Hardy-Weinberg Equilibrium Estimation
The distributions of genotypes and allele frequencies of the MTHFR 677C/T and 1298A/C polymorphisms are given in Tables 2 and 3, respectively. Genotypes and allele distributions, for both markers, were determined in the groups of breast cancer patients, controls and the subject population combining both the groups. χ2 values show that the distributions of genotypes for MTHFR 677C/T occurred in Hardy-Weinberg proportions in controls, and breast cancer patients, however, in the combined group these were not in Hardy-Weinberg equilibrium. The genotypes were in Hardy-Weinberg dis-equilibrium for MTHFR 1298A/C in any of the groups and the disequilibrium is highly significant in the breast cancer patients and the combined group, indicating the role of this polymorphism in breast cancer susceptibility.
3.2 Significant Differences in the Distribution of the MTHFR 677C/T Genotypes in Breast Cancers on the Basis of Age
The distribution of the MTHFR 677C/T polymorphism and genotypes on the basis of age (≤35 years) and advanced age-group (>35years) are shown in Table 4. Homozygous T genotype is associated with advanced age, while heterozygous CT is associated with younger age-group of the patients. No significant association was found between the polymorphism and the risk for breast cancers.
3.3 Significant Protective Effect of MTHFR 1298A/C Polymorphism in Breast Cancer
In the studied cohort, a significant protective effect of CC-genotype of 1298AC against breast cancer was observed in Pakistani population. Statistical analysis of 1298AC is shown in Table 5.
3.4 Significant Association of Combined Genotypes of 677CC/1298AC and 677CT/1298AA with the Risk of Breast Cancers
The nine possible genotype combinations for two MTHFR polymorphisms are shown in Table 6. Case-control association analysis was carried out for each combination. The analysis revealed that homozygous C at 677CT with heterozygous 1298AC (CC/AC), significantly increased the risk of breast cancers when compared with reference genotype (CC/AA) (Chi-sq. value = 6.558, OR: 2.688, 95%CI: 1.247-5.795 (Table 6). Similarly, heterozygous 677CT with homozygous A at 1298AC (CT/AA) when compared with (CC/AA) showed significant association (p value<0.05, OR: 20.91, 95% CI: 1.156-378.2) (Table 6).
LD Analysis at MTHFR 677C/T and 1298A/C Locus
The disequilibrium spread in pairwise allelic combinations at the MTHFR 677C/T and 1298A/C loci was quantified by maximum likelihood calculation from the frequency of diploid genotypes. Haplotype frequencies and the LD statistics, D’, r2, and χ2 values demonstrate that the two sites are in linkage equilibrium.
3.5. Bioinformatics analysis
The bioinformatics analysis for the effect of SNPs on protein function is shown in Table 7. MTHFR 677C/T was predicted to be deleterious by three software tools. On the other hand, MTHFR 1298A/C was predicted to have benign effect by two programmes and deleterious by PROVEAN analysis.
4 DISCUSSION
Breast cancers are a heterogeneous group of disorders, which are characterized by abnormal cellular proliferation in the mammary tissue (Kumar, Abbas, Aster, & Robbins, 2013). The contributing gene effects are complex in nature, which can vary in penetrance (Haines & Pericak-Vance, 2006). The present study is a component of the molecular genetic investigation of the breast cancers in Pakistani population. The aim is to decipher the molecular and genetic architecture of the breast cancers in a population, characterized by high consanguinity, multiparity, and low exposure to putative risk factors for breast cancers i.e. low levels of alcohol and pork consumption. Thus, the present population provides a unique model for the epidemiological and molecular investigations of breast cancers. Additionally, folate deficiency is an established severe public health concern in women of reproducible ages (Soofi et al., 2017). The insufficiency is likely to be compounded by the genetic factors, affecting the breast cancer spectrum observed across Pakistani population.
In the present study, we report significant associations of the MTHFR 677C/T and 1298A/C polymorphisms with the risk of breast cancers in Pakistani population. To the best of our knowledge, the age-specific differences in the distribution of the MTHFR 677C/T genotypes in breast-cancer patients are reported for the first time from any population. As shown in Table 8, the bio-informatics analysis of the MTHFR 677C/T polymorphism demonstrates that Ala222Val substitution has deleterious effect on the enzyme function. The breast cancer patients with MTHFR 677 heterozygous genotype (project) aggregate in the young (≤35 years) group [OR: 2.81; 95% CI: 1.08 - 7.3]. On the other hand, all the homozygous T genotype breast cancer patients are found in the advanced age group (>35 years) [OR: 1.281; 95% CI: 1.162 – 1.412].
The other important observation in the present study is the lack of MTHFR 1298CC genotype in breast cancer patients. Mid-P chi-square test and the conditional maximum likelihood estimate (CMLE) odds ratio show that the protective effect is highly significant. A number of studies from different regions have reported the protective effect of C-allele against cancers including leukaemias and bladder cancer (Robien & Ulrich, 2003; Skibola et al., 1999; You et al., 2013). Mechanistically, double-stranded breaks lead to chromosomal instability, translocation and aberrations consequently contributing to cancer risk and progression. It was shown through the comet assay that C-allele decreases the frequency of double-stranded breaks, therefore confers a protective effect against cancers (Fragkioudaki et al., 2017; Jackson & Bartek, 2009). Secondly, decreased activity of the MTHFR results in the accumulation of the substrate 5,10 THF (Figure 4). Although the effect of increased concentration of 5, 10 THF has not been investigated recently, it has been shown in earlier studies to lead to increased purine and pyrimidine synthesis. Consequently, the DNA replication becomes stable and has the least chance of mutations (Bagley & Selhub, 1998; Chen et al., 1996; Fintelman-Rodrigues, Correa, Santos, Pimentel, & Santos-Reboucas, 2009; Giovannucci et al., 1998; Ulrich et al., 1999).
In the genotype combination analysis (MTHFR 677C/T and 1298A/C), CT/AA combination genotype was not observed in 63 controls as compared to 11 cases in 124 breast cancer patients [OR: 20.91; 95% CI: 1.156 – 378.2]. A second combination i.e. CC/AC also showed significant association with breast cancers [OR: 2.68; 95% CI: 1.25 – 5.8]. The pattern in haplotype analysis also corroborates this result. Both of these heterozygous/homozygous combination genotypes (CC/AC and CT/AA) were also reported with the increased risk of renal cell carcinoma (RCC) in Pakistani population (Ajaz et al., 2012).
Three genotype combinations are absent in the studied breast cancer patients: CT/CC, CC/CC, and TT/CC. These results emphasize the significant protective effect of the MTHFR 1298CC genotype against breast cancers. Interestingly, combination of the homozygous TT with homozygous CC was absent in controls and RCC patients as well (Ajaz et al., 2012).
The linkage equilibrium between two loci, which are in close proximity is a unique finding of the current report. In 1000 genome project (Kumar et al., 2013), among South-Asian populations, these two SNPS have been shown to be in LD in PJL (Punjabis from Lahore), GIH (Gujrati Indians in Houston), ITU (Indian Telugu from the UK) populations; whereas these are known to be in linkage equilibrium in BEB (Bengali from Bangladesh), and STU (SriLankan Tamil from the UK). In the present study, in contrast to PJL, where D’ has been shown to be 1, and r2 to be 0.119, for breast cancer patients these values were −0.502, and 0.0273 respectively. In case of controls, the values were 0.41 and 0.0496, respectively. Thus, underscoring the importance of validating data from genome wide association studies by appropriately designed case-control studies.
Conclusions
The present study reports unique associations of the MTHFR 677C/T and MTHFR 1298A/C independently and in combinations with breast cancers in Pakistani population. A distinctive linkage equilibrium over the short genomic sequence is also a significant finding of the present molecular investigative report. The reduced genetic diversity due to consanguineous population provides a comparatively uniform background for such investigations. The results have important implications in devising folate pathway strategies for population-specific breast cancers.
Data Availability
Data available upon request
Conflict of Interest
The authors declare no conflict of interest.
Acknowledgements
The authors wish to thank Pakistan Health Research Council for funding the study. ICCBS for core facilities. AEMC, JPMC staff for their co-operation. The authors are especially grateful to the participants in the study.
Footnotes
° for the Duration of Reported Research
Conflict of Interest: The authors declare no conflict of interest.