Abstract
The electrocardiogram (ECG) is the most commonly used exam for the screening and evaluation of cardiovascular diseases. Here we propose that the age predicted by artificial intelligence (AI) from the raw ECG tracing (ECG-age) can be a measure of cardiovascular health and provide prognostic information. A deep convolutional neural network was trained to predict a patient’s age from the 12-lead ECG using data from patients that underwent an ECG from 2010 to 2017 - the CODE study cohort (n=1,558,415 patients). On the 15% hold-out CODE test split, patients with ECG-age more than 8 years greater than chronological age had a higher mortality rate (hazard ratio (HR) 1.79, p<0.001) in a mean follow-up of 3.67 years, whereas those with ECG-age more than 8 years less than chronological age had a lower mortality rate (HR 0.78, p<0.001). Similar results were obtained in the external cohorts ELSA-Brasil (n=14,236) and SaMi-Trop (n=1,631). The ability to predict mortality from the ECG predicted age remains even when we adjust the model for cardiovascular risk factors. Moreover, even for apparent normal ECGs, having a predicted ECG-age 8 or more years greater than chronological age remained a statistically significant predictor of risk (HR 1.53, p<0.001 in CODE 15% test split). These results show that AI-enabled analysis of the ECG can add prognostic information to the interpretation of the 12-lead ECGs.
Introduction
The electrocardiogram (ECG) is the most commonly used exam for the screening and evaluation of cardiovascular diseases. Computerized, rule-based, ECG interpretation was developed to facilitate medical research and clinical practice. However, the limited accuracy of these methods has limited their application1,2. In this context, deep neural networks (DNNs) are a promising machine learning approach for the automated analysis of the ECG and have achieved unprecedented performance in initial studies3,4.
DNNs present a paradigm shift from classical ECG automated analysis methods. Classical methods use signal processing techniques to extract the measurements, wavelengths and detect abnormal beats from the ECG signal and then use the extracted information as input features to a classifier5. DNN-based ECG analysis, on the other hand, is based on an “end-to-end” approach, for which the raw signal is used as an input to the classifier, which learns to extract the features by itself 3,4.
Unlike the traditional methods, features learned by end-to-end ECG automated analysis methods do not necessarily have an interpretation rooted in electrocardiographic knowledge. If this paradigm introduces new challenges regarding model interpretability6 and out-of-distribution robustness7, it also introduces new possibilities when it comes to applications. Examples that go beyond traditional electrocardiography and have been achieved using end-to-end approaches include: predicting the risk of death from the ECG 8; identifying patients who will develop atrial fibrillation from a previous ECG taken during sinus rhythm9; and screening for cardiac contractile dysfunction using only the 12-lead ECG 10. This suggests that end-to-end models might be able to identify additional markers, that, in their turn, might be a practical and useful tool in cardiovascular disease prediction.
Despite being part of the routine evaluation of many patients in both primary and specialized care, the role of ECG in cardiovascular disease prediction and, hence, prevention is not as clear. Its prognostic impact has been explored in previous publications 11,12, nonetheless, the available methods are not widely adopted as a screening tool for individuals free of cardiovascular disease13. In this context, we turn to the usage of machine learning algorithms to infer age from ECG traces14,15. Age is a risk factor for cardiac diseases that affects ECG measurements and the likelihood of having an ECG with a higher incidence of abnormalities16,17. Moreover, previous research has shown that AI-based age estimates from the 12-lead ECG (ECG-age) are reasonably accurate, and that overestimation is associated with a higher incidence of cardiovascular risk factors and comorbidities14,15.
In this paper, we demonstrate that AI-predicted ECG-age is a potentially useful tool in the assessment of the risk of death in the general population. We developed, in the CODE Study cohort18, a DNN-based age-prediction model and assessed if the difference between predicted ECG-age and chronological age is a predictor of overall mortality. The model is validated in two external cohorts, ELSA-Brasil19, of Brazilian public servants, and SaMi-Trop20, of Chagas disease patients. Furthermore, we tested if the predictive value remains significant after controlling for the presence of cardiovascular risk factors and for subjects with normal ECGs. We sought to determine whether it can be used as a prognostic marker in the general population. Finally, we also undertook an exploratory analysis to investigate mechanisms that are involved in ECG-age prediction, looking at the main components used during the classification. This is done both by analysing the model sensitivity to changes in the ECG signal and by the manual review of the ECGs and the corresponding ECG-predicted age by trained cardiologists.
Results
Deep neural network age-predictor model
We used the CODE Study cohort18 to develop a DNN capable of predicting the patient’s age from the raw ECG tracing. The dataset consists of ECG records from 1,558,415 patients of 811 counties in the state of Minas Gerais (Brazil) collected by the Telehealth Network of Minas Gerais (TNMG). Patients were divided into 85-15% splits with the 85% split being used to develop the model (see Methods).
The model is evaluated in 3 different cohorts, unseen by the DNN model during its development, the 15% hold-out split described above, which will be referred to as the CODE-15% cohort (with 218,169 participants), the ELSA-Brasil (with 14,236 participants), and the SaMi-Trop cohorts (with 1,631 participants). Table 1 summarizes the baseline characteristics for each of the cohorts including median follow-up and number of events. Compared to the CODE-15% cohort, mean age, the prevalence of cardiovascular risk factors, and previous myocardial infarction were higher in both ELSA-Brasil and SaMi-Trop cohorts. The frequency of events was the highest in the SaMi-Trop cohort, composed of Chagas disease patients, many with chronic cardiomyopathy.
We used the DNN architecture known as the residual network21 to perform the task. The architecture has been successfully used for ECG abnormality detection in previous work3,4. The DNN mean absolute error (MAE) in the age prediction task is 8.38 (with standard deviation, s.d., 7.00), 8.44 (s.d. 6.19) and 10.04 (s.d. 7.76) for the CODE-15%, ELSA-Brasil, SaMi-Trop, respectively. Figure 1 shows the relation between predicted and true age for all the patients in the cohorts.
In the following sections, we try to establish the prognostic relevance of the ECG-age. We perform regression analyses that use the ECG-age as an input variable. In these analyses we always use the CODE-15% cohort for deriving the statistics and the ELSA-Brasil and SaMi-Trop for validating them.
Electrocardiographic-age as a mortality predictor
We try to establish the relevance of ECG-age as a predictor of mortality. We divided the patients into three groups, based on differences between predicted ECG-age and chronological age: a) those with ECG-age 8 or more years greater than the chronological age; b) those with ECG-age within a range of 8 years from their chronological age; and, c) those with ECG-age 8 or more years smaller than the chronological age. The MAE in the CODE dataset is approximately 8 years, which motivates our choice for the thresholds used. That is, when the predicted ECG-age deviates from the chronological age by more than the mean deviation found in the derivation cohort we classify into group (a) if the deviation is positive, and into group (c), if it is negative. Experiments with alternative choices of threshold yield qualitatively similar results.
The risk of death for these three groups, expressed by their hazard ratios (HR), is shown in Table 2, together with the 95% confidence intervals (CI). We fit a Cox model, adjusted for age and sex, in the CODE-15% cohort. The adjusted survival curves for this model are presented in Figure 2. This model indicates that participants with an estimated ECG-age of 8 or more years greater than the chronological age had higher mortality risk (HR 1.79, 95%CI 1.69-1.90; p<0.001). On the other hand, those with an estimated ECG-age of 8 or more years less than the chronological age had a lower mortality risk (HR 0.78, 95%CI 0.74-0.83, p < 0.001). Results in the ELSA-Brasil cohort, were similar: with a higher mortality risk (HR 1.75, 95%CI 1.35-2.27; p<0.001) for those with estimated ECG-age of 8 or more years greater than chronological age; and a lower mortality rate (HR 0.74, 95%CI 0.63-0.88; p<0.001) for those with ECG-age 8 or more years less than chronological age. In the SaMi-Trop cohort, patients with an ECG-age 8 or more years greater than the chronological age had a higher mortality risk (HR 2.42, 95%CI 1.53-3.83; p<0.001); for ECG-age 8 or more years less than chronological age, however, the observed decrease in mortality risk was not statistically significant (HR 0.89, 95%CI 0.52-1.54; p=0.68)). Additional analysis also show that Cox model adjusted by sex and age presents a good performance in the prediction of 1-year mortality, with an area under the curve, AUC, of 0.80 (95%CI 0.79-0.81) for the CODE-15% cohort, 0.77 (95%CI 0.66-0.87) for the ELSA-Brasil and 0.74 (95%CI 0.68-0.80) for the SaMi-Trop.
The importance of the ECG-age in predicting mortality remains also when we adjust the model for cardiovascular risk factors. Hazard ratios for models adjusted by different selections of variables cardiovascular risk factors are given in Table 2. In this analysis, we additionally adjusted the model for hypertension, diabetes mellitus, and smoking habits, but this did not yield significant differences in the results. As in the first case, all associations (except for ECG-age 8 or more years than the chronological age in the Sami-Trop cohort) remained significant with little change in the adjusted HR. We also did additional adjustments for dyslipidemia (CODE-15% and ELSA-Brasil cohorts) and obesity (ELSA-Brasil), without changing significantly the magnitude of the observed association.
Electrocardiographic-age as a mortality predictor in apparently normal ECGs
Table 3 describes conventional ECG measurements for the participants in the three groups described above - i.e., a) patients with predicted ECG-age 8 or more years greater than their chronological age; b) 8 or more years less than their chronological age; and, c) within a range of 8 years from their chronological age. Although in the CODE-15% cohort statistically significant differences can be seen for all measurements (p<0.001 for all), these numbers do not yield a clinically significant difference. From a clinical perspective, these measurements can be considered remarkably similar to each other. In the ELSA-Brasil cohort, measurements were also numerically similar with a statistically significant difference obtained only for heart rate (p<0.001) and QTc interval (p<0.001).
To further evaluate whether the ECG-age predicted by the DNN was related to traditional electrocardiographic abnormalities, we performed an additional analysis, now restricted to normal ECGs from the CODE-15% and ELSA-Brasil cohorts. Which have, respectively, 80679 and 7691 participants with normal ECGs. We did not perform this analysis in the SaMi-Trop because most patients had ECG abnormalities related to Chagas disease. What was considered as normal ECG is defined in Methods. An analysis with a Cox model restricted to the normal ECG was performed and the obtained hazard ratios are displayed in Table 4. The same parameters of the analysis in Table 2 are used. In the model adjusted by age and sex, ECG-age 8 or more years greater than chronological age remained a statistically significant predictor of death risk in both cohorts (HR 1.53, 95% CI 1.30 – 1.80, p<0.001 in CODE-15% and HR 1.63, 95% CI 1.00 – 2.66 p=0.050 in ELSA-Brasil). On the other hand, ECG-age 8 or more years less than chronological age remained associated with reduced risk of mortality in the CODE-15% (HR 0.66, 95% CI 0.57 - 0.76 p<0.001) but was not statistically significant in the ELSA-Brasil cohort (HR 0.91, 95% CI 0.68 - 1.21 p=0.502).
The results for models additionally adjusted for cardiovascular risk factors is also displayed in Table 4. After the adjustment, ECG-age 8 or more years than chronological age was associated with increased risk of mortality in CODE-15% cohort (HR 1.52, 95% CI 1.29 - 1.79, p=0.015), but not in ELSA-Brasil (HR 1.49, 95% CI 0.91-2.43, p=0.114). This was also true for an ECG-age 8 or more years less than chronological age. Risk was significantly decreased in CODE-15% cohort (HR 0.66, 95% CI 0.57 – 0.76, p<0.001) but not in ELSA-Brasil (HR 1.00, 95% CI 0.75 – 1.33, p=0.990). Which might be justified by the lack of statistical power due to the small number of deaths in this group for the ELSA-Brasil cohort (n=19). Additional adjustments for dyslipidemia (CODE-15% and ELSA-Brasil cohorts) and obesity (ELSA-Brasil) do not qualitatively change the results.
Electrocardiographic-age and cardiovascular risk factors
Figure 3(A) represents which cardiovascular risk factors were most likely associated with a predicted ECG-age 8 or more years greater than chronological age considering all ECGs from ELSA-Brasil cohort. After logistic regression adjusted for age and sex, hypertension, diabetes, smoking and obesity remained significantly associated with an increased odds of having an ECG-age 8 or more years greater than chronological age. In Figure 3(B) the same model was applied only to participants with a normal ECG. Hypertension, diabetes and smoking were significantly associated with a predicted ECG-age 8 or more years greater than chronological age.
Interpretability and time and frequency domain saliency maps
To assess whether ECG-age captures signals that can be interpreted by cardiologists, we conducted an additional experiment. We paired ECG-ages of subjects with the same chronological age, but where one of them had an ECG-age 8 or more years greater than their chronological age and the other 8 or more years less than their chronological age. Then, three medical doctors were asked to independently determine, for each pair, which ECG tracing was associated with the subject with higher ECG-age. Analyzing doctor’s assessments of 134 pairs of traces, aggregated through majority voting, we found that they were not significantly better than random (chi square=3.0, p =0.12). We provide detailed results in Sup Table 3. Throughout the experiment, doctors were given feedback about their predictions (in Stage 2), this did not increase their accuracy in the subsequent stage. In fact, they performed worse in Stage 3 (Accuracy=45.5%), after the feedback, than in Stage 1 (Accuracy=64.4%), before the feedback, or in Stage 2 (Accuracy=62.2%), during the feedback.
Additionally, we randomly generated 50 pairs of normal ECG tracings, with saliency maps22 highlighting the regions in the ECG tracing that have the highest impact in the predicted ECG-age (see Methods). Sup Fig 1 provides some illustrative examples. We asked the same set of three medical doctors to qualitatively analyze which sections of the ECG were being frequently highlighted by the visualization algorithm. Doctors independently suggested that low-frequency components, as P and T waves, were disproportionately highlighted.
We also generate saliency maps in the frequency domain giving the relative importance of each frequency component for the final prediction (see Methods). Sup Fig 2 shows the median and interquartile range from this analysis for 100 normal exams in each cohort. The analysis suggests the frequency component between 8 and 15 Hz of the ECG spectrum are the ones that most contribute to the model prediction.
Discussion
Biological aging refers to the decline in tissue/organismal function, whereas chronological aging simply indicates the time passed since birth23. In normally aging individuals, chronological and biological ages are the same. Biological aging, however, is affected by lifestyle, environmental factors, inheritable and acquired conditions, and diseases. Accelerated biological aging points to the decline of tissue/organismal function at a faster rate than the average, and hence associated with the possibility of a premature death23. Most available biomarkers of biological age measure a specific aspect of aging, like molecular and cellular biomarkers, and functional and structural vascular parameters23.
ECG exams are low-cost and widely available, being part of the routine evaluation of many patients in both primary and specialized care. And, previous studies have shown that the age estimated from the ECG tracing (the ECG-age) is related to cardiovascular health14,15: The ECG-age, calculated using a Bayesian model in 5-minutes ECGs, tended to be close to the chronological age in healthy non-athletes, whereas most subjects with risk factors or proven heart diseases had an ECG-age that was higher than their chronological age14; on another study, patients with a DNN-predicted age that exceeded chronologic age by 7 or more years presented a higher frequency of low ejection fraction, hypertension, and coronary disease15. More recently, ECG-derived age has also been related to vascular aging, measured by peripheral endothelial dysfunction24.
In this paper, we use a data-driven approach to obtain a model that predicts age from the raw ECG tracing. We expect such a model to capture, on average, how aging affects the ECG exam. In this case, having an ECG-age higher than one’s actual age is an indication that the exam is similar to those of older people, who have a higher associated cardiovascular risk and are more likely to die from cardiovascular diseases. Indeed, we have show that classical cardiovascular risk factors are associated with having a ECG-age 8 or more years greater than the chronological age, and for some risk factors, such as hypertension, diabetes mellitus, and smoking, the association remains even when only normal ECGs were considered (cf. Figure 3). Moreover, this study shows, in three different cohorts, that the difference between the ECG-age and the chronological age can be used as a marker of the risk of death.
From a clinical perspective, ECG-age presents itself as a natural summary index of ECG changes and abnormalities accumulated during the life course of each subject. ECG tracings are affected by a large number of factors and mechanisms and, while summarizing them in a single number is a huge oversimplification, it can still be useful. It transmits the idea of cardiovascular risk in a language that does not require medical expertise and can be understood by patients and other professionals without medical training. In the literature, an AI-based model that predicts the probability of 1-year mortality have been recently proposed8 and could also play a similar role. Nonetheless, reporting the ECG-age seems more intuitive from a patient perspective and, probably, easier to be used in clinical practice.
The analyses suggest that ECG-age is capable of capturing more than traditional ECG abnormalities or underlying conditions. Over-estimation of ECG-age was significantly associated with death after controlling for age and sex, cardiovascular risk factors and, even, when calculated only for subjects with normal ECGs. In the case of normal ECGs, this association was significant in CODE-15% but not in the ELSA-Brasil cohort. This might be explained by the small number of deaths in the ELSA-Brasil cohort or by the poorer annotation of risk factors in the CODE-15% study, in which this information is self-reported and obtained during the clinical activity. Moreover, ECG measurements were also not meaningfully different in the groups with predicted ECG-age 8 or more years greater than, 8 or more years less than, and within a range of 8 years from their chronological age.
Since the maintenance of a normal ECG status over time is associated with a low risk of cardiovascular diseases in a dose-response relationship25, we hypothesize that the DNN might be able identify subtle abnormalities that are not being currently identified in traditional analysis. This could help justify the capacity of evaluating the risk even for apparently normal ECGs. The lack of capability of trained doctors to distinguish between pairs of normal ECGs of the same age but different ECG-age (see Sup Table 2) also supports this hypothesis.
The advance of interpretable machine learning algorithms26 might make it possible to leverage the features used by these models into clinical practice. Our initial insights on the mechanisms used for the estimation of ECG-age - and its prognostic value - suggest that low-frequency components of the ECG, usually associated with P and T waves, might play an important role although these considerations would deserve a specific and more detailed investigation.
Our work is perhaps best understood in the context of its limitations. The use of end-to-end DNN models is central to this work and yielded interesting findings (such as the possibility of predicting mortality even for apparent normal ECGs). Nonetheless, the complexity of these models makes it hard to fully interpret the results. Our exploratory analysis included sensitivity analysis both in the time and frequency domain and the analysis and review of more than one hundred ECGs by trained cardiologists. While it did provide some insight on what is being detected by the model, it is far from sufficient to completely explain the findings. Furthermore, while our study demonstrates the potential clinical utility of the ECG-age in individual risk prediction, further studies are desired to evaluate its incorporation in the clinical practise, including its use in addition to established risk calculators for primary prevention of cardiovascular diseases.
To conclude, ECG predicted-age may reflect biological age and it is a promising tool for risk prediction of overall mortality. It summarizes the information from the ECG in a single index with a clear interpretation for the patient. Data for training these models are also easy to obtain: while producing large datasets fully annotated with electrocardiographic abnormalities requires many hours of work by trained physicians, self-reported age is usually easy information to come by. Finally, the ability to predict mortality even for normal ECGs suggests that there might still be subtle electrocardiographic markers and abnormalities that are of interest and are not being captured in traditional analysis and the models presented here might be a useful tool in trying to find them.
Methods
The CODE cohort
Clinical Outcomes in Digital Electrocardiography (CODE) study18 was developed with the database of digital ECG exams of the TeleHealth Network of Minas Gerais (TNMG)27,28, Brazil, linked to the public databases of the Mortality and Hospitalization Information Systems. It was expected that the consolidated database would be useful for multiple purposes, including the evaluation of the epidemiological and prognostic significance of ECG findings29 and the development of new methods of automatic classification of ECG abnormalities3, using both conventional statistical methods and new machine learning techniques.
Patients over 16 years old with a valid ECG performed from 2010 to 2017 were included. Clinical data were self-reported. A hierarchical free-text machine learning algorithm recognized specific ECG diagnoses from cardiologist reports. The Glasgow ECG Analysis Program provided Minnesota Codes and automatic diagnostic statements. For the CODE database, the presence of a specific electrocardiographic diagnosis was considered automatically when there was an agreement between the diagnosis extracted from the cardiologist report and the automatic report from Glasgow Diagnostic Statements or Minnesota code. In cases where there were discordances between medical reports and one of the automatic programs, a manual revision was done by trained cardiologists18.
The electronic cohort was obtained linking data from the ECG exams (name, sex, date of birth, city of residence) and those from the national mortality information system, using standard probabilistic linkage methods (FRIL: Fine-grained record linkage software, v.2.1.5, Atlanta, GA). After the linkage, the data was anonymized for storage18.
From a dataset of 2,470,424 ECGs, 1,773,689 patients were identified. After excluding the ECGs with technical problems and patients under 16 years old, a total of 1,558,415 patients were included for analyses. The mean age was 51.6 [s.d.17.6] years with 40.2% male. The overall mortality rate was 3.34% in a mean follow-up of 3.7 years18.
The model was also evaluated in two established cohorts, the São Paulo-Minas Gerais Tropical Medicine Research Center (SaMi-Trop)20 of Chagas disease patients and the Longitudinal Study of Adult Health (ELSA-Brasil)19, of Brazilian public servants, in which raw ECG tracings from the baseline and follow-up with total mortality as the end-point are available. These cohorts are described next.
The CODE-15% cohort
The CODE-15% is a subset of the CODE cohort. The CODE cohort were divided into 85-15% splits, with the 85% split being used for developing the model and 15% hold-out being the one used in subsequent analyses and referred to as CODE-15%. This hold-out set is obtained by stratifying the patients in age groups and random sampling from these groups. The result is an, approximately, uniform age distribution from years 16 to 85 years by randomly picking, approximately the same number of exams (3300) in equally-spaced one-year intervals. Only the first patient exam is considered in all the analysis with this cohort.
The ELSA-Brasil cohort
ELSA-Brasil is a cohort study that aims to investigate the development of chronic diseases, primarily diabetes and cardiovascular diseases, over a long-term follow-up.30,31 All active or retired employees of the six institutions (and, in a few instances, also of related educational or health institutions) from six Brazilian capitals, of both sexes, and with ages between 35 and 74 years, were eligible for the study. Exclusion criteria were severe cognitive or communication impairment, intention to quit work at the institution in the near future for reasons not related to retirement, and, if retired, residence outside the corresponding metropolitan area. Women with current or recent pregnancy were rescheduled so that the first interview could take place ≥4 months after delivery. A total of 15,105 participants were enrolled, 6887 men and 8218 women, thus giving reasonably large numbers for sex-specific analyses. Baseline assessment (2008-10) included detailed interviews and measurements to assess social and biological determinants of health, as well as various clinical and subclinical conditions related to diabetes, cardiovascular diseases, and mental health. A second and third visit of interviews and examinations were done (2012-14 and 2017-2019) to enrich the assessment of cohort exposures and to detect initial incident events. Annual surveillance has been conducted since 2009 for the ascertainment of incident events. Biological samples (sera, plasma, urine, and DNA) obtained at both visits have been placed in long-term storage. In a mean of 9.36 years of follow-up, 14,263 (94,5%) participants were followed, until 01/01/2020, 617 (4.3%) died and 842 (5.6%) were lost to follow-up.
The SaMi-Trop cohort
SaMi-Trop is an NIH-funded prospective cohort of 1959 patients with chronic Chagas cardiomyopathy to evaluate whether a clinical prediction rule based on ECG, brain natriuretic peptide (BNP) levels, and other biomarkers can be useful in clinical practice.20,32 The study is being conducted in 21 municipalities of the northern part of Minas Gerais State in Brazil with at least 2 years of follow-up, including one visit at baseline and another at 24 months. Eligible patients were selected based on the ECG results performed in 2011–2012 by the Telehealth Network of Minas Gerais, which from now on will be called index ECG. Only patients who fulfilled all of the following inclusion criteria were selected: (1) self-reported Chagas disease; (2) an index ECG reported as abnormal and (3) aged 19 years or more. The exclusion criteria included pregnancy or breastfeeding, and any life-threatening disease with an ominous prognosis that suggested a life expectancy of <2 years. The baseline evaluation included a collection of sociodemographic information, social determinants of health, health-related behaviors, comorbidities, medicines in use, history of previous treatment for Chagas disease, functional class, quality of life, blood sample collection, and ECG. Patients were mostly female, aged 50–74 years, with low family income and educational level, with known Chagas disease for >10 years; 46% presented with functional class >II. Previous use of benznidazole was reported by 25.2% and permanent use of pacemaker by 6.2%. Almost half of the patients presented with high blood cholesterol and hypertension and one-third of them had diabetes mellitus. N-terminal of the prohormone BNP (NT- ProBNP) level was >300 pg/mL in 30% of the sample. Clinical and laboratory markers predictive of severe and progressive Chagas disease were identified as high NT-ProBNP levels, as well as symptoms of advanced heart failure 32. During a mean follow-up of 2.09 years, 1631 patients were being followed until the 2nd visit. 104 (6.4%) died and 328 (16.7%) were lost to follow-up.
Electrocardiographic and clinical definitions in CODE and ELSA-Brasil
An ECG was considered “normal” in the CODE cohort according to conventional clinical reporting and by having automatic measurements by the Glasgow software within the normal range. In the ELSA-Brasil and Sami-Trop cohorts, ECGs were codified by the Minnesota code, as described elsewhere,19,33 with manual review of a trained cardiologist. Those with no major or minor abnormalities according to the criteria were considered normal.
All clinical risk factors included in the CODE cohort were self-reported in a clinical standardized questionnaire. Hypertension, diabetes and dyslipidemia were also considered if informed use of antihypertensives, oral hypoglycemic agents or insulin, statins or fibrates; respectively. In the Sami-Trop cohort, the risk factors were also self-reported in a baseline interview. In the ELSA-Brasil study, hypertension was defined as systolic blood pressure ≥ 140 mmHg or diastolic blood pressure ≥ 90 mmHg, or verified treatment with anti-hypertensive medication during the past 2 weeks; diabetes mellitus as a report of a previous diagnosis of diabetes, or the use of medication for diabetes, or meeting a diagnostic value for diabetes according to one of the following tests: fasting or 2-h plasma glucose obtained during a 75-g oral glucose tolerance test or HbA1C; dyslipidemia as either a total cholesterol ≥ 240mg/dl, LDL cholesterol ≥ 160 mg/dl, HDL cholesterol < 40 mg/dl or triglycerides ≥ 150 mg/dl; obesity as BMI ≥30 kg/m2 and smoking by participants self-report.
The model
All the exams from patients in the 85% split of the CODE cohort were used to develop a convolutional DNN to predict age. Thesplit was further divided into 80-5% splits: being the first used to learn the neural network weights, and the 5% remaining samples used for comparing design choices and adjusting optimization parameters. As we did in the CODE-15% cohort, in the 5% validation split we guarantee an, approximately, uniform age distribution by picking the same number of patient exams for equally-spaced one-year intervals. The remaining 80% training dataset is unbalanced and, to correct it, we weight the exam records inversely proportional to the frequency of patients with the given age during the training procedure.
The architecture and the set of hyperparameters are described next and are similar to a previous study3, for which the DNN was trained to detect 6 types of ECG abnormalities (considered representative of both rhythmic and morphologic ECG abnormalities) on the same dataset. The results with this choice of hyperparameters were considered satisfactory and no further hyperparameter search was performed.
We used a convolutional neural network similar to the residual network proposed for image classification21, but adapted to unidimensional signals. This architecture allows deep neural networks to be efficiently trained by including skip connections. We have adopted the modification in the residual block proposed by He et. al.34.
All ECG recordings, which have between 7 and 10 seconds of duration and are sampled at frequencies ranging from 300 to 1000Hz, are re-sampled to 400 Hz and zero-padded, resulting in signals of fixed length (4096 samples), which are fed to the neural network. The output is the age-predicted for that given exam.
The network consists of a convolutional layer followed by 5 residual blocks with two convolutional layers per block. The output of each convolutional layer is rescaled using batch normalization35 and fed into a rectified linear activation unit ReLU. Dropout36 is applied after the nonlinearity. The convolutional layers have filter length 17, starting with 4096 samples and 64 filters for the first layer and residual block and increasing the number of filters by 64 and subsampling by a factor of 4 every residual block. Max Pooling37 and convolutional layers with filter length 1 are included in the skip connections to make the dimensions match those from the signals in the main branch.
The mean square error is minimized using Adam optimizer38 with default parameters and a learning rate of 0.001. The learning rate is reduced by a factor of 10 whenever the validation loss does not present any improvement for 7 consecutive epochs. The neural network weights were initialized sampling from a normal random variable scaled as in He et. al.39 and the bias was initialized with zeros. The training runs for 70 epochs with the final model being the one with the best validation results during the optimization process.
Cardiologist assessment of ECG-age from the tracings
To assess whether ECG-age was capturing ECG changes that are recognizable to medical doctors, we conducted an additional experiment asking three experienced medical doctors to identify, in paired ECGs, ECG tracings associated with having higher ECG-age. All ECGs considered were normal ECGs from the CODE cohort. Within each pair of equal chronological age and sex, one individual had an ECG-age 8 or more years greater than their chronological age and the other had an ECG-age 8 or more years less than their chronological age. We included one pair of male and one pair of female patients for each age between 16 and 85 (whenever possible), totaling 134 pairs. At the edges of our age-range, it was not always possible to have an ECG tracing with ECG-age 8 or more years smaller than the chronological age paired 8 years or more years less than the chronological age, and, in these situations, we use tracings associated with ECG-ages within the 8 years range of the patient’s chronological age. The experiment was divided into three stages where doctors annotated 44, 45, and 45 pairs of ECGs tracings respectively. In stages 1 and 3, doctors were not given the answer after accomplishing the task, and in stage 2 they were. The idea behind this distinction is to see whether doctors would fare any better after a round with explicit feedback on their performance.
Saliency maps in the time and frequency domain
We performed an analysis to assess the relative importance of different segments of the ECG trace in the age prediction. The saliency maps intensity at each point is proportional to the norm of the partial derivatives of the predicted age in relation to the input. These values are computed through a single back-propagation pass through the DNN22. Similar approaches have been pursued in the interpretation of other DNN-based ECG predictors8,40. We then generated ECGs superposed with the saliency maps as in Sup Fig 1. The colored dot sizes are proportional to the magnitude of the partial derivative of the output with regard to the magnitude of the signal at the given point, and might be interpreted as the relative importance of the given point to the model prediction (at least in terms of the linearized local analysis).
In Sup Fig 2, we show a similar analysis, but now in the frequency domain. We take discrete Fourier transform of the gradients computed as described above. We do that for 100 ECG exams, sampled at random, from the ECG exams classified as normal in each of the three different cohorts (CODE-15%, ELSA-Brasil and SaMi-Trop) and show the median and interquartile range in the Figure.
Statistical analysis
To assess the performance of the DNN model in the CODE-15%, ELSA-Brasil and SaMi-Trop cohorts, we computed the R square metric using linear regression and calculated the mean absolute error (MAE) using the chronological age. For further analysis, we divided the samples in three groups, based in differences between predicted ECG-age and chronological age: those with ECG-age 8 or more years less than the chronological age, those with ECG-age within a range of 8 years from their chronological age, and those ECG-age 8 or more years greater than the chronological age.
For mortality analysis, we used Cox proportional regression model, reporting hazard ratios (HR) and 95% confidence intervals (95%CI). The analysis was performed in all ECGs of the three cohorts, with two levels of adjustments: age and sex; age, sex, and other cardiac risk factors (hypertension, diabetes mellitus, smoking). Other two models in the second level of adjustment including dyslipidemia, for CODE-15% and ELSA-Brasil, and obesity, only for ELSA-Brasil, were fitted. A second mortality analysis with the same parameters, was performed considering only normal ECGs from CODE-15% (n=80679) and ELSA-Brasil (n=7691) cohorts. The proportional hazard assumption was verified using a log (-log (survival)) plot and Schoenfeld residuals. The area under the receiver operating characteristic curve (AUC) was used to evaluate the Cox model performance for 1-year mortality risk prediction.
To explore the association of risk factors with the ECG-age being 8 or more years greater than chronological age we performed a logistic regression analysis for the ELSA-Brasil cohort including all ECG and only subjects with normal ECG. In this analysis we fitted a model for each risk factor adjusted by age and sex and reported the ORs and 95% confidence intervals.
Data Availability
Upon publication, some of the cohorts used in the model evaluation will be made available. Information about mortality, age, sex, the ECG tracings and the flag indicating whether the ECG tracing is normal will be made available with no restriction for the Sami-Trop cohort and for the CODE-15% cohorts. The DNN model parameters that give the results presented in this paper will also be made available without restrictions. This should allow the reader to partially reproduce the results presented in the paper. Restrictions apply to additional clinical information on these two cohorts, to the full CODE cohort and to the ELSA-Brasil cohort, for which requests will be considered on an individual basis by the Telehealth Network of Minas Gerais and by ELSA-Brasil Steering Committee. Any data use will be restricted to non-commercial research purposes, and the data will only be made available on the execution of appropriate data use agreements.
Contribution statement
E.M.L., G.M.M.P., A.H.R., T.B.S. and A.L.R. were responsible for the study design. A.L.R. conceived the project and acted as the project leader. A.H.R. choose the architecture, implemented and tuned the deep neural network. E.M.L did the survival analysis and all the statistical tests. G.M.M.P. and A.L.R. interpreted the results and provided clinical interpretation. A.H.R., D.M.O., P.R.G. were responsible for preprocessing the training data. P.R.G was responsible for maintaining and extracting the CODE database. M.M.P.F, E.C.S., S.M.B., L.G., B.B.D. were responsible for cohort design and management, data acquisition, follow-up and ECG exams in ELSA-Brasil and Sami-Trop cohorts. G.M.M.P., A.H.R., E.M.L., W.M.Jr., T.B.S. and A.L.R. contributed to the writing and all authors revised it critically for important intellectual content. All authors read and approved the submitted manuscript.
Competing interests
None of the authors have financial and non-financial competing interests.
Ethics declarations
This study complies with all relevant ethical regulations. CODE Study was approved by the Research Ethics Committee of the Universidade Federal de Minas Gerais, protocol 49368496317.7.0000.5149. Since this is a secondary analysis of anonymized data stored in the TNMG, informed consent was not required by the Research Ethics Committee for the present study. ELSA-Brasil was approved by the Research Ethics Committees of the participating institutions and by the National Committee for Research Ethics (CONEP 976/2006) of the Ministry of Health. Sami-Trop study was approved by the Brazilian National Institutional Review Board (CONEP), No. 179.685/2012. In both investigations, all human subjects were adults who gave written informed consent. All researchers who deal with datasets signed terms of confidentiality and data utilization.
Competing interests
The authors declare no competing interests.
Data Availability
Upon publication, some of the cohorts used in the model evaluation will be made available. Information about mortality, age, sex, the ECG tracings and the flag indicating whether the ECG tracing is normal will be made available with no restriction for the Sami-Trop cohort and for the CODE-15% cohorts. The DNN model parameters that give the results presented in this paper will also be made available without restrictions. This should allow the reader to partially reproduce the results presented in the paper. Restrictions apply to additional clinical information on these two cohorts, to the full CODE cohort and to the ELSA-Brasil cohort, for which requests will be considered on an individual basis by the Telehealth Network of Minas Gerais and by ELSA-Brasil Steering Committee. Any data use will be restricted to non-commercial research purposes, and the data will only be made available on the execution of appropriate data use agreements.
Code Availability
The code for the model training, evaluation and statistical analysis is available at https://github.com/antonior92/ecg-age-prediction.
TABLES AND FIGURES
Acknowledgments
This research was partly supported by the Brazilian Agencies CNPq, CAPES, and FAPEMIG, by projects IATS, INCT-Cyber and Atmosphere, and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by Knut and Alice Wallenberg Foundation. The ELSA-Brasil-Brasil study was supported by the Brazilian Ministries of Health and of Science and Technology (grants 01060010.00RS, 01060212.00BA, 01060300.00ES, 01060278.00MG, 01060115.00SP, and 01060071.00RJ). 5U19AI098461-07). The SaMi-Trop cohort study is supported by the National Institutes of Health (P50 AI098461-02 and U19AI098461-06). AHR, BBD, PAL, SMB, LG, WMJr, and ALR are recipients of unrestricted research scholarships from CNPq; EMS and AHR received scholarships from CAPES and CNPq; and DMO, WMJr and ALPR received a Google Latin America Research Award scholarship. None of the funding agencies had any role in the design, analysis or interpretation of the study. We also thank NVIDIA for awarding our project with a Titan V GPU.