Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

Siyue Yang; Paul Varghese; Ellen Stephenson; Karen Tu; Jessica Gronsbell

doi:10.1101/2022.04.23.22274218

ABSTRACT

Objective Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records (EHRs) for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (i) the data sources used, (ii) the phenotypes considered, (iii) the methods applied, and (iv) the reporting and evaluation methods used.

Materials and Methods We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

Results Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly-supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered marginal improvement over traditional ML for many conditions.

Discussion Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

Conclusion Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

BACKGROUND AND SIGNIFICANCE

Electronic health records (EHRs) are a central data source for biomedical research.[1] In recent years, EHR data has been used to support discovery in disease genomics, to enable rapid and more inclusive clinical trial recruitment, and to facilitate epidemiological studies of understudied and emerging diseases.[2–6] EHRs are also positioned to be a key source of data for the development of personalized treatment strategies and generation of real-world evidence.[7,8] A critical aspect of any secondary use of EHR data is phenotyping, the process of identifying patients with a specific phenotype (e.g. the presence or onset time of a clinical condition or characteristic) based on information in their EHR.[9–11] Phenotyping is one of the first steps of an EHR-based application as it is used to both identify and characterize the population of interest.

Generally, the phenotyping consists of 4 steps: (i) data preparation, (ii) algorithm development, (iii) algorithm evaluation, and (iv) application of the algorithm (Figure 1). The focus of our article is on the use of machine learning (ML) for algorithm development. Traditionally, phenotypes have been inferred from rule-based algorithms consisting of inclusion and exclusion criteria handcrafted by clinical and informatics experts.[12] However, given the complexity and variation in documentation across phenotypes, providers, and institutions, developing a sufficient set of rules is prohibitively resource-intensive and difficult to scale across conditions and healthcare settings.[13,14] For example, the Electronic Medical Records and Genomics (eMERGE) Network was an early leader in phenotyping in creating a public phenotype library called PheKB. A key finding from this effort was the time intensiveness of rule-based phenotyping, sometimes requiring up to 6-10 months of manual effort depending on the complexity of the condition.[14] Similar findings have been reported by other large research networks such as OHDSI (Observational Health Data Science and Informatics).[10]

Figure 1.

Overview of the phenotyping process. Step 1 involves data preparation which includes (i) extraction and processing of relevant data from records of candidate patients from the data warehouse and (ii) manual review of a subset of charts to obtain gold-standard phenotype labels. Step 2 is the algorithm development phase in which researchers use the data from Step 1, often referred to as the data mart, to develop the phenotyping algorithm with a rule-based or machine learning (ML) method. Step 3 evaluates the accuracy of an algorithm by comparing the assigned phenotype from the algorithm to the gold-standard label, often with estimates of the positive predictive value (PPV), sensitivity, and other accuracy metrics. Step 4 applies the algorithm from Step 2 to obtain the cohort of patients with the phenotype for downstream analysis. The identified cohort can then be used in a variety of downstream applications.

To address this barrier to EHR-based research, there has been increasing interest in phenotyping algorithms derived from ML models.[15,16] In contrast to rule-based approaches, ML methods aggregate multiple sources of information available in patient records in a more automated and generalizable fashion to improve phenotype characterization.[17] While there has been substantial progress in ML approaches designed to make phenotyping more efficient, accurate, and portable in recent years, these advances have yet to be formally synthesized.[18] To the best of our knowledge, 5 articles surveyed EHR-based phenotyping methods through 2018.[11,15–17,21] These articles provide conceptual summaries of rule-based methods and early ML approaches and do not capture advances in semi-supervised, weakly-supervised, and deep learning that were popularized after publication (Table S1). Moreover, in light of the wave of EHR-based studies prompted by the pandemic and the increased complexity of ML approaches relative to their rule-based counterparts, there is a pressing need to survey the landscape of phenotyping given its ubiquity in EHR-based applications.[19,20]

OBJECTIVE

Our work fills this gap in current literature through a methodical review of ML-based phenotyping with respect to (i) the data sources used, (ii) the phenotypes considered, (iii) the methods applied, and (iv) the reporting and evaluation methods used. Based on our analysis of 37 items recorded across 100 selected articles, we also identify areas of future research.

MATERIALS AND METHODS

Working definitions

To situate our discussion, key terminology related to EHR data and ML is provided in Table 1. We broadly classified a ML method as either (i) supervised, (ii) semi-supervised, (iv) weakly-supervised, or (v) unsupervised according to the model used and the data available for training.[22,23] We further classified each method as deep learning if it is neural network-based and as a traditional ML approach otherwise. Consistent with recent literature, [18] we used an inclusive definition of phenotyping as a procedure that uses EHR data to “assert characterizations about patients.” Our study therefore includes binary phenotypes such as the presence of disease and nuanced phenotypes such as disease severity, disease progression, and social determinants of health (SDOHs). We focused solely on literature using EHRs, defined as longitudinal records of a patient’s interactions with a healthcare institution or system primarily authored by health professionals. We regard our work as a “methodical review” as it does not qualify as a Cochrane style review, but closely adheres to the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) guidelines. [24]

View this table:

Table 1.

Descriptions of (a) terms used to describe EHR data and (b) ML methods in the context of phenotyping.

Search strategy

Due to the broad and evolving definition of phenotyping, early systematic reviews employed a manual review of all full-text articles published in a small number of informatics venues.[12,17] This manual approach was later expanded to a PubMed query [15] using an overly inclusive search designed to capture all articles that (i) used EHR as the primary data source and (ii) utilized ML or natural language processing (NLP) or considered phenotyping. The PubMed query was similarly restricted to a subset of informatics venues in order to target articles focused on phenotyping rather than clinical applications. We followed an analogous strategy, but increased the scope of our search by including Web of Science as we found articles were missed by PubMed. We also added additional strings related to ML.[25]

Specifically, our search of PubMed and Web of Science identified full-text articles that employed ML or NLP or considered phenotyping with EHR data published between January 1, 2018, and April 14, 2022. The range of publication year was specified to not overlap with existing reviews and focused on the same major informatics venues: (1) Journal of American Medical Informatics Association (JAMIA), (2) Journal of Biomedical Informatics (JBI), (3) PloS One, (4) Proceedings of the American Medical Informatics Association’s Annual Symposium (AMIA), and (5) JAMIA Open.[12,15,16,26,27] The complete search queries are provided in Table S2.

Study selection

Our overall search strategy is depicted in a PRISMA diagram (Figure 2).

Figure 2.

PRISMA diagram for article selection. Only one exclusion reason was chosen for each record during the screening process, although the reasons are not mutually exclusive.

Title and abstract screening

After removing duplicates, articles were retrieved and underwent title and abstract screening by two authors (S.Y. and J.G.). A third author (P.V.) resolved disagreements. Articles were excluded if they (i) were reviews, perspectives, or editorials, (ii) did not use EHRs as a primary data source, (iii) did not use ML methods, or (iv) did not consider phenotyping. Table S3 provides a list of article exclusions.

Full-text review

One author (S.Y.) reviewed the full-text articles and another author (J.G.) verified the information from the full-text review when necessary. After excluding papers that did not focus on ML approaches for EHR phenotyping, 100 papers were selected (Table S4). During the full-text review, we extracted information on: (i) the data sources used, (ii) the phenotypes considered, (iii) the methods applied, and (iv) the reporting and evaluation methods used. A list of the 37 recorded variables is included in Table S5.

RESULTS

In reviewing the literature, we found that all but two deep learning approaches were supervised (Figure 3). We therefore summarize contributions in traditional supervised, deep supervised, semi-supervised, weakly-supervised, and unsupervised learning in the subsequent sections.

Figure 3.

Number of articles that used the various machine learning paradigms.

Data Sources

63 of the 100 articles relied on EHR data from a single institution, while 8 articles used data from multiple institutions, including research networks such as the OHDSI [28] and eMERGE.[29] The remaining articles leveraged publicly available data from the Medical Information Mart for Intensive Care (MIMIC-III) database and NLP competitions (Table S6). A small number of studies utilized additional data sources, including administrative claims [30–36] and registry databases.[37–40] 94 studies were conducted in the US.

With respect to the data types used for developing phenotyping algorithms, 70 of the 100 articles utilized unstructured free-text data and half of these articles also incorporated information from structured data. Unsurprisingly, diagnoses were the most common structured data element and were typically derived from International Classification of Diseases, 9th or 10th Revision (ICD-9/10) billing codes (Figure 4(a)). Clinical note types (eg. progress notes, discharge summaries) used for algorithm development were rarely specified (Figure 4(b)). However, most articles reported on the NLP software that was used to process free-text. The clinical Text Analysis and Knowledge Extraction System (cTAKEs) was the most popular. Frequently used terminologies and NLP software are detailed in Table S7 and S8, respectively.

Figure 4.

Types of structured data and clinical notes used to develop phenotyping algorithms in the selected articles (excluding articles using competition data). A data type is presented if it is used in more than one article. Encounters include encounter metadata, while medical history notes include both social history and cardiac surgical history.

Phenotypes

The articles in our study considered 157 phenotypes, with 40 articles focusing on more than one phenotype. Studies using data from NLP competitions focused on adverse drug events [41] and clinical trial eligibility,[42] while studies using MIMIC-III characterized 25 phenotypes seen in the intensive care unit.[43] Outside of the articles using public data, chronic conditions with a large burden on the healthcare system, such as heart diseases and type II diabetes mellitus, were most frequently considered overall. 69 of the 100 articles aimed to identify binary phenotypes (e.g. case/control disease status), while few focused on severity or temporal phenotypes (4 and 11 articles, respectively). Although this finding coincides with previous reviews, there were considerable differences in the top phenotypes across the 5 ML paradigms (Figure 5). The phenotypes considered in articles utilizing traditional supervised learning were not identified in previous reviews[12,15] These include phenotypes primarily documented in free-text such as suicidal behavior [44,45] and SDOHs.[30,46–49] Deep supervised learning papers similarly considered SDOHs [50–57] as well as episodic conditions [58–61] and COVID-19.[62,63] The phenotypes considered by articles using semi- or weakly-supervised methods aiming to expedite algorithm development included common, chronic conditions [64–66] that had been previously identified with a rule-based or traditional supervised learning method.[13,67] Most unsupervised methods considered progressive conditions associated with multiple comorbidities or phenotypic heterogeneity such as dementia and chronic kidney disease.[68,69]

Figure 5.

Top phenotypes considered within each machine learning category and the number of articles of each phenotype (excluding articles using competition data sources). Phenotypes are colored if they appear in more than one category.

ML Methods

Traditional supervised learning

60 articles employed supervised learning methods, with 27 articles using traditional models. In contrast to rule-based algorithms, phenotyping algorithms derived from supervised learning are less burdensome to develop as they are learned from the data.[15] Traditional supervised learning is also more amenable to incorporating a greater number of features predictive of the phenotype into the algorithm, such as information in clinical notes.[17] Among the articles using traditional supervised learning, half of them mapped terms in free-text to clinical concepts in the Unified Medical Language System (UMLS) [70] for use in algorithm development. Similar to features derived from structured data elements, the extracted concepts were typically engineered into patient-level features (e.g. total number of positive mentions of a concept in the record) based on the consensus of domain experts.[71] Gold-standard labels for model training were predominantly annotated through manual review of patient records.[72] In some instances, labels were also derived from registry data,[37] laboratory results,[35,36,73] or rule-based algorithms.[47]

The most commonly used methods were random forest, logistic regression, and support vector machine (Table 2). A common trend among selected articles was the use of a selective sampling method, such as undersampling or the Synthetic Minority Oversampling Technique (SMOTE), to address class imbalance for rare phenotypes such as surgical site infections and rhabdomyolysis.[31,33,35,37,48,74,75] Several models, including SVM, single-layer perceptron, and logistic regression, were also extended to accommodate federated analysis of distributed EHR data held locally at multiple institutions to identify adverse drug reactions.[33]

View this table:

Table 2.

Common methods in each machine learning category. A method is presented if it appeared in more than one article.

Deep supervised learning

While traditional supervised learning methods enable the use of free-text in algorithm development, they are limited by their inability to handle raw input data. Deep learning models consist of many processing layers that discover intrinsic patterns within data to alleviate the burden of feature engineering.[76,77] This is particularly valuable in the context of EHR data as models can learn rich representations of the clinical language in free-text.[78] All but 2 articles employing deep supervised learning articles leveraged clinical notes. The articles utilized word embeddings to represent words or clinical concepts as real-valued vectors based on their context.[79] Word embeddings are typically learned from a large corpus in an unsupervised fashion and used as the input layer to a neural network. Common corpora within the selected articles included clinical notes [53,57,63,80–84] as well as external sources such as biomedical publications [56,61,62,85,86] and Wikipedia articles [51,58,87–90] (Table S9). Word2vec,[91] Global Vectors (GloVE),[92] and Bidirectional Encoder Representations from Transformers (BERT)[93–96] were the most frequently used methods for training embeddings (Table S10).

Among neural network architectures, feed-forward networks were only used in 3 studies (Table S11)[97] while BERT and variants were frequently used for phenotypes documented in clinical notes such as SDOHs (e.g. education [50,57]) and symptoms (e.g. chest pain,[90] bleeding [58]).

Recurrent neural networks (RNNs), convolutional neural networks (CNNs), and their variants were the most prevalent architectures as they accommodate sequential data in longitudinal patient records and clinical text.[24,76] For instance, the bidirectional long-short term memory (Bi-LSTM), an RNN variant that captures previous and future information in a sequence, was used to characterize phenotypes evolving over time such as dementia [34] and substance abuse.[54] In terms of text-based phenotyping, the Bi-LSTM with a conditional random field layer (Bi-LSTM-CRF) was used to improve identification of adverse drug events.[80,81,88] Similarly, Gehrmann et al. improved text-based phenotyping with a CNN designed to identify phrases relevant to substance abuse, depression, and other chronic conditions with the MIMIC-III phenotype dataset.[55]

Semi-supervised learning

Despite its widespread use, supervised learning is difficult to scale due to the time and resources required to obtain gold-standard labeled data.[98] Semi-supervised methods are trained with a large amount of unlabeled data (i.e. unreviewed medical records) and a small amount of labeled data to minimize the burden of chart review.[99] Three types of methods were used in 6 articles utilizing semi-supervised learning (Table 3). The first type performed feature selection using “silver standard labels” that can be automatically extracted from patient records, such as the frequency of phenotype specific diagnostic codes, prior to supervised training.[100,101] For instance, PheCAP processed openly available knowledge sources such as Wikipedia articles to generate a candidate list of related UMLS concepts. An ensemble sparse regression approach using silver-standard labels was then used to identify relevant concepts for supervised learning. PheCAP was used to phenotype over twenty conditions using EHR data from 4 institutions.[100,102] The second type of semi-supervised learning applied self-learning to train a generative model with labeled data to create pseudo-labels for the unlabeled dataset in order to train a traditional supervised model. Self-learning performed on par with supervised learning for 18 phenotypes.[64,65] In contrast, the third type directly incorporated unlabeled data into the algorithm through modification of the loss function.[66,103] For example, a semi-supervised tensor factorization (PSST) approach used the information in unlabeled data to incorporate cannot link constraints into tensor factorization for classification of hypertension and type-2 diabetes.[66] PSST performed similarly to supervised tensor factorization, but with fewer labeled examples.

View this table:

Table 3.

Semi-supervised methods used in the selected articles, as well as the phenotypes considered and the size of the labeled and unlabeled datasets.

Weakly-supervised learning

Analogous to semi-supervised learning, the goal of weakly-supervised learning is to expedite the phenotyping process by eliminating the need for gold-standard labeled data. Weakly supervised methods rely on a “silver-standard” label that can be easily extracted from patients records in place of gold standard labels.[104] The silver-standard label is selected based on clinical expertise as a proxy for the phenotype.[104–107] Common silver-standard labels included specific diagnosis codes, lab results, and free-text mentions of the phenotype.[108–110]

Two types of weakly-supervised learning approaches were used in 15 articles (Table 4). The first type assumed the silver-standard label follows a mixture model representing phenotype cases and controls.[108–114] For example, PheNorm utilized Gaussian mixture-models with denoising self-regression for phenotyping 4 chronic conditions.[108] MAP later improved upon PheNorm with an ensemble of mixture models and was validated across 16 phenotypes and two phenome-wide association studies.[40,109] PheVis extended the resolution of PheNorm from patient-level to visit-level by incorporating past medical history information into estimation.[112] The second type of weakly-supervised methods used silver standards to directly train supervised models.[51,105,106,115–119] For instance, APHRODITE employs “noisy label” learning with an anchor feature with a near perfect positive predictive value (PPV), but potentially imperfect sensitivity to train L1-penalized logistic regression models.[115] APHRODITE is available in openly available software for users of the OMOP common data model. Similar approaches have been used to identify phenotypes poorly documented in structured data such as systemic lupus erythematosus.[51,116] In general, weakly-supervised models exhibit similar or improved performance to their rule-based and supervised counterparts (Figures S1 and S2).

View this table:

Table 4.

Weakly-supervised methods used in the selected articles, as well as the phenotypes considered and the silver-standard label used.

Unsupervised learning

In contrast to the previously discussed ML approaches, unsupervised learning is used for phenotype discovery, including identification of subphenotypes,[39,74,120–128] co-occurring conditions,[69,129] and disease progression patterns.[68,130–134] Among the 19 articles utilizing unsupervised learning, Latent Dirichlet Allocation (LDA)[69,124,125,127,133] and K-means were the most frequently used methods.[120,121,123,125] LDA was applied to identify the co-occurrence of allergic rhinitis and osteoporosis among patients with kidney disease [69] as well as to capture trends in mental health and end of life care among dementia patients.[133] K-means was used to discover subphenotypes such as patients with different symptoms of acute kidney injury.[120] Model-derived subpopulations were commonly used in downstream prediction tasks.[39,68,121,122,125,131] For example, a SVM was used to identify sepsis using features of subpopulations with distinct dysfunction patterns discovered from a self-organizing map.[128] Only 1 article utilized a deep learning approach, specifically a deep autoencoder to discover subtypes of depression.[132]

Reporting and Evaluation Methods

As the articles primarily focused on identifying disease cases (excluding unsupervised learning articles), most evaluated algorithm performance with PPV, sensitivity, and/or F-score (70/81 articles reported at least one of these metrics; Table S12). The area under the ROC curve (AUROC) was also reported as an overall summary of discriminative performance (42/81 articles), while calibration was rarely assessed (5/81 articles). Additionally, several studies linked EHR data to administrative claims [30–36] or registry databases [37–40] to validate algorithm accuracy. Biorepositories were also used to demonstrate the validity of a derived phenotype in replicating a genetic association study.[109–111,135] Only 5 studies performed external validation or evaluated algorithmic fairness.[36,40,52,61,136] We also found limited reporting of the data descriptors necessary to assess the feasibility of implementing an algorithm in a new setting. Patient demographics were only reported in 38 of 71 papers using private data sources and only 20 articles released their analytic code. A majority of these articles used complex deep learning models (9 articles) and/or free-text data (9 articles).

With respect to performance comparisons, 21 articles compared an ML approach to a rule-based method (Table S13). Traditional ML was used in 10 of these articles and outperformed rule-based algorithms in 8 articles with respect to PPV, sensitivity, or both (Figure S3). 2 supervised deep learning models were compared to rules, with a Bi-LSTM performing similarly to a rule-based approach for substance abuse [54] and a bidirectional gated recurrent unit model significantly decreasing performance in identifying insulin rejection.[137] 20 articles also provided comparisons between deep learning and traditional baselines (Table S14). Deep learning outperformed traditional ML across all reported accuracy metrics for 18 of 33 phenotypes considered (Figure S4(a)). Deep learning improved sensitivity with a corresponding decrease in PPV or vice-versa (Figure S4(b-c)) for the remaining phenotypes, with the exception of one study demonstrating that elastic net logistic regression outperformed a RNN for phenotyping fall risk (Figure S4(d)).[61] It is important to note that a meaningful gain in accuracy must be interpreted in the context of the use case of the algorithm and the target metric of performance. Moreover, improvements in accuracy must be weighed against additional challenges brought on by deep learning, including data demands, decreased interpretability, and limited generalizability over time and across healthcare settings.[72,138–140]

DISCUSSION

This review highlights the substantial ongoing work in ML-based phenotyping. A broad range of phenotypes have been considered and the use of unstructured information in clinical notes is widespread. While ML approaches did not uniformly outperform rule-based methods, deep learning provided marginal improvement over traditional baselines. Moreover, semi-supervised and weakly-supervised learning have expedited the phenotyping process while unsupervised learning has been effective for phenotype discovery. Progress withstanding, most articles focused on binary phenotypes and few studies evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. Future work is warranted in “deep phenotyping”, reporting and evaluation standards, and methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Deep Phenotyping

“Deep phenotyping” moves beyond binary identification to characterization of nuanced phenotypes, such as the timing or severity of a condition, using advanced methods leveraging interoperable and multimodal data types.[19, 122,141,142] From a methodological viewpoint, studies of nuanced phenotypes will face similar, but more substantial challenges in obtaining gold-standard labeled data. Further work in semi- and weakly-supervised deep learning methods is necessary.[143,144] Moreover, given the privacy constraints associated with EHRs and other health data sources, leveraging interoperable and multimodal data calls for advancements in federated learning methods that can accommodate distributed data sources stored locally across institutions. [145]

Reporting & Evaluation Standards

Research networks, such as eMERGE, have long advocated for transparent and reusable phenotype definitions. Most recently, in response to the wave of COVID-19 studies, Kohane et al. proposed a checklist for evaluating the quality of EHR-based studies, emphasizing phenotypic transparency as a key concern.[146] However, we found most articles did not release necessary details for complete evaluation of an approach or implementation in other settings. As a step towards reporting standards that increase transparency and reproducibility, OHDSI proposed Findable, Accessible, Interoperable, and Reusable (FAIR) phenotype definitions based on APHRODITE. All of the necessary tooling, data models, software and vocabularies are publicly available and released with open-source licenses. [147] As noted in Kashyap et al in evaluating the APHRODITE framework, effective reporting of phenotyping models should include a detailed recipe for data preparation and model training, rather than the pre-trained models themselves, given substantial differences in EHR data across institutions.[115]

Additionally, we observed a lack of rigorous evaluation of phenotyping algorithms, with most studies using standard metrics to evaluate internal validity. We stress further model interrogation for phenotyping, including external validation as well as evaluation of fairness. However, reliable performance evaluation requires a substantial amount of gold-standard labeled data. There is very little work on semi-supervised and weakly-supervised model performance evaluation, and further research is warranted.[148–150]

Accounting for Misclassified Phenotypes due to Algorithm Errors

As ML phenotyping expands the scope of EHR research, care must be taken when using derived phenotypes for downstream tasks as they are inevitably misclassified due to algorithm errors. In the context of association studies, it is well known in the statistical community that misclassification can lead to diminished statistical power and biased estimation.[151–153] However, statistical methods are often siloed from the informatics community. We advocate for dissemination of existing methods and for methodological developments in “post-phenotyping” inferential and predictive modeling studies.

Limitations

As the definition of phenotyping is variable within the literature,[12] we used a broad search capturing articles focusing on ML or NLP or phenotyping using EHRs. Following prior work, we limited our scope to select informatics venues.[12,15] Although we have missed articles outside of these journals, our aim is to rigorously characterize the general landscape of ML-based phenotyping, which we believe is captured in the venues considered and in our detailed analyses.

CONCLUSION

This review summarizes the landscape of ML-based phenotyping between 2018 and 2022. Current literature has laid the groundwork for “deep phenotyping”, but developing standards and methodology for reliable use of a diverse range of phenotypes derived from ML models is necessary for continued EHR-based research.

Data Availability

All data produced in the present study are available upon reasonable request to the authors

DATA & CODE AVAILABILITY

The underlying data and R code to replicate our analyses can be found at: https://github.com/jlgrons/ML-EHR-Phenotyping-Review

COMPETING INTEREST

J.G. received scientific consulting fees from Alphabet’s Verily Life Sciences.

FUNDING

The project described was supported by an NSERC Discovery Grant (RGPIN-2021-03734), a CANSSI-ICES Data Access Grant, and a Connaught New Researcher Award.

CONTRIBUTIONS

J.G. conceived and designed the study. S.Y. performed the full-text review. J.G. and S.Y. analyzed and interpreted the data. J.G., P.V., and S.Y. drafted and revised the manuscript. J.G., P.V., S.Y., E.S., and K.T. approved the final manuscript.

ACKNOWLEDGEMENTS

The authors would like to thank Prof. Lei Sun for her useful comments.

Footnotes

Small corrections to figures in supplement and main text. Small edits to the main text.

REFERENCES

↵
Institute of Medicine, Roundtable on Value and Science Driven Health Care. Clinical Data asthe Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary. National Academies Press 2011.
↵
Mc Cord KA, Hemkens LG. Using electronic health records for clinical trials: Where do we stand and where can we go? CMAJ 2019;191:E128–33.
OpenUrl FREE Full Text
Li R, Chen Y, Ritchie MD, et al. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet 08/2020;21:493–502.
OpenUrl
Beesley LJ, Salvatore M, Fritsche LG, et al. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Stat Med 2020;39:773–800.
OpenUrl
Liu R, Rizzo S, Whipple S, et al. Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 2021;592:629–33.
OpenUrl CrossRef
↵
Geva A, Abman SH, Manzi SF, et al. Adverse drug event rates in pediatric pulmonary hypertension: a comparison of real-world data sources. J Am Med Inform Assoc 2020;27:294–300.
OpenUrl
↵
Rogers JR, Lee J, Zhou Z, et al. Contemporary use of real-world data for clinical trial conduct in the United States: a scoping review. J Am Med Inform Assoc 2021;28:144–54.
OpenUrl
↵
Boland MR, Hripcsak G, Shen Y, et al. Defining a comprehensive verotype using electronic health records for personalized medicine. J Am Med Inform Assoc 2013;20:e232–8.
OpenUrl CrossRef PubMed
↵
Liao KP, Cai T, Savova GK, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ 2015;350:h1885–h1885.
OpenUrl FREE Full Text
↵
Wei W-Q, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Medicine. 2015;7.
↵
Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. Curr Protoc Hum Genet 2019;100:e80.
OpenUrl CrossRef PubMed
↵
Shivade C, Raghavan P, Fosler-Lussier E, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2014;21:221–30.
OpenUrl CrossRef PubMed
↵
Denaxas S, Gonzalez-Izquierdo A, Direk K, et al. UK phenomics platform for developing and validating EHR phenotypes: CALIBER. bioRxiv. 2019;:539403. doi:10.1101/539403
OpenUrl Abstract/FREE Full Text
↵
Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013;20:e147–54.
OpenUrl CrossRef PubMed
↵
Banda JM, Seneviratne M, Hernandez-Boussard T, et al. Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models. Annu Rev Biomed Data Sci 2018;1:53–68.
OpenUrl PubMed
↵
Alzoubi H, Alzubi R, Ramzan N, et al. A Review of Automatic Phenotyping Approaches using Electronic Health Records. Electronics 2019;8:1235.
OpenUrl
↵
Robinson JR, Wei W-Q, Roden DM, et al. Defining Phenotypes from Clinical Data to Drive Genomic Research. Annu Rev Biomed Data Sci 2018;1:69–92.
OpenUrl CrossRef
↵
Hripcsak G, Albers DJ. High-fidelity phenotyping: richness and freedom from bias. J Am Med Inform Assoc 2018;25:289–94.
OpenUrl PubMed
↵
Weng C, Shah NH, Hripcsak G. Deep phenotyping: Embracing complexity and temporality-Towards scalability, portability, and interoperability. J Biomed Inform 2020;105:103433.
OpenUrl CrossRef PubMed
↵
Leslie D, Mazumder A, Peppin A, et al. Does ‘AI’ stand for augmenting inequality in the era of covid-19 healthcare? BMJ 2021;372. doi:10.1136/bmj.n304
OpenUrl FREE Full Text
↵
Zeng Z, Deng Y, Li X, et al. Natural Language Processing for EHR-Based Computational Phenotyping. IEEE/ACM Trans Comput Biol Bioinform 2019;16:139–53.
OpenUrl
↵
Mohammed M, Khan MB, Bashier EBM. Machine learning: algorithms and applications. Crc Press 2016.
↵
Zhou Z-H. A brief introduction to weakly supervised learning. Natl Sci Rev 2017;5:44–53.
OpenUrl
↵
Wu S, Roberts K, Datta S, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc 2020;27:457–70.
OpenUrl CrossRef PubMed
↵
Irwin AN, Rackham D. Comparison of the time-to-indexing in PubMed between biomedical journals according to impact factor, discipline, and focus. Res Social Adm Pharm 2017;13:389–93.
OpenUrl PubMed
↵
McBrien KA, Souri S, Symonds NE, et al. Identification of validated case definitions for medical conditions used in primary care electronic medical record databases: a systematic review. J Am Med Inform Assoc 2018;25:1567–78.
OpenUrl CrossRef PubMed
↵
Ford E, Carroll JA, Smith HE, et al. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc 2016;23:1007–15.
OpenUrl CrossRef PubMed
↵
Hripcsak G, Duke JD, Shah NH, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform 2015;216:574–8.
OpenUrl PubMed
↵
McCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics 2011;4:13.
OpenUrl CrossRef PubMed
↵
Erickson J, Abbott K, Susienka L. Automatic address validation and health record review to identify homeless Social Security disability applicants. J Biomed Inform 2018;82:41–6.
OpenUrl CrossRef
↵
Fialoke S, Malarstig A, Miller MR, et al. Application of Machine Learning Methods to Predict Non-Alcoholic Steatohepatitis (NASH) in Non-Alcoholic Fatty Liver (NAFL) Patients. AMIA Annu Symp Proc 2018;2018:430–9.
OpenUrl PubMed
Prenovost KM, Fihn SD, Maciejewski ML, et al. Using item response theory with health system data to identify latent groups of patients with multiple health conditions. PLoS One 2018;13:e0206915.
OpenUrl
↵
Choudhury O, Park Y, Salonidis T, et al. Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning. AMIA Annu Symp Proc 2019;2019:313–22.
OpenUrl PubMed
↵
Nori VS, Hane CA, Sun Y, et al. Deep neural network models for identifying incident dementia using claims and EHR datasets. PLoS One 2020;15:e0236400.
OpenUrl
↵
Gibson TB, Nguyen MD, Burrell T, et al. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021;28:1507–17.
OpenUrl
↵
Mahesri M, Chin K, Kumar A, et al. External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure. PLoS One 2021;16:e0252903.
OpenUrl
↵
Seneviratne MG, Banda JM, Brooks JD, et al. Identifying Cases of Metastatic Prostate Cancer Using Machine Learning on Electronic Health Records. AMIA Annu Symp Proc 2018;2018:1498–504.
OpenUrl
Ling AY, Kurian AW, Caswell-Jin JL, et al. Using natural language processing to construct a metastatic breast cancer cohort from linked cancer registry and electronic medical records data. JAMIA Open 2019;2:528–37.
OpenUrl
↵
Lyudovyk O, Shen Y, Tatonetti NP, et al. Pathway analysis of genomic pathology tests for prognostic cancer subtyping. J Biomed Inform 2019;98:103286.
OpenUrl
↵
Geva A, Liu M, Panickan VA, et al. A high-throughput phenotyping algorithm is portable from adult to pediatric populations. J Am Med Inform Assoc 2021;28:1265–9.
OpenUrl
↵
Henry S, Buchan K, Filannino M, et al. 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records. J Am Med Inform Assoc 2020;27:3–12.
OpenUrl CrossRef
↵
Stubbs A, Filannino M, Soysal E, et al. Cohort selection for clinical trials: n2c2 2018 shared task track 1. J Am Med Inform Assoc 2019;26:1163–71.
OpenUrl
↵
Harutyunyan H, Khachatrian H, Kale DC, et al. Multitask learning and benchmarking with clinical time series data. Sci Data 2019;6:96.
OpenUrl
↵
Buckland RS, Hogan JW, Chen ES. Selection of Clinical Text Features for Classifying Suicide Attempts. AMIA Annu Symp Proc 2020;2020:273–82.
OpenUrl
↵
Carson NJ, Mullin B, Sanchez MJ, et al. Identification of suicidal behavior among psychiatrically hospitalized adolescents using natural language processing and machine learning of electronic health records. PLoS One 2019;14:e0211116.
OpenUrl PubMed
↵
Afshar M, Phillips A, Karnik N, et al. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation. J Am Med Inform Assoc 2019;26:254–61.
OpenUrl
↵
To D, Joyce C, Kulshrestha S, et al. The addition of United States census-tract data does not improve the prediction of substance misuse. AMIA Annu Symp Proc 2021;2021:1149–58.
OpenUrl
↵
Badger J, LaRose E, Mayer J, et al. Machine learning for phenotyping opioid overdose events. J Biomed Inform 2019;94:103185.
OpenUrl
↵
Feller DJ, Zucker J, Don’t Walk OB, et al. Towards the Inference of Social and Behavioral Determinants of Sexual Health: Development of a Gold-Standard Corpus with Semi-Supervised Learning. AMIA Annu Symp Proc 2018;2018:422–9.
OpenUrl
↵
Han S, Zhang RF, Shi L, et al. Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing. J Biomed Inform 2022;127:103984.
OpenUrl
↵
Annapragada AV, Donaruma-Kwoh MM, Annapragada AV, et al. A natural language processing and deep learning approach to identify child abuse from pediatric electronic medical records. PLoS One 2021;16:e0247404.
OpenUrl
↵
Thompson HM, Sharma B, Bhalla S, et al. Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups. Journal of the American Medical Informatics Association. 2021;28:2393–403. doi:10.1093/jamia/ocab148
OpenUrl CrossRef
↵
Lybarger K, Yetisgen M, Ostendorf M. Using Neural Multi-task Learning to Extract Substance Abuse Information from Clinical Notes. AMIA Annu Symp Proc 2018;2018:1395–404.
OpenUrl
↵
Ni Y, Bachtel A, Nause K, et al. Automated detection of substance use information from electronic health records for a pediatric population. J Am Med Inform Assoc 2021;28:2116–27.
OpenUrl
↵
Gehrmann S, Dernoncourt F, Li Y, et al. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One 2018;13:e0192360.
OpenUrl CrossRef PubMed
↵
Stemerman R, Arguello J, Brice J, et al. Identification of social determinants of health using multi-label classification of electronic health record clinical notes. JAMIA Open 2021;4:ooaa069.
OpenUrl
↵
Yu Z, Yang X, Dang C, et al. A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models. AMIA Annu Symp Proc 2021;2021:1225–33.
OpenUrl
↵
Mitra A, Rawat BPS, McManus D, et al. Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems. AMIA Annu Symp Proc 2020;2020:860–9.
OpenUrl
Chen T, Dredze M, Weiner JP, et al. Identifying vulnerable older adult populations by contextualizing geriatric syndrome information in clinical notes of electronic health records. J Am Med Inform Assoc 2019;26:787–95.
OpenUrl
Gao J, Xiao C, Glass LM, et al. Dr. Agent: Clinical predictive model via mimicked second opinions. J Am Med Inform Assoc 2020;27:1084–91.
OpenUrl
↵
Martin JA, Crane-Droesch A, Lapite FC, et al. Development and validation of a prediction model for actionable aspects of frailty in the text of clinicians’ encounter notes. J Am Med Inform Assoc 2021;29:109–19.
OpenUrl
↵
Obeid JS, Davis M, Turner M, et al. An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. J Am Med Inform Assoc 2020;27:1321–5.
OpenUrl
↵
Lybarger K, Ostendorf M, Thompson M, et al. Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework. J Biomed Inform 2021;117:103761.
OpenUrl CrossRef PubMed
↵
Estiri H, Vasey S, Murphy SN. Generative transfer learning for measuring plausibility of EHR diagnosis records. J Am Med Inform Assoc 2021;28:559–68.
OpenUrl
↵
Estiri H, Strasser ZH, Murphy SN. High-throughput phenotyping with temporal sequences. J Am Med Inform Assoc 2020;28:772–81.
OpenUrl
↵
Henderson J, He H, Malin BA, et al. Phenotyping through Semi-Supervised Tensor Factorization (PSST). AMIA Annu Symp Proc 2018;2018:564–73.
OpenUrl
↵
Kirby JC, Speltz P, Rasmussen LV, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016;23:1046–52.
OpenUrl CrossRef PubMed
↵
Zhou F, Gillespie A, Gligorijevic D, et al. Use of disease embedding technique to predict the risk of progression to end-stage renal disease. J Biomed Inform 2020;105:103409.
OpenUrl
↵
Bhattacharya M, Jurkovitz C, Shatkay H. Co-occurrence of medical conditions: Exposing patterns through probabilistic topic modeling of snomed codes. J Biomed Inform 2018;82:31–40.
OpenUrl
↵
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004;32:D267–70.
OpenUrl CrossRef PubMed Web of Science
↵
Yu S, Liao KP, Shaw SY, et al. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources. J Am Med Inform Assoc 2015;22:993–1000.
OpenUrl CrossRef PubMed
↵
Ghassemi M, Naumann T, Schulam P, et al. A Review of Challenges and Opportunities in Machine Learning for Health. AMIA Jt Summits Transl Sci Proc 2020;2020:191–200.
OpenUrl
↵
Lu S, Chen R, Wei W, et al. Understanding Heart Failure Patients EHR Clinical Features via SHAP Interpretation of Tree-Based Machine Learning Model Predictions. AMIA Annu Symp Proc 2021;2021:813–22.
OpenUrl
↵
Ni Y, Alwell K, Moomaw CJ, et al. Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis. PLoS One 2018;13:e0192586.
OpenUrl
↵
Shi J, Liu S, Pruitt LCC, et al. Using Natural Language Processing to improve EHR Structured Data-based Surgical Site Infection Surveillance. AMIA Annu Symp Proc 2019;2019:794–803.
OpenUrl
↵
Yan LC, Yoshua B, Geoffrey H. Deep learning. Nature 2015;521:436–44.
OpenUrl CrossRef PubMed
↵
Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. 2014 Science and Information Conference. 2014. doi:10.1109/sai.2014.6918213
OpenUrl CrossRef
↵
Khattak FK, Jeblee S, Pou-Prom C, et al. A survey of word embeddings for clinical text. Journal of Biomedical Informatics: X 2019;4:100057.
OpenUrl
↵
Teller V. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. 2000.https://direct.mit.edu/coli/article-abstract/26/4/638/1680
↵
Wei Q, Ji Z, Li Z, et al. A study of deep learning approaches for medication and adverse drug event extraction from clinical text. J Am Med Inform Assoc 2020;27:13–21.
OpenUrl
↵
Ju M, Nguyen NTH, Miwa M, et al. An ensemble of neural models for nested adverse drug events and medication extraction with subwords. J Am Med Inform Assoc 2020;27:22–30.
OpenUrl CrossRef PubMed
Xiong Y, Shi X, Chen S, et al. Cohort selection for clinical trials using hierarchical neural network. J Am Med Inform Assoc 2019;26:1203–8.
OpenUrl
Chen L, Gu Y, Ji X, et al. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc 2020;27:56–64.
OpenUrl CrossRef PubMed
↵
Yang X, Bian J, Fang R, et al. Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting. J Am Med Inform Assoc 2020;27:65–72.
OpenUrl CrossRef
↵
Xie K, Gallagher RS, Conrad EC, et al. Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing. J Am Med Inform Assoc 2022;29:873–81.
OpenUrl
↵
Soni S, Roberts K. Patient Cohort Retrieval using Transformer Language Models. AMIA Annu Symp Proc 2020;2020:1150–9.
OpenUrl
↵
Kim Y, Meystre SM. Ensemble method-based extraction of medication and related information from clinical texts. J Am Med Inform Assoc 2020;27:31–8.
OpenUrl
↵
Dai H-J, Su C-H, Wu C-S. Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings. Journal of the American Medical Informatics Association. 2020;27:47–55. doi:10.1093/jamia/ocz120
OpenUrl CrossRef
Zhou S, Wang N, Wang L, et al. CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records. J Am Med Inform Assoc Published Online First: 25 March 2022. doi:10.1093/jamia/ocac040
OpenUrl CrossRef
↵
Eisman AS, Shah NR, Eickhoff C, et al. Extracting Angina Symptoms from Clinical Notes Using Pre-Trained Transformer Architectures. AMIA Annu Symp Proc 2020;2020:412–21.
OpenUrl
↵
1. Burges Cjc,
2. Bottou L,
3. Welling M, et al.
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and their Compositionality. In: Burges Cjc, Bottou L, Welling M, et al., eds. Advances in Neural Information Processing Systems. Curran Associates, Inc. 2013. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf
↵
Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. 1532–43.
↵
Devlin J, Chang M-W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv [cs.CL]. 2018.http://arxiv.org/abs/1810.04805
Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234–40.
OpenUrl CrossRef PubMed
Alsentzer E, Murphy JR, Boag W, et al. Publicly Available Clinical BERT Embeddings. arXiv [cs.CL]. 2019.http://arxiv.org/abs/1904.03323
↵
Liu Z, Lin W, Shi Y, et al. A Robustly Optimized BERT Pre-training Approach with Post-training. Lecture Notes in Computer Science. 2021;:471–84. doi:10.1007/978-3-030-84186-7_31
OpenUrl CrossRef
↵
Ogunyemi OI, Gandhi M, Lee M, et al. Detecting diabetic retinopathy through machine learning on electronic health record data from an urban, safety net healthcare system. JAMIA Open 2021;4:ooab066.
OpenUrl
↵
Cai T, Cai F, Dahal KP, et al. Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening. ACR Open Rheumatol 2021;3:593–600.
OpenUrl
↵
Zhu X (jerry). Semi-supervised learning literature survey. Published Online First: 2008.https://minds.wisconsin.edu/handle/1793/60444 (accessed 19 Apr 2022).
↵
Cade BE, Hassan SM, Dashti HS, et al. Sleep apnea phenotyping and relationship to disease in a large clinical biobank. JAMIA Open 2022;5:ooab117.
OpenUrl
↵
Cohen AM, Chamberlin S, Deloughery T, et al. Detecting rare diseases in electronic health records using machine learning and knowledge engineering: Case study of acute hepatic porphyria. PLoS One 2020;15:e0235574.
OpenUrl
↵
Zhang Y, Cai T, Yu S, et al. High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP). Nat Protoc 2019;14:3426–44.
OpenUrl CrossRef
↵
Zhang L, Ding X, Ma Y, et al. A maximum likelihood approach to electronic health record phenotyping using positive and unlabeled patients. J Am Med Inform Assoc 2020;27:119–26.
OpenUrl
↵
Yu S, Chakrabortty A, Liao KP, et al. Surrogate-assisted feature extraction for high-throughput phenotyping. J Am Med Inform Assoc 2017;24:e143–9.
OpenUrl CrossRef PubMed
↵
Halpern Y, Horng S, Choi Y, et al. Electronic medical record phenotyping using the anchor and learn framework. J Am Med Inform Assoc 2016;23:731–40.
OpenUrl CrossRef PubMed
↵
Agarwal V, Podchiyska T, Banda JM, et al. Learning statistical models of phenotypes using noisy labeled training data. J Am Med Inform Assoc 2016;23:1166–73.
OpenUrl CrossRef PubMed
↵
Banda JM, Halpern Y, Sontag D, et al. Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network. AMIA Jt Summits Transl Sci Proc 2017;2017:48–57.
OpenUrl
↵
Yu S, Ma Y, Gronsbell J, et al. Enabling phenotypic big data with PheNorm. J Am Med Inform Assoc 2018;25:54–60.
OpenUrl CrossRef PubMed
↵
Liao KP, Sun J, Cai TA, et al. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS. J Am Med Inform Assoc 2019;26:1255–62.
OpenUrl CrossRef
↵
Zheng NS, Feng Q, Kerchberger VE, et al. PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records. J Am Med Inform Assoc 2020;27:1675–87.
OpenUrl
↵
Sinnott JA, Cai F, Yu S, et al. PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies. J Am Med Inform Assoc 2018;25:1359–65.
OpenUrl CrossRef PubMed
↵
Ferté T, Cossin S, Schaeverbeke T, et al. Automatic phenotyping of electronical health record: PheVis algorithm. J Biomed Inform 2021;117:103746.
OpenUrl
Ahuja Y, Zhou D, He Z, et al. sureLDA: A multidisease automated phenotyping method for the electronic health record. J Am Med Inform Assoc 2020;27:1235–43.
OpenUrl
↵
Ning W, Chan S, Beam A, et al. Feature extraction for phenotyping from semantic and knowledge resources. J Biomed Inform 2019;91:103122.
OpenUrl
↵
Kashyap M, Seneviratne M, Banda JM, et al. Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network. Journal of the American Medical Informatics Association. 2020; 27:877–83. doi:10.1093/jamia/ocaa032
OpenUrl CrossRef
↵
Murray SG, Avati A, Schmajuk G, et al. Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. Journal of the American Medical Informatics Association. 2019; 26:61–5. doi:10.1093/jamia/ocy154
OpenUrl CrossRef
Sanyal J, Rubin D, Banerjee I. A weakly supervised model for the automated detection of adverse events using clinical notes. J Biomed Inform 2022;126:103969.
OpenUrl
Topaz M, Murga L, Gaddis KM, et al. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform 2019;90:103103.
OpenUrl
↵
Banerjee I, Li K, Seneviratne M, et al. Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. JAMIA Open. 2019;2:150–9. doi:10.1093/jamiaopen/ooy057
OpenUrl CrossRef
↵
Xu Z, Chou J, Zhang XS, et al. Identifying sub-phenotypes of acute kidney injury using structured and unstructured electronic health record data with memory networks. J Biomed Inform 2020;102:103361.
OpenUrl
↵
Apostolova E, Uppal A, Galarraga JE, et al. Towards Reliable ARDS Clinical Decision Support: ARDS Patient Analytics with Free-text and Structured EMR Data. AMIA Annu Symp Proc 2019;2019:228–37.
OpenUrl
↵
Zhao J, Zhang Y, Schlueter DJ, et al. Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study. J Biomed Inform 2019;98:103270.
OpenUrl
↵
Mullin S, Zola J, Lee R, et al. Longitudinal K-means approaches to clustering and analyzing EHR opioid use trajectories for clinical subtypes. J Biomed Inform 10/2021;122:103889.
OpenUrl
↵
Afshar M, Joyce C, Dligach D, et al. Subtypes in patients with opioid misuse: A prognostic enrichment strategy using electronic health record data in hospitalized patients. PLoS One 2019;14:e0219717.
OpenUrl
↵
Wang Y, Zhao Y, Therneau TM, et al. Unsupervised machine learning for the discovery of latent disease clusters and patient subgroups using electronic health records. Journal of Biomedical Informatics. 2020;102:103364. doi:10.1016/j.jbi.2019.103364
OpenUrl CrossRef PubMed
Maurits MP, Korsunsky I, Raychaudhuri S, et al. A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history. J Am Med Inform Assoc 2022;29:761–9.
OpenUrl PubMed
↵
Liu Q, Woo M, Zou X, et al. Symptom-based patient stratification in mental illness using clinical notes. J Biomed Inform 2019;98:103274.
OpenUrl CrossRef
↵
Ibrahim ZM, Wu H, Hamoud A, et al. On classifying sepsis heterogeneity in the ICU: insight using machine learning. J Am Med Inform Assoc 2020;27:437–43.
OpenUrl
↵
Shen F, Peng S, Fan Y, et al. HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology. J Biomed Inform 08/2019;96:103246.
OpenUrl
↵
Hubbard RA, Xu J, Siegel R, et al. Studying pediatric health outcomes with electronic health records using Bayesian clustering and trajectory analysis. J Biomed Inform 2021;113:103654.
OpenUrl
↵
Ben-Assuli O, Jacobi A, Goldman O, et al. Stratifying individuals into non-alcoholic fatty liver disease risk levels using time series machine learning models. J Biomed Inform 2022;126:103986.
OpenUrl
↵
Gong J, Simon GE, Liu S. Machine learning discovery of longitudinal patterns of depression and suicidal ideation. PLoS One 2019;14:e0222665.
OpenUrl
↵
Wang L, Lakin J, Riley C, et al. Disease Trajectories and End-of-Life Care for Dementias: Latent Topic Modeling and Trend Analysis Using Clinical Notes. AMIA Annu Symp Proc 2018;2018:1056–65.
OpenUrl
↵
Meaney C, Escobar M, Moineddin R, et al. Non-negative matrix factorization temporal topic models and clinical text data identify COVID-19 pandemic effects on primary healthcare and community health in Toronto, Canada. Journal of Biomedical Informatics. 2022;128:104034. doi:10.1016/j.jbi.2022.104034
OpenUrl CrossRef
↵
Li R, Chen Y, Moore JH. Integration of genetic and clinical information to improve imputation of data missing from electronic health records. J Am Med Inform Assoc 2019;26:1056–63.
OpenUrl
↵
Klann JG, Estiri H, Weber GM, et al. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. J Am Med Inform Assoc 2021;28:1411–20.
OpenUrl
↵
Malmasi S, Ge W, Hosomura N, et al. Comparing information extraction techniques for low-prevalence concepts: The case of insulin rejection by patients. J Biomed Inform 2019;99:103306.
OpenUrl
↵
Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health 2021;3:e745–50.
OpenUrl PubMed
Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat Med 01/2022;28:31–8.
OpenUrl CrossRef PubMed
↵
1. Doshi-Velez F,
2. Fackler J,
3. Jung K, et al.
Nestor B, McDermott MBA, Boag W, et al. Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks. In: Doshi-Velez F, Fackler J, Jung K, et al., eds. Proceedings of the 4th Machine Learning for Healthcare Conference. PMLR 09--10 Aug 2019. 381–405.
↵
Mate S, Bürkle T, Kapsner LA, et al. A method for the graphical modeling of relative temporal constraints. J Biomed Inform 2019;100:103314.
OpenUrl
↵
Meng W, Ou W, Chandwani S, et al. Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer. J Biomed Inform 2019;100:103335.
OpenUrl
↵
Liang L, Hou J, Uno H, et al. Semi-supervised Approach to Event Time Annotation Using Longitudinal Electronic Health Records. arXiv [stat.ME]. 2021.http://arxiv.org/abs/2110.09612
↵
Ahuja Y, Wen J, Hong C, et al. SAMGEP: A novel method for prediction of phenotype event times using the electronic health record. Research Square. 2021.https://www.researchsquare.com/article/rs-1119858/latest.pdf
↵
Tong J, Luo C, Islam MN, et al. Distributed learning for heterogeneous clinical data with application to integrating COVID-19 data across 230 sites. NPJ Digit Med 2022;5:76.
OpenUrl
↵
Kohane IS, Aronow BJ, Avillach P, et al. What Every Reader Should Know About Studies Using Electronic Health Record Data but May Be Afraid to Ask. J Med Internet Res 2021;23:e22219.
OpenUrl PubMed
↵
Weaver J, Potvien A, Swerdel J, et al. Best practices for creating the standardized content of an entry in the OHDSI Phenotype Library. In: 5th OHDSI Annual Symposium. 2019. https://www.ohdsi.org/wp-content/uploads/2019/09/james-weaver_a_book_in_the_phenotype_library_2019symposium.pdf
↵
Swerdel JN, Hripcsak G, Ryan PB. PheValuator: Development and evaluation of a phenotype algorithm evaluator. J Biomed Inform 2019;97:103258.
OpenUrl PubMed
Gronsbell JL, Cai T. Semi-supervised approaches to efficient evaluation of model prediction performance. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2018;80:579–94. doi:10.1111/rssb.12264
OpenUrl CrossRef
↵
Gronsbell J, Liu M, Tian L, et al. Efficient evaluation of prediction rules in semi-supervised settings under stratified sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2022. doi:10.1111/rssb.12502
OpenUrl CrossRef
↵
Manuel DG, Rosella LC, Stukel TA. Importance of accurately identifying disease in studies using electronic health records. BMJ 2010;341:c4226.
OpenUrl FREE Full Text
Sinnott JA, Dai W, Liao KP, et al. Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records. Hum Genet 2014;133:1369–82.
OpenUrl CrossRef PubMed
↵
Hubbard RA, Tong J, Duan R, et al. Reducing Bias Due to Outcome Misclassification for Epidemiologic Studies Using EHR-derived Probabilistic Phenotypes. Epidemiology 07/2020;31:542–50.
OpenUrl CrossRef
Koola JD, Davis SE, Al-Nimri O, et al. Development of an automated phenotyping algorithm for hepatorenal syndrome. J Biomed Inform 2018;80:87–95.
OpenUrl
Afshar M, Joyce C, Oakey A, et al. A Computable Phenotype for Acute Respiratory Distress Syndrome Using Natural Language Processing and Machine Learning. AMIA Annu Symp Proc 2018;2018:157–65.
OpenUrl
Hong N, Wen A, Stone DJ, et al. Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries. J Biomed Inform 2019;99:103310.
OpenUrl
Bucher BT, Shi J, Pettit RJ, et al. Determination of Marital Status of Patients from Structured and Unstructured Electronic Healthcare Data. AMIA Annu Symp Proc 2019;2019:267–74.
OpenUrl
Dai H-J, Wang F-D, Chen C-W, et al. Cohort selection for clinical trials using multiple instance learning. J Biomed Inform 2020;107:103438.
OpenUrl
Hassanzadeh H, Karimi S, Nguyen A. Matching patients to clinical trials using semantically enriched document representation. J Biomed Inform 2020;105:103406.
OpenUrl
Kulshrestha S, Dligach D, Joyce C, et al. Comparison and interpretability of machine learning models to predict severity of chest injury. JAMIA Open 2021;4:ooab015.
OpenUrl
Chu J, Dong W, He K, et al. Using neural attention networks to detect adverse medical events from electronic health records. J Biomed Inform 2018;87:118–30.
OpenUrl CrossRef PubMed
Chen C-J, Warikoo N, Chang Y-C, et al. Medical knowledge infused convolutional neural networks for cohort selection in clinical trials. J Am Med Inform Assoc 2019;26:1227–36.
OpenUrl
Segura-Bedmar I, Colon-Ruiz C Tejedor-Alonso, et al. Predicting of anaphylaxis in big data EMR by exploring machine learning approaches. J Biomed Inform 2018; 87:50–59.
OpenUrl CrossRef

View the discussion thread.

Posted October 27, 2022.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Health Informatics

Subject Areas

All Articles

Addiction Medicine (380)
Allergy and Immunology (697)
Anesthesia (187)
Cardiovascular Medicine (2824)
Dentistry and Oral Medicine (324)
Dermatology (242)
Emergency Medicine (427)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1002)
Epidemiology (12521)
Forensic Medicine (10)
Gastroenterology (799)
Genetic and Genomic Medicine (4400)
Geriatric Medicine (399)
Health Economics (712)
Health Informatics (2835)
Health Policy (1044)
Health Systems and Quality Improvement (1042)
Hematology (372)
HIV/AIDS (893)
Infectious Diseases (except HIV/AIDS) (13940)
Intensive Care and Critical Care Medicine (827)
Medical Education (412)
Medical Ethics (114)
Nephrology (460)
Neurology (4160)
Nursing (220)
Nutrition (615)
Obstetrics and Gynecology (779)
Occupational and Environmental Health (720)
Oncology (2188)
Ophthalmology (623)
Orthopedics (254)
Otolaryngology (316)
Pain Medicine (263)
Palliative Medicine (81)
Pathology (484)
Pediatrics (1169)
Pharmacology and Therapeutics (486)
Primary Care Research (481)
Psychiatry and Clinical Psychology (3630)
Public and Global Health (6739)
Radiology and Imaging (1482)
Rehabilitation Medicine and Physical Therapy (863)
Respiratory Medicine (896)
Rheumatology (429)
Sexual and Reproductive Health (431)
Sports Medicine (361)
Surgery (470)
Toxicology (57)
Transplantation (198)
Urology (173)

[1] ↵
Institute of Medicine, Roundtable on Value and Science Driven Health Care. Clinical Data asthe Basic Staple of Health Learning: Creating and Protecting a Public Good: Workshop Summary. National Academies Press 2011.

[2] ↵
Mc Cord KA, Hemkens LG. Using electronic health records for clinical trials: Where do we stand and where can we go? CMAJ 2019;191:E128–33.
OpenUrl FREE Full Text

[3] Li R, Chen Y, Ritchie MD, et al. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet 08/2020;21:493–502.
OpenUrl

[4] Beesley LJ, Salvatore M, Fritsche LG, et al. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Stat Med 2020;39:773–800.
OpenUrl

[5] Liu R, Rizzo S, Whipple S, et al. Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 2021;592:629–33.
OpenUrl CrossRef

[6] ↵
Geva A, Abman SH, Manzi SF, et al. Adverse drug event rates in pediatric pulmonary hypertension: a comparison of real-world data sources. J Am Med Inform Assoc 2020;27:294–300.
OpenUrl

[7] ↵
Rogers JR, Lee J, Zhou Z, et al. Contemporary use of real-world data for clinical trial conduct in the United States: a scoping review. J Am Med Inform Assoc 2021;28:144–54.
OpenUrl

[8] ↵
Boland MR, Hripcsak G, Shen Y, et al. Defining a comprehensive verotype using electronic health records for personalized medicine. J Am Med Inform Assoc 2013;20:e232–8.
OpenUrl CrossRef PubMed

[9] ↵
Liao KP, Cai T, Savova GK, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ 2015;350:h1885–h1885.
OpenUrl FREE Full Text

[10] ↵
Wei W-Q, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Medicine. 2015;7.

[11] ↵
Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. Curr Protoc Hum Genet 2019;100:e80.
OpenUrl CrossRef PubMed

[12] ↵
Shivade C, Raghavan P, Fosler-Lussier E, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 2014;21:221–30.
OpenUrl CrossRef PubMed

[13] ↵
Denaxas S, Gonzalez-Izquierdo A, Direk K, et al. UK phenomics platform for developing and validating EHR phenotypes: CALIBER. bioRxiv. 2019;:539403. doi:10.1101/539403
OpenUrl Abstract/FREE Full Text

[14] ↵
Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013;20:e147–54.
OpenUrl CrossRef PubMed

[15] ↵
Banda JM, Seneviratne M, Hernandez-Boussard T, et al. Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models. Annu Rev Biomed Data Sci 2018;1:53–68.
OpenUrl PubMed

[16] ↵
Alzoubi H, Alzubi R, Ramzan N, et al. A Review of Automatic Phenotyping Approaches using Electronic Health Records. Electronics 2019;8:1235.
OpenUrl

[17] ↵
Robinson JR, Wei W-Q, Roden DM, et al. Defining Phenotypes from Clinical Data to Drive Genomic Research. Annu Rev Biomed Data Sci 2018;1:69–92.
OpenUrl CrossRef

[18] ↵
Hripcsak G, Albers DJ. High-fidelity phenotyping: richness and freedom from bias. J Am Med Inform Assoc 2018;25:289–94.
OpenUrl PubMed

[19] ↵
Weng C, Shah NH, Hripcsak G. Deep phenotyping: Embracing complexity and temporality-Towards scalability, portability, and interoperability. J Biomed Inform 2020;105:103433.
OpenUrl CrossRef PubMed

[20] ↵
Leslie D, Mazumder A, Peppin A, et al. Does ‘AI’ stand for augmenting inequality in the era of covid-19 healthcare? BMJ 2021;372. doi:10.1136/bmj.n304
OpenUrl FREE Full Text

[21] ↵
Zeng Z, Deng Y, Li X, et al. Natural Language Processing for EHR-Based Computational Phenotyping. IEEE/ACM Trans Comput Biol Bioinform 2019;16:139–53.
OpenUrl

[22] ↵
Mohammed M, Khan MB, Bashier EBM. Machine learning: algorithms and applications. Crc Press 2016.

[23] ↵
Zhou Z-H. A brief introduction to weakly supervised learning. Natl Sci Rev 2017;5:44–53.
OpenUrl

[24] ↵
Wu S, Roberts K, Datta S, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc 2020;27:457–70.
OpenUrl CrossRef PubMed

[25] ↵
Irwin AN, Rackham D. Comparison of the time-to-indexing in PubMed between biomedical journals according to impact factor, discipline, and focus. Res Social Adm Pharm 2017;13:389–93.
OpenUrl PubMed

[26] ↵
McBrien KA, Souri S, Symonds NE, et al. Identification of validated case definitions for medical conditions used in primary care electronic medical record databases: a systematic review. J Am Med Inform Assoc 2018;25:1567–78.
OpenUrl CrossRef PubMed

[27] ↵
Ford E, Carroll JA, Smith HE, et al. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc 2016;23:1007–15.
OpenUrl CrossRef PubMed

[28] ↵
Hripcsak G, Duke JD, Shah NH, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform 2015;216:574–8.
OpenUrl PubMed

[29] ↵
McCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics 2011;4:13.
OpenUrl CrossRef PubMed

[30] ↵
Erickson J, Abbott K, Susienka L. Automatic address validation and health record review to identify homeless Social Security disability applicants. J Biomed Inform 2018;82:41–6.
OpenUrl CrossRef

[31] ↵
Fialoke S, Malarstig A, Miller MR, et al. Application of Machine Learning Methods to Predict Non-Alcoholic Steatohepatitis (NASH) in Non-Alcoholic Fatty Liver (NAFL) Patients. AMIA Annu Symp Proc 2018;2018:430–9.
OpenUrl PubMed

[32] Prenovost KM, Fihn SD, Maciejewski ML, et al. Using item response theory with health system data to identify latent groups of patients with multiple health conditions. PLoS One 2018;13:e0206915.
OpenUrl

[33] ↵
Choudhury O, Park Y, Salonidis T, et al. Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning. AMIA Annu Symp Proc 2019;2019:313–22.
OpenUrl PubMed

[34] ↵
Nori VS, Hane CA, Sun Y, et al. Deep neural network models for identifying incident dementia using claims and EHR datasets. PLoS One 2020;15:e0236400.
OpenUrl

[35] ↵
Gibson TB, Nguyen MD, Burrell T, et al. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021;28:1507–17.
OpenUrl

[36] ↵
Mahesri M, Chin K, Kumar A, et al. External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure. PLoS One 2021;16:e0252903.
OpenUrl

[37] ↵
Seneviratne MG, Banda JM, Brooks JD, et al. Identifying Cases of Metastatic Prostate Cancer Using Machine Learning on Electronic Health Records. AMIA Annu Symp Proc 2018;2018:1498–504.
OpenUrl

[38] Ling AY, Kurian AW, Caswell-Jin JL, et al. Using natural language processing to construct a metastatic breast cancer cohort from linked cancer registry and electronic medical records data. JAMIA Open 2019;2:528–37.
OpenUrl

[39] ↵
Lyudovyk O, Shen Y, Tatonetti NP, et al. Pathway analysis of genomic pathology tests for prognostic cancer subtyping. J Biomed Inform 2019;98:103286.
OpenUrl

[40] ↵
Geva A, Liu M, Panickan VA, et al. A high-throughput phenotyping algorithm is portable from adult to pediatric populations. J Am Med Inform Assoc 2021;28:1265–9.
OpenUrl

[41] ↵
Henry S, Buchan K, Filannino M, et al. 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records. J Am Med Inform Assoc 2020;27:3–12.
OpenUrl CrossRef

[42] ↵
Stubbs A, Filannino M, Soysal E, et al. Cohort selection for clinical trials: n2c2 2018 shared task track 1. J Am Med Inform Assoc 2019;26:1163–71.
OpenUrl

[43] ↵
Harutyunyan H, Khachatrian H, Kale DC, et al. Multitask learning and benchmarking with clinical time series data. Sci Data 2019;6:96.
OpenUrl

[44] ↵
Buckland RS, Hogan JW, Chen ES. Selection of Clinical Text Features for Classifying Suicide Attempts. AMIA Annu Symp Proc 2020;2020:273–82.
OpenUrl

[45] ↵
Carson NJ, Mullin B, Sanchez MJ, et al. Identification of suicidal behavior among psychiatrically hospitalized adolescents using natural language processing and machine learning of electronic health records. PLoS One 2019;14:e0211116.
OpenUrl PubMed

[46] ↵
Afshar M, Phillips A, Karnik N, et al. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation. J Am Med Inform Assoc 2019;26:254–61.
OpenUrl

[47] ↵
To D, Joyce C, Kulshrestha S, et al. The addition of United States census-tract data does not improve the prediction of substance misuse. AMIA Annu Symp Proc 2021;2021:1149–58.
OpenUrl

[48] ↵
Badger J, LaRose E, Mayer J, et al. Machine learning for phenotyping opioid overdose events. J Biomed Inform 2019;94:103185.
OpenUrl

[49] ↵
Feller DJ, Zucker J, Don’t Walk OB, et al. Towards the Inference of Social and Behavioral Determinants of Sexual Health: Development of a Gold-Standard Corpus with Semi-Supervised Learning. AMIA Annu Symp Proc 2018;2018:422–9.
OpenUrl

[50] ↵
Han S, Zhang RF, Shi L, et al. Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing. J Biomed Inform 2022;127:103984.
OpenUrl

[51] ↵
Annapragada AV, Donaruma-Kwoh MM, Annapragada AV, et al. A natural language processing and deep learning approach to identify child abuse from pediatric electronic medical records. PLoS One 2021;16:e0247404.
OpenUrl

[52] ↵
Thompson HM, Sharma B, Bhalla S, et al. Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups. Journal of the American Medical Informatics Association. 2021;28:2393–403. doi:10.1093/jamia/ocab148
OpenUrl CrossRef

[53] ↵
Lybarger K, Yetisgen M, Ostendorf M. Using Neural Multi-task Learning to Extract Substance Abuse Information from Clinical Notes. AMIA Annu Symp Proc 2018;2018:1395–404.
OpenUrl

[54] ↵
Ni Y, Bachtel A, Nause K, et al. Automated detection of substance use information from electronic health records for a pediatric population. J Am Med Inform Assoc 2021;28:2116–27.
OpenUrl

[55] ↵
Gehrmann S, Dernoncourt F, Li Y, et al. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One 2018;13:e0192360.
OpenUrl CrossRef PubMed

[56] ↵
Stemerman R, Arguello J, Brice J, et al. Identification of social determinants of health using multi-label classification of electronic health record clinical notes. JAMIA Open 2021;4:ooaa069.
OpenUrl

[57] ↵
Yu Z, Yang X, Dang C, et al. A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models. AMIA Annu Symp Proc 2021;2021:1225–33.
OpenUrl

[58] ↵
Mitra A, Rawat BPS, McManus D, et al. Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems. AMIA Annu Symp Proc 2020;2020:860–9.
OpenUrl

[59] Chen T, Dredze M, Weiner JP, et al. Identifying vulnerable older adult populations by contextualizing geriatric syndrome information in clinical notes of electronic health records. J Am Med Inform Assoc 2019;26:787–95.
OpenUrl

[60] Gao J, Xiao C, Glass LM, et al. Dr. Agent: Clinical predictive model via mimicked second opinions. J Am Med Inform Assoc 2020;27:1084–91.
OpenUrl

[61] ↵
Martin JA, Crane-Droesch A, Lapite FC, et al. Development and validation of a prediction model for actionable aspects of frailty in the text of clinicians’ encounter notes. J Am Med Inform Assoc 2021;29:109–19.
OpenUrl

[62] ↵
Obeid JS, Davis M, Turner M, et al. An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. J Am Med Inform Assoc 2020;27:1321–5.
OpenUrl

[63] ↵
Lybarger K, Ostendorf M, Thompson M, et al. Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework. J Biomed Inform 2021;117:103761.
OpenUrl CrossRef PubMed

[64] ↵
Estiri H, Vasey S, Murphy SN. Generative transfer learning for measuring plausibility of EHR diagnosis records. J Am Med Inform Assoc 2021;28:559–68.
OpenUrl

[65] ↵
Estiri H, Strasser ZH, Murphy SN. High-throughput phenotyping with temporal sequences. J Am Med Inform Assoc 2020;28:772–81.
OpenUrl

[66] ↵
Henderson J, He H, Malin BA, et al. Phenotyping through Semi-Supervised Tensor Factorization (PSST). AMIA Annu Symp Proc 2018;2018:564–73.
OpenUrl

[67] ↵
Kirby JC, Speltz P, Rasmussen LV, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016;23:1046–52.
OpenUrl CrossRef PubMed

[68] ↵
Zhou F, Gillespie A, Gligorijevic D, et al. Use of disease embedding technique to predict the risk of progression to end-stage renal disease. J Biomed Inform 2020;105:103409.
OpenUrl

[69] ↵
Bhattacharya M, Jurkovitz C, Shatkay H. Co-occurrence of medical conditions: Exposing patterns through probabilistic topic modeling of snomed codes. J Biomed Inform 2018;82:31–40.
OpenUrl

[70] ↵
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004;32:D267–70.
OpenUrl CrossRef PubMed Web of Science

[71] ↵
Yu S, Liao KP, Shaw SY, et al. Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources. J Am Med Inform Assoc 2015;22:993–1000.
OpenUrl CrossRef PubMed

[72] ↵
Ghassemi M, Naumann T, Schulam P, et al. A Review of Challenges and Opportunities in Machine Learning for Health. AMIA Jt Summits Transl Sci Proc 2020;2020:191–200.
OpenUrl

[73] ↵
Lu S, Chen R, Wei W, et al. Understanding Heart Failure Patients EHR Clinical Features via SHAP Interpretation of Tree-Based Machine Learning Model Predictions. AMIA Annu Symp Proc 2021;2021:813–22.
OpenUrl

[74] ↵
Ni Y, Alwell K, Moomaw CJ, et al. Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis. PLoS One 2018;13:e0192586.
OpenUrl

[75] ↵
Shi J, Liu S, Pruitt LCC, et al. Using Natural Language Processing to improve EHR Structured Data-based Surgical Site Infection Surveillance. AMIA Annu Symp Proc 2019;2019:794–803.
OpenUrl

[76] ↵
Yan LC, Yoshua B, Geoffrey H. Deep learning. Nature 2015;521:436–44.
OpenUrl CrossRef PubMed

[77] ↵
Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. 2014 Science and Information Conference. 2014. doi:10.1109/sai.2014.6918213
OpenUrl CrossRef

[78] ↵
Khattak FK, Jeblee S, Pou-Prom C, et al. A survey of word embeddings for clinical text. Journal of Biomedical Informatics: X 2019;4:100057.
OpenUrl

[79] ↵
Teller V. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. 2000.https://direct.mit.edu/coli/article-abstract/26/4/638/1680

[80] ↵
Wei Q, Ji Z, Li Z, et al. A study of deep learning approaches for medication and adverse drug event extraction from clinical text. J Am Med Inform Assoc 2020;27:13–21.
OpenUrl

[81] ↵
Ju M, Nguyen NTH, Miwa M, et al. An ensemble of neural models for nested adverse drug events and medication extraction with subwords. J Am Med Inform Assoc 2020;27:22–30.
OpenUrl CrossRef PubMed

[82] Xiong Y, Shi X, Chen S, et al. Cohort selection for clinical trials using hierarchical neural network. J Am Med Inform Assoc 2019;26:1203–8.
OpenUrl

[83] Chen L, Gu Y, Ji X, et al. Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning. J Am Med Inform Assoc 2020;27:56–64.
OpenUrl CrossRef PubMed

[84] ↵
Yang X, Bian J, Fang R, et al. Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting. J Am Med Inform Assoc 2020;27:65–72.
OpenUrl CrossRef

[85] ↵
Xie K, Gallagher RS, Conrad EC, et al. Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing. J Am Med Inform Assoc 2022;29:873–81.
OpenUrl

[86] ↵
Soni S, Roberts K. Patient Cohort Retrieval using Transformer Language Models. AMIA Annu Symp Proc 2020;2020:1150–9.
OpenUrl

[87] ↵
Kim Y, Meystre SM. Ensemble method-based extraction of medication and related information from clinical texts. J Am Med Inform Assoc 2020;27:31–8.
OpenUrl

[88] ↵
Dai H-J, Su C-H, Wu C-S. Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings. Journal of the American Medical Informatics Association. 2020;27:47–55. doi:10.1093/jamia/ocz120
OpenUrl CrossRef

[89] Zhou S, Wang N, Wang L, et al. CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records. J Am Med Inform Assoc Published Online First: 25 March 2022. doi:10.1093/jamia/ocac040
OpenUrl CrossRef

[90] ↵
Eisman AS, Shah NR, Eickhoff C, et al. Extracting Angina Symptoms from Clinical Notes Using Pre-Trained Transformer Architectures. AMIA Annu Symp Proc 2020;2020:412–21.
OpenUrl

[91] ↵
Burges Cjc,
Bottou L,
Welling M, et al.
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and their Compositionality. In: Burges Cjc, Bottou L, Welling M, et al., eds. Advances in Neural Information Processing Systems. Curran Associates, Inc. 2013. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf

[92] Burges Cjc,

[93] Bottou L,

[94] Welling M, et al.

[95] ↵
Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. 1532–43.

[96] ↵
Devlin J, Chang M-W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv [cs.CL]. 2018.http://arxiv.org/abs/1810.04805

[97] Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234–40.
OpenUrl CrossRef PubMed

[98] Alsentzer E, Murphy JR, Boag W, et al. Publicly Available Clinical BERT Embeddings. arXiv [cs.CL]. 2019.http://arxiv.org/abs/1904.03323

[99] ↵
Liu Z, Lin W, Shi Y, et al. A Robustly Optimized BERT Pre-training Approach with Post-training. Lecture Notes in Computer Science. 2021;:471–84. doi:10.1007/978-3-030-84186-7_31
OpenUrl CrossRef

[100] ↵
Ogunyemi OI, Gandhi M, Lee M, et al. Detecting diabetic retinopathy through machine learning on electronic health record data from an urban, safety net healthcare system. JAMIA Open 2021;4:ooab066.
OpenUrl

[101] ↵
Cai T, Cai F, Dahal KP, et al. Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening. ACR Open Rheumatol 2021;3:593–600.
OpenUrl

[102] ↵
Zhu X (jerry). Semi-supervised learning literature survey. Published Online First: 2008.https://minds.wisconsin.edu/handle/1793/60444 (accessed 19 Apr 2022).

[103] ↵
Cade BE, Hassan SM, Dashti HS, et al. Sleep apnea phenotyping and relationship to disease in a large clinical biobank. JAMIA Open 2022;5:ooab117.
OpenUrl

[104] ↵
Cohen AM, Chamberlin S, Deloughery T, et al. Detecting rare diseases in electronic health records using machine learning and knowledge engineering: Case study of acute hepatic porphyria. PLoS One 2020;15:e0235574.
OpenUrl

[105] ↵
Zhang Y, Cai T, Yu S, et al. High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP). Nat Protoc 2019;14:3426–44.
OpenUrl CrossRef

[106] ↵
Zhang L, Ding X, Ma Y, et al. A maximum likelihood approach to electronic health record phenotyping using positive and unlabeled patients. J Am Med Inform Assoc 2020;27:119–26.
OpenUrl

[107] ↵
Yu S, Chakrabortty A, Liao KP, et al. Surrogate-assisted feature extraction for high-throughput phenotyping. J Am Med Inform Assoc 2017;24:e143–9.
OpenUrl CrossRef PubMed

[108] ↵
Halpern Y, Horng S, Choi Y, et al. Electronic medical record phenotyping using the anchor and learn framework. J Am Med Inform Assoc 2016;23:731–40.
OpenUrl CrossRef PubMed

[109] ↵
Agarwal V, Podchiyska T, Banda JM, et al. Learning statistical models of phenotypes using noisy labeled training data. J Am Med Inform Assoc 2016;23:1166–73.
OpenUrl CrossRef PubMed

[110] ↵
Banda JM, Halpern Y, Sontag D, et al. Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network. AMIA Jt Summits Transl Sci Proc 2017;2017:48–57.
OpenUrl

[111] ↵
Yu S, Ma Y, Gronsbell J, et al. Enabling phenotypic big data with PheNorm. J Am Med Inform Assoc 2018;25:54–60.
OpenUrl CrossRef PubMed

[112] ↵
Liao KP, Sun J, Cai TA, et al. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS. J Am Med Inform Assoc 2019;26:1255–62.
OpenUrl CrossRef

[113] ↵
Zheng NS, Feng Q, Kerchberger VE, et al. PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records. J Am Med Inform Assoc 2020;27:1675–87.
OpenUrl

[114] ↵
Sinnott JA, Cai F, Yu S, et al. PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies. J Am Med Inform Assoc 2018;25:1359–65.
OpenUrl CrossRef PubMed

[115] ↵
Ferté T, Cossin S, Schaeverbeke T, et al. Automatic phenotyping of electronical health record: PheVis algorithm. J Biomed Inform 2021;117:103746.
OpenUrl

[116] Ahuja Y, Zhou D, He Z, et al. sureLDA: A multidisease automated phenotyping method for the electronic health record. J Am Med Inform Assoc 2020;27:1235–43.
OpenUrl

[117] ↵
Ning W, Chan S, Beam A, et al. Feature extraction for phenotyping from semantic and knowledge resources. J Biomed Inform 2019;91:103122.
OpenUrl

[118] ↵
Kashyap M, Seneviratne M, Banda JM, et al. Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network. Journal of the American Medical Informatics Association. 2020; 27:877–83. doi:10.1093/jamia/ocaa032
OpenUrl CrossRef

[119] ↵
Murray SG, Avati A, Schmajuk G, et al. Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. Journal of the American Medical Informatics Association. 2019; 26:61–5. doi:10.1093/jamia/ocy154
OpenUrl CrossRef

[120] Sanyal J, Rubin D, Banerjee I. A weakly supervised model for the automated detection of adverse events using clinical notes. J Biomed Inform 2022;126:103969.
OpenUrl

[121] Topaz M, Murga L, Gaddis KM, et al. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform 2019;90:103103.
OpenUrl

[122] ↵
Banerjee I, Li K, Seneviratne M, et al. Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. JAMIA Open. 2019;2:150–9. doi:10.1093/jamiaopen/ooy057
OpenUrl CrossRef

[123] ↵
Xu Z, Chou J, Zhang XS, et al. Identifying sub-phenotypes of acute kidney injury using structured and unstructured electronic health record data with memory networks. J Biomed Inform 2020;102:103361.
OpenUrl

[124] ↵
Apostolova E, Uppal A, Galarraga JE, et al. Towards Reliable ARDS Clinical Decision Support: ARDS Patient Analytics with Free-text and Structured EMR Data. AMIA Annu Symp Proc 2019;2019:228–37.
OpenUrl

[125] ↵
Zhao J, Zhang Y, Schlueter DJ, et al. Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study. J Biomed Inform 2019;98:103270.
OpenUrl

[126] ↵
Mullin S, Zola J, Lee R, et al. Longitudinal K-means approaches to clustering and analyzing EHR opioid use trajectories for clinical subtypes. J Biomed Inform 10/2021;122:103889.
OpenUrl

[127] ↵
Afshar M, Joyce C, Dligach D, et al. Subtypes in patients with opioid misuse: A prognostic enrichment strategy using electronic health record data in hospitalized patients. PLoS One 2019;14:e0219717.
OpenUrl

[128] ↵
Wang Y, Zhao Y, Therneau TM, et al. Unsupervised machine learning for the discovery of latent disease clusters and patient subgroups using electronic health records. Journal of Biomedical Informatics. 2020;102:103364. doi:10.1016/j.jbi.2019.103364
OpenUrl CrossRef PubMed

[129] Maurits MP, Korsunsky I, Raychaudhuri S, et al. A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history. J Am Med Inform Assoc 2022;29:761–9.
OpenUrl PubMed

[130] ↵
Liu Q, Woo M, Zou X, et al. Symptom-based patient stratification in mental illness using clinical notes. J Biomed Inform 2019;98:103274.
OpenUrl CrossRef

[131] ↵
Ibrahim ZM, Wu H, Hamoud A, et al. On classifying sepsis heterogeneity in the ICU: insight using machine learning. J Am Med Inform Assoc 2020;27:437–43.
OpenUrl

[132] ↵
Shen F, Peng S, Fan Y, et al. HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology. J Biomed Inform 08/2019;96:103246.
OpenUrl

[133] ↵
Hubbard RA, Xu J, Siegel R, et al. Studying pediatric health outcomes with electronic health records using Bayesian clustering and trajectory analysis. J Biomed Inform 2021;113:103654.
OpenUrl

[134] ↵
Ben-Assuli O, Jacobi A, Goldman O, et al. Stratifying individuals into non-alcoholic fatty liver disease risk levels using time series machine learning models. J Biomed Inform 2022;126:103986.
OpenUrl

[135] ↵
Gong J, Simon GE, Liu S. Machine learning discovery of longitudinal patterns of depression and suicidal ideation. PLoS One 2019;14:e0222665.
OpenUrl

[136] ↵
Wang L, Lakin J, Riley C, et al. Disease Trajectories and End-of-Life Care for Dementias: Latent Topic Modeling and Trend Analysis Using Clinical Notes. AMIA Annu Symp Proc 2018;2018:1056–65.
OpenUrl

[137] ↵
Meaney C, Escobar M, Moineddin R, et al. Non-negative matrix factorization temporal topic models and clinical text data identify COVID-19 pandemic effects on primary healthcare and community health in Toronto, Canada. Journal of Biomedical Informatics. 2022;128:104034. doi:10.1016/j.jbi.2022.104034
OpenUrl CrossRef

[138] ↵
Li R, Chen Y, Moore JH. Integration of genetic and clinical information to improve imputation of data missing from electronic health records. J Am Med Inform Assoc 2019;26:1056–63.
OpenUrl

[139] ↵
Klann JG, Estiri H, Weber GM, et al. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. J Am Med Inform Assoc 2021;28:1411–20.
OpenUrl

[140] ↵
Malmasi S, Ge W, Hosomura N, et al. Comparing information extraction techniques for low-prevalence concepts: The case of insulin rejection by patients. J Biomed Inform 2019;99:103306.
OpenUrl

[141] ↵
Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health 2021;3:e745–50.
OpenUrl PubMed

[142] Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat Med 01/2022;28:31–8.
OpenUrl CrossRef PubMed

[143] ↵
Doshi-Velez F,
Fackler J,
Jung K, et al.
Nestor B, McDermott MBA, Boag W, et al. Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks. In: Doshi-Velez F, Fackler J, Jung K, et al., eds. Proceedings of the 4th Machine Learning for Healthcare Conference. PMLR 09--10 Aug 2019. 381–405.

[144] Doshi-Velez F,

[145] Fackler J,

[146] Jung K, et al.

[147] ↵
Mate S, Bürkle T, Kapsner LA, et al. A method for the graphical modeling of relative temporal constraints. J Biomed Inform 2019;100:103314.
OpenUrl

[148] ↵
Meng W, Ou W, Chandwani S, et al. Temporal phenotyping by mining healthcare data to derive lines of therapy for cancer. J Biomed Inform 2019;100:103335.
OpenUrl

[149] ↵
Liang L, Hou J, Uno H, et al. Semi-supervised Approach to Event Time Annotation Using Longitudinal Electronic Health Records. arXiv [stat.ME]. 2021.http://arxiv.org/abs/2110.09612

[150] ↵
Ahuja Y, Wen J, Hong C, et al. SAMGEP: A novel method for prediction of phenotype event times using the electronic health record. Research Square. 2021.https://www.researchsquare.com/article/rs-1119858/latest.pdf

[151] ↵
Tong J, Luo C, Islam MN, et al. Distributed learning for heterogeneous clinical data with application to integrating COVID-19 data across 230 sites. NPJ Digit Med 2022;5:76.
OpenUrl

[152] ↵
Kohane IS, Aronow BJ, Avillach P, et al. What Every Reader Should Know About Studies Using Electronic Health Record Data but May Be Afraid to Ask. J Med Internet Res 2021;23:e22219.
OpenUrl PubMed

[153] ↵
Weaver J, Potvien A, Swerdel J, et al. Best practices for creating the standardized content of an entry in the OHDSI Phenotype Library. In: 5th OHDSI Annual Symposium. 2019. https://www.ohdsi.org/wp-content/uploads/2019/09/james-weaver_a_book_in_the_phenotype_library_2019symposium.pdf

[154] ↵
Swerdel JN, Hripcsak G, Ryan PB. PheValuator: Development and evaluation of a phenotype algorithm evaluator. J Biomed Inform 2019;97:103258.
OpenUrl PubMed

[155] Gronsbell JL, Cai T. Semi-supervised approaches to efficient evaluation of model prediction performance. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2018;80:579–94. doi:10.1111/rssb.12264
OpenUrl CrossRef

[156] ↵
Gronsbell J, Liu M, Tian L, et al. Efficient evaluation of prediction rules in semi-supervised settings under stratified sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2022. doi:10.1111/rssb.12502
OpenUrl CrossRef

[157] ↵
Manuel DG, Rosella LC, Stukel TA. Importance of accurately identifying disease in studies using electronic health records. BMJ 2010;341:c4226.
OpenUrl FREE Full Text

[158] Sinnott JA, Dai W, Liao KP, et al. Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records. Hum Genet 2014;133:1369–82.
OpenUrl CrossRef PubMed

[159] ↵
Hubbard RA, Tong J, Duan R, et al. Reducing Bias Due to Outcome Misclassification for Epidemiologic Studies Using EHR-derived Probabilistic Phenotypes. Epidemiology 07/2020;31:542–50.
OpenUrl CrossRef

[160] Koola JD, Davis SE, Al-Nimri O, et al. Development of an automated phenotyping algorithm for hepatorenal syndrome. J Biomed Inform 2018;80:87–95.
OpenUrl

[161] Afshar M, Joyce C, Oakey A, et al. A Computable Phenotype for Acute Respiratory Distress Syndrome Using Natural Language Processing and Machine Learning. AMIA Annu Symp Proc 2018;2018:157–65.
OpenUrl

[162] Hong N, Wen A, Stone DJ, et al. Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries. J Biomed Inform 2019;99:103310.
OpenUrl

[163] Bucher BT, Shi J, Pettit RJ, et al. Determination of Marital Status of Patients from Structured and Unstructured Electronic Healthcare Data. AMIA Annu Symp Proc 2019;2019:267–74.
OpenUrl

[164] Dai H-J, Wang F-D, Chen C-W, et al. Cohort selection for clinical trials using multiple instance learning. J Biomed Inform 2020;107:103438.
OpenUrl

[165] Hassanzadeh H, Karimi S, Nguyen A. Matching patients to clinical trials using semantically enriched document representation. J Biomed Inform 2020;105:103406.
OpenUrl

[166] Kulshrestha S, Dligach D, Joyce C, et al. Comparison and interpretability of machine learning models to predict severity of chest injury. JAMIA Open 2021;4:ooab015.
OpenUrl

[167] Chu J, Dong W, He K, et al. Using neural attention networks to detect adverse medical events from electronic health records. J Biomed Inform 2018;87:118–30.
OpenUrl CrossRef PubMed

[168] Chen C-J, Warikoo N, Chang Y-C, et al. Medical knowledge infused convolutional neural networks for cohort selection in clinical trials. J Am Med Inform Assoc 2019;26:1227–36.
OpenUrl

[169] Segura-Bedmar I, Colon-Ruiz C Tejedor-Alonso, et al. Predicting of anaphylaxis in big data EMR by exploring machine learning approaches. J Biomed Inform 2018; 87:50–59.
OpenUrl CrossRef

Machine Learning Approaches for Electronic Health Records Phenotyping: A Methodical Review

ABSTRACT

BACKGROUND AND SIGNIFICANCE

OBJECTIVE

MATERIALS AND METHODS

Working definitions

Search strategy

Study selection

Title and abstract screening

Full-text review

RESULTS

Data Sources

Phenotypes

ML Methods

Traditional supervised learning

Deep supervised learning

Semi-supervised learning

Weakly-supervised learning

Unsupervised learning

Reporting and Evaluation Methods

DISCUSSION

Deep Phenotyping

Reporting & Evaluation Standards

Accounting for Misclassified Phenotypes due to Algorithm Errors

Limitations

CONCLUSION

Data Availability

DATA & CODE AVAILABILITY

COMPETING INTEREST

FUNDING

CONTRIBUTIONS

ACKNOWLEDGEMENTS

Footnotes

REFERENCES

Citation Manager Formats

Subject Area