Graph-Based Clinical Recommender: Predicting Specialists Procedure Orders using Graph Representation Learning

Sajjad Fouladvand; Federico Reyes Gomez; Hamed Nilforoshan; Morteza Noshad; Jiaxuan You; Rok Sosic; Jure Leskovec; Jonathan Chen

doi:10.1101/2022.11.21.22282571

Abstract

An automated medical procedure order recommender can facilitate patient referrals and consultations from primary care providers to specialty care sites. Here, we propose to solve this task using a novel graph representation learning approach. We develop a heterogeneous graph neural network to model structured electronic health records and formulate the procedure recommender task as a link prediction problem. Our experimental results show that our model achieves a 14% improvement in personalized recommendations over state-of-the-art neural network models and existing clinical tools including referral guidelines and checklists.

1. Introduction

Access to medical specialty care is often delayed due to growing limitations in clinicians time and resources leading to higher mortality rates (Prentice and Pizer, 2007). Early prediction of procedures to be ordered during initial outpatient specialty consultation care can facilitate specialist consultations as well as decision making (Chiang et al., 2020; Kim-Hwang et al., 2010). Leveraging artificial intelligence (AI) to solve this task is largely unexplored; even though AI has shown successful application in solving real-world problems in healthcare (Yu et al., 2018).

To this end, Noshad et al. (2021) have proposed an endocrinology procedure recommender using an ensemble of multi-layer per- ceptron neural networks and collaborative filtering. However, the heterogeneity and structured nature of electronic health records (EHR) can be captured more effectively using graphical models (Park et al., 2022; Choi et al., 2018, 2017).

Graph Convolutional Transformer (GCT) (Choi et al., 2020) maps encounters into a fully connected graph and infers the underlying structure by computing self-attentions on the graph connection. Liu et al. (2020) addressed the high visibility (Li et al., 2018) of hub nodes such as demographic nodes and showed the effectiveness of modeling EHR data into heterogeneous graphs. Further, heterogeneous graph neural networks (GNN) have been utilized in drug pairs side effect prediction (Zitnik et al., 2018), medical diagnosis prediction (Liu et al., 2021) and medical concept representations (Wu et al., 2021; Vretinaris et al., 2021).

Motivated by Hamilton et al. (2017), Zitnik et al. (2018), and Veličković et al. (2018) we propose a novel GNN based framework to provide personalized procedure order recommendations prior to or during patients initial specialty care visits. Note, here we use the terminology ‘order’ to refer to the surgical/follow-up procedures ordered by physicians. This work is part of a larger body of work to develop digital specialty consultation systems to expand the access to quality medical expertise. In this paper, we focus on referrals to endocrinology as one of the highest demand and use patients historical structured EHR data.

2. Materials and Methods

2.1. Data

Our data includes all outpatients referred by Stanford primary care providers to the Stanford endocrinology clinic between January 2008 and December 2018. Use of this data for this study was approved as an exempt protocol by Stanford Institutional Review Board. We only included patients’ first visit with their endocrinologist within four months of their referral dates. Our final data set include 6,821 referrals.

We denote the list of patient referrals as P = {p₁, …, p_n} in which n is the number of patient referrals. Each patient referral p_i constitutes a tuple (t_i, Dⁱ, Oⁱ, Lⁱ, Y ⁱ), where t_i is referral’s date and Dⁱ ∈ ℝ¹⁰, Oⁱ ∈ ℝ⁶⁰, and Lⁱ ∈ ℝ^300×3 are multi-hot encoded vectors representing diagnoses codes, procedure orders, and lab results for p_i prior to t_i. We used a two month look back window for lab results and procedures. Each lab result was converted to a vector with three elements indicating (a) if p_i has had the lab result, (b) if the result was high, and (c) if the result was low. Y ⁱ is a multi-hot encoded vector representing the procedures ordered by the specialist during patient’s special care visit.

Our final feature set includes 370 features: 10 most common endocrinology related diagnoses codes, 300 lab result features, and 60 procedures. The target set includes 60 procedure orders (see variable names in Appendix A).

2.2. Proposed Method

2.2.1. Graph Structure

We modeled patients EHR data set into a heterogeneous graph neural network G = (V, E) (see Figure A1 in Appendix A). V contains two node types: patient referral nodes , and procedure order nodes . Each patient node is assigned a 310-dimensional feature vector consisting of concatenation of Dⁱ and Lⁱ and each procedure order node are associated with one-hot encoding of the entity IDs ( and , respectively).

Edge set E contains two edge types. ‘ordered-with’ edges with edge labels set to 0 that are edges between patient nodes and the procedures they have done before t_i, and ‘ordered-with’ edges with edge label set to 1 that connect the patients with the procedures that their specialist ordered during the specialty care visit after t_i. Note, ‘ordered-with’ edges with edge labels equal to 1 that represent specialist’s orders after referral date were not used during training and were only used in the prediction phase as we are aiming to predict procedure orders after t_i. We formulate this task as binary link prediction of the existence of ‘ordered-with’ edges between a patient and an order. Further, node degree, node clustering coefficient and centrality transformations were applied to add synthetic features to each node feature vector. While the model can learn those features on its own, we added them to help the model focus on learning other features. We apply a different Graph Convolutional layers with independent parameters to each message type of (head, relation, tail) and aggregate embeddings across all node types. The same graph attention mechanism was applied to all node types.

2.2.2. Message Passing and Graph Attention

Figure 1 shows our proposed architecture. A fully connected layer with hidden size of 128 was used to map each node feature vector to pre-embedding vectors. Distinct fully connected layers were used for each node type. Two message passing layers was used each consisting of a dropout layer, a PReLU activation function, and a graph convolutional layer.

Figure 1:

General architecture of our data, graph, and our proposed model.

A custom heterogeneous graph attention layer was used using 1-head attention mostly following the structure of the original graph attention networks (Veličković et al., 2018), with the following modifications: 1) we applied fully connected layers with batch normalization to the node embeddings and the neighbor embeddings, and 2) we aggregated neighbor embeddings using the attention mechanism and concatenated the aggregated embedding to the current node’s embedding. This is then passed into a fully connected layer that reduces this down to a single out-put embedding followed by a batch normalization operation. Equation (1) shows our massage passing function and Equation (2) shows the GATConv update function

Where is the 1 head GAT attention score calculated for υ_o, 𝒩 (υ) is neighbors of v, and represents the node features of node υ.

The final predictions on existence of an ‘ordered-with’ edge e_ij between nodes and is inferred by concatenating their node embeddings and passing that through a fully connected two-layer perceptron, a batch normalization, a ReLU activation, and a final fully connected layer that outputs 2-dimensional logit vectors that are converted to final binary predictions using a softmax function.

The formula for the link prediction head is as follows: where BN refers to Batch Normalization and the first value corresponds to the probability that the edge exists and the second that it doesn’t.

3. Experimental Results

We used transductive disjoint training with a 1:4 positive:negative sampling using PyG (Fey and Lenssen, 2019; pyg). Adam optimizer with a learning rate of 1e-3, weight decay of 5e-4, and 400 epochs were used to train the model. Further, dropout of 0.2, and pre-embedding sizes, hidden sizes, and final embedding sizes of 128 were used. Our GNN model was tested on predictions made on all ‘ordered-with’ edges between a patient and an order placed during specialty visit.

Table 1 compares prediction results of our proposed model (CR-GNN) with the baselines presented by Noshad et al. (2021) including fully connected multi-layer neural network (Diagnostic Model), a collaborative filtering auto-encoder (AE), singular value decomposition (SVD), probabilistic matrix factorization (PMF), an aggregate neural networks (Aggregated ANN), and an ensemble model (Ensemble Model) that uses a multi-layer neural network to combine the outputs of the diagnostic model, the collabo- rating filtering auto-encoder and the specialists identifiers as a separate input signal.

View this table:

Table 1:

Performance of endocrinologist procedure order prediction models.

Our proposed model can predict endocrinology specialty procedures more effectively (ROC-AUC=0.91) compared to all models proposed by Noshad et al. (2021) (best ROC-AUC=0.80). Further, our model showed significantly higher precision at recalls 0.5, 0.4 and 0.3 compared to all baseline models. Note, we used the same data as the data that were used in Noshad et al. (2021) except we removed features related to the specialists that patients were referred to, because specialists’ information can add bias to the model. Additionally, we trained and tested our proposed model using all features used in Noshad et al. (2021) including specialists’ information as well. This didn’t significantly affect our model’s performance. The model’s ROC-AUC using all features including specialists’ information was 0.912 and precisions at recalls 0.5, 0.4 and 0.3 were 0.60, 0.65, and 0.70, respectively.

Figure 2 compares precision-recall curves for our proposed method with the baselines. The precision for the baseline models are higher than our proposed model toward the tail of the curves. However, our proposed model has higher precision compared to all baselines over a wide range of recalls including recalls 0.3, 0.4 and 0.5. This can provide clinicians with the ability to adjust our model based on their preferred precisionrecall trade off. Further, our proposed model has significantly higher precision than all baseline precisions at recalls close to the recall of existing clinical guideline and checklist.

Figure 2:

Comparing precision-Recall curve of our proposed model with the baselines.

4. Conclusion

In conclusion, embedding graph neural network models into clinical care can improve digital specialty consultation systems and expand the access to quality medical expertise.

There are some limitations in this work that should be considered before using our proposed model. The proposed model is limited to trandsuctive learning and patients structured data.

Data Availability

The data include Stanford's healthcare patient and can not be publicly available.

APPENDIX A. Features

View this table:

Table A1:

Diagnosis features.

View this table:

Table A2:

Procedure features.

Figure A1:

Graph construction.

View this table:

Table A3:

Lab result features.

Footnotes

SAJJADF{at}STANFORD.EDU, FRG100{at}ALUMNI.STANFORD.EDU, HAMEDN{at}CS.STANFORD.EDU, NOSHAD{at}STANFORD.EDU, JIAXUAN{at}CS.STANFORD.EDU, ROK{at}CS.STANFORD.EDU, JURE{at}CS.STANFORD.EDU, JONC101{at}STANFORD.EDU

References

↵
PyG. https://www.pyg.org/. Accessed: 2022-09-31.
↵
Jonathan Chiang, Andre Kumar, David Morales, Divya Saini, Jason Hom, Lisa Shieh, Mark Musen, Mary K Goldstein, and Jonathan H Chen. Physician usage and acceptance of a machine learning recommender system for simulated clinical order entry. AMIA Summits on Translational Science Proceedings, 2020:89, 2020.
OpenUrl
↵
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. Gram: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 787–795, 2017.
↵
Edward Choi, Cao Xiao, Walter Stewart, and Jimeng Sun. Mime: Multilevel medical embedding of electronic health records for predictive healthcare. Advances in neural information processing systems, 31, 2018.
↵
Edward Choi, Zhen Xu, Yujia Li, Michael Dusenberry, Gerardo Flores, Emily Xue, and Andrew Dai. Learning the graphical structure of electronic health records with graph convolutional transformer. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 606–613, 2020.
OpenUrl
↵
Matthias Fey and Jan Eric Lenssen. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
↵
Will Hamilton, Zhitao Ying, and Jure Leskovec. Deepsnap. Advances in neural information processing systems, 30, 2017.
↵
Judy E Kim-Hwang, Alice Hm Chen, Douglas S Bell, David Guzman, Hal F Yee, and Margot B Kushel. Evaluating electronic referrals for specialty care at a public hospital. Journal of general internal medicine, 25(10):1123–1128, 2010.
OpenUrl CrossRef PubMed Web of Science
↵
Qimai Li, Zhichao Han, and Xiao-Ming Wu. Deeper insights into graph convolutional networks for semi-supervised learning. In Thirty-Second AAAI conference on artificial intelligence, 2018.
↵
Zheng Liu, Xiaohan Li, Hao Peng, Lifang He, and S Yu Philip. Heterogeneous similarity graph neural network on electronic health records. In 2020 IEEE International Conference on Big Data (Big Data), pages 1196–1205. IEEE, 2020.
↵
Zheng Liu, Xiaohan Li, Zeyu You, Tao Yang, Wei Fan, and Philip Yu. Medical triage chatbot diagnosis improvement via multi-relational hyperbolic graph neural network. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1965–1969, 2021.
↵
Morteza Noshad, Ivana Jankovic, and Jonathan H Chen. Clinical recommender algorithms to simulate digital specialty consultations. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022, pages 290–300. World Scientific, 2021.
↵
Sungjin Park, Seongsu Bae, Jiho Kim, Tackeun Kim, and Edward Choi. Graphtext multi-modal pre-training for medical representation learning. In Conference on Health, Inference, and Learning, pages 261–281. PMLR, 2022.
↵
Julia C Prentice and Steven D Pizer. Delayed access to health care and mortality. Health services research, 42(2):644–662, 2007.
OpenUrl CrossRef PubMed Web of Science
↵
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks. International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJXMpikCZ.
↵
Alina Vretinaris, Chuan Lei, Vasilis Efthymiou, Xiao Qin, and Fatma Özcan. Medical entity disambiguation using graph neural networks. In Proceedings of the 2021 International Conference on Management of Data, pages 2310–2318, 2021.
↵
Tong Wu, Yunlong Wang, Yue Wang, Emily Zhao, and Yilian Yuan. Leveraging graphbased hierarchical medical entity embedding for healthcare applications. Scientific reports, 11(1):1–13, 2021.
OpenUrl
↵
Kun-Hsing Yu, Andrew L Beam, and Isaac S Kohane. Artificial intelligence in healthcare. Nature biomedical engineering, 2(10): 719–731, 2018.
OpenUrl
↵
Marinka Zitnik, Monica Agrawal, and Jure Leskovec. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34(13):i457–i466, 2018.
OpenUrl CrossRef