Abstract
An automated medical procedure order recommender can facilitate patient referrals and consultations from primary care providers to specialty care sites. Here, we propose to solve this task using a novel graph representation learning approach. We develop a heterogeneous graph neural network to model structured electronic health records and formulate the procedure recommender task as a link prediction problem. Our experimental results show that our model achieves a 14% improvement in personalized recommendations over state-of-the-art neural network models and existing clinical tools including referral guidelines and checklists.
1. Introduction
Access to medical specialty care is often delayed due to growing limitations in clinicians time and resources leading to higher mortality rates (Prentice and Pizer, 2007). Early prediction of procedures to be ordered during initial outpatient specialty consultation care can facilitate specialist consultations as well as decision making (Chiang et al., 2020; Kim-Hwang et al., 2010). Leveraging artificial intelligence (AI) to solve this task is largely unexplored; even though AI has shown successful application in solving real-world problems in healthcare (Yu et al., 2018).
To this end, Noshad et al. (2021) have proposed an endocrinology procedure recommender using an ensemble of multi-layer per- ceptron neural networks and collaborative filtering. However, the heterogeneity and structured nature of electronic health records (EHR) can be captured more effectively using graphical models (Park et al., 2022; Choi et al., 2018, 2017).
Graph Convolutional Transformer (GCT) (Choi et al., 2020) maps encounters into a fully connected graph and infers the underlying structure by computing self-attentions on the graph connection. Liu et al. (2020) addressed the high visibility (Li et al., 2018) of hub nodes such as demographic nodes and showed the effectiveness of modeling EHR data into heterogeneous graphs. Further, heterogeneous graph neural networks (GNN) have been utilized in drug pairs side effect prediction (Zitnik et al., 2018), medical diagnosis prediction (Liu et al., 2021) and medical concept representations (Wu et al., 2021; Vretinaris et al., 2021).
Motivated by Hamilton et al. (2017), Zitnik et al. (2018), and Veličković et al. (2018) we propose a novel GNN based framework to provide personalized procedure order recommendations prior to or during patients initial specialty care visits. Note, here we use the terminology ‘order’ to refer to the surgical/follow-up procedures ordered by physicians. This work is part of a larger body of work to develop digital specialty consultation systems to expand the access to quality medical expertise. In this paper, we focus on referrals to endocrinology as one of the highest demand and use patients historical structured EHR data.
2. Materials and Methods
2.1. Data
Our data includes all outpatients referred by Stanford primary care providers to the Stanford endocrinology clinic between January 2008 and December 2018. Use of this data for this study was approved as an exempt protocol by Stanford Institutional Review Board. We only included patients’ first visit with their endocrinologist within four months of their referral dates. Our final data set include 6,821 referrals.
We denote the list of patient referrals as P = {p1, …, pn} in which n is the number of patient referrals. Each patient referral pi constitutes a tuple (ti, Di, Oi, Li, Y i), where ti is referral’s date and Di ∈ ℝ10, Oi ∈ ℝ60, and Li ∈ ℝ300×3 are multi-hot encoded vectors representing diagnoses codes, procedure orders, and lab results for pi prior to ti. We used a two month look back window for lab results and procedures. Each lab result was converted to a vector with three elements indicating (a) if pi has had the lab result, (b) if the result was high, and (c) if the result was low. Y i is a multi-hot encoded vector representing the procedures ordered by the specialist during patient’s special care visit.
Our final feature set includes 370 features: 10 most common endocrinology related diagnoses codes, 300 lab result features, and 60 procedures. The target set includes 60 procedure orders (see variable names in Appendix A).
2.2. Proposed Method
2.2.1. Graph Structure
We modeled patients EHR data set into a heterogeneous graph neural network G = (V, E) (see Figure A1 in Appendix A). V contains two node types: patient referral nodes , and procedure order nodes . Each patient node is assigned a 310-dimensional feature vector consisting of concatenation of Di and Li and each procedure order node are associated with one-hot encoding of the entity IDs ( and , respectively).
Edge set E contains two edge types. ‘ordered-with’ edges with edge labels set to 0 that are edges between patient nodes and the procedures they have done before ti, and ‘ordered-with’ edges with edge label set to 1 that connect the patients with the procedures that their specialist ordered during the specialty care visit after ti. Note, ‘ordered-with’ edges with edge labels equal to 1 that represent specialist’s orders after referral date were not used during training and were only used in the prediction phase as we are aiming to predict procedure orders after ti. We formulate this task as binary link prediction of the existence of ‘ordered-with’ edges between a patient and an order. Further, node degree, node clustering coefficient and centrality transformations were applied to add synthetic features to each node feature vector. While the model can learn those features on its own, we added them to help the model focus on learning other features. We apply a different Graph Convolutional layers with independent parameters to each message type of (head, relation, tail) and aggregate embeddings across all node types. The same graph attention mechanism was applied to all node types.
2.2.2. Message Passing and Graph Attention
Figure 1 shows our proposed architecture. A fully connected layer with hidden size of 128 was used to map each node feature vector to pre-embedding vectors. Distinct fully connected layers were used for each node type. Two message passing layers was used each consisting of a dropout layer, a PReLU activation function, and a graph convolutional layer.
A custom heterogeneous graph attention layer was used using 1-head attention mostly following the structure of the original graph attention networks (Veličković et al., 2018), with the following modifications: 1) we applied fully connected layers with batch normalization to the node embeddings and the neighbor embeddings, and 2) we aggregated neighbor embeddings using the attention mechanism and concatenated the aggregated embedding to the current node’s embedding. This is then passed into a fully connected layer that reduces this down to a single out-put embedding followed by a batch normalization operation. Equation (1) shows our massage passing function and Equation (2) shows the GATConv update function
Where is the 1 head GAT attention score calculated for υo, 𝒩 (υ) is neighbors of v, and represents the node features of node υ.
The final predictions on existence of an ‘ordered-with’ edge eij between nodes and is inferred by concatenating their node embeddings and passing that through a fully connected two-layer perceptron, a batch normalization, a ReLU activation, and a final fully connected layer that outputs 2-dimensional logit vectors that are converted to final binary predictions using a softmax function.
The formula for the link prediction head is as follows: where BN refers to Batch Normalization and the first value corresponds to the probability that the edge exists and the second that it doesn’t.
3. Experimental Results
We used transductive disjoint training with a 1:4 positive:negative sampling using PyG (Fey and Lenssen, 2019; pyg). Adam optimizer with a learning rate of 1e-3, weight decay of 5e-4, and 400 epochs were used to train the model. Further, dropout of 0.2, and pre-embedding sizes, hidden sizes, and final embedding sizes of 128 were used. Our GNN model was tested on predictions made on all ‘ordered-with’ edges between a patient and an order placed during specialty visit.
Table 1 compares prediction results of our proposed model (CR-GNN) with the baselines presented by Noshad et al. (2021) including fully connected multi-layer neural network (Diagnostic Model), a collaborative filtering auto-encoder (AE), singular value decomposition (SVD), probabilistic matrix factorization (PMF), an aggregate neural networks (Aggregated ANN), and an ensemble model (Ensemble Model) that uses a multi-layer neural network to combine the outputs of the diagnostic model, the collabo- rating filtering auto-encoder and the specialists identifiers as a separate input signal.
Our proposed model can predict endocrinology specialty procedures more effectively (ROC-AUC=0.91) compared to all models proposed by Noshad et al. (2021) (best ROC-AUC=0.80). Further, our model showed significantly higher precision at recalls 0.5, 0.4 and 0.3 compared to all baseline models. Note, we used the same data as the data that were used in Noshad et al. (2021) except we removed features related to the specialists that patients were referred to, because specialists’ information can add bias to the model. Additionally, we trained and tested our proposed model using all features used in Noshad et al. (2021) including specialists’ information as well. This didn’t significantly affect our model’s performance. The model’s ROC-AUC using all features including specialists’ information was 0.912 and precisions at recalls 0.5, 0.4 and 0.3 were 0.60, 0.65, and 0.70, respectively.
Figure 2 compares precision-recall curves for our proposed method with the baselines. The precision for the baseline models are higher than our proposed model toward the tail of the curves. However, our proposed model has higher precision compared to all baselines over a wide range of recalls including recalls 0.3, 0.4 and 0.5. This can provide clinicians with the ability to adjust our model based on their preferred precisionrecall trade off. Further, our proposed model has significantly higher precision than all baseline precisions at recalls close to the recall of existing clinical guideline and checklist.
4. Conclusion
In conclusion, embedding graph neural network models into clinical care can improve digital specialty consultation systems and expand the access to quality medical expertise.
There are some limitations in this work that should be considered before using our proposed model. The proposed model is limited to trandsuctive learning and patients structured data.
Data Availability
The data include Stanford's healthcare patient and can not be publicly available.
APPENDIX A. Features
Footnotes
SAJJADF{at}STANFORD.EDU, FRG100{at}ALUMNI.STANFORD.EDU, HAMEDN{at}CS.STANFORD.EDU, NOSHAD{at}STANFORD.EDU, JIAXUAN{at}CS.STANFORD.EDU, ROK{at}CS.STANFORD.EDU, JURE{at}CS.STANFORD.EDU, JONC101{at}STANFORD.EDU