Patient-Specific Explanations for Predictions of Clinical Outcomes

Mohammadamin Tajgardoon; Malarkodi J. Samayamuthu; Luca Calzoni; Shyam Visweswaran

doi:10.1055/s-0039-1697907

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00034447.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

CC BY 4.0 · ACI open 2019; 03(02): e88-e97
DOI: 10.1055/s-0039-1697907

Original Article

Georg Thieme Verlag KG Stuttgart · New York

Patient-Specific Explanations for Predictions of Clinical Outcomes

Mohammadamin Tajgardoon

¹Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Malarkodi J. Samayamuthu

²Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Luca Calzoni

²Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Shyam Visweswaran

¹Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

²Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

› Author Affiliations Funding The research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under award number R01LM012095. The content of the paper is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the University of Pittsburgh.

Further Information

Publication History

31 March 2018

07 August 2019

Publication Date:
10 November 2019 (online)

Also available at

Abstract
Full Text
References

Permissions and Reprints

Abstract

Background Machine learning models that are used for predicting clinical outcomes can be made more useful by augmenting predictions with simple and reliable patient-specific explanations for each prediction.

Objectives This article evaluates the quality of explanations of predictions using physician reviewers. The predictions are obtained from a machine learning model that is developed to predict dire outcomes (severe complications including death) in patients with community acquired pneumonia (CAP).

Methods Using a dataset of patients diagnosed with CAP, we developed a predictive model to predict dire outcomes. On a set of 40 patients, who were predicted to be either at very high risk or at very low risk of developing a dire outcome, we applied an explanation method to generate patient-specific explanations. Three physician reviewers independently evaluated each explanatory feature in the context of the patient's data and were instructed to disagree with a feature if they did not agree with the magnitude of support, the direction of support (supportive versus contradictory), or both.

Results The model used for generating predictions achieved a F1 score of 0.43 and area under the receiver operating characteristic curve (AUROC) of 0.84 (95% confidence interval [CI]: 0.81–0.87). Interreviewer agreement between two reviewers was strong (Cohen's kappa coefficient = 0.87) and fair to moderate between the third reviewer and others (Cohen's kappa coefficient = 0.49 and 0.33). Agreement rates between reviewers and generated explanations—defined as the proportion of explanatory features with which majority of reviewers agreed—were 0.78 for actual explanations and 0.52 for fabricated explanations, and the difference between the two agreement rates was statistically significant (Chi-square = 19.76, p-value < 0.01).

Conclusion There was good agreement among physician reviewers on patient-specific explanations that were generated to augment predictions of clinical outcomes. Such explanations can be useful in interpreting predictions of clinical outcomes.

Keywords

predictive model - patient-specific explanation - machine learning - clinical decision support system

Protection of Human and Animal Subjects

All research activities reported in this publication were reviewed and approved by the University of Pittsburgh’s Institutional Review Board.

References
1 Goldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 2017; 24 (01) 198-208

Crossref PubMed Search in Google Scholar
2 Rothman B, Leonard JC, Vigoda MM. Future of electronic health records: implications for decision support. Mt Sinai J Med 2012; 79 (06) 757-768

Crossref PubMed Search in Google Scholar
3 Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. New York, NY: Springer; 2009

Search in Google Scholar
4 Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 2017; 1: 11

PubMed Search in Google Scholar
5 Liu S, Liu S, Cai W, Pujol S, Kikinis R, Feng D. Early diagnosis of Alzheimer's disease with deep learning. IEEE 11th International Symposium on Biomedical Imaging (ISBI) 2014; 1015-1018

PubMed Search in Google Scholar
6 Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 2016; 6 (01) 26094

Crossref PubMed Search in Google Scholar
7 Avati A, Jung K, Harman S, Downing L, Ng A, Shah NH. Improving palliative care with deep learning. IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017; 6: 311-316

PubMed Search in Google Scholar
8 Razavian N, Marcus J, Sontag D. Multi-task prediction of disease onsets from longitudinal lab tests. Machine Learning for Healthcare Conference. 2016: 73-100

PubMed Search in Google Scholar
9 Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J. Doctor AI: predicting clinical events via recurrent neural networks. Machine Learning for Healthcare Conference. 2016: 301-318

PubMed Search in Google Scholar
10 Caruana R, Kangarloo H, David J, Dionisio N, Sinha U, Johnson D. Case-based explanation of non-case-based learning methods. Proc AMIA Symp 1999; 212-215

PubMed Search in Google Scholar
11 Reggia JA, Perricone BT. Answer justification in medical decision support systems based on Bayesian classification. Comput Biol Med 1985; 15 (04) 161-167

Crossref PubMed Search in Google Scholar
12 Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. ACM Digital Library 2015; 1721-1730

PubMed Search in Google Scholar
13 Fine MJ, Auble TE, Yealy DM. , et al. A prediction rule to identify low-risk patients with community-acquired pneumonia. N Engl J Med 1997; 336 (04) 243-250

Crossref PubMed Search in Google Scholar
14 Lipton ZC. The mythos of model interpretability. Available at: https://arxiv.org/pdf/1606.03490.pdf . Accessed August 30 2019.

PubMed
15 Ribeiro MT, Singh S, Guestrin C. Why should I trust you?: explaining the predictions of any classifier. ACM Digital Library 2016; 1135-1144

PubMed
16 Kim B. Interactive and interpretable machine learning models for human machine collaboration. PhD dissertation. Massachusetts Institute of Technology, 2015

PubMed
17 Turner R. A model explanation system. IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2016:1–6

PubMed
18 Luo G. Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction. Health Inf Sci Syst 2016; 4 (01) 2

Crossref PubMed Search in Google Scholar
19 Štrumbelj E, Bosnić Z, Kononenko I, Zakotnik B, Kuhar CG. Explanation and reliability of prediction models: the case of breast cancer recurrence. Knowl Inf Syst 2010; 24 (02) 305-324

Crossref PubMed Search in Google Scholar
20 Kapoor WN. Assessment of the Variantion and Outcomes of Pneumonia: Pneumonia Patient Outcomes Research Team (PORT) Final Report. Washington DC: Agency for Health Policy and Research (AHCPR); 1996

Search in Google Scholar
21 Cooper GF, Abraham V, Aliferis CF. , et al. Predicting dire outcomes of patients with community acquired pneumonia. J Biomed Inform 2005; 38 (05) 347-366

Crossref PubMed Search in Google Scholar
22 Caruana R. Iterated k-nearest neighbor method and article of manufacture for filling in missing values. United States Patent 6,047,287. May 5, 2000

PubMed Search in Google Scholar
23 Pedregosa F, Varoquaux G, Gramfort A. , et al. Scikit-learn: machine learning in python. J Mach Learn Res 2011; 12 (Oct): 2825-2830

PubMed Search in Google Scholar
24 Van Rijsbergen CJ. Information Retrieval. 2nd ed. Newton, MA, USA: Butterworth-Heinemann; 1979

Search in Google Scholar
25 Cohen J. A coefficient of agreeement for nominal scales. Educ Psychol Meas 1960; 20: 37-46

Crossref PubMed Search in Google Scholar
26 Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull 1971; 76 (05) 378-382

Crossref PubMed Search in Google Scholar
27 Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The Philosophical Magazine: A Journal of Theoretical Experimental and Applied Physics 1990; 50 (302) 151-175

PubMed Search in Google Scholar
28 McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 2012; 22 (03) 276-282

PubMed Search in Google Scholar
29 Lim WS, van der Eerden MM, Laing R. , et al. Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study. Thorax 2003; 58 (05) 377-382

Crossref PubMed Search in Google Scholar
30 Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 2017:4765–4774

PubMed
31 Krause J, Perer A, Ng K. Interacting with predictions: visual inspection of black-box machine learning models. ACM Conference on Human Factors in Computing Systems. 2016 :5686–5697

PubMed Search in Google Scholar
32 Baehrens D, Schroeter T, Harmeling S, Kawanabe M, Hansen K, Mueller K-R. How to explain individual classification decisions. J Mach Learn Res 2009; 11: 1803-1831

PubMed Search in Google Scholar
33 Sikonja MR, Kononenko I. Explaining classifications for individual instances. IEEE Trans Knowl Data Eng 2008; 20: 589-600

Crossref PubMed Search in Google Scholar
34 Štrumbelj E, Kononenko I. Towards a model independent method for explaining classification for individual instances. International Conference on Data Warehousing and Knowledge Discovery. 2008: 273 282

PubMed Search in Google Scholar
35 Lemaire V, Féraud R, Voisine N. Contact personalization using a score understanding method. Proceedings of the International Joint Conference on Neural Networks. 2008: 649-654

PubMed Search in Google Scholar
36 Poulin B, Eisner R, Szafron D. , et al. Visual explanation of evidence in additive classifiers. Proc Conference on Innovative Applications of Artificial Intelligence (IAAI06). 2006: 1822-1829

PubMed Search in Google Scholar
37 Szafron D, Greiner R, Lu P, Wishart D, MacDonell C, Anvik J. , et al. Explaining naïve Bayes classifications. Technical Report. Department of Computing Science, University of Alberta. 2003

PubMed
38 DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44 (03) 837-845

Crossref PubMed Search in Google Scholar
39 Robin X, Turck N, Hainard A. , et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011; 12 (01) 77

Crossref PubMed Search in Google Scholar
40 Wilson EB. Probable inference, the law of succession, and statistical inference. J Am Stat Assoc 1927; 22 (158) 209-212

Crossref PubMed Search in Google Scholar

Subscribe to RSS

Share / Bookmark

Patient-Specific Explanations for Predictions of Clinical Outcomes

Publication History

Abstract

Keywords

Protection of Human and Animal Subjects

References