Network graph representation of COVID-19 scientific publications to aid knowledge discovery

BMJ Health Care Inform. 2021 Jan;28(1):e100254. doi: 10.1136/bmjhci-2020-100254.

Abstract

Introduction: Numerous scientific journal articles related to COVID-19 have been rapidly published, making navigation and understanding of relationships difficult.

Methods: A graph network was constructed from the publicly available COVID-19 Open Research Dataset (CORD-19) of COVID-19-related publications using an engine leveraging medical knowledge bases to identify discrete medical concepts and an open-source tool (Gephi) to visualise the network.

Results: The network shows connections between diseases, medications and procedures identified from the title and abstract of 195 958 COVID-19-related publications (CORD-19 Dataset). Connections between terms with few publications, those unconnected to the main network and those irrelevant were not displayed. Nodes were coloured by knowledge base and the size of the node related to the number of publications containing the term. The data set and visualisations were made publicly accessible via a webtool.

Conclusion: Knowledge management approaches (text mining and graph networks) can effectively allow rapid navigation and exploration of entity inter-relationships to improve understanding of diseases such as COVID-19.

Keywords: BMJ health informatics; health care; information science; medical informatics.

MeSH terms

  • Artificial Intelligence*
  • COVID-19 / epidemiology*
  • Humans
  • Knowledge Discovery / methods*
  • Natural Language Processing
  • Periodicals as Topic / statistics & numerical data*
  • SARS-CoV-2