Quantitative and Qualitative evaluation of the recent Artificial Intelligence in Healthcare publications using Deep-Learning

Raghav Awasthi; Shreya Mishra; Jacek B Cywinski; Kamal Maheshwari; Francis A. Papay; Piyush Mathur

doi:10.1101/2022.12.31.22284092

Abstract

Background An ever-increasing number of artificial intelligence (AI) models targeting healthcare applications are developed and published every day, but their use in real-world decision-making is limited. Beyond a quantitative assessment, it is important to have a qualitative evaluation of the maturity of these publications with additional details related to trends in the type of data used type of models developed across the healthcare spectrum.

Methods We assessed the maturity of selected peer-reviewed AI publications pertinent to healthcare for the years 2019–2021. For the report, the data collection was performed by PubMed search using the Boolean operators “machine learning” OR “artificial intelligence” AND “2021”, OR “2020”, OR ‘‘2019” with the English language and human subject research as of December 31, each year. All three years selected were manually classified into 34 distinct medical specialties. We used the Bidirectional Encoder Representations from Transformers (BERT) neural networks model to identify the maturity level of research publications based on their abstracts. We further classified a mature publication based on the healthcare specialty and geographical location of the article’s senior author. Finally, we manually annotated specific details from mature publications, such as model type, data type, and disease type.

Results Of the 7062 publications relevant to AI in healthcare from 2019–2021, 385 were classified as mature. In 2019, 6.01 percent of publications were mature. 7.7 percent were mature in 2020, and 1.81 percent of publications were mature in 2021. Radiology publications had the most mature model publications across all specialties over the last three years, followed by pathology in 2019, ophthalmology in 2020, and gastroenterology in 2021. Geographical pattern analysis revealed a non-uniform distribution pattern. In 2019 and 2020, the United States ranked first with a frequency of 22 and 50, followed by China with 20 and 47. In 2021, China ranked first with 17 mature articles, followed by the United States with 11 mature articles. Imaging-based data was the primary source, and deep learning was the most frequently used modeling technique in mature publications.

Interpretation Despite the growing number of publications of AI models in healthcare, only a few publications have been found to be mature with a potentially positive impact on healthcare. Globally, there is an opportunity to leverage diverse datasets and models across the health spectrum, to develop more mature models and related publications, which can fully realize the potential of AI to transform healthcare.

Evidence Before Study There is an increasing number of publications related to AI in healthcare across different specialities with limited assessment of maturity of these publications and a methodological analysis of their key characteristics. We performed a PubMed search using combinations of the keywords “maturity” or “evaluation” AND “AI in healthcare” restricted to the English language and the past ten years of publications, and found 15 relevant publications. Six were focused on proposing a qualitative framework for evaluating AI models, including one article proposing an evaluation framework for prediction models and one article focusing on health economic evaluations of AI in healthcare models. The remaining publications were related to the usability of AI models. There are limited studies to assess the maturity of AI in healthcare publications which provide further detailed insights into key compositional factors such as data types, model types, geographical trends across different healthcare specialities.

The added value of this Study With an exponentially increasing number of publications, to our knowledge, this is the first study to provide a method, comprehensive quantitative and qualitative evaluation of the recent mature “AI in Healthcare” publications. This study builds on a semi-automated approach that combines deep learning with a unique in-house collection of “AI in Healthcare” publications over the recent three years to highlight the current state of AI in healthcare. The whole spectrum of data types, model types, geographical trends and diseases type represented in the mature publications are presented empirically in this research which provides unique insights.

Implications of all the available evidence This thorough and comparative evaluation of mature publications across different healthcare specialities provides the evidence which can be used to guide future research and resource utilization. Results from this study show that the percentage of mature publications in all healthcare specialties is much lower than in radiology. Text and tabular data are also underrepresented compared to image data in mature publications. Geographical trends of these publications also shows the gaps in inclusivity and the need to provide resources to support AI in healthcare research globally. Publications pertaining to the deep learning model have the highest frequency of mature articles. Our detailed analysis of the mature AI in healthcare publications demonstrates an opportunity to leverage heterogeneous datasets and models across the health spectrum to increase the yield of mature AI in healthcare publications.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

We need to place figures at the places where we have cited them.

Data Availability

We utilized open-sourced data for our study.

https://www.researchgate.net/publication/358897338_Artificial_Intelligence_in_Healthcare_2021_Year_in_Review

https://www.researchgate.net/publication/349570341_Artificial_Intelligence_in_Healthcare_2020_Year_in_Review

https://www.researchgate.net/publication/340926403_2019_YEAR_IN_REVIEW_MACHINE_LEARNING_IN_HEALTHCARE

The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.