II. Abstract
Brain MRI scans and chest X-ray imaging are pivotal in diagnosing and managing neurological and respiratory diseases, respectively. Given their importance in diagnosis, the datasets to train the artificial intelligence (AI) models for automated diagnosis remain scarce. As an example, annotated chest X-ray datasets, especially those containing rare or abnormal cases like bacterial pneumonia, are scarce. Conventional dataset collection methods are labor-intensive and costly, exacerbating the data scarcity issue. To overcome these challenges, we propose a specialized Generative Adversarial Network (GAN) architecture for generating synthetic chest X-ray data representing healthy lungs and various pneumonia conditions, including viral and bacterial pneumonia. Additionally, we extended our experiments to brain MRI scans by simply swapping the training dataset and demonstrating the power of our GAN approach across different medical imaging contexts. Our method aims to streamline data collection and labeling processes while addressing privacy concerns associated with patient data. We demonstrate the effectiveness of synthetic data in facilitating the development and evaluation of machine learning algorithms, particularly leveraging an EfficientNet v2 model. Through comprehensive experimentation, we evaluate our approach on both real and synthetic datasets, showcasing the potential of synthetic data augmentation in improving disease classification accuracy across diverse pathological conditions. Indeed, the classifier performance when trained with fake + real data on brain MRI classification task shows highest accuracy at 85.9%. Our findings underscore the promising role of synthetic data in advancing automated diagnosis and treatment planning for pneumonia, other respiratory conditions, and brain pathologies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
https://www.kaggle.com/datasets/praneet0327/brain-tumor-dataset
https://www.news-medical.net/health/Viral-vs-Bacterial-Pneumonia.aspx