RT Journal Article SR Electronic T1 Optimizing Ocular Pathology Classification with CNNs and OCT Imaging: A Systematic and Performance Review JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.06.18.24309070 DO 10.1101/2024.06.18.24309070 A1 Hauri-Rosales, Walter A1 Pérez, Oswaldo A1 Garcia-Roa, Marlon A1 López-Star, Ellery A1 Olivares-Pinto, Ulises YR 2024 UL http://medrxiv.org/content/early/2024/06/19/2024.06.18.24309070.abstract AB Vision loss due to chronic-degenerative diseases is a primary cause of blindness worldwide. Deep learning architectures utilizing optical coherence tomography images have proven effective for the early diagnosis of ocular pathologies. Nevertheless, most studies have emphasized the best outcomes using optimal hyperparameter combinations and extensive data availability. This focus has eclipsed the exploration of how model learning capacity varies with different data volumes. The current study evaluates the learning capabilities of efficient deep-learning classification models across various data amounts, aiming to determine the necessary data portion for effective clinical trial classifications of ocular pathologies. A comprehensive review was conducted, which included 295 papers that employed OCT images to classify one or more of the following retinal pathologies: Drusen, Diabetic Macular Edema, and Choroidal Neovascularization. Performance metrics and dataset details were extracted from these studies. Four Convolutional Neural Networks were selected and trained using three strategies: initializing with random weights, fine-tuning, and retraining only the classification layers. The resultant performance was compared based on training size and strategy to identify the optimal combination of model size, dataset size, and training approach. The findings revealed that, among the models trained with various strategies and data volumes, three achieved 99.9% accuracy, precision, recall, and F1 score. Two of these models were fine-tuned, and one used random weight initialization. Remarkably, two models reached 99% accuracy using only 10% of the original training dataset. Additionally, a model that was less than 10% the size of the others achieved 98.7% accuracy and an F1 score on the test set while requiring 100 times less computing time. This study is the first to assess the impact of training data size and model complexity on performance metrics across three scenarios: random weights initialization, fine-tuning, and retraining classification layers only, specifically utilizing optical coherence tomography images.Competing Interest StatementThe authors have declared no competing interest.Funding StatementYesAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:N/AI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.Yeshttps://github.com/HpcDataLab/DL_ENESJ_IMO/ https://github.com/HpcDataLab/DL_ENESJ_IMO/