PT - JOURNAL ARTICLE AU - Ahmed, Syed Rakin AU - Befano, Brian AU - Lemay, Andreanne AU - Egemen, Didem AU - Rodriguez, Ana Cecilia AU - Angara, Sandeep AU - Desai, Kanan AU - Jeronimo, Jose AU - Antani, Sameer AU - Campos, Nicole AU - Inturrisi, Federica AU - Perkins, Rebecca AU - Kreimer, Aimee AU - Wentzensen, Nicolas AU - Herrero, Rolando AU - Pino, Marta del AU - Quint, Wim AU - de Sanjose, Silvia AU - Schiffman, Mark AU - Kalpathy-Cramer, Jayashree TI - Reproducible And Clinically Translatable Deep Neural Networks For Cervical Screening AID - 10.1101/2022.12.17.22282984 DP - 2022 Jan 01 TA - medRxiv PG - 2022.12.17.22282984 4099 - http://medrxiv.org/content/early/2022/12/20/2022.12.17.22282984.short 4100 - http://medrxiv.org/content/early/2022/12/20/2022.12.17.22282984.full AB - Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. Published AI reports have exhibited overfitting, lack of portability, and unrealistic, near-perfect performance estimates. To surmount recognized issues, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-institutional dataset of 9,462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study did not receive any external funding.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethics committee/IRB of the National Cancer Institute (NCI), National Institutes of Health (NIH) and within each institution/country where data/images were collected gave ethical approval for this work.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data produced in the present work are contained in the manuscript.