RT Journal Article SR Electronic T1 Species distribution modeling for disease ecology: a multi-scale case study for schistosomiasis host snails in Brazil JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.07.10.23292488 DO 10.1101/2023.07.10.23292488 A1 Singleton, Alyson L. A1 Glidden, Caroline K. A1 Chamberlin, Andrew J. A1 Tuan, Roseli A1 Palasio, Raquel G. S. A1 Pinter, Adriano A1 Caldeira, Roberta L. A1 Mendonça, Cristiane L. F. A1 Carvalho, Omar S. A1 Monteiro, Miguel V. A1 Athni, Tejas S. A1 Sokolow, Susanne H. A1 Mordecai, Erin A. A1 De Leo, Guilio A. YR 2023 UL http://medrxiv.org/content/early/2023/07/12/2023.07.10.23292488.abstract AB Species distribution models (SDMs) are increasingly popular tools for profiling disease risk in ecology, particularly for infectious diseases of public health importance that include an obligate non-human host in their transmission cycle. SDMs can create high-resolution maps of host distribution across geographical scales, reflecting baseline risk of disease. However, as SDM computational methods have rapidly expanded, there are many outstanding methodological questions. Here we address key questions about SDM application, using schistosomiasis risk in Brazil as a case study. Schistosomiasis—a debilitating parasitic disease of poverty affecting over 200 million people across Africa, Asia, and South America—is transmitted to humans through contact with the free-living infectious stage of Schistosoma spp. parasites released from freshwater snails, the parasite’s obligate intermediate hosts. In this study, we compared snail SDM performance across machine learning (ML) approaches (MaxEnt, Random Forest, and Boosted Regression Trees), geographic extents (national, regional, and state), types of presence data (expert-collected and publicly-available), and snail species (Biomphalaria glabrata, B. tenagophila and B. straminea). We used high-resolution (1km) climate, hydrology, land-use/land-cover (LULC), and soil property data to describe the snails’ ecological niche and evaluated models on multiple criteria. Although all ML approaches produced comparable spatially cross-validated performance metrics, their suitability maps showed major qualitative differences that required validation based on local expert knowledge. Additionally, our findings revealed varying importance of LULC and bioclimatic variables for different snail species at different spatial scales. Finally, we found that models using publicly-available data predicted snail distribution with comparable AUC values to models using expert-collected data. This work serves as an instructional guide to SDM methods that can be applied to a range of vector-borne and zoonotic diseases. In addition, it advances our understanding of the relevant environment and bioclimatic determinants of schistosomiasis risk in Brazil.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the Belmont collaborative Forum on Climate, Environment, and Health (NSF-ICER 2024383) and by FAPESP (2019/23593-3) in a collaborative research action with Belmont Forum. GDL and EM are also supported by the Stanford Center for Innovation in Global Health. GDL is also partially supported by NSF-DEB DEB-2011179 (EEID). EM is also supported by grants from the National Science Foundation (DEB-2011147, with Fogarty International Center), the National Institutes of Health (R35GM133439, R01AI168097, R01AI102918), and the Woods Institute for the Environment. TA is supported by the National Institute of General Medical Sciences (T32GM144273). RP is supported by FAPESP (2021/10212-1). The authors received no specific funding for this work.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:N/AI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data and code used to construct the models described in this manuscript will be made publicly available upon publication.