RT Journal Article SR Electronic T1 A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across Great Britain JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.07.19.20157396 DO 10.1101/2020.07.19.20157396 A1 Santos, Rochelle Schneider dos A1 Vicedo-Cabrera, Ana M. A1 Sera, Francesco A1 Masselot, Pierre A1 Stafoggia, Massimo A1 de Hoogh, Kees A1 Kloog, Itai A1 Reis, Stefan A1 Vieno, Massimo A1 Gasparrini, Antonio YR 2020 UL http://medrxiv.org/content/early/2020/09/17/2020.07.19.20157396.abstract AB Epidemiological studies on health effects of air pollution usually rely on measurements from fixed ground monitors, which provide limited spatio-temporal coverage. Data from satellites, reanalysis and chemical transport models offer additional information used to reconstruct pollution concentrations at high spatio-temporal resolution. The aim of this study is to develop a multi-stage satellite-based machine learning model to estimate daily fine particulate matter (PM2.5) levels across Great Britain during 2008-2018. This high-resolution model consists of random forest (RF) algorithms applied in four stages. Stage-1 augments monitor-PM2.5 series using co-located PM10 measures. Stage-2 imputes missing satellite aerosol optical depth observations using atmospheric reanalysis models. Stage-3 integrates the output from previous stages with spatial and spatiotemporal variables to build a prediction model for PM2.5. Stage-4 applies Stage-3 models to estimate daily PM2.5 concentrations over a 1 km grid. The RF architecture performed well in all stages, with results from Stage-3 showing an average cross-validated R2 of 0.767 and minimal bias. The model performed better over the temporal scale when compared to the spatial component, but both presented good accuracy with an R2 of 0.795 and 0.658, respectively. The high spatio-temporal resolution and relatively high precision allows this dataset (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposures to PM2.5.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was supported by the Medical Research Council-UK (Grant ID: MR/M022625/1), the Natural Environment Research Council UK (Grant ID: NE/R009384/1), and the European Union's Horizon 2020 Project Exhaustion (Grant ID: 820655). EMEP4UK Model results and contributions by S.R. and M.V. were supported by award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study is based on publicly available data and it does not make use of sensitive data.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data used to perform the analysis are in the public domain. References and sources are provided in the text.