RT Journal Article SR Electronic T1 Using early detection data to estimate the date of emergence of an epidemic outbreak JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.01.09.23284284 DO 10.1101/2023.01.09.23284284 A1 Jijón, S. A1 Czuppon, P. A1 Blanquart, F. A1 Débarre, F. YR 2023 UL http://medrxiv.org/content/early/2023/11/15/2023.01.09.23284284.abstract AB While the first infection of an emerging disease is often unknown, information on early cases can be used to date it. In the context of the COVID-19 pandemic, previous studies have estimated dates of emergence (e.g., first human SARS-CoV-2 infection, emergence of the Alpha SARS-CoV-2 variant) using mainly genomic data. Another dating attempt used a stochastic population dynamics approach and the date of the first reported case. Here, we extend this approach to use a larger set of early reported cases to estimate the delay from first infection to the Nth case. We first validate our model using data on Alpha variant infections in the UK, dating the first Alpha infection at (median) August 21, 2020 (95% interquantile range across retained simulations, IqR: July 23 – September 5, 2020. Next, we apply our model to data on COVID-19 cases with symptom onset before mid-January 2020. We date the first SARS-CoV-2 infection in Wuhan at (median) November 28, 2019 (95%IqR: November 2–December 9, 2019). Our results fall within ranges previously estimated by studies relying on genomic data. Our population dynamics-based modelling framework is generic and flexible, and thus can be applied to estimate the starting time of outbreaks in contexts other than COVID-19.Author summary While the first infection of an emerging disease is often unknown, information on early cases can be used to date it. In the context of the COVID-19 pandemic, previous studies have estimated dates of emergence of epidemic outbreaks (e.g., first human SARS-CoV-2 infection, emergence of the Alpha SARS-CoV-2 variant) using mainly genomic data. Another dating attempt used a population-level stochastic approach and the date of the first reported case. Here, we extend this generic and flexible approach to use a larger set of early reported cases to estimate the time elapsed between the first infection and the Nth case. Our model dates the first Alpha infection at around August 21, 2020, and the first SARS-CoV-2 infection in Wuhan at around November 28, 2019. Our findings fall within ranges previously estimated by studies relying on genomic data.Competing Interest StatementThe authors have declared no competing interest.Funding StatementSJ's postdoctoral fellowship was funded by a grant from the MODCOV19 platform of the National Institute of Mathematical Sciences and their Interactions (Insmi, CNRS) to FD. FD was funded by ANR-19-CE45-0009 (TheoGeneDrive). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study used ONLY openly available human data. Data on early Alpha cases were retrieved from the Global Initiative on Sharing Avian Influenza Data (GISAID), available at doi.org/10.55876/gis8.230104xg. For comparison to our results, V. Hill and J. Pekar shared their previously published results, available at doi.org/10.1093/ve/veac080 and doi.org/10.1126/science.abp8337, respectively.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data and codes needed for reproducibility of our results and the corresponding figures are available at a public Github repository: https://github.com/sjijon/estimate-emergence-from-data. https://github.com/sjijon/estimate-emergence-from-data