Abstract
Background Mounting evidence suggests that there is an undetected pool of COVID-19 asymptomatic but infectious cases. Estimating the number of asymptomatic infections has been crucial to understand the virus and contain its spread, which is, however, hard to be accurately counted.
Methods We propose an approach of machine learning based fine-grained simulator (ML-Sim), which integrates multiple practical factors including disease progress in the incubation period, cross-region population movement, undetected asymptomatic patients, and prevention and containment strength. The interactions among these factors are modeled by virtual transmission dynamics with several undetermined parameters, which are determined from epidemic data by machine learning techniques. When MLSim learns to match the real data closely, it also models the number of asymptomatic patients. MLSim is learned from the open Chinese global epidemic data.
Findings MLSim showed better forecast accuracy than the SEIR and LSTM-based prediction models. The MLSim learned from the data of China’s mainland reveals that there could have been 150,408 (142,178-157,417) asymptomatic and had self-healed patients, which is 65% (64% – 65%) of the inferred total infections including undetected ones. The numbers of asymptomatic but infectious patients on April 15, 2020, were inferred as, Italy: 41,387 (29,037 – 57,151), Germany: 21,118 (11,484 – 41,646), USA: 354,657 (277,641 – 495,128), France: 40,379 (10,807 – 186,878), and UK: 144,424 (127,215 – 171,930). To control the virus transmission, the containment measures taken by the government were crucial. The learned MLSim also reveals that if the date of containment measures in China’s mainland was postponed for 1, 3, 5, and 7 days later than Jan. 23, there would be 109,039 (129%), 183,930 (218%), 313,342 (371%), 537,555 (637%) confirmed cases on June 12.
Conclusions Machine learning based fine-grained simulators can better model the complex real-world disease transmission process, and thus can help decision-making of balanced containment measures. The simulator also revealed the potential great number of undetected asymptomatic infections, which poses a great risk to the virus containment.
Funding National Natural Science Foundation of China.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
National Natural Science Foundation of China
Author Declarations
All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
all data used in this paper is publically available.