ABSTRACT
The U.S. needs early warning systems to help it contain the spread of infectious diseases. Conventional early warning systems use lab-test results or dynamic records to signal early warning signs. New early warning systems can supplement these data with indicators of public awareness like news articles and search queries. This study aims to explore the potential of utilizing social media data to enhance early warning of the COVID-19 outbreak. To demonstrate the feasibility, this study conducts a retrospective analysis and investigates more than 14 million related Twitter postings in the date range from January 20 to March 10, 2020. With the aid of natural language processing tools and machine learning classifiers, this study classifies each of these tweets into either a signal or a non-signal. In this study, a “signal” tweet implies that the user recognized the COVID-19 outbreak risk in the U.S. This study then proposes a parameter “signal ratio” to signal warning signs of the COVID-19 pandemic over periods. Results reveal that social media data and the signal ratio can detect the hazards ahead of the COVID-19 outbreak. This claim has been validated with a leading time of 16 days through the comparison to other referenced methods based on Google trends or media news.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
There is no funding for this research.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
There is no need for exemption
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Paper in collection COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.