Abstract
The current development of vaccines for SARS-CoV-2 is unprecedented. Little is known, however, about the nuanced public opinions on the coming vaccines. We adopt a human-guided machine learning framework (using more than 40,000 rigorously selected tweets from more than 20,000 distinct Twitter users) to capture public opinions on the potential vaccines for SARS-CoV-2, classifying them into three groups: pro-vaccine, vaccine-hesitant, and anti-vaccine. We aggregate opinions at the state and country levels, and find that the major changes in the percentages of different opinion groups roughly correspond to the major pandemic-related events. Interestingly, the percentage of the pro-vaccine group is lower in the Southeast part of the United States. Using multinomial logistic regression, we compare demographics, social capital, income, religious status, political affiliations, geo-locations, sentiment of personal pandemic experience and non-pandemic experience, and county-level pandemic severity perception of these three groups to investigate the scope and causes of public opinions on vaccines. We find that socioeconomically disadvantaged groups are more likely to hold polarized opinions on potential COVID-19 vaccines. The anti-vaccine opinion is the strongest among the people who have the worst personal pandemic experience. Next, by conducting counterfactual analyses, we find that the U.S. public is most concerned about the safety, effectiveness, and political issues regarding potential vaccines for COVID-19, and improving personal pandemic experience increases the vaccine acceptance level. We believe this is the first large-scale social media-based study to analyze public opinions on potential COVID-19 vaccines that can inform more effective vaccine distribution policies and strategies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
No external funding was received.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The number of COVID-19 cases in US and the Twitter data are publicly available to anyone with internet access.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵* hlyu5{at}ur.rochester.edu, jluo{at}cs.rochester.edu
↵9 The capitalization of non-hastag keywords does not matter in the Tweepy query.
↵12 Joe Biden was the presidential candidate when the data was collected.
↵13 Due to limitation of Twitter API, only about half of Donald Trump’s follower ID was crawled.
Data Availability
Data sharing is not applicable.