ABSTRACT
We utilize functional data analysis techniques to investigate patterns of COVID-19 positivity and mortality in the US and their associations with Google search trends for COVID-19 related symptoms. Specifically, we represent state-level time series data for COVID-19 and Google search trends for symptoms as smoothed functional curves. Given these functional data, we explore the modes of variation in the data using functional principal component analysis (FPCA). We also apply functional clustering analysis to identify patterns of COVID-19 confirmed case and death trajectories across the US. Moreover, we quantify the associations between Google COVID-19 search trends for symptoms and COVID-19 confirmed case and death trajectories using dynamic correlation. Finally, we examine the dynamics of correlations for the top nine Google search trends of symptoms commonly associated with COVID-19 confirmed case and death trajectories. Our results reveal and characterize distinct patterns for COVID-19 spread and mortality across the US. The dynamics of these correlations suggest the feasibility of using Google queries to forecast COVID-19 cases and mortality for up to three weeks in advance. Our results and analysis framework set the stage for the development of predictive models for forecasting COVID-19 confirmed cases and deaths using historical data and Google search trends for nine symptoms associated with both outcomes.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
YE is supported by startup funding from Geisinger Health System. The funder had no role in the design of the study, collection, analysis, or interpretation of data or the writing of the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
N/A
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The numbers of daily COVID-19 confirmed cases and deaths were obtained from the Centers for Disease Control and Prevention (CDC) at https://data.cdc.gov/Case-Surveillance/United-States-COVID-19-Cases-and-Deaths-by-State-o/9mfq-cb36. The Google COVID-19 Search Trends Symptoms dataset is publicly available at https://github.com/google-research/open-covid-19-data/. The 2019 US Census data are available at https://www.census.gov/data.html.