Abstract
To understand and model public health emergencies, epidemiologists need data that describes how humans are moving and interacting across physical space. Such data has traditionally been difficult for researchers to obtain with the temporal resolution and geographic breadth that is needed to study, for example, a global pandemic. This paper describes Colocation Maps, which are spatial network datasets that have been developed within Facebook’s Data For Good program. These Maps estimate how often people from different regions are colocated: in particular, for a pair of geographic regions x and y, these Maps estimate the probability that a randomly chosen person from x and a randomly chosen person from y are simultaneously located in the same place during a randomly chosen minute in a given week. These datasets are well suited to parametrize metapopulation models of disease spread or to measure temporal changes in interactions between people from different regions; indeed, they have already been used for both of these purposes during the COVID-19 pandemic. In this paper, we show how Colocation Maps differ from existing data sources, describe how the datasets are built, provide examples of their use in compartmental modeling, and summarize ideas for further development of these and related datasets. We also conduct the first large-scale analysis of human colocation patterns across the world. Among the findings of this study, we observe that a pair of regions can exhibit high colocation despite few people moving between them. We also find that although few pairs of people are colocated for many days over the course of a week, these pairs can contribute significant fractions of the total colocation time within a region or between pairs of regions.
Competing Interest Statement
All authors were paid employees, contractors, or interns of Facebook, Inc. when performing the work reported in this paper.
Funding Statement
All authors were paid employees, contractors, or interns of Facebook, Inc. when performing the work reported in this paper and received compensation for this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This paper and the datasets that this paper describes were examined and approved by Facebook's internal review processes. The dataset sharing that is described in this paper is managed through Facebook's Data for Good program. All data analysis reported in this paper was performed on datasets that were deidentified and analyzed in aggregate.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Adding an author who contributed significantly to this project
1 For more details on the Location History setting, please see: https://www.facebook.com/help/278928889350358
2 Those interested in requesting access to Colocation Maps, or any other Facebook Data for Good maps, can do so via the instructions provided on the program’s website (https://dataforgood.fb.com/). The program website also maintains a collection of research publications which have made use of Colocation Maps and other resources to aid those seeking to make use of the data for their crisis-response work.
3 In practice, we recommend inserting another unknown parameter κ such that mrs,k ≈ nrns(1 + δrsκr)prs to rescale the diagonal values of prs (where the Kronecker delta δrs = 1 if r = s and zero otherwise). We suggest this modification because within-region colocation tends to occur for different reasons than between-region colocation. Within-region colocation is generally much larger than between-region colocation, and can be driven by nighttime contributions. This additional unknown parameter can help account for this difference in scale. See 4.2 and 4.4 for more details.
4 http://maths.adelaide.edu.au/lewis.mitchell/socialdistancing Last time visited on July 27th 2020.
5 https://chapman.maps.arcgis.com/apps/opsdashboard/index.html/d5d4047f03b742e9bddb2af75e5b9ba8 Last time visited on July 27th 2020.
6 https://chapman.maps.arcgis.com/apps/opsdashboard/index.html/3eb4582a297d42d498066309668427bf Last time visited on July 27th 2020.
7 https://cmmid.github.io/colocation_dashboard_cmmid/ Last time visited on July 27th 2020.
8 https://population.un.org/wpp/DataQuery/ Last time visited on August 6th 2020.
9 Génois and Barrat compute cosine similarities between adjacency matrices by simply concatenating the rows of the matrices into vectors [32].
Data Availability
The data used in this paper is made available through Facebook's Data for Good program under the terms of a Data License Agreement for research purposes. Any interested party can request access to this data by going to dataforgood.fb.com or by emailing diseaseprevmaps{at}fb.com.