skip to main content
10.1145/3442188.3445881acmconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Open access

Leveraging Administrative Data for Bias Audits: Assessing Disparate Coverage with Mobility Data for COVID-19 Policy

Published: 01 March 2021 Publication History

Abstract

Anonymized smartphone-based mobility data has been widely adopted in devising and evaluating COVID-19 response strategies such as the targeting of public health resources. Yet little attention has been paid to measurement validity and demographic bias, due in part to the lack of documentation about which users are represented as well as the challenge of obtaining ground truth data on unique visits and demographics. We illustrate how linking large-scale administrative data can enable auditing mobility data for bias in the absence of demographic information and ground truth labels. More precisely, we show that linking voter roll data---containing individual-level voter turnout for specific voting locations along with race and age---can facilitate the construction of rigorous bias and reliability tests. Using data from North Carolina's 2018 general election, these tests illuminate a sampling bias that is particularly noteworthy in the pandemic context: older and non-white voters are less likely to be captured by mobility data. We show that allocating public health resources based on such mobility data could disproportionately harm high-risk elderly and minority groups.

References

[1]
Hunt Allcott, Levi Boxell, Jacob Conway, Matthew Gentzkow, Michael Thaler, and David Y Yang. 2020. Polarization and public health: Partisan differences in social distancing during the Coronavirus pandemic. Working Paper w26946. National Bureau of Economic Research (NBER).
[2]
Kristen M Altenburger, Daniel E Ho, et al. 2018. When Algorithms Import Private Bias into Public Enforcement: The Promise and Limitations of Statistical Debiasing Solutions. Journal of Institutional and Theoretical Economics 174, 1 (2018), 98--122.
[3]
Ionut Andone, Konrad Błaszkiewicz, Mark Eibes, Boris Trendafilov, Christian Montag, and Alexander Markowetz. 2016. How Age and Gender Affect Smart-phone Usage. In UbiComp '16. Association for Computing Machinery, New York, NY, USA, 9--12. https://doi.org/10.1145/2968219.2971451
[4]
Salman Aslam. 2021. Snapchat by the Numbers: Stats, Demographics Fun Facts. https://www.omnicoreagency.com/snapchat-statistics/#:~:text=Snapchat%20Demographics
[5]
Han Bao, Xun Zhou, Yingxue Zhang, Yanhua Li, and Yiqun Xie. 2020. COVID-GAN: Estimating Human Mobility Responses to COVID-19 Pandemic through Spatio-Temporal Conditional Generative Adversarial Networks. In Proceedings of the 28th International Conference on Advances in Geographic Information Systems. 273--282.
[6]
Seth G. Benzell, Avinash Collis, and Christos Nicolaides. 2020. Rationing social contact during the COVID-19 pandemic: Transmission risk and social benefits of US locations. Proceedings of the National Academy of Sciences 117, 26 (2020), 14642--14644. https://doi.org/10.1073/pnas.2008025117 arXiv:https://www.pnas.org/content/117/26/14642.full.pdf
[7]
Guillermo Bernal and María R Scharró-del Río. 2001. Are empirically supported treatments valid for ethnic minorities? Toward an alternative approach for treatment research. Cultural Diversity and Ethnic Minority Psychology 7, 4 (2001), 328.
[8]
Krishna K Bommakanti, Laramie L Smith, Lin Liu, Diana Do, Jazmine Cuevas-Mota, Kelly Collins, Fatima Munoz, Timothy C Rodwell, and Richard S Garfein. 2020. Requiring smartphone ownership for mHealth interventions: who could be left out? BMC public health 20, 1 (2020), 81.
[9]
Adam Brzezinski, Valentin Kecht, and David Van Dijcke. 2020. The Cost of Staying Open: Voluntary Social Distancing and Lockdowns in the US. Technical Report. Oxford University.
[10]
Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Conference on Fairness, Accountability, and Transparency (Proceedings of Machine Learning Research), Sorelle A. Friedler and Christo Wilson (Eds.), Vol. 81. PMLR, New York, NY, USA, 77--91. http://proceedings.mlr.press/v81/buolamwini18a.html
[11]
Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, and Duen Horng Chau. 2019. FairVis: Visual analytics for discovering intersectional bias in machine learning. In 2019 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, Virtual, 46--56.
[12]
Serina Chang, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, and Jure Leskovec. 2021. Mobility network models of COVID-19 explain inequities and inform reopening. Nature 589 (2021), 82--87.
[13]
Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. 2018. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions. In Conference on Fairness, Accountability and Transparency (Proceedings of Machine Learning Research), Sorelle A. Friedler and Christo Wilson (Eds.), Vol. 81. PMLR, New York, NY, USA, 134--148. http://proceedings.mlr.press/v81/chouldechova18a.html
[14]
Alexandra Chouldechova and Aaron Roth. 2018. The Frontiers of Fairness in Machine Learning. arXiv:arXiv:1810.08810
[15]
Sam Corbett-Davies and Sharad Goel. 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. arXiv:arXiv:1808.00023
[16]
Kimberlé Crenshaw. 1989. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. University of Chicago Legal Forum 1, 8 (1989), 139. Issue 1.
[17]
Dhaval M Dave, Andrew I Friedson, Kyutaro Matsuzawa, Drew McNichols, Connor Redpath, and Joseph J Sabia. 2020. Did President Trump's Tulsa Rally Reignite COVID-19? Indoor Events and Offsetting Community Effects. Technical Report. National Bureau of Economic Research.
[18]
Dhaval M Dave, Andrew I Friedson, Drew McNichols, and Joseph J Sabia. 2020. The Contagion Externality of a Superspreading Event: The Sturgis Motorcycle Rally and COVID-19. Technical Report. National Bureau of Economic Research.
[19]
Rebecca Dresser. 1992. Wanted single, white male for medical research. The Hastings Center Report 22, 1 (1992), 24--29.
[20]
David Dutwin, Scott Keeter, and Courtney Kennedy. 2010. Bias from wireless substitution in surveys of Hispanics. Hispanic journal of behavioral sciences 32, 2 (2010), 309--328.
[21]
Philip Mielke Eva Pereira, Bryan Bonack and Chelsea Lawson. 2020. Using Data to Govern Through a Crisis. https://www.safegraph.com/webinar-govern-through-a-crisis
[22]
Maryam Farboodi, Gregor Jarosch, and Robert Shimer. 2020. Internal and external effects of social distancing in a pandemic. Technical Report. National Bureau of Economic Research.
[23]
Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, and Derek Roth. 2019. A Comparative Study of Fairness-Enhancing Interventions in Machine Learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, USA, 329--338. https://doi.org/10.1145/3287560.3287589
[24]
Song Gao, Jinmeng Rao, Yuhao Kang, Yunlei Liang, and Jake Kruse. 2020. Mapping county-level mobility pattern changes in the United States in response to COVID-19. SIGSPATIAL Special 12, 1 (2020), 16--26.
[25]
Kyra H Grantz, Hannah R Meredith, Derek AT Cummings, C Jessica E Metcalf, Bryan T Grenfell, John R Giles, Shruti Mehta, Sunil Solomon, Alain Labrique, Nishant Kishore, et al. 2020. The use of mobile phone data to inform analysis of COVID-19 pandemic epidemiology. Nature Communications 11, 1 (2020), 1--8.
[26]
Darrell M Gray, Adjoa Anyane-Yeboa, Sophie Balzora, Rachel B Issaka, and Folasade P May. 2020. COVID-19 and the other pandemic: populations made vulnerable by systemic inequity. Nature Reviews Gastroenterology & Hepatology 17, 9 (2020), 520--522.
[27]
Moritz Hardt and Solon Barocas. 2017. Fairness in machine learning.
[28]
M Hlavac. 2018. Stargazer: Well-formatted regression and summary statistics tables (R Package version 5.2)[Computer software].
[29]
Daniel E Ho and Kosuke Imai. 2006. Randomization inference with natural experiments: An analysis of ballot effects in the 2003 California recall election. Journal of the American statistical association 101, 475 (2006), 888--900.
[30]
Paul W Holland. 1986. Statistics and Causal Inference. J. Amer. Statist. Assoc. 81, 396 (1986), 945--960.
[31]
Mariea Grubbs Hoy and George Milne. 2010. Gender differences in privacy-related measures for young adult Facebook users. Journal of Interactive Advertising 10, 2 (2010), 28--45.
[32]
Ruth Igielnik, Scott Keeter, Courtney Kennedy, and Bradley Spahn. 2018. Commercial voter files and the study of US politics. Technical Report. Pew Research Center. www.pewresearch.org/2018/02/15/commercial-voter-files-and-the-study-of-us-politics
[33]
Michael H. Keller Jennifer Valentino-DeVries, Natasha Singer and Aaron Krolik. 2018. Your Apps Know Where You Were Last Night, and They're Not Keeping It Secret. https://www.washingtonpost.com/nation/2020/06/01/americans-are-delaying-medical-care-its-devastating-health-care-providers/?arc404=true
[34]
Nathan Kallus, Xiaojie Mao, and Angela Zhou. 2020. Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* '20). Association for Computing Machinery, New York, NY, USA, 110. https://doi.org/10.1145/3351095.3373154
[35]
Benjamin D. Killeen, Jie Ying Wu, Kinjal Shah, Anna Zapaishchykova, Philipp Nikutta, Aniruddha Tamhane, Shreya Chakraborty, Jinchi Wei, Tiger Gao, Mareike Thies, and Mathias Unberath. 2020. A County-level Dataset for Informing the United States' Response to COVID-19. arXiv:arXiv:2004.00756
[36]
Pauline T Kim. 2017. Auditing algorithms for discrimination. University of Pennsylvania Law Review Online 166 (2017), 189.
[37]
David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. The parable of Google Flu: traps in big data analysis. Science 343, 6176 (2014), 1203--1205.
[38]
Sunghee Lee, J Michael Brick, E Richard Brown, and David Grant. 2010. Growing cell-phone population and noncoverage bias in traditional random digit dial telephone health surveys. Health services research 45, 4 (2010), 1121--1139.
[39]
Amanda Moreland. 2020. Timing of State and Territorial COVID-19 Stay-at-Home Orders and Changes in Population Movement---United States, March 1-May 31, 2020. MMWR. Morbidity and Mortality Weekly Report 69 (2020), 1198--1203.
[40]
Gina Moreno-John, Anthony Gachie, Candace M Fleming, Anna Napoles-Springer, Elizabeth Mutran, Spero M Manson, and Eliseo J Pérez-Stable. 2004. Ethnic minority older adults participating in clinical research. Journal of Aging and Health 16, 5_suppl (2004), 93S-123S.
[41]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447--453.
[42]
Maria Petrova, Ruben Enikolopov, Georgy Egorov, and Alexey Makarin. 2020. Divided We Stay Home: Social Distancing and Ethnic Diversity. Technical Report. National Bureau of Economic Research.
[43]
Vickie L Shavers-Hornaday, Charles F Lynch, Leon F Burmeister, and James C Torner. 1997. Why are African Americans under-represented in medical research studies? Impediments to participation. Ethnicity & health 2, 1-2 (1997), 31--45.
[44]
Mobile Fact Sheet. 2019. Pew Research Center, Internet and Technology. June 12, 2019.
[45]
Jennifer L Skeem and Christopher T Lowenkamp. 2016. Risk, race, and recidivism: Predictive bias and disparate impact. Criminology 54, 4 (2016), 680--712.
[46]
RF Squire. 2019. An Interactive Guide To Analyze Demographic Profiles from SafeGraph Patterns Data. https://colab.research.google.com/drive/1qqLRxehVZr1OBpnbHRRyXPWo1Q98dnxA?authuser=1#scrollTo=fEFiU4ny9LYx
[47]
RF Squire. 2019. Measuring and Correcting Sampling Bias in Safegraph Patterns for More Accurate Demographic Analysis. https://www.safegraph.com/blog/measuring-and-correcting-sampling-bias-for-accurate-demographic-analysis/?utm_source=content&utm_medium=referral&utm_campaign=colabnotebook&utm_content=panel_bias
[48]
RF Squire. 2019. "What about bias in your dataset?" Quantifying Sampling Bias in SafeGraph Patterns. https://colab.research.google.com/drive/1u15afRytJMsizySFqA2EPlXSh3KTmNTQ#offline=true&sandboxMode=true
[49]
Don Bambino Geno Tai, Aditya Shah, Chyke A Doubeni, Irene G Sia, and Mark L Wieland. 2020. The Disproportionate Impact of COVID-19 on Racial and Ethnic Minorities in the United States. Clinical Infectious Diseases 2020 (06 2020), 1--4. https://doi.org/10.1093/cid/ciaa815
[50]
Laris Karklis Ted Mellnik and Andrew Ba Tran. 2020. Americans are delaying medical care, and it's devastating health-care providers. https://www.washingtonpost.com/nation/2020/06/01/americans-are-delaying-medical-care-its-devastating-health-care-providers/?arc404=true
[51]
Sandra Millon Underwood. 2000. Minorities, women, and clinical cancer research: the charge, promise, and challenge. Annals of Epidemiology 10, 8 (2000), S3-S12.
[52]
Darshali A Vyas, Leo G Eisenstein, and David S Jones. 2020. Hidden in plain sight---reconsidering the use of race correction in clinical algorithms.
[53]
Amy Wesolowski, Caroline O Buckee, Kenth Engø-Monsen, and Charlotte Jessica Eland Metcalf. 2016. Connecting mobility to infectious diseases: the promise and limits of mobile phone data. The Journal of infectious diseases 214, suppl_4 (2016), S414-S420.
[54]
Amy Wesolowski, Nathan Eagle, Abdisalan M Noor, Robert W Snow, and Caroline O Buckee. 2012. Heterogeneous mobile phone ownership and usage patterns in Kenya. PloS one 7, 4 (2012), e35319.
[55]
Nathalie E Williams, Timothy A Thomas, Matthew Dunbar, Nathan Eagle, and Adrian Dobra. 2015. Measures of human mobility using mobile phone records enhanced with GIS data. PloS one 10, 7 (2015), e0133630.
[56]
Fei Zhou, Ting Yu, Ronghui Du, Guohui Fan, Ying Liu, Zhibo Liu, Jie Xiang, Yeming Wang, Bin Song, Xiaoying Gu, et al. 2020. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The lancet 395, 10229 (2020), 1054--1062.

Cited By

View all
  • (2024)Inferring dynamic networks from marginals with iterative proportional fittingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692310(6202-6252)Online publication date: 21-Jul-2024
  • (2024)Understanding the bias of mobile location data across spatial scales and over time: A comprehensive analysis of SafeGraph data in the United StatesPLOS ONE10.1371/journal.pone.029443019:1(e0294430)Online publication date: 19-Jan-2024
  • (2024)BiasBuster: a Neural Approach for Accurate Estimation of Population Statistics using Biased Location Data2024 25th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM61037.2024.00022(1-10)Online publication date: 24-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
March 2021
899 pages
ISBN:9781450383097
DOI:10.1145/3442188
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2021

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Science Foundation Graduate Research Fellowship Program
  • Stanford RISE initiative
  • K&L Gates Presidential Fellowship
  • Stanford's Institute for Human-Centered Artificial Intelligence

Conference

FAccT '21
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)451
  • Downloads (Last 6 weeks)48
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Inferring dynamic networks from marginals with iterative proportional fittingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692310(6202-6252)Online publication date: 21-Jul-2024
  • (2024)Understanding the bias of mobile location data across spatial scales and over time: A comprehensive analysis of SafeGraph data in the United StatesPLOS ONE10.1371/journal.pone.029443019:1(e0294430)Online publication date: 19-Jan-2024
  • (2024)BiasBuster: a Neural Approach for Accurate Estimation of Population Statistics using Biased Location Data2024 25th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM61037.2024.00022(1-10)Online publication date: 24-Jun-2024
  • (2024)Detecting synthetic population bias using a spatially-oriented framework and independent validation dataInternational Journal of Geographical Information Science10.1080/13658816.2024.235839938:9(1912-1938)Online publication date: 24-May-2024
  • (2024)Gaps in gender and socioeconomic mobility disparity studiesNature Computational Science10.1038/s43588-024-00667-84:9(633-635)Online publication date: 24-Sep-2024
  • (2024)Enhancing human mobility research with open and standardized datasetsNature Computational Science10.1038/s43588-024-00650-34:7(469-472)Online publication date: 2-Jul-2024
  • (2024)The exciting potential and daunting challenge of using GPS human-mobility data for epidemic modelingNature Computational Science10.1038/s43588-024-00637-04:6(398-411)Online publication date: 19-Jun-2024
  • (2024)Twitter social mobility data reveal demographic variations in social distancing practices during the COVID-19 pandemicScientific Reports10.1038/s41598-024-51555-014:1Online publication date: 12-Jan-2024
  • (2024)An experienced racial-ethnic diversity dataset in the United States using human mobility dataScientific Data10.1038/s41597-024-03490-y11:1Online publication date: 17-Jun-2024
  • (2024)A systematic review of using population-level human mobility data to understand SARS-CoV-2 transmissionNature Communications10.1038/s41467-024-54895-715:1Online publication date: 3-Dec-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media