RT Journal Article SR Electronic T1 Characterization of long-term patient-reported symptoms of COVID-19: an analysis of social media data JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.07.13.21260449 DO 10.1101/2021.07.13.21260449 A1 Banda, Juan M. A1 Adderley, Nicola A1 Ahmed, Waheed-Ul-Rahman A1 AlGhoul, Heba A1 Alser, Osaid A1 Alser, Muath A1 Areia, Carlos A1 Cogenur, Mikail A1 Fišter, Krisitina A1 Gombar, Saurabh A1 Huser, Vojtech A1 Jonnagaddala, Jitendra A1 Lai, Lana YH A1 Leis, Angela A1 Mateu, Lourdes A1 Mayer, Miguel Angel A1 Minty, Evan A1 Morales, Daniel A1 Natarajan, Karthik A1 Paredes, Roger A1 Periyakoil, Vyjeyanthi S. A1 Prats-Uribe, Albert A1 Ross, Elsie G. A1 Singh, Gurdas A1 Subbian, Vignesh A1 Vivekanantham, Arani A1 Prieto-Alhambra, Daniel YR 2021 UL http://medrxiv.org/content/early/2021/07/15/2021.07.13.21260449.abstract AB As the SARS-CoV-2 virus (COVID-19) continues to affect people across the globe, there is limited understanding of the long term implications for infected patients1–3. While some of these patients have documented follow-ups on clinical records, or participate in longitudinal surveys, these datasets are usually designed by clinicians, and not granular enough to understand the natural history or patient experiences of ‘long COVID’. In order to get a complete picture, there is a need to use patient generated data to track the long-term impact of COVID-19 on recovered patients in real time. There is a growing need to meticulously characterize these patients’ experiences, from infection to months post-infection, and with highly granular patient generated data rather than clinician narratives. In this work, we present a longitudinal characterization of post-COVID-19 symptoms using social media data from Twitter. Using a combination of machine learning, natural language processing techniques, and clinician reviews, we mined 296,154 tweets to characterize the post-acute infection course of the disease, creating detailed timelines of symptoms and conditions, and analyzing their symptomatology during a period of over 150 days.Competing Interest StatementThe authors have declared no competing interest.Funding StatementJMB was funded by a grant by the National Institute of Aging (3P30AG059307-02S1). DPA is funded through an NIHR Senior Research Fellowship (Grant number SRF-2018-11-ST2-004). VH contribution to this work was carried out with support from National Library of Medicine, National Institutes of Health.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:IRB not neededAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData will be made available after formal publication.