RT Journal Article SR Electronic T1 Identification, analysis and prediction of valid and false information related to vaccines from Romanian tweets JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.08.19.23294319 DO 10.1101/2023.08.19.23294319 A1 Valeanu, Andrei A1 Mihai, Dragos Paul A1 Andrei, Corina A1 Puscasu, Ciprian A1 Ionica, Alexandra Mihaela A1 Hinoveanu, Miruna Ioana A1 Predoi, Valentina Patricia A1 Bulancea, Ema A1 Chirita, Cornel A1 Negres, Simona A1 Marineci, Cristian Daniel YR 2023 UL http://medrxiv.org/content/early/2023/08/25/2023.08.19.23294319.abstract AB The online misinformation might undermine the vaccination efforts. Therefore, given the fact that no study specifically analyzed online vaccine related content written in Romanian, the main objective of the study was to detect and evaluate tweets related to vaccines and written in Romanian language. 1400 Romanian vaccine related tweets were manually classified in true, neutral and fake information and analyzed based on wordcloud representations, a correlation analysis between the three classes and specific tweet characteristics and the validation of several predictive machine learning algorithms. The tweets annotated as misinformation showed specific word patterns and were liked and reshared more often as compared to the true and neutral ones. The validation of the machine learning algorithms yielded enhanced results in terms of Area Under the Receiver Operating Characteristic Curve Score (0.744-0.843) when evaluating the Support Vector Classifier. The predictive model estimates in a well calibrated manner the probability that a specific Twitter post is true, neutral or fake. The current study offers important insights regarding vaccine related online content written in an Eastern European language. Future studies must aim at building an online platform for rapid identification of vaccine misinformation and raising awareness for the general population.Competing Interest StatementThe authors have declared no competing interest.Funding StatementPublication of this paper was supported by Carol Davila University of Medicine and Pharmacy, through the institutional program "Publish not Perish".Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.Not ApplicableThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:N/AI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.Not ApplicableI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Not ApplicableI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.Not ApplicableThe Python code for data preprocessing, wordcloud representation, correlation analysis and the development and validation of the machine learning predictive models, as well as the Tfidf vectorized dataset and the final SVC algorithm are publicly available at https://github.com/valeanuandrei/vaccine-tweets-ro-research.https://github.com/valeanuandrei/vaccine-tweets-ro-research