Abstract
This paper presents our system employed for the Social Media Mining for Health 2023 Shared Task 4: Binary classification of English Reddit posts self-reporting a social anxiety disorder diagnosis. We systematically investigate and contrast the efficacy of hybrid and ensemble models that harness specialized medical domain-adapted transformers in conjunction with BiLSTM neural networks. The evaluation results outline that our best performing model obtained 89.31% F1 on the validation set and 83.76% F1 on the test set.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This paper has been peer-reviewed and the used dataset: Binary classification of English Reddit posts self-reporting a social anxiety disorder diagnosis dataset is provided by the organizor of Social Media Mining for Health 2023 share tasks. More information can be found https://healthlanguageprocessing.org/smm4h-2023/
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
sourabh.zanwar{at}rwth-aachen.de
d.wiechmann{at}uva.nl
yu.qiao{at}rwth-aachen.de
elma.kerz{at}ifaar.rwth-aachen.de
Data Availability
This paper has been peer-reviewed and accepted for presentation at the #SMM4H 2023 Workshop. The dataset used in this paper is a part of #SSM4H 2023 shared task. More information can be found at: https://healthlanguageprocessing.org/smm4h-2023/