MANTIS at #SMM4H 2023: Leveraging Hybrid and Ensemble Models for Detection of Social Anxiety Disorder on Reddit
===============================================================================================================

* Sourabh Zanwar
* Daniel Wiechmann
* Yu Qiao
* Elma Kerz

## Abstract

This paper presents our system employed for the Social Media Mining for Health 2023 Shared Task 4: Binary classification of English Reddit posts self-reporting a social anxiety disorder diagnosis. We systematically investigate and contrast the efficacy of hybrid and ensemble models that harness specialized medical domain-adapted transformers in conjunction with BiLSTM neural networks. The evaluation results outline that our best performing model obtained 89.31% F1 on the validation set and 83.76% F1 on the test set.

## 1 Introduction

According to the Anxiety & Depression Association of America1, anxiety disorders rank as the most prevalent mental illnesses in the United States. An estimated 40 million adults, constituting 19.1% of the population aged 18 and above, grapple with these conditions annually. This challenge is compounded by the scarcity of accessible mental health care services and the frequent occurrence of misdiagnoses, often causing individuals to unknowingly endure these disorders (Kasper, 2006).

Natural Language Processing in combination with Machine Learning is increasingly recognized as having transformative potential to support healthcare professionals and stakeholders in the early detection, treatment and prevention of mental disorders (Zhang et al., 2022). In this paper, we report on our participation in The Social Media Mining for Health Applications (#SMM4H) 2023 workshop, which aims to promote automated methods for mining social media data for health informatics. We chose to participate in the Shared Task 4 competition, which was to improve social anxiety detection in Reddit posts (Klein et al., 2023). We approached this task by developing hybrid and ensemble models combining domain-matched transformers with Bidirectional Long Short-Term Memory (BiLSTM) networks trained on a comprehensive set of engineered linguistic features. This set encompasses measures of morpho-syntactic complexity, lexical sophistication/diversity, readability, stylistics measures (register-specific ngram frequencies) and sentiment/emotion lexicons.

## 2 Data

The data for Task 4 consisted of 8117 Reddit posts written by users aged between 12 and 25 years. These data were split into training (75%), validation (8.4%), and testing sets (16.6%).

In preparation for model training, all texts were subjected to preprocessing procedures including eliminating HTML, URLs, excessive spaces, and emojis from the text, as well as rectifying inconsistent punctuation.

## 3 System Description

Our systems leveraged three domain-adapted Transformer-based pretrained language models (PLM), a BiLSTM trained on engineered features and their combination forming into hybrid and ensemble models. Domain-adapted pretrained language models include: (1) PsychBERT (Vajre et al., 2021) (2) Mental RoBERTa (Ji et al., 2022) and (3) Clinical BERT (Alsentzer et al., 2019). All PLMs were obtained from the Huggingface (Wolf et al., 2020), choosing the uncased, where applicable, base versions. We constructed a BiLSTM trained on 168 features that fall into six categories. All measurements of these features were obtained using a system that employs a sliding window technique to compute sentence-level measurements. The BiLSTM model is formulated as: ![Formula][1]</img>  where *fc**i*(*x*) = ReLU(*W**i**x* + *b**i*), ![Graphic][2]</img> is a *L* layer BiLSTM with hidden size of *H*. ![Graphic][3]</img>, where CM*i* represents the linguistic features for the *i*th sentence of a post consisting of *N* sentences. The last hidden representation of the last layer in forward and backward directions are denoted by ![Graphic][4]</img> and ![Graphic][5]</img> denotes the concatenation operator.

The hybrid model combines a Mental RoBERTa model with above BiLSTM. ![Formula][6]</img>  where ![Graphic][7]</img> is the sequence of tokens from a post.

We constructed three distinct ensemble models using the stacking method: ensemble model (1) composed of instances from the hybrid model, which emerged as the most accurate base model (M6), (2) combining hybrid models with fine-tuned PsychBERT models (M7), and (3) consisting of Mental RoBERTa models, PsychBERT models, and BiLSTM models (M8). The resulting models represent homogeneous ensemble (HOE), intermediate and heterogeneous ensemble (HEE) approaches (Ganaie et al., 2022). As meta-learners, Support Vector Classifer, Logistic Regression, Gradient Boosting and Ridge Regression and XGBoost were used.

Further details are provided in the supplementary material ([https://shorturl.at/epuF3](https://shorturl.at/epuF3)).

## 4 Results and Evaluation

The results of our models on the validation set and test set are presented in Table 1. The Mental RoBERTa model achieved the highest performance (F1=86.59%) among the PLMs, outperforming the PsychBERT and ClinicalBERT models by 4% and 14.73%, respectively. This finding indicates that the detection of anxiety on Reddit sees a marked improvement from pretraining the PLM on mental health-related subreddits, as opposed to pretraining on clinical text. The hybrid models consistently outperformed the standalone PLM across all model iterations, yielding an average increase in F1 scores of 0.3%. The use of model stacking enhanced classification outcomes with performance boosts ranging between 1.86% and 2.44% in F1 score. The highest balanced classification score was achieved by the HEE model (M8). A variant of this model using ridge regression as a meta-learner (M12) achieved the best performance on the test set (F1 = 83.76%, meanall teams = 79.3%, medianall teams = 82.4%). The HOE model (M6) achieved the second-highest performance and the best precision among all models examined. This suggests that both ensemble approaches can produce beneficial, albeit distinct, impacts on the detection of social anxiety disorder.

View this table:
[Table 1:](http://medrxiv.org/content/early/2023/12/05/2023.12.05.23299439/T1)

Table 1: Results on the validation set (top) and test set (bottom). For each ensemble model, we report results of the best performing meta-learner.

## Data Availability

This paper has been peer-reviewed and accepted for presentation at the #SMM4H 2023 Workshop. The dataset used in this paper is a part of #SSM4H 2023 shared task. More information can be found at: [https://healthlanguageprocessing.org/smm4h-2023/](https://healthlanguageprocessing.org/smm4h-2023/)

## Footnotes

*   sourabh.zanwar{at}rwth-aachen.de

*   d.wiechmann{at}uva.nl

*   yu.qiao{at}rwth-aachen.de

*   elma.kerz{at}ifaar.rwth-aachen.de

*   1 [https://adaa.org/understanding-anxiety/facts-statistics](https://adaa.org/understanding-anxiety/facts-statistics)

*   Received December 5, 2023.
*   Revision received December 5, 2023.
*   Accepted December 5, 2023.


*   © 2023, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/)

## References

1.   Emily Alsentzer,  John Murphy,  William Boag,  Wei-Hung Weng,  Di Jindi,  Tristan Naumann, and  Matthew McDermott. 2019. Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pages 72–78. Association for Computational Linguistics.
    
    

2.   Mudasir A Ganaie,  Minghui Hu,  AK Malik,  M Tanveer, and  PN Suganthan. 2022. Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence, 115:105151.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.engappai.2022.105151&link_type=DOI) 

3.   Shaoxiong Ji,  Tianlin Zhang,  Luna Ansari,  Jie Fu,  Prayag Tiwari, and  Erik Cambria. 2022. Mental-BERT: Publicly available pretrained language models for mental healthcare. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7184–7190, Marseille, France. European Language Resources Association.
    
    

4.   Siegfried Kasper. 2006. Anxiety disorders: underdiagnosed and insufficiently treated. International Journal of Psychiatry in Clinical Practice, 10(sup1):3–9.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/13651500600940492&link_type=DOI) 

5.   AZ Klein,  JM Banda,  Y Guo,  JI Flores Amaro,  R Rodriguez-Esteban,  A Sarker,  AL Schmidt,  D Xu, and  G Gonzalez-Hernandez. 2023. Overview of the eighth social media mining for health applications (#smm4h) shared tasks at the AMIA 2023 annual symposium. Proceedings of the Eighth Social Media Mining for Health Applications (#SMM4H) Workshop and Shared Task.
    
    

6.   Vedant Vajre,  Mitch Naylor,  Uday Kamath, and  Amarda Shehu. 2021. Psychbert: a mental health language model for social media mental health behavioral analysis. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 1077–1082. IEEE.
    
    

7.   Thomas Wolf,  Julien Chaumond,  Lysandre Debut,  Victor Sanh,  Clement Delangue,  Anthony Moi,  Pierric Cistac,  Morgan Funtowicz,  Joe Davison,  Sam Shleifer, et al. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45.
    
    

8.   T. Zhang,  A Schoene, and  S. Ananiadou. 2022. Natural language processing applied to mental illness detection: A narrative review. NPJ Digital Medicine, 5:46.

 [1]: /embed/graphic-1.gif
 [2]: /embed/inline-graphic-1.gif
 [3]: /embed/inline-graphic-2.gif
 [4]: /embed/inline-graphic-3.gif
 [5]: /embed/inline-graphic-4.gif
 [6]: /embed/graphic-2.gif
 [7]: /embed/inline-graphic-5.gif