Using GenAI to train mental health professionals in suicide risk assessment: Preliminary findings
=================================================================================================

* Zohar Elyoseph
* Inbar Levkovitch
* Yuval Haber
* Yossi Levi-Belz

## Abstract

**Background** Suicide risk assessment is a critical skill for mental health professionals (MHPs), yet traditional training in this area is often limited. This study examined the potential of generative artificial intelligence (GenAI)-based simulator to enhance self- efficacy in suicide risk assessment among MHPs.

**Method** A quasi-experimental study was conducted with 43 MHPs from Israel. Participants attended an online seminar and interacted with a GenAI-powered suicide risk assessment simulator. They completed pre- and post-intervention questionnaires measuring suicide risk assessment self-efficacy and willingness to treat suicidal patients. Qualitative data on user experience were collected.

**Results** We found a significant increase in self-efficacy scores following the intervention. Willingness to treat patients presenting suicide risk increased slightly but did not reach significance. Qualitative feedback indicated that participants found the simulator engaging and valuable for professional development. However, participants raised concerns about over-reliance on AI and the need for human supervision during training.

**Conclusion** This preliminary study suggests that GenAI-based simulators hold promise as a tool to enhance MHPs’ competence in suicide risk assessment. However, further research with larger samples and control groups is needed to confirm these findings and address ethical considerations surrounding AI use in suicide risk assessment training. AI-powered simulation tools have the potential to democratize access to high-quality training in mental health, potentially contributing to global suicide prevention efforts. However, their implementation should be carefully considered to ensure they complement rather than replace human expertise.

Keywords
*   Suicide risk
*   Artificial intelligence
*   Assessment
*   Training

## Introduction

Suicide prevention is one of the most pressing and complex issues in the field of mental health. Its critical importance is emphasized as suicide poses a serious public health concern, with annual global figures indicating approximately 817,000 individuals dying by suicide and 20 million attempted suicides.1 These staggering figures underscore the urgent necessity for comprehensive investment in a broad array of suicide prevention strategies aimed at curtailing local and global suicide prevalence. Thus, training professionals and gatekeepers to evaluate, react, and perform early interventions becomes crucial, among other important targets. Critical actions in this context include suicide risk assessment, which is recognized as a critical step toward suicide prevention.2 This study aimed to examine the ability of an artificial intelligence (AI)-based simulator to enhance competence in suicide risk assessment among mental health professionals (MHPs).

Suicide risk assessment is a process in which MHPs seek to assess the likelihood of a patient engaging in future suicidal behavior.3 Currently, this assessment primarily includes administering the Columbia-Suicide Severity Rating Scale (C-SSRS4), which poses direct questions about the frequency, intensity, and content of suicidal thoughts, detailed questions about plans for future suicidal acts or descriptions of previous suicide attempts, and specific questions about the desire to die or continue living.4 Suicide risk assessment also includes the assessment of risk and protective factors relating to depressive symptoms and recent mental crises.5Despite the recent positive global trend of making knowledge about suicide risk assessment more accessible in the community, such programs remain limited. There is a dearth of experts to deliver appropriate training, and existing programs are often taught at a theoretical level without allowing intensive training with patients and frequent professional feedback. At best, MHPs usually watch a video demonstrating how to conduct an assessment but do not perform multiple assessments with diverse types of patients with various risk factors, communication styles, psychopathologies, and other difficulties. These shortcomings reflect a significant training gap for one of the most challenging clinical tasks in mental health, a task that is already considered very difficult to predict,6 partly due to the impulsive nature of suicidal behavior and the large variability among patients.7

Importantly, in several studies, MHPs were found to be highly reluctant to treat suicidal individuals,8 partly due to their perception of low knowledge and competence regarding suicide risk assessment.9 Improving the knowledge and competence of MPHs in performing real-time suicide risk assessments would be a crucial step toward accurately identifying those at risk and providing the appropriate psychological help. Thus, improving these skills would require systematic training in controlled conditions for real-time risk assessment. However, how competence can be enhanced among MHPs remains an open question. Following this, in this study, we aimed to shed light on the prospect of employing an AI-based simulator to enhance MHPs’ competence in suicide risk assessment. More generally, the current research will focus on how generative artificial intelligence (GenAI) technology can assist in global efforts to prevent suicide.

### AI-based technology in mental health

GenAI––large language models (LLMs)––is an advanced technology that entered widespread public use by late 2022. Within two months, 100 million users adopted applications such as ChatGPT,10 considered the fastest technology adoption rate in history. Since then, numerous studies have demonstrated the significant potential of this technology in various domains, including education, medicine, law, art, programming, and psychology.11–14

LLMs are a cornerstone technology within the broader GenAI field, designed to understand, generate, and manipulate human-like texts based on vast amounts of training data. While GenAI encompasses various AI systems capable of creating new content (including images, audio, and code), LLMs specifically focus on language tasks, serving as the foundation for advanced natural language processing applications and driving significant improvements in AI’s ability to interact with humans in more natural and contextually appropriate ways.

Recent research has shown that LLMs can accurately identify emotions and mental disorders, such as schizophrenia, depression, and anxiety, and provide treatment recommendations and prognoses comparable to mental health professionals.15–27 Despite their potential to democratize clinical knowledge and encourage ideological pluralism,21,28,29 ethical concerns persist. These include data privacy, algorithmic opacity, threats to patient autonomy, risks of anthropomorphism, technology access disparities, corporate concentration, deep fakes, fake news, reduced reliance on professionals, and amplification of biases.12,21,28,29 At a more philosophical level, questions arise regarding GenAI-human relationships, the prospect of generating intimacy, and how this may affect human-to-human relationships.12,18,30 One promising domain in which GenAI has the potential to make a significant contribution to mental health is a global endeavor to prevent suicide. 22,24,31–35

### GenAI and suicide prevention

Recent research has demonstrated GenAI’s potential in suicide prevention and mental health support. Studies have shown that GenAI, such as ChatGPT-4, can assess suicide risk with accuracy comparable to mental health professionals24,31 and adapt its assessments to different cultural contexts.35 GenAI has also proved proficient in using the World Health Organization’s guidelines to evaluate media reports on suicide.22 Beyond risk assessment, recent studies have highlighted the significant potential of GenAI in facilitating role-playing scenarios for educational and therapeutic purposes.36,37 This AI application offers a versatile and interactive platform for training and intervention. In therapeutic settings, GenAI can assist children and adolescents in sharing their experiences, organizing them coherently, identifying strengths and resilience resources, and creating visual representations of their internal psyche.38 Moreover, GenAI can also be utilized to externalize inner voices, allowing patients to engage in guided dialogue with different aspects of their internal psyche, all within a controlled therapeutic environment.39

Combining GenAI capabilities in suicide risk assessment with its application in role- playing scenarios offers new possibilities for mental health care and professional training. This integration may enable immediate and wide-scale access to clinical knowledge about suicide, be adaptable to various users and situations, and create interactive learning experiences. The current study examined this potential through a GenAI-based simulator designed to train MHPs to assess and mitigate suicide risk by utilizing role-play scenarios.

### The Current Study

This study aimed to make significant theoretical and practical contributions to the field of suicide prevention by harnessing the capabilities of GenAI to develop an innovative simulation tool for training MHPs. By addressing critical gaps in current training programs, such as the lack of opportunities for intensive practice and personalized feedback, this study aimed to democratize access to essential clinical knowledge and skills, ultimately contributing to global suicide prevention efforts. Specifically, We have developed an AI-powered simulator for suicide risk assessment training. This tool enables mental health professionals (MHPs) to practice and improve their skills by interacting with AI-generated patient characters. The simulator plays the role of an individual coming in for assessment, allowing MHPs to conduct mock interviews, evaluate suicide risk levels, and receive feedback on their performance. We examined possible changes in the MHPs’ perceived competence to conduct suicide risk assessments before and after training in the simulator.

We posited two primary research questions: 

1.  To what extent can the use of an AI-based suicide risk assessment simulator improve the self-efficacy and willingness of MHPs to handle suicide-related situations?

2.  How can the design and feedback delivery within an AI-powered simulation tool be optimized to enhance learning outcomes and user experience?

## Method

### **Participants and Study Design**

#### Participants

The study included 43 mental health professionals (e.g., psychologists, social workers, and psychiatrists) at different professional and seniority levels in Israel. Participants were recruited through a social media. These professionals were invited to attend a special webinar on 6 June 2024 (via the Zoom application platform), during which the simulator was presented, followed by an active online experience with the simulator. This experience included completing a questionnaire tapping the main issues of this study at two times: once before engaging with the simulator (pre-measure) and once after engaging with the simulator (post-measure).

#### Study Design

This study integrated quantitative and qualitative approaches to evaluate the effectiveness and ethical considerations of using an AI-powered chatbot for suicide risk assessment training. As seen in Figure 1, the experimental design included pre- and post-intervention measurements.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/17/2024.07.17.24310579/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2024/07/17/2024.07.17.24310579/F1)

Figure 1: 
The study flow from pre- to post-intervention

#### Intervention

The Generative AI Suicide Risk Assessment bot is an innovative tool designed to enhance the training of mental health professionals in assessing suicide risk. The AI-Bot was built by the authors between March and June 2024. Interaction with the bot comprises three key stages. First, the user was introduced to the objective and structure of the simulation; the introduction provided a clear framework for the interactive learning experience. Second, the user was invited to engage in conversation (in chat interface) with one out of two unique characters, each of whom presented a distinct scenario (man or woman, a unique case story, communication style, and risk factor profile). Finally, the bot presented detailed, constructive writen feedback on the central aspects of the risk assessment procedure. This feedback helps users identify areas to improve and refine their skills.

The feedback provided in this simulation covers several critical aspects of suicide risk assessment. It evaluates the interviewer’s ability to establish rapport and trust, explores risk factors across various domains (demographic, psychological, interpersonal, and situational), identifies protective factors, and assesses the level of distress and psychopathology. The feedback also evaluates the thoroughness of suicidal thought exploration, including the nature, frequency, intensity, and duration of such thoughts, as well as any specific plans or preparatory behaviors for suicide. Additionally, the feedback offers an assessment of the interviewer’s skill in reflecting on the risk level while engaging the patient in hope and reasons for living. This comprehensive approach aligns with established suicide risk assessment protocols, such as the Columbia-Suicide Severity Rating Scale (C-SSRS)40. The C-SSRS provides a structured method for evaluating and improving clinical interviewing skills in high-stake mental health scenarios.

### Measures

#### Pre-Intervention Questionnaire

Participants completed a brief questionnaire assessing age, gender, profession, years of experience, and prior experience with suicide risk assessment. Participants completed a pre-intervention questionnaire designed to assess their baseline perceptions and attitudes toward conducting suicide risk assessments. The questionnaire included the following four items presented on an 11-point Likert-type scale, ranging from 0 (*not at all*) to 10 (*completely*):

Self-efficacy: 

1.  To what extent do you feel capable of conducting a suicide risk assessment?

2.  To what extent do you feel you have the tools to conduct a suicide risk assessment?

These questions were averaged to create a composite variable called “ suicide risk assessment self-efficacy” This self-efficacy measure yielded good internal reliability (Cronbach’s alpha = .97), indicating that the items consistently measured the same underlying construct of the participants’ belief in their ability to conduct suicide risk assessments.

Willingness to treat: 

1.  3. If given the opportunity, how willing would you be to take patients with suicidal risk for assessment?

2.  4. If a patient dealing with active suicidality were referred to you, how willing would you be to accept them?

These two questions were averaged to create a composite variable called “Willingness to Treat.” This measure yielded good internal reliability (Cronbach’s alpha = 0.89), indicating that the items consistently measured the same underlying construct of the participants’ willingness to treat.

#### Post-simulation questionnaire

A post-intervention questionnaire was designed to assess the participants’ experiences and attitudes following the intervention. This questionnaire included 4 items identical to those in the pre-intervention questionnaire, allowing for a direct comparison of suicide risk assessment self-efficacy (Cronbach’s alpha = .86) and willingness to conduct suicide risk assessments (Cronbach’s alpha = .90). The post-intervention questionnaire included four new items to assess the participants’ perceptions of the simulator experience: 

1.  To what extent did the simulator experience help you conduct future suicide risk assessments?

2.  To what extent do you feel you have learned from your experience?

3.  To what extent do you feel that the feedback you received will help you in future assessments?

4.  Would you recommend using this AI-based simulator with other practitioners before they conduct suicide risk assessments?

Moreover, the questionnaire featured four open qualitative questions aimed at gathering deeper insights: How was your learning experience with an AI-based simulator? What do you suggest improving in the AI simulator? What advantages do you find in this training? What risks or concerns do you have about training this way?

### Procedure

The study was conducted during a live webinar (online conference) initiated by the authors, in which the participants were invited to participate in the experiment voluntarily. All questionnaires, including both pre- and post-intervention measures, as well as the demographic questions and open qualitative items, were administered through Google Forms. Participants were recruited from an MHP professional community called “The Artificial Third - Artificial Intelligence in Mental Health” and from other online communities. During the webinar, an explanation of the bot was provided. Interested participants received links via Zoom chat to pre and post intervention questionnaires and the AI-based simulator. Subsequently, they were asked to choose to interact with one of the two selected bots (male or female). Following this interaction, participants were given a post-intervention questionnaire. The interaction with the AI-based simulator took approximately 15-25 minutes. Each participant communicated with the bot from their home. Participants were instructed to engage in the interaction using a computer rather than a mobile device.

This study was approved by the University Ethics Committee (Institutional Review Board Approval Number: 2024-67 YVC EMEK). Participants were fully informed of the study’s aims, procedures, and the right to withdraw at any stage without any repercussions. Participants were assured of the confidentiality and anonymity of their data. They were instructed not to enter any personal information during interactions with the bot.

All conversations were recorded for research purposes with the participants’ informed consent. A full demonstration of one conversation is presented in Appendix A. The data were securely stored and managed by the current research team to ensure the participants’ privacy and confidentiality. The documentation of these interactions provides valuable insights into the assessment process. It enables an in-depth analysis of the MHPs’ interviewing techniques, risk evaluation strategies, and the overall effectiveness of AI-powered simulation for training MHPs.

The simulator is powered by GPT-4o (OpenAI LTD), an advanced LLM that demonstrates sophisticated and multimodal capabilities in “understanding” and generating text, recognizing, and producing speech tones, and offering improved speed for enhanced user experience. Access to this simulation system is facilitated through API technology, enabling seamless integration with external platforms. The AI bot was deployed using an innovative application––PMFM AI––which specializes in making AI-powered chatbots accessible to the public via API technology.

### Data analysis

Descriptive statistics were used to describe the participants’ demographic baseline data, followed by a series of paired *t*-tests to compare the MHPs’ responses before and after the intervention on the study measures. Pearson correlations were calculated to examine the relationships between the background data and the dependent variables. The criterion for determining statistical significance throughout the study was set at *p* < .05. Qualitative data were analyzed using content analysis.

## Results

### **Information on the study sample**

Participants included 43 mental health professionals with diverse backgrounds and experience levels. The participants’ ages ranged from 26 to 72 (*M*age = 45.44, *SD* = 10.43). The sample’s gender distribution was mostly female, with 35 women and eight men. Regarding professional backgrounds, most participants (*n* = 27, 62.8%) were psychologists at various stages of their careers, ranging from those pursuing graduate studies to experienced supervisors. The sample also included nine social workers, one psychiatrist, and several other mental health professionals, such as a clinical criminologist, a mental health nurse, and emotional therapists. The professional status of the participants varied: 18 were experts in their field, 12 were supervisors, nine were interns, two were practicum students, and two had relevant experience (e.g., volunteering in a crisis intervention center). The participants had a wide range of professional experience in mental health, averaging 13.81 years (*SD* = 10.06), with some having as little as two years of experience and others having up to 45 years in the field. Most participants (*n* = 41, 95.34%) reported having conducted suicide risk assessments in the past. All participants indicated that they had worked with patients exhibiting suicidal characteristics, including suicidal ideation, self- injury, suicidal behavior, or past suicide attempts. Regarding prior training in suicide prevention, most participants (*n* = 29, 67.44%) had undergone formal training in suicide prevention (more than 10 hours). Seven participants reported having attended a lecture on the subject, while three had not received any specific training. Some participants mentioned receiving training through academic courses, military service, or as part of their professional development.

### Impact of the intervention on suicide risk assessment self-efficacy

A paired-sample *t*-test was conducted to evaluate the impact of the intervention on participants’ self-efficacy in conducting suicide risk assessments. As shown in Figure 2, self-efficacy scores on the 11-point scale increased significantly from pre- intervention (*M* = 6.0, *SD* = 2.4) to post-intervention (*M* = 6.4, *SD* = 2.1), *t*(42) = - 1.96, *p* = .027 (one-tailed). This finding indicates that the participants believed they were more capable of conducting suicide risk assessments following the intervention.

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/17/2024.07.17.24310579/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2024/07/17/2024.07.17.24310579/F2)

Figure 2: 
Self-efficacy in performing suicide risk assessment pre- and post-AI simulator intervention (*N* = 43) *Note.* This graph illustrates the self-efficacy of the participants in performing suicide risk assessments before and after using the AI simulator. The pink bar represents the mean pre-intervention self-efficacy score (Pre), whereas the yellow bar represents the mean post-intervention self-efficacy score (Post).*.05

### Impact of the intervention on willingness to treat

A paired-sample *t-*test was conducted to evaluate the impact of the intervention on participants’ willingness to treat patients at risk for suicide. Willingness-to-treat scores increased slightly from pre-intervention (4.76±2.64) to post-intervention (5.00±2.50), but the difference did not achieve significance, *t*(42) = -1.148, *p* = .257. The mean increase in willingness-to-treat scores was 0.24, with a 95% confidence interval ranging from .67 to .19.

### Correlation Analysis

Pre-intervention suicide risk assessment self-efficacy and willingness to treat were positively correlated (*r* = .71, *p* < .001). MHPs’ experience (in years) was positively correlated with pre-intervention self-efficacy (*r* = 0.331, *p* =.03) but not with pre- intervention willingness to treat.

### Simulator Usage Experience

As illustrated in Figure 3, most participants reported positive learning outcomes from using the simulator. For the question regarding to what extent the simulator helped in conducting future assessments, the mean score was 6.19 ± 2.26. When asked how much they felt they learned from the experience, the mean score was 6.40 ± 2.11. Regarding how helpful the feedback was for future assessments, participants reported a mean score of 6.74 ± 1.94. Lastly, for the recommendation to other practitioners, the mean score was 7.63 ± 2.27. These results suggest that participants generally found the simulator beneficial and were likely to recommend its use to others.

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/17/2024.07.17.24310579/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2024/07/17/2024.07.17.24310579/F3)

Figure 3: 
shows the distribution of participants’ responses to four questions regarding their experiences with the AI simulator: How much did the simulator experience help you conduct future suicide risk assessments (blue)? How much did you feel that you learned from your experience (green)? How much feedback do you feel would help you in future assessments (red)? (Would you recommend this AI-based simulation experience to other practitioners before they conduct suicide risk assessments (purple)?

Each histogram illustrates the frequency of the scores given by the participants, with red lines marking the means and standard deviations.

### Qualitative Analysis of Feedback on the Simulator

The participants’ qualitative feedback provided rich insights into their experiences with the AI-based simulator. This section summarizes their responses to four key questions, highlighting both the positive aspects and areas for improvement, along with representative quotes.

1. How was your experience of learning using an AI-based simulator?

Participants generally found the experience of learning with an AI-based simulator engaging and insightful. The novelty and interactivity of the tool were particularly noted, with many participants expressing a sense of curiosity and appreciation for the technology.

*“I felt like I was having a real conversation with a real person.”*

*“This was my first experience with AI, so it felt strange, especially with the Webinar in the background. The tool is good, and if used when I’m more focused, it could be very beneficial.”*

*“Very interesting! Thank you… If possible, I would like to receive my feedback via email.”*

2. “It was very interesting and challenging, but much more user-friendly than I imagined.”What would you suggest improving?

The participants offered several constructive suggestions for improving the AI simulator. A common theme was the desire for more natural and interactive communication as well as more detailed and relevant feedback.

“It would be more successful if it were a conversation with a character instead of writing.”

“Turn the writing into a conversation with a character.”

“Provide detailed examples of important questions to ask if they were not covered.”

“Define the session setting before starting the simulation.”

“Add more time for the interaction so that we can explore all the necessary questions.”

3. What advantages do you see in such training?

The AI simulator was recognized by the participants for its significant advantages, particularly in providing a safe environment for practice and learning. The participants appreciated the opportunity to develop their skills without the risk of working with actual patients.

“It allows you to practice without real patients, so when the time comes, we will be better prepared and able to assess more effectively.”

“Huge advantages… This is an interesting and gamified way to get professionals to experience something that usually intimidates them.”

“It allows for practice and builds confidence without the fear of making mistakes on real patients.”

“The immediate feedback helps to reinforce learning and correct mistakes promptly.”

4. What are the risks and concerns associated with such training?

Despite the overall positive reception, the participants expressed concerns regarding the reliability and completeness of AI-based training. There were apprehensions about over-reliance on technology and the potential impact on therapist confidence.

“It’s still hard for me to trust AI, so there’s a concern about training that is done this way and not supervised.”

“The score could actually harm therapists’ confidence and willingness to ‘jump into the water’ even at the cost of less successful initial assessments.”

“It cannot replace current training. The tool is good, but it needs to be emphasized that it cannot replace real training.”

“There is a risk of missing nonverbal cues, which are crucial in real-life assessments.”

## Discussion

In this study, we investigated the suicide risk assessment self-efficacy of mental health professionals (MHPs) and their willingness to treat patients with suicidal tendencies before and after using an AI-based simulator designed for suicide risk assessment practice. The preliminary result of the study provides promising insights into the potential of AI-based simulators for training mental health professionals in suicide risk assessment. Critically, whereas the results indicate significant benefits, they should be interpreted cautiously due to the study’s limited scope, small sample size, and lack of a control group.

Our findings indicate a significant increase in participants’ ‘suicide risk assessment self-efficacy following their interaction with the AI-based simulator. Additionally, a slight but insignificant increase in the participants’ willingness to treat suicide-risk patients was observed. The MHPs reported positive experiences with the simulator, learning value, and feedback usefulness. A further indication of the perceived value of the tool was the participants’ high likelihood of recommending the simulator to colleagues.

The qualitative findings of this study are noteworthy. MHPs described the experience as realistic, and many reported that interacting with the AI character felt real. Furthermore, they appreciated the opportunity to converse with the character and expressed a desire to interact through voice rather than text. This feedback is particularly noteworthy, as psychologists are sometimes hesitant to adopt modern technologies. Thus, the positive response of this innovative form of training highlights its potential. However, the participants raised some concerns, particularly regarding privacy during the interaction, as well as the prospect of the simulator giving negative feedback that could negatively impact the trainee’s confidence.

These results align with and extend recent research demonstrating AI’s capabilities in mental health contexts. Studies have shown that large language models can accurately assess suicide risk,24 adapt assessments to different cultural contexts,35 and evaluate responsible reporting of suicide-related content.22 Our study took this line of research further by exploring the potential of AI in professional training, addressing a critical need in mental health education, specifically in the area of suicide prevention.

The potential impact of AI-based simulators on mental health training is significant. Traditional training methods often struggle with accessibility, lack of practical hands-on experience, and limited opportunities for personalized feedback.41 Thus, even after such training, MHPs remain with a low willingness to treat suicidal patients and with low self-efficacy in suicide risk assessment (e.g.8). AI simulators can address some of these challenges in skill development by offering a safe, accessible, and interactive environment. This, in turn, can help boost self-efficacy and the willingness to treat patients at risk of suicide. This path aligns with calls for innovative approaches to mental health training, particularly suicide prevention.42

The introduction of AI-based simulators is reflective of a change in basic assumptions in psychotherapy. It moves training away from a primarily theoretical approach toward a more experiential, practice-oriented model. This shift aligns with modern educational theories that highlight active learning and situated cognition. AI simulators can better prepare professionals for the complexities and unpredictability of suicidal behavior by exposing trainees to a wide range of scenarios and patient profiles.6,7

The study has several limitations that should be acknowledged. The small sample size, employing only two scenarios, the lack of a control group, and especially, only a single post-intervention measurement point limits the generalizability of our findings. The study’s execution during an online seminar may have affected participants’ engagement. Furthermore, while AI simulations offer many benefits, they cannot fully replicate the nuances of human interaction, particularly nonverbal cues, and complex emotional dynamics. Most of this study focused on only a single text-based interaction with a character. Future studies or training should encompass multiple interactions with various characters to better reflect the diversity of real-life situations and to provide a more comprehensive training experience.

Future research should address these limitations through more robust study designs and larger and more diverse samples. Longitudinal studies examining the long-term impact of AI-based training on clinical skills and patient outcomes would provide valuable insight. Additionally, exploring how AI simulations can best complement existing training methods and adapt them to diverse cultural contexts are important goals for future research. Experts could validate the vignettes used in the simulations so that continuous practice with a variety of characteristics could gradually improve the accuracy of risk assessment against expert criteria.

Ethical considerations must be addressed when considering the potential of AI in mental health training. These include ensuring privacy and data security, mitigating potential biases in AI systems, and guarding against overreliance on technology at the expense of human judgment and empathy. While the current article highlights the potential of AI-based feedback, it also underscores the risks and challenges that will emerge as training and supervision rely increasingly on AI systems. Contrary to common perception, AI does not offer an objective mirror of ourselves but rather a reflection imbued with gender, cultural, and economic biases.43 Recognizing these biases is crucial as we integrate AI into mental health training. Failure to do so can lead to skewed training outcomes and perpetuate existing disparities in mental healthcare. By being aware of these biases, we can better integrate AI to support and enrich the human aspects of mental health training, thus ensuring more equitable and effective outcomes. In conclusion, this pilot study represents an initial step toward exploring the potential of AI-based simulators in mental health training, specifically for suicide risk assessment. These promising findings notwithstanding, they should be viewed as a foundation for future research rather than as definitive evidence. The integration of AI simulations into mental health training raises complex questions regarding the nature of professional education and the future of psychotherapy training.

## Conclusions and Implications

The scalability of AI-based training offers the potential to address the global challenge of suicide prevention more effectively. With approximately 817,000 individuals dying by suicide annually,1 there is an urgent need to train professionals capable of conducting effective risk assessments. The results of this study may suggest that AI simulators can democratize access to high-quality training, reaching professionals in areas where traditional resources are scarce.

The potential of AI to revolutionize mental health training is significant, but realizing this potential will require ongoing collaboration between mental health professionals, educators, AI researchers, and ethicists. By integrating AI-based simulations into comprehensive training programs, we may be able to better prepare mental health professionals to meet the complex challenges of suicide risk assessment and intervention, contributing to improved patient care and outcomes globally.

As we move forward, it is vital to maintain a balance between optimism regarding the potential of this technology and a critical, evidence-based approach to its development and implementation. AI-based simulators should be seen as a powerful tool to enhance, rather than replace, human expertise in mental health care. With careful development and application, these tools could play a crucial role in addressing the pressing global need for effective suicide prevention strategies.

## Data Availability

All data produced in the present study are available upon reasonable request to the authors 

## Appendix A: Documentation of Sarah’s Risk Assessment Simulator by a User (Ilan, pseudonym)

View this table:
[Table1](http://medrxiv.org/content/early/2024/07/17/2024.07.17.24310579/T1)

## Footnotes

*   Financial and Material Support This research was supported by

*   Declaration of Relevant Financial Interests The authors declare the following potential

*   conflicts of interest: ’None’.

*   Received July 17, 2024.
*   Revision received July 17, 2024.
*   Accepted July 17, 2024.


*   © 2024, Posted by Cold Spring Harbor Laboratory

The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission.

## References

1.  1.Rogers JP, Chesney E, Oliver D, et al. Suicide, self-harm and thoughts of suicide or self-harm in infectious disease epidemics: a systematic review and meta-analysis. Epidemiol Psychiatr Sci. 2021;30:e32. doi:10.1017/S2045796021000214
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/S2045796021000214&link_type=DOI) 

2.  2.Ryan EP, Oquendo MA. Suicide Risk Assessment and Prevention: Challenges and Opportunities. 101176/appi.focus20200011. 2020;18(2):88-99. doi:10.1176/APPI.FOCUS.20200011
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1176/APPI.FOCUS.20200011&link_type=DOI) 

3.  3.Fowler JC. Suicide risk assessment in clinical practice: Pragmatic guidelines for imperfect assessments. Psychotherapy. 2012;49(1):81–90. doi:10.1037/a0026148
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1037/a0026148&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22369082&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F17%2F2024.07.17.24310579.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000301163500013&link_type=ISI) 

4.  4.Posner K, Brown GK, Stanley B, et al. The Columbia-Suicide Severity Rating Scale: initial validity and internal consistency findings from three multisite studies with adolescents and adults. Am J Psychiatry. 2011;168(12):1266–1277. oi:10.1176/appi.ajp.2011.10111704
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1176/appi.ajp.2011.10111704&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22193671&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F17%2F2024.07.17.24310579.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000297947400009&link_type=ISI) 

5.  5.Turecki G, Brent D. Suicide and suicidal behaviour. Lancet. 2016;387(10024):1227-1239. doi:10.1016/S0140-6736(15)00234-2
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(15)00234-2&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26385066&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F17%2F2024.07.17.24310579.atom) 

6.  6.Franklin JC, Ribeiro JD, Fox KR, et al. Risk factors for suicidal thoughts and behaviors: a meta-analysis of 50 years of research. Psychol Bull. 2017;143(2). doi:10.1037/bul0000084
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1037/bul0000084&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F17%2F2024.07.17.24310579.atom) 

7.  7.McHugh CM, Chun Lee RS, Hermens DF, et al. Impulsivity in the self-harm and suicidal behavior of young people: A systematic review and meta-analysis. J Psychiatr Res. 2019;116. doi:10.1016/j.jpsychires.2019.05.012
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jpsychires.2019.05.012&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F17%2F2024.07.17.24310579.atom) 

8.  8.Levi-Belz Y, Barzilay S, Levy D, et al. To treat or not to treat: the effect of hypothetical patients’ suicidal severity on therapists’ willingness to treat. Arch Suicide Res. 2020;24(3):355–366. doi:10.1080/13811118.2019.1632233
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/13811118.2019.1632233&link_type=DOI) 

9.  9.Aharon ON, Aisenberg-Shafran D, Levi-Belz Y. The effect of suicide severity and patient’s age on mental health professionals’ willingness to treat: the moderating effect of ageism. Death Stud. 2024;48(7):652–662. doi:10.1080/07481187.2023.2255854
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/07481187.2023.2255854&link_type=DOI) 

10. 10.Wu T, He S, Liu J, et al. A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica. 2023;10(5):1122–1136. doi:10.1109/JAS.2023.123618
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/JAS.2023.123618&link_type=DOI) 

11. 11.Biswas S. Role of ChatGPT in Computer Programming. Mesopotamian Journal of Computer Science. Published online 2023. doi:10.58496/mjcsc/2023/002
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.58496/mjcsc/2023/002&link_type=DOI) 

12. 12.Haber Y, Levkovich I, Hadar-Shoval D, et al. The artificial third: a broad view of the effects of introducing generative artificial intelligence on psychotherapy (preprint). JMIR Ment Health. Published online November 22, 2023. doi:10.2196/54781
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2196/54781&link_type=DOI) 

13. 13.Grossman MR, Grimm PW, Brown DG, et al. The GPTJudge: justice in a Generative AI world. ACM Comput Surv. 2023;55(12).
    
    
14. 14.Macey-Dare R. How ChatGPT and Generative AI systems will revolutionize legal services and the legal profession. SSRN Electronic Journal. Published online 2023. doi:10.2139/ssrn.4366749
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2139/ssrn.4366749&link_type=DOI) 

15. 15.Elyoseph Z, Hadar-Shoval D, Asraf K, Lvovsky M. ChatGPT outperforms humans in emotional awareness evaluations. Front Psychol. 2023;14:1199058. doi:10.3389/fpsyg.2023.1199058
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fpsyg.2023.1199058&link_type=DOI) 

16. 16.Hadar-Shoval D, Elyoseph Z, Lvovsky M. The plasticity of ChatGPT’s mentalizing abilities: personalization for personality structures. Front Psychiatry. 2023;14:1234397. doi:10.3389/fpsyt.2023.1234397
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fpsyt.2023.1234397&link_type=DOI) 

17. 17.Elyoseph Z, Levkovich I, Shinan-Altman S. Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public. Fam Med Community Health. 2024;12(Suppl 1). doi:10.1136/fmch-2023-002583
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiZm1jaCI7czo1OiJyZXNpZCI7czoxODoiMTIvU3VwcGxfMS9lMDAyNTgzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTcvMjAyNC4wNy4xNy4yNDMxMDU3OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

18. 18.Elyoseph Z, Hadar Shoval D, Levkovich I. Beyond personhood: ethical paradigms in the generative artificial intelligence era. American Journal of Bioethics. 2024;24(1):57–59. doi:10.1080/15265161.2023.2278546
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/15265161.2023.2278546&link_type=DOI) 

19. 19.Hadar-Shoval D, Asraf K, Mizrachi Y, et al. Assessing the alignment of large language models with human values for mental health integration: cross- sectional study using Schwartz’s theory of basic values. JMIR Ment Health. 2024;11(1):e55988. doi:10.2196/55988
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2196/55988&link_type=DOI) 

20. 20.Elyoseph Z, Refoua E, Asraf K, et al. Capacity of Generative AI to interpret human emotions from visual and textual data: pilot evaluation study. JMIR Ment Health. 2024;11(1). doi:10.2196/54369
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2196/54369&link_type=DOI) 

21. 21.Elyoseph Z, Gur T, Haber Y, et al. An ethical perspective on the democratization of mental health with generative artificial intelligence. JMIR Ment Health. Published online 2024. Accessed May 18, 2024. [https://www.researchgate.net/publication/378737076\_An\_Ethical\_Perspective\_on\_The\_Democratization\_of\_Mental\_Health\_with\_Generative\_Artificial\_Intelligence](https://www.researchgate.net/publication/378737076\_An\_Ethical\_Perspective\_on\_The\_Democratization\_of\_Mental\_Health\_with_Generative_Artificial_Intelligence)
    
    
22. 22.Elyoseph Z, Levkovich I, Rabin E, et al. Applying language models for suicide prevention: evaluating news article adherence to WHO reporting guidelines. Mental Health Research. Published online April 2, 2024. doi:10.21203/RS.3.RS-4180591/V1
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.21203/RS.3.RS-4180591/V1&link_type=DOI) 

23. 23.Levkovich I, Elyoseph Z. Identifying depression and its determinants upon initiating treatment: ChatGPT versus primary care physicians. Fam Med Community Health. 2023;11(4):2391. doi:10.1136/FMCH-2023-002391
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1136/FMCH-2023-002391&link_type=DOI) 

24. 24.Levkovich I, Elyoseph Z. Suicide risk assessments through the eyes of ChatGPT-3.5 Versus ChatGPT-4: Vignette Study. JMIR Ment Health. 2023;10(1):e51232. doi:10.2196/51232
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2196/51232&link_type=DOI) 

25. 25.Elyoseph Z, Levkovich I. Comparing the perspectives of Generative AI, mental health experts, and the general public on schizophrenia recovery: case vignette study. JMIR Ment Health. 2024;11(1). doi:10.2196/53043
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2196/53043&link_type=DOI) 

26. 26.Sorin V, Brin D, Barash Y, et al. Large language models (LLMs) and empathy – a systematic review. medRxiv. Published online 2023. doi: 10.1101/2023.08.07.23293769
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMy4wOC4wNy4yMzI5Mzc2OXYxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTcvMjAyNC4wNy4xNy4yNDMxMDU3OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

27. 27.Omar M, Levkovich I. Exploring the efficacy and potential of large language models for depression: a systematic review. medRxiv. Published online May 7, 2024:2024.05.07.24306897. doi:10.1101/2024.05.07.24306897
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyNC4wNS4wNy4yNDMwNjg5N3YxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTcvMjAyNC4wNy4xNy4yNDMxMDU3OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

28. 28.Tal A, Elyoseph Z, Haber Y, et al. The artificial third: utilizing ChatGPT in mental health. Am J Bioeth. 2023;23(10):74–77. doi:10.1080/15265161.2023.2250297
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/15265161.2023.2250297&link_type=DOI) 

29. 29.Cohen IG. What should ChatGPT mean for bioethics? Am J Bioeth. 2023;23(10). doi:10.1080/15265161.2023.2233357
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/15265161.2023.2233357&link_type=DOI) 

30. 30.Alabed A, Javornik A, Gregory-Smith D, et al. More than just a chat: a taxonomy of consumers’ relationships with conversational AI agents and their well-being implications. Eur J Mark. 2024;58(2). doi:10.1108/EJM-01-2023-0037
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1108/EJM-01-2023-0037&link_type=DOI) 

31. 31.Elyoseph Z, Levkovich I. Beyond human expertise: the promise and limitations of ChatGPT in suicide risk assessment. Front Psychiatry. 2023;14. doi:10.3389/fpsyt.2023.1213141
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fpsyt.2023.1213141&link_type=DOI) 

32. 32.Arendt F, Till B, Voracek M, et al. ChatGPT, artificial intelligence, and suicide prevention. Crisis. 2023;44(5). doi:10.1027/0227-5910/a000915
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1027/0227-5910/a000915&link_type=DOI) 

33. 33.Arendt F, Till B, Voracek M, et al. ChatGPT, artificial intelligence, and suicide prevention: a call for a targeted and concerted research effort. Crisis. 2023;44(5). doi:10.1027/0227-5910/a000915
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1027/0227-5910/a000915&link_type=DOI) 

34. 34.Ghanadian H, Nejadgholi I, Al Osman H. ChatGPT for suicide risk assessment on social media: quantitative evaluation of model performance, potentials and limitations. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics. ; 2023. doi:10.18653/v1/2023.wassa-1.16
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.18653/v1/2023.wassa-1.16&link_type=DOI) 

35. 35.Levkovich I, Shinan-Altman S, Elyoseph Z. Can large language models be sensitive to culture suicide risk assessment? Published online March 13, 2024. doi:10.21203/RS.3.RS-4066705/V1
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.21203/RS.3.RS-4066705/V1&link_type=DOI) 

36. 36.Maurya RK. A qualitative content analysis of ChatGPT’s client simulation role-play for practising counselling skills. Couns Psychother Res. 2024;24(2):614–630. doi:10.1002/CAPR.12699
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/CAPR.12699&link_type=DOI) 

37. 37.Maurya RK. Using AI based Chatbot ChatGPT for practicing counseling skills through role-play. J Creat Ment Health. Published online December 31, 2023. doi:10.1080/15401383.2023.2297857
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/15401383.2023.2297857&link_type=DOI) 

38. 38.Haber Y, Levkovich I, Tzafrir I, et al. CanvasHero: The role of artificial intelligence in cultivating resilience among children and youth using the 6-part story method in mass war trauma. Under review.
    
    
39. 39.Haber Y, Hadar Shoval D, Levkovich I, et al. The Externalization of Internal Experiences in Psychotherapy through Generative Artiûcial Intelligence: A Theoretical, Clinical, and Ethical Analysis.; 2024. Under review.
    
    
40. 40.Interian A, Chesin M, Kline A, et al. Use of the Columbia-Suicide Severity Rating Scale (C-SSRS) to classify suicidal behaviors. Archives of Suicide Research. 2018;22(2). doi:10.1080/13811118.2017.1334610
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/13811118.2017.1334610&link_type=DOI) 

41. 41.Holmes W, Porayska-Pomsta K, Holstein K, et al. Ethics of AI in education: towards a community-wide framework. Int J Artif Intell Educ. 2022;32(3):504–526. doi:10.1007/S40593-021-00239-1/FIGURES/1
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/S40593-021-00239-1/FIGURES/1&link_type=DOI) 

42. 42. McKoin Owens M, Zickafoose A, Wingenbach G, et al. Selected Texan K-12 educators’ perceptions of youth suicide prevention training. Int J Environ Res Public Health. 2022;19(19). doi:10.3390/ijerph191912625
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3390/ijerph191912625&link_type=DOI) 

43. 43.Vallor S. The AI Mirror: How to Reclaim our Humanity in an Age of Machine Thinking. Oxford University Press; 2022. doi:10.1145/3514094.3539567
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1145/3514094.3539567&link_type=DOI)