Uncovering the linguistic characteristics of psychotherapy: a computational approach to measure therapist language timing, responsiveness, and consistency

Adam S Miner; Scott L Fleming; Albert Haque; Jason A Fries; Tim Althoff; Denise E Wilfley; W. Stewart Agras; Arnold Milstein; Jeff Hancock; Steven M Ash; Shannon Wiltsey Stirman; Bruce A. Arnow; Nigam H. Shah

doi:10.1101/2022.04.24.22274227

Abstract

Although individual psychotherapy is generally effective for a range of mental health conditions, little is known about the moment-to-moment language use of effective therapists. Increased access to computational power, coupled with a rise in computer-mediated communication (telehealth), makes feasible the large-scale analyses of language use during psychotherapy. Transparent methodological approaches are lacking, however. Here we present novel methods to increase the efficiency of efforts to examine language use in psychotherapy. We evaluate three important aspects of therapist language use - timing, responsiveness, and consistency - across five clinically relevant language domains: pronouns, time orientation, emotional polarity, therapist tactics, and paralinguistic style. We find therapist language is dynamic within sessions, responds to patient language, and relates to patient symptom diagnosis but not symptom severity. Our results demonstrate that analyzing therapist language at scale is feasible and may help answer longstanding questions about specific behaviors of effective therapists.

Introduction

Individual psychotherapy is an effective treatment for a wide range of mental health conditions^1,2.Two problems that have emerged in research on outcomes from psychotherapy are that: 1) for some disorders, there is little evidence that one specific form of psychotherapy is superior to another even when hypothesized change mechanisms differ³, and 2) some therapists consistently achieve better outcomes than others (i.e., therapist effects), but it is unclear what individual therapists may be doing that accounts for these effects^4–7. Studies of the psychotherapy process attempt to understand what happens during the sessions that may explain patient improvement⁸. The chief method used since the 1950s to evaluate therapist behavior in therapy sessions is to have trained humans identify clinically meaningful therapist utterances in transcripts, and draw conclusions based on observed patterns^9–12. Although useful, relying solely on human inspection of transcripts is not likely to meet demands for improved reproducibility and scalability in psychotherapy process research^11,13–17.

Computational approaches using natural language processing offer the potential to move past human limits of attention and reproducibility^11,18–22. Improvements in computational power, the growing ease of recording and transcribing therapy sessions, and a shift to computer-mediated communication in healthcare (i.e., telehealth) make this feasible^11,14,23,24. Early work is promising, but does not yet translate to best practices for improved patient outcomes or provide a clear direction for therapist training^{11,19,25–28}. Methodological improvements are needed to bridge divisions between theoretical schools of thought (e.g., Cognitive Behavioral, Interpersonal, Psychodynamic, Counseling) as to which therapist language patterns correlate with favorable therapy outcomes^{4,6,13,15,20,24,29,30}. If known, the linguistic behavior of successful therapists may inform targeted clinical trials to test causality and implementability, subsequently improving clinician training.

A fundamental tenet of psychotherapy is that therapists expose patients to language that may be helpful (e.g., emotional validation) and avoid language that may be harmful (e.g., shaming). Therapist language should be well timed and appropriate for the specific moment. Nevertheless, the specific timing, frequency, and reactivity of therapist utterances is difficult to scrutinize systematically without human inspection¹³. Difficulties are multifaceted, with key limitations being theoretical (i.e., disagreement about mechanisms of change), technical (i.e., lack of validated tools for language measurement), and practical (i.e., lack of clinically meaningful datasets). This work primarily addresses the technical limitations of language analysis in psychotherapy. Here we present a three-phase approach that measures therapist language by building on prior theoretical, methodological, and clinical insights: Phase 1 - To identify a priori language features of interest, we generate a non-exhaustive list of clinically relevant language features. Phase 2 - To observe the natural occurrence of language features identified in Phase 1, we describe the underlying structure of therapy focusing on timing, responsiveness, and consistency. Phase 3 - To demonstrate the potential for clinical utility, we evaluate the relationship between therapist language and patient symptom severity and diagnosis.

Many forms of therapy exist, along with an abundance of theoretically and practically motivated therapist approaches. Thus, we suggest a reasonable but non-exhaustive list of domain-focused concepts that balance face-validity and technical implementability using modern linguistic and statistical approaches. We posit, based on prior research and clinical judgment, that five clusters of language features may be clinically important across theoretical orientations, meriting close inspection. We limit our focus to characterizations of human language most amenable to machine learning, and that may correlate with favorable patient improvement. We acknowledge that other modern sensing technologies will allow for more rich characterization of human interaction such as facial expressions, body movement, and voice tone that may also be related to therapy outcomes³¹.

The five feature clusters we seek to describe are: pronouns, time orientation, emotional polarity, therapist tactics, and paralinguistic style. Pronouns (e.g., I, me, you, them) reflect internal psychological attention^25,32,33. Measuring the relative frequency of self-focused pronouns (i.e., I, me, my) and other-focused pronouns (e.g., you, your, they) has demonstrated theoretical and practical value in psychological research^32,34,35. Time orientation is a longstanding focus of psychotherapy. Some theoretical orientations advise therapists to focus on past experiences (e.g., early childhood), while some encourage focus on the present^36–39. Emotions are important in most clinical psychology theoretical orientations^{36,37,40–42}. There is strong disagreement, however, on how to represent and measure polarity and emotionality in clinical contexts^43–45. Therapist tactics are used to help develop a therapeutic relationship and engender patient change, including statements that demonstrate understanding^4,11. Paralinguistics refers to the way words are said, not the words themselves, for example, rate of speech^46,47. Based on prior work, these language-focused constructs are theoretically important, but poorly measured in psychotherapy. Although a full review of the theoretical importance and practical application of these clusters is beyond the scope of this work, we briefly summarize each feature in our Methods (Phase 1: Feature generation).

Uncovering modifiable, therapist-focused interventions that are associated with patient improvement is a key objective of therapy process research^4,8,13,15. Our approach presents a systematic way to generate or evaluate hypotheses about psychotherapy process at scale. This study identifies potentially modifiable features of interest in psychotherapy (Phase 1), measures feature timing, responsiveness, and consistency (Phase 2), tests clinical usefulness (Phase 3), and shares methods to encourage critical peer review and collaboration.

Results

Overall, our results surface linguistic nuance in psychotherapy that previously has not been directly measured. Therapist language timing is dynamic (Figure 1) and does not mirror the frequency of patient language consistently (Figure 2). Therapist language appears to be responsive to patient language for a number of clinically relevant language features (Figures 3, 4). For example, Figures 3 and 4 show that therapists decreased their rate of speech, as measured by words per second, in response to increases in the patient’s rate of speech, or vice versa (i.e., therapists significantly slowed their speech as patients increased theirs). Therapist language appears consistent across sessions: on average, within-therapist language patterns were significantly more similar than between-therapist language patterns. In relation to patient-focused characteristics, therapist language appears to be related to patient diagnosis: logistic regression models trained to classify diagnosis based on therapist language patterns performed significantly better than chance.

Figure 1. Therapist speech phase-dependence

The dynamic nature of therapist speech, grouped by language feature category. Figure 1 represents trends in therapist language over time after aggregating across therapists. LIWC = Linguistic Inquiry and Word Count, a dictionary-based lexicon that maps words and word stems to psychologically relevant categories. EmoLex = Word-Emotion Association Lexicon, a list of English words mapped to crowdsourced sentiment annotations. We performed smoothing/interpolation between discrete points at the level of temporal quintiles using a natural cubic spline. See Figure 2 for per-feature examples of these trends viewed without smoothing.

Figure 2. Therapist and patient language within-session changes

Quantitative assessment of changes in therapist language features over time, as well as within-quintile differences between patient and therapist language. All differences annotated asterisks (*) are significant at level α=0.05 after controlling for multiple hypothesis tests via the Benjamini-Hochberg procedure.

p-value annotation:

Non-significant (ns): 0.01 < p ≤ 1.0; *: 0.01 < p ≤ 0.05; **: 0.001 < p ≤ 0.01; ***: 0.0001 < p ≤ 0.001; ****: p ≤ 0.0001

Figure 3. Therapist responsiveness patterns at the level of individual sessions

Illustration of significant directional associations between patient language and therapist language in four sessions, each representing a unique patient-therapist dyad. Language features are colored by feature group (see Table 2). Edges are colored according to the average partial correlation coefficient. Figure 3a illustrates an example of one patient-therapist dyad in which there was just one significant association: increases in patient rate of speech, as measured in words per second, were associated with decreases in therapist rate of speech, and vice versa. Figure 3d highlights a patient-therapist dyad with varied significant associations: increased patient use of third-person plural pronouns (‘“They” Pronouns’) drove increased therapist use of third-person plural pronouns (‘“They” Pronouns’), increased use of positive language by the patient (“Positive”) was associated with increased use of checking for understanding phrases by the therapist (“Checking for Understanding”), etc. These are four of the 73 network diagrams produced, one for each session/patient-therapist dyad.

Figure 4. Therapist responsiveness patterns aggregated over all sessions

The number of times a particular type of association between patient language features and subsequent/accommodating therapist language features was found, across all sessions. Patient language features are on the left, therapist language features on the right. For the purposes of illustration, only associations that were found in at least 4 patient-therapist dyads are displayed (see Supplementary Figure 2 for a similar plot containing all significant associations). There were 72 such associations from 43 unique patient-therapist dyads, of which 24 involved changes in the patient’s rate of speech (“Words per Second”). Language features are colored by feature group (see Table 2). Edges are colored according to the average partial correlation coefficient amongst all patient-therapist dyads in which that association was found. For example, 12 patient-therapist dyads exhibited a significant negative association between patient rate of speech and therapist rate of speech, such that increases in the patient’s words per second (“Words per Second”) were associated with subsequent decreases in the therapist’s words per second (“Words per Second”) and/or vice versa (i.e., decreases in the patient’s words per second were associated with subsequent increases in the therapist’s words per second).

Study population

Therapy transcripts were created per protocol as part of a secondary analysis of a previously completed randomized controlled trial, conducted in the United States across 24 college counseling clinics from April 2013 and December 2016^14,48. See [Miner et al., 2020]¹⁴ for details on transcription and sample selection. The demographic information of a subset of therapist-patient dyads, and their clinical information (diagnosis and symptom severity) is presented in Table 1. Patients were predominantly female (87%) and in their early 20s (median age, 21 years). Therapists were predominantly female (78%), and in their early 40s (median age, 41 years). Patient depressive symptom severity was mostly minimal to mild.

View this table:

Table 1.

Clinician and patient demographic information

View this table:

Table 2.

Summary of language features

Therapist timing is dynamic

Here we evaluate therapist language timing. Therapists appear to use distinct types of language at specific points in the session (early vs. late feature frequency). Figure 1 presents normalized frequency over time of therapist language features. Supplementary Figure 1 shows individual therapists as examples. Figure 2 presents differences between therapist and patient language features over time for a subset of features.

Therapist speech changed significantly between the start and the end of the session. As illustrated in Figure 1 and Supplementary Table 1, relative to the first quintile of the session, therapists in the last quintile of the session used a smaller proportion of words with negative emotionality (0.0136 vs. 0.0227, p=3.97 × 10⁻⁷); a greater proportion of present-focused words (0.1697 vs. 0.1271, p=1.30 × 10⁻¹⁵) and future-focused words (0.2084 vs. 0.01314, p=2.46 × 10⁻⁷), but a smaller proportion of past-focused words (0.0231 vs. 0.0416, p=6.87^-11); and a greater proportion of personal pronouns (0.1500 vs. 0.1182, p=3.86 × 10⁻¹⁰), including first-person singular pronouns (0.0415 vs. 0.0238, p=1.93 × 10⁻⁸), first-person plural pronouns (0.0150 vs. 0.0072, p=8.25 × 10⁻⁸), and second-person pronouns (0.0808 vs. 0.0748, p=1.88 × 10⁻²). Additionally, relative to the first quintile of the session, therapists in the final quintile tended to speak for longer durations, measured both in terms of raw seconds per talk turn (7.1615 seconds vs. 4.8952 seconds, p=7.35 × 10⁻⁴) as well as the ratio of therapist-to-patient seconds per talk turn (1.879 vs. 0.938, p=4.95 × 10⁻⁶). While therapists tended to speak longer in each talk turn near the end of the session, they also tended to speak faster relative to the patient, such that the ratio of therapist words per second to patient words per second was higher in the last quintile relative to the first (1.1715 vs. 1.040, p=9.62 × 10⁻³). These results were all significant after controlling the False Discovery Rate at level α=0.05 via the Benjamini-Hochberg procedure.

The aggregate trends in therapist language highlighted above were in some cases also present in patient language, but the starting point and relative alignment (i.e., parallel, convergent, divergent) varied significantly depending on the language feature under consideration. See Figure 2 and Supplementary Table 1 for additional details.

Although therapist language appears dynamic within sessions, patient language does not always follow the same trends. Figure 2 presents therapist-patient within-session language changes organized by quintile. Therapists’ use of negative and past-oriented language decreased significantly over the course of the session (Figures 2a and 2c), while their use of future-oriented language and first-person plural pronouns increased significantly (Figures 2b and 2d). In some cases, patient and therapist language features converged over time (e.g., Figures 2b and 2c: therapists used significantly less future-oriented language and significantly more negative language early in the session relative to patients, but these differences disappeared later in the session). In other cases, patient and therapist language diverged (e.g., Figure 2d: there were no significant differences between patient and therapist use of first-person plural (“We”) pronouns early in the session, but near the end of the session therapists used significantly more first-person plural pronouns than patients). In yet other cases, therapist and patient language differed significantly but seemed neither to converge nor diverge (e.g., Figure 2a: use of past-oriented language).

Therapist speech is responsive

Here we evaluate therapist language responsiveness, specifically the degree to which changes in patient speech patterns are associated with subsequent changes in therapist speech patterns after controlling for potential confounding factors. Out of the 78 sessions we considered, two were excluded because they exhibited non-stationarity after differencing (differencing is a common technique in time series analysis for removing macro-level trends from time series whereby differences between consecutive observations are computed and treated as the primary subject of analysis; see Supplementary Methods for additional details).⁴⁹ Another three were excluded because the patient and/or therapist had one or more language features with zero variance. Across the remaining 73 sessions analyzed, of the 18,688 possible dyad-specific associations between patient and therapist language features (16 language features each for patient and therapist, for 73 dyads) that were tested, 303 (1.6%) were significant after controlling the false discovery rate at level α=0.05. The mean (median) number of significant associations per therapist-patient dyad was 4.2 (3.0), with the minimum number of links in a session being 0, the maximum being 16, and the interquartile range (25th percentile, 75th percentile) being (2, 5). See Supplementary Figure 3 for the distribution of the number of significant links per session. Figure 3 shows directed acyclic graphs illustrating the set of associations for a subset of the therapy sessions. As illustrated in Figure 3, while the exact combinations of significant associations describing each therapist’s accommodation patterns were almost all unique, some forms of accommodation (i.e., the therapist modulating their speech patterns in response to changes in patient speech patterns) were more common than others. The top three most frequent accommodation patterns were as follows: of 78 therapists in the sample, (1) 12 therapists significantly decreased their rate of speech (as measured by words per second) in response to increases in the patient’s rate of speech, or vice versa (mean [SD] partial correlation: -0.24 [0.069]); (2) seven therapists significantly decreased their use of personal pronouns in response to increases in the patient’s rate of speech, or vice versa (mean [SD] partial correlation: -0.27 [0.064]); (3) six therapists significantly altered the frequency with which they used phrases that demonstrate understanding in response to increases/decreases in their patients’ use of third-person plural pronouns, though we note that four therapists increased their use of such phrases in response to increased patient third-person plural pronoun use (or vice versa) while two therapists’ use of such phrases moved in the opposite direction (mean [SD] partial correlation: 0.10 [0.34]). Figure 4 presents the frequency with which certain associations between patient language features and subsequent/accommodating therapist language features appeared, across all sessions (for the sake of readability, only associations represented by at least three dyads are presented - see Supplementary Figure 2 for all associations).

Therapists are consistent between sessions

Here we evaluate therapist language consistency across sessions. The average pairwise correlation of language patterns between therapists in our primary sample, averaged across 3,003 (78 choose 2) distinct pairs of therapists, was -0.012 (95% CI: [-0.0218, -0.0024]), while the average pairwise correlation within therapists (comparing language patterns from two sessions with the same therapist but different patients) was 0.253 (95% CI: [0.1299, 0.3794]) across 20 samples. A t test comparing the two distributions revealed that this difference was significant at level α = 0.05 (t = 4.39, p = 1.15 × 10⁻⁵), suggesting that on average, within-therapist language patterns were significantly more similar than between-therapist language patterns.

Clinical relevance: Diagnoses and symptom severity

Here we evaluate therapist language as it relates to patient diagnosis and symptom severity. Logistic regression models trained to classify diagnosis based on therapist language patterns performed significantly better than chance in terms of accuracy on a held-out evaluation set (72.04% vs. 55.26%), with an average [95% CI] model accuracy improvement over chance (i.e., always guessing the majority class) of 16.78% [5.13%, 28.21%] (p = 0.008). Logistic regression models trained to classify symptom severity also performed better than chance in terms of accuracy (81.97% vs. 74.45%), though the improvement of model accuracy over chance accuracy (7.52%, 95% CI: [-2.56%, 17.95%], p = 0.094) was not significant at level α=0.05.

Discussion

In this work we provide researchers a transparent computational approach for representing, measuring, and analyzing therapist language in psychotherapy without time-consuming human inspection. We apply our approach to directly measure and analyze therapist language - both individually and in aggregate, and at multiple time scales (at the level of entire sessions, session quintiles, and utterances). We examine three clinically relevant but computationally neglected aspects of therapeutic discourse analysis: therapist language timing, responsiveness, and consistency across five clinically relevant domains: pronouns, time orientation, emotional polarity, therapist tactics, and paralinguistic style. We demonstrate the potential clinical utility of this approach by evaluating the association between therapist language and two aspects of patient treatment: diagnosis and symptom severity. We conclude that increased use of computational language analysis of therapy will allow researchers and clinicians to transition from simply knowing what was said, to understanding what is most therapeutic⁵⁰.

Timing

Although therapists need to decide what to say and when to say it, the temporal sequencing of therapist language has been poorly measured^13,15. Moreover, clinical features of interest are typically analyzed in isolation, leaving potential sequencing or interactions unexplored. Our approach puts multiple clinically relevant features in context across an entire session (Figure 1), substantiating claims from discourse analysis and linguistics that words and phrases have layered and hierarchical interpretations^50,51. We find that prospectively identified language features (i.e., pronouns, therapist tactics, etc.) display a layered and temporally nuanced pattern that may be clinically relevant, meriting further inspection in observational or controlled studies.

Therapist-patient dyads actively adjust their speech based on emergent characteristics of the conversation⁵¹. Yet the specific language used by a therapist may be deployed in non-obvious ways in response to their conversation partner⁴⁶. Our findings suggest that clinically relevant language features from each speaker appear to follow both similar and different trends between language features (Figure 2). We see evidence of multiple alignments and directions of change when therapist and patient language are directly compared. Therapist and patient language are sometimes misaligned (Figure 2a), convergent (i.e., start apart and become similar) (Figures 2b and 2c), or divergent (i.e., start similar and diverge) (Figure 2d). This finding is consistent with dyadic communication research outside of therapy, which uses related concepts such as language accommodation, adjustment, style matching, and affordances^21,50–54. Despite a lack of harmony in concept terminology, our findings align with prior work suggesting that complex linguistic interactions are likely playing out during therapy. For example, in a study of romantic couples’ texting patterns, couples’ language converged over time towards a plateau, suggesting some normative or optimal level of linguistic alignment in romantic relationships⁵⁴. Of note in our work, some language features converged, while others diverged, suggesting an opportunity for hypothesis generation and testing of language accommodation in psychotherapy^51,55. For example, is emotional language convergence or divergence related to patient symptom improvement? Well-powered clinical trials or naturalistic data repositories would help discern which patterns are most associated with clinical effectiveness.

Responsiveness

Therapist responsiveness to a patient’s personal experience is a crucial difference between in-person therapy and more easily accessible mental health treatments such as bibliotherapy or internet-delivered treatment⁵⁶. Despite the importance of patient language in therapy discourse analysis, the moment-to-moment association of therapist and patient language has been difficult to operationalize. Our findings suggest a non-obvious and complex relationship between therapists’ and patients’ language features (Figures 3 and 4). For example, it does not appear that therapists are following simple rules such as mirroring patient language and speaking style exactly, which would be relatively easy to observe and teach future clinicians to do. To understand what successful therapists are doing, more nuanced evaluation is called for. We build on prior work which often focuses on patient or therapist language in isolation, specific therapeutic approaches (e.g. motivational interviewing), or language convergence (e.g. linguistic alignment)^{11,53,54,57–59}. Our findings suggest that many-to-one and one-to-many associations are playing out between therapist and patient language features.

We do not claim originality for the idea that therapist language is responsive. In early work in discourse analysis of psychotherapy, Pittenger and colleagues (1960) wrote “the details of how [language] adjustment takes place in any given instance are worth looking for… indeed, we should venture to assert that the sequential pattern of adjustment lies at the very heart of psychotherapy process⁶⁰.” Our contribution here is a method to inspect the adjustments across features of interest in psychotherapy. Future work is needed to establish whether specific language adjustments are helpful, inert, or harmful to patients in psychotherapy.

Consistency

If best practices are to be developed to improve therapist training and create useful markers of therapy quality, comparisons are needed across clinicians, patients, treatment settings, and time⁶¹. Our findings suggest some degree of linguistic stability in therapists’ use of within-session language. We refer to this as a therapist’s ‘signature’, consistent with prior work finding linguistic ‘signatures’ of emotion regulation in laboratory-based emotion regulation tasks⁴¹. Therapists appear to be both idiosyncratic and consistent in their use of language. Some language patterns are similar across sessions (i.e. therapist signature), while some language patterns adjust to patient or other situational factors. Therapist signatures may reflect their lived experience, preferences, or clinical training. Whether certain signatures are more clinically effective, and whether they are modifiable, is an important direction of future research. For example, some clinicians may regularly use more empathic language, a learnable skill, which may improve patient outcomes^28,62.

Limitations

Our study has several limitations in how features were selected; these potentially may confound variables and generalizability. Phase 1 - feature selection. A small group of clinicians identified clinically relevant language features based on their training and personal experience. Other reasonable people almost certainly would have made different selections. Also, our selected features do not address multilingualism or cultural variation in language use^63–66. Phase 2 - language evaluation. We caution against an overly reductionist view of therapy as primarily or exclusively language based. Visual, auditory, biological, demographic, cultural, and other contextual factors may enhance, mitigate, or contradict interpretations made from language alone. We do not evaluate, nor do we claim, that therapist language always directly causes patient language or symptom improvement. It may be that patient improvement is caused by unmeasured covariates, or that therapist language is responsive to patient improvement or decompensation. Other approaches exist for feature implementation and should be evaluated, especially in the context of accuracy and appropriateness across demographic and clinical patient characteristics^43,67–71. Phase 3 - clinical relevance. Clinical symptom severity measures were gathered in a college counseling setting, and thus our findings may not be generalizable to other clinicians, patients, or treatment settings. In college counseling sites, symptom severity often ranges from mild-moderate, as is true in our sample. It is unknown whether results would differ in more patients with more severe symptoms. Additionally, the sample of 98 sessions is small relative to other AI and machine learning-based studies, reflecting a well-documented limitation in psychotherapy process research²⁴.

Conclusion

If successful, computational language analysis of entire psychotherapy sessions may address long-standing criticisms of methodological rigor in psychotherapy evaluation^15,72. If deployed ethically and fairly, this approach could assist evaluations of treatment adherence and quality^15,73–77. To appreciate the full diversity of expression in therapy, computationally-conducted, theoretically informed evaluation may be a practical necessity^14,78. Our goal is not to reduce opportunities for clinical spontaneity and improvisation but to develop methods to learn from skilled therapists. Our results suggest that therapist language timing, responsiveness, and consistency demonstrate patterns that merit more rigorous inspection across populations and contexts.

Methods

Study design

This is a retrospective cohort study of patient-therapist dyads that uses psychotherapy transcripts gathered from a completed clinical trial. The original study objectives, methods, and results have been published previously^48,79. Written informed consent was obtained per protocol in the original trial from both patients and therapists. The study presented here was designed and conducted independently of the original clinical trial’s primary objectives. Our study had three phases: feature generation, feature measurement, and clinical relevance. In Phase 1 (feature generation), our team used a modified Delphi approach to generate a list of clinically relevant language features related to therapist skill (authors ASM, BA, SA, NS)⁸⁰. This feature list was refined based on its ability to be implemented by an expanded team of clinicians, informaticists, and computer scientists (authors ASM, SF, JF, TA, JH, AH, NS). Each feature was then implemented based on prior research and researcher judgment (authors ASM, SF). Features were selected that maximized reproducibility and transparency⁸¹. In Phase 2 (feature measurement), features were measured and standardized for therapists and patients in 98 professionally transcribed psychotherapy transcripts. Each transcript represents a unique patient-therapist dyad. We quantitatively assessed the structure of therapist and patient language. To evaluate timing, we measured the occurrence and frequency of the clinically relevant language features noted above (grouped into pronouns, time orientation, emotional polarity, therapist tactics, and paralinguistic style) in full therapy sessions. To evaluate responsiveness, we evaluated whether changes in therapist language were associated with immediately preceding utterance-level changes in patient language. To measure consistency, we tested whether or therapists have a consistent linguistic signature across sessions with different patients. In Phase 3 (clinical relevance), the relationship between therapists’ language and patients’ clinical presentation (i.e., diagnosis and symptom severity) was evaluated. The study protocol was approved by the Institutional Review Board at Stanford University.

Dataset

Audio recordings of psychotherapy were collected per protocol during a randomized controlled trial⁷⁹. The sessions took place between April 2013 and December 2016 at 24 college counseling sites across the United States. Non-directed counseling was offered to participants presenting with symptoms of depression or eating disorders. Transcripts were created using professional human transcriptionists; details are provided in prior work¹⁴. For the current study, a convenience sub-sample of unique therapist-patient dyads was selected, yielding 78 session transcripts. For therapists with more than one patient or session in our sample, a single session was randomly selected. Thus, our primary sample had 78 sessions, across 78 unique therapists and 78 unique patients. A secondary sample added an additional 20 sessions, each of which represented a second session from a therapist in the primary sample but with a different patient relative to the first. Unless otherwise explicitly stated, any analyses are with respect to the primary sample of 78 unique therapist/unique patient sessions.

Diagnosis was made by the treating clinician during the original clinical trial using the DSM-IV diagnostic criteria. Depression symptom severity was measured at the start of each session using the Patient Health Questionnaire-9 (PHQ-9), a common and validated measure of depression severity^82–84.

Phase 1: Feature generation

Due to a lack of validated clinical ontologies for psychotherapy, we first identified clinically relevant features using a modified Delphi approach⁸⁰. Features reflect either clinically important constructs (e.g. emotions) or paralinguistics (e.g. rate of speech). Features were manually clustered into five domains based on prior research and clinical judgment: pronouns, time orientation, emotional polarity, specific tactics, and paralinguistics.

Pronouns

The Linguistic Inquiry and Word Count (LIWC) program is a validated lexicon containing psychologically meaningful categories of words and word stems, including categories for various kinds of personal pronouns³². Our “Pronouns” features represent the number of matches between spoken words and terms in the relevant pronoun-specific LIWC category.

Time Orientation

Time orientation of patient and therapist language is a key focus of research in mental health^38,39. Each “Time Orientation” feature represents the number of times a word/word stem from a relevant time orientation lexicon in LIWC appears in speech³².

Emotional polarity

Emotions are important in most clinical psychology theoretical orientations^34,36,37,40. Nevertheless, there is strong disagreement on how to measure emotionality^43,85. We chose to use the NRC Emotion Lexicon (EmoLex) to measure whether a word conveyed positive or negative sentiment because of its expansive coverage (14,182 unigrams/words) and inspectable approach, rooted in a crowdsourced layman’s understanding of each word. The “Positive emotionality” feature represents the number of words considered to have positive polarity, and similarly for the “Negative emotionality” feature.

Therapist tactics

We used small, non-exhaustive lexicons to detect two clinically important but rarely measured therapist tactics: active listening and non-judgmental stance, adapted from prior work²¹. Active listening entails speech acts that seek to validate the patient, clarify meaning, or direct the patient towards useful experiences⁸⁶. A non-judgmental stance is created and maintained in many ways, but one approach is to avoid absolutist language (e.g. “always”, “never”)⁸⁷. See Supplementary Methods for additional details.

Paralinguistic style

The meanings of words are influenced by how the words are said^88,89. We focus on paralinguistic aspects of speech that can be measured using only transcripts. We measured the seconds taken by each therapist per talk turn, with talk turn boundaries delineated by a change in speaker in the transcript. We additionally measured therapists’ rate of speech by dividing the number of therapist-spoken words by the amount of time that the therapist spoke, as indicated by the time stamps in the transcripts. We also measured the therapist-to-patient ratio of both seconds taken per talk turn and words spoken per second. Including these ratios provides insight into whether the therapist was speaking faster or slower than the patient, as well as taking more time in each talk turn compared to the patient.

Phase 2: Feature implementation

Temporal aggregation and granularity

In addition to analyzing therapist language at the level of talk turns/utterances, we aggregated features (1) at the level of session quintiles (e.g., the first 20% of the session, by time), and (2) at the entire session level. We indicate which level of aggregation was used in each subsection of the methods. There is no standard approach, and prior work has used both quintiles and deciles to segment discourse analysis^21,33. We analyzed sessions at the level of quintiles to reduce the variance of aggregate language feature statistics within each time window while nevertheless providing sufficient temporal granularity so as to make meaningful deductions about changes in language use over time.

Evaluating whether therapist speech dynamically changes throughout the session

To represent therapist language, we calculated the average value of each language feature within each quintile of therapist speech. For count-based lexicon-matching features, we calculated the proportion of total words that matched a term appearing in the associated lexicon for each quintile.

To qualitatively analyze the dynamic nature of therapist language over time, we fit a natural cubic spline to the data represented by ordinally indexed session quintiles (independent variable) and quintile-aggregated language features, averaged across therapists (dependent variable)⁹⁰. This procedure was also performed for individual therapist language features to additionally highlight heterogeneity in the way therapist language changes over time. See Figure 1 and Supplementary Figure 1.

We also quantitatively assessed patterns in therapist language features over time. For each language feature, we compared the distribution of that therapist language feature in the first and last quintile of therapy sessions, using a nonparametric Mann-Whitney U test to test for significant differences in distribution between the two quintiles⁹¹. Within the first and last quintiles, we also analyzed differences between patient and therapist language features using the Mann-Whitney U test. We used the Benjamini-Hochberg procedure to control the False Discovery Rate (FDR) at level α = 0.05. See Figure 2.

Evaluating whether therapist speech is responsive

Here we describe how our therapist language representations were used to analyze individual therapist’s accommodation patterns at the level of utterances. To better answer the question of how therapists adapt their language to patient language, we leveraged recent methodological advances in time series causal discovery for dynamical systems to identify temporal dependencies between patient and therapist language features⁹². The algorithm we employed, PCMCI, applies momentary conditional independence (MCI) tests to identify temporal links between variables, accounting for potential observed confounding. PCMCI has been shown to identify such links in observational data with good statistical power and low Type I error. For each therapist, we used PCMCI with partial correlation to identify significant links between patient language and therapist language.

Patient-to-therapist associations were recorded as significant if the associated MCI test was significant at level α = 0.05, after controlling the FDR with the Benjamini-Hochberg procedure. We additionally calculated and reported the frequency of each type of association across all sessions.

Phase 3: Measuring clinical relevance

We next describe our approach to differentiating between therapy sessions. By aggregating therapists’ language features over the entire time course of the session, we obtain a 16-dimensional vector for each therapist (i.e., the therapist’s linguistic “signature”). We sought to examine: (1) whether a therapist’s “signature” is consistent across patients, over and above chance; and (2) whether these “signatures” are associated with clinically relevant patient variables, namely symptom severity and psychiatric diagnosis.

To answer (1), we calculated the cross-therapist “signature” correlations between all pairs of therapists in our primary sample, then compared that distribution to the distribution of “signature” correlations within therapists but across different patients. We used a t-test to test whether there were any differences in the distribution of correlations between the two groups.

To answer (2), we performed two predictive analyses via logistic regression, treating the therapist “signatures” as independent variables and the patient symptom severity classification (admitting PHQ-9 < 10 vs. PHQ-9 ≥ 10) and admitting diagnosis (depression vs. eating disorder) as the dependent variables, respectively. We randomly divided our dataset into two equally sized halves, trained a logistic regression model on one half, and evaluated the model’s accuracy on the second half. The test accuracy on the second half was compared to chance, which in this case we defined as always predicting the majority label of the dependent variable in the subsampled evaluation dataset. The difference between our model’s accuracy and chance accuracy was recorded, and this process was repeated 1000 times, using random splits of the data each time. We used the resulting distribution of accuracy differences to estimate the probability that our logistic regression model would perform no better than chance, defining a significant result as p < 0.05.

Data Availability

The dataset is not publicly available due to patient privacy restrictions but may be available from the corresponding author on reasonable request.

Data availability

The dataset is not publicly available due to patient privacy restrictions but may be available from the corresponding author on reasonable request.

Code availability

The code used in this study can be found at: (link to publicly available repository be updated prior to publication)

Author contributions

A.S.M. and S.L.F contributed equally as co-first authors. A.S.M., S.L.F, A.H., B.A.A., W.S.A., N.H.S., J.A.F. conceptualized and designed the study. A.S.M., S.L.F., A.H., J.A.F., B.A.A., J.H., and N.H.S. acquired, analyzed or interpreted the data. A.S.M., S.L.F., A.H., J.A.F., B.A.A., and N.H.S. drafted the manuscript. All authors performed critical revision of the manuscript for important intellectual content. S.L.F., A.H., and J.A.F. performed statistical analysis. B.A.A. and N.H.S. provided administrative, technical, and material support. B.A.A., W.S.A., and N.H.S. supervised the study. A.S.M, S.L.F, and N.H.S. had full access to all the data. A.S.M. and S.L.F. take responsibility for the integrity of the data and the accuracy of the data analysis.

Ethics declarations

All authors declare no competing interests.

Supplemental information

Additional tables and figures can be found in the supplementary material.

Acknowledgements

A.S.M. was supported by grants from the National Institutes of Health, National Center for Advancing Translational Science, Clinical and Translational Science Award (KL2TR001083 and UL1TR001085), the Stanford Department of Psychiatry Innovator Grant Program, and the Stanford Human-Centered AI Institute. S.L.F. was supported by a Big Data to Knowledge (BD2K) grant from the National Institutes of Health (T32 LM012409) and a National Defense Science and Engineering Graduate Fellowship from the Department of Defense. N.H.S acknowledges support from the Mark and Debra Lesli Endowment for the Program for AI in Healthcare at Stanford Medicine. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We thank Fei-Fei Li and G. Terence Wilson for project guidance. We thank members of 2018 Spring AI Bootcamp, Pranav Rajpurkar, Andrew Ng, Suvadip Paul, Ben Cohen-Wang, Matthew Sun for collaboration. We thank all of the participating counseling centers, directors, therapists, and student patients.

Footnotes

↵† Co-first authors

References

↵
Holmes EA, Ghaderi A, Harmer CJ, Ramchandani PG, Cuijpers P, Morrison AP et al. The Lancet Psychiatry Commission on psychological treatments research in tomorrow’s science. Lancet Psychiatry 2018; 5: 237–286.
OpenUrl PubMed
↵
Lambert MJ. Outcome in psychotherapy: the past and important advances. Psychotherapy. 2013; 50: 42–51.
OpenUrl CrossRef PubMed
↵
Zilcha-Mano S. Major developments in methods addressing for whom psychotherapy may work and why. Psychother Res 2019; 29: 693–708.
OpenUrl
↵
Castonguay LG, Hill CE (eds.). How and why are some therapists better than others?: Understanding therapist effects. American Psychological Association: Washington, DC, USA, 2017.
Kramer U, Stiles WB. The responsiveness problem in psychotherapy: A review of proposed solutions. Clinical Psychology: Science and Practice 2015; 22: 277–295.
OpenUrl
↵
Roth A, Fonagy P. What Works for Whom?: A Critical Review of Psychotherapy Research. Guilford Press: New York, NY, USA, 2006.
↵
Anderson T, Ogles BM, Patterson CL, Lambert MJ, Vermeersch DA. Therapist effects: facilitative interpersonal skills as a predictor of therapist success. J Clin Psychol 2009; 65: 755–768.
OpenUrl CrossRef PubMed
↵
Barkham M, Lutz W, Castonguay LG. ergin and Garfield’s Handbook of Psychotherapy and Behavior Change. John Wiley & Sons: Hoboken, NJ, USA, 2021.
↵
Stiles WB. Verbal response modes and psychotherapeutic technique. Psychiatry 1979; 42: 49–62.
OpenUrl PubMed Web of Science
Benjamin LS. Structural analysis of social behavior. Psychol Rev 1974; 81: 392–425.
OpenUrl CrossRef Web of Science
↵
Flemotomos N, Martinez VR, Chen Z, Singla K, Ardulov V, Peri R et al. Automated evaluation of psychotherapy skills using speech and language technologies. Behav Res Methods 2021. doi:10.3758/s13428-021-01623-4.
OpenUrl CrossRef
↵
Rogers CR. The use of electrically recorded interviews in improving psychotherapeutic techniques. Am J Orthopsychiatry 1942; 12: 429–434.
OpenUrl
↵
Hofmann SG, Hayes SC. The Future of Intervention Science: Process-Based Therapy. Clin Psychol Sci 2019; 7: 37–50.
OpenUrl CrossRef PubMed
↵
Miner AS, Haque A, Fries JA, Fleming SL, Wilfley DE, Terence Wilson G et al. Assessing the accuracy of automatic speech recognition for psychotherapy. NPJ Digit Med 2020; 3: 82.
OpenUrl
↵
Goldfried MR. Obtaining consensus in psychotherapy: What holds us back? Am Psychol 2019; 74: 484–496.
OpenUrl PubMed
Kazdin AE. Addressing the treatment gap: A key challenge for extending evidence-based psychosocial interventions. Behav Res Ther 2017; 88: 7–18.
OpenUrl CrossRef PubMed
↵
Hill CE, Lent RW. A narrative and meta-analytic review of helping skills training: Time to revive a dormant area of inquiry. Psychotherapy 2006; 43: 154–172.
OpenUrl PubMed
↵
Lee M, Martin JL. Coding, counting and cultural cartography. American Journal of Cultural Sociology 2015; 3: 1–33.
OpenUrl CrossRef
↵
Goldberg SB, Flemotomos N, Martinez VR, Tanana MJ, Kuo PB, Pace BT et al. Machine learning and natural language processing in psychotherapy research: Alliance as example use case. J Couns Psychol 2020; 67: 438–448.
OpenUrl PubMed
↵
Gaines AN, Goldfried MR, Constantino MJ. Revived call for consensus in the future of psychotherapy. Evid Based Ment Health 2021; 24: 2–4.
OpenUrl Abstract/FREE Full Text
↵
Althoff T, Clark K, Leskovec J. Large-scale analysis of counseling conversations: An application of natural language processing to mental health. In: Transactions of the Association for Computational Linguistics, Volume 4. 2016, pp 463–476.
OpenUrl
↵
Eichstaedt JC, Kern ML, Yaden DB, Schwartz HA, Giorgi S, Park G et al. Closed- and open-vocabulary approaches to text analysis: A review, quantitative comparison, and recommendations. Psychol Methods 2021; 26: 398–427.
OpenUrl
↵
Malhotra G, Waheed A, Srivastava A, Akhtar MS, Chakraborty T. Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 2021, pp 735–745.
↵
Doorn KA, Kamsteeg C, Bate J, Aafjes M. A scoping review of machine learning in psychotherapy research. Psychother Res 2020. doi:10.1080/10503307.2020.1808729.
OpenUrl CrossRef
↵
Nook EC, Vidal Bustamante CM, Cho HY, Somerville LH. Use of linguistic distancing and cognitive reappraisal strategies during emotion regulation in children, adolescents, and young adults. Emotion 2020; 20: 525–540.
OpenUrl PubMed
Lee F-T, Hull D, Levine J, Ray B, McKeown K. Identifying therapist conversational actions across diverse psychotherapeutic approaches. In: Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology. Association for Computational Linguistics: Minneapolis, Minnesota, 2019, pp 12–23.
Ewbank MP, Cummins R, Tablan V, Bateup S, Catarino A, Martin AJ et al. Quantifying the Association Between Psychotherapy Content and Clinical Outcomes Using Deep Learning. JAMA Psychiatry 2020; 77: 35–43.
OpenUrl
↵
Sharma A, Lin IW, Miner AS, Atkins DC, Althoff T. Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach. In: Proceedings of the Web Conference 2021. Association for Computing Machinery: \plNew York, NY, USA, 2021, pp 194–205.
↵
Imel ZE, Steyvers M, Atkins DC. Computational psychotherapy research: Scaling up the evaluation of patient-provider interactions. Psychotherapy 2015; 52: 19–30.
OpenUrl
↵
Lattie EG, Stiles-Shields C, Graham AK. An overview of and recommendations for more accessible digital mental health services. Nature Reviews Psychology 2022; : 1–14.
↵
Haque A, Milstein A, Fei-Fei L. Illuminating the dark spaces of healthcare with ambient intelligence. Nature 2020; 585: 193–202.
OpenUrl
↵
Tausczik YR, Pennebaker JW. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. J Lang Soc Psychol 2010; 29: 24–54.
OpenUrl CrossRef
↵
Zimmermann J, Brockmeyer T, Hunn M, Schauenburg H, Wolf M. First-person Pronoun Use in Spoken Language as a Predictor of Future Depressive Symptoms: Preliminary Evidence from a Clinical Sample of Depressed Patients. Clin Psychol Psychother 2017; 24: 384–391.
OpenUrl
↵
Vine V, Boyd RL, Pennebaker JW. Natural emotion vocabularies as windows on distress and well-being. Nat Commun 2020; 11: 4525.
OpenUrl
↵
Wadden D, August T, Li Q, Althoff T. The Effect of Moderation on Online Mental Health Conversations. In: Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM 2021). 2021, pp 751–763.
↵
Beck JS. Cognitive Behavior Therapy, Second Edition: Basics and Beyond. Guilford Press: New York, NY, USA, 2011.
↵
Weissman MM, Markowitz JC, Klerman GL. Clinician’s Quick Guide to Interpersonal Psychotherapy. 2008.
↵
Baird HM, Webb TL, Sirois FM, Gibson-Miller J. Understanding the effects of time perspective: A meta-analysis testing a self-regulatory framework. Psychol Bull 2021; 147: 233–267.
OpenUrl
↵
Park G, Schwartz HA, Sap M, Kern ML, Weingarten E, Eichstaedt JC et al. Living in the Past, Present, and Future: Measuring Temporal Orientation With Language. J Pers 2017; 85: 270–280.
OpenUrl
↵
Keefe JR, Huque ZM, DeRubeis RJ, Barber JP, Milrod BL, Chambless DL. In-session emotional expression predicts symptomatic and panic-specific reflective functioning improvements in panic-focused psychodynamic psychotherapy. Psychotherapy 2019; 56: 514–525.
OpenUrl
↵
Nook EC, Schleider JL, Somerville LH. A linguistic signature of psychological distancing in emotion regulation. J Exp Psychol Gen 2017; 146: 337–346.
OpenUrl
↵
Greenberg LS. Emotions, the great captains of our lives: their role in the process of change in psychotherapy. Am Psychol 2012; 67: 697–707.
OpenUrl PubMed
↵
Fiske AP. The lexical fallacy in emotion research: Mistaking vernacular words for psychological entities. Psychol Rev 2020; 127: 95–113.
OpenUrl
Gross JJ, Jazaieri H. Emotion, Emotion Regulation, and Psychopathology: An Affective Science Perspective. Clin Psychol Sci 2014; 2: 387–401.
OpenUrl
↵
Greenberg LS, Safran JD. Emotion in psychotherapy. Am Psychol 1989; 44: 19–29.
OpenUrl CrossRef PubMed
↵
Rocco D, Pastore M, Gennaro A, Salvatore S, Cozzolino M, Scorza M. Beyond Verbal Behavior: An Empirical Analysis of Speech Rates in Psychotherapy Sessions. Front Psychol 2018; 9: 978.
OpenUrl
↵
Tonti M, Gelo OCG. Rate of speech and emotional-cognitive regulation in the psychotherapeutic process: a pilot study. RES PSYCHOTHER-PSYCH 2016; 19. doi:10.4081/ripppo.2016.232.
OpenUrl CrossRef
↵
Wilfley DE, Agras WS, Fitzsimmons-Craft EE, Bohon C, Eichen DM, Welch RR et al. Training models for implementing evidence-based psychological treatment: A cluster-randomized trial in college counseling centers. JAMA Psychiatry 2020; 77: 139–147.
OpenUrl
↵
Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. OTexts, 2018.
↵
Labov W, Fanshel D. Therapeutic discourse: Psychotherapy as conversation. Academic Press, 1977.
↵
Clark HH. Using Language. Cambridge University Press, 1996.
Borelli JL, Sohn L, Wang BA, Hong K, DeCoste C, Suchman NE. Therapist-Client Language Matching: Initial Promise as a Measure of Therapist-Client Relationship Quality. Psychoanal Psychol 2019; 36: 9–18.
OpenUrl
↵
Ireland ME, Slatcher RB, Eastwick PW, Scissors LE, Finkel EJ, Pennebaker JW. Language style matching predicts relationship initiation and stability. Psychol Sci 2011; 22: 39–44.
OpenUrl CrossRef PubMed
↵
Brinberg M, Ram N. Do New Romantic Couples Use More Similar Language Over Time? Evidence from Intensive Longitudinal Text Messages. J Commun 2021; 71: 454–477.
OpenUrl
↵
Giles H, Mulac A, Bradac JJ, Johnson P. Speech Accommodation Theory: The First Decade and Beyond. Annals of the International Communication Association 1987; 10: 13–48.
OpenUrl
↵
Hatcher RL. Interpersonal competencies: Responsiveness, technique, and training in psychotherapy. Am Psychol 2015; 70: 747–757.
OpenUrl
↵
Duran ND, Paxton A, Fusaroli R. ALIGN: Analyzing linguistic interactions with generalizable techNiques—A Python library. Psychol Methods 2019; 24: 419–438.
OpenUrl
Burkhardt HA, Alexopoulos GS, Pullmann MD, Hull TD, Areán PA, Cohen T. Behavioral Activation and Depression Symptomatology: Longitudinal Assessment of Linguistic Indicators in Text-Based Therapy Sessions. J Med Internet Res 2021; 23: e28244.
OpenUrl
↵
Nook EC, Hull TD, Nock M, Somerville L. Linguistic measures of psychological distance track symptom levels and treatment outcomes in a large set of psychotherapy transcripts. doi:10.31234/osf.io/hqxaz.
OpenUrl CrossRef
↵
Pittenger RE, Hockett CF, Danehy JJ. The first five minutes: A sample of microscopic interview analysis. Paul Martineau: Ithaca, NY, 1960.
↵
Horn RL, Weisz JR. Can Artificial Intelligence Improve Psychotherapy Research and Practice? Adm Policy Ment Health 2020; 47: 852–855.
OpenUrl
↵
Sharma A, Miner A, Atkins D, Althoff T. A computational approach to understanding empathy expressed in text-based mental health support. In: The 2020 Conference on Empirical Methods in Natural Language Processing. 2020, pp 5263–5276.
↵
Costa B, Dewaele J-M. Psychotherapy across languages: beliefs, attitudes and practices of monolingual and multilingual therapists with their multilingual patients. Language and Psychoanalysis 2012; : 18–40.
Benish SG, Quintana S, Wampold BE. Culturally adapted psychotherapy and the legitimacy of myth: a direct-comparison meta-analysis. J Couns Psychol 2011; 58: 279–289.
OpenUrl CrossRef PubMed
Whaley AL, Davis KE. Cultural competence and evidence-based practice in mental health services: a complementary perspective. Am Psychol 2007; 62: 563–574.
OpenUrl CrossRef PubMed Web of Science
↵
Hook JN, Farrell JE, Davis DE, DeBlaere C, Van Tongeren DR, Utsey SO. Cultural humility and racial microaggressions in counseling. J Couns Psychol 2016; 63: 269–277.
OpenUrl
↵
Koenecke A, Nam A, Lake E, Nudell J, Quartey M, Mengesha Z et al. Racial disparities in automated speech recognition. Proc Natl Acad Sci U S A 2020; 117: 7684–7689.
OpenUrl Abstract/FREE Full Text
Corcoran CM, Benavides C, Cecchi G. Natural Language Processing: Opportunities and Challenges for Patients, Providers, and Hospital Systems. Psychiatric Annals. 2019; 49: 202–208.
OpenUrl
Mitchell M, Baker D, Moorosi N, Denton E, Hutchinson B, Hanna A et al. Diversity and Inclusion Metrics in Subset Selection. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery: New York, NY, USA, 2020, pp 117–123.
Rubin E. Striving for Diversity in Research Studies. N Engl J Med 2021. doi:10.1056/NEJMe2114651.
OpenUrl CrossRef
↵
Brown S, Davidovic J, Hasan A. The algorithm audit: Scoring the algorithms that score us. Big Data & Society 2021; 8: 2053951720983865.
↵
Krause KR, Chung S, Sousa Fialho M da L, Szatmari P, Wolpert M. The challenge of ensuring affordability, sustainability, consistency, and adaptability in the common metrics agenda. Lancet Psychiatry 2021; 8: 1094–1102.
OpenUrl
↵
Hernandez-Boussard T, Bozkurt S, Ioannidis JPA, Shah NH. MINIMAR (MINimum Information for Medical AI Reporting): Developing reporting standards for artificial intelligence in health care. J Am Med Inform Assoc 2020; 27: 2011–2015.
OpenUrl CrossRef PubMed
Xiao B, Imel ZE, Georgiou PG, Atkins DC, Narayanan SS. ‘ Rate My Therapist’: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing. PLoS One 2015; 10: e0143055.
OpenUrl
Bone C, Simmonds-Buckley M, Thwaites R, Sandford D, Merzhvynska M, Rubel J et al. Dynamic prediction of psychological treatment outcomes: development and validation of a prediction model using routinely collected symptom data. The Lancet Digital Health 2021; 3: e231–e240.
OpenUrl
Huckvale K, Venkatesh S, Christensen H. Toward clinical digital phenotyping: a timely opportunity to consider purpose, quality, and safety. NPJ Digit Med 2019; 2: 88.
OpenUrl PubMed
↵
Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines. Nature Machine Intelligence 2019; 1: 389–399.
OpenUrl
↵
Krieger N. Epidemiology and the People’s Health: Theory and Context. Oxford University Press: New York, NY, USA, 2011.
↵
Wilfley DE, Fitzsimmons-Craft EE, Eichen DM, Van Buren DJ, Welch RR, Robinson AH et al. Training models for implementing evidence-based psychological treatment for college mental health: A cluster randomized trial study protocol. Contemp Clin Trials 2018; 72: 117–125.
OpenUrl
↵
Linstone HA, Turoff M. The Delphi Method: Techniques and Applications. First Edition edition. Addison-Wesley Educational Publishers Inc: Reading, MA, USA, 1975.
↵
Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015-2017. PLoS Biol 2018; 16: e2006930.
OpenUrl CrossRef PubMed
↵
Stochl J, Fried EI, Fritz J, Croudace TJ, Russo DA, Knight C et al. On Dimensionality, Measurement Invariance, and Suitability of Sum Scores for the PHQ-9 and the GAD-7. Assessment 2020; : 1073191120976863.
Zimmerman M. Using the 9-Item Patient Health Questionnaire to Screen for and Monitor Depression. JAMA 2019. doi:10.1001/jama.2019.15883.
OpenUrl CrossRef
↵
Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–613.
OpenUrl CrossRef PubMed Web of Science
↵
Aafjes-van Doorn K, Barber JP. Systematic Review of In-Session Affect Experience in Cognitive Behavioral Therapy for Depression. Cognit Ther Res 2017; 41: 807–828.
OpenUrl
↵
Fitzgerald P, Leudar I. On active listening in person-centred, solution-focused psychotherapy. J Pragmat 2010; 42: 3188–3198.
OpenUrl
↵
Al-Mosaiwi M, Johnstone T. In an Absolute State: Elevated Use of Absolutist Words Is a Marker Specific to Anxiety, Depression, and Suicidal Ideation. Clin Psychol Sci 2018; 6: 529–542.
OpenUrl
↵
Mehrabian A, Ferris SR. Inference of attitudes from nonverbal communication in two channels. J Consult Psychol 1967; 31: 248–252.
OpenUrl CrossRef PubMed Web of Science
↵
Mehrabian A, Wiener M. Decoding of inconsistent communications. Journal of Personality and Social Psychology. 1967; 6: 109–114.
OpenUrl CrossRef PubMed Web of Science
↵
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Science & Business Media, 2009.
↵
Mann HB, Whitney DR. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann Math Stat 1947; 18: 50–60.
OpenUrl
↵
Runge J, Nowack P, Kretschmer M, Flaxman S, Sejdinovic D. Detecting and quantifying causal associations in large nonlinear time series datasets. Sci Adv 2019; 5: eaau4996.
OpenUrl FREE Full Text

View the discussion thread.

Posted April 27, 2022.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Psychiatry and Clinical Psychology

Subject Areas

All Articles

Addiction Medicine (386)
Allergy and Immunology (701)
Anesthesia (193)
Cardiovascular Medicine (2859)
Dentistry and Oral Medicine (326)
Dermatology (244)
Emergency Medicine (431)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1011)
Epidemiology (12569)
Forensic Medicine (10)
Gastroenterology (807)
Genetic and Genomic Medicine (4447)
Geriatric Medicine (402)
Health Economics (716)
Health Informatics (2856)
Health Policy (1050)
Health Systems and Quality Improvement (1050)
Hematology (376)
HIV/AIDS (893)
Infectious Diseases (except HIV/AIDS) (13986)
Intensive Care and Critical Care Medicine (831)
Medical Education (415)
Medical Ethics (114)
Nephrology (464)
Neurology (4201)
Nursing (223)
Nutrition (617)
Obstetrics and Gynecology (788)
Occupational and Environmental Health (723)
Oncology (2205)
Ophthalmology (626)
Orthopedics (254)
Otolaryngology (319)
Pain Medicine (269)
Palliative Medicine (83)
Pathology (488)
Pediatrics (1172)
Pharmacology and Therapeutics (489)
Primary Care Research (483)
Psychiatry and Clinical Psychology (3658)
Public and Global Health (6787)
Radiology and Imaging (1494)
Rehabilitation Medicine and Physical Therapy (869)
Respiratory Medicine (902)
Rheumatology (430)
Sexual and Reproductive Health (433)
Sports Medicine (369)
Surgery (473)
Toxicology (57)
Transplantation (202)
Urology (174)

[1] ↵
Holmes EA, Ghaderi A, Harmer CJ, Ramchandani PG, Cuijpers P, Morrison AP et al. The Lancet Psychiatry Commission on psychological treatments research in tomorrow’s science. Lancet Psychiatry 2018; 5: 237–286.
OpenUrl PubMed

[2] ↵
Lambert MJ. Outcome in psychotherapy: the past and important advances. Psychotherapy. 2013; 50: 42–51.
OpenUrl CrossRef PubMed

[3] ↵
Zilcha-Mano S. Major developments in methods addressing for whom psychotherapy may work and why. Psychother Res 2019; 29: 693–708.
OpenUrl

[4] ↵
Castonguay LG, Hill CE (eds.). How and why are some therapists better than others?: Understanding therapist effects. American Psychological Association: Washington, DC, USA, 2017.

[5] Kramer U, Stiles WB. The responsiveness problem in psychotherapy: A review of proposed solutions. Clinical Psychology: Science and Practice 2015; 22: 277–295.
OpenUrl

[6] ↵
Roth A, Fonagy P. What Works for Whom?: A Critical Review of Psychotherapy Research. Guilford Press: New York, NY, USA, 2006.

[7] ↵
Anderson T, Ogles BM, Patterson CL, Lambert MJ, Vermeersch DA. Therapist effects: facilitative interpersonal skills as a predictor of therapist success. J Clin Psychol 2009; 65: 755–768.
OpenUrl CrossRef PubMed

[8] ↵
Barkham M, Lutz W, Castonguay LG. ergin and Garfield’s Handbook of Psychotherapy and Behavior Change. John Wiley & Sons: Hoboken, NJ, USA, 2021.

[9] ↵
Stiles WB. Verbal response modes and psychotherapeutic technique. Psychiatry 1979; 42: 49–62.
OpenUrl PubMed Web of Science

[10] Benjamin LS. Structural analysis of social behavior. Psychol Rev 1974; 81: 392–425.
OpenUrl CrossRef Web of Science

[11] ↵
Flemotomos N, Martinez VR, Chen Z, Singla K, Ardulov V, Peri R et al. Automated evaluation of psychotherapy skills using speech and language technologies. Behav Res Methods 2021. doi:10.3758/s13428-021-01623-4.
OpenUrl CrossRef

[12] ↵
Rogers CR. The use of electrically recorded interviews in improving psychotherapeutic techniques. Am J Orthopsychiatry 1942; 12: 429–434.
OpenUrl

[13] ↵
Hofmann SG, Hayes SC. The Future of Intervention Science: Process-Based Therapy. Clin Psychol Sci 2019; 7: 37–50.
OpenUrl CrossRef PubMed

[14] ↵
Miner AS, Haque A, Fries JA, Fleming SL, Wilfley DE, Terence Wilson G et al. Assessing the accuracy of automatic speech recognition for psychotherapy. NPJ Digit Med 2020; 3: 82.
OpenUrl

[15] ↵
Goldfried MR. Obtaining consensus in psychotherapy: What holds us back? Am Psychol 2019; 74: 484–496.
OpenUrl PubMed

[16] Kazdin AE. Addressing the treatment gap: A key challenge for extending evidence-based psychosocial interventions. Behav Res Ther 2017; 88: 7–18.
OpenUrl CrossRef PubMed

[17] ↵
Hill CE, Lent RW. A narrative and meta-analytic review of helping skills training: Time to revive a dormant area of inquiry. Psychotherapy 2006; 43: 154–172.
OpenUrl PubMed

[18] ↵
Lee M, Martin JL. Coding, counting and cultural cartography. American Journal of Cultural Sociology 2015; 3: 1–33.
OpenUrl CrossRef

[19] ↵
Goldberg SB, Flemotomos N, Martinez VR, Tanana MJ, Kuo PB, Pace BT et al. Machine learning and natural language processing in psychotherapy research: Alliance as example use case. J Couns Psychol 2020; 67: 438–448.
OpenUrl PubMed

[20] ↵
Gaines AN, Goldfried MR, Constantino MJ. Revived call for consensus in the future of psychotherapy. Evid Based Ment Health 2021; 24: 2–4.
OpenUrl Abstract/FREE Full Text

[21] ↵
Althoff T, Clark K, Leskovec J. Large-scale analysis of counseling conversations: An application of natural language processing to mental health. In: Transactions of the Association for Computational Linguistics, Volume 4. 2016, pp 463–476.
OpenUrl

[22] ↵
Eichstaedt JC, Kern ML, Yaden DB, Schwartz HA, Giorgi S, Park G et al. Closed- and open-vocabulary approaches to text analysis: A review, quantitative comparison, and recommendations. Psychol Methods 2021; 26: 398–427.
OpenUrl

[23] ↵
Malhotra G, Waheed A, Srivastava A, Akhtar MS, Chakraborty T. Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 2021, pp 735–745.

[24] ↵
Doorn KA, Kamsteeg C, Bate J, Aafjes M. A scoping review of machine learning in psychotherapy research. Psychother Res 2020. doi:10.1080/10503307.2020.1808729.
OpenUrl CrossRef

[25] ↵
Nook EC, Vidal Bustamante CM, Cho HY, Somerville LH. Use of linguistic distancing and cognitive reappraisal strategies during emotion regulation in children, adolescents, and young adults. Emotion 2020; 20: 525–540.
OpenUrl PubMed

[26] Lee F-T, Hull D, Levine J, Ray B, McKeown K. Identifying therapist conversational actions across diverse psychotherapeutic approaches. In: Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology. Association for Computational Linguistics: Minneapolis, Minnesota, 2019, pp 12–23.

[27] Ewbank MP, Cummins R, Tablan V, Bateup S, Catarino A, Martin AJ et al. Quantifying the Association Between Psychotherapy Content and Clinical Outcomes Using Deep Learning. JAMA Psychiatry 2020; 77: 35–43.
OpenUrl

[28] ↵
Sharma A, Lin IW, Miner AS, Atkins DC, Althoff T. Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach. In: Proceedings of the Web Conference 2021. Association for Computing Machinery: \plNew York, NY, USA, 2021, pp 194–205.

[29] ↵
Imel ZE, Steyvers M, Atkins DC. Computational psychotherapy research: Scaling up the evaluation of patient-provider interactions. Psychotherapy 2015; 52: 19–30.
OpenUrl

[30] ↵
Lattie EG, Stiles-Shields C, Graham AK. An overview of and recommendations for more accessible digital mental health services. Nature Reviews Psychology 2022; : 1–14.

[31] ↵
Haque A, Milstein A, Fei-Fei L. Illuminating the dark spaces of healthcare with ambient intelligence. Nature 2020; 585: 193–202.
OpenUrl

[32] ↵
Tausczik YR, Pennebaker JW. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. J Lang Soc Psychol 2010; 29: 24–54.
OpenUrl CrossRef

[33] ↵
Zimmermann J, Brockmeyer T, Hunn M, Schauenburg H, Wolf M. First-person Pronoun Use in Spoken Language as a Predictor of Future Depressive Symptoms: Preliminary Evidence from a Clinical Sample of Depressed Patients. Clin Psychol Psychother 2017; 24: 384–391.
OpenUrl

[34] ↵
Vine V, Boyd RL, Pennebaker JW. Natural emotion vocabularies as windows on distress and well-being. Nat Commun 2020; 11: 4525.
OpenUrl

[35] ↵
Wadden D, August T, Li Q, Althoff T. The Effect of Moderation on Online Mental Health Conversations. In: Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM 2021). 2021, pp 751–763.

[36] ↵
Beck JS. Cognitive Behavior Therapy, Second Edition: Basics and Beyond. Guilford Press: New York, NY, USA, 2011.

[37] ↵
Weissman MM, Markowitz JC, Klerman GL. Clinician’s Quick Guide to Interpersonal Psychotherapy. 2008.

[38] ↵
Baird HM, Webb TL, Sirois FM, Gibson-Miller J. Understanding the effects of time perspective: A meta-analysis testing a self-regulatory framework. Psychol Bull 2021; 147: 233–267.
OpenUrl

[39] ↵
Park G, Schwartz HA, Sap M, Kern ML, Weingarten E, Eichstaedt JC et al. Living in the Past, Present, and Future: Measuring Temporal Orientation With Language. J Pers 2017; 85: 270–280.
OpenUrl

[40] ↵
Keefe JR, Huque ZM, DeRubeis RJ, Barber JP, Milrod BL, Chambless DL. In-session emotional expression predicts symptomatic and panic-specific reflective functioning improvements in panic-focused psychodynamic psychotherapy. Psychotherapy 2019; 56: 514–525.
OpenUrl

[41] ↵
Nook EC, Schleider JL, Somerville LH. A linguistic signature of psychological distancing in emotion regulation. J Exp Psychol Gen 2017; 146: 337–346.
OpenUrl

[42] ↵
Greenberg LS. Emotions, the great captains of our lives: their role in the process of change in psychotherapy. Am Psychol 2012; 67: 697–707.
OpenUrl PubMed

[43] ↵
Fiske AP. The lexical fallacy in emotion research: Mistaking vernacular words for psychological entities. Psychol Rev 2020; 127: 95–113.
OpenUrl

[44] Gross JJ, Jazaieri H. Emotion, Emotion Regulation, and Psychopathology: An Affective Science Perspective. Clin Psychol Sci 2014; 2: 387–401.
OpenUrl

[45] ↵
Greenberg LS, Safran JD. Emotion in psychotherapy. Am Psychol 1989; 44: 19–29.
OpenUrl CrossRef PubMed

[46] ↵
Rocco D, Pastore M, Gennaro A, Salvatore S, Cozzolino M, Scorza M. Beyond Verbal Behavior: An Empirical Analysis of Speech Rates in Psychotherapy Sessions. Front Psychol 2018; 9: 978.
OpenUrl

[47] ↵
Tonti M, Gelo OCG. Rate of speech and emotional-cognitive regulation in the psychotherapeutic process: a pilot study. RES PSYCHOTHER-PSYCH 2016; 19. doi:10.4081/ripppo.2016.232.
OpenUrl CrossRef

[48] ↵
Wilfley DE, Agras WS, Fitzsimmons-Craft EE, Bohon C, Eichen DM, Welch RR et al. Training models for implementing evidence-based psychological treatment: A cluster-randomized trial in college counseling centers. JAMA Psychiatry 2020; 77: 139–147.
OpenUrl

[49] ↵
Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. OTexts, 2018.

[50] ↵
Labov W, Fanshel D. Therapeutic discourse: Psychotherapy as conversation. Academic Press, 1977.

[51] ↵
Clark HH. Using Language. Cambridge University Press, 1996.

[52] Borelli JL, Sohn L, Wang BA, Hong K, DeCoste C, Suchman NE. Therapist-Client Language Matching: Initial Promise as a Measure of Therapist-Client Relationship Quality. Psychoanal Psychol 2019; 36: 9–18.
OpenUrl

[53] ↵
Ireland ME, Slatcher RB, Eastwick PW, Scissors LE, Finkel EJ, Pennebaker JW. Language style matching predicts relationship initiation and stability. Psychol Sci 2011; 22: 39–44.
OpenUrl CrossRef PubMed

[54] ↵
Brinberg M, Ram N. Do New Romantic Couples Use More Similar Language Over Time? Evidence from Intensive Longitudinal Text Messages. J Commun 2021; 71: 454–477.
OpenUrl

[55] ↵
Giles H, Mulac A, Bradac JJ, Johnson P. Speech Accommodation Theory: The First Decade and Beyond. Annals of the International Communication Association 1987; 10: 13–48.
OpenUrl

[56] ↵
Hatcher RL. Interpersonal competencies: Responsiveness, technique, and training in psychotherapy. Am Psychol 2015; 70: 747–757.
OpenUrl

[57] ↵
Duran ND, Paxton A, Fusaroli R. ALIGN: Analyzing linguistic interactions with generalizable techNiques—A Python library. Psychol Methods 2019; 24: 419–438.
OpenUrl

[58] Burkhardt HA, Alexopoulos GS, Pullmann MD, Hull TD, Areán PA, Cohen T. Behavioral Activation and Depression Symptomatology: Longitudinal Assessment of Linguistic Indicators in Text-Based Therapy Sessions. J Med Internet Res 2021; 23: e28244.
OpenUrl

[59] ↵
Nook EC, Hull TD, Nock M, Somerville L. Linguistic measures of psychological distance track symptom levels and treatment outcomes in a large set of psychotherapy transcripts. doi:10.31234/osf.io/hqxaz.
OpenUrl CrossRef

[60] ↵
Pittenger RE, Hockett CF, Danehy JJ. The first five minutes: A sample of microscopic interview analysis. Paul Martineau: Ithaca, NY, 1960.

[61] ↵
Horn RL, Weisz JR. Can Artificial Intelligence Improve Psychotherapy Research and Practice? Adm Policy Ment Health 2020; 47: 852–855.
OpenUrl

[62] ↵
Sharma A, Miner A, Atkins D, Althoff T. A computational approach to understanding empathy expressed in text-based mental health support. In: The 2020 Conference on Empirical Methods in Natural Language Processing. 2020, pp 5263–5276.

[63] ↵
Costa B, Dewaele J-M. Psychotherapy across languages: beliefs, attitudes and practices of monolingual and multilingual therapists with their multilingual patients. Language and Psychoanalysis 2012; : 18–40.

[64] Benish SG, Quintana S, Wampold BE. Culturally adapted psychotherapy and the legitimacy of myth: a direct-comparison meta-analysis. J Couns Psychol 2011; 58: 279–289.
OpenUrl CrossRef PubMed

[65] Whaley AL, Davis KE. Cultural competence and evidence-based practice in mental health services: a complementary perspective. Am Psychol 2007; 62: 563–574.
OpenUrl CrossRef PubMed Web of Science

[66] ↵
Hook JN, Farrell JE, Davis DE, DeBlaere C, Van Tongeren DR, Utsey SO. Cultural humility and racial microaggressions in counseling. J Couns Psychol 2016; 63: 269–277.
OpenUrl

[67] ↵
Koenecke A, Nam A, Lake E, Nudell J, Quartey M, Mengesha Z et al. Racial disparities in automated speech recognition. Proc Natl Acad Sci U S A 2020; 117: 7684–7689.
OpenUrl Abstract/FREE Full Text

[68] Corcoran CM, Benavides C, Cecchi G. Natural Language Processing: Opportunities and Challenges for Patients, Providers, and Hospital Systems. Psychiatric Annals. 2019; 49: 202–208.
OpenUrl

[69] Mitchell M, Baker D, Moorosi N, Denton E, Hutchinson B, Hanna A et al. Diversity and Inclusion Metrics in Subset Selection. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery: New York, NY, USA, 2020, pp 117–123.

[70] Rubin E. Striving for Diversity in Research Studies. N Engl J Med 2021. doi:10.1056/NEJMe2114651.
OpenUrl CrossRef

[71] ↵
Brown S, Davidovic J, Hasan A. The algorithm audit: Scoring the algorithms that score us. Big Data & Society 2021; 8: 2053951720983865.

[72] ↵
Krause KR, Chung S, Sousa Fialho M da L, Szatmari P, Wolpert M. The challenge of ensuring affordability, sustainability, consistency, and adaptability in the common metrics agenda. Lancet Psychiatry 2021; 8: 1094–1102.
OpenUrl

[73] ↵
Hernandez-Boussard T, Bozkurt S, Ioannidis JPA, Shah NH. MINIMAR (MINimum Information for Medical AI Reporting): Developing reporting standards for artificial intelligence in health care. J Am Med Inform Assoc 2020; 27: 2011–2015.
OpenUrl CrossRef PubMed

[74] Xiao B, Imel ZE, Georgiou PG, Atkins DC, Narayanan SS. ‘ Rate My Therapist’: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing. PLoS One 2015; 10: e0143055.
OpenUrl

[75] Bone C, Simmonds-Buckley M, Thwaites R, Sandford D, Merzhvynska M, Rubel J et al. Dynamic prediction of psychological treatment outcomes: development and validation of a prediction model using routinely collected symptom data. The Lancet Digital Health 2021; 3: e231–e240.
OpenUrl

[76] Huckvale K, Venkatesh S, Christensen H. Toward clinical digital phenotyping: a timely opportunity to consider purpose, quality, and safety. NPJ Digit Med 2019; 2: 88.
OpenUrl PubMed

[77] ↵
Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines. Nature Machine Intelligence 2019; 1: 389–399.
OpenUrl

[78] ↵
Krieger N. Epidemiology and the People’s Health: Theory and Context. Oxford University Press: New York, NY, USA, 2011.

[79] ↵
Wilfley DE, Fitzsimmons-Craft EE, Eichen DM, Van Buren DJ, Welch RR, Robinson AH et al. Training models for implementing evidence-based psychological treatment for college mental health: A cluster randomized trial study protocol. Contemp Clin Trials 2018; 72: 117–125.
OpenUrl

[80] ↵
Linstone HA, Turoff M. The Delphi Method: Techniques and Applications. First Edition edition. Addison-Wesley Educational Publishers Inc: Reading, MA, USA, 1975.

[81] ↵
Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015-2017. PLoS Biol 2018; 16: e2006930.
OpenUrl CrossRef PubMed

[82] ↵
Stochl J, Fried EI, Fritz J, Croudace TJ, Russo DA, Knight C et al. On Dimensionality, Measurement Invariance, and Suitability of Sum Scores for the PHQ-9 and the GAD-7. Assessment 2020; : 1073191120976863.

[83] Zimmerman M. Using the 9-Item Patient Health Questionnaire to Screen for and Monitor Depression. JAMA 2019. doi:10.1001/jama.2019.15883.
OpenUrl CrossRef

[84] ↵
Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–613.
OpenUrl CrossRef PubMed Web of Science

[85] ↵
Aafjes-van Doorn K, Barber JP. Systematic Review of In-Session Affect Experience in Cognitive Behavioral Therapy for Depression. Cognit Ther Res 2017; 41: 807–828.
OpenUrl

[86] ↵
Fitzgerald P, Leudar I. On active listening in person-centred, solution-focused psychotherapy. J Pragmat 2010; 42: 3188–3198.
OpenUrl

[87] ↵
Al-Mosaiwi M, Johnstone T. In an Absolute State: Elevated Use of Absolutist Words Is a Marker Specific to Anxiety, Depression, and Suicidal Ideation. Clin Psychol Sci 2018; 6: 529–542.
OpenUrl

[88] ↵
Mehrabian A, Ferris SR. Inference of attitudes from nonverbal communication in two channels. J Consult Psychol 1967; 31: 248–252.
OpenUrl CrossRef PubMed Web of Science

[89] ↵
Mehrabian A, Wiener M. Decoding of inconsistent communications. Journal of Personality and Social Psychology. 1967; 6: 109–114.
OpenUrl CrossRef PubMed Web of Science

[90] ↵
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Science & Business Media, 2009.

[91] ↵
Mann HB, Whitney DR. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann Math Stat 1947; 18: 50–60.
OpenUrl

[92] ↵
Runge J, Nowack P, Kretschmer M, Flaxman S, Sejdinovic D. Detecting and quantifying causal associations in large nonlinear time series datasets. Sci Adv 2019; 5: eaau4996.
OpenUrl FREE Full Text