Abstract
Introduction Mood instability in bipolar disorder (BD) is poorly understood. Here we examined cognitive and neural mechanisms related to these fluctuations and how they are changed with the mood stabilizer lithium.
Methods We recruited volunteers with low (n=37) or high (n=40) risk of BD (using the Mood Disorder Questionnaire, MDQ). We also recruited patients with BD who were assigned (randomized, double-blind) to six weeks of lithium (n=19) or placebo (n=16) after a two-week baseline period. Participants completed mood ratings daily over 50 (healthy) or 42 (BD) days, as well as a risky decision-making task and one functional magnetic resonance imaging session. The task measured adaptation of risk taking to past outcomes (increased risk aversion after a previous win, ‘outcome history’).
Results While the low MDQ group was risk averse after a win, this was less evident in the high MDQ group and least so in the patients with BD. Neurally, ‘outcome history’ was linked to medial frontal pole activation at the time of the decision. Corresponding to the behavioural effect, this activation was reduced in the high MDQ vs. the low MDQ group. While lithium did not reverse the pattern of BD in the task, it changed reward processing in the dorsolateral prefrontal cortex.
Discussion Healthy participants’ modulation of risk-taking in response to reward outcomes was reduced by risk of BD and BD. These results provide a model for how reward may prime escalation of risk-related behaviours in bipolar disorder and how mood stabilising treatments may work.
Introduction
Bipolar disorder (BD) is typically characterized by episodes of depression or mania, lasting weeks and months. Lithium is the most effective mood stabiliser for management of BD, reducing the frequency of both manic and depressive episodes (1). While, fluctuating mood episodes have traditionally be seen as lasting weeks or months, more recent work has shown that, in fact, patients with BD show large day-to-day fluctuations in mood even when symptoms are in the non-clinical range (2). A specific aspect of this mood instability – volatility - is affected by lithium treatment (3). Understanding the processes underpinning these fluctuations may help us develop and assess more effective treatment approaches.
To understand mood fluctuations in BD, recent work has used a computational-psychiatry approach drawing from the field of reinforcement learning. This work has suggested that destabilizing positive feedback cycles between mood and perceptions of rewards may contribute to BD (4–6): In people with subclinical symptoms of BD, positive or negative surprises were found to affect the neural and behavioural responses to reward and punishments. In particular, symptoms were associated with an increase in reward value after a positive surprise. This kind of reward sensitivity has been linked to later changes in mood, suggesting a route by which escalation of reward responses may translate into clinical symptoms (7). In patients, however, work so far has focused on lab-based measures of mood and its impact on behaviour on the scale of minutes.
Beyond learning, research on decision-making has also revealed that behaviour shows temporal interdependencies. For example, people show ‘biases’ such as ‘loss chasing’ (8) (taking more risks to recover losses). While these behaviours have often been interpreted as biases, from more recent conceptual work from an ecological perspective (9–12), they could be seen as useful for achieving homeostasis between needs for different rewards. Using momentary ecological monitoring has revealed homeostatic, mood stabilizing, behaviour. In healthy controls, when mood fluctuates, people self-report using strategies to re-establish mood homeostasis such as engaging with aversive activities when they are in a good mood (13). It has been shown that this strategy is reduced in people with depression or low mood (14). However, it is yet unclear whether homeostatic behaviour is also reduced in BD.
Studies in healthy participants and patients show that there is a wide network of frontal and striatal areas in the brain that process rewards and are crucial for motivation (15,16) and potentially important for understanding mood disorders (17,18) and homeostasis (9,10,12). However, how BD and mood instability change signals in these networks related to homeostasis is still unclear.
Here, we have built on these findings to test whether a gradient of mood elevation, was linked to behavioural measures of reduced active homeostatic behaviour in a decision-making task, measured as changes in decision strategies (risk taking) from trial to trial in response to reward/loss outcomes. For this, we recruited 40 healthy volunteers with risk of BD, i.e. a history of mood elevation, assessed using the mood disorder questionnaire (MDQ (19)), 37 volunteers without mood elevation, and 35 patients with BD. To assess whether behaviour and naturally occurring daily-life mood fluctuations were related, participants completed up to 50 longitudinal testing sessions at home. To understand the neural mechanisms of homeostatic stabilizing behaviour, we measured brain activity with fMRI. To test the causal effect of a commonly prescribed mood-stabilizing drug, lithium, 19 patients were randomly assigned for six weeks to lithium and 16 to placebo in a double-blind design.
METHODS
Participants
Participants were recruited in two separate studies (see below) approved by the local ethics committee: Oxford University Ethic Committee (MSD-IDREC-C2-2014-023) and Soutch Central – Oxford A Research Ethcis Committee (15/SC/0109). Participants gave informed consent and were reimbursed for taking part in the study. The intervention study was registered: https://doi.org/10.1186/ISRCTN91624955.
Volunteers with history of high or low mood elevation
Participants were recruited through local advertisement and from pools of previous participants who had consented to be re-contacted. Seventy-seven were included in the study. Participants were recruited into two age- and gender-matched groups depending on their scores from the Mood Disorders Questionnaire, MDQ (19): ‘low MDQ’ group: <5 points (n=37); ‘high MDQ’ group: ≥ 7 points and additionally indicating that several of these symptoms happened during the same period of time (n=40). Detailed exclusion criteria appear in supplementary method 1A. A higher score on the MDQ indicates a history of mood elevation and risk of BD.
Patients with BD
Participants were recruited through the BD Research Clinic (Oxford). All participants met criteria for BD-I (n=7), BD-II (n=27) or BD not otherwise specified (BD-NOS, n=1), based on structured clinical interview. Full exclusion criteria are provided in the supplementary materials [1B]. Participants were assigned to placebo (n=16) or lithium (n=19), see below.
Study design
Volunteers
We measured participants’ mood and behaviour in a cognitive task longitudinally five times a week over ten weeks. Brain activity during the same task was measured during an MRI scan. The data here were part of a larger study (supplementary method 1B).
Patients with BD
This study was a randomised, 6-week, double-blind, placebo-controlled trial (20). See supplementary method [1B] for full information. All participants underwent a two-week pre-randomization phase during which they completed the cognitive task and mood ratings daily at home. Due to logistic challenges, for some participants this phase lasted longer than two weeks. For the next phase, which lasted 6 weeks, participants were randomly assigned to receive either lithium or placebo in a double-blind design. The first 10 participants were fully randomly assigned to avoid predictability, while for subsequent participants, an algorithm was used to minimize differences in age (<25 or >25 years) and gender between the two groups. In the lithium group, participants were titrated to doses producing plasma levels of 0.6-1 mmol/L (see supplementary methods 1C for dosing details). Data from one MRI session are ported here (n= 23 participants were fMRI compatible).
Throughout, we performed two types of group comparisons. First, we compared across history of mood elevation severity/risk of BD (i.e. group as ordered factor (21) in regressions, Low MDQ < High MDQ < patients with BD), subsequently referred to as ‘mood elevation gradient’. Second, we tested for the effects of lithium treatment as drug (lithium/placebo) x time point (baseline/post) interactions.
‘Wheel of fortune’ task
Trial structure
On each trial of the task, participants were given two options shown side-by-side. In the at-home version, these were wheels of fortune (Figure 1A). In the fMRI version, they were instead presented as bars. Each option had three attributes: probability of winning vs. losing (size of green vs. red area), magnitude of possible gain (number on green area, 10 to 200), and magnitude of possible loss (number on red area, also 10 to 200). After participants chose one option, the wheel of fortune started spinning and then randomly landed on either win or loss. Finally, participants were shown their updated total score. The experiment was designed so that most choices were difficult, i.e., the options were very similar in value (90% of choices were not more than 20 points apart; 76% not more than 5 points apart), Figure 1B.
A) On each trial, participants chose between two gambles (‘wheel of fortune’) that differed in their probability of winning or losing points and in the number of points that could be won or lost. Once participants had chosen an option, the alternative was hidden, and the chosen wheel started spinning until finally landing on the win or loss. B) Participants’ choices (left vs. right option) were guided by the relative utilities (reward utility - i.e., probability * magnitude – minus loss utility): the higher the utility of the left option, the more it was chosen. The computational model (lines) captured behaviour (dots with error bars) well. Data were combined across all testing sessions (up to 50) per participant (20 trials per session). Error bars show the standard error of the mean, and the size of the dots indicates the number of data points available.
Timings and number of trials
Each day, participants were asked to rate their positive and negative mood using the Positive and Negative Affect Schedule – Short Form, PANAS-SF (22). They were also asked to give an overall rating of their mood (‘How are you feeling’, referred to here as ‘Happiness VAS’) using a slider ranging from ‘very unhappy’ (red sad face drawing) to ‘very happy’ (green smiley face). They then played 20 trials of the task. After the task, they repeated the happiness VAS.
In the fMRI scanner, participants played 100 trials. All timings were jittered. From the onset of options until participants could make a choice: 1-2 sec; delay between participants’ response and outcomes shown: 2.7 to 7.7 sec; duration of outcome shown: 1-3 sec; duration of total score shown: 1-9 sec; ITI: 1-9 sec.
Behaviour
Behavioural data were analysed in R (23) (version 4.0.2) and Matlab. R-packages: Stan (24), BRMS (25,26), dplyr (27), ggpubr (28), sjPlot (29), compareGroups (30), emmeans (31), ggsci (32).
Model-free analyses of behaviour
We analysed participants’ choices without and with computational models. First, to test that participants could perform the task, i.e., that their choices were sensitive to value, we binned their choices (% left vs. right option) according to the overall utility difference between the two options (i.e., left vs. right reward utility minus loss utility).
To test sensitivity to risk of losses, as has been previously reported to be affected in BD (33,34), we refined the binning of choices (as above) by further splitting the data according to win and loss utility (i.e. probability * magnitude).
We next analysed behaviour for adaptions of risk taking to past outcomes. One way this can be measured is by considering how participants change their behaviour – here risk-taking – based on win/loss outcomes on previous trials (‘outcome history’ effect), as we and others have previously done in a learning context (16,35). One might think that rational behaviour could mean that the outcome of one trial should not affect the choice in the next trial. While this might be rational in a simple computer task, it is plausible that participants approach the task with mechanisms adapted to decision making in the real world in which there are sequential dependencies between decisions or extended temporal contexts that might affect which outcomes will follow choices. For example, in an ecological framework where participants try to achieve homeostasis (9,12), they might be willing to accept higher risks of losing to make up for losses just encountered, or vice versa, be more avoidant of future losses after having received a reward outcome. To measure this in our experiment, we considered in the binning of choices (as above) whether choices were influenced by the previous trial’s win/loss outcome.
To compare groups, instead of a standard ANOVA procedure which tests for any differences between groups, we tested for a systematic effect, i.e. a gradient of mood elevation (group as ordered factor (21), Low MDQ < High MDQ < BD patients) in linear regressions, also controlling for age and gender. Models used the BRMS toolbox interface for Stan (supplementary methods 2). For this and all subsequent analyses, we used Bayesian Credible Intervals (36) to establish significance by the 95% CI not including zero.
Computational models
Decision making
We used a computational model to capture participants’ choices. The model first computed the overall value (‘utility’) of each option, then made a choice (left or right option) depending on which option had the better utility, but also allowing for some random choice behaviour (17,37).
First, the model compared the options’ utilities as displayed at the time of choice on the current trial, i.e., probability (prob) x magnitude (mag). We allowed for individual differences in sensitivity to the loss vs. reward utility (λ). Participants’ ‘outcome history’ was captured by a parameter (γ) that changed the weighting of the loss utility on the current trial depending on whether the previous trial’s outcome was a win or a loss (i.e., γ>0 means increased sensitivity to losses after a win on the previous trial).
To decide which option to choose, the model compared the utilities of the left and right options taking into account each participant’s ‘randomness’ (inverse temperature (β), with higher values indicating choice consistency and lower choice randomness):
Models were validated using simulations (Table S1 and supplementary methods 2A).
Model Fitting
To allow fitting of individual sessions (20 trials), a Bayesian approach was implemented that allowed specifying priors for each parameter (supplementary methods 2B).
To assess group differences, we then entered these session-wise parameters into hierarchical regressions (using BRMS). This enabled consideration of parameters that might change over the days of testing, as well as individual differences in the means and variability (standard deviation) across sessions of parameters. For example:
The effect of lithium (vs. placebo) was tested analogously:
These models were used for group comparisons of mean parameters (supplementary methods 2B+C). Variabilities were not compared as model validation (table S1) suggested poor recovery. Mood data (positive and negative PANAS, happiness VAS) were analysed using similar regressions (supplementary methods 2C) to assess group differences in mood (mean or variability) or the relationship between task outcomes and changes in happiness VAS.
MRI acquisition
Data from all 77 healthy volunteers and 13 patients with BD were collected on a 3T Siemens Magneton Trio. Data from 9 patients with BD were collected at a different site using a Siemens Magneton PRISMA. Group comparisons include scanner as a control regressor. Scan protocols were carried out following (16), supplementary methods 3A.
FMRI analysis – whole-brain
General approach
Data were pre-processed using FSL ((38), supplementary methods 3B). Data were pre-whitened before analysis (39). Statistical analysis was performed at two levels. At the first level, we used an event-related GLM approach for each participant. On the second level, data were analysed using a mixed-effects model using FSL’s FLAME 1 (40,41) with outlier de-weighting. The main-effect images are all cluster-corrected (p<0.05 two-tailed) with the inclusion threshold of z < 2.3.
Regression designs
Firstly, at the time of the decision, we looked for neural activity correlating with the utility (predicted reward, loss) of the choice. Second, at the time of the outcome of the gamble, we looked for neural activity related to the processing of the outcome (win/loss as continuous regressor). Decision and outcome-related activity could be dissociated due to jitter used in the experimental timing (see (16)). Third, as a key measure of interest, we looked at whether there was a history effect at the time of the choice (i.e., previous trial’s gamble win/loss outcome (16,42), analogous to the behavioural findings). Full design information is included in the supplementary methods [3C] and Figure S2.
Group-level comparisons
We compared the low vs high MDQ groups (n=77) in a second-level analysis (statistical thresholds as described above). For the patients with BD, only 23 participants were available. Therefore, group comparisons were first performed in the regions of interest (ROIs) derived from comparisons of the healthy volunteers (see below). As exploratory analyses, groups were also compared at the whole-brain level. Scanner assignment was included as a control variable.
ROI analyses
Mean brain activations (z-stats) were extracted for each participant. These were used to illustrate group differences and also to perform independent statistical tests (e.g., ROIs of clusters defined based on group differences of high vs. low MDQ could be used to test group differences between lithium and placebo). For this, non-hierarchical Bayesian regressions were used, also controlling for age and gender. Brain activations were also correlated with behavioural measures. For this, effects of age, gender and group (and for the patients with BD: number of days in the baseline phase) were first removed using regressions from both neural and behavioural measures. As correlations of primary interest, activity from brain areas of significant differences between the groups was correlated with significant behavioural measures.
RESULTS
We recruited four groups of participants in two separate studies (Table 1).
Participant demographics. Statistical tests are two-tailed p-values and refer to comparisons between the two groups of healthy participants with low or high mood elevation (‘Low vs. high MDQ’) and between the two groups of patients with BD randomized to lithium or placebo (‘Lith vs. pla’). Values are the mean and standard error of the mean. Abbreviations: ‘# Behav. Days’ – number of days of behavioural data available (20 trials per day), ‘# Behav. days (pre)’ – number of days in the baseline phase for the patients with BD, ‘# PANAS days’ – number of days with mood scores (PANAS, positive affect negative affect scale, short form) available. ‘MDQ’ – Mood disorder questionnaire. ‘Has longitudinal data: Yes’ – percentage of participants from whom longitudinal data (i.e., sessions at home) were available. Diagnoses: ‘BD-I’ – bipolar I disorder; ‘BD-II’ – bipolar II disorder; ‘BDNOS’ – bipolar disorder not otherwise specified; ‘PTSD’ – post traumatic stress disorder. For the patients with BD, comorbid disorders were not measured. Note that in the low and high MDQ groups, diagnoses were only based on SCID, not on a full clinical examination.
General performance
Participants completed longitudinal daily behavioural test sessions at home, consisting of 20 trials of a gambling task and mood self-reports. In the task (Figure 1A), participants needed to choose repeatedly between two gambles (wheels of fortune), considering the probabilities of winning or losing points and the number of points that could be won or lost. Participants in all groups performed the task well (Figure 1B), selecting options with higher values more frequently.
Sensitivity to potential losses
To test whether sensitivity to potential losses vs. wins when gambling was reduced with a mood elevation gradient (low MDQ < high MDQ < patients with BD), we examined how much participants’ choices took the value differences between options into account. We found that indeed, the mood elevation gradient was related to reduced sensitivity to losses compared to wins (Figure 2 Ai, group*win/loss dimension* utility bin: mean=0.33, 95% CI = [0.06; 0.61]), driven by both an increased sensitivity to wins (group*utility bin: mean 0.24, 95% CI = [0.08; 3.99]) and a decreased sensitivity to losses (group* utility bin: mean = -0.15, 95% CI = [-0.30; -0.01]). To quantify the behavioural effect precisely while controlling for potential confounds, we used computational modelling. We built a stochastic decision-making model that described participants’ choices as being based on the reward and loss utilities of the two options while allowing for individual differences in how people made decisions. The model captured participants’ sensitivity to losses (vs. wins) as a parameter (λ). We found again that the higher the mood elevation, the lower the sensitivity to losses (Figure 2Aii, Table S2, mean =-0.27, 95%CI = [-0.49; -0.05], driven mainly by a decrease in the group of patients with BD compared to the low/high MDQ groups). Lithium vs. placebo did not affect this (Figure 2Aiii).
A) Loss sensitivity. Ai) Illustration of sensitivity of choices to loss/reward utility – as utility increases for the left compared to the right option, participants are more likely to choose the left option. For low/high MDQ participants, this increase in choice probability is similar for the reward or loss dimension. In contrast, patients with BD show decreased sensitivity to losses vs. rewards (the loss curve is shallower). Aii) The loss sensitivity model parameter reflects the group difference observed in the binned data (Ai). Aiii) Lithium (vs. placebo) does not affect loss sensitivity (group [lithium/placebo] * time [pre/post] interaction).B) Outcome history. Bi) Illustration of participants’ increased risk-taking after a loss (compared to a win) on the previous trial (‘outcome history’), indicated by the distance between the % choice values split by previous trial’s outcome. This difference is smaller with mood elevation gradient. Bii) The outcome-history model parameter (γ) differed between the groups capturing the effects observed in the binned behaviour (Bi). Low MDQ participants showed the most and the patients with BD showed the least outcome history effects. Biii) Lithium (vs. placebo) does not affect outcome history effects. A full list of comparisons of parameters for the groups is shown in Tables S2 (longitudinal data) and Table S4 (fMRI session data). Relationships between parameters measured longitudinally over weeks or in the lab during the fMRI session are shown in Table S5. ii) and iii) show conditional effects from regression models, roughly equivalent to means, controlling for regressors of no interest.
Outcome history effects
We next analysed how participants adapted their risk taking across trials based on win or loss outcomes in the previous trial (‘outcome history effect’). Low MDQ participants had the strongest outcome history effects, i.e., reduced risk taking (avoidance of potential losses) after a previous win. This was reduced with the mood elevation gradient (Figure 2A mean = - 0.4, 95% CI = [-0.74; -0.06).
In the computational model, outcome history effects were captured as a parameter (γ) that described to what extent participants were less sensitive to potential losses after a win on the previous trial. We found again that the mood elevation gradient reduced outcome history effects (Figure 2B, Table S2, mean=-0.05, 95% CI=[-0.11; -0.0001]).
Mood
Finally, an advantage of the behavioural data being collected at home was that we could relate daily moods to task-based behaviour. As reported previously (3,43) and similar to other studies (2,44,45) groups differed in their instability (standard deviation) of mood. The low MDQ group showed the lowest and the patients with BD the highest (positive PANAS: mean= 0.22, 95%CI = [0.11; 0.33]; negative PANAS: mean=0.64, 95%CI = [0.45; 0.83], Table S2A, Figure S1A). Lithium did not affect instability when using our measure of standard deviation here (table S2B), though note that using a measure of Bayesian volatility, lithium has been found to increase volatility of positive mood (3). Looking at the relationship between task outcomes and mood (happiness VAS), across all groups, happiness was increased by reward and decreased by loss (mean = 0.42, 95%CI = [0.31; 0.52]), but this did not differ mood elevation (mean =-0.06, 95% CI = [-0.15, 0.03]).
Neural results
Neural data were available for 77 volunteers and 23 patients. Across volunteers, brain activations to reward and loss utility during decisions (Figure 3A) and at the receipt of outcomes (Figure 3C) activated brain evaluation networks, including ventromedial prefrontal cortex (vmPFC), ventral striatum, dorsal anterior cingulate cortex (dACC), insula (Table S6). Next, we tested whether, related to the outcome history effect, there was brain activity when participants made a choice related to what had happened previously. Indeed, we found that activity in a network including the ventral striatum, vmPFC and medial frontal pole (FPm) related to the outcome of the previous trial (Figure 3B, Table S6).
A) At the time of the decision a wide network of areas activated with relative (chosen minus unchosen) reward utility (orange), while loss relative utility activated the anterior cingulate cortex (blue). B) At the time of the decision, the last trial’s outcome (points won or lost) activated areas including vmPFC and ventral striatum (orange). C) At the time of the outcome (win or loss received), the outcome (points won or lost) activated areas including vmPFC, FPm, and ventral striatum (red/orange) and deactivated the pre-supplementary area. All results are cluster-corrected at p<0.05, two-tailed, with inclusion cut-off z>2.3. See Table S6 for the full list of results. Data were combined across both volunteer groups (low and high MDQ).
Next, we compared the low and high MDQ groups. Activity for the last trial’s outcome was higher for the low MDQ vs high MDQ group in FPm (Figure 4A-B, Table 2A, p=0.038, whole-brain cluster corrected). In other words, while all participants showed activity in vmPFC/FPm, in low MDQ participants the cluster extended further into FPm. Moreover, the stronger the activity for the last trial’s outcome in this area, the stronger the behavioural outcome history effect (Figure 4C, r=0.24, p=0.017, partial correlation after correction for control variables and group; without correction: r=0.28, p=0.005). Lithium vs. placebo participants’ activity did not differ in this area (mean=0.64, 95% CI = [-0.23; 1.44]).
Statistics for group comparisons in Figures 4-5. A) Comparisons of the low vs high mood elevation volunteers (Figure 4). B) Comparisons of the patients with BD assigned to placebo or lithium (Figure 5). All cluster-based thresholded, inclusion threshold: z=2.3, significance p<0.05 two-tailed. The maximum z-value of the cluster, the p-value and number of voxels are given for each cluster. Anatomical labels are based on: [1] (46)) [2] (47), [3] (48), [4] (49), [5] (50).
A) Activation with last trial’s outcome at the time of the current trial’s decision differed between the low and high MDQ groups in the medial frontal pole (FPm; x=-10, y=56, z=16; p=0.038, n=77, cluster-corrected, Table 2A). In the low MDQ group, the activation with the last trial’s outcome that is found across both groups (Figure 3B) extends further dorsally. B) This group difference was driven by the low MDQ group showing stronger activation than the high MDQ group in FPm (Figure shows conditional effects from regression model, roughly equivalent to means, controlling for regressors of no interest). There was no significant difference between activations comparing lithium and placebo groups. C) This FPm activity correlated with the longitudinally measured outcome history parameter. Colours match those of groups in B.
As exploratory analyses (due to low sample sizes in the clinical FMRI groups), we next compared lithium vs. placebo treatment at the whole-brain level. We found that patients receiving placebo had stronger activity related to the outcome of gambles in an area spanning dorsolateral prefrontal cortex (dlPFC, area 46) and lateral frontal pole (Figure 5A-B, Table 2B, p=0.009).
A) Outcome related activity differed between the placebo and the lithium participants in an area including dorsolateral prefrontal cortex and lateral frontal pole (whole-brain cluster-corrected, Table 2B). This effect is illustrated in B).
DISCUSSION
We designed a study to test the computational and neural correlates of adaptations of risk-taking to gains and losses in bipolar disorder (BD) and treatment with lithium. We included participants along a gradient of history of mood elevation ranging from volunteers with low risk of BD or mood instability (low MDQ group), to volunteers high risk of BD, to patients with BD. In the patients, we tested the effect of lithium treatment in a placebo-controlled double-blind design. We measured how much participants adapted their risk-taking following reward outcomes in a risky decision-making task (‘outcome history effects’). We measured behaviour both longitudinally over up to 50 days and during a brain imaging (FMRI) session. We found that the low MDQ group showed ‘outcome history effects’. Specifically, after a win on a trial, they were more risk averse. This was reduced across the mood elevation gradient (lowest in patients with BD). Neurally, outcome history was related to the representation of past information in a large swath including ventromedial prefrontal cortex (vmPFC) and medial frontal pole (FPm). In low MDQ volunteers, this brain signal extended further dorsally into FPm.
First, we replicated previous findings (33,34) that mood elevation decreased sensitivity to potential losses (vs wins).Then, we went further looking at adaptation of risk taking to past outcomes. We found that healthy volunteers without mood elevation showed sequential dependencies between their choices and previous trials’ outcomes. This was not strictly rational in our task since outcomes for gambles across trials were independent. Therefore, these sequential effects could be understood as a bias, e.g., ‘chasing’ as has been described in the gambling literature (8). However, an alternative view is that ‘biases’ observed in the lab are actually functionally appropriate in more naturalistic environments (17,51–53). For example, in natural environments, which are experienced continually rather than in discrete trials and in which different types of rewards (e.g. food, water) need to be accumulated or a homeostatic setpoint needs to be reached, it would make sense to adapt behaviour according to previous outcomes (9,10,12,54,55). Here we find that in the absence of mood elevation, volunteers show a kind of ‘active homeostatic behaviour’, i.e., they modulate their risk-taking depending on the previous trial’s outcome. This tendency was lower in the high MDQ group and lowest in patients with BD. Reduced homeostatic behaviour could lead to unstable moods as in the healthy population mood has been found to be regulated through behaviour (13). Relatedly, in patients with BD, purposefully regulating behaviour during the prodromal periods has been shown to reduce the risk of relapse (56).
We focused on whole-brain analyses for the low/high MDQ volunteer sample due to the larger sample size. Decision-making and the processing of outcomes produced a typical pattern of activation (15,57–59) in areas including dorsal anterior cingulate cortex, striatum and vmPFC. However, there were no group differences in any of these signals, matching our behavioural results. We next looked for brain activity related to the modulation of risk taking with ‘outcome history’. We found that at the time when people made decisions, there was activity representing the last trial’s outcome in an area spanning vmPFC to FPm. This is similar to previous findings in a learning context of between-trial activities (16,35,42). This signal extended more dorsally into FPm in low MDQ volunteers. Furthermore, the stronger this signal, the stronger the modulation of risk taking by outcome history. In this region, lithium did not affect brain activity. In an exploratory analysis, we also compared the brain activity of patients with BD with lithium or placebo. Patients given placebo showed larger outcome-related activity in dorsolateral prefrontal cortex, while under lithium this activity was similar to the non-clinical groups. Previously, a dampening of reward responses with lithium has been reported in the ventral striatum in healthy volunteers (60).
It has been proposed that changes to reward processing, particularly reward hypersensitivity, are central to BD (61,62). While self-reports strongly support this (63), behavioural findings are more mixed: though some have found differences in reward-based decision-making or risk-taking (64,65), the interpretation has been hampered by concerns about the tasks (64), non-computational analysis strategies focusing on ‘correct’ answers (65), or similar effects being seen in other disorders. Neurally, while there have been suggestions that BD is accompanied by increased reward-related brain signals (66), opposite findings have also been observed (67). One possible explanation could be that these neural effects are sensitive to patients’ mood states or medication (67). Recently, computational work has suggested how differences in the interplay between reward processing and mood could lead to feedback loops and fluctuations in mood, including spiralling out into manic and depressive episodes (4,6,7). Our findings here complement these results by revealing the process underlying changes in active homeostatic behaviour. In the future, it would be interesting to see how these different processes interact or whether there are distinct clusters of patients as has been proposed for other facets of decision-making (68).
The current study also revealed an effect of lithium treatment on the neural response during rewarded outcome, making these measures more similar to healthy controls. Analysis of subjective ratings from the same patient cohort revealed that lithium increased the volatility of positive affective experiences compared to placebo (3). This effect was proposed to reduce the persistence of positive affect in patients with bipolar disorder. A previous study also revealed effects of lithium on reward-related prediction errors in a healthy volunteer model (60). While the efficacy of lithium in the treatment of both manic and depressive episodes is well established, its mechanisms of action remain a subject of debate. Together our results highlight the importance of considering rewarded decision-making and learning perspectives to understand mood instability and the effects of lithium.
Contributions
JS: Conceptualization, software, formal analysis, writing, original draft, reviewing and editing
PP: Investigation, writing, reviewing and editing
NN: Conceptualization, investigation, reviewing
NK: Conceptualization, reviewing and editing
LZA: Investigation, writing, reviewing and editing.
MFSR: Conceptualization, design, reviewing and editing.
CAN: Funding acquisition, conceptualization, supervision, design, reviewing and editing.
PJH: Funding acquisition, Conceptualisation, recruitment, reviewing and editing
JG: Funding acquisition, conceptualisation, design, supervision, reviewing and editing
KS: Investigation, project administration, reviewing and editing
CJH: Funding acquisition, conceptualization, design, supervision, reviewing and editing
Financial disclosure
The study was funded by a Wellcome Trust Strategic Award (CONBRIO: Collaborative Oxford Network for Bipolar Research to Improve Outcomes, reference No. 102,616/Z). JRG, CJH, PJH and KEAS are supported by the Oxford Health NIHR Biomedical Research Centre. MFSR is funded by the Wellcome Trust (221794/Z/20/Z). The Wellcome Centre for Integrative Neuroimaging is supported by core funding from the Wellcome Trust (203139/Z/16/Z). JS has been funded by the Institut National de la Santé et de la Recherche Médicale, the Biotechnology and Biological Sciences Research Council (BB/V004999/1, Discovery Fellowship) and Medical Research Council (MR/N014448/1, Skills Development Fellowship). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health.
JS, PP, NN, LZA, NK, JG, MFSR report no biomedical financial interests or potential conflicts of interest. CJH has received consultancy payments from P1vital, Lundbeck, Compass Pathways, IESO, Zogenix (now UCB). PJH reports receiving an honorarium for editorial work for Biological Psychiatry and Biological Psychiatry Global Open Science. ACN is non-executive director at the Oxford Health Foundation Trust. KEAS has received consultancy payment from Yale University.