Predicting Quality Adjusted Life Years in young people attending primary mental health services

Matthew P Hamilton; Caroline X Gao; Kate M Filia; Jana M Menssink; Sonia Sharmin; Nic Telford; Helen Herrman; Ian B Hickie; Cathrine Mihalopoulos; Debra J Rickwood; Patrick D McGorry; Sue M Cotton

doi:10.1101/2021.07.07.21260129

Abstract

Background Quality Adjusted Life Years (QALYs) are often used in economic evaluations, yet utility weights for deriving them are rarely directly measured in mental health services.

Objectives We aimed to: (i) identify the best Transfer To Utility (TTU) algorithms and predictors for adolescent weighted Assessment of Quality of Life - six dimensions (AQoL-6D) health utility and (ii) assess ability of TTU algorithms to predict longitudinal change.

Methods We recruited 1107 young people attending Australian primary mental health services, collecting data at two time points, three months apart. Five linear and three generalised linear models were explored to identify the best TTU algorithm. Forest models were used to assess predictive ability of six candidate measures of psychological distress, depression and anxiety and linear / generalised linear mixed effect models were used to construct longitudinal predictive models for AQoL-6D change.

Results A depression measure (Patient Health Questionnaire-9) was the strongest independent predictor of health utility. Linear regression models with complementary log-log transformation of utility score were the best performing models. Between-person associations were slightly larger than within-person associations for most of the predictors.

Conclusions Adolescent AQoL-6D utility can be derived from a range of psychological distress, depression and anxiety measures. TTU algorithms estimated from cross-sectional data can approximate longitudinal change but may slightly bias QALY predictions.

Toolkits The TTU models produced by this study can be searched, retrieved and applied to new data to generate QALY predictions with the Youth Outcomes to Health Utility (youthu) R package - https://ready4-dev.github.io/youthu.

1 Introduction

To efficiently allocate scarce public resources between competing mental health programs, it is useful to have a common measure of benefit. Quality adjusted life years (QALYs) are generic indices of outcome that inform public health policy in many countries [1] and are frequently used in health economic evaluations, including in mental health. The “quality” in QALYs is often measured via the use of multi-attribute utility instruments (MAUIs), where domains of quality of life measured by a questionnaire are weighted using the preferences of people [2]. This approach produces a single health utility weight for each individual for each measured health state, anchored on a scale where 0 represents death and 1 represents perfect health. Health utility weighs can be converted to QALYs by weighting the duration (the “years” part of QALYs) each individual spends in each health state.

MAUIs are regularly used in research studies such as clinical trials and epidemiological surveys, but rarely feature in routine data collection by mental health services. In the absence of direct measurement, Transfer to Utility (TTU) analysis has been developed to map utility weights from standard health status measurements [3]. In mental health settings, TTU algorithms have been developed to map psychological distress (measured using Kessler Psychological Distress Scale – 10 items, K10) and depression and anxiety symptoms (measured using Depression, Anxiety, and Stress Scale – 21 items, DASS-21 [4]) to a range of health utility measures including the Assessment of Quality of Life – 8 dimensions (AQoL-8D [5]). Published mental health TTU algorithms have been developed for adult [5] or child [6] general populations; however, they have questionable appropriateness for predicting health utility in clinical mental health samples of young people. Other difficulties with currently available TTU algorithms include over-reliance on cross-sectional data (not capturing the longitudinal dimension of QALYs), and a limited range of predictors.

With a sample of help-seeking young people attending primary mental health care services, we aimed to: (i) identify the best TTU regression models to predict adolescent weighted AQoL-6D utility and evaluate the predictive ability of six candidate measures of psychological distress, depression and anxiety; and (ii) assess ability of the TTU algorithms to predict longitudinal (three-month) change.

2 Methods

2.1 Sample and setting

This study forms part of a research program to develop better outcome measures for young people seeking mental health support, and the study sample has previously been described [7]. Briefly, young people aged 12 to 25 years who presented for a first appointment for mental health or substance use related issues were recruited from three metropolitan and two regional Australian youth-focused primary mental health clinics (headspace centres) between September 2016 to April 2018. Sample characteristics are similar to previous descriptions of headspace clients, with slight differences in age (less aged 12-14, more aged 18-20), cultural background (more Culturally and Linguistically Diverse and less Aboriginal and Torres Strait Islander young people), sexuality (fewer heterosexual clients) and housing (more in unstable accommodation) [7].

2.2 Measures

We collected data on utility weights, six candidate predictors of utility weights including psychological distress, depression and anxiety measures as well as demographic, clinical and functional population information.

2.2.1 Utility weights

We assessed utility weights using the Assessment of Quality of Life – Six Dimension scale (AQoL-6D; [8]) MAUI. It was selected due to the relevance of its domains for a clinical mental health sample [9] and its acceptable participant time-burden. The AQoL-6D instrument contains 20 items across the six dimensions of independent living, social and family relationship, mental health, coping, pain and sense. Health utility scores were calculated using a published algorithm for adolescents (available at https://www.aqol.com.au/index.php/aqolinstruments?id=92), using Australian population preference weights.

2.2.2 Candidate predictors

Data from six measures of psychological distress (one measure), depression (two measures) and anxiety (three measures) symptoms were used as candidate predictors to construct TTU models. These measures were selected as they are widely used in clinical mental health services or clinically relevant to the profiles of young people seeking mental health care.

The Kessler Psychological Distress Scale (K6; [10]) was used to measure psychological distress over the last 30 days. It includes six items (nervousness, hopelessness, restlessness, sadness, effort, and worthlessness) of the 10 item version of this measure, K10. Individual items use a five-point frequency scale that spans from 0 (“none of the time”) to 4 (“all of the time”).

The Patient Health Questionnaire-9 (PHQ-9; [11]) and Behavioural Activation for Depression Scale (BADS; [12]) were used to measure degree of depressive symptomatology. PHQ-9 includes nine questions measuring the frequency of depressive thoughts (including self-harm/suicidal thoughts) as well as associated somatic symptoms (e.g., sleep disturbance, fatigue, anhedonia, appetite, psychomotor changes) in the past two weeks. PHQ-9 uses a four-point frequency scale ranging from 0 (“Not at all”) to 3 (“Nearly every day”). For the PHQ-9 a total score is derived (0-27) with higher scores depicting greater symptom severity. BADS measures a range of behaviours (activation, avoidance/rumination, work/school impairment as well as social impairment) reflecting severity of depression. BADS includes 25 questions on behaviours over the past week, scored on a seven-point scale ranging from 0 (“Not at all”) to 6 (“Completely”). A total score is derived for the BADS (0-150) as well as subscale scores, with higher scores indicating greater activation.

The Generalised Anxiety Disorder Scale (GAD-7; [13]), Screen for Child Anxiety Related Disorders (SCARED; [14]) and Overall Anxiety Severity and Impairment Scale (OASIS; [15]) were used to measure anxiety symptoms. GAD-7 measures symptoms such as nervousness, worrying and restlessness, over the past two weeks using seven questions, with a four-point frequency scale ranging from 0 (“Not at all”) to 3 (Nearly every day”). A total score is calculated with scores ranging from 0 to 21 and higher scores indicating more severe symptomatology. SCARED is an anxiety screening tool designed for children and adolescents which can be mapped directly on specific Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) anxiety disorders including generalised anxiety disorder, panic disorder, separation anxiety disorder and social phobia. It includes 41 questions on a three-point scale of 0 (“Not true or hardly ever true”), 1 “Somewhat True or Sometimes True” and 2 (“Very true or often true”) to measure symptoms over the last three months. A total score is derived with scores ranging from 0-82, with higher scores indicative of the presence of an anxiety disorder. The OASIS was developed as a brief questionnaire to measure severity of anxiety and impairment in clinical populations. The OASIS includes five questions about frequency and intensity of anxiety as well as related impairments such as avoidance, restricted activities and problems with social functioning over the past week. Total scores range from 0-20 with higher scores depicting more severe symptomatology.

2.2.3 Population characteristics

We collected self-reported measures of demographics (age, gender, sex at birth, education and employment status, languages spoken at home and country of birth). We also collected clinician or research interviewer assessed measures of mental health including primary diagnosis, clinical stage [16] and functioning (measured by the Social and Occupational Functioning Assessment Scale (SOFAS) [17]).

2.3 Procedures

Eligible participants were recruited by trained research assistants and written consent was obtained from the young person and a parent/guardian if the participant was aged <18 years.

Participants responded to the questionnaire via a tablet device and participants’ clinical characteristics were obtained from clinical records and research interview. At three-months post-baseline, participants were contacted in person or by telephone, to complete a 3-month follow-up assessment.

2.4 Statistical analysis

Basic descriptive statistics were used to characterise the cohort in terms of baseline demographics and clinical variables. Pearson’s Product Moment Correlations (r) were used to determine the relationships between candidate predictors and the AQoL-6D utility score.

2.4.1 TTU regression models

As AQoL-6D utility score is normally left skewed and constrained between 0 and 1, ordinary least squares (OLS) models with different types of outcome transformations (such as log and logit) have been previously used in TTU regression [3]. Similarly, generalised linear models (GLMs) address this issue via modelling the distribution of the outcome variable and applying a link function between the outcome and linear combination of predictors [18].

We compared predictive performance of a range of models predicting AQoL-6D utility scores using the candidate predictor that had the highest Pearson correlation coefficient with utility scores. The models compared include OLS regression with log, logit, log-log (f(y) = -log(-log(y))) and clog-log (f(y) = -log(1-y))) transformation; GLM using Gaussian distribution with log link; and GLM using Beta distribution with logit and clog-log link. Ten-fold cross-validation was used to compare model fitting using training datasets and predictive ability using testing datasets using three indicators including R², root mean square error (RMSE) and mean absolute error (MAE) [19,20].

To evaluate whether candidate predictors could independently predict utility scores, we established multi-variate prediction models using baseline data with the candidate predictor and a range of other risk factors including participants’ age, sex at birth, clinical stage, cultural and linguistic diversity, education and employment status, primary diagnosis, region of residence (whether metropolitan - based on location of attending service) and sexual orientation. Functioning (as measured by SOFAS), was also included in each model to evaluate whether it can jointly predict utility with clinical symptom measurements.

2.4.2 Candidate predictor comparison

Two steps were used to compare the usefulness of the candidate predictors. First, we used a random forest model including all six candidate predictors. Anxiety and depression measurements are highly collinear, making it difficult to compare these candidate predictors using one regression model. Random forest models provide flexible methods for comparing correlated predictors’ relative ‘importance’ (loss of accuracy from random permutation of the predictor) for the overall prediction model [21]. Second the predictive performance of candidate predictors using selected TTU regression model were compared using 10-fold cross-validation. This procedure helped us to directly evaluate the independent predictive ability of different candidate predictors.

2.4.3 Methods to evaluate the ability of measures to predict longitudinal change in health utility

After identifying the best TTU regression model(s), we established longitudinal models to evaluate the ability to predict change. This was achieved using generalised linear mixed-effect models (GLMM) including both the baseline and follow-up data. The detailed model is specified in the following equation: g() is the link function of the model; U_i,j is AQoL-6D utility score of individual i in observation j; S_i,baseline is the baseline distress/depression/anxiety score for individual i and ΔS_i,j is the score change from the baseline for individual i at observation j. We used β₀ to represent fixed intercept, b_i to represent the random intercept for individual i (controlling for clustering at individual level) and ϵ_i,j to represent the random error. Hence for baseline observations ΔS_i,j = 0; and at follow-up ΔS_i,j = S_{i,follow − up} − S_i,baseline. With this parameterisation, β_baseline can be interpreted as between person association and β_change as within person association. When β_baseline = β_change, Equation 1 can be generalised to: for both baseline and follow-up observations. The discrepancy between β_baseline and β_change can be interpreted as bias of estimating longitudinal predictive score changes within individual using cross-sectional score difference between individuals.

Bayesian linear mixed models were used to avoid common convergence problems in frequentist tools [22]. Linear mixed effect model (LMM) can be fitted in the same framework with Gaussian distribution and identify link function. Clustering at individual level is controlled via including random intercepts. Model fitting was evaluated using Bayesian R² [23].

2.4.4 Secondary analyses

We repeated the previous steps to develop additional TTUs - a set of models that used SOFAS as an independent predictor (Secondary Analysis A) and a set of models that combined anxiety and depression predictors (Secondary Analysis B).

2.4.5 Software

We undertook all our analyses using R 4.0.2 [24]. We used a wide range of third-party code libraries in the analysis and reporting (see Supplementary Information, Table A.5). We wrote our analysis and reporting algorithms as R packages so that they can be used by others as tools for predicting QALYs, replicating this study and developing TTUs with different utility measures and predictors. Where it is not feasible to publicly release study data synthetic replication datasets can be useful [25]. We created such a dataset and included it in one of our R packages.

3 Results

3.1 Cohort characteristics

Participants characteristics at baseline and follow-up are displayed in Table 1. This study included 1068 out of the 1107 participants with complete AQol-6D data. This cohort predominantly comprised individuals with anxiety/depression (76.7%) at early (prior to first episode of a serious mental disorder) clinical stages (91.7%). Participant ages ranged between 12-25 with a mean age of 18.13 (SD = 3.26).

View this table:

Table 1:

Participant characteristics

There were 643 participants (60.2%) who completed AQol-6D questions at the follow-up survey three months after baseline assessment.

3.2 AQol-6D and candidate predictors

Distribution of AQol-6D total utility score and sub-domain scores are displayed in Figure 1, the mean utility score at baseline is 0.59 (SD = 0.24) and 0.68 (SD = 0.24) at follow-up. Distribution of candidate predictors, BADS, GAD-7, K6, OASIS, PHQ-9 and SCARED, are summarised in Table 2. PHQ-9 was found to have the highest correlation with utility score both at baseline and follow-up followed by OASIS and BADS; baseline and follow-up SCARED was found to have the lowest correlation coefficients with utility score although all correlation coefficients can be characterised as being strong.

View this table:

Table 2:

Candidate predictors distribution parameters and correlations with AQoL-6D utility

Figure 1:

Distribution of AQoL-6D domains

3.3 TTU regression model performance

The 10-fold cross-validated model fitting index from TTU models using PHQ-9 are reported in Table A.1 in the Supplementary Material. Both training and testing R², RMSE and MAE were comparable between GLM model types. The best OLS model was found to be either no transformation, log transformation or clog-log transformation. Model diagnoses (such as heteroscedasticity, residual normality) suggested better model fit of the clog-log transformed model, as the distribution clog-log transformed utility are closest to normal distribution among all transformation methods. Another benefit of the clog-log model is that the predicted utility score will be constrained with an upper bound of 1, thus preventing out of range prediction. Therefore, both GLM with Gaussian distribution and log link and OLS with clog-log transformation were selected for further evaluation. Predictive ability of each candidate predictor using baseline data were also compared using 10-fold cross-validation.

As shown in Table A.2, PHQ-9 had the highest predictive ability followed by OASIS, BADS, GAD-7 and K6. SCARED had the least predictive capability. This is consistent with the random forest model in which PHQ-9 was found to be the most ‘important’ predictor (see Figure A.1). The confounding effect of other participant characteristics were also evaluated when using the candidate predictors in predicting utility score. Using the baseline data, SOFAS was found to independently predict utility scores in models for all six candidate predictors (p<0.005). No other confounding factor was identified for the either predictor prediction model; sex at birth was found to be a confounder for K6 model (p<0.01). A few other confounders, including primary diagnosis, clinical staging and age were identified as weakly associated with utility in TTU models using anxiety and depression measurements other than PHQ-9. Considering many of these factors are unlikely to change over three months, they were not evaluated in the mixed effect models.

3.4 Longitudinal TTU regression models

Regression coefficients of the baseline score and score changes (from baseline to follow-up) estimated in individual GLMM and LLM models are summarised in Table 3. Bayesian R² from each model is reported. Modelled residual standard deviations (SDs) were also provided to support simulation studies which need to capture individual level variation. In GLMM and LLM models, the prediction models using OASIS and PHQ-9 respectively had the highest R² (0.68 and 0.76) and lowest estimated residual SD. R² were above 0.7 for all LLM models and above 0.6 for all GLMM models except for the K6 model. Variance of the random intercept was comparable with the residual variance.

View this table:

Table 3:

Estimated coefficients from longitudinal TTU models for candidate predictors

The coefficients of score change from baseline were generally estimated to be lower compared with coefficients of baseline score (except for SCARED). The mean ratio between two coefficients (β_change/β_baseline) is 0.82 for K6, between 0.8 and 0.85 for depression measurements and between 0.9 and 1.09 for anxiety measurements.

Distribution of observed and predicted utility scores and their association from GLMM (Gaussian distribution and log link) and LLM (complementary log log transformation) using PHQ-9 are plotted in Figure 2. Compared with GLMM, the predicted utility scores from the LLM model converge better to the observed distribution and provide better estimations at the tail of the distribution. When the observed utility scores were low, the predicted utility were too high in GLMM model, see Figure 2 (B). The observed and predicted distributions of utility scores for other anxiety and depression measurements were similar from LLM models. However, GLMM models had low coverage in utility scores below 0.3 and also made predictions out of range (over 1).

Figure 2:

Comparison of observed and predicted AQoL-6D utility score from longitudinal TTU of PHQ-9 (A) Density plots of observed and predicted utility scores (GLMM with Gaussian distribution and log link) (B) Scatter plots of observed and predicted utility scores by timepoint (GLMM with Gaussian distribution and log link) (C) Density plots of observed and predicted utility scores (LMM with clog-log transformation) (D) Scatter plots of observed and predicted utility scores by timepoint (LMM with clog-log transformation))

We also evaluated models with SOFAS at baseline and SOFAS change from baseline added to psychological distress, depression and anxiety predictors (see Tables A.3 and A.4). SOFAS scores were generally found to be associated with utility scores when controlling for anxiety and depression symptom measurements in longitudinal models.

The secondary analysis where SOFAS is the sole predictor resulted in models with slightly lower R² than all primary analysis models. Adding the PHQ-9 depression measure to each anxiety measure predictor did not notably improve the performance of these models.

Detailed summaries of all models from the primary and secondary analyses are available in the online data repository (see “Availability of data and materials”).

3.5 Toolkits for predicting QALYs and modelling additional TTUs

We created an online results data-repository and three R packages to facilitate easy access to and application of study outputs and replication of study methods. See “Availability of data and materials” for details of where these resources (and supporting documentation) can be accessed.

4 Discussion

MAUIs are largely absent in routine data collection in clinical mental health services. This gap means that it can be difficult for researchers, service planners and service commissioners to derive much economic insight from the often-rich outcome data that is collected in administrative and treatment evaluation datasets. Existing TTU algorithms may not appropriately predict longitudinal change in utility weights especially in help-seeking young people. Our study addresses this important gap and is the first to evaluate longitudinal mapping ability between affective symptom measurements and health utility in a cohort of help seeking young people.

Although there is encouraging evidence about the quality, effectiveness and cost-effectiveness of youth mental health service innovations worldwide [26][27], the public health and economic returns from systemic reforms to support better mental health in young people still needs to be better understood [28]. Our study contributes to this goal by developing tools that can extract additional economic insights into existing mental health datasets by facilitating prediction of QALYs with our TTU algorithms and supporting the development of additional TTU algorithms by other researchers.

By helping to translate measures commonly collected in youth mental health services to QALYs, our TTU algorithms enable greater use of cost-utility analyses (CUAs). Unlike alternative economic evaluation types (e.g., Cost Consequence Analysis and Cost-Effectiveness Analysis using measures other than health utility) CUAs have commonly understood willingness to pay benchmarks for outcomes and facilitate comparison of the value for money claims of interventions from different illness groups. In practical terms, CUAs can help a decision-maker assess the competing economic claims of an intervention for depression compared to an intervention in anxiety or determine whether it may be efficient to fund expanded access to specified mental health services by redirecting parts of the general health budget.

As many youth mental health services routinely collect data on at least one of our six candidate predictors and the measure of functioning (SOFAS) included in our models, the TTU algorithms we developed in this study may have widespread applicability. Importantly, our TTUs were developed in a clinical sample of 12-25 year olds, using adolescent AQoL-6D weights. We were able to independently predict adolescent AQoL-6D from each of the six candidate measures we assessed, with PHQ-9 having the best predictive performance. Predictive performance was improved when adding SOFAS as an additional predictor or confound to each model; SOFAS also performed well as an independent predictor. These results may be useful for service system planners in helping to prioritise which measures should be included in routine data collection. Although direct measurement of health utility with measures such as the ReQoL [29] may be feasible in some mental health services, relying on clinical measures that can also map to health utility may be an attractive alternative.

A key feature of QALYs is their longitudinal dimension - health utilities are weighted and aggregated based on the time spent in varying health states. Our results suggest that psychological distress, depression and anxiety measurements explain the variations of health utility and cross-sectional variations can be used to approximate the longitudinal change in this cohort. However, a finding of our study is that. for psychological distress and depression measures at least, TTU algorithms developed from cross-sectional data may slightly over-estimate these changes, introducing bias into QALY predictions (overestimating QALYs for populations whose health utility improves over time, underestimating QALYS for those with deteriorating mental health).

Key strengths of our study include the novelty of our clinical youth mental health study sample, the use of clinically relevant and frequently collected outcome measures as predictors, the appropriateness and range of statistical methods deployed, the comparison of within-person and between-person differences in health utility weight predictions and highly replicable, publicly disseminated study algorithms. We acknowledge limitations that our data pertained to a single country, and we explored only one MAUI-derived utility weight. We did not examine some potential predictors that may be more common in some mental health services (for example we explored K6, as opposed to the expanded, and commonly used measure, the K10).

However, using utility weight input data derived from the same country as that to which an analysis pertains may be relatively unimportant [30], particularly when the MAUI is well suited to the relevant health condition (as is the case with AQoL and mental health [9]). Furthermore, our R packages should help make it relatively straightforward for others to replicate our study algorithm in different samples (non-Australian, non-clinical and/or non-youth populations) and generalise our methods to developing TTU algorithms that use different predictors (other clinical, functioning and demographic measures) and other utility measures (e.g., EQ-5D). Clinical trial datasets, which now usually collect MAUIs, could provide rich opportunities for applying our algorithm to develop and test new TTU algorithms.

By distributing study outputs as freely available open science resources we hope to make it easier to access and appropriately and consistently apply study findings. Open science resources also provide a valuable opportunity for other researchers to contribute refinements and extensions so that the usefulness of our study algorithm improves with time.

5 Conclusions

We have found that it is possible to predict both within-person and between-person differences in adolescent AQOL-6D utility weights from measures routinely collected in youth mental health services. TTU algorithms developed from cross-sectional data can approximate longitudinal changes in health utility, but may slightly over-estimate these changes. The TTU algorithms we have developed can help inform resource allocation decisions relating to the mental health of young people. Our toolkits also provide a basis for future research that extends our work with additional TTU algorithms.

Availability of data and materials

Detailed results in the form of catalogues of the TTU models produced by this study and other supporting information are available in the results repository https://doi.org/10.7910/DVN/DKDIB0. Tools for finding and using the TTU models appropriate for use with new prediction datasets are available as part of the youthu R package (https://ready4-dev.github.io/youthu). The youthvars R package (https://ready4-dev.github.io/youthvars/) provides a number of tools helpful for replicating this study (including a synthetic dataset) while TTU (https://ready4-dev.github.io/TTU/) has tools for both replicating the study and generalising our algorithms to develop TTU algorithms with other utility measures and predictors.

Ethics approval

The study was reviewed and granted approval by the University of Melbourne’s Human Research Ethics Committee, and the local Human Ethics and Advisory Group (1645367.1).

Funding

This study was funded by the National Health and Medical Research Council (NHMRC, APP1076940), Orygen and headspace.

Conflict of Interest

None declared.

A Appendix

A.1 Additional tables

View this table:

Table A.1:

10-fold cross-validated model fitting index for different OLS or GLM models for using PHQ9 total scores as predictor with the baseline data

View this table:

Table A.2:

10-fold cross-validated model fitting index for different candidate predictors estimated using GLM with Gaussian distribution and log link with the baseline data

View this table:

Table A.3:

Estimated coefficients from longitudinal TTU models based on candidate predictors and SOFAS score using LLM (with cloglog transformation)

View this table:

Table A.4:

Estimated coefficients from longitudinal TTU models based on individual candidate predictors and SOFAS score using GLM (Gaussian distribution with log link)

View this table:

Table A.5:

R Packages used in data analysis and reporting

A.2 Additional figures

Figure A.1:

Variable importance estimated using random forest

Footnotes

Addresses inconsistencies between written summary and tables caused by error in program to render manuscript.

References

1.↵
MacKillop E, Sheard S. Quantifying life: Understanding the history of quality-adjusted life-years (qalys). Social Science & Medicine. 2018;211:359–366. doi:https://doi.org/10.1016/j.socscimed.2018.07.004
OpenUrl Google Scholar
2.↵
Neumann PJ, Goldie SJ, Weinstein MC. Preference-based measures in economic evaluation in health care. Annual Review of Public Health. 2000;21:587–611. doi:10.1146/annurev.publhealth.21.1.587
OpenUrl CrossRef PubMed Web of Science Google Scholar
3.↵
Mortimer D, Segal L. Comparing the incomparable? A systematic review of competing techniques for converting descriptive measures of health status into qaly-weights. Medical decision making. 2008;28:66–89.
OpenUrl CrossRef PubMed Web of Science Google Scholar
4.↵
Henry JD, Crawford JR. The short-form version of the depression anxiety stress scales (dass-21): Construct validity and normative data in a large non-clinical sample. British journal of clinical psychology. Wiley Online Library; 2005;44:227–239. doi:https://doi.org/10.1348/014466505X29657
OpenUrl Google Scholar
5.↵
Mihalopoulos C, Chen G, Iezzi A, Khan MA, Richardson J. Assessing outcomes for cost-utility analysis in depression: Comparison of five multi-attribute utility instruments with two depression-specific outcome measures. The British Journal of Psychiatry. 2014;205:390–397.
OpenUrl Abstract/FREE Full Text Google Scholar
6.↵
Furber G, Segal L, Leach M, Cocks J. Mapping scores from the Strengths and Difficulties Questionnaire (SDQ) to preference-based utility values. Qual Life Res. 2014;23:403–411.
OpenUrl Google Scholar
7.↵
Filia K, Rickwood D, Menssink J, Gao CX, Hetrick S, Parker A, et al. Clinical and functional characteristics of a subsample of young people presenting for primary mental healthcare at headspace services across australia. Soc Psychiatry Psychiatr Epidemiol. 2021; doi:10.1007/s00127-020-02020-6
OpenUrl CrossRef Google Scholar
8.↵
Richardson JR, Peacock SJ, Hawthorne G, Iezzi A, Elsworth G, Day NA. Construction of the descriptive system for the assessment of quality of life aqol-6D utility instrument. Health and quality of life outcomes. 2012;10:38. Available: https://hqlo.biomedcentral.com/track/pdf/10.1186/1477-7525-10-38
OpenUrl Google Scholar
9.↵
Engel L, Chen G, Richardson J, Mihalopoulos C. The impact of depression on health-related quality of life and wellbeing: Identifying important dimensions and assessing their inclusion in multi-attribute utility instruments. Qual Life Res. 2018;27:2873–2884. doi:10.1007/s11136-018-1936-y
OpenUrl CrossRef Google Scholar
10.↵
Kessler RC, Andrews G, Colpe LJ, Hiripi E, Mroczek DK, Normand SLT, et al. Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychological Medicine. 2002;32:959–976. doi:10.1017/s0033291702006074
OpenUrl CrossRef PubMed Web of Science Google Scholar
11.↵
Kroenke K, Spitzer RL, Williams JB. The phq: Validity of a brief depression severity measure. Journal of general internal medicine. 2001;16:606–613. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1495268/pdf/jgi_01114.pdf
OpenUrl CrossRef PubMed Web of Science Google Scholar
12.↵
Kanter JW, Mulick PS, Busch AM, Berlin KS, Martell CR. The behavioral activation for depression scale (bads): Psychometric properties and factor structure. Journal of Psychopathology and Behavioral Assessment. 2006;29:191–202. doi:10.1007/s10862-006-9038-5
OpenUrl CrossRef Google Scholar
13.↵
Spitzer RL, Kroenke K, Williams JB, Lowe B. A brief measure for assessing generalised anxiety disorder: The gad-7. Archives of Internal Medicine. 2006;166:1092–1097.
OpenUrl CrossRef PubMed Web of Science Google Scholar
14.↵
Birmaher B, Brent DA, Chiappetta L, Bridge J, Monga S, Baugher M. Psychometric properties of the screen for child anxiety related emotional disorders (scared): A replication study. Journal of the American Academy of Child & Adolescent Psychiatry. 1999;38:1230–1236.
OpenUrl Google Scholar
15.↵
Norman SB, Cissell SH, Means-Christensen AJ, Stein MB. Development and validation of an overall anxiety severity and impairment scale (oasis). Depress Anxiety. 2006;23:245–9. doi:10.1002/da.20182
OpenUrl CrossRef PubMed Web of Science Google Scholar
16.↵
McGorry PD, Hickie IB, Yung AR, Pantelis C, Jackson HJ. Clinical staging of psychiatric disorders: A heuristic framework for choosing earlier, safer and more effective interventions. Aust N Z J Psychiatry. 2006;40:616–22. doi:10.1111/j.1440-1614.2006.01860.x
OpenUrl CrossRef PubMed Web of Science Google Scholar
17.↵
Goldman HH, Skodol AE, Lave TR. Revising axis v for dsm-iv: A review of measures of social functioning. Am J Psychiatry. 1992;149:9.
OpenUrl CrossRef PubMed Web of Science Google Scholar
18.↵
Dobson AJ, Barnett AG. An introduction to generalized linear models. CRC press; 2018.
Google Scholar
19.↵
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: Data mining, inference, and prediction. Springer Science & Business Media; 2009.
Google Scholar
20.↵
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai. Montreal, Canada; pp. 1137–1145.
Google Scholar
21.↵
Kursa MB, Rudnicki WR. Feature selection with the boruta package. Journal of Statistical Software, Articles. 2010;36:1–13. doi:10.18637/jss.v036.i11
OpenUrl CrossRef PubMed Google Scholar
22.↵
Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, et al. Generalized linear mixed models: A practical guide for ecology and evolution. Trends in ecology & evolution. Elsevier; 2009;24:127–135.
OpenUrl Google Scholar
23.↵
Gelman A, Goodrich B, Gabry J, Vehtari A. R-squared for bayesian regression models. The American Statistician. 2019;73:307–309. doi:10.1080/00031305.2018.1549100
OpenUrl CrossRef Google Scholar
24.↵
R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available: https://www.R-project.org/
Google Scholar
25.↵
Nowok B, Raab GM, Dibben C. synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software. 2016;74:1–26. doi:10.18637/jss.v074.i11
OpenUrl CrossRef Google Scholar
26.↵
Hetrick SE, Bailey AP, Smith KE, Malla A, Mathias S, Singh SP, et al. Integrated (one-stop shop) youth health care: Best available evidence and future directions. Med J Aust. 2017;207:S5–S18. doi:10.5694/mja17.00694
OpenUrl CrossRef Google Scholar
27.↵
Hamilton MP, Hetrick SE, Mihalopoulos C, Baker D, Browne V, Chanen AM, et al. Identifying attributes of care that may improve cost-effectiveness in the youth mental health service system. Med J Aust. 2017;207:S27–S37. doi:10.5694/mja17.00972
OpenUrl CrossRef Google Scholar
28.↵
Alegría M, NeMoyer A, Falgàs Bagué I, Wang Y, Alvarez K. Social determinants of mental health: Where we are and where we need to go. Current Psychiatry Reports. 2018;20:95–95. doi:10.1007/s11920-018-0969-9
OpenUrl CrossRef Google Scholar
29.↵
Keetharuth AD, Rowen D, Bjorner JB, Brazier J. Estimating a preference-based index for mental health from the recovering quality of life measure: Valuation of recovering quality of life utility index. Value in Health. 2021;24:281–290. doi:https://doi.org/10.1016/j.jval.2020.10.012
OpenUrl Google Scholar

Posted July 12, 2021.

Download PDF

Author Declarations

Data/Code

Revision Summary

Citation Tools

Get QR code

Tweet Widget

Subject Area

Health Economics

Reviews and Context

Comment

TRIP Peer Reviews

Community Reviews

Automated Services

Blogs/Media

Author Videos

Subject Areas

All Articles

Addiction Medicine (418)
Allergy and Immunology (741)
Anesthesia (217)
Cardiovascular Medicine (3183)
Dentistry and Oral Medicine (355)
Dermatology (268)
Emergency Medicine (469)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1131)
Epidemiology (13160)
Forensic Medicine (18)
Gastroenterology (880)
Genetic and Genomic Medicine (4995)
Geriatric Medicine (460)
Health Economics (765)
Health Informatics (3145)
Health Policy (1116)
Health Systems and Quality Improvement (1158)
Hematology (418)
HIV/AIDS (989)
Infectious Diseases (except HIV/AIDS) (14463)
Intensive Care and Critical Care Medicine (899)
Medical Education (463)
Medical Ethics (122)
Nephrology (512)
Neurology (4743)
Nursing (253)
Nutrition (702)
Obstetrics and Gynecology (862)
Occupational and Environmental Health (774)
Oncology (2439)
Ophthalmology (692)
Orthopedics (273)
Otolaryngology (335)
Pain Medicine (316)
Palliative Medicine (89)
Pathology (525)
Pediatrics (1267)
Pharmacology and Therapeutics (535)
Primary Care Research (539)
Psychiatry and Clinical Psychology (4073)
Public and Global Health (7308)
Radiology and Imaging (1641)
Rehabilitation Medicine and Physical Therapy (977)
Respiratory Medicine (956)
Rheumatology (468)
Sexual and Reproductive Health (486)
Sports Medicine (411)
Surgery (528)
Toxicology (66)
Transplantation (226)
Urology (196)

Comments

medRxiv aims to provide a venue for anyone to comment on a medRxiv preprint. Comments are moderated for offensive or irrelevant content (this can take ~24 h). Please avoid duplicate submissions and read our Comment Policy before commenting. The content of a comment is not endorsed by medRxiv.

medRxiv aims to inform readers about online discussion of this preprint occurring elsewhere. The content at the links below is not endorsed by either medRxiv or the preprint's authors.

Community reviews for this article:

There are no community reviews for this paper.

Automated Evaluations

Certain services provide automated analysis of preprints. Analyses invited by the authors are displayed at the top of this tab. Those done independently of authors are shown underneath . None of these analyses is endorsed by medRxiv.

Automated Evaluations:

There are no automated evaluations for this paper.

[1] 1.↵
MacKillop E, Sheard S. Quantifying life: Understanding the history of quality-adjusted life-years (qalys). Social Science & Medicine. 2018;211:359–366. doi:https://doi.org/10.1016/j.socscimed.2018.07.004
OpenUrl Google Scholar

[2] 2.↵
Neumann PJ, Goldie SJ, Weinstein MC. Preference-based measures in economic evaluation in health care. Annual Review of Public Health. 2000;21:587–611. doi:10.1146/annurev.publhealth.21.1.587
OpenUrl CrossRef PubMed Web of Science Google Scholar

[3] 3.↵
Mortimer D, Segal L. Comparing the incomparable? A systematic review of competing techniques for converting descriptive measures of health status into qaly-weights. Medical decision making. 2008;28:66–89.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[4] 4.↵
Henry JD, Crawford JR. The short-form version of the depression anxiety stress scales (dass-21): Construct validity and normative data in a large non-clinical sample. British journal of clinical psychology. Wiley Online Library; 2005;44:227–239. doi:https://doi.org/10.1348/014466505X29657
OpenUrl Google Scholar

[5] 5.↵
Mihalopoulos C, Chen G, Iezzi A, Khan MA, Richardson J. Assessing outcomes for cost-utility analysis in depression: Comparison of five multi-attribute utility instruments with two depression-specific outcome measures. The British Journal of Psychiatry. 2014;205:390–397.
OpenUrl Abstract/FREE Full Text Google Scholar

[6] 6.↵
Furber G, Segal L, Leach M, Cocks J. Mapping scores from the Strengths and Difficulties Questionnaire (SDQ) to preference-based utility values. Qual Life Res. 2014;23:403–411.
OpenUrl Google Scholar

[7] 7.↵
Filia K, Rickwood D, Menssink J, Gao CX, Hetrick S, Parker A, et al. Clinical and functional characteristics of a subsample of young people presenting for primary mental healthcare at headspace services across australia. Soc Psychiatry Psychiatr Epidemiol. 2021; doi:10.1007/s00127-020-02020-6
OpenUrl CrossRef Google Scholar

[8] 8.↵
Richardson JR, Peacock SJ, Hawthorne G, Iezzi A, Elsworth G, Day NA. Construction of the descriptive system for the assessment of quality of life aqol-6D utility instrument. Health and quality of life outcomes. 2012;10:38. Available: https://hqlo.biomedcentral.com/track/pdf/10.1186/1477-7525-10-38
OpenUrl Google Scholar

[9] 9.↵
Engel L, Chen G, Richardson J, Mihalopoulos C. The impact of depression on health-related quality of life and wellbeing: Identifying important dimensions and assessing their inclusion in multi-attribute utility instruments. Qual Life Res. 2018;27:2873–2884. doi:10.1007/s11136-018-1936-y
OpenUrl CrossRef Google Scholar

[10] 10.↵
Kessler RC, Andrews G, Colpe LJ, Hiripi E, Mroczek DK, Normand SLT, et al. Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychological Medicine. 2002;32:959–976. doi:10.1017/s0033291702006074
OpenUrl CrossRef PubMed Web of Science Google Scholar

[11] 11.↵
Kroenke K, Spitzer RL, Williams JB. The phq: Validity of a brief depression severity measure. Journal of general internal medicine. 2001;16:606–613. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1495268/pdf/jgi_01114.pdf
OpenUrl CrossRef PubMed Web of Science Google Scholar

[12] 12.↵
Kanter JW, Mulick PS, Busch AM, Berlin KS, Martell CR. The behavioral activation for depression scale (bads): Psychometric properties and factor structure. Journal of Psychopathology and Behavioral Assessment. 2006;29:191–202. doi:10.1007/s10862-006-9038-5
OpenUrl CrossRef Google Scholar

[13] 13.↵
Spitzer RL, Kroenke K, Williams JB, Lowe B. A brief measure for assessing generalised anxiety disorder: The gad-7. Archives of Internal Medicine. 2006;166:1092–1097.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[14] 14.↵
Birmaher B, Brent DA, Chiappetta L, Bridge J, Monga S, Baugher M. Psychometric properties of the screen for child anxiety related emotional disorders (scared): A replication study. Journal of the American Academy of Child & Adolescent Psychiatry. 1999;38:1230–1236.
OpenUrl Google Scholar

[15] 15.↵
Norman SB, Cissell SH, Means-Christensen AJ, Stein MB. Development and validation of an overall anxiety severity and impairment scale (oasis). Depress Anxiety. 2006;23:245–9. doi:10.1002/da.20182
OpenUrl CrossRef PubMed Web of Science Google Scholar

[16] 16.↵
McGorry PD, Hickie IB, Yung AR, Pantelis C, Jackson HJ. Clinical staging of psychiatric disorders: A heuristic framework for choosing earlier, safer and more effective interventions. Aust N Z J Psychiatry. 2006;40:616–22. doi:10.1111/j.1440-1614.2006.01860.x
OpenUrl CrossRef PubMed Web of Science Google Scholar

[17] 17.↵
Goldman HH, Skodol AE, Lave TR. Revising axis v for dsm-iv: A review of measures of social functioning. Am J Psychiatry. 1992;149:9.
OpenUrl CrossRef PubMed Web of Science Google Scholar

[18] 18.↵
Dobson AJ, Barnett AG. An introduction to generalized linear models. CRC press; 2018.
Google Scholar

[19] 19.↵
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: Data mining, inference, and prediction. Springer Science & Business Media; 2009.
Google Scholar

[20] 20.↵
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai. Montreal, Canada; pp. 1137–1145.
Google Scholar

[21] 21.↵
Kursa MB, Rudnicki WR. Feature selection with the boruta package. Journal of Statistical Software, Articles. 2010;36:1–13. doi:10.18637/jss.v036.i11
OpenUrl CrossRef PubMed Google Scholar

[22] 22.↵
Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen JR, Stevens MHH, et al. Generalized linear mixed models: A practical guide for ecology and evolution. Trends in ecology & evolution. Elsevier; 2009;24:127–135.
OpenUrl Google Scholar

[23] 23.↵
Gelman A, Goodrich B, Gabry J, Vehtari A. R-squared for bayesian regression models. The American Statistician. 2019;73:307–309. doi:10.1080/00031305.2018.1549100
OpenUrl CrossRef Google Scholar

[24] 24.↵
R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available: https://www.R-project.org/
Google Scholar

[25] 25.↵
Nowok B, Raab GM, Dibben C. synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software. 2016;74:1–26. doi:10.18637/jss.v074.i11
OpenUrl CrossRef Google Scholar

[26] 26.↵
Hetrick SE, Bailey AP, Smith KE, Malla A, Mathias S, Singh SP, et al. Integrated (one-stop shop) youth health care: Best available evidence and future directions. Med J Aust. 2017;207:S5–S18. doi:10.5694/mja17.00694
OpenUrl CrossRef Google Scholar

[27] 27.↵
Hamilton MP, Hetrick SE, Mihalopoulos C, Baker D, Browne V, Chanen AM, et al. Identifying attributes of care that may improve cost-effectiveness in the youth mental health service system. Med J Aust. 2017;207:S27–S37. doi:10.5694/mja17.00972
OpenUrl CrossRef Google Scholar

[28] 28.↵
Alegría M, NeMoyer A, Falgàs Bagué I, Wang Y, Alvarez K. Social determinants of mental health: Where we are and where we need to go. Current Psychiatry Reports. 2018;20:95–95. doi:10.1007/s11920-018-0969-9
OpenUrl CrossRef Google Scholar

[29] 29.↵
Keetharuth AD, Rowen D, Bjorner JB, Brazier J. Estimating a preference-based index for mental health from the recovering quality of life measure: Valuation of recovering quality of life utility index. Value in Health. 2021;24:281–290. doi:https://doi.org/10.1016/j.jval.2020.10.012
OpenUrl Google Scholar

Predicting Quality Adjusted Life Years in young people attending primary mental health services

Abstract

1 Introduction