Investigating associations between physical multimorbidity clusters and subsequent depression: cluster and survival analysis of UK Biobank data =============================================================================================================================================== * Lauren Nicole DeLong * Kelly Fleetwood * Regina Prigge * Paola Galdi * Bruce Guthrie * Jacques D. Fleuriot ## Abstract **Background** Multimorbidity, the co-occurrence of two or more conditions within an individual, is a growing challenge for health and care delivery as well as for research. Combinations of physical and mental health conditions are highlighted as particularly important. The aim of this study was to investigate associations between physical multimorbidity and subsequent depression. **Methods and Findings** We performed a clustering analysis upon physical morbidity data for UK Biobank participants aged 37-73 years at baseline data collection between 2006-2010. Of 502,353 participants, 142,005 had linked general practice data with at least one physical condition at baseline. Following stratification by sex (77,785 women; 64,220 men), we used four clustering methods (agglomerative hierarchical clustering, latent class analysis, *k*-medoids and *k*-modes) and selected the best-performing method based on clustering metrics. We used Fisher’s Exact test to determine significant over-/under-representation of conditions within each cluster. Amongst people with no prior depression, we used survival analysis to estimate associations between cluster-membership and time to subsequent depression diagnosis. The *k*-modes models consistently performed best, and the over-/under-represented conditions in the resultant clusters reflected known associations. For example, clusters containing an overrepresentation of cardiometabolic conditions were amongst the largest clusters in the whole cohort (15.5% of participants, 19.7% of women, 24.2% of men). Cluster associations with depression varied from hazard ratio (HR) 1.29 (95% confidence interval (CI) 0.85-1.98) to HR 2.67 (95% CI 2.24-3.17), but almost all clusters showed a higher association with depression than those without physical conditions. **Conclusions** We found that certain groups of physical multimorbidity may be associated with a higher risk of subsequent depression. However, our findings invite further investigation into other factors, like social ones, which may link physical multimorbidity with depression. ## Introduction Multimorbidity, the simultaneous occurrence of two or more long-term conditions in an individual is increasingly common as populations age, and it challenges existing health systems 1,2. Multimorbidity is more common with increasing age, in women and in the less affluent 3,4. Studying the co-occurrence of multiple long-term conditions in the same individual has the potential to inform understanding of disease causation and support planning of current and future health and care services 5–7. Depression affects millions of people worldwide 8,9 and is ranked by the World Health Organization as one of the most burdensome diseases 9,10. There is strong evidence that depression co-occurs with other mental health disorders 11,12, and several ongoing studies aim to identify potential shared mechanisms 11,13. However, previous studies have also found depression to be more common in people with particular chronic physical illnesses, such as cardiovascular disease 14, multiple sclerosis 15, and inflammatory bowel disease 16. Physical ill-health might cause depression because it creates psychological disturbance through ‘biographical disruption’ that threatens a sense of identity, or because of impact on physical or social function. Alternatively, physical conditions may cause depression through intermediate biological processes, like inflammation 15,17, in which case we might expect that different combinations or patterns of physical conditions would be more strongly associated with depression than others. Several studies have used cluster analyses to identify common patterns of physical conditions 18–21, typically using one method, such as agglomerative hierarchical clustering 18,19,22, *k*-medoids 20,23, Latent Class Analysis 21,24–26, or *k*-means approaches 27–30. Additionally, since morbidity data is binary (a person has a condition or does not), some common clustering methods are inappropriate since they use similarity measures incompatible with categorical data 18,31. Therefore, the aim of this study was to explore and compare the use of four independent clustering methods appropriate for binary data and to examine whether certain groups of physical conditions are associated with the subsequent diagnosis of depression. ## Materials and methods ### Data selection and pre-processing We used data from UK Biobank 32. Participants aged 37-73 years attended a baseline assessment during 2006-2010 which collected data on demography, lifestyle habits, health conditions, and a range of physical and laboratory measurements. Participants provided written informed consent for linkage to national datasets including general practice (GP) (primary care), hospital, cancer registry and death records 32. The UK Biobank has ethical approval from the NHS North West Research Ethics Committee (reference: 21/NW/0157). To robustly ascertain a broad range of long-term conditions, our study population included participants with a continuous GP record from at least a year before to at least one day beyond their baseline assessment. We excluded records from the UKB extract of the Vision practice management system in England because the extraction process excluded participants who died prior to data extraction. We also excluded participants who withdrew from the study (Supplementary Fig. 1). We ascertained the presence of depression and 69 long-term physical health conditions at baseline using data from the baseline visit and from linked GP, hospital, and cancer registry records based on previously published lists 33,34 (Supplementary Table 1). The UK National Health Service limits registration to one practice at a time, and GP records transfer between practices so should capture an individual’s entire medical history. However, available hospital and cancer registry records began at different times for England, Wales and Scotland. Therefore, we used all GP records up to baseline assessment date, and to be consistent across countries, we used hospital and cancer registry records within eight years before baseline assessment date. We used published codelists to identify diagnoses from GP records using Read V2 and CTV3 diagnosis codes, hospital records using ICD-10 diagnosis codes and OPCS-4 procedure codes, and cancer registry records using ICD-10 codes 34. We similarly ascertained depression during follow-up using information from GP, hospital and death records. Eligible participants with no history of depression prior to baseline were followed up to the earliest of depression diagnosis, death or the end of their available GP or hospital records. ### Models and metrics We explored the suitability of four methods (*k*-modes 35,36, *k-*medoids 23, Latent Class Analysis (LCA) 24, and agglomerative hierarchical clustering (AHC) 22 (Supplementary Appendix 1)) to cluster all participants based on binary features denoting the absence/presence of the 69 baseline physical conditions. Participants with no physical conditions at baseline were excluded from the clustering analysis. We additionally clustered separately for men (all 69 conditions) and women (67 conditions since erectile dysfunction and hyperplasia of the prostate are only found in men), as well as in the whole population, because of known sex differences in patterns of individual morbidities and multimorbidity 37,38. To select the number of clusters for each method, we used various heuristics, including the elbow method on a scree plot 39 for both *k*-modes and *k*-medoids, the minimal Bayesian Information Criterion 40 for LCA, and Hamming distance 41 for AHC. To assess suitability and performance of these clustering methods, we used three performance metrics (Calinski and Harabasz score 42, Davies-Bouldin score 43, and Silhouette score 44 (Supplementary Appendix 1)), which are appropriate for unsupervised clustering. Since *k*-modes and LCA are sensitive to differences in initialization 24,45, we repeated them five times and compared with other models using the mean and standard deviation across the five experiments. Thereafter, we used two metrics to analyze over-and under-representation of conditions in each cluster. We designed one metric, the *adjusted relative frequency (ARF)* to measure the magnitude of over-or under-represented conditions within a cluster, relative to prevalence in the whole cohort. For each condition, ARF is calculated as: ![Formula][1] An ARF of exactly one indicates that the condition occurs at the same relative frequency as it does in the entire cohort, and values greater or less than one indicate over-and under-representation, respectively. We used Fisher’s Exact Test (two-sided, α=0.05) to evaluate whether over-or under-representation of a condition in each cluster was statistically significant, using a Bonferroni correction to account for multiple testing 46–48. Finally, we visualised and compared statistically significant results on a *bubble heatmap* (Supplementary Appendix 2). To allow others to conduct similar cluster analyses, we made the code available as a software package ([https://github.com/laurendelong21/clusterMed](https://github.com/laurendelong21/clusterMed)). ### Survival analysis to predict depression diagnosis Using participants without a record of depression at baseline, we applied Cox regression models 49 to evaluate time to depression diagnosis by condition cluster, accounting for death as a competing risk 50. Participants with no physical conditions at baseline were included as the reference group. We ran separate models for the whole cohort and for men and women separately, examining associations between cluster membership and subsequent depression. All models were adjusted for baseline age, ethnicity, country of residence and deprivation. The model for the whole cohort was additionally adjusted for sex. Baseline age was included in the models as a continuous variable; all other variables were categorical. Ethnicity was self-reported at baseline, and we categorized it into five groups (Black, Mixed, South Asian, White, and any other ethnic group 51). Country of residence (England, Wales or Scotland) and area-based deprivation, measured by the Townsend Deprivation Index 52, were derived from participants’ home addresses at baseline. We divided the Townsend Deprivation Index into deciles within the entire UK Biobank cohort. A small number of participants (368 women and 435 men) who were missing data on ethnicity, country or deprivation were excluded from the survival analyses. ## Results ### Performance metrics across various clustering methods There were 140,956 participants (73,036 women and 67,920 men) with at least one physical condition at baseline who were included in the clustering analysis (Supplementary Fig. 1). Performance metrics for each of the four methods explored are reported in Table 1. View this table: [Table 1.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T1) Table 1. Performance metrics, number of clusters, and cluster sizes across four clustering methods. Models based on AHC consistently achieved the poorest Calinski and Harabasz, and Davies-Bouldin scores in all three cohorts. Models based on LCA had better metrics than AHC-based models, with particularly high Calinski and Harabasz scores, but the Davies-Bouldin scores were consistently worse in comparison to *k-*modes or *k-*medoids based models. In contrast, the best Davies-Bouldin scores were achieved by the *k-*modes and *k-*medoids based models. However, since the Davies-Bouldin score assesses similarity between the most similar clusters, scores may be optimistic when several clusters only contain a single (or very few) participant(s). This was the case for the *k*-medoids models for the whole and men-only cohorts. The presence of singleton clusters is also concurrent with a larger number of total clusters. Specifically, all three models based on *k*-modes discovered eight clusters, all models using AHC discovered ten clusters, and all models using LCA discovered five or six clusters. In contrast, the *k-* medoids models discovered 25 clusters for the whole population (17 only had one participant, six for women-only (no singletons), and 13 for men-only (seven singletons) (Table 1). Therefore, while *k-* medoids models had comparable Davies-Bouldin scores to *k-*modes models, the results were less informative and consistent. For each cohort, we therefore selected the best performing *k*-modes model with the highest Calinski and Harabasz score among the five independent runs. ### Differential representation of physical conditions within *k*-modes clusters Many of the significantly over-represented conditions within several clusters aligned with body systems (Fig. 1) and we therefore used clinical judgement to name the clusters according to the systems or conditions which were most prominent (Supplementary Tables 2-5). The four largest clusters in whole cohort are *Mixed including cancer* (27.9% of participants in the cohort), *Healthy + Rhinitis* (22.2%), *Cardiovascular disease (CVD) + diabetes* (15.5%), and *Very extensive morbidity* (12.5%). For women, the four largest clusters are *Mixed including cancer* (29.3%), *CVD + diabetes* (19.7%), *Musculoskeletal (MSK)* (16.4%), and *Healthy + Rhinitis* (15.9%). Finally, for men, the four largest clusters are *CVD + diabetes* (24.2%), *Mixed including cancer* (20.8%), *MSK + others* (19.1%), and *Healthy + Rhinitis* (17.2%) (Fig. 2, Table 2). ![Fig. 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F1.medium.gif) [Fig. 1.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F1) Fig. 1. Bubble heatmap shows under-and over-represented conditions in each cluster from the k-modes derived models ![Fig. 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F2.medium.gif) [Fig. 2.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F2) Fig. 2. Cluster sizes and condition counts. View this table: [Table 2.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T2) Table 2. Hazard ratios per cluster for the development of subsequent depression. Of the remaining clusters, there were some similarities across all three cohorts (*e.g. Respiratory* clusters). Generally, clusters with more participants also tended to have fewer conditions per participant (Fig. 2). For example, the *Mixed including cancer* clusters had the lowest mean conditions per participant (whole: 1.77; women: 1.75; men: 1.62). Such clusters may serve as "miscellaneous" categories for participants with condition profiles that are not easily grouped and/or people with one dominant condition. However, there were also differences. For example, there were clusters which only appeared in the whole population (*Migraine*) and clusters which only appeared in the sex-stratified cohorts (*Digestive* and *MSK* clusters). ### Subsequent incident depression per identified cluster Analysis of time to incident depression diagnosis included 141,001 participants (73,036 women and 67,920 men), excluding 30,770 participants with a history of depression at baseline (20,592 women and 10,178 men). In addition to participants included in the clustering analysis, this analysis also included 30,551 participants with no physical conditions at baseline (16,238 women and 14,313 men) (Supplementary Fig. 1). During an average follow-up of 6.8 years, 5,904 (4.2%) participants, including 3,574 (4.9%) women and 2,330 (3.4%) men, had a new depression diagnosis. Generally, participants with physical conditions at baseline had a higher rate of subsequent depression than participants with no physical conditions at baseline (Table 2, Supplementary Fig. 3). There were several consistencies across cohorts. The *Very extensive morbidity* clusters were the most strongly associated with depression in all three cohorts (whole: HR 2.42, 95% CI 2.17-2.69; women: HR 2.67, 95% CI 2.24-3.17; men: HR 2.65, 95% CI 2.22-3.18). Additionally, the *Healthy + rhinitis* (whole: HR 1.59, 95% CI 1.46-1.75; women: HR 1.48, 95% CI 1.30-1.67; men: HR 1.50, 95% CI 1.29-1.75) and *Mixed including cancer* (whole: HR 1.62, 95% CI 1.48-1.77; women: HR 1.63, 95% CI 1.46-1.82; men: HR 1.60, 95% CI 1.38-1.86) clusters were generally the most weakly associated. Finally, association with depression also appeared to increase with the number of conditions per participant (Fig. 3). However, there were some exceptions; for example, the whole cohort’s *Macular degeneration + diabetes* cluster had the fourth highest mean number of conditions (3.09), but was only weakly associated with depression (HR 1.29, 95% CI 0.85-1.98). ![Fig. 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F3.medium.gif) [Fig. 3.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F3) Fig. 3. Risk of subsequent depression by mean number of physical conditions. ## Discussion ### Summary of findings This study systematically explored clustering of physical health conditions using four methods appropriate for binary data (*k*-modes 35,36, *k-*medoids 23, Latent Class Analysis 24, and agglomerative hierarchical clustering (AHC) 22). *K*-modes performed best, and the clusters identified were reasonably interpretable and often aligned with known associations between conditions. People with any physical condition at baseline were generally more likely to develop depression than people without any physical condition. There was some variation in this association by cluster which may be at least partly driven by differences in the mean number of physical conditions in each cluster. ### Comparison with other studies Existing studies of morbidity clustering typically apply a single method. One study compared LCA to a Bayesian, network-based approach, but used age and admission type, rather than conditions alone, to drive cluster formation 53. Two other studies explored AHC and *k*-means in the same dataset, but chose *k*-means on the basis of AHC being too computationally intensive rather than based on performance 27,28. Additionally, despite the use of *k*-means 54 by several multimorbidity studies 27–30, it typically relies upon Euclidean distance as its similarity measure 31, which is unsuitable for binary data 18. Other multimorbidity studies have used *k*-means *after* a Multiple Correspondence Analysis 55,56, which represents categorical features as a low-dimensional Euclidean space 29,30. While this transforms the data features into an appropriate format for *k*-means, it also manipulates the data based on their pairwise co-occurrences, which may not be appropriate for every dataset. This study finds that almost all physical morbidity clusters are associated with higher risk of subsequent depression than the group with no physical conditions at baseline. Although the strength of association varied by cluster, this seemed to be partly explained by the mean number of conditions in the cluster. This is consistent with a similar study 29 which found associations between severe mental illness and a higher number of physical conditions. Another similar study, which aimed to identify groups of physical conditions associated with incident depression within a Taiwanese cohort, also found that social factors played a role on the risk of subsequent depression diagnosis 21. Specifically, they found that amongst four *Cardiometabolic*, *Arthritis-cataract*, *Multimorbidity*, and *Relatively healthy* clusters, those within the *Arthritis-cataract* and *Multimorbidity* clusters had significantly higher risk of depression than healthy individuals. However, this association was attenuated for participants who engaged in social activities, including a job, volunteer experience, or community activities 21. ### Strengths and limitations Strengths of this study include the analysis of a large dataset which records morbidities in both baseline research data and linked routine data, as well as the inclusion of a wide set of morbidities recommended by a recent consensus study 57. Notably, this study is unique for its implementation and comparison between four clustering methods appropriate for binary data. A limitation is that the data are collected from volunteers who are generally more affluent than the UK average, and people from ethnic minorities are somewhat under-represented 58. Additionally, there is no standard way to evaluate the validity of identified clusters, although the observed clusters do include several known clinical associations. Consequently, this warrants further validation studies in other datasets to explore reproducibility of cluster solutions. ### Implications for research Many previous studies of morbidity clustering do not provide much information about which conditions are over or under-represented in clusters, which leaves readers relying solely on author-chosen cluster labels for interpretation 19,21. For example, it is common for other studies to identify a ‘cardiometabolic’ cluster 21 and the *CVD + diabetes* cluster in our study was amongst the three largest clusters in all three cohorts. However, it is not straightforward to compare clusters across studies because of considerable variation in the conditions included in analysis, and because many clustering studies do not provide detailed information about the nature of identified clusters. Key implications are that clustering studies should be more consistent in the choice of conditions to include (and, at a minimum, follow consensus recommendations 57). Additionally, they should report the nature of clusters to help understand them beyond their high-level labels (by, for example, visualizing the prevalence of individual conditions in each cluster alongside over/under representation). We believe that our Adjusted Relative Frequency (ARF) measure with visualization in a bubble heatmap demonstrates one way to do this. However, there is a need for multimorbidity researchers to develop improved and consistent cluster visualizations and explanations to facilitate interpretation and to enhance clinical utility. Morbidity clustering studies also typically use one clustering method, but there is no single clustering method which is likely to be optimal for every dataset. We therefore believe that clustering studies should more systematically explore different methods and make explicit how they choose the best method for their datasets and purposes. To encourage similar systematic comparison of different cluster methods, we have provided access to our code ([https://github.com/laurendelong21/clusterMed](https://github.com/laurendelong21/clusterMed)). Many studies also cluster the entire population which is likely not sensible given the very different incidence and prevalence of disease with age and, to a lesser extent, sex and ethnicity. In this analysis, study participants were mostly middle-aged (so we did not further stratify by age) and overwhelmingly white, but we found some differences in the clusters identified in the whole population versus women or men separately. Although whole population clustering may be appropriate in some circumstances, reporting of clusters stratified by age and sex (and ethnicity if the data permits) would be valuable to explore how clustering varies by demographic characteristics. Finally, further research to better understand why physical multimorbidity is associated with subsequent depression is needed. The general trend between increased risk of subsequent depression and mean number of conditions suggests a social explanation: suffering more conditions may more strongly interrupt one’s life or sense of self. However, the relationship between depression and physical conditions is very likely bidirectional and longitudinal research which better examines how the two interact over a lifetime would be valuable. ## Conclusions Using the best performing of four different clustering methods, this study identified several multimorbidity clusters which align with known clinical associations. Association with depression varied between clusters, but this may be partly driven by differences in the number of conditions. More research is needed to better understand the mechanisms underlying such associations. ## Supporting information Supplemental Table 2 [[supplements/310004_file02.xlsx]](pending:yes) Supplemental Figure 3 [[supplements/310004_file03.jpg]](pending:yes) Supplemental Figure 2 [[supplements/310004_file04.jpg]](pending:yes) Supplemental Figure 1 [[supplements/310004_file05.jpg]](pending:yes) Figure 3 [[supplements/310004_file06.jpg]](pending:yes) Figure 2 [[supplements/310004_file07.jpg]](pending:yes) Figure 1 [[supplements/310004_file08.jpg]](pending:yes) ## Data Availability The UK Biobank data is not openly available to protect the rights of participants. Researchers can register for access here: [https://www.ukbiobank.ac.uk/enable-your-research/register](https://www.ukbiobank.ac.uk/enable-your-research/register). ## Code Availability Corresponding code is available at [https://github.com/laurendelong21/clusterMed](https://github.com/laurendelong21/clusterMed). ## Competing Interests The authors declare no competing interests. ## Author Contributions All authors contributed to writing the manuscript. L.N.D. and K.F. conducted the analyses and made the figures. L.N.D. and P.G. wrote the software. K.F. and R.P. processed and prepared the data. J.D.F. and B.G. designed and supervised the study. ## Data Availability The UK Biobank data is not openly available to protect the rights of participants. Researchers can register for access here: [http://www.ukbiobank.ac.uk/enable-your-research/register](http://www.ukbiobank.ac.uk/enable-your-research/register). [https://www.ukbiobank.ac.uk/enable-your-research/register](https://www.ukbiobank.ac.uk/enable-your-research/register) ## Supporting information View this table: [Supplementary Table 1.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T3) Supplementary Table 1. 69 physical conditions with corresponding bodily systems. ![Supplementary Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F4.medium.gif) [Supplementary Figure 1.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F4) Supplementary Figure 1. Flow diagram explaining data filtration steps. **Supplementary Appendix 1.** Methodological Descriptions. To identify a clustering method best suited for our data, we explored several combinations of metrics and methods which were, in theory, capable of handling the binary nature of the morbidity data. Specifically, we used the following metrics within this study: * **Hamming distance** 1: This is a dissimilarity metric denoting the number of mismatching categories between two objects. It is formally defined as 2: ![Formula][2] where: ![Formula][3] * **cosine similarity** 3: This is a measure which was originally used to indicate how similar two vector angles are to one another. For binary data, it can be understood as the number of true, or positive features are shared between two objects, divided by the product of the number of true objects from each object. Formally, it is defined as 4: ![Formula][4] With these metrics, we explored the following four clustering methods: * ***k*-modes** 5,6: This uses the same methodology as *k*-means clustering 7, but the centroid of each cluster is defined on the number of matching categories between data points, computed via the Hamming distance 1. In other words, the centroid represents the mode of the cluster, rather than the mean 5,6. Centroid initialization was performed via the frequency-based *Huang* metric 5. * ***k*-medoids** 8: As above, this uses the same methodology as *k*-means clustering 7, but the centroid is an actual data point acting as the "median", computed via cosine similarity. * **Latent Class Analysis (LCA)** 9: This aims to find groups or subtypes of cases (latent classes) in multivariate categorical data. It gives probabilities of class membership, rather than concrete class assignments, which are unique, so the user can see the likelihood that a data point truly belongs to its assigned class 9. * **agglomerative hierarchical clustering (AHC)** 10: This is best understood as a "bottom-up" approach in which samples start out alone, then merge to form larger and larger clusters 11. We used a *complete* linkage (the maximum distance between points in two clusters 12), computed via Hamming distance 1. Finally, we assessed cluster performance, including separation and overlap, with the following three performance metrics: * **Calinski and Harabasz score** 13: This is the ratio of between-cluster dispersion to within-cluster dispersion. A higher Calinski and Harabasz score indicates better performance. * **Davies Bouldin score** 14: This is a measure of cluster similarity to each cluster’s most similar cluster. A Davies Bouldin score closer to zero indicates better performance. * **Silhouette score** 15: This is a measure of cluster fit which accounts for the mean distance between points in each individual cluster as well as the mean distance to points in the closest neighboring cluster. A silhouette score closer to one indicates better performance. Hamming distance was utilized as the distance metric. The best similarity or dissimilarity metrics were selected for *k*-modes, *k-*medoids, and AHC by testing each of them upon a random selection of participants (1,417). The *k-*modes method used with an alternative initialization technique, called the *Cao* metric 6, resulted in some clusters containing less than ten participants, while others contained hundreds. Such imbalance is uninformative for our purposes. Similarly, *k*-medoids with Hamming distance and *Jaccard similarity* 16, another similarity metric, resulted in several empty clusters. Finally, AHC with other metrics and linkage types resulted in poor separation between clusters, with high overlap between branches. These issues were not present with the specified metrics. **Supplementary Appendix 2.** Bubble Heatmap. The *bubble heatmap* places ARF values on a grid in which the *y-*axis contains conditions, the *x-*axis contains clusters, and data points are colored blue (under-representation) or red (over-representation) at each intersection. The magnitude of under-or over-representation is indicated by the size of the data point, or *bubble.* Points are not statistically significant, as determined by the Fisher’s Exact test (REF), were omitted. Therefore, conditions with no significant values are omitted entirely from the *y-* axis. Notably, for visualisation purposes, the ARF values are adjusted so that values denoting under-representation (between zero and one) were mapped to a similar scale as those denoting over-representation (values greater than one). Specifically, we used the following function, in which *x* denotes the original ARF value: ![Formula][5] **Supplementary Table 2. ARF values and adjusted p-values per condition and cluster.** Statistically significant values (p < 0.05) are denoted in bold. See corresponding excel file (S2Table.xlsx). View this table: [Supplementary Table 2.a.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T4) Supplementary Table 2.a. ARF values for *whole*-cohort clusters. View this table: [Supplementary Table 2.b.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T5) Supplementary Table 2.b. ARF values for *women-only* clusters. View this table: [Supplementary Table 2.c.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T6) Supplementary Table 2.c. ARF values for *men-only* clusters. View this table: [Supplementary Table 2.d.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T7) Supplementary Table 2.d. Adjusted *p*-values values for *whole*-cohort clusters. View this table: [Supplementary Table 2.e.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T8) Supplementary Table 2.e. Adjusted *p*-values values for *women-only* clusters. View this table: [Supplementary Table 2.f.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T9) Supplementary Table 2.f. Adjusted *p*-values values for *men-only* clusters. View this table: [Supplementary Table 3.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T10) Supplementary Table 3. Cluster labels for the *whole* cohort. View this table: [Supplementary Table 4.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T11) Supplementary Table 4. Cluster labels for the *women-only* cohort. View this table: [Supplementary Table 5.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/T12) Supplementary Table 5. Cluster labels for the *men-only* cohort. ![Supplementary Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F5.medium.gif) [Supplementary Figure 2.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F5) Supplementary Figure 2. Prevalence values per cluster and condition. ![Supplementary Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/07/2024.07.05.24310004/F6.medium.gif) [Supplementary Figure 3.](http://medrxiv.org/content/early/2024/07/07/2024.07.05.24310004/F6) Supplementary Figure 3. Time-to-depression diagnosis for each cluster in each of the *k*-modes models. ## Acknowledgments This work was co-funded by the Medical Research Council and the National Institute for Health Research (grant number MC/S028013), and the NIHR AIM-CISC programme (grant number NIHR202639). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. The study was conducted using the UK Biobank Resource under application number 57213. LND is individually funded by a Global Informatics Scholarship from the School of Informatics at the University of Edinburgh. The School of Informatics had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors would like to thank the UK Biobank participants and the UK Biobank staff for their contributions to this study. The authors would like to thank the public members of our advisory board, Dr Paul Kelly and Pat Watson, for providing thoughtful feedback throughout our project. This work has made use of the resources provided by the Edinburgh Compute and Data Facility (ECDF) ([http://www.ecdf.ed.ac.uk/](http://www.ecdf.ed.ac.uk/)). * Received July 5, 2024. * Revision received July 5, 2024. * Accepted July 7, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.Skou, S. T. et al. Multimorbidity. Nat Rev Dis Primers 8, 1–22 (2022). 2. 2.Harrison, C. et al. Comorbidity versus multimorbidity: Why it matters. Journal of Multimorbidity and Comorbidity vol. 11 2633556521993993 Preprint at (2021). 3. 3.Fortin, M., Stewart, M., Poitras, M.-E., Almirall, J. & Maddocks, H. A Systematic Review of Prevalence Studies on Multimorbidity: Toward a More Uniform Methodology. The Annals of Family Medicine 10, 142–151 (2012). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiYW5uYWxzZm0iO3M6NToicmVzaWQiO3M6ODoiMTAvMi8xNDIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyNC8wNy8wNy8yMDI0LjA3LjA1LjI0MzEwMDA0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 4. 4.Barnett, K. et al. Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study. The Lancet 380, 37–43 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(12)60240-2&link_type=DOI) 5. 5.Swain, S., Sarmanova, A., Coupland, C., Doherty, M. & Zhang, W. Comorbidities in Osteoarthritis: A systematic review and meta-analysis of observational studies. Arthritis Care Res (Hoboken*)* 72, 991–1000 (2020). 6. 6.Alexander, K. P. et al. Outcomes of apixaban versus warfarin in patients with atrial fibrillation and multi-morbidity: Insights from the ARISTOTLE trial. Am Heart J 208, 123–131 (2019). 7. 7.Marrie, R. A. Comorbidity in multiple sclerosis: Past, present and future. Clinical and Investigative Medicine 42, E5–E12 (2019). 8. 8.Pitsillou, E. et al. The cellular and molecular basis of major depressive disorder: towards a unified model for understanding clinical depression. Mol Biol Rep 47, 753–770 (2020). 9. 9.Kraus, C., Kadriu, B., Lanzenberger, R., Zarate Jr, C. A. & Kasper, S. Prognosis and improved outcomes in major depression: a review. Transl Psychiatry 9, 1–17 (2019). 10. 10.Malhi, G. S. & Mann, J. J. Depression. The Lancet 392, 2299–2312 (2018). 11. 11.Goodwin, G. M. The overlap between anxiety, depression, and obsessive-compulsive disorder. Dialogues Clin Neurosci (2022). 12. 12.Rao, S. & Broadbear, J. Borderline personality disorder and depressive disorder. Australasian Psychiatry 27, 573–577 (2019). 13. 13.Gold, S. M. et al. Comorbid depression in medical diseases. Nat Rev Dis Primers 6, 1–22 (2020). 14. 14.Shao, M. et al. Depression and cardiovascular disease: Shared molecular mechanisms and clinical implications. Psychiatry Res 285, 112802 (2020). 15. 15.Riemer, F. et al. Microstructural changes precede depression in patients with relapsing-remitting Multiple Sclerosis. Communications Medicine 3, 90 (2023). 16. 16.Marrie, R. A., Graff, L. A., Fisk, J. D., Patten, S. B. & Bernstein, C. N. The relationship between symptoms of depression and anxiety and disease activity in IBD over time. Inflamm Bowel Dis 27, 1285–1293 (2021). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) 17. 17.Davyson, E. et al. Metabolomic Investigation of Major Depressive Disorder Identifies a Potentially Causal Association With Polyunsaturated Fatty Acids. Biol Psychiatry 94, 630–639 (2023). 18. 18.Cornell, J. E., et al. Multimorbidity Clusters: Clustering Binary Data From Multimorbidity Clusters: Clustering Binary Data From a Large Administrative Medical Database. Applied Multivariate Research 12, 163 (2009). 19. 19.Bisquera, A. et al. Identifying longitudinal clusters of multimorbidity in an urban setting: A population-based cross-sectional study. The Lancet Regional Health-Europe 3, 100047 (2021). 20. 20.Robertson, L. et al. Identifying multimorbidity clusters in an unselected population of hospitalised patients. Sci Rep 12, 5134 (2022). 21. 21.Ho, H.-E., Yeh, C.-J., Cheng-Chung Wei, J., Chu, W.-M. & Lee, M.-C. Association between multimorbidity patterns and incident depression among older adults in Taiwan: the role of social participation. BMC Geriatr 23, 177 (2023). 22. 22.Sasirekha, K. & Baby, P. Agglomerative hierarchical clustering algorithm-a. International Journal of Scientific and Research Publications 83, 83 (2013). 23. 23. Jin Xin and Han, J. K-Medoids Clustering. Encyclopedia of Machine Learning 564–565 (2010) doi:10.1007/978-0-387-30164-8_426. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/978-0-387-30164-8_426&link_type=DOI) 24. 24.Weller, B. E., Bowen, N. K. & Faubert, S. J. Latent class analysis: a guide to best practice. Journal of Black Psychology 46, 287–311 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/0095798420930932&link_type=DOI) 25. 25.Hall, M. et al. Multimorbidity and survival for patients with acute myocardial infarction in England and Wales: Latent class analysis of a nationwide population-based cohort. PLoS Med 15, e1002501 (2018). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) 26. 26.Eto, F. et al. Ethnic differences in early onset multimorbidity and associations with health service use, long-term prescribing, years of life lost, and mortality: A cross-sectional study using clustering in the UK Clinical Practice Research Datalink. PLoS Med 20, e1004300 (2023). 27. 27.Ioakeim-Skoufa, I. et al. Multimorbidity Clusters in the Oldest Old: Results from the EpiChron Cohort. Int J Environ Res Public Health 19, 10180 (2022). 28. 28.Carmona-Pírez, J. et al. Multimorbidity clusters in patients with chronic obstructive airway diseases in the EpiChron Cohort. Sci Rep 11, 4784 (2021). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-021-83964-w&link_type=DOI) 29. 29.Launders, N., Hayes, J. F., Price, G. & Osborn, D. P. Clustering of physical health multimorbidity in people with severe mental illness: An accumulated prevalence analysis of United Kingdom primary care data. PLoS Med 19, e1003976 (2022). 30. 30.Guisado-Clavero, M. et al. Multimorbidity patterns in the elderly: a prospective cohort study with cluster analysis. BMC Geriatr 18, 16 (2018). 31. 31.Sinaga, K. P. & Yang, M.-S. Unsupervised K-means clustering algorithm. IEEE access 8, 80716–80727 (2020). 32. 32.Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12, e1001779 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pmed.1001779&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25826379&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) 33. 33.Ho, I. S. S. et al. Measuring multimorbidity in research: Delphi consensus study. BMJ Medicine 1, e000247 (2022). 34. 34.Prigge, R., et al. Robustly Measuring Multiple Long-Term Health Conditions Using Disparate Linked Datasets in UK Biobank. Preprints with The Lancet (2024). 35. 35.Huang, Z. Clustering large data sets with mixed numeric and categorical values. in *Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining**,(*PAKDD*)* 21–34 (1997). 36. 36.Cao, F., Liang, J. & Bai, L. A new initialization method for categorical data clustering. Expert Syst Appl 36, 10223–10228 (2009). 37. 37.Abad-D\’\iez, J. M., et al. Age and gender differences in the prevalence and patterns of multimorbidity in the older population. BMC Geriatr 14, 1–8 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2318-14-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24393272&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) 38. 38.Agur, K., McLean, G., Hunt, K., Guthrie, B. & Mercer, S. W. How does sex influence multimorbidity? Secondary analysis of a large nationally representative dataset. Int J Environ Res Public Health 13, 391 (2016). 39. 39.Robert, L. Thorndike."Who Belongs in the Family?". Psychometrika 18, 267–276 (1953). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/BF02289263&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1953YB33500001&link_type=ISI) 40. 40.Schwarz, G. Estimating the dimension of a model. The annals of statistics 461–464 (1978). 41. 41.Hamming, R. W. Entropy and Shannon’s First Theorem. Coding and information theory. (Prentice-Hall Inc. Englewood Cliffs, New Jersey) 107, (1980). 42. 42.Calinski, T. & Harabasz, J. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods 3, 1–27 (1974). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/03610928308827180&link_type=DOI) 43. 43.Davies, D. L. & Bouldin, D. W. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2, 224–227 (1979). 44. 44.Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20, 53–65 (1987). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/0377-0427(87)90125-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=WOS:A1987L11&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1987L111800005&link_type=ISI) 45. 45.Huang, Z. A fast clustering algorithm to cluster very large categorical data sets in data mining. Data Min Knowl Discov 3, 34–39 (1997). 46. 46.Dunn, O. J. Multiple comparisons among means. J Am Stat Assoc 56, 52–64 (1961). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2282330&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A19611734300002&link_type=ISI) 47. 47.Bland, J. M. & Altman, D. G. Multiple significance tests: the Bonferroni method. Bmj 310, 170 (1995). 48. 48.Guide, P. Fisher’s Exact Test. Preprint at [https://www.pathwaycommons.org/guide/primers/statistics/fishers\_exact\_test/](https://www.pathwaycommons.org/guide/primers/statistics/fishers_exact_test/). 49. 49.Cox, D. R. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 34, 187–202 (1972). [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1972N572600003&link_type=ISI) 50. 50.Satagopan, J. M. et al. A note on competing risks in survival data analysis. Br J Cancer 91, 1229–1235 (2004). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/sj.bjc.6602102&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15305188&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000224250400001&link_type=ISI) 51. 51.Khunti, K., Routen, A., Banerjee, A. & Pareek, M. The need for improved collection and coding of ethnicity in health research. J Public Health (Bangkok*)* 43, e270–e272 (2021). 52. 52.Townsend, P. Deprivation. J Soc Policy 16, 125–146 (1987). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/S0047279400020341&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1987J267500001&link_type=ISI) 53. 53.Restocchi, V., Villegas, J. G. & Fleuriot, J. D. Multimorbidity profiles and stochastic block modeling improve ICU patient clustering. in 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid*)* 925–932 (IEEE, 2022). doi:10.1109/CCGrid54584.2022.00112. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/CCGrid54584.2022.00112&link_type=DOI) 54. 54.MacQueen, J. Classification and analysis of multivariate observations. in *5th Berkeley Symp*. Math. Statist. Probability 281–297 (1967). 55. 55.Abdi, H. & Dominique Valentin. Multiple correspondence analysis. Encyclopedia of measurement and statistics 2, 651–657 (2007). 56. 56.Beaney, T. et al. Identifying multi-resolution clusters of diseases in ten million patients with multimorbidity in primary care in England. Communications Medicine 4, 102 (2024). 57. 57.Ho, I. S. S. et al. Measuring multimorbidity in research: Delphi consensus study. BMJ Medicine 1, e000247 (2022). 58. 58.Fry, A. et al. Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am J Epidemiol 186, 1026–1034 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwx246&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28641372&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F07%2F2024.07.05.24310004.atom) [1]: /embed/graphic-1.gif [2]: /embed/graphic-10.gif [3]: /embed/graphic-11.gif [4]: /embed/graphic-12.gif [5]: /embed/graphic-13.gif