Distinct cognitive and functional connectivity features from healthy cohorts inform clinical obsessive-compulsive disorder ========================================================================================================================== * Luke J. Hearne * B.T. Thomas Yeo * Lachlan Webb * Andrew Zalesky * Paul B. Fitzgerald * Oscar W. Murphy * Ye Tian * Michael Breakspear * Caitlin V. Hall * Sunah Choi * Minah Kim * Jun Soo Kwon * Luca Cocchi ## Abstract Improving diagnostic accuracy of obsessive-compulsive disorder (OCD) using models of brain imaging data is a key goal of the field, but this objective is challenging due to the limited size and phenotypic depth of clinical datasets. Leveraging the phenotypic diversity in large non-clinical datasets such as the UK Biobank (UKBB), offers a potential solution to this problem. Nevertheless, it remains unclear whether classification models trained on non-clinical populations will generalise to individuals with clinical OCD. This question is also relevant for the conceptualisation of OCD; specifically, whether the symptomology of OCD exists on a continuum from normal to pathological. Here, we examined a recently published “meta-matching” model trained on functional connectivity data from five large normative datasets (N=45,507) to predict cognitive, health and demographic variables. Specifically, we tested whether this model could classify OCD status in three independent clinical datasets (N=345). We found that the model could identify out-of-sample OCD individuals. Notably, the most predictive functional connectivity features mapped onto known cortico-striatal abnormalities in OCD and correlated with genetic brain expression maps previously implicated in OCD. Further, the meta-matching model relied upon estimates of disordered cognition, such as cognitive flexibility and inhibition, to predict OCD successfully. Overall, our findings suggest that core subclinical features associated with obsessive-compulsive variability can discriminate clinical OCD status. These results support a dimensional and transdiagnostic conceptualisation of the brain and behavioural basis of OCD, with implications for research approaches and treatment targets. Keywords * OCD * networks * meta matching * big data * transfer learning * fMRI ## Main Obsessive-compulsive disorder (OCD) is a disabling mental condition characterised by the presence of intrusive thoughts (obsessions) and/or excessive ritualistic behaviours (compulsions) 1. Neurobiological models of OCD supported by genetic, preclinical and neuroimaging studies implicate dysfunction within cortico-striatal-thalamic circuitry 2–7. Accurate individual classification, rather than group-level observations, is an important next step in understanding OCD. However, this objective is challenged by the difficulty of acquiring sufficiently large datasets 8, which has likely led to over-inflated classification accuracies and poor model generalisability 9. Large population-based cohorts like the UK Biobank (UKBB) have been touted as a viable way forward to address the lack of sample size and richness within clinical datasets. A major challenge in applying “big data” brain models to mental health conditions is the lack of individuals with clinical disorders in such databases. Specifically, in the case of OCD, large publicly available datasets tend not to include individuals with severe or extreme levels of obsessions or compulsions. Nonetheless, contrary to the dominant categorical view of OCD whereby an individual either has or does not have the disorder, several studies support a *dimensional* conceptualisation, with relevant behavioural, neurophysiological, and genetic phenotypes representing a continuum 10. This framework suggests that individual brain and behavioural variability associated with subclinical OCD symptoms and dimensional traits (e.g., compulsivity) in the general population may be leveraged to guide research, improve diagnostic tools, and develop targeted personalised treatments for clinical OCD 11–13. However, it is uncertain if brain features linked to the expression of subclinical OCD symptoms and traits align with established diagnostic (e.g., DSM-V) and neurophysiological (e.g., cortical-striatal-thalamic loops) models of the disorder. Accordingly, the success of models based on “big data” relies on whether obsessive-compulsive features do indeed form a continuum from healthy individuals to clinical OCD with overlapping brain characteristics. A recent approach, termed “meta-matching”, demonstrated successful resting-state functional connectivity (RSFC) brain-behaviour model generalisation from large to small datasets in healthy cohorts 14,15. Meta-matching leverages the observation that many cognitive, health and demographic variables are correlated. Thus, brain connectivity patterns useful for classifying a known variable (e.g., age) are also likely to be useful in classifying an unseen, *correlated* variable in a different dataset (e.g., memory performance). This approach demonstrated impressive classification performances, even in small to moderate sample sizes 15, suggesting it may be useful to inform clinical neuroimaging studies. In the current work, we started by assessing whether a meta-matching model, trained on five large healthy datasets, was useful in identifying persons with an OCD diagnosis. We then investigated the brain features generated by the model and, in line with the cortical-striatal-thalamic loop model of OCD, hypothesised a large contribution of RSFC features associated with cortico-striatal circuits 3,16. We also explored the behavioural and physiological phenotypes derived from the large healthy datasets contributing to OCD prediction. Finally, to further evaluate the biological validity of the meta-matching model, we assessed the relationship between its connectivity features and genetic expression maps previously associated with compulsivity in the general population 17. ## Results First, we established whether a brain-based classifier trained on healthy individuals could accurately identify persons with an OCD diagnosis. To achieve this, we used a meta-matching model 14 that had been trained to predict 458 variables reflecting different aspects of physical and mental health, cognition, and behaviour. This model used whole-brain RSFC calculated across five source datasets, encompassing a total of 45,507 participants (**Methods**). Our local, to-be-classified dataset was a clinical OCD sample and matched healthy controls from three independent sites (N=345, nOCD = 199, nHC = 146). We refer to the data used in the meta-matching model as “normative” to distinguish it from the local healthy control data. Model accuracy was assessed in held-out test samples in a standard five-fold cross-validation scheme (**Methods**). ### Normative meta-matching RSFC brain-behaviour models can be used to classify OCD We found the meta-matching model could accurately classify unseen RSFC data as OCD or healthy controls (median balanced accuracy = 61.1%, *p* = 0.003, **Figure 1A**). Brain features utilised in the meta-matching model are shown in **Figure 1B** (**Methods**). Positive brain feature weights indicate that higher RSFC was observed in the OCD group for a given brain region compared to the local dataset of healthy controls or vice versa. For example, visual cortex regions demonstrated high feature weights and, therefore, higher RSFC in OCD. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/09/04/2024.09.02.24312960/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2024/09/04/2024.09.02.24312960/F1) Figure 1. Meta-matching model performance and brain features. **A.** Balanced accuracy (y-axis) classification performance in out-of-sample data for the meta-matching model (blue) and shuffled permutations (grey). Individual data points represent the variability across the cross-validation folds (200 iterations of 5 folds). **B.** Visualisation of the averaged brain features used in the model. **C.** Brain features averaged into canonical functional networks. Brain networks: FPN=control/frontoparietal, DMN=default-mode, DAt=dorsal attention, Lim=limbic, VAt=salience/ventral attention, SM=somatomotor, SC=subcortical, Vis=visual). Next, we averaged the regional weights into eight canonical functional brain networks 18. All networks but the subcortical and limbic networks showed significant contributions to the successful predictions (**Figure 1C**, **Supplementary Table 1**). Visual and dorsal attention networks were associated with increased OCD functional connectivity, whereas all other significant networks were associated with decreased connectivity in OCD (**Figure 1C**). ### Evaluating the predictive contribution of cortico-striatal circuits OCD has been consistently associated with abnormal cortico-striatal circuit function 2,6,16,19. Thus, we tested whether such circuits were more predictive of OCD status than other equally sized sets of subcortical-cortical functional brain connections. An independent dataset (N=250) was used to map the cortico-striatal circuits of interest 20 (**Methods**). As expected, connectivity patterns involving the striatal seed regions (NAcc; Nucleus Accumbens, dCaud; dorsal Caudate, dPut; dorsal Putamen, vPut; ventral Putamen; **Supplementary Figure 1**) highlighted a largely striatal-frontal pattern of connectivity (**Figure 2A**). Using this data, “circuits of interest” were defined as the most highly connected cortical regions with positive RSFC (10 regions per hemisphere, per seed, cortical areas within black contours in **Figure 2A**). ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/09/04/2024.09.02.24312960/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2024/09/04/2024.09.02.24312960/F2) Figure 2. Predictive weights within cortico-striatal circuits. **A.** Group- and hemisphere-averaged functional connectivity maps from 250 unrelated HCP participants when seeding each subcortical region of interest. Black borders indicate the cortico-striatal circuits of interest based on cortical regions with the highest positive connectivity values. NAcc; Nucelus Accumbens, dCaud; dorsal Caudate, dPut; dorsal Putamen, vPut; ventral Putamen (Supplementary Figure 1). **B.** Comparison between a null model of random subcortical-cortical brain feature weights (grey, far left) and the circuits of interest for the meta-matching model (blue). Circuit-specific shuffled permutations are also shown in grey. Individual data points represent the variability across the cross-validation folds (200 iterations of 5 folds). **C.** Top ten meta-matching phenotypes with the largest feature weights. Positive values indicate the OCD cohort had higher weights than the local cohort healthy controls and vice versa (e.g., lower functional connectivity in the dCaud pathway in panel **B**, or increased harm avoidance in panel **C**). TCI; Temperament and Character Inventory, SDQ; Strengths and Difficulties Questionnaire, CWI; Colour Word Interference, D-KEFS; Delis-Kaplan Executive Function System, WIAT; Wechsler Individual Achievement Test, BIS-BAS; Behavioural Inhibition and Behavioural Activation Systems, NIH refers to the NIH toolbox We observed consistent outlier brain features in the dorsal caudate, dorsal putamen, and ventral putamen circuits compared to shuffled permutations (dCaud; *p*= 0.01, dPut; *p* = 0.01, vPut; *p* = 0.01; **Figure 2B**). The dorsal caudate and dorsal putamen circuits also had larger negative feature weights than would be expected from any random set of subcortical-cortical brain connections (dCaud; *p* = 0.02, dPut; *p* = 0.04, **Figure 2B** and **Supplementary Table 2**). Note that, as before, the sign indicates the direction of the effect. For example, connectivity is lower in OCD in the dorsal putamen pathway compared to the local dataset healthy controls (**Figure 2B**). ### Trained phenotypes contributing to OCD predictions We next sought to establish which phenotypes from the normative population data used in the meta-matching model were useful for classifying OCD. Specifically, the meta-matching approach transforms brain connectivity data into 458 variables representing different health, cognitive and behavioural phenotypes derived from several datasets. We investigated which of these variables were useful in predicting OCD status. Of those 458 variables, only two were significantly related to OCD predictions: (i) the Dimensional Change Card Sort Test, a measure of cognitive flexibility; and (ii) the Flanker Inhibitory Control and Attention test (*p*FDR < 0.05, **Supplementary Table 3**). Other variables that were highly weighted (e.g., *p*FDR < 0.06) included cognitive measures such as performance on the Colour Word Interference test (a test of cognitive flexibility), mental health (e.g., excessive worrying indexed by harm avoidance), and behavioural variables (e.g., increased sleep per day) (**Figure 2C**). ### Relationship between meta-matching model brain features and genes implicated in OCD Next, we sought to understand the link between the brain connectivity features used in the meta-matching model and the known role of genetics in the development of OCD 21,22. A recent GWAS meta-analysis identified that variability within the KIT, GRID2, ADCK1, and WDR7 genes are associated with an increased likelihood of compulsive symptoms 17. Broadly speaking, the KIT gene is involved in cell proliferation and survival, while the GRID2 gene supports glutamate signalling in cerebellar Purkinje cells 23,24. The ADCK1 gene maintains mitochondrial function, and the WDR7 gene aids proteins involved in neurotransmission 25,26. To investigate these gene expression in the brain for these specific genes, we utilised preprocessed regional microarray expression data provided by the Allen Human Brain Atlas via the Abagen toolbox 27,28. Brain-wide regional gene expression was correlated with connectivity features generated by the meta-matching model (cortical features shown in **Figure 1B**). Three of the regional gene expression maps could be reliably linked to the brain connectivity features in the meta-matching model (GRID2, *r* = -0.31, *p* <0.001; WDR7, *r* = -0.30, *p* <0.001; ADCK1, *r* = -0.14, *p* = 0.03) (**Figure 3**). When considering the subcortex, only WDR7 demonstrated a significant correlation (WDR7, *r* = -0.64, *p* = 0.01, see **Supplementary Table 4**). For each significant relationship, higher regional gene expression was associated with negative connectivity features (i.e., regions with lower RSFC in OCD compared to the local dataset healthy control cohort). Note that statistical significance was declared using the spin test 29, followed by FDR correction for multiple comparisons. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/09/04/2024.09.02.24312960/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2024/09/04/2024.09.02.24312960/F3) Figure 3. Regional gene expression associated with meta-matching predictive feature weights. The top panel shows the normalised regional gene expression (nRGE) for each of the four genes of interest. The bottom panel displays the linear relationship between each nRGE cortical map and the meta-matching model connectivity feature weights. ### Control analyses We contrasted the current results obtained using multilayer meta-matching 14 with the original meta-matching model 15. The key difference between these models is the increased number of training datasets and phenotypes in the multilayer meta-matching (the original model only used data from the UKBB for training). The multilayer meta-matching performance was not statistically different from the original model (*p* = 0.44, **Supplementary Figure 2**). We also performed a control experiment using a logistic regression model to classify OCD patients (**Methods**). Classification performance for this model was not statistically different to the meta-matching models (*p* = 0.53, **Supplementary Figure 2**). Likewise, the brain features extracted from the models were significantly correlated (cortex: *r* = 0.92, *p* < 0.001, subcortex: *r* = 0.82, *p* < 0.001). Finally, when developing brain-behaviour prediction models, it is common practice to attempt to remove (*deconfound)* covariates that may bias the results (e.g., age, sex, site, head motion). Accordingly, we have presented results adjusted for site, age and sex (see **Methods**). Classification performance was not statistically different between the deconfounded and non-deconfounded models (*p* = 0.93), suggesting that variables such as age, sex, and head motion had a limited impact on the results (**Supplementary Figure 2**). ## Discussion In this study, we tested if a brain-behaviour model trained on large, relatively “healthy” (i.e., non-clincal) datasets could classify individuals with clinical OCD. Despite the lack of individuals with a clinical diagnosis of OCD and relevant measurements (e.g., symptom scales) in the training data, the model demonstrated above-chance performance. This finding supports the conceptualisation of OCD as a continuum comprising core behavioural, neurophysiological, and genetic aspects 30. Specifically, the predictive power of cognitive phenotypes developed in healthy training data - cognitive flexibility, attention, and inhibitory control - highlights the value of these non-specific transdiagnostic measures for understanding OCD 31. Likewise, the results confirm the key role of dysregulated frontostriatal functional connectivity in OCD 3,7 and their relevance in the expression of subclinical symptoms. Finally, we were able to establish a link between OCD-related brain connectivity features and gene expression associated with compulsiveness in the general population 10,17. The ubiquitous nature of the brain-behaviour mappings used by the meta-matching model is consistent with the idea that OCD pathology is supported by mental processes and behaviours that cut across traditional diagnostic categories and exist on a spectrum between healthy and diseased states 12,32,33. Accordingly, cognitive deficits like the one indexed by the Dimensional Card Sorting task or the Flanker Task are believed to be markers of general psychopathology detectable across various neurological and psychiatric conditions 34,35. In addition to the above cognitive processes, our findings highlight the strong predictive weight of harm avoidance or escape “coping”, which are more specific to OCD. Excessive harm avoidance and its underpinning processes are indeed thought to be a core mechanism of compulsive behaviour 36,37. These findings suggest that the subclinical “healthy population” variability in core cognitive processes is useful for classifying clinical OCD and may contribute to the emergence of OCD-specific behaviours. Our analysis of the meta-matching model’s brain connectivity feature weights supports the hypothesis that abnormal RSFC within cortico-striatal circuits represents a hallmark of OCD pathophysiology 16,19. Specifically, resting-state connectivity of the dorsal caudate and putamen circuits was one of the largest predictors across all possible subcortical-cortical brain connections. These circuits have been previously associated with cognitive deficits and sensorimotor symptoms observed in OCD, such as planning, decision-making, and repetitive behaviours 38–42. Therefore, our findings provide additional motivation to study the neural mechanisms underpinning abnormalities in frontostriatal activity in people with clinical OCD. Investigating the brain basis of changes in frontostriatal circuit activity is key to improving the efficacy of invasive neurosurgical treatments 43 and the development of mechanism-sensitive non-invasive therapies like transcranial magnetic stimulation 44. Similar to prior work attempting to use neuroimaging data to classify OCD individuals, we also observed that, compared to other macroscale networks, the default-mode and sensory networks contained the largest predictive features 45,46. The default mode network, which has been linked to internal monitoring 47, may align with ruminations and excessive self-monitoring observed in OCD 48. Likewise, sensorimotor systems have been associated with inhibition problems at the core of several OCD symptoms 49. Supporting the importance of functional brain networks in OCD, the expression patterns of genes previously associated with clinical OCD and subclinical compulsivity 17 correlated with brain-wide predictive feature patterns. Specifically, brain regions with lower RSFC in OCD, relative to controls, tended to exhibit higher gene expression. This suggests that differences in the regional brain expression of genes broadly associated with neural function and communication (e.g., GRID2 cerebellar glutamate signalling, ADCK1 mitochondrial function, WDR7 neurotransmission) can be tied to brain connectivity in OCD. Collectively, our findings underscore the importance of the neurobiological features leveraged by the meta-matching model to explain and predict clinical OCD. Our results also shed light on the broader feasibility of transfer learning from general population datasets to clinical populations 50,51. Unlike prior studies, we did not observe a boost in classification performance due to meta-matching when compared to a simple regression model 14,15,52. There are likely multiple reasons for this discrepancy, the most obvious being that some critical aspects of variability associated with clinical OCD are not observable in brain-behaviour mappings trained in a healthy population. Thus, a key future research direction is improving meta-matching performance to a clinically useful level, including diagnosing OCD from brain connectivity measures. In conclusion, we have demonstrated that large healthy normative imaging datasets can be used to classify and advance knowledge of the brain and behavioural bases of clinical OCD. Our investigation into cortico-striatal circuits, functional networks, and OCD-specific brain-wide gene expression patterns provide strong evidence for the biological relevance of the adopted meta-matching model. These results support the study of OCD via a transdiagnostic framework defined by core neuro-behavioural constructs 32. The findings also motivate further investigations into using meta-matching and transfer learning to improve the biologically grounded classification of psychiatric conditions beyond OCD. ## Methods ### Datasets Data used in the current study were pooled across three independent datasets collected in Brisbane 44, Melbourne (clinical trial registration ACTRN12619000008123), and Seoul 53. The specifics of each dataset are described in the following section. The final sample size after the exclusion of people with excessive head motion (below) was N = 345 (NOCD = 199, NHC = 146). Demographics and clinical characteristics for each of these datasets are described in **Table 1**. In addition, 250 unrelated participants from the Human Connectome Project 54 were used to map cortico-striatal circuits of interest. The relevant local ethics committees approved each study. Likewise, all studies complied with the ethical standards of the relevant national and institutional committees on human experimentation. View this table: [Table 1.](http://medrxiv.org/content/early/2024/09/04/2024.09.02.24312960/T1) Table 1. Sample demographics #### Brisbane Fifty-eight adult participants with a clinical diagnosis of OCD and 45 controls were recruited across Australia as part of a registered randomised-controlled clinical trial (ACTRN12616001687482). In total, seven participants from the OCD cohort were excluded due to excessive head motion, anatomical abnormalities and missing and/or corrupted data (see brain imaging preprocessing section below). Details regarding participant recruitment and inclusion criteria have been reported elsewhere 44,55. #### Melbourne Forty-six adult participants with a clinical diagnosis of OCD were recruited across Australia as part of a registered randomised-controlled clinical trial (ACTRN12619000008123). The trial aimed to recruit a total of 75 participants, but it was prematurely terminated because of the COVID-19 pandemic. The study was approved by the Alfred Health Research Ethics Committee (Melbourne, Australia). Written informed consent was obtained from all participants. Details regarding participant recruitment and inclusion criteria are reported in the supplementary material. #### Seoul 102 medication-free adult participants with a clinical diagnosis of OCD and 101 controls were recruited as part of a previous study 53. The Institutional Review Board of Seoul National University Hospital approved the study. Written informed consent was obtained from all participants (any minors who participated in the study required consent from the individual and their caretakers). Details regarding participant recruitment and inclusion criteria have been reported elsewhere 53. ### Brain imaging data acquisition #### Brisbane Brain imaging data were acquired on a 3T Siemens Prisma MR scanner equipped with a 64-channel head coil at the Herston Imaging Research Facility, Brisbane, Australia. Whole brain echo-planar images were acquired with the following parameters: voxel size = 2 mm3, TR = 810 ms, multiband acceleration factor = 8, TE = 30 ms, flip angle = 53°, field of view = 212 mm, 72 slices. The resting state acquisition was about 12 minutes in length (880 volumes). Structural brain images were acquired with the following parameters: voxel size = 1 mm3, TR = 1900 ms, TE = 2.98 ms, 256 slices, flip angle = 9°. Anterior-to-posterior and posterior-to-anterior spin echo fieldmaps were also acquired. #### Melbourne Brain imaging data were acquired on a 3T Siemens Prisma MR scanner equipped with a 64-channel head coil at The Royal Melbourne Hospital, Melbourne, Australia. The functional and structural brain imaging sequences were identical to the Brisbane site. #### Seoul Brain imaging data were acquired on a 3T Siemens Trio MR scanner equipped with a 12-channel head coil at the Seoul National University Hospital. Whole brain echo-planar images were acquired with the following parameters: voxel size = 1.9 mm x 1.9 mm x 3.5 mm, TR = 3500 ms, TE = 30 ms, flip angle = 90°, field of view = 240 mm, 35 slices. The resting state acquisition was about 7 minutes in length (116 volumes). Structural brain images used in the preprocessing pipeline were acquired with the following parameters: voxel size = 1 mm x 0.98 mm x 0.98 mm, TR = 1670 ms, TE = 1.89 ms, 208 slices, flip angle = 9°. ### Brain imaging data processing All brain imaging data were preprocessed using fMRIprep (version 23.2.0) 56. Briefly, the data were skull stripped, corrected for susceptibility distortions (multiband data only), coregistered to the anatomical image and slice time corrected. The data were then resampled to a standard template space (HCP CIFTI surface and volume) (see supplementary for full details). The data were then downsampled into region-specific timeseries using the Schaefer 400 brain parcellation 57, with an additional 19 subcortical and cerebellar regions 58. This specific atlas was required to use the previously published meta-matching model 15. The resulting timeseries were denoised via Nilearn using a standard, previously benchmarked pipeline 59. For the main results, we employed a denoising strategy that regressed 24 motion parameters (six motion parameters, their temporal derivatives and quadratics of all regressors), average white matter signal, cerebrospinal fluid signal, global signal, as well as cosine transformation basis regressors. Framewise displacement was used to identify participants with large amounts of head motion. Specifically, any participant with less than 5 minutes of data with a framewise displacement of less than 0.5 mm was excluded from further analysis (N = 2 from the OCD cohort at the Brisbane site). Finally, consistent with the original meta-matching model, Pearson correlation was conducted on the denoised timeseries to estimate RSFC. Thus, for each participant, there was a single 419 x 419 symmetric RSFC matrix. The unique, lower triangle values from these matrices were indexed, resulting in a final array of RSFC values by participants (87,543 x N), which were used as features in the classification models. ### Meta-matching model To perform meta-matching, we used the openly available multilayer meta-matching model published by Chen et al. (2022) (V2.0; [https://github.com/ThomasYeoLab/Meta\_matching_models](https://github.com/ThomasYeoLab/Meta_matching_models)). The model was trained on five source datasets: the UKBB N = 36,834;, 60, the Adolescent Brain Cognitive Development study ABCD, N = 5,985;, 61, the Healthy Brain Network project HBN, N = 930;, 62, the enhanced Nathan Kline Institute-Rockland sample eNKI-RS, N = 896;, 63, and the Genomics Superstruct Project GSP, N = 862;, 64. Crucially, the samples used in these studies reflect individuals who are healthier than the general population 65. The UKBB did include some individuals who self-reported a lifetime OCD diagnoses, regardless of whether they were exhibiting symptoms at the time of imaging. However, given their small number (approximately 0.06%)66, it is unlikely that this sub-sample could drive our results. Age varied across these datasets, ranging from children, adolescents, younger adults, and older adults (see Chen et al., 2023 for further details). This meta-matching model uses a combination of a fully connected feedforward deep neural network and multiple kernel ridge regression models to predict 458 *non-brain imaging phenotypes* from RSFC (87,543 edges). The 458 phenotypes were derived from the source datasets, representing different aspects of physical and mental health, cognition, and behaviour. Comprehensive details regarding the model and its training data are available elsewhere 14,15. In brief, the 458 predictions are made by two parallel approaches. The first uses kernel ridge regression models to generate predictions from RSFC for each dataset, resulting in 229 phenotypic predictions (UKBB=67, ABCD=36, HBN=42, eNKI-RS=61, GSP=23). The second uses RSFC to predict UKBB phenotypes (67) via a fully connected deep neural network and then uses the outcomes of this first layer as input to the remaining kernel ridge regression models (ABCD=36, HBN=42, eNKI-RS=61, GSP=23). The 458 phenotypes predicted by the meta-matching model are not specific to our research problem: predicting OCD diagnosis. Thus, we employed a procedure called “stacking”, whereby a final regression model is used to predict the phenotype of interest. Specifically, the 458 predicted phenotypes were used in a logistic regression to predict OCD status. This was implemented using a regularised logistic regression in sklearn (*LogisticRegressionCV*) with the *liblinear* solver where the best *C* hyperparameter was selected via a nested five-fold cross-validation 67. ### Cross-validation and model performance A five-fold cross-validation scheme was employed to assess the accuracy of the predictions. The cross-validation was repeated 200 times to generate a distribution of classification performance measures. Within the training set, site harmonisation was performed using ComBat with OCD status as a covariate of interest 68,69, a validated method for correcting multi-site data. Likewise, linear regression was used to remove variability associated with age and gender. These deconfounding models were applied separately to the test data within each cross-validation fold. A statistical analysis of the success of this deconfounding procedure is described below (**Control Analyses**). Model accuracy for each cross-validation iteration was assessed using balanced accuracy. As in prior work 70, we evaluated the performance compared to chance via permutation testing. Specifically, each model was rerun after shuffling the RSFC data, effectively severing the link between RSFC and OCD status. This was repeated 1000 times, generating a null distribution of performance classifications for each model. The null distribution was compared to the real classification by calculating the percentile of the average classification value compared to the null distribution. ### Examination of brain connectivity features To assess the importance of each functional connectivity edge in predicting OCD status, the Haufe transform 71 was used to calculate feature weights. We used these weights to assess the importance of functional brain networks (**Figure 1**) and cortico-striatal circuits (**Figure 2**), as well as the relationship between model features and gene expression (**Figure 3**). #### Cortico-striatal brain circuits We mapped four key cortico-striatal circuits by seeding the Nucleus Accumbens (NAcc), dorsal Caudate (dCaud), dorsal Putamen (dPut), and ventral Putamen (vPut) in an independent dataset (N=250 from the Human Connectome Project, HCP; see **Supplementary Material**). The brain connectivity feature weights of these distinct brain circuits were examined and compared to two null models. The first compared the brain connectivity weights within the circuits to the identical weights from permuted models (described above). The second approach compared the circuit weights to similarly sized sets of connections drawn from a distribution of random subcortical-cortical empirical weights. The null distributions were compared to the real classification by calculating the percentile of the mean real classification value compared to the null distribution. #### Region-wise brain connectivity features, functional brain networks, and gene expression Brain connectivity feature weights were averaged across cross-validation folds for each brain region in the adopted brain parcellation. We then performed two analyses on these region-wise values. First, these values were further averaged into eight functional brain networks (control/ frontoparietal, default-mode network, dorsal attention, limbic, salience / ventral attention, somatomotor, subcortical and visual) 18 and compared to null permutations. Second, we tested whether brain-wide gene expression in four genes highlighted by a recent GWAS meta-analysis of subclinical compulsive symptoms (KIT, GRID2, ADCK1, WDR7) 17 were associated with brain connectivity features. Specifically, regional microarray expression data were collected from six post-mortem brains (one female, ages 24 to 57, average age 42.5 ± 13.38 years) provided by the Allen Human Brain Atlas. The data were processed using the Abagen toolbox in MNI space 27,28. When comparing cortical maps, we employed the spin test 29. For subcortical maps, we utilised a standard permutation shuffling approach (10,000 permutations in both scenarios) 72. #### Meta-matching phenotype feature weights Similar to the Haufe transform between predictions and RSFC inputs described above; an analogous analysis was conducted to examine the predictive value of the 458 training dataset phenotypes in the meta-matching model. In this context, a positive weight indicates that the meta-matching model predicted a higher score for a given phenotypic variable for the local dataset OCD patients than the healthy controls, and vice versa. To give a concrete example, the phenotype neuroticism was measured in the UKBB (one of the training normative datasets), however this was not measured in the local OCD and healthy control datasets. If this phenotype feature has a high weight, the meta-matching model has predicted that OCD patients would report higher scores on the neuroticism scale than the local dataset healthy controls. Thus, in this scenario, we would conclude that the RSFC from OCD patients in the local data is similar to RSFC in the UKBB that reported high scores on the neuroticism scale. As in the brain feature connectivity weights, the average empirical value was compared to shuffled null permutations and subsequently FDR corrected (Ncomponents = 458). ### Control analyses #### Alternative model comparisons The meta-matching model was compared to a logistic regression model in which all RSFC values were used to predict OCD status. Aside from the shape of the input data, this model was identical to the “stacking” logistic regression described above. This baseline model represents a standard comparison model that might be used to predict OCD diagnosis. We also contrasted the meta-matching model with a prior version (V1.1), which only included a single training dataset (the UKBB, He et al., 2022). Models were compared to each other via the corrected resample *t*-test as a standard *t*-test would not be valid 73,74. #### Deconfounding Predictive models can capture confounding effects correlated with the outcome rather than the brain features of interest (e.g., age, site, head motion) 75–77. To explore this, we compared model performance before and after deconfounding using the corrected resample t-test. #### Impact of global signal regression Denoising fMRI data from non-neuronal sources is a crucial preprocessing step in RSFC analyses 59. The training data used in the meta-matching model utilised a variety of denoising approaches with and without global signal regression. For completeness, we compared model performance with and without global signal regression in the local dataset using the corrected resample t-test (*p* = 0.41, see **Supplementary Figure 2**). ### Data and code availability Code used to generate the results will be uploaded to GitHub upon publication. The version of the meta-matching model used in the current work is available online (V2.0; [https://github.com/ThomasYeoLab/Meta\_matching_models](https://github.com/ThomasYeoLab/Meta_matching_models)). De-identified participant data for research purposes are available on request for data collected at the Brisbane 44 and Melbourne sites. De-identified participant data for research purposes are available on request for data collected at the Seoul site from the original authors53. ## Supporting information Supplemental Information [[supplements/312960_file02.docx]](pending:yes) ## Data Availability Code used to generate the results will be uploaded to GitHub upon publication. The version of the meta-matching model used in the current work is available online (V2.0; https://github.com/ThomasYeoLab/Meta_matching_models). De-identified participant data for research purposes are available on request for data collected at the Brisbane and Melbourne sites. De-identified participant data for research purposes are available on request for data collected at the Seoul site from the original authors. ## Conflict of Interest L.C., L.J.H, and A.Z. are involved in a not-for-profit clinical neuromodulation centre (Qld. Neurostimulation Centre) that offers neuroimaging-guided neurotherapeutics. In the last 3 years PBF has received equipment for research from Neurosoft and Nexstim. He has served on a scientific advisory board for Magstim and received speaker fees from Otsuka. He has also acted as a founder and board member for TMS Clinics Australia and Resonance Therapeutics. ## Acknowledgements This work was supported by the Australian NHMRC (GN2001283 and GNT2027597, L.C). A.Z. and L.J.H were supported by research fellowships from the NHMRC (APP1118153, APP1194070, respectively). PBF is supported by a National Health and Medical Research Council of Australia Investigator grant (1193596). BTTY is supported by the NUS Yong Loo Lin School of Medicine (NUHSRO/2020/124/TMR/LOA), the Singapore National Medical Research Council (NMRC) LCG (OFLCG19May-0035), NMRC CTG-IIT (CTGIIT23jan-0001), NMRC STaR (STaR20nov-0003), Singapore Ministry of Health (MOH) Centre Grant (CG21APR1009), the Temasek Foundation (TF2223-IMH-01), and the United States National Institutes of Health (R01MH120080 & R01MH133334). ## Footnotes * Formatting on title page corrected * Received September 2, 2024. * Revision received September 4, 2024. * Accepted September 4, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-5. Arlingt. VA (2013). 2. 2.Ahmari, S. E. et al. Repeated cortico-striatal stimulation generates persistent OCD-like behavior. Science 340, 1234–1239 (2013). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIzNDAvNjEzNy8xMjM0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDkvMDQvMjAyNC4wOS4wMi4yNDMxMjk2MC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 3. 3.Harrison, B. J. et al. Altered Corticostriatal Functional Connectivity in Obsessive-compulsive Disorder. Arch. Gen. Psychiatry 66, 1189–1200 (2009). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/archgenpsychiatry.2009.152&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19884607&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000271427500005&link_type=ISI) 4. 4.Menzies, L. et al. Integrating evidence from neuroimaging and neuropsychological studies of obsessive-compulsive disorder: The orbitofronto-striatal model revisited. Neurosci. Biobehav. Rev. 32, 525–549 (2008). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neubiorev.2007.09.005&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18061263&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000253400600016&link_type=ISI) 5. 5.Piantadosi, S. C., Chamberlain, B. L., Glausier, J. R., Lewis, D. A. & Ahmari, S. E. Lower excitatory synaptic gene expression in orbitofrontal cortex and striatum in an initial study of subjects with obsessive compulsive disorder. Mol. Psychiatry 26, 986–998 (2021). 6. 6.Shephard, E. et al. Toward a neurocircuit-based taxonomy to guide treatment of obsessive–compulsive disorder. Mol. Psychiatry 1–22 (2021) doi:10.1038/s41380-020-01007-8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41380-020-01007-8&link_type=DOI) 7. 7.Stein, D. J. et al. Obsessive–compulsive disorder. Nat. Rev. Dis. Primer 5, 1–21 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41572-019-0115-y&link_type=DOI) 8. 8.Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660 (2022). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/S41586-022-04492-9&link_type=DOI) 9. 9.Poldrack, R. A., Huckins, G. & Varoquaux, G. Establishment of Best Practices for Evidence for Prediction: A Review. JAMA Psychiatry (2019) doi:10.1001/jamapsychiatry.2019.3671. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jamapsychiatry.2019.3671&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31774490&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 10. 10.Strom, N. I., Soda, T., Mathews, C. A. & Davis, L. K. A dimensional perspective on the genetics of obsessive-compulsive disorder. Transl. Psychiatry 11, 1–11 (2021). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41398-021-01722-y&link_type=DOI) 11. 11.Fullana, M. A. et al. Obsessive–compulsive symptom dimensions in the general population: Results from an epidemiological study in six European countries. J. Affect. Disord. 124, 291–299 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jad.2009.11.020&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20022382&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 12. 12.Robbins, T. W., Banca, P. & Belin, D. From compulsivity to compulsion: the neural basis of compulsive disorders. Nat. Rev. Neurosci. 25, 313–333 (2024). 13. 13.Ruscio, A. M., Stein, D. J., Chiu, W. T. & Kessler, R. C. The epidemiology of obsessive-compulsive disorder in the National Comorbidity Survey Replication. Mol. Psychiatry 15, 53–63 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/mp.2008.94&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18725912&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000272992600012&link_type=ISI) 14. 14.Chen, P., et al. Multilayer meta-matching: translating phenotypic prediction models from multiple datasets to small data. bioRxiv (2023). 15. 15.He, T. et al. Meta-matching as a simple framework to translate phenotypic predictive models from big to small data. Nat. Neurosci. 25, 795–804 (2022). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41593-022-01059-9&link_type=DOI) 16. 16.Naze, S. et al. Mechanisms of imbalanced frontostriatal functional connectivity in obsessive-compulsive disorder. Brain 146, 1322–1327 (2023). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/brain/awac425&link_type=DOI) 17. 17.Smit, D. J. A. et al. Genetic meta-analysis of obsessive–compulsive disorder and self-report compulsive symptoms. Am. J. Med. Genet. B Neuropsychiatr. Genet. 183, 208–216 (2020). 18. 18.Yeo, B. T. et al. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. (2011). 19. 19.Robbins, T. W., Vaghi, M. M. & Banca, P. Obsessive-Compulsive Disorder: Puzzles and Prospects. Neuron 102, 27–47 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuron.2019.01.046&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 20. 20.Glasser, M. F. et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature18933&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27437579&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 21. 21.Mahjani, B., Bey, K., Boberg, J. & Burton, C. Genetics of obsessive-compulsive disorder. Psychol. Med. 51, 2247–2259 (2021). 22. 22.Posthuma, D. Revealing the complex genetic architecture of obsessive-compulsive disorder using meta-analysis. Mol. Psychiatry 23, 1181–1188 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/MP.2017.154&link_type=DOI) 23. 23.Pathania, S., Pentikäinen, O. T. & Singh, P. K. A holistic view on c-Kit in cancer: Structure, signaling, pathophysiology and its inhibitors. Biochim. Biophys. Acta BBA-Rev. Cancer 1876, 188631 (2021). 24. 24.Raghavan, N. S. et al. Whole-exome sequencing in 20,197 persons for rare variants in Alzheimer”s disease. Ann. Clin. Transl. Neurol. 5, 832–842 (2018). 25. 25.Kawabe, H. et al. A novel rabconnectin-3-binding protein that directly binds a GDP/GTP exchange protein for Rab3A small G protein implicated in Ca2+-dependent exocytosis of neurotransmitter. Genes Cells 8, 537–546 (2003). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1046/j.1365-2443.2003.00655.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12786944&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000183306400004&link_type=ISI) 26. 26.Zhuo, B. et al. ADCK1 is a potential therapeutic target of osteosarcoma. Cell Death Dis. 13, 954 (2022). 27. 27.Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11405&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22996553&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000308860900037&link_type=ISI) 28. 28.Markello, R. D. et al. Standardizing workflows in imaging transcriptomics with the abagen toolbox. elife 10, e72129 (2021). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7554/eLife.72129&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34783653&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 29. 29.Alexander-Bloch, A. F. et al. On testing for spatial correspondence between maps of human brain structure and function. Neuroimage 178, 540–551 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/J.NEUROIMAGE.2018.05.070&link_type=DOI) 30. 30.Mataix-Cols, D., do Rosario-Campos, M. C. & Leckman, J. F. A Multidimensional Model of Obsessive-Compulsive Disorder. Am. J. Psychiatry 162, 228–238 (2005). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1176/appi.ajp.162.2.228&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15677583&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000227210800003&link_type=ISI) 31. 31.Gillan, C. M., Fineberg, N. A. & Robbins, T. W. A trans-diagnostic perspective on obsessive-compulsive disorder. Psychol. Med. 47, 1528–1548 (2017). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 32. 32.Insel, T. et al. Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. Am. J. Psychiatry 167, 748–751 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1176/appi.ajp.2010.09091379&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20595427&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000279429300006&link_type=ISI) 33. 33.Kebets, V. et al. Somatosensory-Motor Dysconnectivity Spans Multiple Transdiagnostic Dimensions of Psychopathology. Biol. Psychiatry 86, 779–791 (2019). 34. 34.Abramovitch, A., Short, T. & Schweiger, A. The C Factor: Cognitive dysfunction as a transdiagnostic dimension in psychopathology. Clin. Psychol. Rev. 86, 102007 (2021). 35. 35.Wright, L., Lipszyc, J., Dupuis, A., Thayapararajah, S. W. & Schachar, R. Response inhibition and psychopathology: A meta-analysis of go/no-go task performance. J. Abnorm. Psychol. 123, 429–439 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1037/a0036295&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24731074&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 36. 36.Gillan, C. M. et al. Enhanced avoidance habits in obsessive-compulsive disorder. Biol. Psychiatry 75, 631–638 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.biopsych.2013.02.002&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23510580&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000334101200007&link_type=ISI) 37. 37.Rigoux, L., Stephan, K. E. & Petzschner, F. H. Beliefs, compulsive behavior and reduced confidence in control. PLOS Comput. Biol. 20, e1012207 (2024). 38. 38.Abramovitch, A., Abramowitz, J. S. & Mittelman, A. The neuropsychology of adult obsessive–compulsive disorder: A meta-analysis. Clin. Psychol. Rev. 33, 1163–1171 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cpr.2013.09.004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24128603&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 39. 39.Banca, P. et al. Action sequence learning, habits, and automaticity in obsessive-compulsive disorder. eLife 12, RP87346 (2024). 40. 40.Marzuki, A. A. et al. Association of Environmental Uncertainty With Altered Decision-making and Learning Mechanisms in Youths With Obsessive-Compulsive Disorder. *JAMA Netw*. Open 4, e2136195–e2136195 (2021). 41. 41.Snyder, H. R., Kaiser, R. H., Warren, S. L. & Heller, W. Obsessive-compulsive disorder is associated with broad impairments in executive function: A meta-analysis. Clin. Psychol. Sci. 3, 301–330 (2015). 42. 42.van den Heuvel, O. A., et al. Brain circuitry of compulsivity. Eur. Neuropsychopharmacol. 26, 810–827 (2016). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 43. 43.Figee, M. et al. Deep brain stimulation restores frontostriatal network activity in obsessive-compulsive disorder. Nat. Neurosci. 16, 386–387 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nn.3344&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23434914&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 44. 44.Cocchi, L. et al. Effects of transcranial magnetic stimulation of the rostromedial prefrontal cortex in obsessive–compulsive disorder: a randomized clinical trial. Nat. Ment. Health 1, 555–563 (2023). 45. 45.Bruin, W. B. et al. The functional connectome in obsessive-compulsive disorder: resting-state mega-analysis and machine learning classification for the ENIGMA-OCD consortium. Mol. Psychiatry 1–13 (2023). 46. 46.Bu, X. et al. Investigating the predictive value of different resting-state functional MRI parameters in obsessive-compulsive disorder. Transl. Psychiatry 9, 17 (2019). 47. 47.Buckner, R. L., Andrews-Hanna, J. R. & Schacter, D. L. The Brain”s Default Network. Ann. N. Y. Acad. Sci. 1124, 1–38 (2008). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1196/annals.1440.011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18400922&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000257139500002&link_type=ISI) 48. 48.Coles, M. E., Heimberg, R. G., Frost, R. O. & Steketee, G. Not just right experiences and obsessive–compulsive features:: Experimental and self-monitoring perspectives. Behav. Res. Ther. 43, 153–167 (2005). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.brat.2004.01.002&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15629747&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 49. 49.Ahmari, S. E., Risbrough, V. B., Geyer, M. A. & Simpson, H. B. Impaired Sensorimotor Gating in Unmedicated Adults with Obsessive–Compulsive Disorder. Neuropsychopharmacology 37, 1216–1223 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/npp.2011.308&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22218093&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000301602400015&link_type=ISI) 50. 50.Koppe, G., Meyer-Lindenberg, A. & Durstewitz, D. Deep learning for small and big data in psychiatry. Neuropsychopharmacology 46, 176–190 (2021). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 51. 51.Weiss, K., Khoshgoftaar, T. M. & Wang, D. A survey of transfer learning. J. Big Data 3, 1–40 (2016). 52. 52.Chopra, S. et al. Reliable and generalizable brain-based predictions of cognitive functioning across common psychiatric illness. medRxiv 2022–12 (2022). 53. 53.Kim, M. et al. Functional connectivity of the raphe nucleus as a predictor of the response to selective serotonin reuptake inhibitors in obsessive-compulsive disorder. Neuropsychopharmacology 44, 2073–2081 (2019). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41386-019-0436-2&link_type=DOI) 54. 54.Van Essen, D. C. et al. The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2013.05.041&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23684880&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000322416000009&link_type=ISI) 55. 55.Hearne, L. J. et al. Revisiting deficits in threat and safety appraisal in obsessive-compulsive disorder. Hum. Brain Mapp. 44, 6418–6428 (2023). 56. 56.Esteban, O. et al. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat. Methods 1 (2018) doi:10.1038/s41592-018-0235-4. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41592-018-0235-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30532080&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 57. 57.Schaefer, A. et al. Local-Global Parcellation of the Human Cerebral Cortex from Intrinsic Functional Connectivity MRI. Cereb. Cortex N. Y. N 1991 28, 3095–3114 (2018). 58. 58.Fischl, B. et al. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 33, 341–355 (2002). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0896-6273(02)00569-X&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11832223&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000173643200006&link_type=ISI) 59. 59.Wang, H.-T. et al. Continuous evaluation of denoising strategies in resting-state fMRI connectivity using fMRIPrep and Nilearn. PLOS Comput. Biol. 20, e1011942 (2024). 60. 60.Sudlow, C. et al. UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age. PLOS Med. 12, e1001779 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pmed.1001779&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25826379&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 61. 61.Volkow, N. D. et al. The conception of the ABCD study: From substance use to a broad NIH collaboration. Dev. Cogn. Neurosci. 32, 4–7 (2018). 62. 62.Alexander, L. M. et al. An open resource for transdiagnostic research in pediatric mental health and learning disorders. Sci. Data 4, 170181 (2017). 63. 63.Nooner, K. B. et al. The NKI-Rockland Sample: A Model for Accelerating the Pace of Discovery Science in Psychiatry. Front. Neurosci. 6, (2012). 64. 64.Holmes, A. J. et al. Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures. Sci. Data 2, 150031 (2015). 65. 65.Fry, A. et al. Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am. J. Epidemiol. 186, 1026–1034 (2017). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwx246&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28641372&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 66. 66.Davis, K. A. S. et al. Mental health in UK Biobank – development, implementation and results from an online questionnaire completed by 157 366 participants: a reanalysis. BJPsych Open 6, e18 (2020). 67. 67.Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cpc.2010.04.018&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23755062&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 68. 68.Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biostatistics/kxj037&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16632515&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000242715400008&link_type=ISI) 69. 69.Pomponio, R. et al. Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. NeuroImage 208, 116450 (2020). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016j.neuroimage.2019.116450&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 70. 70.Ooi, L. Q. R. et al. Comparison of individualized behavioral predictions across anatomical, diffusion and functional connectivity MRI. NeuroImage 263, 119636 (2022). 71. 71.Haufe, S. et al. On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage 87, 96–110 (2014). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2013.10.067&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24239590&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000330554000010&link_type=ISI) 72. 72.Larivière, S. et al. The ENIGMA Toolbox: multiscale neural contextualization of multisite neuroimaging datasets. Nat. Methods 18, 698–700 (2021). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41592-021-01186-4&link_type=DOI) 73. 73.Nadeau, C. & Bengio, Y. Inference for the Generalization Error. Mach. Learn. 52, 239–281 (2003). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1024068626366&link_type=DOI) 74. 74.Bouckaert, R. R. & Frank, E. Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. in Advances in Knowledge Discovery and Data Mining (eds. Dai, H., Srikant, R. & Zhang, C.) 3–12 (Springer, Berlin, Heidelberg, 2004). doi:10.1007/978-3-540-24775-3_3. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/978-3-540-24775-3_3&link_type=DOI) 75. 75.Chyzhyk, D., Varoquaux, G., Milham, M. & Thirion, B. How to remove or control confounds in predictive models, with applications to brain biomarkers. GigaScience 11, giac014 (2022). 76. 76.Smith, S. M. & Nichols, T. E. Statistical Challenges in “Big Data” Human Neuroimaging. Neuron 97, 263–268 (2018). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuron.2017.12.018&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29346749&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F09%2F04%2F2024.09.02.24312960.atom) 77. 77.Spisak, T. Statistical quantification of confounding bias in machine learning models. GigaScience 11, giac082 (2022).