A Lightweight, End-to-End Explainable, and Generalized attention-based graph neural network to Classify Autism Spectrum Disorder using Meta-Connectivity ======================================================================================================================================================== * Km Bhavna * Niniva Ghosh * Romi Banerjee * Dipanjan Roy ## 1 Abstract Recent technological advancement in Graph Neural Networks (GNNs) have been extensively used to diagnose brain disorders such as autism (ASD), which is associated with deficits in social communication, interaction, and restricted/repetitive behaviors. However, the existing machine-learning/deep-learning (ML/DL) models suffer from low accuracy and explainability due to their internal architecture and feature extraction techniques, which also predominantly focus on node-centric features. As a result, performance is moderate on unseen data due to ignorance of edge-centric features. Here, we argue that meaningful features and information can be extracted by focusing on meta connectivity between large-scale brain networks which is an edge-centric higher order dynamic correlation in time. In the current study, we have proposed a novel explainable and generalized node-edge connectivity-based graph attention neural network(Ex-NEGAT) model to classify ASD subjects from neuro-typicals (TD) on unseen data using a node edge-centric feature set for the first time and predicted their symptom severity scores. We used ABIDE (I and II) dataset with a large sample size (Total no. of samples = 1500). The framework employs meta-connectivity derived from Theory-of-Mind (ToM), Default-mode Network (DMN), Central executive (CEN), and Salience network (SN) that measure the dynamic functional connectivity (dFC) as a flow across morphing connectivity configurations. To generalize the Ex-NEGAT model, we trained the proposed model on ABIDE I(No. of samples =840) and performed testing on the ABIDE II(no. of samples =660) dataset and achieved 88% accuracy with an F1-score of 0.89. Additionally, we identified symptom severity scores for each individual subjects using their meta-connectivity links between relevant brain networks and passing that to Connectome-based Prediction Modelling (CPM) pipeline to identify the specific large-scale brain networks whose edge connectivity contributed positively and negatively to the prediction. Our approach accurately predicted ADOS-Total, ADOS-Social, ADOS-Communication, ADOS-Module, ADOS-STEREO, and FIQ scores. Keywords * Metaconnectivity * Dynamic Functional Connectivity * Autism * Typical development * Graph-attention neural network ## Introduction Autism spectrum disorder (ASD) is a pervasive neurodevelopmental disorder that impairs an individual’s social behaviour and communication abilities [1, 2]. Individuals with ASD face several challenges in their day-to-day lives and often acquire concurrent disorders such as depression, anxiety, and ADHD, which may further complicate the clinical diagnosis, particularly in younger children [3]. While no accurate treatment for autism is currently available, early detection may be advantageous in alleviating primary symptoms of autism by providing appropriate interventions [3, 2]. Functional magnetic resonance imaging (fMRI) is an effective method that has provided a deeper understanding of the pathophysiology of ASD [4]. According to previous studies [5, 6, 7] using fMRI, it has been found that autism primarily affects the abilities of social cognition, interaction, and communication abilities that are associated with alternations in the specific functional brain networks such as Theory of Mind(ToM), Default Mode Network(DMN), Central Executive Network(CEN), and Salience Network(SN). Previous research on ASD classification and prediction has majorly employed statistical methods and models, specifically voxel, ROI, and network-level analysis, and applying statistical tests such as group-level independent t-test, to assess statistical disparities between autistic and typically developing(TD) groups [8, 9, 10]. These statistical techniques have also been used as feature selection techniques to enhance the ML/DL classification performance [11, 12, 13]. The current era, marked by the availability of big data and Artificial Intelligence (AI) frameworks, provides an unprecedented opportunity to leverage connectome-based features associated with ASD. Further, the recent availability of large neuroimaging datasets opens up excellent opportunities to develop data-driven DNN models for hypothesis-driven and exploratory studies [14]. Numerous previous studies [15, 16, 17] have also incorporated ML-based techniques for feature selection, such as the least absolute shrinkage and selection operator (LASSO) and its variations. ML/DL algorithms such as SVM, variational autoencoder, recurrent neural networks, and convolutional neural networks achieved considerable performance in dealing with computationally aided diagnosis of ASD [18, 19, 20, 21]. Beyond ML models, Deep learning models, such as Convolutional Neural Networks (CNN), are an effective and scalable approach for distinguishing between ASD and TD without requiring manual feature engineering [22, 23]. Although CNN operates well with grid-like input in Euclidean space, such as naturalistic images, it is probably unsuitable for use with non-Euclidean data, like brain imaging [24]. Neuronal activity in many distinct brain regions is known to be closely linked and exhibits higher-order correlations, involving large-scale functional brain networks during rest and task [23]. Consequently, the geometric distance between distinct brain areas in Euclidean space may not effectively reflect the functional distance. Some geometric deep learning approaches, such as graph neural networks (GNNs), may circumvent this constraint and better cope with non-Euclidean data types like graphs [25, 26, 27]. The previous studies used GNN and attention-based GNN for ASD classification in two ways: either using node-centric features [28, 29], in which functional connectivity along with phenotypic features (e.g., age, gender, and ASD scores) were used to train the model at the group level, or using graph-based features set [30, 31], in which node centralized local information (or in other words, each sample were treated as a graph) was used to train the model. This GNN architecture has performed modestly due to the following limitations. In many of the previous approaches, the models mainly focus on static or node-centric features in spatial space, so the importance of edge-centric features depicting relationships between nodes is largely ignored [28, 32, 33]. This aspect of GNN architecture, based on the features above that contributed to the heterogeneity of ASD, has not been completely uncovered. Secondly, due to the high dimensionality of the GNN features, significant symptomatic variations, and heterogeneity of data across different sites, the previous GNN models gave low reliability, robustness, and generalizability in identifying ASD. Lastly, the internal architecture of GNN models takes time to execute information from one node to another, which increases the execution time to process data or, in other words, increases resource consumption. These limitations collectively contribute to overfitting the classification model and lead to unsatisfactory identification performance on independent datasets [33]. Hence, there is a genuine need for an explainable GNN architecture that can utilize both edge- and node-centric features, along with the sufficient generalizability of the proposed model architecture. In this work, we designed an attention-based GNN framework to identify ASD from TD and detect the potential biomarkers related to ASD. Our work here mainly focuses on two fundamental research questions that can potentially improve the classification of ASD samples on unseen datasets. How can we identify a unique spatio-temporal feature set for resting-state fMRI data that can improve the classification of ASD? Previously [18, 21], functional connectivity (time-averaged) matrices were widely used to classify ASD, which fell short at times to fully capture the complexity of brain dynamics on unseen datasets, or feature engineering was used in which the model identified features from time-series data that increased time and space complexity of the model. These motivated us to identify an improved feature set that can retain the topology information of brain images by constructing individual graph data based on both node-and-edge-centric features. Furthermore, how can we propose a better explainable and generalized GNN model architecture that can achieve ideal classification? The GNN models [25, 26, 27] generally suffer from an issue of the execution time of information from one node to another. This opens up a way to identify the modified architecture of GNNN that could reduce execution time without losing any information about node-edge connectivity, as well as capture the topology of brain networks, which could improve the performance of the model by detecting potential biomarkers for ASD. The contribution of the current study is two-fold: a) We hypothesized that higher-order correlation capturing node and edge-centric temporal features between ToM, DMN, CEN, and SN brain networks could be potential imaging biomarkers to classify autistic from typical developing without ad hoc feature engineering. To this end, we calculated meta-connectivity matrices ToM, DMN, CEN and SN functional brain networks based on resting-state fMRI data for each individuals. b) Next, we hypothesized that the higher-order correlation could faithfully track ongoing resting brain dynamics in large-scale brain networks but also require significant computational power. Hence, we proposed a generalized Ex-NEGAT model that could reduce execution time as well as capture complex brain dynamics patterns by dividing the graphs into sub-graphs using the depth-first search (DFS) approach, in which meta-connectivity matrices were subjected as input to train the model. We trained the proposed model on ABIDE I dataset and to make model generalizable, we performed testing on ABIDE II dataset. To validate our results, we performed leave-one-out and five-fold cross-validation approaches. Additionally, to test whether meta-connectivity could be a reliable predictive biomarker, we have performed prediction of symptom severity scores for each individual participant using a connectome-based prediction modeling (CPM)approach. To the best of our knowledge, our approach based on generalized Ex-NEGAT model surpasses previous such attempts of using GNN models in classifying ASD from TD based on the novel metaconnectivity based higher-order feature. The proposed framework reduce execution time that can lead to a reduction in computational resources, which helps in overcoming the earlier-mentioned challenges and improving existing approach based on derived attention maps from the framework to search and explain the potential image biomarkers for ASD. (refer to Figure 1). ![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F1) Figure 1: Illustrative overview of light-weight, generalized proposed Ex-NEGAT framework for classifying ASD from TD using meta-connectivity as an edge-centric feature set. Step 1 involves BOLD time-series signal extraction and calculation of meta-connectivity matrices; Step 2 shows the details of the proposed architecture NEGAT framework to perform classification. ## 2 Materials and Methods ### 2.1 fMRI Dataset and Preprocessing In this study, we have used fMRI anatomical and resting-state functional neuroimaging data from the Autism Brain Imaging Data Exchange (ABIDE I and ABIDE II) datasets [14]. The functional scan parameters listed per site were utilized for further preprocessing. ABIDE I contains preprocessed images that were used for the study. The data acquired from ABIDE II was further preprocessed, keeping in line with the process utilized to preprocess the ABIDE I dataset. The data was preprocessed using the DPARSF (Data Processing Assistant for Resting-State fMRI) toolbox, which is compatible with MATLAB. The first 4 time points were removed to account for the time taken by the subject to settle down. The reference slice for slice timing correction was set as the slice acquired in the middle of the slice order. The Friston-24 model was employed for head motion correction, and timeframes with head motion above 0.2 mm, along with one previous and two following timeframes, were added as regressors. Nuisance regression was done using SPM apriori masks, and global signal regression was not employed as it is a contentious preprocessing measure. A bandpass filter of 0.01 to 0.1 Hz was applied, and the data was normalized using DARTEL. Smoothing was completed with a FWHM kernel size of 6 mm. Scrubbing was done for all timeframes with a head motion above 0.5 mm and one previous and two following timeframes (After preprocessing: total no. of subjects = 1500, comprised of 711 ASD and 789 TD individuals) (refer to Tables 1 and 2). The configuration for computational processing used in this study was as follows: GPUs: 16X NVIDIA Tesla V100, NVIDIA CUDA Cores: 81920, NVIDIA Tensor Cores: 10240. View this table: [Table 1:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T1) Table 1: Demographic information of ASD and TD groups from ABIDE dataset View this table: [Table 2:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T2) Table 2: Brain region with MNI coordinates that are associated with Theory-of-Mind, Default-mode-Network, Central Executive, and Salience Networks ### 2.2 Calculation of Meta-Connectivity Feature-set #### 2.2.1 Calculation of Functional Connectivity and Dynamic Functional Connectivity Stream We extracted time-series signals from ToM, DMN, CEN, and SN brain networks to calculate meta-connectivity (refer to Table 2). Based on previous literature, we considered 25 Region-of-Interest (ROIs) [34, 35, 36, 37]. Using MNI coordinates, we created a spherical binary mask with a 10 mm radius for all selected ROIs and extracted time-series signals. We computed functional connectivity for each individual using eq (1) [38]: ![Formula][1] Where, *TS**i*(*t*) and *TS**j*(*t*) are time-series signals from *ith* ROI and *jth* ROI at time *t*, ![Graphic][2] indicates average functional connectivity between ROI *i* and ROI *j*, and *t* represents specific time-point within the total time-period *T*. The functional connectivity with pair-wise Pearson correlation matrices were useful for establishing group level differences. To track change in functional connectivity patterns over time, we calculated Dynamic Functional Connectivity (dFC) using a sliding window based approach that computed average FC matrices at different time points. According to the previous study [38], we denoted it as dFC-stream. We calculated temporal frame of dFC stream for each individual [38]: ![Formula][3] Where *t**k* indicates the *kth* temporal frame’s start time and *w* represents fixed window length. We choose *w* = 20 − 30*sec*, according to the literature [39]. Sliding step Δ*τ* is used to separate frame start time, such that *t**k* = *t**k*−1 + Δ*τ* = *k*Δ*τ*. #### 2.2.2 Meta-Connectivity Analysis Each dFC stream was considered as a collection of time series that defines the time dependency of individual FC pairwise couplings. An *M* * *M* matrix of correlations between the time-dependent strengths of *M* = *N* (*N* − 1) FC links (*N* 2 pairs of regions, minus the self-loops) was produced by improving the calculation of FC matrix from regional nodes to inter-regional links [38]. The resultant Meta-Connectivity (MC) matrix expresses the inter-link covariance analogous to how the FC matrix explains the inter-node covariance (refer to Figure 2). ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F2) Figure 2: Calculation of Static, Dynamic functional connectivity, and Meta-connectivity: In the traditional approach **(A)**, neural activity is averaged over time to create a functional connectivity (FC) matrix. We calculated a dynamic FC stream using shorter sliding windows **(B)**, leading to a dynamic FC matrix. In **(C)**, we considered each FC link as a dynamic variable, giving rise to a Meta-Connectivity (MC) matrix by capturing covariances between these dynamic links. ![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F3.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F3) Figure 3: **A) and B)** are depicting classification results using meta-connectivity with window size =30 sec. **C) and D)** are depicting classification results using meta-connectivity with window size = 20 sec. Whereas **E) and F)** are depicting classification results using functional connectivity. **G) and H)** are showing classification results using Dynamic Functional Connectivity. To calculate MC matrices, we used dFC stream and extracted *n* = *N* (*N* − 1)/2 time-series of pairwise FC coupling, in which input was given as *FC**ij*(*t*) for all pair of regions i and j with *i < j* ≤ *N* using equation (3) [38]: ![Formula][4] These *MC**ij,kl* entries are computed into a matrix format, making it possible to identify the pair of links contributed in each meta-link directly. Consequently, MC matrices had a dimension of *M* **M*, where various rows correspond to different directed pairings of regions – that is, both the pair (*i, j*) and the pair (*j, i*) were included – and only linkages relating to self-loops – that is, of the type, i.e., (*i, i*) were eliminated from consideration [38]. The resulting representation of MC matrices was large since: ![Formula][5] MC matrices captured higher level correlation between triplets or quadruplets of ROIs, as compared to FC matrices. ### 2.3 Node-Edge Connectivity Based Graph Attention Network for Classification MC matrices are a higher-order correlation feature set that tells about edge connectivity between nodes. We have proposed a generalized Ex-NEGAT model that uses an attention matrix by applying a sub-graph creation approach, which helped to reduce the execution time of information between the nodes as well as capture the representation of MC matrices. We provided MC matrices as input to the proposed model. We created a custom plugin that can parse the structure files of these MC matrices. The plugin extracted important characteristics like the computational graph topology and certain node-edge features by partitioning matrices into multiple subgraphs. This subgraph splitting approach saved computing complexity while providing a rich context for the model. The graph representation showed that nodes’ actions might dramatically affect their near neighboring nodes because of their intrinsic interconnection. However, including all neighbors for each node, particularly those at greater distances, was computationally expensive and generally not required. We propose a methodology leveraging a Depth-First Search (DFS) approach to optimize the subgraph depth parameter *d*. The reason behind using the DFS approach instead of the manual selection of the value of *d* was its ability to explore the graph by traversing down one branch as far as possible before backtracking, as well as allowing for dynamic adaptation to the local structure of the graph. #### 2.3.1 Optimization of Subgraph Depth using DFS To optimize the subgraph depth parameter d, we employ a Depth-First Search (DFS) algorithm to systematically explore the graph *G*(*V, E*) and select subgraphs *G′* (*V ′, E′*) for each node *n* based on a maximum depth of *d*. The subgraph *G′* is constructed to include node *n* and its neighboring nodes up to a maximum *d* depth. Here, we selected value of maximum depth *d* = 5. Formally, let *G*(*V, E*) represent the input graph, where *V* is the set of vertices and *E* is the set of edges. For each vertex *n* ∈ *V*, we define *G′* (*V ′, E′*) as the subgraph containing *n* and its neighbors within a depth limit of *d*. The vertex set *V ′* of *G′* consists of all nodes within this specified distance, while the edge set *E′* comprises all edges connecting these nodes. The process begins by iterating over each vertex *n* ∈ *V* in the graph *G*(*V, E*). The DFS algorithm explores neighboring nodes up to a maximum *d* depth at each iteration. If the distance from node *n* to a neighboring node *v* is less than or equal to *d, v* is added to the vertex set *V ′* of the subgraph *G′*. Similarly, all edges connecting *n* and its neighbors within the specified distance are added to the edge set *E′* of *G′*. By systematically traversing the graph using DFS and constructing subgraphs *G′* (*V ′, E′*) for each node *n*, we create a series of subgraphs *G**set* that capture the local structure and connectivity patterns around each node within the specified depth limit *d*. Within each subgraph, the attention matrix is applied to assign weights to the edges based on their relevance, which is learned during the training phase. This allows the Ex-NEGAT model to focus on the most significant connections, enhancing the interpretability and performance of the network. The attention mechanism ensures that the most informative relationships are given priority, thus reducing noise from less relevant nodes and edges. #### 2.3.2 Model-Architecture The traditional Graph neural networks were considered only node features. In the current work, we proposed an attention mechanism for each edge that could consider their multidimensionality and complexity. MC matrices were higher-order correlation feature set that gave correlation values for each edge. We hypothesized that the correlation reflected by edge weight might not represent the attention required for connected nodes. To test our hypothesis, we applied a linear transformation to both node feature set *h**i* and edge feature *e**ij* set that extracted high-level associated feature set by generating a new feature representation: ![Formula][6] ![Formula][7] Where *W**node* and *W**edge* were indicating weight matrices. However, transformed node and edge features were used to calculate separate attention coefficients, where the attention coefficient for node *i* was computed using equation 7: ![Formula][8] where ![Graphic][9] and ![Graphic][10] were representing learnable parameters for source and destination and *N* (*i*) = neighbours of node *i*. Similarly, calculated attention coefficient for edge *e**ij* using equation 10: ![Formula][11] where ![Graphic][12] and ![Graphic][13] were representing learnable parameters for edge features. Finally, we updated node and edge feature weight according to attention coefficients: ![Formula][14] ![Formula][15] Lastly, we concatenated node and edge features and applied a linear layer to reduce dimensionality: ![Formula][16] Where *W**final* = Weight matrix and ![Graphic][17] Updated node and edge features. Finally, we used Multi-layer perception (MLP) to make predictions for individual graphs of ASD and TD based on the final feature map ![Graphic][18]. The MLP layers involved two linear layers with dropout = 0.2 and ReLU function following first linear layer to improve robustness of the proposed model. #### 2.3.3 Training and Testing Process We trained our model in different ways, in which the learning rate was set to 0.0001, the batch size was 64, and the number of epochs was set to 100. The L2 regularization parameter has been set to 0.0001 for all the linear layers to prevent overfitting. We utilized the Adam optimizer, and cross-entropy loss is as loss function. In addition, to prevent the issue of the model overfitting, we use a technique called label smoothing, as illustrated in the equation (12) [40]: ![Formula][19] Algorithm 1 Pseudo-code to implement NEGAT ![Figure4](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F4.medium.gif) [Figure4](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F4) Where K = 2, that was indicating no. of class, *y**k* ∈ (0, 1) = true labels, ![Graphic][20] corresponding smoothing label, and *α* = 0.1. The loss function was calculated using equation (13): ![Formula][21] Where W = weight of linear layer, y = real label (after label smoothing), p = predicted label, *p**k* = predicted probability of class k. #### 2.3.4 Classification Evaluation The performance of the proposed model was tested in 3 different ways to make it more robust, explainable, and generalization: a) we trained the model on ABIDE I dataset and performed testing on ABIDE II dataset; b) we performed 5-fold cross-validation across all the subjects; c) we also used leave-one-out approach in which, we trained model on complete ABIDE I+II except one site, and perform testing on that site data as an independent dataset. We repeated this process for all sites, including ABIDE I and II datasets. We also trained different types of GNN models like graph convolutional network (GCN) [41], Chebyshev Convolution [42], Residual GCN (ResGCN) [43], GraphSAGE [44] to cross-check the proposed model’s performance using MC matrices. To validate the novelty of the proposed meta-connectivity feature set, we also performed classification using FC and dFC matrices. ### 2.4 Symptom-Severity score prediction for ASD Samples We identified the association between neurobiological features, i.e., Meta-connectivity matrices and symptom severity scores. We investigated whether resting-state Meta-connectivity could identify ASD symptom severity scores for each individual. For this purpose, we used Connectome-based Prediction Modelling (CPM), in which each participant’s meta connectivity matrix and behavioral scores were subjected to CPM model [45]. Previously, the authors used CPM approach to behavioral scores using node-centric features like functional connectivity [45, 7]. There is no such framework, in which CPM approach is implemented using an edge-centric feature, i.e., Meta-connectivity matrices. The initial step involves splitting the input data into training and testing sets. In this study, we trained the model on ABIDE I ASD samples and performed testing on ABIDE II ASD samples. Subsequently, across all participants in the training set, each connection in the connectivity matrices is associated with the behavioral measures using various linear regression techniques, such as Pearson’s correlation, Spearman’s correlation, or robust regression. After the linear regression, the most significant connections were singled out for further analysis. Typically, prominent connections were selected through statistical significance testing, i.e., chosen connections with correlation values exceeding a predefined threshold (0.4 with p*<*0.05). For each participant, the most vital connections are consolidated into a single summary value. This is commonly done by summing the strengths of the connections. Next, a predictive model is constructed, assuming a linear relationship between the summary value of connectivity data (independent variable) and the behavioral variable (dependent variable). Subsequently, summary values are computed for each participant in the testing set, and these values are input into the predictive model. ## 3 Results ### 3.1 Computation of Meta-Connectivity Matrices In the current study, we extracted time-series signals from ToM, DMN, CEN, and SN brain networks and calculated functional connectivity matrices for each individual. The FC matrices have a unique symmetry, and despite their large N × N size, they only have *L* = *N* (*N* − 1)/2 unique entries, representing the lower triangular part of the matrix. As the output, FC matrix appears as a *N* * *N* matrix. We also calculated dFC stream, that output in a 3D tensor sized *N* * *N* * *F*, where F was no. of frames. We chose window sizes 20 and 30 as per the literature [39]. However, each FC matrix was shown in the ‘vector’ format that outputs a vector of size *L* * 1, and dFC stream yielded a 2D matrix sized *L* * *F*, with each frame in vector format. The ‘vector’ format is advantageous because it produces more memory-efficient output, which is essential for large datasets. It also naturally represents FCs as points in a vector space, making it easier to create dimensionally reduced representations of the dFC stream using methods like t-stochastic neighborhood embedding. After that, we calculated MC matrices for each individual of size 600*600. To validate the results, we performed multiple one-sample t-tests with a p-value*<*0.01 and applied FDR correction. ### 3.2 Classification of ASD and TD Individuals using Proposed Ex-NEGAT Model #### 3.2.1 Performance of Proposed Method using Meta-connectivity Matrices In the current study, we proposed an attention-based graph neural network in which we divided the graph into a sub-graph upto a maximum depth *d* using the DFS approach, which helped to identify a suitable value for depth *d*. Initially, we chose the maximum value of *d* = 5, but we observed that value of *d* = 3 gave the best possible results, i.e., high accuracy with less execution time as reported in Table 4. So, we reported all the results using value of *d* = 3. To classify ASD samples from neuro-typicals, we subjected resting-state MC matrices of size 600*600 to the proposed Ex-NEGAT model. Here, we considered and performed the complete analysis using window sizes 20 and 30 sec. Firstly, we performed normal training and testing processes on ABIDE1 dataset and achieved an average accuracy of 92% with F1-score of 0.94 for window size of 20 sec and average accuracy of 91% with F1-score of 0.90 for window size of 30 sec. To make the model generalizable, we trained the model ABIDE1 dataset and performed testing on the ABIDE 2 dataset. We achieved 88% average accuracy with F1-score of 0.89 for window size of 20 sec and 83% average accuracy with F1-score of 0.85 for window size of 30 sec. We observed that we were getting better results using window size of 20 sec (Refer to Figure 4 and Table 4). ![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F5.medium.gif) [Figure 4:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F5) Figure 4: Performance of proposed model using Leave-one-out approach. The results suggested that the proposed model outperforms using the proposed meta-connectivity as a feature set. #### 3.2.2 Comparison of Performance using Meta Connectivity and Traditional Feature sets like Static and Dynamic Functional Connectivity Matrices An extensive comparative analysis was conducted to assess the proposed model’s performance, leveraging the novel meta-connectivity feature set against the conventional feature set. This entailed the execution of classification tasks using both static and dynamic feature sets. The Ex-NEGAT model was trained using functional connectivity (FC) matrices derived from ToM, DMN, CEN, and SN on the ABIDE I dataset, with subsequent testing on the ABIDE II dataset. We achieved an average accuracy of 63% with F1-score of 0.62 using FC matrices on unseen data. We again trained the Ex-NEGAT model using dynamic functional connectivity (dFC) of each individual on ABIDE I dataset and performed testing on the DFC of ABIDE II dataset. We achieved average accuracy of 70% with F1-score of 0.71 for window size of 30 sec and average accuracy of 74% with F1-score of 0.73 for window size of 20 sec. We were again getting better results using window size of 20 sec. There was an improvement in performance using dFC matrices as feature set compared to FC as feature set (Refer to Table 4). These findings collectively underscore the limitations of conventional feature sets in effectively classifying the ASD samples when confronted with unseen data. In contrast, the MC matrices as a feature set, which emphasizes higher-order correlation capturing edge strength variations over time for each individual, exhibited remarkable performance. It successfully distinguished the ASD population in the context of specific brain networks when applied to unseen data. This underscores the efficacy of the proposed framework in leveraging MC matrices as a feature set for specific brain networks, illuminating a promising avenue for further exploration. #### 3.2.3 Cross-validation using Five-fold Cross-validation and Leave-one-out Approaches The classification framework’s evaluation involves utilizing two distinct approaches to mitigate overfitting and ensure a robust and generalized assessment of its classification performance. The initial approach entails implementing five-fold cross-validation across all subjects, with a specific value of k (in this study, k =5). The second approach adopts a more stringent method known as the Leave-one-out approach. Under this strategy, one particular site’s data is designated as an independent testing dataset, while the data from all other sites collectively constitute the training dataset. Using five-fold cross-validation, we achieved average accuracy of 81% with F1-score of 0.78 for window size of 30 sec and average accuracy of 82% with F1-score of 0.83 for window size of 20 sec. While using the Leave-one-out method, we achieved average accuracy of 78% with F1-score of 0.79 for window size of 30 sec and average accuracy of 85% with F1-score of 0.84 for window size of 20 sec. We observed that leave-on-out gave better-generalized results in which each site data was treated as unseen data for testing (refer to Table 6 and Figure 4). ### 3.3 Prediction of Symptom Severity Score To identify robustness of MC matrices as higher order correlation feature set, we performed a prediction of symptom severity scores for each ASD sample using the CPM approach. We used MC matrices of ABIDE I dataset for training the model and performed testing on unseen data, i.e., MC matrices of ABIDE II dataset. Our proposed model was able to give an average accuracy of 85% with an F1-score of 0.86. We also performed validation using 5-fold cross-validation approach. Our model predicted ADOS-MODULE, ADOS-STEREO, ADOS-COMMUNICATION, FIQ, and ADOS-TOTAL scores more accurately, whereas it did not give accurate results for ADOS-SOCIAL scores. We also performed prediction of symptom severity scores using FC matrices. Using FC matrices, we got an average accuracy of 75% with an F1-score of 0.76. Our approach is novel as no framework in the literature identified higher order correlation (MC matrices) as a predictive measure for the prediction of symptom severity scores (refer to Figure 5). Even in large testing data, the proposed framework was able to predict symptom severity scores up to a considerable range. ![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/07/18/2024.07.17.24310610/F6.medium.gif) [Figure 5:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/F6) Figure 5: Prediction of Symptom-severity scores (ADOS-Total, ADOS-Social, ADOS-Module, ADOS-Communication, ADOS-Stereo, and FIQ scores) using Connectome-based prediction modeling in which Meta-connectivity was used as feature set. **A)** shows results for brain regions whose edge connectivity contributed positively to the prediction. Whereas **B)** depicts results for brain regions whose edge connectivity contributed negatively to the prediction. ## 4 Discussion We developed a novel explainable and generalized Node-edge connectivity-based graph attention neural network (Ex-NEGAT) model which could capture complex patterns exhibited by dynamic functional connectivity in the resting state during atypical neurodevelopment and is considered as a correlate of cognitive processing. In this work, we make a substantial gain and acquire an important insight about time varying nature of brain networks during early developmental changes by conceptualizing dFC as a flow across morphing connectivity configurations, our notion of dFC speed quantifies the rate at which FC networks evolve in time which serves as an important imaging biomarker for accurate classification between ASD and TD and predicting symptom severity scores at the individual subject level. Here, we probe the hypothesis that variations of resting state dFC flow characterized by meta-connectivity are selectively interrelated within specific functional subnetworks (ToM, DMN, CEN, and SN networks) and associated with deficits in social cognition, communication, and interaction abilities frequently reported in the extant literature [1, 2]. ### 4.1 Advantages of using meta-connectivity feature to discover new ASD biomarkers A growing body of recent works including previous works from our group indicates that aberrations in Autism Spectrum Disorder (ASD) are frequently associated with functional connectivity (FC), dynamic functional connectivity (dFC) between salient brain regions [46, 47, 39, 38]. Many similar studies using feature selection based on graph data has been carried out for other mental disorders such as Schizophrenia and Attention Deficit Hyperactivity Disorder(ADHD) [48, 49, 50]. According to the existing literature, large-scale brain networks anchored in ToM, DMN, CEN, and SN networks and their connectivity profiles based on structural and functional neuroimaging to index atypical development in autistic individuals and subserve as crucial biomarker of brain graphs [51, 52, 53, 54]. However, to our knowledge, the brain’s functional network constitutes a complex spatio-temporal network structure which entails both connectivity and dynamics. Dynamics during connectivity switching driven by various internal states importantly attributes neural flexibility and the mutual influence observed between graph-edges over time or time-dependent functional links [55, 56, 57]. Hence, it is highly probable that whole brain meta-connectivity between these subnetworks may also exhibit mutual influences, a notion that has very rarely been addressed in research. Identifying new biomarkers related to diseases holds significant importance for diagnosis and treatment. Thus, analyzing the interactions between meta-connectivity which describe resting state dFC as a smooth flow across continually morphing connectivity configurations based on characterizing dFC streams and their relationship with mental disorders may offer a novel perspective for discovering biomarkers(refer to Table 2). ### 4.2 Advantages of using MC matrix and attention-based GNN fMRI data possess both temporal and spatial properties, many recent works now analyze and process data based on their temporal and spatial features separately. One of the frequently employed methods to apply traditional machine learning algorithms or artificial neural networks(ANN) to graph data is to ignore the structure of the graph and treat the input edge weights as a vector of features [58]. However, the caveat with many of these approaches are that they ignore the crucial topological and statistical relationship among functional brain modules and large-scale brain networks which is important for the emergent properties of brain functions and constraining dynamics [59]. Another recent model has introduced a Spatial-Temporal Attention Graph Convolutional Network (STAGCN) for the classification of functional connectivity [60]. In the spatial domain, this model utilizes attention-enhanced graph convolutional networks to process the topological features of brain regions. In the temporal domain, it employs a multi-head self-attention method to capture the temporal relationships between different dynamic Functional Connectivity (dFC) [60, 23]. Although, many recent attempts leverages the temporal and spatial characteristics of fMRI data, the accuracy of ASD diagnosis leaves something more to be desired and fell well short of the expected classification accuracy [60, 61]. An alternative approach could be to treat adjacency matrix of the graph as an image and extract features using traditional CNN models. However, as has been shown the spatial proximity in the adjacency matrix elements does not necessarily always correspond with topological locality of the graph data [59, 62]. However, in this work we differed from all the previous approach and used a node-edge centric spatio-temporal meta-connectivity feature to train a novel generalized Ex-NEGAT model. This MC meta-module may be at certain times coordinated (when the co-fluctuating links are in a large strength transient) and thus form a *FC*(*t*) module in the conventional sense. This link sets that form different meta-modules will fluctuate toward large or weak strengths at different independent times. Importantly, this transient property of the links and their cofluctuations at different time windows seem to be a key feature that plays an important role in the diagnosis task of brain diseases as demonstrated in this work 4 and Table 4). Compared with the previously employed methods, we create an attention-based network by applying a sub-graph creation approach reducing computational time and interpretability of our model based on meta-links between relevant brain networks. Our attention-based GNN make the classification model more transparent and enhancing the possibility for detecting potential spatio-temporal imaging biomarkers of ASD. Moreover, instead of using convolution based method used earlier on FC by [61] the subgraph creation approach of GNN can aggregate spatio-temporal information and extract node-edge centric features on the non-Euclidean graph data according to graph topological properties [63]. It can generate more subgraph specific information based on meta-modules that is beneficial to the Autism classification task. In the next subsection, we discuss in details about how this approach also significantly reduces computational time. ### 4.3 Advantages of using MC subgraphs to reduce computational time To classify between ASD and TD samples, we employed meta-connectivity matrices as higher-order correlation features. Additionally, static and dynamic FC measures were also calculated for each individual subjects to compare the performance of all three connectivity features in accurately classifying ASD and TD. Our findings revealed hypo-connectivity within edges for the ASD group and on the contrary, hyper-connectivity within edges for the TD group (see Table 7). Meta-connectivity matrices from the ABIDE I dataset were utilized for training the Ex-NEGAT model, with testing conducted on the ABIDE II dataset. Further, our proposed generalized Ex-NEGAT model converted MC matrices into graphs and then divided them into subgraphs with maximum depth *d* using the DFS approach. In order to determine the optimal depth parameter in DFS, denoted as *d*, we initially explored up to *d* = 5, corresponding to traversing sub-graphs up to the fifth-order neighbor. However, increasing *d* beyond 5 escalated execution time without substantial improvement. Traversing to the first and second-order neighbors (*d* = 1 or 2) yielded less accurate predictions but reduced execution time. Optimal prediction performance was observed at *d* = 3, with reasonable execution time. Further exploration of the fourth and fifth-order neighbors showed minimal improvement in prediction accuracy at the cost of increased execution time. Our analysis using Depth-First Search (DFS) determined that traversing up to the third-order neighbor (*d* = 3) offered the best model performance (see Table 4). The proposed framework is unique for the following reasons: a) Use of a novel feature set that has not been employed in classification, b) unique attention-based GNN architecture based on subgraph creation, and c) the proposed model could reduce the execution time of information from one node to another, which could help in reducing computational resources and time-complexity. To make our model generalizable, training was performed on ABIDE I dataset and performed testing on ABIDE II dataset (Refer to Table 3). View this table: [Table 3:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T3) Table 3: Performance of proposed Ex-NEGAT model using FC,dFC, and MC matrices as feature set with *d* = 3, where W.s represents window size. View this table: [Table 4:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T4) Table 4: Performance of proposed model using different values of depth *d*. The table is highlighting Accuracy, F1-Score, Precision, Recall, and Execution time in seconds. View this table: [Table 5:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T5) Table 5: Performance comparison of multiple models including proposed model in the classification of ASD from neuro-typical samples using MC matrices as feature set. View this table: [Table 6:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T6) Table 6: Validation of performance of the proposed EX-NAGET model using the Leave-one-out approach. View this table: [Table 7:](http://medrxiv.org/content/early/2024/07/18/2024.07.17.24310610/T7) Table 7: Table shows the top 10 pairs of nodes whose edge connectivity contributed most to the prediction. The table shows edge connectivity strength for ASD and TD groups. ### 4.4 Predicting symptom severity in ASD using MC matrices and identifying relevant ROIs To check robustness of MC matrices as higher-order correlation feature set, we performed prediction of symptom severity scores using the CPM approach in which meta-connectivity matrices were subjected as input. Our approach notably demonstrated that resting-state meta-connectivity between ToM, DMN, CEN, and SN sub-networks could classify ASD samples from TD samples accurately on unseen datasets, as well as this feature set could also predict symptom severity scores without requiring any additional feature-engineering approach (Refer to Figure 1). In the previous works, [64, 65], the authors tried to predict symptom severity using functional connectivity between ROIs which relies on spatial features of fMRI data, but there is no proposed framework based on attention-based GNN that predicted symptom severity and diagnostic scores using meta-connectivity spatio-temporal features in functional neuroimaging. We trained a CPM model based on MC matrices derived based on ABIDE I ASD samples and performed testing on ABIDE II samples. Our model accurately predicted ADOS-Total, ADOS-Social, ADOS-Module, ADOS-Communication, ADOS-Stereo, and FIQ scores. We also identified ROIs whose meta-connectivity edges contributed positively and negatively to the symptom severity score prediction. We found that edges between Precuneus, Medial prefrontal Cortex (MPFC), Left Superior Temporal Sulcus (LSTS), Right Lateral Prefrontal Cortex (RLPC), Right Superior Temporal Sulcus (RSTS), Right Posterior Parietal Cortex (RPPC) were contributed as negatively in prediction, whereas all other edges between other ROIs contributed positively(Refer to Figures 4). Extant literature suggests that a critical social and cognitive dysfunction in ASD is the impaired ability to decode the mental states such as beliefs, emotions and intentions of self as well as others, and altered DMN may be an important neurobiological feature of these deficits [66, 67]. Previous neuroimaging work has consistently suggested that the DMN is one of the most aberrant functional networks in ASD [68, 69]. Studies of intrinsic FC in ASD reported disrupted within-network connectivity between core DMN nodes [70, 71, 39, 72, 38]. To validate the result, we also tried to predict the symptom severity scores at the individual level using DMN, ToM region specific functional connectivity, but this resulted in fairly lower accuracy on unseen data3 and 4. The proposed explainable and generalized attention-based GNN framework and feature set not only distinguish ASD from TD but also predicted their symptom severity scores on large unseen data. We observed that the SN contributed the most to the classification and prediction of ASD. This is consistent with existing literature suggesting the social-brain circuit dominantly includes classic limbic areas, ventral and medial aspects of the prefrontal cortex, the anterior temporal lobes, the posterior cingulate cortex (PCC), the PCUN, the posterior temporal regions, the temporo-parietal junction (TPJ), the left IFG involved in social communication, somatosensory and anterior insular cortex [73, 4, 74, 75, 76, 60, 23], which was implicated in the neurological mechanisms in ASD. The proposed framework also reduces computational time and is a lightweight approach in contrast with whole brain approach by focusing on selective set of brain networks and node-edge centric features and consequently, minimal computational resource is necessary to run the proposed framework. ## 5 Conclusion In summary, our discovery of robust, individualized functional brain meta-connectivity links is a promising biomarker within relevant brain regions related to social cognition in ASD diagnosis, and tracking transient dynamics of meta-connectivity in pervasive neurodevelopmental disorders. Beyond that, our innovative approach offers Ex-attention-based generalized GNN framework based on subgraphs to explore the reliable and interpretable neurobiological features from medical imaging data providing crucial insights into their clinical symptoms and advancing precision neuroimaging in brain disorders. ## Data Availability [https://fcon\_1000.projects.nitrc.org/indi/abide/](https://fcon_1000.projects.nitrc.org/indi/abide/) [https://fcon\_1000.projects.nitrc.org/indi/abide/](https://fcon_1000.projects.nitrc.org/indi/abide/) * Received July 17, 2024. * Revision received July 17, 2024. * Accepted July 18, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. [1]. L. Shao, C. Fu, Y. You, and D. Fu, “Classification of asd based on fmri data with deep learning,” Cognitive Neurodynamics, vol. 15, no. 6, pp. 961–974, 2021. 2. [2].D. American Psychiatric Association, A. P. Association et al., Diagnostic and statistical manual of mental disorders: DSM-5. American psychiatric association Washington, DC, 2013, vol. 5, no. 5. 3. [3]. F. Almuqhim and F. Saeed, “Asd-saenet: a sparse autoencoder, and deep-neural network model for detecting autism spectrum disorder (asd) using fmri data,” Frontiers in Computational Neuroscience, vol. 15, p. 654315, 2021. 4. [4]. D. P. Kennedy and E. Courchesne, “The intrinsic functional organization of the brain is altered in autism,” Neuroimage, vol. 39, no. 4, pp. 1877–1885, 2008. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2007.10.052&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18083565&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000253241800037&link_type=ISI) 5. [5]. S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the autistic child have a “theory of mind”?” Cognition, vol. 21, no. 1, pp. 37–46, 1985. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/0010-0277(85)90022-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=2934210&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1985AVC5100002&link_type=ISI) 6. [6]. S. R. Leekam and J. Perner, “Does the autistic child have a metarepresentational deficit?” Cognition, vol. 40, no. 3, pp. 203–218, 1991. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/0010-0277(91)90025-Y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=1786675&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1991GW72200002&link_type=ISI) 7. [7]. K. Bhavna, R. Banerjee, and D. Roy, “End-to-end explainable ai: Derived theory-of-mind fingerprints to distinguish between autistic and typically developing and social symptom severity,” bioRxiv, pp. 2023–01, 2023. 8. [8]. S.-J. Hong, R. Vos de Wael, R. A. Bethlehem, S. Lariviere, C. Paquola, S. L. Valk, M. P. Milham, A. Di Martino, D. S. Margulies, J. Smallwood et al., “Atypical functional connectome hierarchy in autism,” Nature communications, vol. 10, no. 1, p. 1022, 2019. 9. [9]. J. Ciarrusta, R. Dimitrova, D. Batalle, J. O’Muircheartaigh, L. Cordero-Grande, A. Price, E. Hughes, J. Kangas, E. Perry, A. Javed et al., “Emerging functional connectivity differences in newborn infants vulnerable to autism spectrum disorders,” Translational psychiatry, vol. 10, no. 1, p. 131, 2020. 10. [10]. A. Nair and M. Jolliffe, “A review of default mode network connectivity and its association with social cognition in adolescents with autism spectrum disorder and early-onset psychosis,” Frontiers in psychiatry, vol. 11, p. 548922, 2020. 11. [11]. C.-M. Chen, P. Yang, M.-T. Wu, T.-C. Chuang, and T.-Y. Huang, “Deriving and validating biomarkers associated with autism spectrum disorders from a large-scale resting-state database,” Scientific reports, vol. 9, no. 1, p. 9043, 2019. 12. [12]. X. Xu, W. Li, J. Mei, M. Tao, X. Wang, Q. Zhao, X. Liang, W. Wu, D. Ding, and P. Wang, “Feature selection and combination of information in the functional brain connectome for discrimination of mild cognitive impairment and analyses of altered brain patterns,” Frontiers in aging neuroscience, vol. 12, p. 28, 2020. 13. [13]. H. S. Nogay and H. Adeli, “Machine learning (ml) for the diagnosis of autism spectrum disorder (asd) using brain imaging,” Reviews in the Neurosciences, vol. 31, no. 8, pp. 825–841, 2020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1515/revneuro-2020-0043&link_type=DOI) 14. [14]. A. Di Martino, C.-G. Yan, Q. Li, E. Denio, F. X. Castellanos, K. Alaerts, J. S. Anderson, M. Assaf, S. Y. Bookheimer, M. Dapretto et al., “The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism,” Molecular psychiatry, vol. 19, no. 6, pp. 659–667, 2014. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/mp.2013.78&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23774715&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000336663100007&link_type=ISI) 15. [15]. P. Zille, V. D. Calhoun, J. M. Stephen, T. W. Wilson, and Y.-P. Wang, “Fused estimation of sparse connectivity patterns from rest fmri—application to comparison of children and adult brains,” IEEE transactions on medical imaging, vol. 37, no. 10, pp. 2165–2175, 2017. 16. [16]. Y. Li, J. Liu, X. Gao, B. Jie, M. Kim, P.-T. Yap, C.-Y. Wee, and D. Shen, “Multimodal hyper-connectivity of functional networks using functionally-weighted lasso for mci classification,” Medical image analysis, vol. 52, pp. 80–96, 2019. 17. [17]. Y. Li, J. Liu, Z. Tang, and B. Lei, “Deep spatial-temporal feature fusion from adaptive dynamic functional connectivity for mci identification,” IEEE Transactions on Medical Imaging, vol. 39, no. 9, pp. 2818–2830, 2020. 18. [18]. B. Devi, S. Kumar Anuradha, and V. G. Shankar, “Anadata: A novel approach for data analytics using random forest tree and svm,” in Computing, Communication and Signal Processing: Proceedings of ICCASP 2018. Springer, 2019, pp. 511–521. 19. [19]. M. Khosla, K. Jamison, A. Kuceyeski, and M. R. Sabuncu, “3d convolutional neural networks for classification of functional connectomes,” in International Workshop on Deep Learning in Medical Image Analysis. Springer, 2018, pp. 137–145. 20. [20]. T. Eslami, V. Mirjalili, A. Fong, A. R. Laird, and F. Saeed, “Asd-diagnet: a hybrid learning approach for detection of autism spectrum disorder using fmri data,” Frontiers in neuroinformatics, vol. 13, p. 70, 2019. 21. [21]. Y. Liu, L. Xu, J. Li, J. Yu, and X. Yu, “Attentional connectivity-based prediction of autism using heterogeneous rs-fmri data from cc200 atlas,” Experimental neurobiology, vol. 29, no. 1, p. 27, 2020. 22. [22]. S. Ryali, Y. Zhang, C. de Los Angeles, K. Supekar, and V. Menon, “Deep learning models reveal replicable, generalizable, and behaviorally relevant sex differences in human functional brain organization,” Proceedings of the National Academy of Sciences, vol. 121, no. 9, p. e2310012121, 2024. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1073/pnas.2310012121&link_type=DOI) 23. [23]. K. Supekar, S. Ryali, R. Yuan, D. Kumar, C. de los Angeles, and V. Menon, “Robust, generalizable, and interpretable ai-derived brain fingerprints of autism and social-communication symptom severity,” Biological psychiatry, vol. 92, no. 8, p. 643, 2022. 24. [24]. R. Rosenbaum, M. A. Smith, A. Kohn, J. E. Rubin, and B. Doiron, “The spatial structure of correlated neuronal variability,” Nature neuroscience, vol. 20, no. 1, pp. 107–114, 2017. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nn.4433&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27798630&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 25. [25]. F. Xia, K. Sun, S. Yu, A. Aziz, L. Wan, S. Pan, and H. Liu, “Graph learning: A survey,” IEEE Transactions on Artificial Intelligence, vol. 2, no. 2, pp. 109–127, 2021. 26. [26]. J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI open, vol. 1, pp. 57–81, 2020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.aiopen.2021.01.001&link_type=DOI) 27. [27]. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arxiv:1810.00826, 2018. 28. [28]. S. Parisot, S. I. Ktena, E. Ferrante, M. Lee, R. Guerrero, B. Glocker, and D. Rueckert, “Disease prediction using graph convolutional networks: application to autism spectrum disorder and alzheimer’s disease,” Medical image analysis, vol. 48, pp. 117–130, 2018. 29. [29]. D. Bandara and K. Riccardi, “Graph node classification to predict autism risk in genes,” Genes, vol. 15, no. 4, p. 447, 2024. 30. [30]. D. Yao, J. Sui, M. Wang, E. Yang, Y. Jiaerken, N. Luo, P.-T. Yap, M. Liu, and D. Shen, “A mutual multi-scale triplet graph convolutional network for classification of brain disorders using functional or structural connectivity,” IEEE transactions on medical imaging, vol. 40, no. 4, pp. 1279–1289, 2021. 31. [31]. J. Gao, M. Chen, D. Xiao, Y. Li, S. Zhu, Y. Li, X. Dai, F. Lu, Z. Wang, S. Cai et al., “Classification of major depressive disorder using an attention-guided unified deep convolutional neural network and individual structural covariance network,” Cerebral Cortex, vol. 33, no. 6, pp. 2415–2425, 2023. 32. [32]. N. Dehmamy, A.- L. Barabási, and R. Yu, “Understanding the representation power of graph neural networks in learning graph topology,” Advances in Neural Information Processing Systems, vol. 32, 2019. 33. [33]. Y. Chen, J. Yan, M. Jiang, T. Zhang, Z. Zhao, W. Zhao, J. Zheng, D. Yao, R. Zhang, K. M. Kendrick et al., “Adversarial learning based node-edge graph attention networks for autism spectrum disorder identification,” IEEE Transactions on Neural Networks and Learning Systems, 2022. 34. [34]. K. Baetens, N. Ma, J. Steen, and F. Van Overwalle, “Involvement of the mentalizing network in social and non-social high construal,” Social cognitive and affective neuroscience, vol. 9, no. 6, pp. 817–824, 2014. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/scan/nst048&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23552077&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 35. [35]. J. C. Mazziotta, A. W. Toga, A. Evans, P. Fox, J. Lancaster et al., “A probabilistic atlas of the human brain: theory and rationale for its development,” Neuroimage, vol. 2, no. 2, pp. 89–101, 1995. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1006/nimg.1995.1012&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9343592&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995RE54000001&link_type=ISI) 36. [36]. R. K. Kana, J. O. Maximo, D. L. Williams, T. A. Keller, S. E. Schipul, V. L. Cherkassky, N. J. Minshew, and M. A. Just, “Aberrant functioning of the theory-of-mind network in children and adolescents with autism,” Molecular autism, vol. 6, no. 1, pp. 1–12, 2015. 37. [37]. G. B. Chand and M. Dhamala, “Interactions among the brain default-mode, salience, and central-executive networks during perceptual decision-making of moving dots,” Brain connectivity, vol. 6, no. 3, pp. 249–254, 2016. 38. [38]. L. M. Arbabyazd, D. Lombardo, O. Blin, M. Didic, D. Battaglia, and V. Jirsa, “Dynamic functional connectivity as a complex random walk: definitions and the dfcwalk toolbox,” MethodsX, vol. 7, p. 101168, 2020. 39. [39]. V. Harlalka, R. S. Bapi, P. Vinod, and D. Roy, “Atypical flexibility in dynamic functional connectivity quantifies the severity in autism spectrum disorder,” Frontiers in human neuroscience, vol. 13, p. 6, 2019. 40. [40]. R. Müller, S. Kornblith, and G. E. Hinton, “When does label smoothing help?” Advances in neural information processing systems, vol. 32, 2019. 41. [41]. J. You, Z. Ying, and J. Leskovec, “Design space for graph neural networks,” Advances in Neural Information Processing Systems, vol. 33, pp. 17 009–17 021, 2020. 42. [42]. M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” Advances in neural information processing systems, vol. 29, 2016. 43. [43]. X. Bresson and T. Laurent, “Residual gated graph convnets,” arXiv preprint arxiv:1711.07553, 2017. 44. [44]. W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017. 45. [45]. X. Shen, E. S. Finn, D. Scheinost, M. D. Rosenberg, M. M. Chun, X. Papademetris, and R. T. Constable, “Using connectome-based predictive modeling to predict individual behavior from brain connectivity,” nature protocols, vol. 12, no. 3, pp. 506–518, 2017. 46. [46]. R. M. Hutchison, T. Womelsdorf, E. A. Allen, P. A. Bandettini, V. D. Calhoun, M. Corbetta, S. Della Penna, J. H. Duyn, G. H. Glover, J. Gonzalez-Castillo et al., “Dynamic functional connectivity: promise, issues, and interpretations,” Neuroimage, vol. 80, pp. 360–378, 2013. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2013.05.079&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23707587&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000322416000030&link_type=ISI) 47. [47]. R.-A. Müller and I. Fishman, “Brain connectivity and neuroimaging of social networks in autism,” Trends in cognitive sciences, vol. 22, no. 12, pp. 1103–1116, 2018. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.tics.2018.09.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30391214&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 48. [48]. O. Demirci, V. P. Clark, V. A. Magnotta, N. C. Andreasen, J. Lauriello, K. A. Kiehl, G. D. Pearlson, and V. D. Calhoun, “A review of challenges in the use of fmri for disease classification/characterization and a projection pursuit application from a multi-site fmri schizophrenia study,” Brain imaging and behavior, vol. 2, pp. 207–226, 2008. 49. [49]. B. Sen, N. C. Borle, R. Greiner, and M. R. Brown, “A general prediction model for the detection of adhd and autism using structural and functional mri,” PloS one, vol. 13, no. 4, p. e0194856, 2018. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0194856&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 50. [50]. L. Tian, T. Jiang, M. Liang, Y. Zang, Y. He, M. Sui, and Y. Wang, “Enhanced resting-state brain activities in adhd patients: a fmri study,” Brain and Development, vol. 30, no. 5, pp. 342–348, 2008. 51. [51]. S. Mouga, I. C. Duarte, C. Café, D. Sousa, F. Duque, G. Oliveira, and M. Castelo-Branco, “Parahippocampal deactivation and hyperactivation of central executive, saliency and social cognition networks in autism spectrum disorder,” Journal of Neurodevelopmental Disorders, vol. 14, no. 1, pp. 1–12, 2022. 52. [52]. E. Marshall, J. S. Nomi, B. Dirks, C. Romero, L. Kupis, C. Chang, and L. Q. Uddin, “Coactivation pattern analysis reveals altered salience network dynamics in children with autism spectrum disorder,” Network Neuroscience, vol. 4, no. 4, pp. 1219–1234, 2020. 53. [53]. K. E. Lawrence, L. M. Hernandez, H. C. Bowman, N. T. Padgaonkar, E. Fuster, A. Jack, E. Aylward, N. Gaab, J. D. Van Horn, R. A. Bernier et al., “Sex differences in functional connectivity of the salience, default mode, and central executive networks in youth with asd,” Cerebral Cortex, vol. 30, no. 9, pp. 5107–5120, 2020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cercor/bhaa105&link_type=DOI) 54. [54]. G. B. Chand, J. Wu, I. Hajjar, and D. Qiu, “Interactions of the salience network and its subsystems with the default-mode and the central-executive networks in normal aging and mild cognitive impairment,” Brain connectivity, vol. 7, no. 7, pp. 401–412, 2017. 55. [55]. D. S. Bassett, N. F. Wymbs, M. A. Porter, P. J. Mucha, and S. T. Grafton, “Cross-linked structure of network evolution,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 24, no. 1, 2014. 56. [56]. A. Brovelli, J.-M. Badier, F. Bonini, F. Bartolomei, O. Coulon, and G. Auzias, “Dynamic reconfiguration of visuomotor-related functional connectivity networks,” Journal of Neuroscience, vol. 37, no. 4, pp. 839–853, 2017. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam5ldXJvIjtzOjU6InJlc2lkIjtzOjg6IjM3LzQvODM5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTgvMjAyNC4wNy4xNy4yNDMxMDYxMC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 57. [57]. J. Faskowitz, F. Z. Esfahlani, Y. Jo, O. Sporns, and R. F. Betzel, “Edge-centric functional network representations of human cerebral cortex reveal overlapping system-level architecture,” Nature neuroscience, vol. 23, no. 12, pp. 1644–1654, 2020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41593-020-00719-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33077948&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 58. [58]. B. C. Munsell, C.-Y. Wee, S. S. Keller, B. Weber, C. Elger, L. A. T. da Silva, T. Nesland, M. Styner, D. Shen, and L. Bonilha, “Evaluation of machine learning algorithms for treatment outcome prediction in patients with epilepsy based on structural connectome data,” Neuroimage, vol. 118, pp. 219–230, 2015. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2015.06.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 59. [59]. J. Kawahara, C. J. Brown, S. P. Miller, B. G. Booth, V. Chau, R. E. Grunau, J. G. Zwicker, and G. Hamarneh, “Brainnetcnn: Convolutional neural networks for brain networks; towards predicting neurodevelopment,” NeuroImage, vol. 146, pp. 1038–1049, 2017. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/J.NEUROIMAGE.2016.09.046&link_type=DOI) 60. [60]. W. Wang, Y. Kong, Z. Hou, C. Yang, and Y. Yuan, “Spatio-temporal attention graph convolution network for functional connectome classification,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 1486–1490. 61. [61]. Z. Wang, Y. Xu, D. Peng, J. Gao, and F. Lu, “Brain functional activity-based classification of autism spectrum disorder using an attention-based graph neural network combined with gene expression,” Cerebral Cortex, vol. 33, no. 10, pp. 6407–6419, 2023. 62. [62]. C. Sun, C. Li, X. Lin, T. Zheng, F. Meng, X. Rui, and Z. Wang, “Attention-based graph neural networks: a survey,” Artificial Intelligence Review, vol. 56, no. Suppl 2, pp. 2263–2310, 2023. 63. [63]. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “Computational capabilities of graph neural networks,” IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 81–102, 2008. 64. [64]. E. Moradi, B. Khundrakpam, J. D. Lewis, A. C. Evans, and J. Tohka, “Predicting symptom severity in autism spectrum disorder based on cortical thickness measures in agglomerative data,” Neuroimage, vol. 144, pp. 128–141, 2017. 65. [65]. M. Plitt, K. A. Barnes, G. L. Wallace, L. Kenworthy, and A. Martin, “Resting-state functional connectivity predicts longitudinal change in autistic traits and adaptive functioning in autism,” Proceedings of the National Academy of Sciences, vol. 112, no. 48, pp. E6699–E6706, 2015. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTEyLzQ4L0U2Njk5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDcvMTgvMjAyNC4wNy4xNy4yNDMxMDYxMC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 66. [66]. M. V. Lombardo, B. Chakrabarti, E. T. Bullmore, S. A. Sadek, G. Pasco, S. J. Wheelwright, J. Suckling, M. A. Consortium, and S. Baron-Cohen, “Atypical neural self-representation in autism,” Brain, vol. 133, no. 2, pp. 611–624, 2010. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/brain/awp306&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20008375&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000274777700024&link_type=ISI) 67. [67]. R. B. Mars, F.-X. Neubert, M. P. Noonan, J. Sallet, I. Toni, and M. F. Rushworth, “On the relationship between the “default mode network” and the “social brain”,” Frontiers in human neuroscience, vol. 6, p. 189, 2012. 68. [68]. R. Moseley, R. Ypma, R. Holt, D. Floris, L. Chura, M. Spencer, S. Baron-Cohen, J. Suckling, E. Bullmore, and M. Rubinov, “Whole-brain functional hypoconnectivity as an endophenotype of autism in adolescents,” Neuroimage: clinical, vol. 9, pp. 140–152, 2015. 69. [69]. E. Glerean, R. K. Pan, J. Salmi, R. Kujala, J. M. Lahnakoski, U. Roine, L. Nummenmaa, S. Leppämäki, T. Nieminen-von Wendt, P. Tani et al., “Reorganization of functionally connected brain subnetworks in high-functioning autism,” Human brain mapping, vol. 37, no. 3, pp. 1066–1079, 2016. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/hbm.23084&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26686668&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) 70. [70]. C. S. Monk, S. J. Peltier, J. L. Wiggins, S.-J. Weng, M. Carrasco, S. Risi, and C. Lord, “Abnormalities of intrinsic functional connectivity in autism spectrum disorders,” Neuroimage, vol. 47, no. 2, pp. 764–772, 2009. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2009.04.069&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19409498&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000267756900037&link_type=ISI) 71. [71]. C. J. Lynch, L. Q. Uddin, K. Supekar, A. Khouzam, J. Phillips, and V. Menon, “Default mode network in childhood autism: posteromedial cortex heterogeneity and relationship with social deficits,” Biological psychiatry, vol. 74, no. 3, pp. 212–219, 2013. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.biopsych.2012.12.013&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23375976&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000321443100012&link_type=ISI) 72. [72]. X. Guo, X. Duan, J. Suckling, H. Chen, W. Liao, Q. Cui, and H. Chen, “Partially impaired functional connectivity states between right anterior insula and default mode network in autism spectrum disorder,” Human brain mapping, vol. 40, no. 4, pp. 1264–1275, 2019. 73. [73]. S.-J. Blakemore, “The social brain in adolescence,” Nature Reviews Neuroscience, vol. 9, no. 4, pp. 267–277, 2008. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nrn2353&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18354399&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000254145200013&link_type=ISI) 74. [74]. R. Adolphs, “The social brain: neural basis of social knowledge,” Annual review of psychology, vol. 60, no. 1, pp. 693–716, 2009. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1146/annurev.psych.60.110707.163514&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18771388&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000262615800028&link_type=ISI) 75. [75]. S. J. Gotts, W. K. Simmons, L. A. Milbury, G. L. Wallace, R. W. Cox, and A. Martin, “Fractionation of social brain circuits in autism spectrum disorders,” Brain, vol. 135, no. 9, pp. 2711–2725, 2012. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/brain/aws160&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22791801&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F07%2F18%2F2024.07.17.24310610.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000308873600013&link_type=ISI) 76. [76]. D. Roy and L. Q. Uddin, “Atypical core-periphery brain dynamics in autism,” Network Neuroscience, vol. 5, no. 2, pp. 295–321, 2021. [1]: /embed/graphic-4.gif [2]: /embed/inline-graphic-1.gif [3]: /embed/graphic-5.gif [4]: /embed/graphic-8.gif [5]: /embed/graphic-9.gif [6]: /embed/graphic-10.gif [7]: /embed/graphic-11.gif [8]: /embed/graphic-12.gif [9]: /embed/inline-graphic-2.gif [10]: /embed/inline-graphic-3.gif [11]: /embed/graphic-13.gif [12]: /embed/inline-graphic-4.gif [13]: /embed/inline-graphic-5.gif [14]: /embed/graphic-14.gif [15]: /embed/graphic-15.gif [16]: /embed/graphic-16.gif [17]: /embed/inline-graphic-6.gif [18]: /embed/inline-graphic-7.gif [19]: /embed/graphic-17.gif [20]: /embed/inline-graphic-8.gif [21]: /embed/graphic-19.gif