Abstract
Background As large analyses merge data across sites, a deeper understanding of variance in statistical assessment across the sources of data becomes critical for valid analyses. Diffusion tensor imaging (DTI) exhibits spatially varying and correlated noise, so care must be taken with distributional assumptions.
Purpose We characterize the role of physiology, subject compliance, and the interaction of subject with the scanner in the understanding of DTI variability, as modeled in spatial variance of derived metrics in homogeneous regions.
Methods We analyze DTI data from 1035 subjects in the Baltimore Longitudinal Study of Aging (BLSA), with ages ranging from 22.4 to 103 years old. For each subject, up to 12 longitudinal sessions were conducted. We assess variance of DTI scalars within regions of interest (ROIs) defined by four segmentation methods and investigate the relationships between the variance and covariates, including baseline age, time from the baseline (referred to as “interval”), motion, sex, and whether it is the first scan or the second scan in the session.
Results Covariate effects are heterogeneous and bilaterally symmetric across ROIs. Inter-session interval is positively related (p ≪ 0.001) to FA variance in the cuneus and occipital gyrus, but negatively (p ≪ 0.001) in the caudate nucleus. Males show significantly (p ≪ 0.001) higher FA variance in the right putamen, thalamus, body of the corpus callosum, and cingulate gyrus. In 62 out of 176 ROIs defined by the Eve type-1 atlas, an increase in motion is associated (p < 0.05) with a decrease in FA variance. Head motion increases during the rescan of DTI (Δμ = 0.045 millimeters per volume).
Conclusions The effects of each covariate on DTI variance, and their relationships across ROIs are complex. Ultimately, we encourage researchers to include estimates of variance when sharing data and consider models of heteroscedasticity in analysis. This work provides a foundation for study planning to account for regional variations in metric variance.
1 Introduction
Large datasets enable exploration of questions that would be impractical with smaller- or moderate-sized datasets while giving rise to the development and application of deep learning models which can assimilate complex data. One prevalent challenge is that large datasets often comprise samples aggregated from distinct sources at different time points using diverse technologies, causing data heterogeneity, experimental variations, and statistical biases if the analysis is not executed appropriately.1 In such scenarios, understanding the variance and variability of data is of great importance. The general linear model2, a structured and widely used framework for relationship modeling, allows us to illustrate the importance. The general linear model is assessed through regression. A linear regression can be expressed by Y = Xβ + ε, where the response variable Y, the covariate matrix X, and the regression coefficients β are conventionally represented in matrix forms given by: where we use M to denote the number of samples, and N to denote the number of independent variables. The error term ε is given by: where Σ represents the covariance matrix. If we assume the errors are uncorrelated, Σ is simplified to a diagonal matrix. We can simplify estimation of Eq. (1) with a whitening matrix3, W: Note Wε∼(0, IM), where I& denotes the identity matrix of dimension M × M.
We illustrate the practical importance of understanding the variance structure for reducing statistical errors (Fig. 1).
Diffusion tensor imaging (DTI)4–6 is a modeling approach used in diffusion-weighted imaging 7–9, a variant of conventional magnetic resonance imaging (MRI) based on the tissue water diffusion rate.10 DTI allows for visualization and measurement of the degree of anisotropy and structural orientation of fibers in the brain and has been widely used in studies.11–13 DTI is inherently subject to low signal-to-noise ratio, and the noise structure exhibits spatial variability and correlation, primarily attributed to fast imaging and noise suppression techniques.14,15 Understanding the statistical nature of DTI variance or noise has been proven to be beneficial for diffusion tensor estimation15,16, outlier detection17, reproducibility assessment18–20.
Methods have been proposed for DTI variance (or noise) estimation, among which we recognize four types. The first type requires multiple repeated acquisitions (therefore, we refer to it as the “multiple acquisition method”). After the repeated acquisitions, we could take the standard deviation of the measurements, or perform data resampling, such as bootstrap or jackknife,21 to quantify uncertainties of DTI parameters. The second type involves two repeated acquisitions, which we call the “scan-rescan method”. We compute the difference between the images from each acquisition and then calculate the standard deviation of the difference across the space.22,23
Note that the standard deviation must be divided by the square root of 2 to account for the combination of two random variables (noise in each image). The third and fourth types are used when we have only one acquisition. For the third type, we select a homogeneous ROI and compute the standard deviation of the measurements within this ROI (“ROI-based method”).24 The fourth type, often referred to as “model-based resampling”, involves fitting a model (e.g., diffusion tensor) locally to the observed data. The residuals from the fitted model, along with the original observed data, are then used by data resampling techniques such as wild bootstrap to generate random subsets.21,25 From each subset, we obtain an estimate of a specific parameter. Across all subsets, we get the distribution of the estimates and thus quantify the uncertainty of the DTI parameter. The first type (multiple acquisition method) makes no assumptions about the noise properties at the cost of requiring multiple acquisitions (for each diffusion gradient in the DTI scenario).21 The second type (scan-rescan method) assumes that the noise is constant across space, or across the region from which the standard deviation is computed.23 The third method (ROI- based method) assumes that both the noise and the signal are constant across the ROI.24 The fourth method (model-based resampling) assumes that the non-constant variance of measured signals can be captured by the chosen model.21 In this study, we choose the ROI-based method for estimating noise across brain regions in DTI. This is because it does not require repeated acquisitions, and one advantage is that we can compute noise from each individual scan within an imaging session, enabling inter-scan comparisons. Secondly, by using an individual scan for noise estimation, we can mitigate errors caused by motion and inter-scan misalignment of the brain, which could be problematic when using the multiple acquisition method or the scan-rescan method. Thirdly, we want to avoid using the assumption of the fourth type (model-based resampling).21
Up to this point, we have been using the terms “variance” and “noise” interchangeably. In the following text, we use “variance” when referring to the measure of the dispersion of data, and “noise” when referring to the imaging noise such as MRI noise—primarily caused by thermal fluctuations and electrical noise—to avoid misinterpretation.
To gain a better understanding of DTI variance or noise, it is important to characterize the role of physiology, subject compliance, and the interaction between the subject and the scanner. Our approach is driven by two fundamental questions (Fig. 2): Which factors are associated with DTI variance? Where and how does this association manifest? We assess variance of DTI scalars, including fractional anisotropy (FA), axial diffusivity (AD), mean diffusivity (MD), and radial diffusivity (RD), within ROIs, and investigate the associations between the variance and covariates, including baseline age, time from the baseline (referred to as “interval”), motion, sex, and whether it is the first or the second scan within the session, using linear mixed effects models26.
2 Methods
We use the PreQual27 pipeline for preprocessing and quality assurance of the DTI data in the Baltimore Longitudinal Study of Aging (BLSA)28,29 dataset. We consider all subjects with at least one session comprising both T1-weighted (T1w) MRI data and DTI data. We exclude 49 DTI images exhibiting one or more of the following characteristics according to their potential impact on subsequent analyses:
The presence of extreme susceptibility-induced distortion, motion artifacts, or eddy currents that resists correction.
The failure of the preprocessed data to be fitted by the tensor model.
An exceptionally low signal-to-noise ratio in the FA and MD images.
The exclusion of these cases results in the dataset depicted in Fig. 3. We identify 1035 subjects (562 F/ 473 M, 22.4 to 94.4 y/o at baseline) with 2751 sessions (1497 F/ 1254 M). During subsequent sessions, 4 female subjects and 10 male subjects were diagnosed with Alzheimer’s disease, while the remaining subjects remained cognitively normal throughout all sessions.
2.1 ROI-Based DTI Variance Estimation
We use a registration-based approach for brain segmentation in the b0 (minimally weighted) volume (Fig. 4). We initiate the process with brain segmentations for the T1w images obtained through manual parcellations provided by the JHU-MNI-ss atlas (“Eve atlas”)30,31 and automated whole-brain segmentation by SLANT32. For the Eve atlas, there are three types of parcellations available, each with different regional focus.30 For SLANT segmentation, labels for 132 regions covering the whole brain are provided.32 We use the method by Hansen et al.33 to transfer these labels from T1w to b0 space. After label transferring, we manually review the segmentation to see if the labels align with the anatomical regions.
2.2 Linear Mixed-Effects Model
We use linear mixed-effects models26 to analyze the association between DTI scalar standard deviation and covariates. (R program, version 4.2.2 34; Ubuntu 20.04.5 LTS; R package lme4, version 1.1.31 35; R package lmerTest, version 3.1.3 36.)
We study linear mixed-effects models of the form: where σ,-./ represents the standard deviation of a DTI scalar (FA, AD, MD, or RD) in a specific brain region of subject i at session j via scanner k in acquisition l, Agebaseline,, (hereafter referred to as “baseline”) is the age of subject i at baseline session, Ageinterval,,- (hereafter referred to as “interval”) is the time between the current session, j, and the baseline session. Motion,-/ is a scalar value reflecting the degree of head movement of subject i at session j during acquisition l (calculated based on eddy movement, in millimeters)37, Sex, is the gender of subject i (0 for female and 1 for male), and Rescan,-/ is a binary variable indicating if the acquisition l is the first scan (coded 0) or the rescan (coded 1) of session j. We consider subject and scanner as two random intercepts, respectively denoted by r1,i and r2,k. Prior to fitting the models, we standardize the dependent variable σ. To allow an artificially “amplified” effect of aging in the result, facilitating comparison with other covariates, we convert the units of baseline and interval from years to decades.
We have a total of 2224 models, derived from the four DTI scalars (FA, AD, MD, or RD), across varying ROIs defined by Eve Type 1 (176 ROIs), Eve Type 2 (130 ROIs), Eve Type 3 (118 ROIs),30,31 and SLANT (132 ROIs)32. Each model starts with a full model, with all fixed effects and random effects, followed by an implementation of backward model selection.36 The p-values for the fixed-effect terms are calculated based on the associated F tests.36 To account for multiple comparisons, we adjust the p-values across the pairs of DTI scalar and ROI for a false discovery rate (FDR) of 0.05 using the Benjamini-Hochberg method.38
3 Results
The magnitude and direction of the effects of each covariate on DTI variance exhibit heterogeneous patterns across ROIs (Fig. 5). Specifically, interval is positively related to FA variance in ROIs such as the cuneus, middle occipital gyrus, superior occipital gyrus, medulla, precuneus white matter, but negatively related in ROIs such as the caudate nucleus, posterior thalamic radiation, and superior fronto-occipital fasciculus (Table 1). Males have higher FA variance in the right putamen, thalamus, body of corpus callosum, and cingulum (cingulate gyrus), but lower FA variance in the middle frontal gyrus (Table 2). In the right inferior temporal gyrus, an increase of 1 millimeter in motion is associated with an increase of 2.211 standard deviations in the z-scored standard deviation (σ) of FA values (β = 2.211, p ≪ 0.001). Interestingly and counterintuitively, in several ROIs, including the medulla, middle occipital white matter, cingulum (cingulate gyrus), an increase in motion is linked with a decrease in FA variance (Table 1).
In the lateral fronto-orbital gyrus, left insular, gyrus rectus, and inferior occipital gyrus, both motion and interval exhibit a positive association with FA variance (Table 1). In the left caudate nucleus, and right posterior thalamic radiation, they both show a negative association with FA variance (Table 1). In many other ROIs, such as the cuneus, lingual white matter, and middle occipital white matter, motion is negatively related to FA variance while interval is positively related (Table 1). Results from the left ROI closely align with those from the corresponding right ROI (Table 1). There are some ROIs where interval is significantly (p ≪ 0.001) associated with FA variance, while motion either gets removed during the model selection or shows weak associations (p ≥ 0.05) (Fig. 5).
On data extracted from ROIs defined by SLANT segmentation, which has different regional focus and delineation than Eve type-1 segmentation, the aforementioned patterns of effects can also be observed (Fig. 6, Fig. 7). For instance, in the left cerebellum exterior, both motion and interval are positively associated with FA variance, with motion’s coefficient (β = 0.960, p ≪ 0.001) higher than that of interval (β = 0.485, p ≪ 0.001). This parallels the relationship observed between the motion and interval coefficients in the left cerebellum defined by the Eve type-1 segmentation (β = 0.993, p ≪ 0.001 for motion; β = 0.471, p ≪ 0.001 for interval). Similarly, in ROIs such as the right cuneus, left precuneus, right superior occipital gyrus, and right middle occipital gyrus, motion shows a negative association with FA variance, while interval shows a positive association. (See Supplementary Materials for coefficients and p-values.)
In the model selection process, we observe that the rescan and the motion terms appear mutually exclusive, with only one preserved post-selection in most models. This pattern is echoed in the heatmaps of coefficients (Fig. 5, Fig. 6), where the cell in either the motion or the rescan column is colored grey. This hints at a correlation between rescan and motion. Supporting this observation, we detect an increase in head motion in the rescan of DTI acquired right after the first scan of DTI in the same session (mean shift Δμ = 0.045 millimeters per volume, relative mean shift Δμ/μ = 17.0%, coefficient of determination R2 = 0.065).
We provide the results from other pairs of DTI scalar and segmentation method in the Supplementary Materials.
4 Discussion
While many studies have estimated and shown the spatial variability of DTI variance (or noise),14,15,39 we characterize how DTI variance is associated with physiological and behavioral factors across brain regions. We answer the questions: Which factors are associated with DTI variance? Where and how does this association manifest? We found region-specific and bidirectional effects of covariates—including interval (which captures the within-individual longitudinal change over time), motion, and sex—on FA variance across brain regions. For instance, FA variance is positively associated with interval in cuneus, but negatively associated in caudate nucleus. Long-standing research has demonstrated that there is a decline in white matter microstructure with aging,40–45 with the consensus being that frontal and parietal areas are particularly vulnerable and the occipital and motor areas are mostly preserved. The frontal lobe exhibits the most pronounced decline, with FA declining by approximately 3% per decade starting at ∼35 years of age.46 Although our study focuses on the standard deviation of FA, our results converge with these prior research studies as we have shown high sensitivity to aging in the frontal, parietal, and temporal areas. While it is unclear what mechanisms are driving the changes in these areas, potential culprits include the change of uniformity of fiber orientations and fiber density.47–49
Previous studies50–52 have shown differences in FA between genders across brain regions. Oh et al. found that males have significantly higher FA values in global corpus callosum structure areas, while they exhibit lower FA values than females in the partial areas of the rostrum, genu, and splenium.50 Menzler et al. found that males show higher FA values in the thalamus, corpus callosum and cingulum.52 Most of these regions previously identified in the literature also show significant (p ≪ 0.001) associations between FA variance and sex in our study. While previous studies have reported changes in mean FA values, we offer a different perspective by depicting the variance of FA values.
The negative association observed between motion and FA variance in multiple regions, while counterintuitive, is not unreasonable. One might naturally expect that as motion increases, the uncertainty (reflected as variance) in the image should increase, given that motion leads to lower image quality, signal-to-noise ratio, and artifacts that can mislead image interpretation.53,54 However, the images we use for analysis have undergone motion correction during preprocessing. Although in practice, motion artifacts cannot be fully eliminated from the image, the recorded motion value doesn’t reflect the motion’s impact in the image after preprocessing. Instead, it reflects the subject’s motion during image acquisition. A higher motion value does not necessarily correspond to a noisier image post-preprocessing. Furthermore, Zeng et al. found that head motion during brain imaging is not merely a technical artifact but a reflection of a neurobiological trait. Specifically, individuals with stronger distant connectivity in the default network could consistently refrain from moving and such “head motion tendency” remains consistent within individuals.55 These points, taken together, provide explanations from image processing and biological perspectives, respectively, for why FA variance can decrease as motion increases.
4.1 Limitations of Current Study
First, this study relies on a registration-based method for brain segmentation in the b0 space. Despite rigorous quality assurance, the labels for each brain region may not correspond flawlessly with the true anatomical regions. Consequently, the standard deviation of DTI scalars extracted from each region combines both voxel-wise modeling factors and image analysis factors from neighboring regions. Second, we used backward model selection for the fixed-effect terms of the linear mixed-effects models. Such method can be unstable according to Breiman et. al.56 Third, we use the variance of DTI scalar values in the ROI as a proxy for measuring noise. This is not an ideal proxy, as the ROI-based variance captures not only the image noise but also the spatial variability in voxel intensity due to microstructural variations. This makes it suboptimal to reflect DTI noise. Fourth, our study only includes the BLSA dataset. Incorporating data from additional sources could make the findings more convincing. Fifth, the motion value used in this study is based on movement calculated by FSL’s eddy37, which approximates true head motion.
5 Conclusion
The notion of harnessing variance to enhance the reliability of analysis is universally applicable. Having a better understanding of variance is pivotal in mega-analyses, where heteroscedasticity is often an inherent challenge. Our study illuminates the complex and heterogeneous effects of covariates including baseline age, interval, motion, sex, and rescan on DTI variance across ROIs. More comprehensive efforts are required to fully characterize the variance. In the meantime, we encourage researchers to consider models of heteroscedasticity in their analyses and to include their estimates of variance when sharing data. As highlighted in the introduction, the application of the whitening matrix, constructed using the variance of the data, significantly reduces statistical errors. We anticipate that more sophisticated methods can further unlock the potential benefits derived from a nuanced understanding of variance, thereby bolstering the accuracy and reliability of future research.
Data Availability
All data used in the present work are available upon request to the Baltimore Longitudinal Study of Aging.
Declaration of Competing Interest
The authors declare that they have no conflict of interest.
Code, Data, and Materials Availability
The dataset used in this study can be accessed upon approval from the Baltimore Longitudinal Study of Aging team at https://www.blsa.nih.gov/. The code for the experiments can be found on GitHub at https://github.com/MASILab/Variance-Aging-Diffusion.
Ethics approval
IRB of Vanderbilt University waived ethical approval for de-identified access of the human subject data. (IRB #172072, PI: Bennett A. Landman)
CRediT Authorship Contribution Statement
Chenyu Gao: Conceptualization, Methodology, Software, Validation, Formal analysis, Data Curation, Writing - Original Draft, Visualization. Qi Yang: Software, Validation, Data Curation. Michael E. Kim: Software, Validation, Data Curation. Nazirah Mohd Khairi: Software, Validation, Data Curation. Leon Y. Cai: Software, Validation, Data Curation. Nancy R. Newlin: Validation. Praitayini Kanakaraj: Validation. Lucas W. Remedios: Visualization. Aravind R. Krishnan: Validation. Xin Yu: Data Curation. Tianyuan Yao: Validation. Panpan Zhang: Writing - Review & Editing. Kurt G. Schilling: Data Curation, Writing - Review & Editing. Daniel Moyer: Writing - Review & Editing. Derek B. Archer: Writing - Review & Editing. Susan M. Resnick: Resources, Writing - Review & Editing. Bennett A. Landman: Conceptualization, Methodology, Validation, Resources, Data Curation, Writing - Review & Editing, Supervision, Funding acquisition.
Declaration of Generative AI and AI-assisted technologies in the writing process
During the preparation of this work the author Chenyu Gao used ChatGPT to check grammar and improve readability of the original draft. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Acknowledgement
This work was supported in part by the National Institute of Health through grants 1R01EB017230-01A1 (Bennett A. Landman), 5-K01-EB032898-02 (Kurt Schilling), 1U24AG074855-01 (Timothy J. Hohman), and ViSE/VICTR VR3029 and by the Intramural Research Program of the National Institute on Aging, NIH. This work was conducted in part using the resources of the Advanced Computing Center for Research and Education at Vanderbilt University, Nashville, TN. We appreciate the National Institute of Health S10 Shared Instrumentation grant 1S10OD020154-01 (Smith), and grant 1S10OD023680-01 (Vanderbilt’s High-Performance Computer Cluster for Biomedical Research). The Vanderbilt Institute for Clinical and Translational Research (VICTR) is funded by the National Center for Advancing Translational Sciences (NCATS) Clinical Translational Science Award (CTSA) Program, Award Number 5UL1TR002243-03. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Footnotes
Introduction section updated to clarify why we choose the ROI-based estimation method; Results section updated to move values to tables; Redundant sentences in experimental design are removed; Supplemental files updated.