: Publication 8703

Publication 8703

Title:	On the stability of canonical correlation analysis and partial least squares with application to brain-behavior associations
Journal:	Communications Biology
Published:	21 Feb 2024
Pubmed:	https://pubmed.ncbi.nlm.nih.gov/38383808/
DOI:	https://doi.org/10.1038/s42003-024-05869-4
URL:	https://www.nature.com/articles/s42003-024-05869-4.pdf
Citations:	10 (10 in last 2 years) as of 8 Aug 2024

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

Abstract

Associations between datasets can be discovered through multivariate methods like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). A requisite property for interpretability and generalizability of CCA/PLS associations is stability of their feature patterns. However, stability of CCA/PLS in high-dimensional datasets is questionable, as found in empirical characterizations. To study these issues systematically, we developed a generative modeling framework to simulate synthetic datasets. We found that when sample size is relatively small, but comparable to typical studies, CCA/PLS associations are highly unstable and inaccurate; both in their magnitude and importantly in the feature pattern underlying the association. We confirmed these trends across two neuroimaging modalities and in independent datasets with n ≈ 1000 and n = 20,000, and found that only the latter comprised sufficient observations for stable mappings between imaging-derived and behavioral features. We further developed a power calculator to provide sample sizes required for stability and reliability of multivariate analyses. Collectively, we characterize how to limit detrimental effects of overfitting on CCA/PLS stability, and provide recommendations for future studies.</p>

5 Keywords

Algorithms
Brain
Canonical Correlation Analysis
Least-Squares Analysis
Reproducibility of Results

9 Authors

Markus Helmer
Shaun Warrington
Ali-Reza Mohammadi-Nejad
Jie Lisa Ji
Amber Howell
Benjamin Rosand
Alan Anticevic
Stamatios N. Sotiropoulos
John D. Murray

1 Application

Application ID	Title
43822	Multi-Modal Analysis of the UK Biobank Neuroimaging Data

Enabling scientific discoveries that improve human health