Volume 45, Issue 8 e26682
RESEARCH ARTICLE
Open Access

Principal component analysis as an efficient method for capturing multivariate brain signatures of complex disorders—ENIGMA study in people with bipolar disorders and obesity

Sean R. McWhinney

Sean R. McWhinney

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

Search for more papers by this author
Jaroslav Hlinka

Jaroslav Hlinka

Department of Complex Systems, Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic

National Institute of Mental Health, Klecany, Czech Republic

Search for more papers by this author
Eduard Bakstein

Eduard Bakstein

National Institute of Mental Health, Klecany, Czech Republic

Department of Cybernetics, Czech Technical University, Prague, Czech Republic

Search for more papers by this author
Lorielle M. F. Dietze

Lorielle M. F. Dietze

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

Department of Medical Neuroscience, Dalhousie University, Halifax, Nova Scotia, Canada

Search for more papers by this author
Emily L. V. Corkum

Emily L. V. Corkum

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

Search for more papers by this author
Christoph Abé

Christoph Abé

Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden

Search for more papers by this author
Martin Alda

Martin Alda

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

National Institute of Mental Health, Klecany, Czech Republic

Search for more papers by this author
Nina Alexander

Nina Alexander

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Francesco Benedetti

Francesco Benedetti

Vita-Salute San Raffaele University, Milan, Italy

Division of Neuroscience, Psychiatry and Psychobiology Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy

Search for more papers by this author
Michael Berk

Michael Berk

Institute for Mental and Physical Health and Clinical Translation, School of Medicine, Barwon Health, Deakin University, Geelong, Victoria, Australia

Search for more papers by this author
Erlend Bøen

Erlend Bøen

Unit for Psychosomatics/CL Outpatient Clinic for Adults, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

Search for more papers by this author
Linda M. Bonnekoh

Linda M. Bonnekoh

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Department of Child Adolescent Psychiatry and Psychotherapy, University of Münster, Münster, Germany

Search for more papers by this author
Birgitte Boye

Birgitte Boye

Unit for Psychosomatics and C-L Psychiatry for Adults, Oslo University Hospital, Oslo, Norway

Department of Behavioural Medicine, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway

Search for more papers by this author
Katharina Brosch

Katharina Brosch

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, New York, USA

Search for more papers by this author
Erick J. Canales-Rodríguez

Erick J. Canales-Rodríguez

FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain

CIBERSAM, Instituto de Salud Carlos III, Barcelona, Spain

Search for more papers by this author
Dara M. Cannon

Dara M. Cannon

Clinical Neuroimaging Laboratory, Galway Neuroscience Centre, College of Medicine Nursing and Health Sciences, University of Galway, Galway, Ireland

Search for more papers by this author
Udo Dannlowski

Udo Dannlowski

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Caroline Demro

Caroline Demro

Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota, USA

Search for more papers by this author
Ana Diaz-Zuluaga

Ana Diaz-Zuluaga

Research Group in Psychiatry GIPSI, Department of Psychiatry, Faculty of Medicine, Universidad de Antioquia, Medellin, Colombia

Search for more papers by this author
Torbjørn Elvsåshagen

Torbjørn Elvsåshagen

Department of Behavioural Medicine, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway

Institute of Clinical Medicine, Norwegian Centre for Mental Disorders Research (NORMENT), University of Oslo and Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

Department of Neurology, Division of Clinical Neuroscience, Oslo University Hospital, Oslo, Norway

Search for more papers by this author
Lisa T. Eyler

Lisa T. Eyler

Department of Psychiatry, University of California San Diego, La Jolla, California, USA

Desert-Pacific MIRECC, VA San Diego Healthcare, San Diego, California, USA

Search for more papers by this author
Lydia Fortea

Lydia Fortea

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), CIBERSAM, Instituto de Salud Carlos III, University of Barcelona, Barcelona, Spain

Search for more papers by this author
Janice M. Fullerton

Janice M. Fullerton

Neuroscience Research Australia, Randwick, New South Wales, Australia

School of Biomedical Sciences, Faculty of Medicine & Health, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Janik Goltermann

Janik Goltermann

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Ian H. Gotlib

Ian H. Gotlib

Department of Psychology, Stanford University, Stanford, California, USA

Search for more papers by this author
Dominik Grotegerd

Dominik Grotegerd

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Bartholomeus Haarman

Bartholomeus Haarman

Department of Psychiatry, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands

Search for more papers by this author
Tim Hahn

Tim Hahn

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Fleur M. Howells

Fleur M. Howells

Neuroscience Institute, University of Cape Town, Cape Town, South Africa

Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa

Search for more papers by this author
Hamidreza Jamalabadi

Hamidreza Jamalabadi

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Andreas Jansen

Andreas Jansen

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Core-Facility Brainimaging, Faculty of Medicine, University of Marburg, Germany

Search for more papers by this author
Tilo Kircher

Tilo Kircher

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Anna Luisa Klahn

Anna Luisa Klahn

Institute of Neuroscience and Physiology, Sahlgrenska Academy at Gothenburg University, Gothenburg, Sweden

Search for more papers by this author
Rayus Kuplicki

Rayus Kuplicki

Laureate Institute for Brain Research, Tulsa, Oklahoma, USA

Search for more papers by this author
Elijah Lahud

Elijah Lahud

Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota, USA

Search for more papers by this author
Mikael Landén

Mikael Landén

Institute of Neuroscience and Physiology, Sahlgrenska Academy at Gothenburg University, Gothenburg, Sweden

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

Search for more papers by this author
Elisabeth J. Leehr

Elisabeth J. Leehr

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Carlos Lopez-Jaramillo

Carlos Lopez-Jaramillo

Research Group in Psychiatry GIPSI, Department of Psychiatry, Faculty of Medicine, Universidad de Antioquia, Medellin, Colombia

Search for more papers by this author
Scott Mackey

Scott Mackey

Department of Psychiatry, University of Vermont College of Medicine, Burlington, Vermont, USA

Search for more papers by this author
Ulrik Malt

Ulrik Malt

Unit for Psychosomatics/CL Outpatient Clinic for Adults, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

Institute of Clinical Medicine, Department of Neurology, University of Oslo, Oslo, Norway

Search for more papers by this author
Fiona Martyn

Fiona Martyn

Clinical Neuroimaging Laboratory, Galway Neuroscience Centre, College of Medicine Nursing and Health Sciences, University of Galway, Galway, Ireland

Search for more papers by this author
Elena Mazza

Elena Mazza

Vita-Salute San Raffaele University, Milan, Italy

Division of Neuroscience, Psychiatry and Psychobiology Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy

Search for more papers by this author
Colm McDonald

Colm McDonald

Clinical Neuroimaging Laboratory, Galway Neuroscience Centre, College of Medicine Nursing and Health Sciences, University of Galway, Galway, Ireland

Search for more papers by this author
Genevieve McPhilemy

Genevieve McPhilemy

Clinical Neuroimaging Laboratory, Galway Neuroscience Centre, College of Medicine Nursing and Health Sciences, University of Galway, Galway, Ireland

Search for more papers by this author
Sandra Meier

Sandra Meier

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

Search for more papers by this author
Susanne Meinert

Susanne Meinert

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Institute for Translational Neuroscience, University of Münster, Münster, Germany

Search for more papers by this author
Elisa Melloni

Elisa Melloni

Vita-Salute San Raffaele University, Milan, Italy

Division of Neuroscience, Psychiatry and Psychobiology Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy

Search for more papers by this author
Philip B. Mitchell

Philip B. Mitchell

Discipline of Psychiatry and Mental Health, School of Clinical Medicine, Faculty of Medicine & Health, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Leila Nabulsi

Leila Nabulsi

Clinical Neuroimaging Laboratory, Galway Neuroscience Centre, College of Medicine Nursing and Health Sciences, University of Galway, Galway, Ireland

Search for more papers by this author
Igor Nenadić

Igor Nenadić

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Robert Nitsch

Robert Nitsch

Institute for Translational Neuroscience, University of Münster, Münster, Germany

Search for more papers by this author
Nils Opel

Nils Opel

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Department of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany

German Center for Mental Health (DZPG), Site Jena-Magdeburg-Halle, Germany

Search for more papers by this author
Roel A. Ophoff

Roel A. Ophoff

UCLA Center for Neurobehavioral Genetics, Los Angeles, California, USA

Search for more papers by this author
Maria Ortuño

Maria Ortuño

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain

Search for more papers by this author
Bronwyn J. Overs

Bronwyn J. Overs

Neuroscience Research Australia, Randwick, New South Wales, Australia

Search for more papers by this author
Julian Pineda-Zapata

Julian Pineda-Zapata

Research Group, Instituto de Alta Tecnología Médica, Ayudas diagnósticas SURA, Medellin, Colombia

Search for more papers by this author
Edith Pomarol-Clotet

Edith Pomarol-Clotet

FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain

CIBERSAM, Instituto de Salud Carlos III, Barcelona, Spain

Search for more papers by this author
Joaquim Radua

Joaquim Radua

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), CIBERSAM, Instituto de Salud Carlos III, University of Barcelona, Barcelona, Spain

Search for more papers by this author
Jonathan Repple

Jonathan Repple

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, Goethe University Frankfurt, University Hospital, Frankfurt, Germany

Search for more papers by this author
Gloria Roberts

Gloria Roberts

Discipline of Psychiatry and Mental Health, School of Clinical Medicine, Faculty of Medicine & Health, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Elena Rodriguez-Cano

Elena Rodriguez-Cano

FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain

CIBERSAM, Instituto de Salud Carlos III, Barcelona, Spain

Search for more papers by this author
Matthew D. Sacchet

Matthew D. Sacchet

Meditation Research Program, Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA

Search for more papers by this author
Raymond Salvador

Raymond Salvador

FIDMAG Germanes Hospitalàries Research Foundation, Barcelona, Spain

CIBERSAM, Instituto de Salud Carlos III, Barcelona, Spain

Search for more papers by this author
Jonathan Savitz

Jonathan Savitz

Laureate Institute for Brain Research, Tulsa, Oklahoma, USA

Oxley College of Health Sciences, The University of Tulsa, Tulsa, Oklahoma, USA

Search for more papers by this author
Freda Scheffler

Freda Scheffler

Neuroscience Institute, University of Cape Town, Cape Town, South Africa

Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa

Search for more papers by this author
Peter R. Schofield

Peter R. Schofield

Neuroscience Research Australia, Randwick, New South Wales, Australia

School of Biomedical Sciences, Faculty of Medicine & Health, University of New South Wales, Sydney, New South Wales, Australia

Search for more papers by this author
Navid Schürmeyer

Navid Schürmeyer

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Chen Shen

Chen Shen

Department of Psychology, University of Minnesota, Minneapolis, Minnesota, USA

Search for more papers by this author
Kang Sim

Kang Sim

West Region, Institute of Mental Health, Singapore, Singapore

Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore

Search for more papers by this author
Scott R. Sponheim

Scott R. Sponheim

Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota, USA

Minneapolis VA Health Care System, Minneapolis, Minnesota, USA

Search for more papers by this author
Dan J. Stein

Dan J. Stein

Neuroscience Institute, University of Cape Town, Cape Town, South Africa

Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa

South African MRC Unit on Risk & Resilience in Mental Disorders, University of Cape Town, Cape Town, South Africa

Search for more papers by this author
Frederike Stein

Frederike Stein

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Benjamin Straube

Benjamin Straube

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Chao Suo

Chao Suo

Turner Institute for Brain and Mental Health, School of Psychological Sciences and Monash Biomedical Imaging, Monash University, Melbourne, Victoria, Australia

Search for more papers by this author
Henk Temmingh

Henk Temmingh

Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa

Search for more papers by this author
Lea Teutenberg

Lea Teutenberg

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Florian Thomas-Odenthal

Florian Thomas-Odenthal

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Sophia I. Thomopoulos

Sophia I. Thomopoulos

Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, California, USA

Search for more papers by this author
Snezana Urosevic

Snezana Urosevic

Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota, USA

Minneapolis VA Health Care System, Minneapolis, Minnesota, USA

Search for more papers by this author
Paula Usemann

Paula Usemann

Department of Psychiatry and Psychotherapy, Philipps-University Marburg, Marburg, Germany

Search for more papers by this author
Neeltje E. M. van Haren

Neeltje E. M. van Haren

Department of Child and Adolescent Psychiatry/Psychology, Erasmus University Medical Center, Rotterdam, The Netherlands

Department of Psychiatry, University Medical Center Utrecht, Utrecht, The Netherlands

Search for more papers by this author
Cristian Vargas

Cristian Vargas

Research Group in Psychiatry GIPSI, Department of Psychiatry, Faculty of Medicine, Universidad de Antioquia, Medellin, Colombia

Search for more papers by this author
Eduard Vieta

Eduard Vieta

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), CIBERSAM, Instituto de Salud Carlos III, Institute of Neuroscience, University of Barcelona, Hospital Clínic, Barcelona, Spain

Search for more papers by this author
Enric Vilajosana

Enric Vilajosana

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain

Search for more papers by this author
Annabel Vreeker

Annabel Vreeker

Department of Child and Adolescent Psychiatry/Psychology, Erasmus University Medical Center, Rotterdam, The Netherlands

Department of Psychology, Education and Child Studies, Erasmus University Rotterdam, Rotterdam, The Netherlands

Search for more papers by this author
Nils R. Winter

Nils R. Winter

Institute for Translational Psychiatry, University of Münster, Münster, Germany

Search for more papers by this author
Lakshmi N. Yatham

Lakshmi N. Yatham

University of British Columbia, Vancouver, British Columbia, Canada

Search for more papers by this author
Paul M. Thompson

Paul M. Thompson

Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, California, USA

Search for more papers by this author
Ole A. Andreassen

Ole A. Andreassen

Institute of Clinical Medicine, Norwegian Centre for Mental Disorders Research (NORMENT), University of Oslo and Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

Search for more papers by this author
Christopher R. K. Ching

Christopher R. K. Ching

Imaging Genetics Center, Mark and Mary Stevens Neuroimaging and Informatics Institute, Keck School of Medicine, University of Southern California, Marina del Rey, California, USA

Search for more papers by this author
Tomas Hajek

Corresponding Author

Tomas Hajek

Department of Psychiatry, Dalhousie University, Halifax, Nova Scotia, Canada

National Institute of Mental Health, Klecany, Czech Republic

Correspondence

Tomas Hajek, Department of Psychiatry, Dalhousie University, QEII HSC, A.J.Lane Bldg., Room 3095, 5909 Veteran's Memorial Lane, Halifax, NS B3H 2E2, Canada.

Email: [email protected]

Search for more papers by this author
First published: 02 June 2024
Citations: 4

The complete author details for the ENIGMA bipolar disorder working group are available at https://enigma.ini.usc.edu/ongoing/enigma-bipolar-working-group/

The complete author details for the ENIGMA BMI-X working group are available at https://enigma.ini.usc.edu/ongoing/enigma-bmix/

Abstract

Multivariate techniques better fit the anatomy of complex neuropsychiatric disorders which are characterized not by alterations in a single region, but rather by variations across distributed brain networks. Here, we used principal component analysis (PCA) to identify patterns of covariance across brain regions and relate them to clinical and demographic variables in a large generalizable dataset of individuals with bipolar disorders and controls. We then compared performance of PCA and clustering on identical sample to identify which methodology was better in capturing links between brain and clinical measures. Using data from the ENIGMA-BD working group, we investigated T1-weighted structural MRI data from 2436 participants with BD and healthy controls, and applied PCA to cortical thickness and surface area measures. We then studied the association of principal components with clinical and demographic variables using mixed regression models. We compared the PCA model with our prior clustering analyses of the same data and also tested it in a replication sample of 327 participants with BD or schizophrenia and healthy controls. The first principal component, which indexed a greater cortical thickness across all 68 cortical regions, was negatively associated with BD, BMI, antipsychotic medications, and age and was positively associated with Li treatment. PCA demonstrated superior goodness of fit to clustering when predicting diagnosis and BMI. Moreover, applying the PCA model to the replication sample yielded significant differences in cortical thickness between healthy controls and individuals with BD or schizophrenia. Cortical thickness in the same widespread regional network as determined by PCA was negatively associated with different clinical and demographic variables, including diagnosis, age, BMI, and treatment with antipsychotic medications or lithium. PCA outperformed clustering and provided an easy-to-use and interpret method to study multivariate associations between brain structure and system-level variables.

Practitioner Points

  1. In this study of 2770 Individuals, we confirmed that cortical thickness in widespread regional networks as determined by principal component analysis (PCA) was negatively associated with relevant clinical and demographic variables, including diagnosis, age, BMI, and treatment with antipsychotic medications or lithium.
  2. Significant associations of many different system-level variables with the same brain network suggest a lack of one-to-one mapping of individual clinical and demographic factors to specific patterns of brain changes.
  3. PCA outperformed clustering analysis in the same data set when predicting group or BMI, providing a superior method for studying multivariate associations between brain structure and system-level variables.

1 INTRODUCTION

Large-scale, multisite brain imaging datasets are becoming more common through initiatives such as ENIGMA (McWhinney et al., 2023), ADNI (Cruciani et al., 2024), ABCD (Dahl et al., 2024), the human connectome project (Cohen et al., 2023), and others. Large datasets allow us to apply multivariate techniques of analyses, which model interplay between regions (Woo et al., 2017), but require larger, more ecologically valid samples to provide more replicable results (Marek et al., 2022). These techniques better fit the anatomy of complex neuropsychiatric disorders which are characterized not by alterations in a single region, but rather by variations across distributed brain networks (Hibar et al., 2018; Segal et al., 2023). However, there is little methodological clarity on which of the many available methods of multivariate data analyses are best suited to the task of relating brain structure to system-level variables. While development of new methods is one key aspect of the field, uncovering benefits and best-use scenarios for established methods is equally as important.

Analyzing brain imaging changes in BD is a suitable way to test multivariate techniques. Individuals with BD markedly vary in their clinical presentations and impact of the illness on their functioning. This clinical heterogeneity may reflect neurobiological heterogeneity, which can be studied by brain imaging. It is increasingly clear that brain alterations in severe mental illnesses (SMI) are multifactorial. Aside from the diagnosis, they also reflect the effects of additional clinical factors, including medications (Hajek et al., 2012; McWhinney, Abé, et al., 2022; Van Gestel et al., 2019), and comorbid psychiatric or physical conditions, such as obesity (McWhinney, Abé, et al., 2021; McWhinney, Brosch, et al., 2022; McWhinney, Kolenic, et al., 2021) and diabetes (Hajek et al., 2014, 2016). Understanding the brain changes in SMI and translating these findings into clinical settings requires sensitive and replicable methods that link patterns of brain alterations to system-level variables.

Broadly speaking, some methods, such as clustering, categorize participants into groups based on their brain structure, while others, such as principal component analyses, represent brain imaging data as a linear combination of features. While clustering has become a popular method for multivariate analyses of neuroimaging data (McWhinney, Abé, et al., 2022), we do not expect groups of individuals to fall neatly into distinct clusters (e.g. healthy vs. unhealthy). Also, external variables may not exhibit a binary effect on the brain, but rather a nuanced, continuous one. Indeed our previous study using clustering exemplified these issues. We found that there were no strictly separate clusters in brain imaging data and that the boundaries between BD and controls were not clear, i.e. many controls fell into the cluster together with BD individuals, while some BD individuals clustered with controls (McWhinney, Abé, et al., 2022). The cluster assignment of individuals in part depended on continuous variables including age and BMI and effectively resulted in categorizing of these continuous variables, which is not optimal.

Neuroimaging data are often strongly correlated naturally (i.e. brain networks) and due to preprocessing (i.e. coregistration, smoothing, etc.), which is a good reason for linear projection methods. Instead of categorization, such methods quantify degrees of variation and may be better suited to identifying sources of heterogeneity in brain imaging data, as many of these sources may in fact themselves be on a continuum. While machine learning (ML) techniques can overcome these challenges, such techniques require large training sets and out-of-sample validation, and results can be difficult to interpret and translate into practice. Principal component analysis (PCA) represents a potentially optimal middle ground between these approaches, as it can perform well using modest sample sizes while reliably reducing dimensionality across many variables (i.e. brain regions) and deriving robust low-dimensional data representations (Comrey & Lee, 2013). By identifying covariance across individuals in numerous regions simultaneously, PCA can identify patterns of distributed brain network changes that can subsequently be linked with clinical correlates, while maintaining interpretability (Behdinan et al., 2015; Maralakunte et al., 2023; Rehák Bučková et al., 2023; Yeh et al., 2010).

Our main goal here is to compare whether methods which represent brain imaging data as a linear combination of features are better in capturing associations with clinical variable than methods which categorize brain imaging data into clusters. To do that we selected the most established and representative examples of each approach, i.e. PCA vs K-means clustering. Specifically, we used PCA to identify patterns of covariance across regions of interest and related them to clinical and demographic variables. We then compared performance of this approach to our prior clustering study on identical sample. We expected that compared to clustering, patterns of covariance in brain imaging data, as identified by PCA, would show stronger associations with clinical and demographic variables.

2 MATERIALS AND METHODS

2.1 Participating sites

The ENIGMA-BD working group brings together researchers with brain imaging and clinical data from people with BD (Hibar et al., 2016, 2018; McWhinney, Abé, et al., 2021; McWhinney et al., 2023). Nineteen site members of this group from 13 countries on six continents contributed individual subject structural MRI data, medication information, and body mass index (BMI) values for a total of 2770 participants. We split this sample into the primary and replication samples. The primary sample (N = 2436) was identical to the one in our previous study (McWhinney, Abé, et al., 2022) and allowed us to directly compare the results of clustering and PCA on the same sample. An additional five sites contributed data to our replication sample for out-of-sample validation (n = 327). Two of the new sites also recruited individuals with schizophrenia (N = 107). We decided to include them for testing the diagnostic specificity of the findings. Table 1, as well as Supplementary Tables S1 and S2 list the demographic and clinical details for each cohort. Supplementary Table S3 provides the diagnostic instruments used to obtain diagnosis and clinical information. Supplementary Table S4 lists exclusion criteria for study enrollment. Briefly, all studies used standard diagnostic instruments, including SCID (N = 12 studies), MINI (N = 1), and DIGS (N = 1). Most studies (N = 8) included both bipolar I (BDI) and bipolar II (BDII) disorders, five studies included only BDI, and one study only BDII participants. At the time of scanning, most individuals with BD were euthymic (81%), with some depressed (15%), manic (2%), hypomanic (1%), or mixed (<1%). Substance abuse was an exclusion criterion in seven studies. Most studies did not exclude comorbidities, other than substance abuse. Consequently, the sample represents one of the broadest, ecologically most valid, and a generalizable representation of real-world BD studied to date. In order to test how well a method captures relevant clinical links, we need a broad representation of the diagnosis, which is not restricted to one subtype and ideally also includes representation of other diagnoses.

TABLE 1. Demographic, diagnostic, and treatment characteristics of sample.
Controls Cases Difference
Primary sample Sample size (N) 1600 836
Sex—N (%) male 684 (42.8) 353 (42.2) χ2 = 0.25, p = .617
Age—mean (SD) 35.47 (12.63) 40.57 (12.81) F(1,2433) = 49.64, p < .001*
BMI—mean (SD) 24.43 (4.12) 27.10 (5.30) F(1,2378) = 135.46, p < .001*
BMI category—N (%)
Normal 1014 (63.4) 331 (39.6) χ2 = 158.55, p < .001*
Overweight 437 (27.3) 298 (35.6)
Obese 149 (9.3) 207 (24.8)
Diagnosis in patients—N (%): BD-I/BD-III/BD-NOS 572 (70.5)/234 (28.9)/5 (0.6)
Treatment at time of scan in patients—N (%): None/Lithium/Anticonvulsant/1st gen./2nd gen. antipsychotic/Antidepressant 226 (27.0) / 373 (49.7) /244 (35.4) / 37 (5.4) /262 (37.4) / 248 (35.4)
Replication sample Sample size 136 191
Sex—N (%) male 59 (43.4) 115 (60.2) χ2 = 2.85, p = .091
Age—mean (SD) 38.54 (13.51) 40.25 (12.87) F(1,246) = 0.01, p = .986
BMI—mean (SD) 24.60 (4.77) 29.35 (6.63) F(1,82) = 21.86, p < .001*
BMI category—N (%)
Normal 89 (65.4) 54 (27.7) χ2 = 51.03, p < .001*
Overweight 30 (22.1) 56 (29.3)
Obese 17 (12.5) 82 (42.9)
Diagnosis in patients—N (%): BD-I/BD-III/BD-NOS /Schizophrenia 15 (7.8)/9 (4.7)/1 (0.1)/107 (56.0)
Treatment at time of scan in patients – N (%): None/Lithium/Anticonvulsant/1st gen./2nd gen. antipsychotic/Antidepressant 57 (29.8)/24 (12.6)/38 (19.9)/35 (18.3)/112 (58.6)/60 (31.4)
  • Note: Asterisks indicate significant group differences (*p < .05).

All participating sites received approval from local ethics committees, and all participants provided written informed consent. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

2.2 MRI acquisition & processing

High-resolution T1-weighted brain anatomical MRI scans were acquired at each site, see Table S5. All groups used the same ENIGMA-standardized FreeSurfer protocol to derive region of interest (ROI) estimates of cortical thickness and surface area and performed standard visual and statistical quality assessment, as detailed at: http://enigma.ini.usc.edu/protocols/imaging-protocols/. These open-source protocols are standardized across the ENIGMA consortium, and available online to foster open science, replication, and better reproducibility. They were applied in prior publications by our group (Hibar et al., 2018; McWhinney et al., 2023), and more broadly in large-scale ENIGMA studies of major depression, schizophrenia, ADHD, OCD, PTSD, epilepsy, and autism (Thompson et al., 2020).

Briefly, FreeSurfer provides segmentations of 34 cortical regions per hemisphere, based on the Desikan–Killiany atlas, with estimates of cortical thickness and surface area for each region. Visual quality controls were performed on a ROI level aided by a visual inspection guide including pass/fail segmentation examples. We also generated diagnostic histogram plots for each site and outliers that deviated from the site mean for each structure by more than three standard deviations were flagged for further review. All observations failing quality inspection were withheld from subsequent analyses, see Table S6. Measurements were removed in 1.4% of participants per region on average, and missing values were imputed using the missForest algorithm (Stekhoven & Bühlmann, 2012). Prior analyses from the ENIGMA-BD working group showed that scanner field strength, voxel volume, and the version of FreeSurfer used for segmentation did not significantly influence the effect size estimates (Hibar et al., 2016).

2.3 Principal component analysis

We scaled each region to be zero-centered with a standard deviation of 1.0 and used PCA to obtain the loadings and scores of each principal component (PC) separately for cortical thickness and surface area, with each including all 68 cortical regions. Loadings indicated the contribution of each region in reduced-dimensional space for each component, and scores reflected the position of individuals in that component's space based on their cortical thickness or surface area weighted by each component's loadings. For each, we calculated the proportion of variance explained by each component and took the first component as the indicator of whole-brain structural covariance across individuals (Alexander-Bloch et al., 2013). The principal components are essentially anatomical patterns composed of highly correlated brain regions (Alexander-Bloch et al., 2013). We further focused on PCs that explained more than 10% of variance in either cortical measure.

The scores for these components provided each participant a single number to indicate their position on a continuous range of alterations across 68 ROIs throughout the whole cortical mantel. We plotted these scores to better visualize them. We tested for associations between the component score and clinical or demographic factors, as described below. First, this PCA model was completed in the primary sample which was identical to the sample we used for clustering in our previous work, in order to be able to directly compare the two methods (McWhinney, Abé, et al., 2022). Second, it was performed in our replication sample that included only newly contributing sites.

2.4 Statistical modeling

The list of all models we tested is included in the supplement. For each of cortical thickness and surface area, we used mixed linear regression modeling to test for associations between the individual component's score (and by proxy their associated patterns of brain structure) with group (BD or control), BMI, age, and sex. In previous studies, BMI has proven to be robustly related to cortical thickness (McWhinney et al., 2023; McWhinney, Brosch, et al., 2022). We tested for nonlinear effects of age (age squared), as well as for interactions between age, sex, and BMI. We additionally tested the inclusion of an interaction between group and BMI, including it if significant. BMI and age were each scaled to a range of 4.0 so that model estimates would equate to quartiles of their distributions.

This same procedure was performed among participants with BD using predictors of BMI, age, sex, diagnosis subtype (BD-I or BD-II), age of illness onset, history of psychosis (Y/N), and prescribed medications at the time of scanning (antidepressant, 1st or 2nd generation antipsychotic, anticonvulsant, and/or lithium), coded as yes or no for each medication class as separate predictors, as in prior ENIGMA BD analyses (McWhinney, Abé, et al., 2022). We tested for interactions between BMI and medications and included them if significant. All models included research site as a random effect. We checked for normality of residuals using QQ plots, and for multicollinearity by testing the variance inflation factor (VIF). All modeling was completed using the package lme4 (v1.1-21) and lmerTest (v3.1-3) in R version 4.1.1.

We compared the PCA with clustering analyses used in our previous paper (McWhinney, Abé, et al., 2022). We obtained the cluster number for each participant using the exact sample and procedure described in our previous study (McWhinney, Abé, et al., 2022). Both outcome measures (first component score for cortical thickness and cluster number) were tested as predictors of our two variables of interest, group (control or BD) and BMI, to determine which measure was a stronger predictor of each variable. We used mixed logistic regression modeling to test for associations between group and (1) covariates alone; (2) covariates with cluster number; (3) covariates with the first PC's score; and (4) all of the above. Covariates included BMI, age, sex as fixed effects, and research site as a random effect. We calculated the Bayesian Information Criterion (BIC), area under the curve (AUC) of the ROC curve, as well as predictor and model significance to compare the predictive power of each model in association with the participant group. We performed the same procedure using mixed linear regression modeling with BMI as the dependent variable but using the diagnostic group as a covariate instead of BMI and estimating model fit using R2 instead of AUC.

Lastly, we tested the fit of either average cortical thickness or the first component's score as dependent variables to BMI and group as predictors. We compared fit between these two models (R2, AIC, BIC). We additionally tested the Pearson correlation coefficient between average cortical thickness and the first component's score, and we further tested the significance of their association while adjusting for a random effect of research site.

2.5 Harmonization of between-site differences

The methods described above control for differences between sites using random effects in mixed regression modeling, identically to the preferred approach in most previous comparable studies (Hibar et al., 2016, 2018; McWhinney, Abé, et al., 2021; McWhinney, Abé, et al., 2022; McWhinney, Brosch, et al., 2022; McWhinney et al., 2023). As a sensitivity analysis, we additionally pre-processed the raw data using ComBat to mitigate between-site variability in the raw data (Johnson et al., 2007; Radua et al., 2020). We recompleted PCA in cortical thickness for our primary sample, calculated scores for the first component, and tested for associations with group, age, sex, and BMI as specified above. Estimates and significance of these effects were compared with and without the transformation of ComBat to test whether the combination of PCA and random effects adequately controlled for between-site variability.

2.6 Application in replication sample

The PCA model derived from cortical thickness of the primary sample was applied in the replication sample by applying the first PC's projections to the cortical thickness data for these new individuals. The first PC's score in the new data was tested for associations with group, BMI, age, sex, and a random effect of research site using linear mixed regression modeling. Group was first tested using two levels (healthy controls or patients), and second using three levels (healthy controls, individuals with BD, or schizophrenia). We included individuals with schizophrenia, as we wanted to test whether the method revealed something specific to BD or whether the results represented a general pattern across diagnoses. For comparison, a new PCA model was additionally run in the replication sample, resulting in each individual receiving a first component score from each of the two PCA models. These two scores were compared using the Pearson correlation coefficient.

3 RESULTS

3.1 Sample

Both the primary and replication samples are outlined in Table 1. In both samples, individuals with BD were significantly older and had significantly higher BMI relative to controls. All models that included both groups adjusted for both age and BMI.

3.2 Principal component analysis

In the primary sample, the scores for each of the first PCs accounted for 42.7% of variance in cortical thickness, and 46.2% in surface area (see Figure 1). With exception of the second component for cortical thickness (11.5%, see Figure S1), all other components in both measures explained <5% of variance each. The first principal component scores were associated with higher cortical thickness and surface area in all studied regions, with some regional variations in the strength of association (see Figure 2). Consequently, if a clinical variable was associated with lower first component score, then it would be associated with lower cortical thickness across regions in a pattern reflecting the component's loadings across the regions.

Details are in the caption following the image
PCA results with variance explained by each component (left) and distribution of the first two components for cortical thickness and surface area (right). Cortical thickness distributions are broken out by research site.
Details are in the caption following the image
Factor loadings of the first principal component for cortical thickness (top) and surface area (bottom).

For cortical thickness, the first component's score was negatively associated with diagnosis of BD, BMI, and age, indicating that BD, BMI, and age were independently associated with a diffuse pattern of thinner cortex, see Table 2. Among participants with BD, lithium and antipsychotic medications showed opposing associations with the first principal component, such that antipsychotics were negatively while lithium was positively associated with cortical thickness, see Table 2. The second component for cortical thickness was significantly associated with BMI, age, and sex, see Table S7.

TABLE 2. Associations between demographic, clinical, and treatment characteristics and the first principal component of cortical thickness and surface area.
All participants Participants with BD
Estimate (SE) Significance Estimate (SE) Significance
Cortical thickness Group (BD) −1.38 (0.17) F(1,2421) = 67.80, p = .000* n/a n/a
BMI −0.39 (0.13) F(1,2419) = 8.79, p = .003* 0.25 (0.29) F(1,436) = 0.75, p = .387
Sex (F) 0.16 (0.14) F(1,2418) = 1.30, p = .254 −0.01 (0.34) F(1,435) = 0.00, p = .970
Age −3.10 (0.10) F(1,2422) = 1003.34, p = .000* −3.37 (0.28) F(1,439) = 143.67, p < .001 *
Diagnosis BDII n/a n/a −0.74 (0.57) F(1,438) = 1.65, p = .200
Lithium n/a n/a 0.80 (0.38) F(1,437) = 4.50, p = .034 *
Antipsychotic n/a n/a −0.90 (0.38) F(1,438) = 5.61, p = .018 *
Anticonvulsant n/a n/a −0.79 (0.40) F(1,438) = 3.84, p = .051
Antidepressant n/a n/a 0.20 (0.37) F(1,436) = 0.28, p = .597
Age of onset n/a n/a 0.03 (0.02) F(1,437) = 2.29, p = .131
Psychosis n/a n/a 0.44 (0.46) F(1,439) = 0.92, p = .338
Cortical surface area Group (BD) −0.03 (0.22) F(1,2431) = 0.02, p = .904 n/a n/a
BMI −0.10 (0.18) F(1,2424) = 0.35, p = .554 0.41 (0.38) F(1,439) = 1.20, p = .274
Sex (F) −5.77 (0.18) F(1,2421) = 980.07, p = .000* −6.40 (0.44) F(1,436) = 210.13, p < .001*
Age −1.53 (0.13) F(1,2430) = 139.82, p = .000* −2.49 (0.36) F(1,440) = 47.03, p < .001*
Diagnosis BDII n/a n/a −0.81 (0.74) F(1,441) = 1.19, p = .276
Lithium n/a n/a 1.55 (0.48) F(1,440) = 10.23, p = .001*
Antipsychotic n/a n/a −0.36 (0.49) F(1,441) = 0.54, p = .463
Anticonvulsant n/a n/a 0.56 (0.52) F(1,441) = 1.16, p = .282
Antidepressant n/a n/a 0.33 (0.48) F(1,437) = 0.47, p = .495
Age of onset n/a n/a 0.03 (0.03) F(1,441) = 1.01, p = .315
Psychosis n/a n/a 0.34 (0.59) F(1,440) = 0.33, p = .565
  • Note: Asterisks indicate significant associations (*p < .05).

For surface area, only sex, age, and Li treatment were associated with the first component scores for surface area, see Table 2. Females, relative to males, and older participants showed significantly smaller surface area, while those prescribed lithium showed larger surface area.

3.3 Comparing goodness of fit

Rankings of component loadings closely overlapped with the ranking based on clustering (Cohen et al., 2023) (Spearman ρ = 0.922, p < .001). We tested goodness of fit measures when using the covariates alone, relative to the addition of cluster number, the first PC score for cortical thickness, or both combined. Each of these sets of predictors was included in separate models for the prediction of group (control or BD) or BMI. Multicollinearity among the cluster number and the first component was acceptable in the combined model (VIF = 2.4). Results are shown in Table 3.

TABLE 3. Model fit metrics for predicting group (control or BD) or BMI using (1) covariates only (Group in BMI model, BMI in group model, age, sex, research site), (2) covariates with cluster number, (3) covariates with first component of PCA, and (4) all of the above.
Covariates Covariates, cluster Covariates, PCA Covariates, cluster, PCA
Predicting group Model fit BIC 2512 2501 2458 2465
AUC 0.811 0.815 0.823 0.823
Predictor significance Cluster number χ2 = 18.52, p < .001* χ2 = 0.19, p = 0.659
PCA first component χ2 = 57.99, p < .001* χ2 = 41.04, p < .001*
Model significance VS Covariates χ2 = 18.75, p < .001* χ2 = 61.67, p < .001* χ2 = 61.86, p < .001*
VS Cluster model χ2 = 43.11, p < .001*
VS PCA Model χ2 = 0.19, p = .660
Predicting BMI Model fit BIC 3793 3803 3803 3816
R2 .159 .162 .167 .167
Predictor significance Cluster number χ2 = 3.95, p = .047* χ2 = 0.16, p = .687
PCA first component χ2 = 7.76, p = .005* χ2 = 3.95, p = .047*
Model significance VS Covariates χ2 = 3.94, p = .047* χ2 = 7.65, p = .006* χ2 = 7.81, p = .020*
VS Cluster model χ2 = 3.86, p = .049*
VS PCA Model χ2 = 0.16, p = .686
  • Note: Asterisks indicate significant associations (*p < 0.05).

The BIC indicated that PCA offered the most accurate and parsimonious model over any other option when predicting the group (BD or controls). Both the clustering and PCA methods had a similar BIC when predicting BMI. However, goodness of fit (AUC for group, or R2 for BMI) was highest when using PCA to predict either group or BMI. While cluster number and the first component score were significant predictors of both group and BMI alone, when both were included as predictors in a single model, only the first PC was a significant predictor of BMI. Also, both the cluster and PCA models provided a significantly better fit than the covariate-only model. Critically, while the combined model performed significantly better than the cluster-based model, it was not a significant improvement over the PCA model (Table 3).

Lastly, when testing the association between the first component's score for cortical thickness with BMI and group, model fit (R2 = .065) was 28.6% higher than when using average cortical thickness (R2 = .050), with corresponding improvements in AIC and BIC, see Table 4. The first component score and average cortical thickness were highly correlated (r = .983, p < .001, see Figure S2), and average cortical thickness was significantly associated with the first component score of cortical thickness with adjustment for research site (t(2433) = 290.30, p < .001).

TABLE 4. Comparison of fit when using either the first component for CT or average cortical thickness as outcomes predicted by participant group and BMI.
Outcome Predictors R2 AUC AIC BIC Percent fit improvement
First component BMI, Diagnosis .065 6756 6780 28.6%
Avg. thickness BMI, Diagnosis .050 6793 6817
  • Note: Fit is shown using R2 for linear models. Percent improvement in fit for using the first component relative to average thickness is shown in each model.

3.4 PCA predictions in replication sample

When applying the PCA model to the cortical thickness estimates in the replication sample on which the model was not trained, the first PC (i.e. thickness overall) was significantly smaller in patients (BD and schizophrenia) relative to controls (Difference estimate = 1.48, SE = 0.49, F(1,320) = 8.84, p = .003), and in older participants (F(1,321) = 193, p < .001), while there was no significant association with BMI or sex. These results are consistent with those for the original sample (see Table 2), except for the missing association with BMI, which may be due to lower statistical power in the smaller sample, which was 13.4% the size of the primary sample. When categorized using three diagnostic groups, significant group differences remained (F(1,318) = 7.21, p = .001), with the thickest cortex in controls, intermediate in BD (Est = −0.96, SE = 0.54), and the thinnest in schizophrenia (Est = −2.31, Est = 0.61).

Within the replication sample, the first PC derived from the training sample loadings was strongly correlated with the first PC from the new PCA, completed in this replication sample (r = .998, t(325) = 308.32, p < .001). These findings suggest that the PCA model is sensitive not only to generalizable differences seen in BD from other samples but also to similar variations seen in other SMIs.

3.5 Exploration of components

The distribution on the first component for cortical thickness was normal (W = 0.99, p = .422), whereas the second component showed a non-normal, bimodal distribution (Figure 1, W = 0.89, p < .001). The distinct cluster of lower scores consisted of data from a single research site; no other clinical or demographic data distinguished these clusters. The variance accounted for by each component is shown in Table S8.

3.6 Comparison with ComBat

Similar to the analyses without ComBat, the first PC of data which were preprocessed with ComBat remained significantly associated with age, BMI, and diagnosis of BD. Specifically, we found significantly thinner cortex in older participants (Est = −3.94, 95% CI [−4.05, −3.62]), those with higher BMI (Est = −0.50, 95% CI [−0.81, −0.18]), and those with BD (Est = −1.66, 95% CI [−2.03, −1.30]). In addition, the ranking of regional component loadings for the first PC with and without the application of ComBat using the Spearman rank order correlation coefficient was almost identical (ρ = 0.921, p < .001), see Table S9 for more details.

4 DISCUSSION

In this study, the first of 68 total PCs accounted for 42.7% of variance in cortical thickness. The first PC, which indexed a greater cortical thickness across all 68 cortical regions, was negatively associated with BD, BMI, antipsychotic medications, and age and positively associated with Li treatment. These associations between the first PC and cortical thickness closely mirrored the associations found when applying clustering to the same sample, where the cluster with lower cortical thickness was also associated with diagnosis of BD, higher BMI, and older age, and the cluster with higher cortical thickness was associated with Li treatment (McWhinney, Abé, et al., 2022). Only PCA, not clustering detected links with antipsychotic medications. Also, when directly compared, PCA outperformed clustering as predictors of clinical and demographic variables. When we applied the PCA to the previously unseen replication sample collected from additional ENIGMA-BD working group sites, on which the model was not trained, we found the same patterns of associations with diagnosis and age, even though the sample was almost 90% smaller. The same pattern of brain changes detected in BD was also associated with the diagnosis of schizophrenia. Similar to previous large-scale studies, surface area was not associated with diagnosis of BD or BMI (Hibar et al., 2018; McWhinney, Abé, et al., 2022; McWhinney, Brosch, et al., 2022; McWhinney et al., 2023). The different system-level correlates of CT and SA in our and previous studies further support the practice of keeping these measures separate.

We directly compared PCA and clustering in the same dataset. Both methods detected similar associations with system-level variables, including diagnosis, BMI, age, and Li exposure. However, PCA outperformed clustering in terms of model fit and sensitivity. We suspect there are systematic reasons for this. Even if there are no clearly defined discontinuities/clusters in the data, clustering would segment continuous distribution of findings into several parts. Similar categorization of a continuous range of values necessarily results in a loss of statistical power, as very alike individuals are considered distinct when on opposing sides of a clustering threshold. That is, clustering does not encode the distance between individuals in the multidimensional space where those assigned to the same cluster still differ from one another. Similarly, as there are no strict boundaries, individuals assigned to different clusters may be very similar to one another. In contrast, PCA does not need arbitrary criteria to delineate patterns and find orthogonal effects in the dataset. Within the same component, we were able to maintain the strength of the associations and distance between individuals. For both reasons, PCA should systematically outperform clustering when there are no clearly defined groups of individuals and indeed that was the case in our study.

While our study replicates previous findings of negative associations between cortical thickness and diagnosis of BD in mass univariate analyses (Hibar et al., 2018; McWhinney, Abé, et al., 2021; McWhinney, Brosch, et al., 2022; van Erp et al., 2018), the effect size for association between first PC (d = 0.33) and diagnosis of BD was stronger than associations between individual ROIs and the diagnosis of BD in previous ENIGMA studies (d = 0.015–0.29) (Hibar et al., 2018) and instead of running one model per region, we captured covariance across all regions by a single number. Even in mass univariate analyses associations between clinical variables and brain structure are evident across many regions and are not isolated to a single ROI (Hibar et al., 2018; McWhinney, Abé, et al., 2021; McWhinney, Brosch, et al., 2022; van Erp et al., 2018). The lesion model, where changes in a single region are necessary and sufficient to cause an illness, clearly does not apply to a complex disorder such as BD (Hibar et al., 2018; Reddan et al., 2017). Consequently, looking at distributed effects across groups of regions should be more informative than looking at individual regions. All in all, it is encouraging that our findings converge with these theoretical expectations and suggest that multivariate analyses are better suited to studying complex neuropsychiatric disorders than mass univariate ones.

While the second component for cortical thickness explained approximately 12% of variance, it was associated predominantly with research site; with the outlying site removed, the second component explained only 5% of the variance, see supplement. It is interesting that the first and second components, which are necessarily orthogonal to one another, were each predominantly associated with clinical/anthropometric variables, or research site, respectively. These differences need to be replicated in other studies, but they may represent a more generalizable pattern. After all, correlates of demographic and clinical variables are consistent in the same direction (i.e. negatively associated with cortical thickness). In contrast, the variations related to research sites may be less consistent and less predictable. Consequently, they may fall into separate components. Interestingly when we applied a different method of removing the site effects, i.e. ComBat, the associations between the first PC and clinical/demographic variables remained identical to the results obtained without ComBat. Even without removal of site effects from the raw data, by identifying orthogonal components, PCA may implicitly separate the site effects from the more predictable/systematic biological effects. If confirmed in future studies, this would be a major advantage of PCA.

The same distributed pattern of brain regions was associated with each of the system-level variables. When we applied the PCA projections to the replication sample, we found associations with similar system-level variables, consistent with observations by others (Cao et al., 2023). In fact, the same patterns were also associated with the diagnosis of schizophrenia. This is in keeping with other large-scale studies or meta-analyses, which have also demonstrated that there is a common, non-specific pattern of case-control differences across major psychiatric disorders (Goodkind et al., 2015; Hettwer et al., 2022; Matsumoto et al., 2023), with highly correlated neurostructural abnormalities between BD and schizophrenia (Opel et al., 2020) and with PCA detecting a profile of shared cortical thickness differences across 6 major psychiatric disorders, which explained 48% of variance (Writing Committee for the Attention-Deficit/Hyperactivity Disorder et al., 2021). Some have argued that there may be more cause-specific alterations after removing this non-specific pattern of the first component (Cao et al., 2023), though with exception of the second component for cortical thickness, subsequent components explained only a small fraction of variance (i.e. typically less than 5% each). All in all, this study contributed to the growing body of evidence for lack of specificity of associations between some key clinical and demographic factors and patterns of brain alterations, so-called neural P factor (Sprooten et al., 2022). Many different variables, including age, obesity, psychiatric diagnoses may be negatively associated with cortical thickness across a wide range of cortical regions.

When testing associations with BMI and group, the first PC explained 28.6% more variance than average cortical thickness. Principal component analysis, which accounts for the regional distribution of effects, was better than simple average cortical thickness, which collapses information across all regions. At the same time, the PC1 was highly correlated with average cortical thickness. While there are regional effects and accounting for these improves the fit of the model, the system-level variables are associated with some level of cortical thinning across most regions. This may further contribute to the lack of specificity of the system level to brain associations and is consistent with findings from another study indicating that accuracy of ML classifier comparing controls versus BD versus schizophrenia was strongly dependent on global grey matter measures (Schwarz et al., 2019).

Our findings confirmed that BD is characterized by diffuse regional structural brain alterations, specifically lower cortical thickness. The fact that these alterations are so diffuse as to resemble global atrophy is interesting. We can only speculate about why this would occur. First of all, most pathologies will result in atrophy, i.e. thinning of the cortex. Second of all, changes in one region are likely to propagate through the network, thus eventually involving more related regions. Thirdly, perturbations within one network are going to propagate to other networks, thus involving even more regions eventually to the point of resembling a global atrophy. These mechanisms could explain why so many regions are correlated across individuals, why the association with the first component is uniformly in one direction, and even why different predictors (age, BMI, BD, schizophrenia, medications) are all associated with the same global pattern.

The advantages of this study include the large sample size, the validation of the most prominent findings in a replication sample, and the multivariate approach which improved statistical sensitivity. Importantly, we were able to directly compare results between two unique multivariate analyses in the same sample: clustering and PCA.

The multi-site nature of the study is a limitation that complicates the data analyses and interpretation of the results. At the same time, our findings suggest that PCA may code site in separate PCs from the effects of clinical/anthropometric variables. Along similar lines, the sample contained a broad representation of BD, including individuals with BDII and we also included a sample of people with schizophrenia. In order to test how well a method captures relevant clinical links, we need a representative/generalizable snapshot of the disorder, which is not restricted to one subtype and ideally also includes representation of other diagnoses. More detailed clinical or biological markers beyond those analyzed were not broadly available throughout the ENIGMA-BD working group. While it is possible that associations with other clinical variables would show different patterns, the consistent and replicated nature of the direction of associations and spatial distribution of networks suggest this is unlikely. As these were independently collected datasets, not a centralized single study, we did not have access to raw, whole-brain data.

We performed these analyses on derived estimates of cortical thickness and surface area, but we cannot generalize our findings to other measures, which may show different patterns of associations. There are other methods within this broad category, which may be of specific use for example when attempting a multimodal data fusion, i.e. FLICA. However, comparing different methods within the broad group of linear representation of data was not our goal and would likely only lead to incremental gains, if any. We did not test other multivariate techniques including ML methods, which are difficult to interpret. Our interest was in applying traditional methods which may also be applied in smaller samples and in a wide range of situations, including clinical settings. Due to simplicity and straightforward application of the linear methods, they may be a method of choice, especially where the size/structure of the dataset would not allow for proper ML.

5 CONCLUSION

In this study, we confirmed that cortical thickness in widespread regional networks as determined by PCA is negatively associated with relevant clinical and demographic variables, including diagnosis, age, BMI, and treatment with antipsychotic medications or lithium. The action of these factors on a widespread network suggests that conceptualizing or studying their effects on individual regions may be misleading. In addition, significant associations of many different system-level variables with the same network suggest a lack of specificity to individual clinical and demographic factors. While there may be general agreement among multivariate techniques on these associations, PCA outperformed clustering and may better fit the nature of brain imaging data in SMI. More broadly, the results have demonstrated that representing data as a linear combination of features is superior to clustering when investigating links between brain and system measures in SMI. This work could help researchers make informed decisions about which methods to use and could save them from applying an ill-fitting method, when a simpler, more reproducible one is available.

ACKNOWLEDGEMENTS

We gratefully acknowledge the following contributions and research funding sources that made this study possible: PMT, CRKC, and SIT were supported by NIH grants U54 EB020403 from the Big Data to Knowledge (BD2K) Program, R56 AG058854 (The ENIGMA World Aging Center); R01 MH116147 (The ENIGMA Sex Differences Initiative), R01 MH129742-01 (The ENIGMA Bipolar Initiative), and the Baszucki Brain Research Fund & Milken Institute's Center for Strategic Philanthropy; CRKC also acknowledges, NIA T32AG058507. The St. Göran study was supported by grants from the Swedish Research Council (2022-01643), the Swedish Foundation for Strategic Research (KF10-0039), the Swedish Brain Foundation (FO2022-0217), and the Swedish Federal Government under the LUA/ALF agreement (ALF 20200036, ALFGBG-965444).

    FUNDING INFORMATION

    This work is also part of the German multicenter consortium “Neurobiology of Affective Disorders. A translational perspective on brain structure and function,” funded by the German Research Foundation (Deutsche Forschungsgemeinschaft DFG; Forschungsgruppe/Research Unit FOR2107). Principal investigators (PIs) with respective areas of responsibility in the FOR2107 consortium are as follows: Work Package WP1, FOR2107cohort and brain imaging: TK (speaker FOR2107; DFG grant numbers KI588/14-1, KI588/14-2, KI588/20-1, KI588/22-1), UD (co-speaker FOR2107; DA 1151/5-1, DA 1151/5-2), AK (KR 3822/5-1, KR 3822/7-2), IN (NE 2254/1-1, NE 2254/2-1, NE 2254/3-1, and NE 2254/4-1), BS (STR 1146/18-1), CK (KO 4291/3-1). Further support from the German sites was provided by MNC and FOR2107-Muenster: This work was funded by the German Research Foundation (SFB-TRR58, Project C09 to UD) and the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Münster (grant Dan3/012/17 to UD and grant SEED11/18 to NO); FOR2107-Muenster: This work was supported by grants from the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Münster (grant MzH 3/020/20 to TH) and the German Research Foundation (DFG grants HA7070/2-2, HA7070/3, HA7070/4 to TH, CRC1457/A7 to RN). The Medellin studies (GIPSI) were supported by the PRISMA UNION TEMPORAL (UNIVERSIDAD DE ANTIOQUIA/HOSPITAL SAN VICENTE FUNDACIÓN), Colciencias-INVITACIÓN 990 de 3 de agosto de 2017, Codigo 99059634. The San Raffaele site was supported by the Italian Ministry of Health RF-2018-12367789 project. The University of Galway research was supported by the Irish Research Council (IRC) Postgraduate Scholarship, Ireland awarded to LN and to GM, and by the Health Research Board (HRA-POR-324) awarded to DMC and (HRA_POR/2011/100) awarded to CMcD. We thank the participants and the support of the Welcome-Trust HRB Clinical Research Facility and the Centre for Advanced Medical Imaging, St. James Hospital, Dublin, Ireland. JS and RTK received support from the William K. Warren Foundation National Institute of Mental Health (R21MH113871); JS also acknowledges the National Institute of General Medical Sciences (P20GM121312). MB is supported by a NHMRC Senior Principal Research Fellowship and Leadership 3 Investigator grant (1156072 and 2017131). JH and EB were supported by the Czech Health Research Council Project No. NU21-08-00432 and by ERDF-Project Brain dynamics, No. CZ.02.01.01/00/22_008/0004643. IHG received support from the National Institute of Mental Health (R37MH101495). This study was also funded by EU-FP7-HEALTH-222963 “MOODIN-FLAME” and EU-FP7-PEOPLE-286334 “PSYCHAID.” The Barcelona group would like to thank CIBERSAM (EPC) and the Instituto de Salud Carlos III (PI18/00877 and PI19/00394) for their support. This work was supported by the Singapore Bioimaging Consortium (RP C009/2006) research grant awarded to K.S. The CIAM group (FMH—PI) was supported by the University Research Committee, University of Cape Town, and South African funding bodies National Research Foundation and Medical Research Council; DJS from CIAM was supported by the SAMRC. FS from CIAM was supported by a Brain-Behavior Unit Postdoctoral Research Fellowship. The Sydney studies were supported by the Australian National Health and Medical Research Council (NHMRC) Program Grant 1037196, Investigator Grants 1176716 (PRS) and 1177991 (PBM), Project Grants 1063960 and 1066177, the Lansdowne Foundation, Good Talk and Keith Pettigrew Family; as well as the Janette Mary O'Neil Research Fellowship to JMF. The study was also supported by NIMH grant number: R01 MH090553(to RAO). Funding for the Oslo-Malt cohort was provided by the South Eastern Norway Regional Health Authority (2015-078), the Ebbe Frøland Foundation, and a research grant from Mrs. Throne-Holst. EV acknowledges the support of the Spanish Ministry of Science and Innovation (PI15/00283, PI18/00805) integrated into the Plan Nacional de I + D + I and co-financed by the ISCIII-Subdirección General de Evaluación and the Fondo Europeo de Desarrollo Regional (FEDER); the Instituto de Salud Carlos III; the CIBER of Mental Health (CIBERSAM); the Secretaria d'Universitats i Recerca del Departament d'Economia i Coneixement (2017 SGR 1365), the CERCA Programme, and the Departament de Salut de la Generalitat de Catalunya for the PERIS grant SLT006/17/00357. Lastly, this study was supported by the Canadian Institutes of Health Research (103703, 106469, and 142255, 180449, 186254), Nova Scotia Health Research Foundation, Dalhousie Clinical Research Scholarship to TH, Brain & Behavior Research Foundation (formerly NARSAD); 2007 Young Investigator and 2015 Independent Investigator Awards to TH. The Minnesota sites were supported by the National Institutes of Health (U01MH108150 to SS; K01MH093621 to SU), the Center for Magnetic Resonance Research (P41 EB027061; 1S10OD017974), a Merit Review Award (#I01CX000227 to SS) from the United States (U.S.) Department of Veterans Affairs Clinical Science Research and Development Service, and Minneapolis VA Health Care System resources to SU and SS. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs or the United States government. LTE was supported by NIH MH083968 and Desert-Pacific Mental Illness Research Education and Clinical Center.

    EV thanks the support by CIBER-Consorcio Centro de Investigación Biomédica en Red-(CB07/09/0004), Instituto de Salud Carlos III, Spanish Ministry of Science and Innovation and grants PI18/00805 and PI21/00787, integrated into the Plan Nacional de I + D + I and co-financed by the ISCIII-Subdirección General de Evaluación and the Fondo Europeo de Desarrollo Regional (FEDER); the Instituto de Salud Carlos III; the Secretaria d'Universitats i Recerca del Departament d'Economia i Coneixement (2021 SGR 01358), the CERCA Programme, and the Departament de Salut de la Generalitat de Catalunya for the PERIS grant SLT006/17/00357. Thanks also for the support of the European Union Horizon 2020 research and innovation program (EU.3.1.1. Understanding health, wellbeing, and disease: Grant No 754907 and EU.3.1.3. Treating and managing disease: Grant No 945151).

    JR thanks the support by the Spanish Ministry of Science and Innovation. Instituto de Salud Carlos III (PI19/00394 and PI22/00261), integrated into the Plan Nacional de I + D + I and co-financed by ERDF Funds from the European Commission (“A Way of Making Europe”), CIBERSAM, and the Secretaria d'Universitats i Recerca, Departament d'Economia i Coneixement and Departament de Salut (2021 SGR 01128).

    CONFLICT OF INTEREST STATEMENT

    PMT & CRKC received a grant from Biogen, Inc., for research unrelated to this manuscript. DJS has received research grants and/or consultancy honoraria from Lundbeck and Sun. LNY has received speaking/consulting fees and/or research grants from Abbvie, Alkermes, Allergan, AstraZeneca, CANMAT, CIHR, Dainippon Sumitomo Pharma, Janssen, Lundbeck, Otsuka, Sunovion, and Teva. TE received speaker's honoraria from Lundbeck and Janssen Cilag and is a consultant to Sumitomo Pharma America. Thanks also for the support of the European Union Horizon 2020 research and innovation program (EU.3.1.1. Understanding health, wellbeing, and disease: Grant No 754907 and EU.3.1.3. Treating and managing disease: Grant No 945151). EV has received grants and served as consultant, advisor, or CME speaker for the following entities (unrelated to the present work): AB-Biotics, Abbott, Allergan, Angelini, Dainippon Sumitomo Pharma, Ferrer, Gedeon Richter, Janssen, Lundbeck, Otsuka, Sage, Sanofi-Aventis, and Takeda. PMT and CRKC have received partial research support from Biogen, Inc. (Boston, USA) for work unrelated to the topic of this manuscript. EV has received grants and served as consultant, advisor, or CME speaker for the following entities: AB-Biotics, AbbVie, Adamed, Angelini, Biogen, Biohaven, Boehringer-Ingelheim, Celon Pharma, Compass, Dainippon Sumitomo Pharma, Ethypharm, Ferrer, Gedeon Richter, GH Research, Glaxo-Smith Kline, HMNC, Idorsia, Johnson & Johnson, Lundbeck, Medincell, Merck, Novartis, Orion Corporation, Organon, Otsuka, Roche, Rovi, Sage, Sanofi-Aventis, Sunovion, Takeda, and Viatris, outside the submitted work. Yatham reports grants from Abbvie and Dainippon Sumitomo, and served as an advisor or consultant or speaker to JAMA Pharma, Intracellular Therapies, Merck, Allergan, GSK, Gedeon Richter, Sanofi, Sunovion, and Alkermes, outside the submitted work.

    DATA AVAILABILITY STATEMENT

    The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.