Proteome-Wide Association Study for Finding Druggable Targets in Progression and Onset of Parkinson's Disease
Funding: This work was supported by the National Natural Science Foundation of China (Grants 81671068, 81873727, 82171196, and 82304990) and the China Postdoctoral Science Foundation (Grant 2023M732380).
Chenhao Gao, Haobin Zhou, Weixuan Liang, Zhuofeng Wen contributed equally to this work as co-first authors.
ABSTRACT
Objective
To identify and validate causal protein targets that may serve as potential therapeutic interventions for both the onset and progression of Parkinson's disease (PD) through integrative proteomic and genetic analyses.
Method
We utilized large-scale plasma and brain protein quantitative trait loci (pQTL) datasets from the deCODE Health study and the Religious Orders Study/Rush Memory and Aging Project (ROS/MAP), respectively. Proteome-wide association studies (PWAS) were conducted using the OTTERS framework for plasma proteins and the FUSION tool for brain proteins, examining associations with PD onset and three progression phenotypes: composite, motor, and cognitive. Significant protein associations (FDR-corrected p < 0.05) from PWAS were further validated using summary-based Mendelian randomization (SMR), colocalization analyses, and reverse Mendelian randomization (MR) to establish causality. Phenome-wide Mendelian randomization (PheW-MR) was performed to assess potential side effects across 679 disease traits when targeting these proteins to reduce PD-related phenotype risk by 20%. Additionally, we conducted cellular distribution-based clustering using gene expression data from the Allen Brain Atlas (ABA) to explore the distribution of key proteins across brain regions, constructed protein–protein interaction (PPI) networks via the STRING database to explore interactions among proteins, and evaluated the druggability of identified targets using the DrugBank database to identify opportunities for drug repurposing.
Result
Our analyses identified 25 candidate proteins associated with PD phenotypes, including 16 plasma proteins linked to PD progression (10 cognitive, 4 motor, and 3 composite) and 9 plasma proteins associated with PD onset. Notably, GPNMB was implicated in both plasma and brain tissues for PD onset. PheW-MR revealed predominantly beneficial side effects for the identified targets, with 83.7% of associations indicating positive outcomes and 16.3% indicating adverse effects. Cellular clustering categorized candidate targets into three distinct expression profiles across brain cell types using ABA. PPI network analysis highlighted one key interaction cluster among the proteins for PD cognitive progression and PD onset. Druggability assessment revealed 15 out of 25 proteins had repurposing opportunities for PD treatment.
Conclusion
We have identified 25 causal protein targets associated with the onset and progression of PD, providing new insights into the research and development of treatment strategies for PD.
Abbreviations
-
- ABA
-
- Allen Brain Atlas
-
- ACAT-O
-
- Aggregated Cauchy Association Test
-
- BH
-
- Benjamini–Hochberg
-
- FDR
-
- false discovery rate
-
- FUSION
-
- Functional Summary-based Imputation
-
- GreX
-
- genetically regulated expression
-
- GWAS
-
- genome-wide association studies
-
- HEIDI
-
- Heterogeneity in Dependent Instruments
-
- IVs
-
- instrumental variables
-
- Lassosum
-
- a frequentist LASSO-based approach
-
- LD
-
- linkage disequilibrium
-
- NSAIDs
-
- nonsteroidal anti-inflammatory drugs
-
- OTTERS
-
- Omnibus Transcriptome Test using Expression Reference Summary data
-
- PD
-
- Parkinson's Disease
-
- PheW-MR
-
- phenome-wide MR
-
- PP
-
- posterior probability
-
- PPI
-
- protein–protein interaction
-
- pQTL
-
- protein quantitative trait loci
-
- PRS-CS
-
- a Bayesian multivariable regression model utilizing continuous shrinkage priors
-
- P+T
-
- p-value thresholding with LD clumping
-
- PWAS
-
- proteome-wide association studies
-
- ROS/MAP
-
- Religious Orders Study/Rush Memory and Aging Project
-
- SDPR
-
- a nonparametric Bayesian Dirichlet Process Regression model
-
- SMR
-
- summary-based Mendelian randomization
-
- SNPs
-
- single nucleotide polymorphisms
-
- STRING
-
- Search Tool for the Retrieval of Interacting Genes/Proteins
-
- UPGMA
-
- Unweighted Pair Group Method with Arithmetic Mean
1 Introduction
Parkinson's disease (PD) is a neurodegenerative disorder characterized by the progressive loss of dopaminergic neurons in the substantia nigra pars compacta and the accumulation of α-synuclein aggregates, known as Lewy bodies. It is the second most prevalent neurodegenerative disease after Alzheimer's disease [1, 2]. Epidemiological studies indicate a global increase in PD cases, rising from 2.5 million to 6.1 million over the past three decades [3]. With the aging global population, the incidence of PD is projected to escalate significantly, imposing substantial socioeconomic burdens on patients and healthcare systems [4].
PD manifests through a spectrum of motor symptoms, including tremors, rigidity, bradykinesia, and postural instability, resulting from the degeneration of dopaminergic neurons [1, 2, 5, 6]. In addition to these motor deficits, PD encompasses a range of non-motor symptoms such as cognitive decline, mood disorders, and autonomic dysfunction, which contribute to the disease's complexity and severely impact the quality of life of affected individuals [7]. The heterogeneity in disease progression, characterized by varying rates of motor and cognitive deterioration among patients, presents significant challenges for effective treatment and management strategies. Current therapeutic approaches for PD primarily aim at symptomatic relief, employing medications like levodopa and dopamine agonists to replenish dopamine levels and alleviate motor symptoms [8]. While these treatments can provide temporary improvement, they do not halt the underlying neurodegenerative processes driving the disease [8]. The absence of disease-modifying therapies underscores the urgent need for interventions that can influence both the onset and progression of PD.
Advancements in proteomics and genomics have opened new avenues for identifying biomarkers and therapeutic targets in complex diseases such as PD. Proteome-wide association studies (PWAS), leveraging protein quantitative trait loci (pQTL) data, facilitate the identification of protein-level associations with disease phenotypes [9]. Specifically, plasma and brain proteomics offer valuable insights into systemic and central nervous system-specific protein alterations linked to PD [10, 11]. Integrative genomic analyses that combine genome-wide association studies (GWAS) with proteomic data enable the elucidation of causal relationships between genetic variants, protein expression, and disease traits [10, 11]. Methodologies such as PWAS, summary-based Mendelian randomization (SMR) [12], colocalization analyses [13], and phenome-wide MR (PheW-MR) [14] are instrumental in dissecting the genetic architecture of PD and identifying proteins that may serve as potential therapeutic targets. Therefore, by integrating these powerful and steady approaches in a logical order, our study aims to identify latent but reliable drug targets for PD.
Few studies that explored PD's targets focused on the developing procedure of this neurodegenerative disease, while the primary objective of our study is to identify and validate potential therapeutic targets for both the onset and progression of PD through integrative proteomic and genetic analyses, providing novel perspectives on the dynamic changes associated with PD. By harnessing large-scale plasma and brain pQTL datasets from the deCODE Health study and the Religious Orders Study/Rush Memory and Aging Project (ROS/MAP), respectively, we conducted comprehensive PWAS to uncover proteins associated with various PD phenotypes, including PD onset and three distinct progression phenotypes: composite, motor, and cognitive. These PD phenotypes were selected to comprehensively capture the entire disease trajectory. Subsequent sensitivity analyses, including SMR and colocalization, were employed to confirm the causal relevance of these proteins and PD phenotypes. Additionally, a reverse MR analysis was performed to explore potential bidirectional causal relationships between proteins and PD. The identification of causal proteins may highlight candidate drug targets for PD treatment. Furthermore, PheW-MR analyses were conducted to assess potential side effects of targeting these candidate proteins, thereby informing the safety and efficacy of prospective therapeutic interventions for PD. Given that PD primarily affects the brain, it is essential to understand the cellular distribution of the genes encoding candidate drug target proteins across various brain regions to develop effective therapies. To achieve this, we utilized gene expression data from the ABA to perform cluster analysis, refining the distribution of candidate targets within the brain and identifying co-expression patterns in specific cell populations. Moreover, we employed protein–protein interaction (PPI) networks to investigate interactions between candidate proteins across multiple PD phenotypes, thereby elucidating functional relationships and exploring the potential for multi-target drug development. Then, we integrated drug target information from the DrugBank database to explore opportunities for drug repurposing of candidate targets [15]. Figure 1 shows the research workflow of this study. Taken together, the identification of causal protein targets not only enhances our understanding of PD pathogenesis but also paves the way for the development of disease-modifying therapies and personalized medicine approaches aimed at improving patient outcomes.

2 Method
2.1 Data Sources
We obtained plasma pQTL data from the deCODE Health study, which performed comprehensive proteomic profiling in plasma samples from 35,559 Icelandic participants using the SomaScan platform, ultimately quantifying 4907 distinct plasma proteins [16]. For brain-derived protein data, we used pQTL information on 1097 proteins measured in the dorsolateral prefrontal cortex from participants in the ROS/MAP using mass spectrometry [17]. We also incorporated GWAS summary statistics for three PD progression phenotypes, including composite (2755 patients), motor (2848 patients), and cognitive (2788 patients), as reported by Tan MMX et al. [18] For PD onset, the discovery cohort consisted of GWAS summary statistics derived from Nalls MA et al. [19] (15,056 cases and 12,637 controls), and the replication cohort employed data from the FinnGen consortium [20] (4235 cases and 373,042 controls). Details of these datasets are provided in Table S1.
2.2 PWAS
We conducted PWAS on both brain and whole blood tissues to identify protein-level associations with PD phenotypes. For brain tissue, we utilized the Functional Summary-based Imputation (FUSION) framework, which employs existing pQTL weights specifically tailored to brain proteomes [21]. FUSION is a well-established computational tool that imputes genetically regulated gene expression and assesses gene-level associations with complex traits and diseases. By leveraging pretrained pQTL weights for brain tissue, we integrated PD-related phenotypes and performed PWAS using FUSION on a Linux platform [22].
In contrast, appropriate pretrained PWAS weights for whole blood were unavailable. To overcome this limitation, we employed the Omnibus Transcriptome Test using Expression Reference Summary data (OTTERS), a specialized framework designed to generate and utilize pQTL weights from summary-level data [23]. OTTERS operates in two primary stages. In Stage I, we constructed genetically regulated expression (GReX) imputation models by deriving cis-pQTL weights, defined as the regions extending 1 MB upstream and downstream of the protein-coding genes, from summary-level cis-pQTL data and external European linkage disequilibrium (LD) reference panels from the 1000 Genomes Project. Multiple methodologies were employed for weight derivation, including P+T (p-value thresholding with LD clumping) [24], lassosum (a frequentist LASSO-based approach) [25], SDPR (a nonparametric Bayesian Dirichlet Process Regression model) [26, 27], and PRS-CS (a Bayesian multivariable regression model utilizing continuous shrinkage priors) [28]. In Stage II, these cis-pQTL weights were used to estimate GReX for each gene, enabling gene-level association tests within the GWAS dataset. PWAS p-values derived from each modeling approach were subsequently integrated into a single composite metric using the Aggregated Cauchy Association Test (ACAT-O) [29]. We refer to the resultant p-values from this integrated test as OTTERS p-values.
For our analyses, we incorporated plasma pQTL data from the deCODE Health Study and brain pQTL data from the Religious Orders Study and the Rush Memory and Aging Project (ROS/MAP). We applied the Benjamini–Hochberg (BH) method to correct p-values and control the false discovery rate (FDR), thereby minimizing false positives without excessively inflating false negatives. In the PWAS, proteins with FDR-corrected p-values below 0.05 were considered significantly associated with the corresponding PD phenotype. Specifically, for proteins associated with PD onset, those that reached significance in the discovery cohort and maintained p < 0.05 in the replication cohort were deemed successfully replicated and selected for subsequent analyses.
2.3 Sensitivity Analyses
2.3.1 SMR Analysis
To rigorously validate our PWAS findings, we employed SMR to confirm both brain and plasma proteins found to be causally associated with PD-related phenotypes. SMR integrates pQTL and GWAS summary statistics within the MR framework, which utilizes instrumental variables (IVs), genetic variants that serve as proxies for protein levels, to enable the assessment of the causal impact of protein levels on PD-related traits [12]. SMR is an extension of MR, and MR adheres to three core assumptions: (i) the relevance assumption, which requires a strong association between IVs and the exposure; (ii) the independence assumption, stating that IVs influence the outcome solely through the exposure; (iii) the exclusion restriction assumption, which dictates that IVs should not have a direct impact on the outcome. Unlike conventional two-sample MR, where two independent GWAS datasets are required to estimate the causal effect between traits, SMR combines pQTL and GWAS data and utilizes the Heterogeneity in Dependent Instruments (HEIDI) test [30]. This approach offers more robust discrimination between pleiotropic and linkage effects, reduces potential biases due to LD, and lowers the large sample size requirements often seen in standard MR methods [30].
We adopted the SMR-derived estimates as our primary measures of each protein's influence on PD-related phenotypes. Given the inherent stringency of the SMR method, we applied the Benjamini–Hochberg procedure to control the FDR, thereby minimizing false positives without excessively inflating false negatives. Any protein that met the criteria of an FDR-adjusted SMR p < 0.05 and a HEIDI p > 0.01 was considered to have a causal relationship with the respective PD-related phenotype [31].
2.3.2 Colocalization Analysis
To determine whether the observed associations between proteins and PD-related phenotypes stemmed from a shared causal variant rather than LD, we conducted Bayesian colocalization analyses using the coloc R package [13]. This methodology integrated both brain and plasma pQTL data with GWAS summary statistics for PD-related traits. We evaluated five distinct hypotheses: (i) H0: no causal variant influences either the protein or PD-related phenotypes; (ii) H1: a causal variant affects only the protein; (iii) H2: a causal variant affects only the PD phenotype; (iv) H3: distinct causal variants influence the protein and PD phenotypes independently; and (v) H4: a single causal variant affects both. For each protein, we included single nucleotide polymorphisms (SNPs) within a ± 500 kb window surrounding its pQTL region. In instances where a protein was associated with multiple pQTLs, each pQTL was analyzed separately, prioritizing those with the strongest evidence of association. A posterior probability (PP) greater than 0.8 for hypothesis H4 was considered strong evidence supporting the existence of a shared causal variant. Overall, the prioritized proteins, which were significantly identified in PWAS and had successfully passed SMR, HEIDI, and colocalization assessments, might have the potential to be the candidate targets for PD treatment.
2.3.3 Reverse MR Analysis
Complementing our primary SMR and colocalization analyses, we implemented a reverse MR approach to investigate potential bidirectional causal relationships among candidate targets [33, 34]. In this analysis, GWAS data for PD-related phenotypes were designated as exposures, while proteins that satisfied the PWAS, SMR, HEIDI, and colocalization thresholds from both whole-blood and brain pQTL datasets were treated as outcomes. To ensure an adequate number of IVs for each PD-related phenotype, we adopted a relaxed significance threshold of p < 5e-06 and performed LD clumping to maintain LD independence (r2 < 0.001, window size = 10,000 kb) among the SNPs.
Subsequently, we calculated F-statistics for each IV to assess their strength, excluding those with F-values < 10 to mitigate weak instrument bias. Following this, the Steiger test was conducted to verify the directionality of the associations, retaining only SNPs that explained a larger proportion of variance in the exposure compared to the outcome (Table S3). This step ensured that each IV primarily influenced the outcome through its effect on the exposure. Finally, we applied a Bonferroni-corrected significance threshold of p < (0.05/n), where n denotes the total number of the tested associations, with associations surpassing this threshold considered statistically significant, thereby revealing potential bidirectional causalities between PD and the candidates.
2.4 PheW-MR Analysis of 679 Disease Traits
To evaluate the potential unintended consequences of targeting proteins implicated in PD-related phenotypes, we conducted a PheW-MR analysis encompassing 679 distinct disease traits. Initially, we established causal associations between our prioritized proteins and these 679 common disease traits using PheW-MR. Subsequently, we integrated these results with the SMR findings for PD-related phenotypes to ensure that potential side effects were not confounded by directional biases. This comprehensive approach enabled the identification of unintended consequences associated with targeting specific proteins as therapeutic interventions.
For this analysis, protein–disease associations were derived from PheW-MR evaluations across a broad spectrum of 679 diseases, each comprising more than 500 cases, as previously described by Zhou et al. [14] These phenotypes were sourced from the UK Biobank (N ≤ 408,961) and categorized using PheCodes. To determine the effect sizes of proteins on the 679 diseases, we performed MR. In this process, IVs were selected using a stringent significance threshold of p < 5e-08, followed by LD clumping (r2 < 0.1, window size = 10,000 kb) to ensure LD independence among the selected SNPs.
Here, represents the effect of the candidate proteins on the 679 diseases, with only associations having p-values below 0.05 included. denotes the proteins' effect on PD-related phenotypes, derived from the PheW-MR and SMR analyses. Proteins with OR values greater than 1 were considered to have potentially adverse side effects, whereas those with OR values less than 1 were deemed to confer beneficial side effects. p-values were estimated using a bootstrap method with 1 million iterations (n = 1,000,000) and the p-values for the side effects were corrected using the Bonferroni method. A side effect was considered statistically significant if its p-value was below (0.05/k). Here, k represents the total number of associations between proteins and diseases that had a p < 0.05 in the PheW-MR analysis.
The associations achieving Bonferroni-corrected significance threshold of p-values < (0.05/k), where k is the total number of protein–disease associations identified at p < 0.05, were regarded as statistically significant [36].
2.5 Cellular Distribution-Based Clustering of Candidate Targets Using ABA Data
Given that PD predominantly affects the brain, it is imperative to elucidate the cellular distribution of genes encoding candidate target proteins across various brain regions to develop effective therapies. To refine the spatial distribution of these targets and identify co-expression patterns within specific cell populations, we conducted cluster analysis using gene expression data from the ABA [37]. Specifically, we utilized the Whole Human Brain 10x RNA-seq dataset (data updated on March 30, 2024) and extracted log₂-normalized expression matrices corresponding to our prioritized protein-coding genes. Cluster analysis was performed based on the similarity of gene expression levels across different cell types. Hierarchical clustering was executed using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) to identify patterns of co-expression and potential functional relationships among the genes.
2.6 PPI Network and Druggability Assessment
To identify synergistic interactions among targets across multiple phenotypes and facilitate the development of multi-target therapeutics, we constructed a PPI network encompassing candidate targets associated with various PD phenotypes. We utilized the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (version 12.0; http://string-db.org) to identify interactions among proteins implicated in both the onset and progression of PD, as determined in our preceding analyses. An interaction score threshold of ≥ 0.4 was applied to ensure a moderate level of confidence in the identified interactions. The resulting PPI network was subsequently visualized using Cytoscape (version 3.6.1; https://cytoscape.org). For clarity, any nodes not connected to the main PPI network were excluded from the final visualization.
To evaluate the feasibility of drug repurposing, we conducted a druggability assessment using real-world drug target data from databases such as DrugBank. This assessment enabled us to identify overlaps between our identified proteins and established drug targets, as well as to explore their associated therapeutic indications. By leveraging preprocessed data from Ruiz et al. [38], we facilitated the evaluation of potential repurposing opportunities, thereby enhancing the clinical relevance of our candidate proteins for the treatment of Parkinson's disease.
3 Result
3.1 PWAS for PD Progression and Onset
3.1.1 Identification of Plasma Proteins Associated With PD Progression
We integrated plasma pQTL data from the deCODE study with GWAS summary statistics for three PD statuses (cognitive, motor, and composite progression) and conducted a PWAS using the OTTERS framework, encompassing 4732 proteins. Proteins were deemed significantly associated with PD progression if they met the FDR-corrected significance threshold (p < 0.05). Our analysis identified 42 plasma proteins associated with cognitive progression, 30 with motor progression, and 39 with composite progression (see Table S4 and Figure 2A–C). Notably, APOE exhibited the most significant association with cognitive progression (p = 3.12e-14), while NSF was most significantly associated with both motor (p = 3.90e-21) and composite (p = 5.42e-10) progressions (Table S4).

To validate the causal associations between plasma proteins and the PD progression phenotypes, we performed SMR and HEIDI analyses. Applying stringent criteria—SMR p (FDR-corrected) < 0.05 and HEIDI p > 0.05—we identified 12 plasma proteins causally associated with cognitive progression, 5 with motor progression, and 6 with composite progression (Table S5, Figure 3). A subsequent colocalization analysis (PPH4 > 0.8) confirmed that, of the 12 proteins linked to cognitive progression, 10 (ALKBH3, GLO1, IDO1, SERPINA3, SORD, TPST1, GM2A, MICB, SH3BGRL3, and TGFBI) shared causal variants with PD cognitive progression loci (Table S6, Figure 3). Among these 10 proteins, the abundance of ALKBH3 (β = 0.482, p = 2.58e-03), GLO1 (β = 1.094, p = 3.64e-03), IDO1 (β = 1.899, p = 7.34e-03), SERPINA3 (β = 0.381, p = 9.87e-03), SORD (β = 0.521, p = 2.17e-02), and TPST1 (β = 0.342, p = 1.10e-02) exhibited significant positive causal correlations with cognitive progression, whereas GM2A (β = −0.414, p = 6.48e-03), MICB (β = −0.105, p = 4.79e-02), SH3BGRL3 (β = −0.402, p = 1.36e-02), and TGFBI (β = −0.247, p = 1.76e-03) demonstrated negative correlations. Of the five proteins associated with motor progression, four (NUDT2, PLA2G12B, EVA1C, and MATN3) passed the colocalization test. Specifically, the abundance of NUDT2 (β = 0.378, p = 1.11e-02) and PLA2G12B (β = 0.968, p = 2.90e-02) correlated positively with motor progression, whereas EVA1C (β = −1.446, p = 9.73e-03) and MATN3 (β = −0.160, p = 1.69e-02) correlated negatively. Additionally, among the six proteins implicated in composite progression, three (SH3BGRL3, NANS, and RSPO3) were validated by colocalization analysis. SH3BGRL3 (β = 0.392, p = 3.84e-02) was positively correlated with composite progression, while NANS (β = −1.153, p = 1.78e-02) and RSPO3 (β = −0.503, p = 6.19e-03) exhibited negative correlations. Notably, SH3BGRL3 emerged as a causal factor in both cognitive and composite progressions, despite manifesting a negative correlation with the former and a positive correlation with the latter. Finally, reverse MR analyses revealed no significant bidirectional associations (p < 0.05/114) between these proteins and the three PD progression phenotypes (Table S7).

3.1.2 Identified Related Plasma Proteins for the Onset of PD
Using the OTTERS framework, we conducted a PWAS on 4693 proteins in both the discovery and replication datasets to identify plasma proteins linked to PD onset. Proteins meeting the FDR-corrected significance threshold (p < 0.05) were considered significantly associated with PD onset. In the discovery dataset, we identified 317 proteins correlated with PD onset, while 230 proteins were implicated in the replication dataset. Notably, 54 proteins reached significance in both datasets, with NSF exhibiting the most robust association (pdiscovery = 2.06e-56, preplication = 1.87e-224) (Table S4, Figure 4A). To further evaluate their potential causal roles in PD onset, the 54 proteins were subjected to subsequent sensitivity analyses.

We applied SMR and the HEIDI test to these 54 proteins to explore their causal associations with PD onset. Overall, 22 plasma proteins demonstrated significant causal evidence for PD onset (PSMR (FDR-corrected) < 0.05 and PHEIDI > 0.01) (Table S5). Colocalization analysis confirmed that nine of these proteins (ARSA, EHBP1, FCGR2A, GGH, GPNMB, HDHD2, DNAJB4, HAVCR2, and PDCD1LG2) shared a common causal variant with PD onset (PPH4 > 0.8; Table S6, Figure 4B). Among these nine proteins, the abundance of six proteins, ARSA (OR = 2.177,p = 7.53e-04), EHBP1 (OR = 3.739, p = 1.27e-02), FCGR2A (OR = 1.059, p = 4.37e-05), GGH (OR = 1.167, p = 2.81e-02), GPNMB (OR = 1.503, p = 1.79e-07), and HDHD2 (OR = 1.230, p = 1.64e-02), was significantly associated with an elevated risk of PD onset, whereas the abundance of DNAJB4 (OR = 0.701, p = 2.54e-03), HAVCR2 (OR = 0.905, p = 5.55e-03), and PDCD1LG2 (OR = 0.852, p = 2.44e-02) was associated with a reduced risk. To examine potential bidirectional causal relationships, we also performed reverse MR analyses with no significant association detected, reinforcing the robustness of the observed causal links between the identified proteins and PD onset (Table S7).
3.1.3 Identified Related Proteins With Human Brain Proteomes for the Progression of PD
We utilized the FUSION framework to perform PWAS analyses on brain pQTLs, evaluating the associations between 1097 proteins and the progression of PD. Our analysis identified 57 proteins associated with cognitive progression, 45 with motor progression, and 55 with composite progression (p < 0.05). Among these, MICAL1 exhibited the strongest associations with both cognitive progression (p = 1.38e-03) and composite progression (p = 1.36e-03), while C14orf159 was most significantly associated with motor progression (p = 2.54e-03). Despite these findings, no proteins reached the FDR-corrected significance threshold (p < 0.05) for any of the PD progression phenotypes. Consequently, no brain proteins were identified as candidate targets for further sensitivity analyses (Figures S1–S3 and Table S8).
3.1.4 Identified Related Proteins With Human Brain Proteomes for the PD Onset
We employed the FUSION framework for PWAS analysis to leverage brain pQTL data in assessing the association between 1067 proteins and the onset of PD (Figure 5, Table S8). In the discovery cohort, 99 proteins demonstrated significant associations with PD onset (p < 0.05). After applying the FDR correction, four proteins remained significantly associated and were subsequently validated in replication cohorts (p < 0.05). These proteins include CD38 (pdiscovery = 8.27e-09, preplication = 0.004), GPNMB (pdiscovery = 1.21e-08, preplication = 0.034), VKORC1 (pdiscovery = 1.65e-05, preplication = 0.015), and GAK (pdiscovery = 3.69e-07, preplication = 0.003). Additionally, CTSB (pdiscovery = 9.47 × 10−5, preplication = 0.477) and ARSA (pdiscovery = 7.95 × 10−5, preplication = 0.153) were found to be significantly associated with PD onset in the discovery cohort but did not reach the p < 0.05 threshold in the replication cohort.

For the four proteins validated through PWAS, we initially performed SMR and the HEIDI test to elucidate their causal relationships with PD onset (Table S5). Among these four proteins, only GPNMB and CD38 had valid IVs extracted from brain pQTL data. Consequently, we conducted SMR and HEIDI tests exclusively for these two proteins. The results revealed that both GPNMB and CD38 exhibited significant causal associations with PD onset. Specifically, the abundance of GPNMB (OR = 1.394, p = 7.73e-07) was associated with an increased risk of PD onset, whereas the abundance of CD38 (OR = 0.415, p = 3.32e-08) was associated with a decreased risk. Subsequent colocalization analysis using the COLOC method confirmed the associations between these two proteins and PD onset (Table S6). The analysis showed that only GPNMB had PP.H4 exceeding 0.8, indicating a shared causal variant between GPNMB and PD onset. As a final sensitivity analysis, we attempted to perform a reverse MR analysis to investigate the association between GPNMB and PD onset. Unfortunately, because we only had SNPs within the GPNMB cis region, and there was no overlap with the IVs for PD onset, we did not have valid IVs for the analysis, thereby precluding the reverse MR analysis for this protein. This limitation prevents us from fully establishing the bidirectional causal relationship of this protein, necessitating further investigation in future studies.
3.1.5 Summary of Candidate Plasma and Brain Targets Related to PD Phenotypes
Our analyses identified 25 candidate targets associated with PD-related phenotypes. Among these, 16 plasma proteins were linked to PD progression. Specifically, 10 plasma proteins (ALKBH3, GLO1, IDO1, SERPINA3, SORD, TPST1, GM2A, MICB, SH3BGRL3, and TGFBI) exhibited causal relationships with cognitive progression, four proteins (NUDT2, PLA2G12B, EVA1C, and MATN3) were associated with motor progression, and three proteins (SH3BGRL3, NANS, and RSPO3) were linked to composite progression. Notably, SH3BGRL3 emerged as a causal factor for both cognitive and composite progressions. Additionally, nine plasma proteins (ARSA, EHBP1, FCGR2A, GGH, GPNMB, HDHD2, DNAJB4, HAVCR2, and PDCD1LG2) demonstrated causal relationships with PD onset.
When applying the same analytical pipeline to brain proteins, we did not identify any brain-specific candidate targets causally linked to PD progression. However, we identified one protein in brain tissue, GPNMB, as a candidate target showing a clear causal association with PD onset. Intriguingly, GPNMB was implicated in PD onset in both plasma and brain tissues. These results are summarized in Figure 6 and Table 1.

Source | Target | PD-related phenotype | PPWAS | PSMR | PP.H4 |
---|---|---|---|---|---|
Plasma | ALKBH3 | Cognitive progression | 1.17e-05 | 0.003 | 0.959 |
Plasma | GLO1 | Cognitive progression | 2.02e-04 | 0.004 | 0.975 |
Plasma | GM2A | Cognitive progression | 2.70e-05 | 0.006 | 0.950 |
Plasma | IDO1 | Cognitive progression | 2.75e-04 | 0.007 | 0.953 |
Plasma | MICB | Cognitive progression | 2.49e-11 | 0.048 | 0.907 |
Plasma | SERPINA3 | Cognitive progression | 5.53e-05 | 0.010 | 0.926 |
Plasma | SH3BGRL3 | Cognitive progression | 2.73e-11 | 0.014 | 0.954 |
Plasma | SORD | Cognitive progression | 3.83e-09 | 0.022 | 0.918 |
Plasma | TGFBI | Cognitive progression | 3.90e-07 | 0.002 | 0.975 |
Plasma | TPST1 | Cognitive progression | 7.91e-06 | 0.011 | 0.890 |
Plasma | EVA1C | Motor progression | 7.63e-04 | 0.010 | 0.926 |
Plasma | MATN3 | Motor progression | 5.69e-08 | 0.017 | 0.885 |
Plasma | NUDT2 | Motor progression | 6.13e-05 | 0.011 | 0.865 |
Plasma | PLA2G12B | Motor progression | 1.75e-06 | 0.029 | 0.948 |
Plasma | NANS | Composite progression | 1.90e-04 | 0.018 | 0.899 |
Plasma | RSPO3 | Composite progression | 5.87e-05 | 0.006 | 0.966 |
Plasma | SH3BGRL3 | Composite progression | 4.27e-07 | 0.038 | 0.908 |
Plasma | ARSA | Onset | 1.58e-06 | < 0.001 | 0.989 |
Plasma | DNAJB4 | Onset | 3.99e-07 | 0.003 | 0.969 |
Plasma | EHBP1 | Onset | 1.34e-07 | 0.013 | 0.918 |
Plasma | FCGR2A | Onset | 2.19e-12 | 0.013 | 0.918 |
Plasma | GGH | Onset | 1.63e-12 | < 0.001 | 0.812 |
Plasma | GPNMB | Onset | 1.04e-30 | 0.028 | 0.984 |
Plasma | HAVCR2 | Onset | 4.86e-10 | < 0.001 | 0.868 |
Plasma | HDHD2 | Onset | 1.40e-06 | 0.006 | 0.811 |
Plasma | PDCD1LG2 | Onset | 1.80e-08 | 0.015 | 0.920 |
Brain | GPNMB | Onset | 1.21e-08 | < 0.001 | 0.903 |
- Note: This table reported candidate drug target source, PD-related phenotypes, PPWAS, PSMR, and PP.H4 for each protein.
- Abbreviations: ALKBH3, alpha-ketoglutarate-dependent dioxygenase AlkB homolog 3; ARSA, arylsulfatase A; DNAJB4, DnaJ homolog subfamily B member 4; EHBP1, EH domain binding protein 1; EVA1C, protein Eva-1 homolog C; FCGR2A, low affinity immunoglobulin gamma Fc region receptor II-a; GGH, gamma-glutamyl hydrolase; GLO1, lactoylglutathione lyase; GM2A, ganglioside GM2 activator; GPNMB, transmembrane glycoprotein NMB; HAVCR2, hepatitis A virus cellular receptor 2; HDHD2, haloacid dehalogenase-like hydrolase domain-containing protein 2; IDO1, indoleamine 2,3-dioxygenase 1; MATN3, matrilin-3; MICB, MHC class I polypeptide-related sequence B; NANS, sialic acid synthase; NUDT2, Bis(5′-nucleosyl)-tetraphosphatase [asymmetrical]; PDCD1LG2, programmed cell death 1 ligand 2; PLA2G12B, group XIIB secretory phospholipase A2-like protein; PP.H4, p-value of colocalization; PPWAS, p-value of PWAS; PSMR, p-value of SMR; RSPO3, R-spondin-3; SERPINA3, alpha-1-antichymotrypsin; SH3BGRL3, SH3 domain-binding glutamic acid-rich-like protein 3; SORD, sorbitol dehydrogenase; TGFBI, transforming growth factor-beta-induced protein ig-h3; TPST1, indicates protein-tyrosine sulfotransferase 1.
3.2 PheW-MR
Following the identification of candidate targets associated with PD-related phenotypes, we conducted a comprehensive analysis across 679 common disease traits to characterize the side effect profiles of each prioritized protein as a potential therapeutic target. Unlike the previous SMR approach, PheW-MR results were standardized to reflect a 20% reduction in the risk of PD-related phenotypes mediated by each protein. Consequently, the observed associations can be interpreted as potential side effects that may arise from therapeutically targeting these proteins.
Under this 20% risk-reduction assumption, PheW-MR analyses (p < 0.05/3126) identified 1529 significant beneficial side effects (83.7%) and 297 adverse side effects (16.3%) across 25 candidate targets. A paired t-test confirmed that beneficial side effects significantly outnumbered adverse side effects (p = 7.91e-05), suggesting that the majority of identified side effects were beneficial (Table S9, Figure 7). Of these 25 candidate targets, 17 exhibited exclusively beneficial side effects, while the remaining eight displayed both beneficial and adverse side effects.

Among the 17 candidate targets with exclusively beneficial side effects, we focused on those that reduce the risk of four major PD progression phenotypes and demonstrated the largest number of positive outcomes. For targets mitigating PD cognitive progression, MICB exhibited the most pronounced beneficial profile, with 227 beneficial side effects primarily concentrated in the circulatory system (31 distinct traits). Regarding PD motor progression, NUDT2 was associated with 93 beneficial side effects, predominantly within the circulatory system (15 traits). For PD composite progression, SH3BGRL3 conferred 67 beneficial side effects—the highest in this category—primarily related to digestive disorders. Finally, among candidate targets for PD onset, GGH showed 72 beneficial side effects, mainly affecting musculoskeletal conditions.
In contrast, the remaining eight candidate targets displayed both beneficial and adverse side effects. Notably, targeting PD cognitive progression, GM2A was linked to 155 total side effects, comprising 56 beneficial and 99 adverse effects. For PD motor progression, EVA1C yielded 51 side effects, including 40 beneficial and 11 adverse effects. Lastly, for PD onset, FCGR2A was associated with 106 side effects, including 42 beneficial and 64 adverse effects. No significant side effects were detected among the candidate targets for PD composite progression.
3.3 Cellular Distribution-Based Clustering of Genes Corresponding to Candidate Targets
To elucidate the cellular distribution of genes encoding candidate target proteins across various brain regions for the development of effective PD therapies, we retrieved gene expression matrices from the ABA covering 31 distinct brain cell types. Of the 25 proteins identified, we successfully obtained corresponding gene expression data for 24, excluding EHBP1, which lacked expression information. We then performed hierarchical clustering using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) on these 24 genes based on their expression patterns across the 31 cell types.
The clustering analysis resulted in three distinct clusters. Cluster 1, comprising solely TPST1, exhibited elevated expression primarily in deep-layer intratelencephalic and near-projecting neurons, as well as in the mammillary body and the lower rhombic lip. Cluster 2 included GPNMB, SORD, GM2A, PDCD1LG2, MATN3, TGFBI, FCGR2A, DNAJB4, MICB, SERPINA3, IDO1, and PLA2G12B, none of which showed particularly high expression in any of the examined cell types. Cluster 3 consisted of EVA1C, GLO1, SH3BGRL3, RSPO3, GGH, NANS, ARSA, NUDT2, HAVCR2, ALKBH3, and HDHD2, all demonstrating elevated expression in metabolic and homeostatic cell populations, notably within hippocampal regions (CA1–CA3, CA4, and the dentate gyrus), deep-layer corticothalamic neurons, and vascular cells. Detailed results of the clustering analysis are provided in Table S10.
3.4 PPI Network
To elucidate synergistic relationships among targets across diverse PD phenotypes, we examined interactions among the 25 candidate proteins using a PPI network constructed via the STRING database (Figure 8). The PPI network analysis identified a primary interaction cluster comprising FCGR2A, HAVCR2, PDCD1LG2, and IDO1, which were interconnected with MICB. Specifically, FCGR2A, HAVCR2, and PDCD1LG2 were associated with PD onset, whereas MICB and IDO1 were linked to cognitive progression in PD. Additionally, multiple pairwise interactions were observed. GLO1 and SORD, both associated with cognitive progression, formed a direct interaction pair. GM2A, also related to cognitive progression, interacted with ARSA, a candidate target for PD onset. Furthermore, our PPI analysis revealed an interaction between TGFBI and GPNMB, both of which have been implicated in PD onset.

3.5 Druggability Assessment
To explore the potential for repurposing existing medications targeting the candidate proteins, we consulted the DrugBank database to identify drugs known to modulate these targets. Of the 25 proteins identified in this study, 15 correspond to established drug targets, indicating significant overlaps with treatments for various neurological and psychiatric disorders (Table S11). Notably, EHBP1, SERPINA3, FCGR2A, GPNMB, MICB, RSPO3, NUDT2, and GLO1 are primarily associated with antipsychotic agents such as chlorpromazine, risperidone, and olanzapine. Additionally, IDO1 and GLO1 are targets for a range of antidepressants, including citalopram, fluoxetine, and venlafaxine. Furthermore, FCGR2A, NANS, and MATN3 are linked to corticosteroids and nonsteroidal anti-inflammatory drugs (NSAIDs) like prednisone, ibuprofen, and naproxen. GGH and MATN3 are involved in pathways targeted by antiepileptic drugs, such as phenytoin and topiramate. GM2A is associated with both antiepileptic and sedative medications, while ARSA is implicated in treatments for dystonia and epilepsy.
4 Discussion
Our study identified and validated latent plasma and brain targets for the onset and progression of PD by an integrative PWAS, which is an effective method in such contexts. Based on extensive pQTL data from plasma and brain tissues, comprehensive GWAS summary statistics were also utilized, resulting in the identification of 25 protein targets associated with the PD trajectory, including its onset and progression. Furthermore, we provided comprehensive insights into the therapeutic potential and safety profiles for the prioritized targets through PheW-MR analysis, cellular distribution-based clustering, PPI networks, and druggability assessments.
Moreover, we reviewed the prioritized protein targets by accessing literature sources (including PubMed, Embase, and Google Scholar, etc.), and ALKBH3, GLO1, GM2A, IDO1, SERPINA3, TGFBI578PL, A2G12B, ARSA, FCGR2A, and GPNMB were found to be previously reported with reliable evidence [39-47], while MICB, SH3BGRL3, SORD, TPST1, EVA1C, MATN3, NUDT2, NANS, RSPO3, DNAJB4, EHBP1, GGH, HAVCR2, HDHD2, and PDCD1LG2 were novelly identified with few direct evidence from prior studies. Though it is relatively unreliable to predict most novel targets' concrete mechanisms in PD onset or progression, we focus on the novelty and investigability, from proteins to pathways, then to the phenotype and subtype.
Concerning the identified progression-related targets, ALKBH3, GLO1, IDO1, SERPINA3, SORD, TPST1, GM2A, MICB, SH3BGRL3, and TGFBI, we found them to be significantly associated with cognitive decline in the plasma of PD patients. They are involved in diverse biological processes: Expression and glycation damage of GLO1 was demonstrated to be induced by alpha-synuclein ablation, contributing to the development of PD40; IDO1 inhibition improves motor dysfunction and provides neuroprotection in PD mice [42]; MICB and TPST1 are novel targets identified for PD, may shape neuroinflammatory processes by modulating microglial activation in PD and mediate sulfation of key neuronal proteins modulating intracellular signaling pathways, thereby influencing dopaminergic neuron survival and accelerate the progression [48, 49]; Additionally, targets such as NUDT2, PLA2G12B, EVA1C, and MATN3 were associated with motor progression, highlighting potential targets for mitigating motor dysfunction in PD. For instance, NUDT2, involved in nucleotide metabolism [50], and PLAG12B in PLA2 (phospholipase A2) superfamily were suggestively associated with PD, influencing neuronal membrane integrity and essential signal transduction pathways for motor function [45]. The identification of these targets underscores the complex interplay between metabolic and inflammatory pathways in PD motor symptoms. Furthermore, SH3BGRL3 was identified as an intriguing target, demonstrating a dual role by being causally linked to both cognitive and composite progression phenotypes, albeit with contrasting directions of effect. While there is no direct evidence, this duality suggests that SH3BGRL3 may regulate multiple pathways that differentially affect various aspects of disease progression, such as influencing PD by stabilizing synaptic architecture and acting as a redox sensor, thereby protecting against α-synuclein-induced synaptic deficits and adjusting dysregulated intracellular signaling cascades [51]. Aside from that, previously hinted by an MR study, GPNMB stood out as a pivotal target showing a causal relationship with increased risk of PD onset in both plasma and brain tissues [47]. The consistent association of GPNMB across different tissues highlights its potential as a robust biomarker for early PD detection and as a promising therapeutic target to delay disease onset.
The PheW-MR analysis offered a comprehensive evaluation of the potential side effect profiles associated with the candidate targets. Impressively, 83.7% of the identified side effects were beneficial, while 16.3% were adverse. This predominance of beneficial side effects suggests that targeting these proteins may confer therapeutic advantages beyond PD, thereby enhancing the overall safety and efficacy of potential interventions. For instance, MICB's association with numerous beneficial traits within the circulatory system underscores its potential role in vascular health, which could be advantageous given the emerging evidence of vascular contributions to PD pathology [52]. Additionally, targets such as NUDT2 and SH3BGRL3 exhibited substantial beneficial effects across various disease traits, highlighting their multifaceted therapeutic potential. Conversely, targets like GM2A and FCGR2A, which demonstrated both beneficial and adverse side effects, emphasize the necessity for cautious therapeutic modulation to balance efficacy with safety.
We also conducted Cellular Distribution-Based Clustering to elucidate the cellular distribution of genes encoding candidate target proteins across various brain regions, thereby informing the development of effective PD therapies. This analysis identified three distinct clusters, with particular emphasis on Cluster 1 and Cluster 3. Cluster 1, comprising solely TPST1, exhibited elevated expression in deep-layer intratelencephalic and near-projecting neurons, as well as in the mammillary body and lower rhombic lip. This specific expression profile suggests that TPST1 may play a critical role in neuronal connectivity and signaling pathways pertinent to PD onset, presenting a targeted opportunity for therapeutic intervention. Cluster 3, consisting of EVA1C, GLO1, SH3BGRL3, RSPO3, GGH, NANS, ARSA, NUDT2, HAVCR2, ALKBH3, and HDHD2, demonstrated elevated expression in metabolic and homeostatic cell populations, particularly within hippocampal regions, deep-layer corticothalamic neurons, and vascular cells. The metabolic and homeostatic functions highlighted by Cluster 3 underscore the importance of maintaining cellular energy balance and vascular integrity in mitigating PD-related neurodegeneration. These findings suggest that targeting metabolic pathways and supporting vascular health could be pivotal in slowing disease progression and enhancing neuronal survival. Furthermore, we conducted PPI analysis to explore synergistic relationships among targets across diverse PD phenotypes. Utilizing the STRING database for PPI network analysis, we identified a primary interaction cluster comprising FCGR2A, HAVCR2, PDCD1LG2, and IDO1, interconnected with MICB. Notably, MICB and IDO1 emerged as candidate targets associated with PD cognitive progression, while the remaining proteins were linked to PD onset. IDO1 plays a crucial role in regulating immune responses and inflammation, potentially contributing to cognitive deterioration in PD patients [53], whereas MICB modulates natural killer and T cell activity, suggesting a complex immune regulatory mechanism underlying cognitive impairments [54]. Conversely, FCGR2A, HAVCR2, and PDCD1LG2 are primarily associated with PD onset, involving immune regulation and sustained inflammatory responses that may drive neurodegenerative changes [55-57]. This cluster highlights candidate targets associated with PD onset and cognitive progression, suggesting that targeting these interconnected proteins could modulate both the initiation and advancement of the disease. The intricate interactions among these candidate targets reveal potential nodes for multi-target drug development, where simultaneous modulation of interconnected proteins may enhance therapeutic efficacy and more effectively mitigate disease progression compared to single-target approaches.
Utilizing the DrugBank database, we assessed the druggability and therapeutic potential of the 25 identified candidate targets [15]. Notably, 15 of these candidates were recognized as existing drug targets, highlighting significant opportunities for drug repurposing. Proteins such as EHBP1, SERPINA3, and GLO1 are currently targeted by antipsychotic and antidepressant medications, whereas FCGR2A and MATN3 are associated with corticosteroids and NSAIDs. This overlap indicates that existing pharmacological agents could be repurposed to modulate these proteins, thereby potentially accelerating the development of disease-modifying therapies for PD.
Our study is underpinned by several notable strengths that collectively enhance its scientific rigor and potential impact. Firstly, our research represents the first known PWAS utilizing the OTTERS method and large-scale summary-level pQTL data from deCODE to investigate both the onset and progression of PD. In contrast to previous PWAS studies that employed small-sample pQTL data, we leveraged the OTTERS framework with extensive summary-level pQTL data from deCODE. This methodological approach significantly increases statistical power, enabling the identification of a greater number of critical proteins, particularly those previously undiscovered. Consequently, this not only deepens our understanding of the mechanisms driving PD progression but also provides additional potential targets for developing disease-modifying therapeutic strategies. Furthermore, the robustness of our findings is reinforced through rigorous validation methodologies, including SMR, colocalization analyses, and bidirectional MR. These approaches collectively ensure high confidence in the causal relationships between proteins and PD phenotypes. Additionally, our comprehensive assessment of potential side effects via PheW-MR offers critical insights into the safety profiles of candidate targets, thereby informing the development of safer therapeutic interventions. The identification of existing drug targets among the candidates also facilitates drug repurposing, potentially accelerating the translation of our findings into clinical applications and enhancing the feasibility of novel therapeutic strategies.
Despite the comprehensive nature of this PWAS, several limitations warrant consideration. First, our analyses do not encompass the entirety of the human proteome. Some proteins remain unmeasured and may also play pivotal roles in PD onset and progression, introducing the possibility of horizontal pleiotropy. Second, although our investigation incorporated both plasma and brain pQTL datasets, the absence of training sets in OTTERS for PWAS among brain proteins may reflect limitations in assay sensitivity and completeness of the entire study pipeline, partially weakening the convincement. Third, our study primarily includes individuals of Icelandic and European ancestries, where population homogeneity might reduce the generalizability of our findings to more diverse ethnic backgrounds, underscoring the need for replication efforts in multiethnic cohorts. Fourth, different proteomic platforms were employed (SOMAscan for plasma vs. mass spectrometry for brain), which may partially explain the minimal overlap of candidate targets across tissues. Harmonizing platform technologies in future studies could help identify additional shared targets. Fifth, the exclusion of EHBP1 from our downstream analyses due to missing expression data may have obscured its potential relevance in PD pathophysiology. Finally, the results of prioritized protein targets were computational hypotheses based on limited direct evidence, which are relatively primary and call for experimental validation in the future. Addressing these limitations through broader proteomic profiling, larger and more diverse cohorts, and uniform assay platforms will be vital to refining our understanding of causal protein targets and their translational potential in PD.
Author Contributions
All authors made significant contributions to this work and have approved the final manuscript. Concept and design: Chenhao Gao, Haobin Zhou, Weixuan Liang, Zhuofeng Wen, Jiewen Zhang, Chuiguo Huang, and Naijun Yuan. Data curation: Chenhao Gao, Haobin Zhou, Weixuan Liang, Zhuofeng Wen, and Chuiguo Huang. Analysis and interpretation of data: Chenhao Gao, Haobin Zhou, Weixuan Liang, Wanzhe Liao, Zhuofeng Wen, and Chuiguo Huang. Computational resources and support: Haobin Zhou, Chuiguo Huang, Jiewen Zhang, and Naijun Yuan. Writing of the original draft and reviews: Chuiguo Huang, Chenhao Gao, Wanzhe Liao, Zhixin Xie, Cailing Liao, Limin He, Jingzhang Sun, and Zhilin Chen. Editing draft and reviews: Haobin Zhou, Weixuan Liang, Zhuofeng Wen, Jiewen Zhang, Chuiguo Huang, and Naijun Yuan.
Ethics Statement
Each cohort included in this study has been conducted using published studies and consortia providing publicly available summary statistics. All original studies have received ethical approval and agreed to participate, and summary-level data were provided for analysis.
Consent
The authors have nothing to report.
Conflicts of Interest
The authors declare no conflicts of interest.
Open Research
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found on deCODE Health study (https://www.decode.com/summarydata/), ROS/MAP study (https://www.synapse.org/#!Synapse:syn23627957), GWAS Catelog (https://www.ebi.ac.uk/gwas/home) FinnGen consortium (https://www.finngen.fi/fi), and Allen Brain Cell Atlas Public Dataset (https://allen-brain-cell-atlas.s3.us-west-2.amazonaws.com/index.html).