Assessing the Causal Effect of Circulating Protein-To-Protein Ratio on the Risk of Morbidity of Hepatocellular Carcinoma
Funding: We gratefully acknowledge the financial support from the Natural Science Foundation of Fujian Province (No.2023J01179) and the Foundation Sponsored by Clinical Research Funding Project Initiated by Researchers of Fujian Cancer Hospital (Grant No. 202412009).
Qiuyuan Yue, Xiaoxia Li, and Xiaoye Wan have contributed equally to this work and share first authorship.
ABSTRACT
Objective
Several observational studies have identified an association between plasma proteins and hepatocellular carcinoma (HCC). This study aimed to explore the potential causal relationship between the circulating protein-to-protein ratio and the morbidity risk of HCC.
Methods
Genetic association data for circulating plasma proteins and 2821 protein-to-protein ratios were sourced from the UKB PPP and Suhre's study. Genetic association data for HCC were sourced from the FinnGen cohort (finngen-R11-HCC) and the IEU OpenGWAS project (ieu-b-4953). Subsequently, a two-sample Mendelian randomization (MR) and drug-targeted MR approach were used to evaluate causality associations. To bolster the robustness of our findings, we conducted a series of sensitivity analyses.
Results
Eight protein–protein pairs were identified as causal factors for HCC in the two independent cohorts. For each standard deviation increase in protein–protein pair expression, susceptibility to HCC fluctuated from 0.4974 (95% confidence interval [CI]: 0.2506–0.9871) for the LAT2/SPRY2 protein pair to 1.9763 (95% CI: 1.3009–3.0026) for the ERBIN/LAT2 protein pair. However, among the significant protein pairs, only one circulating protein, TDRKH (odds ratio: 0.5964, 95% CI: 0.4196–0.8476), was causally associated with HCC.
Conclusion
Using multiple datasets and methods, eight protein–protein pairs were identified as having causal associations with HCC. Protein–protein interactions can provide meaningful findings beyond simple pQTL analysis.
1 Introduction
Hepatocellular carcinoma (HCC) is one of the most common malignancies globally. The morbidity and mortality rates of HCC are particularly high in certain regions [1] and among patients with chronic hepatitis, including hepatitis B and C [2]. Early detection of HCC is challenging because of its typically asymptomatic and rapidly progressing nature [3]. Primary treatments for HCC include surgical resection, liver transplantation, radiotherapy, chemotherapy, and targeted therapy; however, their effectiveness in enhancing patient survival rates is limited [2]. Among these treatment options, liver transplantation is considered the best option for patients with cirrhosis combined with HCC because of its ability to completely remove the lesions and replace the functionally impaired liver [4]. However, the shortage of liver donors and the high cost limit its widespread application. Moreover, the high recurrence rate of HCC remains a major challenge in current treatments, severely affecting patients' long-term survival [5, 6]. The mechanisms underlying HCC morbidity are complex and varied. Recent research has shed more light on the roles of genetic mutations, abnormal signaling pathways, and the tumor microenvironment in the development of HCC [7]. Despite advances in targeted drug therapy, its efficacy still falls below expected levels [8], indicating that exploring new therapeutic targets is crucial for treating HCC.
Research has indicated that circulating proteins (pQTLs) are potential biomarkers for assessing HCC risk [9]. Proteins such as alpha-fetoprotein (AFP) and des-γ-carboxythrombotic acid precursor have been used for clinically diagnosing and monitoring HCC. However, their effectiveness is limited by low sensitivity and specificity [10]. Additionally, circulating tumor DNA (ctDNA), as an emerging biomarker, has garnered significant attention in cancer research in recent years. ctDNA refers to DNA fragments released by tumor cells into the bloodstream, and its presence and changes can reflect the genomic characteristics of the tumor [11]. Because of its advantages of being noninvasive, allowing for real-time monitoring, and possessing high specificity, ctDNA shows great potential in the early diagnosis of cancer [12], predicting recurrence [13], and guiding treatment [14, 15]. However, despite its significant advantages in cancer research, the detection of ctDNA still faces challenges in terms of sensitivity and specificity. In contrast, proteomics, as a method for studying protein expression and function, offers another potential source of biomarkers. Combining ctDNA with proteomics can complement the strengths of both, providing more comprehensive and accurate tumor information [16]. This multilevel biomarker analysis is expected to offer clinicians more precise diagnostic and therapeutic guidance, thereby improving patient prognosis. Recent studies on the role of circulating proteins and their ratios in predicting the risk of HCC have revealed complex interactions between various biomarkers and disease progression [17]. Furthermore, the boundaries between diagnostic serum biomarkers and therapeutic targets are becoming increasingly blurred, and circulating proteins, such as AFP, GPC3, and SALL4, have become biomarker-guided precision medicines for HCC [18]. However, these studies were observational or cross-sectional studies, and the existence of confounding factors cannot reveal the causal relationship between these single circulating proteins and their biological links with HCC, leading to low-quality research evidence.
Mendelian randomization (MR) is a powerful alternative that uses genetic variation as an instrumental variable to infer a causal relationship between exposure and outcomes, thus providing a higher level of evidence [19, 20]. Traditional statistical methods used to study the relationship between biomarkers and disease risk are often affected by confounding factors and reverse causality, making it challenging to establish causal relationships. The MR method randomly assigns genes from parents to offspring, mimicking the randomization process of a controlled trial and minimizing confounding effects. Additionally, research on cell genomics suggests that studies based on a single protein, ignoring the relationships between proteins, lead to false negatives and reduce the level of evidence [21]. To address these deficiencies, this study presents the first protein–protein pair-based (rQTL) MR study, which highlights the causal association between the biological link between proteins and the pathogenesis of HCC compared with pQTL-based MR. To date, no MR study has assessed the causal association between the circulating protein-to-protein ratio and HCC.
This study used large HCC cohorts and multi-omics data to explore the causal relationship between circulating protein-to-protein ratio and the risk of HCC through MR from the perspective of pQTL and rQTL, with the aim of elucidating the molecular mechanism of HCC. We believe that our findings will identify new insights that may serve as biomarkers for early detection and risk classification and provide new directions for targeted therapy of HCC.
2 Methods
In this study, we explored the potential causal relationship between the circulating protein-to-protein ratio and morbidity risk of HCC using two-sample MR analysis, and strengthened the validation of the causal association using the Steiger directionality test and colocalization analysis. We further analyzed the causal association between the expression abundance of individual proteins in positive protein–protein ratios and the morbidity risk of HCC. The study design is illustrated in Figure 1.

2.1 Data Acquisition
Genetic association data for protein–protein ratios were sourced from Karsten Suhre's identification of 2821 protein-to-protein ratio-associated single nucleotide polymorphisms (SNPs) (rQTL) [21]. Genetic association data for circulating plasma proteins (pQTLs) were obtained from the UKB PPP reported by Suhre [22]. Genetic association data for HCC were sourced from the FinnGen cohort (finngen-R11-HCC, with 609 patients and 345,118 controls; https://www.finngen.fi/en/) [23] and the IEU OpenGWAS project (ieu-b-4953, with 168 patients and 372,016 controls; https://gwas.mrcieu.ac.uk/datasets/) (Table 1).
Type | Data source | Sample size | Cases | Population |
---|---|---|---|---|
rQTLs | UKB-PPP | 54,000 | — | European |
pQTLs | UKB-PPP | 54,000 | — | European |
HCC | FinnGen | 345,727 | 609 | European |
HCC | IEU OPEN GWAS | 372,184 | 168 | European |
2.2 Endpoint Definition
2.2.1 finngen-R11-HCC
The endpoint is delineated by a set of registry filters that include the ICD-10 code C22.0 for the cause of death, ICD-O-3 topography code C22, a range of morphology codes from 81700 to 81705, and a behavior code of 3, which together specify the criteria for identifying HCC cases within the registry. Controls for the endpoint excluded individuals who were defined as C3_CANCER or C3_CANCER_WIDE. (https://r11.risteys.finregistry.fi/endpoints/C3_HEPATOCELLU_CARC_EXALLC#).
2.2.2 ieu-b-4953
Liver cell carcinoma cases were defined using the following parameters: first, individuals with a site-specific cancer code (ICD10: C22.0 and ICD9: 1550). Second, site-specific cancer morphology (behavior) was dealt with using the following rules: Cancer behaviors including “Malignant, primary site,” “Malignant, microinvasive,” “Malignant, metastatic site,” and “Malignant, uncertain whether primary or metastatic site” were included in the dataset. Cancer behaviors including “Benign,” “Uncertain whether benign or malignant,” and “Carcinoma in situ” were excluded from the dataset. Third, individuals with an ICD10: D code but no C code were not included as cases. Controls were defined using the following parameters: individuals who do not have any cancer code (ICD9 & ICD10—C and D codes). And individuals who have no self-report of cancer [24].
2.3 Instrumental Variable Screening
The selection criteria for instrumental variables for plasma rQTLs in conventional two-sample MR analysis [25] were as follows [26]: (1) p < 5 × 10−8 was selected as the selection criterion. The coefficient of linkage disequilibrium R2 was set as 0.001 [27], the width of the linkage disequilibrium region was 10,000 kb, and the minor allele frequency (MAF) was > 0.01 to ensure the independence of each SNP and remove the influence of linkage disequilibrium on the results; (2) the variables showed no pleiotropy; (3) the instrumental variables were independent of the confounders. Additionally, instrumental variables with an F-statistic > 10 were preferentially chosen to reduce weak instrumental variable bias [28].
The screening criteria for instrumental variables of plasma pQTLs in the MR analysis of drug-target were as follows: (1) the screening criterion was p < 1 × 10−5, the linkage disequilibrium coefficient R2 was 0.3, the linkage disequilibrium region width was 100 kb, and the MAF > 0.01. This was to ensure the independence of SNPs and mitigate the influence of linkage disequilibrium on the results; (2) the variables showed no pleiotropy; (3) each instrumental variable was independent of confounding factors; (4) the instrumental variable was located in the cis-acting region of the drug target gene (± 1 Mb).
The screened relevant SNPs were extracted from genome-wide association studies (GWAS) summary data of the outcome variable, HCC.
2.4 Co-Localization Analysis
Co-localization analysis of positive protein-to-protein ratio and individual proteins with HCC was performed using the “coloc” package of R [29]. In a given region, co-localization analysis assumes that each of the two traits has at most one true causal variation in the region, resulting in five mutually exclusive model hypotheses (H0–H4). The five models assume all possible associations under the following assumptions: H0: phenotype 1 (GWAS) and phenotype 2 (QTL or GWAS) are not significantly associated with all SNP sites in a genomic region; H1/H2: phenotypes 1 or 2 are significantly associated with SNP sites in a genomic region; H3: phenotypes 1 and 2 are significantly associated with SNP sites in a genomic region, but are driven by different causal variable sites; and H4: phenotypes 1 and 2 are significantly associated with SNP sites in a genomic region and are driven by the same causal variable site. The colocalization regions for pQTL–GWAS and rQTL–GWAS were set at ±500 kb from the location of the top-associated SNP. Strong co-localization was observed when PH4 > 0.8. Moderate co-localization was observed when 0.5 < PH4 < 0.8. Weak co-localization was observed when 0.25 < PH4 < 0.5 [30].
2.5 Steiger Direction Test
Steiger's direction test was performed to verify the directionality of the causal association between the eight positive protein–protein pairs and HCC. The variance explained by the instrumental variable SNPs for the protein-to-protein ratio and the variance of HCC were calculated using Steiger's directional test [31], which tests whether the variance of HCC is smaller than that of the protein-to-protein ratio. If the variance of the outcome is less than the variance of the protein-to-protein ratio, it is judged as “TRUE,” indicating that the causal relationship is consistent with the expected direction, while a “FALSE” result indicates that the causal relationship is the opposite of the expected direction.
2.6 Statistical Analysis
Statistical analyses were conducted using R version 4.4.1 (R Foundation for Statistical Computing, Vienna, Austria; www.r-project.org). The “TwoSampleMR” package, version 0.6.6, was employed for general two-sample MR and drug–target MR analyses. The Wald ratio method and inverse-variance weighted approach [32] were used to assess the causal associations between HCC, the protein-to-protein ratio, and protein abundance using QTLs as instrumental variables. The intercept of the MR-Egger method was used for pleiotropy analysis. SNP heterogeneity was assessed using Cochran's Q test [28]. I2 was also used to represent the proportion of total variation due to heterogeneity. p < 0.05 was considered statistically significant.
3 Results
3.1 Causal Association Between Circulating Protein-To-Protein Ratio and HCC
3.1.1 Two-Sample MR
This study identified 2821 protein-to-protein ratios. In the FinnGen database, 40,028 SNPs associated with HCC were screened for MR analysis after excluding confounders such as hepatitis, cirrhosis, fatty liver, and alcohol consumption (Table S1). The screening results are detailed in Data S1. The main results of the analysis of the causal association between the protein-to-protein ratio and HCC morbidity risk are shown in Figure 2. The results revealed that LAT2/SPRY2 (odds ratio [OR]: 0.4974, 95% confidence interval [CI]: 0.2506–0.9871; p = 0.045), FAP/THBS4 (OR: 0.6537, 95% CI: 0.4919–0.8687), ICA1/IRAK1 (OR: 1.3014, 95% CI: 1.0345–1.637; p = 0.024), ERBIN/LAT2 (OR: 1.9763, 95% CI: 1.3009–3.0026; p < 0.005), and ITGB1BP2/LAT2 (OR: 1.5055, 95% CI: 1.0418–2.1755; p = 0.029), a total of 93 protein pairs, were associated with increased risk of HCC. The results of the heterogeneity and horizontal pleiotropy analyses of rQTL associated with HCC (finngen-R11-HCC) are shown in Data S2.

In the IEU OpenGWAS database, 34,268 SNPs associated with HCC were screened for MR analysis after excluding confounders such as hepatitis, liver cirrhosis, fatty liver, and alcohol consumption (Table S1). The screening results are detailed in Data S3. The main results of the analysis of the causal association between the protein-to-protein ratio and HCC morbidity risk are shown in Figure 3. The results showed that LAT2/SPRY2 (OR: 0.9992, 95% CI: 0.9986–0.9999; p = 0.018), FAP/THBS4 (OR: 0.9997, 95% CI: 0.9994–0.9999; p = 0.012), ICA1/IRAK1 (OR: 1.0002, 95% CI: 1–1.0004; p = 0.033), ERBIN/LAT2 (OR: 1.0006, 95% CI: 1.0002–1.0009); and ITGB1BP2/LAT2 (OR: 1.0004, 95% CI: 1.0001–1.0007; p < 0.005), a total of 51 protein pairs, were linked to an increased risk of HCC. The results of the heterogeneity and horizontal pleiotropy analyses of rQTL associated with HCC (ieu-b-4953) are shown in Data S4.

We considered the intersection of positive results from the two databases. A total of eight protein–protein pairs were causally associated with the risk of HCC morbidity in the FinnGen and IEU OpenGWAS databases (Table 2). The effects of ERBIN/LAT2, ICA1/IRAK1, ICA1/YTHDF3, ITGB1BP2/LAT2, LAT2/SPRY2, and FAP/THBS4 showed consistent directionality across the FinnGen and IEU databases.
Ratio | Protein1 | Protein2 | Outcome | Nsnp | Methods | OR (95% CI) | p |
---|---|---|---|---|---|---|---|
ERBIN.LAT2 | ERBIN | LAT2 | finngen_R11_HCC | 20 | Inverse variance weighted | 1.9763 (1.3009–3.0026) | 0.00141 |
ERBIN.LAT2 | ERBIN | LAT2 | ieu-b-4953 | 16 | Inverse variance weighted | 1.0006 (1.0002–1.0009) | 0.004876 |
FAP.THBS4 | FAP | THBS4 | finngen_R11_HCC | 21 | Inverse variance weighted | 0.6537 (0.4919–0.8687) | 0.003384 |
FAP.THBS4 | FAP | THBS4 | ieu-b-4953 | 19 | Inverse variance weighted | 0.9997 (0.9994–0.9999) | 0.012371 |
ICA1.IRAK1 | ICA1 | IRAK1 | finngen_R11_HCC | 5 | Inverse variance weighted | 1.3014 (1.0345–1.637) | 0.024458 |
ICA1.IRAK1 | ICA1 | IRAK1 | ieu-b-4953 | 3 | Inverse variance weighted | 1.0002 (1–1.0004) | 0.03397 |
ICA1.YTHDF3 | ICA1 | YTHDF3 | finngen_R11_HCC | 2 | Inverse variance weighted | 1.4362 (1.0052–2.0518) | 0.046741 |
ICA1.YTHDF3 | ICA1 | YTHDF3 | ieu-b-4953 | 2 | Inverse variance weighted | 1.0003 (1–1.0007) | 0.029107 |
ITGB1BP2.LAT2 | ITGB1BP2 | LAT2 | finngen_R11_HCC | 33 | Inverse variance weighted | 1.5055 (1.0418–2.1755) | 0.029411 |
ITGB1BP2.LAT2 | ITGB1BP2 | LAT2 | ieu-b-4953 | 28 | Inverse variance weighted | 1.0004 (1.0001–1.0007) | 0.004713 |
LAT2.SPRY2 | LAT2 | SPRY2 | finngen_R11_HCC | 11 | Inverse variance weighted | 0.4974 (0.2506–0.9871) | 0.045805 |
LAT2.SPRY2 | LAT2 | SPRY2 | ieu-b-4953 | 7 | Inverse variance weighted | 0.9992 (0.9986–0.9999) | 0.018179 |
PPP1R9B.TDRKH | PPP1R9B | TDRKH | finngen_R11_HCC | 8 | Inverse variance weighted | 1.265 (1.0043–1.5933) | 0.045856 |
PPP1R9B.TDRKH | PPP1R9B | TDRKH | ieu-b-4953 | 5 | Inverse variance weighted | 0.9998 (0.9996–1) | 0.034891 |
STK24.TDRKH | STK24 | TDRKH | finngen_R11_HCC | 4 | Inverse variance weighted | 1.2839 (1.0031–1.6434) | 0.047222 |
STK24.TDRKH | STK24 | TDRKH | ieu-b-4953 | 2 | Inverse variance weighted | 0.9998 (0.9996–1) | 0.047475 |
3.2 Steiger Direction Test
The results showed that the direction of the eight positive protein–protein pairs and the HCC outcome was “TRUE,” indicating that the causal relationship between the above protein-to-protein ratio and HCC was consistent with the expected direction (Table 3).
Outcome | Exposure | Correct_causal_direction | p |
---|---|---|---|
Hepatocellular carcinoma (finngen-R11-HCC) | ERBIN/LAT2 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | FAP/THBS4 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | ICA1/IRAK1 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | ICA1/YTHDF3 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | ITGB1BP2/LAT2 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | LAT2/SPRY2 | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | PPP1R9B/TDRKH | TRUE | < 0.001 |
Hepatocellular carcinoma (finngen-R11-HCC) | STK24/TDRKH | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | ERBIN/LAT2 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | FAP/THBS4 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | ICA1/IRAK1 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | ICA1/YTHDF3 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | ITGB1BP2/LAT2 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | LAT2/SPRY2 | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | PPP1R9B/TDRKH | TRUE | < 0.001 |
Liver cell carcinoma (ieu-b-4953) | STK24/TDRKH | TRUE | < 0.001 |
3.3 Co-Localization Analysis
The results of the co-localization analysis revealed weak evidence of co-localization of positive protein–protein pairs with HCC (Table S2).
3.4 Causal Association Between Circulating Proteins and HCC
For the eight positive protein pairs, the causal effect of a single circulating protein on the risk of HCC was explored based on two-sample MR, drug–target MR, and co-localization analysis.
3.5 Two-Sample MR
The results of the two-sample MR analysis showed that three proteins, TDRKH (OR: 0.6866, 95% CI: 0.4721–0.9987; p = 0.049), THBS4 (OR: 1.6071, 95% CI: 1.1199–2.3063; p = 0.01), and ICA1 (OR: 1.8042, 95% CI: 1.0962–2.9697; p = 0.02), were causally associated with HCC (Figure 4). Sensitivity analysis results for pQTLs associated with HCC (finngen-R11-HCC) are detailed in Table S3.

3.6 Co-Localization Analysis of Common Two-Sample MR
The results of the co-localization analysis suggested weak evidence of co-localization between pQTLs in positive protein pairs and HCC (Table S4).
3.7 Drug-Target MR
The results of the drug target MR showed that only the TDRKH (OR: 0.5964, 95% CI: 0.4196–0.8476; p < 0.005) protein was causally associated with HCC (Figure 5). The results of the sensitivity analysis of the pQTLs associated with HCC (finngen-R11-HCC) are shown in Table S3.

4 Discussion
The mechanism of HCC morbidity is complex and involves cell cycle deregulation, chromosomal instability, immune regulation, epithelial-mesenchymal transition, and microRNA deregulation [33]. The combination therapy with ICI has significantly improved the treatment outcomes for advanced HCC. This therapeutic approach is not only used for the treatment of unresectable liver cancer but also serves as a bridge and downstaging treatment before liver transplantation, as well as a reference for postoperative tumor recurrence management [34, 35]. The main challenges regarding HCC are early and accurate diagnosis and medical therapy after diagnosis [36, 37]. Therefore, there is an urgent need to identify new biomarkers and therapeutic targets for the treatment of HCC. To our knowledge, this study is the first to assess the association between 2821 pairs of plasma protein-to-protein ratios (rQTLs) and the risk of HCC morbidity. After screening and intersecting samples from two different populations, we identified eight plasma protein–protein pairs (ERBIN/LAT2, ICA1/IRAK1, ICA1/YTHDF3, ITGB1BP2/LAT2, LAT2/SPRY2, FAP/THBS4, PPP1R9B/TDRKH, and STK24/TDRKH) that were strongly associated with HCC. From the above eight positive protein–protein pairs, MR analysis of pure plasma circulating proteins (pQTL) showed that one protein (TDRKH) was robustly and significantly associated with the pathogenesis of HCC.
Interleukin receptor-associated kinase 1 (IRAK1) reportedly plays an important role in a series of malignant tumors and is overexpressed in HCC tissues and cell lines. RAK1/4 inhibitors cause G1/S cell cycle arrest and apoptosis, confirming that IRAK1 is a new target for drug therapy for HCC [38]. THBS4 is part of the extracellular matrix, and its regulation of HCC progression is related to the interaction of Integrin β1, which involves the FAK/PI3K/AKT signaling pathway [39] and may be a prognostic marker or potential therapeutic target for HCC [40]. SPRY2 expression is downregulated in various cancers, including prostate, liver, lung, and breast cancers [41]. One study found that the inhibition of SPRY2 expression via the MAPK/ERK signaling pathway promotes HCC development [42]. It has been reported that activating the STAT3 pathway promotes the expression of fibroblast activation protein (FAP) [43], thereby mediating proliferation and migration of HCC cells while inhibiting apoptosis of HCC cells. FAP can be used as a marker of poor prognosis in patients with HCC [44, 45]. ERBIN promotes HCC by inactivating ERα-mediated tumor suppressor signaling [46]. The roles of these molecules in HCC are consistent with our results. Additionally, we found a strong causal association between other circulating proteins and HCC risk. Among them, high expression of L-type amino acid transporter 2 (LAT2) leads to abnormal amino acid uptake and metabolism, causing rapid proliferation in pancreatic cancer [47], osteosarcoma [48], and other tumor cells. Small-molecule inhibitors of LAT2 reportedly inhibit CD47-mediated tumor immune escape [49]. YTHDF3 is a reader of N6-methyladenosine (m6A), and recent studies have revealed that YTHDF3 plays a role in tumorigenesis by recognizing the m6A modification in MYC mRNA. This highlights the therapeutic potential of targeting the YTHDF3-MYC signaling axis in pancreatic cancer [50]. Results from a proximity extension assay study suggested that ITGB1BP2 may be a candidate biomarker for diagnosing invasive cervical cancer [51]. PPP1R9B (SPN or NEURABIN-2) is a tumor suppressor that affects tumor initiation and progression [52, 53] when the SPN domain that interacts with protein phosphokinase 1 (PP1) is mutated. Recent studies have found that the expression of the serine–threonine kinase (STK24), which regulates tumor immune escape by promoting AKT phosphorylation and PD-L1 expression, is upregulated in tumor specimens [54]. The role of ICA1 in gliomas requires further study [55]. In light of the above mechanisms in other cancers, the specific mechanisms underlying the roles of these circulating proteins in HCC development are yet to be further verified and explored.
MR analysis of single circulating proteins based on the eight positive protein–protein pairs yielded one protein (TDRKH). Few reports exist on TDRKH; however, some studies have shown that TDRKH-AS1 (an antisense RNA of TDRKH) is highly expressed in HCC cell lines in vitro. TDRKH-AS1 knockdown may affect cell proliferation by inducing apoptosis in HCC cells [56]. Additionally, TDRKH-AS1 is upregulated in patients with colorectal cancer and can target β-catenin in the Wnt signaling pathway to exert oncogenic activity [57]. TDRKH-AS1 can be used as a prognostic marker for HCC [56], which is consistent with our findings. Similar to a previous paper on cell genomics [21], only one protein (TDRKH) was found to have a positive causal association with HCC based on a single pQTL in this study, while eight robust protein–protein pairs were found to have a causal association with HCC after considering protein–protein interactions. Therefore, considering protein–protein interactions will help identify new targets for diagnosis and intervention.
This study has several significant strengths. First, genetic data on the plasma protein ratio and HCC were derived from numerous GWAS designed to minimize sample overlap, thereby reducing the risk of confounding effects due to shared samples between datasets. Second, the inclusion of different HCC cohorts in the FinnGen and UKB datasets helped mitigate potential confounding effects due to population differences. Third, the F-statistics of all selected genetic instrumental variables exceeded 10, indicating that the instrumental variable intensity was high. This could effectively reduce the bias introduced by weak instrumental variables, thus improving the credibility of the instrumental variables used in this analysis. Finally, integrating multiple sensitivity analysis techniques enhanced the reliability of the results.
Although the analysis in this study was relatively comprehensive, several limitations must be acknowledged that may affect our interpretation of the results. First, the limitations inherent in MR must be recognized, including issues such as heterogeneity in phenotypic traits and developmental compensation, which may have undermined the accuracy and applicability of our findings [58]. Second, we relied only on aggregate-level data, which limited our ability to perform hierarchical analysis or fully explore individual-level data. Future studies could conduct more in-depth analyses based on individual-level data and perform stratified analyses for HCC caused by different etiologies. Third, given that the study population was composed primarily of individuals of European ancestry, it is prudent to generalize the results to more heterogeneous groups, such as Asians. Follow-up studies should be conducted in different ethnic groups to verify the relevance of our findings. Fourth, it is important to note that no correction for multiple testing was performed in this study; however, we used two large-sample datasets to maximize the robustness of the results through sensitivity analysis. Finally, although we revealed a causal relationship between the eight protein pairs and HCC, our understanding of the underlying mechanisms is incomplete. Further studies are needed to elucidate these complex pathways.
In conclusion, based on the rQTL and pQTL data, we found a causal relationship between HCC and eight protein–protein pairs and one protein and HCC, providing new insights into the diagnosis and treatment of HCC.
Author Contributions
Xu Shaohua, Zhang Mingwei, and Li Yueming supervised the entire project and designed the work. Yue Qiuyuan contributed to the data analysis, data interpretation, manuscript writing, and revision. Li Xiaoxia contributed to the data analysis. Wan Xiaoye contributed to manuscript writing. Lin Xi contributed to data curation. All authors reviewed or revised the manuscript and approved the final draft for submission.
Acknowledgments
We gratefully acknowledge the financial support from the Natural Science Foundation of Fujian Province (No.2023J01179) and the Foundation Sponsored by Clinical Research Funding Project Initiated by Researchers of Fujian Cancer Hospital (Grant No. 202412009).
Ethics Statement
This study was performed in line with the principles of the Declaration of Helsinki.
Consent
Written informed consent for participation was not required for this study in accordance with national legislation and institutional requirements. No animal studies are presented in this manuscript.
Conflicts of Interest
The authors declare no conflicts of interest.
Open Research
Data Availability Statement
The summary statistics for the protein–protein ratios are available in the research of Karsten Suhre (https://doi.org:10.1016/j.xgen.2024.100506). The HCC summary statistics for the FinnGen GWAS are available at https://www.finngen.fi/en/access_results. The HCC summary statistics for the IEU OpenGWAS project are available at https://gwas.mrcieu.ac.uk/datasets/.