The malignant neoplasm of the TNBC is the leading cause of death among Indian women. Recent studies identified the global burden of TNBC affecting approximately more than 40 percent of all BC cases in women worldwide. The absence of expression of receptors such as ER, PR, and HER2 characterizes TNBC.

Objectives

Due to the lack of specific targets, standard treatment options for TNBC are limited. This integrative study aims to identify key genes and provide insights into the underlying molecular mechanisms of TNBC, which can potentially lead to the development of more effective therapeutic strategies.

Material and Methodology

This study integrates PPI and WGCNA analysis of TNBC-related datasets (GSE52194 and GSE58135) to identify key genes. Subsequently, downstream analysis is conducted to explore potential therapeutic targets for TNBC.

Results

The present study renders the potential 13 key genes (PLCG2, CXCL10, CDK1, STAT1, IL6, PLK1, CCNB1, AURKA, NDC80, EGFR, 1L1B, FN1, BUB1B), along with their associated 6 TFs and 20 miRNAs, as reporter biomolecules around which the most significant changes occur. There were some miRNAs hsa-mir-449b-5p, hsa-let-7b-5p, hsa-mir-26a-5p, hsa-mir-155-5p, hsa-mir-24-3p, hsa-mir-212-3p, hsa-mir-21-5p, hsa-mir-210-3p and hsa-mir-20a-5p whose association with other cancers and other BC subtypes have been reported but their association with TNBC need to be explored. Further, enrichment and cumulative survival analysis support the disease association of identified key genes with TNBC.

Conclusion

This integrative analysis could be regarded for experimental inspection as it provides the platform for future researchers in drug designing and biomarker discovery for TNBC diagnosis and treatment.

1 Introduction

Breast cancer (BC) accounting for 12% of all prevailing cancers worldwide is a serious concern for public health globally [1]. Due to its heterogenous nature, BC is categorized into three main groups based on cellular receptor markers reflecting available targeted therapies: (a) estrogen receptor (ER) or progesterone receptor (PR) positive; (b) human epidermal growth factor receptor 2 (HER2) positive (amplification of erbB2) with or without ER and PR positivity; and (c) triple-negative breast cancer (TNBC) defined by the absence of all kinds of receptor markers expression like ER, PR, and HER2 [2]. Due to the advancement of genomics technologies and proper management by government authorities, major contributing factors responsible for TNBC surveillance and prevention have been identified. Still, there are no standard treatment options available for TNBC because it does not respond to drugs that target receptors like ER, PR, and HER2, which accounts for 10%–20% of all invasive BC cases [2, 3]. Since TNBC is more likely to metastasize to the liver, bones, and lungs: it is usually diagnosed late and the survival period is short once it spreads. So, we really need new ways to spot it early [4]. Hence, there is an urgent need to find new biomarkers and their robust finding technique/pipeline facilitating the early-stage detection of the disease. Biomarkers are generally classified into four main categories: diagnostic, prognostic, predictive, and therapeutic, each with distinct importance [5]. Diagnostic biomarkers have the potential to identify noninvasively the presence of disease, prognostic biomarkers provide information on patient survival with or without treatment, and predictive biomarkers help to determine which treatment is most likely to improve a patient's survival whereas therapeutic biomarkers, often proteins, serve as targets in treatment therapies [5].

Array-based sequencing techniques, including microarrays and RNA-Seq, are potential cutting-edge high-throughput genomic/transcriptomic sequencing methods. The microarray methodology is based on hybridization, whereas the RNA-Seq method is based on synthesis and uses DNA polymerase to insert nucleotides [6]. Unlike arrays, RNA-Seq technology does not require species- or transcript-specific probes and may discover novel transcripts, gene fusions, single nucleotide variations, and indels (small insertions and deletions) [7, 8]. Recent studies have employed array and RNA-seq data, on which bioinformatics approaches have been implemented to identify key/hub genes for TNBC [9-12]. These studies have revealed that most of the identified key/hub genes are kinases. Cross-platform data integration in RNA-seq analysis involves combining data from multiple sources, technologies, or studies to improve the robustness and depth of biological insights [13]. This approach can overcome limitations posed by individual datasets, due to small sample sizes or platform-specific biases, by leveraging diverse data to enhance statistical power and uncover broader patterns in expression profiling. In today's era of bioinformatics, collecting data is not the main challenge; instead, normalizing the data poses a significant hurdle [14]. Considering all these factors, we devised an integrated protein–protein interaction (PPI) and weighted gene co-expression network analysis (WGCNA) study aimed at identifying crucial key genes linked to TNBC along with their associated TFs and miRNAs as reporter biomolecules [15, 16] around which the most significant changes occur, which could be used as a potential biomarker to cure the disease. Initially, we retrieved TNBC-associated RNA-Seq datasets from the GEO database. Before integrating the data, we conducted preprocessing and normalization procedures, following established methods documented in existing literature deemed suitable for our study. In this integrated analysis aimed at identifying key genes, the pipeline was bifurcated into two main parts. The first part involved identifying differentially expressed genes (DEGs) and subsequently reconstructing a PPI network to retrieve significant hub genes. This was followed by enrichment analysis, which supported the involvement of DEGs in cancer-related pathways and biological ontologies. The second part focused on WGCNA analysis, where a co-expression network was constructed to elucidate correlations between gene clusters and phenotypic attributes. Furthermore, phenotypically significant clusters were identified. Moreover, key hub genes were retrieved, and downstream analyses such as the exploration of associated regulatory biomolecules, cross-validation, and novel cumulative survival analysis were conducted to establish them as potential biomarkers for TNBC. This novel cumulative survival method surpasses traditional survival analysis, which typically evaluates the prognostic power of individual genes over time but often fails to fully capture the underlying mechanisms of disease progression. This new approach can identify multiple gene targets within disrupted pathways, facilitating the development of more effective drugs to improve survival rates in the future.

2 Materials and Methodology

2.1 Data Retrieval

To achieve a comprehensive analysis, a thorough literature review was conducted to identify all the publicly available RNA-seq datasets containing both normal and cancerous samples associated with TNBC. Keywords such as “TNBC,” “Homo sapiens,” and “expression profiling by high throughput sequencing” were used for the thorough literature search. Four unprocessed RNA-seq transcriptome datasets (GSE52194, GSE58135, GSE142258, and GSE142731) were obtained from the literature available on the GEO database [17]. Of these, two datasets (GSE58135, GSE52194) were selected for evaluating integrative gene expression profiling in TNBC, as they included samples from both healthy and diseased individuals. The other two datasets (GSE142258 and GSE142731) were excluded due to the absence of normal samples, to maintain the homogeneity of analysis.

2.2 Quality-Check and Data Integration

Using FastQC v0.11.5 toolkit [18] the quality of the unprocessed sequence data was examined. After the quality check, poor-quality reads and adapters were cropped and trimmed using the tool Trimmomatic v0.36 [19]. After processing, reference-based alignment was performed on the processed reads using the tool STAR 2.7.10a [20]. The human genome GRCh38. DNA (Ensembl release 107) was used as the reference, and default parameters in STAR for alignment were employed because these settings are optimized for mammalian genomes. Followed by this, FeatureCounts v1.6.2 [21] was used to quantify each read using the same Ensembl release (107) annotation file. Before the data integration, pre-filtering was performed to remove low-count genes and select rows with at least five reads within each dataset. Furthermore, before integrating the data using the merge() R function, normalization and GC BIAS correction were performed using the cqn() R package [22], and again pre-filtering was performed on integrated datasets to remove low-count genes, which were less than 10 in 75% of samples to make the data more consistent and robust for the further analysis.

2.3 Dataset Analysis

After integrating the data, the entire pipeline was bifurcated into two distinct categories as shown in Figure 1. The first category was dedicated to conducting the differential gene expression analysis utilizing the edgeR [23] package. Simultaneously, the second category focused on performing the WGCNA [24] using the R package.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Overview of the study.

2.4 DEG Identification, PPI Reconstruction and Module Identification

In the present study using the edgeR package, genes with log2FoldChange > 1.0 and adjusted p (p adj) < 0.01, corrected by the Benjamini-Hochberg method, were considered upregulated or overexpressed, while log2FoldChange < −1.0 and p adj < 0.01 were considered downregulated or under-expressed, and we are calling them DEGs. The STRING database [25] with a confidence score of 0.90 was employed to reconstruct the PPI network of DEGs. The visualization of this network was accomplished using CytoscapeF [26]. In the PPI network, an undirected graph was employed, where ‘V’ denoted a set of vertices representing nodes (proteins), and ‘E’ represented a set of edges signifying connections between the proteins. To identify the significant module of DEGs, the Cytoscape plugin MCODE was employed to identify the finest cluster within the network.

2.5 WGCNA Analysis

To create WGCNA of integrated 16,384 gene counts, we used the WGCNA R package. Primarily, we created an adjacency matrix to outline the correlation strength between the nodes.

An intermediate co-expression similarity matrix was calculated first to calculate the adjacency matrix. The following equations were utilized to derive the similarity and adjacency matrix:

{S}_{ij}=\&{a}_{ij}={s^{\beta}}_{ij}

()

In these equations, i and j represent two distinct genes, whereas xi and xj denote their respective expression values. S_ij signifies Pearson's correlation coefficient, whereas a_ij denotes the magnitude of the connection between two genes. For this study, we selected a soft-threshold power of β = 15 (scale-free R² = 0.90), which determines the specificity and sensitivity of the pairwise connection strengths used to construct the adjacency matrix. Following this, we transformed the adjacency matrix into a topological overlap matrix (TOM). The TOM matrix serves as a method to quantitatively depict the similarity in nodes by evaluating the weighted correlation between two nodes and other nodes. Following this, hierarchical clustering was conducted to pinpoint significant modules.

2.6 Enrichment Analysis of DEGs

Enrichment analyses referring to molecular function, biological process, and cellular activity of DEGs were performed using the ShinyGO 0.80 [27] tool, and to explore involved biological pathways, Kyoto Encyclopedia of Genes and Genomes (KEGG) [28] was employed in the analysis. In this inspection, p-values derived by hypergeometric distribution:

p={\sum}_{k=n}^a\frac{\left(\genfrac{}{}{0pt}{}{b}{k}\right)\ \left(\genfrac{}{}{0pt}{}{N-b}{a-k}\right)}{\left(\genfrac{}{}{0pt}{}{N}{a}\right)}

()

Where, n represents the number of DE genes in the gene set. N denotes the total count of genes included in the analysis. b represents the counts of genes within the gene set. a signifies the number of DE genes within the gene set.

Equation (2) utilizes Fisher's exact test, which underwent correction through an enhanced Benjamini-Hochberg method as the multiple testing correction technique. Gene-set enrichment outcomes with an adjusted p < 0.05 were deemed statistically significant.

2.7 Key Gene and Regulatory Biomolecule Identification

To identify the key genes in this integrative analysis, we focused on the overlapping and integrative module genes derived from our bifurcated pipeline. We conducted topological analyses, followed by constructing a PPI network on module genes. To identify significant regulatory biomolecules—transcription factors (TFs) and miRNAs—that collectively control key genes at transcriptional and translational levels, we used the TRRUST v2 [29] and miRTarbase [30] databases through the miRNet 2.0 [31] platform. Only biomolecules with an adjusted p < 0.05 were considered significant.

2.8 Cumulative Survival Analysis

To gain insight into the survival value of identified key genes, independent array datasets for breast cancer (study ID: brca_metabric), which include 1981 patient's clinical information, were employed. At first, we clustered the key genes based on their expression values using of K-means clustering algorithm in R and grouped them into five clusters, denoted as k = 5. The patient's groups were divided into low and high groups based on their mRNA expression levels, which correspond to their cluster. For each cluster, Kaplan–Meier (KM) plots were generated to visually compare survival outcomes, and clusters with log-rank p < 0.05 were considered statistically significant. Subsequently, Cox proportional hazard regression analysis was also performed to assess the association between the survival time of patients and predictor variables. By combining Kaplan–Meier plots with Cox regression analysis, we ensured a comprehensive evaluation of the survival impact, where the former offered a visual representation of survival differences and the latter provided a quantitative hazard ratio.

3 Results

3.1 Identification of DEGs

To identify the participation of DEGs in the association of TNBC as the disease is heterogeneous and differs remarkably by the absence of receptor biomarkers, we selected two publicly available gene expression datasets associated with TNBC that contained both cancerous and normal samples (Table 1). To identify DEGs of integrated datasets, the edgeR pipeline is utilized. Among the DEGs, we found that there are 2595 up- and 2001 down-regulated (Table 2) genes that were found statistically significant with adj p ≤ 0.01.

TABLE 1. Datasets (TNBC).

GEO ID	Platform ID	Number of cancerous samples	Number of normal samples	Library layout	References
GSE58135	GPL11154	35	18	Paired	Varley et al. 2014 [32]
GSE52194	GPL11154	6	3	Paired	Eswaran et al. 2015 [33]

TABLE 2. DEGs analysis.

Merged count matrix	UP regulated genes (LogFC ≥ 1 and adj p ≤ 0.01)	Down regulated genes (LogFC ≥ 1 and adj p ≤ 0.01)
16,384 counts, 61 samples	2595	2001

3.2 Enrichment Analysis

Pathway analysis revealed pathways in cancer, including the PI3K-Akt signaling pathway, focal adhesion, cell cycle, MAPK signaling pathway, calcium signaling pathway, and other cancer pathways that were found to be influenced by DEGs (Figure 2A). GO ontology biological process reveals localization of cells, regulation of cell population proliferation, cell adhesion, cell migration, circulatory system development, and other processes associated with cancer development that were found to be influenced by DEGs (Figure 2B). Go-term molecular function inspection uncovers enzyme regulator activity, signaling receptor binding, cytoskeletal protein binding, protein kinase binding, kinase binding, and calcium ion binding, which are the most molecular activities in which DEG involvement has been identified as statistically significant (Figure 2C). Whereas GO term cellular component analysis revealed the intrinsic component of the plasma membrane, integral component of the plasma membrane, plasma membrane region, extracellular matrix, external encapsulating structure, and cell surface, others were the cellular components affected by DEGs of TNBC (Figure 2D).

3.3 PPI Reconstruction of DEGs

Further PPI networks for both up- and down-regulated genes were reconstructed together. The PPI network consists of 1961 nodes as proteins and 6161 edges as interactions between them, demonstrating that the PPI network follows a scale-free topology, where a few nodes have a higher degree of interaction with other nodes (Figure 3). The interconnections among the cluster genes within the entire network were identified using the Cytoscape plugin MCODE. There were seven clusters: 34 nodes and 489 edges in cluster1, 13 nodes and 78 edges in Cluster 2, 21 nodes and 125 edges in cluster 3, 12 nodes and 61 edges in Cluster 4, 32 nodes and 139 edges in cluster 5, 9 nodes and 35 edges in cluster 6, and 57 nodes and 210 edges in cluster 7, which were identified from MCODE based on a scoring system (cutoff k-score ≥ 7) (Table 3).

TABLE 3. PPI module genes.

Module name	Score	No of nodes
Cluster1	29.636	34
Cluster2	13	13
Cluster3	12.5	21
Cluster4	11.091	12
Cluster5	8.968	32
Cluster6	8.75	9
Cluster7	7.5	57

3.4 WGCNA Phenotypic Significant Modules Identification

To create a co-expression network of integrated 16,384 gene counts with the clinical trait cancer, primitively, preprocessing was conducted to identify outliers (Figure 4A,B) within the selected samples of TNBC using a clustering algorithm. A soft thresholding parameter of β = 15 (scale-free R² = 0.90) was chosen to ensure a scale-free network (Figure 4C). A dendrogram was created by clustering all the DEGs using a dissimilarity measure known as 1-TOM. Through hierarchical clustering, 27 modules were identified (Figure 5A), among which 7 were found to have the highest association with cancer and were statistically significant (eigenvalue ≥ 0.90) (Figure 5B, Table 4).

TABLE 4. Phenotypic significant modules.

Module name	Phenotype	Correlation score	No. of genes
Dark green	Cancer	0.94	74
Dark orange	Cancer	0.90	55
Dark red	Cancer	0.93	80
Light yellow	Cancer	0.90	83
Midnight blue	Cancer	0.94	151
Orange	Cancer	0.96	63
Tan	Cancer	0.90	190

3.5 Key Gene Identification

When we took consensus between module genes identified through PPI and WGCNA, there are two genes, PLCG2 and CXCL10 (Figure 6A), and when we did integrative PPI analysis followed by topological analysis, 11 genes (Figure 6B, Table 5) were identified as key genes, demonstrating consensus in terms of degree, betweenness, and maximum neighborhood component (MNC) network topological properties, and we are calling them key genes for our TNBC integrative analysis (Table 6).

TABLE 5. Topological analysis of PPI retrieved key genes.

Gene name	Degree	MNC	Betweenness
CDK1	60	60	18,044
STAT1	41	40	12,817
IL6	50	50
PLK1	59	56	6782
CCNB1	52	52	9414
AURKA	44	44	8720
NDC80	46	46	3002
EGFR	72	65	56,082
IL1B	39	39	3744
FN1	40	39	14,678
BUB1B	50	50	2758

TABLE 6. Retrieved key genes.

Gene symbol	Description	Function
PLCG2	Phospholipase C gamma 2	Hydrolase, transducer
CXCL10	C-X-C motif chemokine ligand 10	Cytokine
CDK1	Cyclin dependent kinase 1	Control eukaryotic cell cycle (G2-M, G1, and G1-S)
STAT1	Signal transducer and activator of transcription 1	Activator, DNA-binding
IL6	Interleukin 6	Cytokine, growth factor
PLK1	Polo like kinase 1	Kinase, transferase
CCNB1	Cyclin B1
AURKA	Aurora kinase A	Cytokinesis, cell cycle progression
NDC80	NDC80 kinetochore complex component	Chromosome segregation, spindle checkpoint activity
EGFR	Epidermal growth factor receptor	Transferase, receptor, kinase
IL1B	Interleukin 1 beta	Cytokine, mitogen, pyrogen
FN1	Fibronectin 1	Heparin-binding
BUB1B	BUB1 mitotic checkpoint serine/threonine kinase	Kinase, transferase

3.6 Regulatory Biomolecules of Breast Cancer

We studied transcriptional and translational regulatory networks and identified 6 TFs (E2F3, E2F1, TP53, STAT1, NFKB1, and RELA) (Table 7) and 20 reporter miRNAs (Table 8) which showed significant values associated with key genes of TNBC (Figure 7).

TABLE 7. Identified TFs and their association with human disease.

TFs	Description	Associated with human disease
E2F3	E2F transcription factor 3	Dysregulated E2F3 has been identified associated with breast and other gynecological cancers [34]
E2F1	E2F transcription factor 1	Overexpressed E2F1 implication in cell cycle reported associated to gynecological and other cancers too [35]
TP53	Tumor protein P53	Mutation in TP53 gene found to be associated with early-onset breast cancer other cancers too [36]
NFKB1	Nuclear factor kappa B subunit 1	NF-kappaB pathway has been appeared to play a major role in inflammatory BC [37]
STAT1	Signal transducer and activator of transcription 1	Association of STAT1 in immune system alterations found contributed to the adult glioma [38]
RELA	RELA Proto-oncogene, NF-KB subunit	Upregulation of RELA has been identified as a key promoter of oral cancer progression, as well as other types of cancer [39]

TABLE 8. 20 miRNAs and their association in human disease.

miRNAs	Associated with human disease
hsa-mir-34a-5p, hsa-mir-16-5p, hsa-mir-1-3p	Identified as key regulators in all the BC subtypes [40]
hsa-mir-130a-3p	Discovered as potential post-transcriptional regulators in TNBC [41]
hsa-mir-449b-5p	Identified as potential biomarker for pancreatic and other types of cancer [42]
hsa-let-7b-5p, hsa-mir-26a-5p, hsa-mir-155-5p	It has been found in breast tumor formation and progression [43-45]
hsa-mir-7-5p, hsa-mir-449a	Identified as crucial regulators in various cancer subtypes, including lung cancer and other breast cancer subtypes [46]
hsa-mir-24-3p, hsa-mir-212-3p, hsa-mir-21-5p, hsa-mir-210-3p, hsa-mir-20a-5p	It has been identified associated with liver diseases, epilepsy, and other subtypes of breast cancer [47-51]
hsa-mir-335-5p, hsa-mir-27a-3p, hsa-mir-429	It has been found associated with colorectal cancer and other cancers [52]
hsa-let-7e-5p and hsa-mir-214-3p	Identified as a potential biomarker for rectal carcinoma and thyroid [53, 54]

3.7 Cross-Validation With TCGA

The UALCAN integrative cancer data analysis portal (http://ualcan.path.uab.edu) was utilized to analyze the expression level of key genes in both normal and cancerous samples from patients with TNBC, data obtained from TCGA breast invasive carcinoma (BRCA). In total, there were 1211 samples, among which 114 samples were normal and 116 patient samples were categorized as TNBCs based on the immunohistochemical status of ER, PR, and HER2. Such manual curation identifies 10 differentially expressed key genes out of 13 that were found to be correctly dysregulated (Figure 8A–M). TPM (Transcripts per million) values for each gene in every sample were derived by multiplying the scaled estimate value by 1,000,000.

3.8 Cumulative Survival Analysis

To evaluate the collective impact of key genes on disease progression, at first, K-means clustering was applied to group the key genes based on their expression into five clusters (Figure S2). To further assess the strength of relationships between genes in each cluster, Pearson's correlation analysis was conducted, and correlation plots were visualized for each cluster containing at least two genes (Figure S3). The survival outcomes for Cluster 1, which includes all the kinases, showed a p-value of 0.0008 and a hazard ratio (HR) of 1.262. Cluster 2, consisting of growth factor genes, had a p-value of 0.0004 and HR of 1.383. Clusters 3 and 4, each containing a single gene (EGFR and PLCG2), had p-values of 0.001 and 0.01, with HRs of 0.8266 and 0.863. Cluster 5, consisting of interleukins, had a p-value of 0.08 and HR of 0.881 (Figure 9A–E). This implies that over time, a high expression of kinases leads to a lower probability of survival with a significant hazard rate. A similar trend was observed in cluster 2, where a notable difference in survival was seen between low and high gene expression. However, clusters 3 and 4, which contain individual genes, and cluster 5, which contains two genes, do not exhibit such remarkable differences in survival based on their expression. We observed some significant statistical values when we performed survival analysis for individual genes in cluster 1. However, we did not obtain significant survival curves compared to the cumulative analysis (Figure S4). Thus, initially implemented on small-scale datasets, this novel cumulative survival analysis could be a new therapeutic approach for large-scale data to elucidate the impact of genetic complexity on patient survival.

4 Discussion

In recent years, there have been a large number of important studies on TNBC prevention, diagnosis, and treatment [2]. However, the molecular mechanisms regulating TNBC remain complex and poorly understood, and there are still lacking biomarkers for the early-stage diagnosis of the disease [4]. The advancement of next-generation sequencing platforms offers the identification of various molecular features of genes, including alternative gene-spliced transcripts, post-transcriptional modifications, gene fusions, mutations/single-nucleotide polymorphisms (SNPs), and changes in the transcriptome [6-8]. Integrating RNA-seq data from multiple platforms allows researchers to capture a broader spectrum of gene expression patterns, enhancing the robustness and accuracy of biomarker discovery [13]. WGCNA has emerged as a powerful tool for identifying modules of co-expressed genes associated with clinical phenotypes [24]. By correlating module eigengenes or individual gene expression profiles with clinical phenotypes such as disease status, sex, age, condition, or treatment response, researchers can identify modules or individual genes that are significantly associated with the phenotype of interest [55]. PPI network analysis helps identify highly connected proteins (hubs) and densely interconnected protein clusters (modules) within the network [56, 57]. DEGs that are part of these hubs or modules may play critical roles in disease progression or other biological processes [58]. Therefore, through the integration of WGCNA and PPI analysis, we aimed to identify key genes that could serve as potential diagnostic and prognostic biomarkers for early-stage disease diagnosis (Figure 1). In the DEGs analysis, we identified 2595 upregulated and 2001 downregulated genes (Table 2). Subsequently, the PPI network was reconstructed using these dysregulated genes, and to ensure a scale-free network, topological properties were calculated (Figure 3 and Figure S1). Understanding the specific interactions between proteins and the formation of protein complexes is essential for advancing our knowledge of biological processes and developing new therapies for diseases [56, 57]. The PPI networks are also an essential tool for identifying new drug targets and developing new therapeutic strategies [59]. Within this network, modules of co-expressed genes were identified, revealing potential functional clusters and key regulatory pathways associated with the observed gene expression changes. In the second bifurcated pipeline of WGCNA, we identified seven significant modules (Table 4) with correlation scores exceeding 0.90, each strongly associated with tumor phenotypic properties. A total of 13 key genes (Table 6) associated with TNBC were identified through consensus and integrated analysis, followed by PPI network reconstruction and hub gene (Table 5) identification of module genes retrieved through the two bifurcated pipelines. During the overlapping analysis, we identified two genes, PLCG2 and CXCL10 (Figure 6A), consistently found in both module identification methods. When we conducted an integrated analysis of module genes identified from both pipelines, followed by PPI and hub gene identification, we found a total of 11 genes, CDK1, STAT1, IL6, PLK1, CCNB1, AURKA, NDC80, EGFR, IL1B, FN1, and BUB1B (Figure 6B, Table 5), associated with TNBC. Hub proteins are those with the most connections and are required for the PPI network to function [58]. Computation of topological parameters such as Degree, Betweenness, and MNC in the above lines provides valuable insights into the optimal associations among edges and nodes within a network, elucidating the network's structure and identifying critical nodes with high centrality [38]. PLCG2, a crucial enzyme in transmembrane signaling [60], has been implicated in breast and other cancers [61], yet its specific association with TNBC remains unclear. CXCL10, involved in processes such as regulation of cell growth, differentiation, chemotaxis, and activation of peripheral immune cells, plays crucial roles in cancer-specific pathways [62]. It has also been identified as a potential predictive biomarker for TNBC in other studies [34]. CCNB1, PLK1, CDK1, BUB1B, AURKA, and NDC80 are genes involved in various stages of the cell cycle process, which is crucial from a cancer perspective [35, 60]. These genes play significant roles in regulating cell division and are often dysregulated in cancer, making them potential targets for therapeutic intervention or biomarkers for diagnosis and prognosis. In other similar integrative bioinformatics studies, CCNB1, PLK1, CDK1, BUB1B, and AURKA have been recognized as potential hub genes [10-12]. IL-6 and IL-1β play diverse roles in biological functions, including immunity, tissue regeneration, and acting as potential pro-inflammatory cytokines [36, 37]. The EGFR gene encodes a cell surface receptor involved in regulating cell growth, proliferation, and survival [39]. Activation by ligands triggers signaling pathways influencing cell division, migration, differentiation, and apoptosis [63]. Dysregulation or mutations in EGFR contribute to cancer development, making it a target for cancer therapies, including EGFR inhibitors [39, 63]. STAT1 regulates immune responses by activating genes involved in defense against pathogens and anti-tumor immunity [64]. It also contributes to cellular differentiation, development, and homeostasis, but dysregulation can lead to autoimmune diseases, immunodeficiency disorders, and cancer [40, 41]. The FN1 gene encodes a glycoprotein crucial for cell adhesion, migration, tissue remodeling, and wound healing [42]. Its dysregulation is implicated in various pathological conditions, including cancer [43] and fibrosis [44]. Further, TFs and miRNAs were identified that control key genes at the transcriptional and translational levels associated with TNBC (Figure 7). TFs E2F3, E2F1, TP53, STAT1, NFKB1, and RELA (Table 7) were identified as dysregulated in nearly all gynecological cancers and several other malignancies [41, 45-49]. miRNAs a family of small non-coding RNAs that regulate a wide array of biological processes, including carcinogenesis, are heavily dysregulated in cancer cells [50]. They can regulate breast cancer initiation and progression in different BC subtypes; therefore, they can be used as potential biomarkers [51]. In the current study, 20 reporter miRNAs (Table 8) were identified as significantly associated with BC. The hsa-mir-34a-5p, hsa-mir-16-5p, and hsa-mir-1-3p miRNAs have been found to be key regulators in all the BC subtypes [52]. The hsa-mir-130a-3p has been discovered to be post-transcriptional regulators in TNBC [11]. The hsa-mir-449b-5p has shown to be a potential biomarker for pancreatic and other types of cancer [53]. However, its relationship with TNBC and other subtypes of BC remains unexplored. The miRNAs hsa-let-7b-5p, hsa-mir-26a-5p, and hsa-mir-155-5p have been implicated in breast tumor formation and progression; however, their specific role in TNBC remains to be investigated [54, 65, 66]. The hsa-mir-7-5p and hsa-mir-449a miRNAs have been identified as crucial regulators in various cancer subtypes, including lung cancer and other BC subtypes [67]. The miRNAs hsa-mir-24-3p, hsa-mir-212-3p, hsa-mir-21-5p, hsa-mir-210-3p, and hsa-mir-20a-5p have been linked to liver diseases, epilepsy, and other subtypes of breast cancer [68-72]. However, their potential association with TNBC needs to be explored. The hsa-mir-335-5p, hsa-mir-27a-3p, and hsa-mir-429 miRNAs have been found associated with colorectal cancer and other cancers [73]. The hsa-let-7e-5p and hsa-mir-214-3p have been implicated as a potential biomarkers for rectal carcinoma and thyroid [74, 75].

According to cumulative survival analysis (Figure 9A–E) the retrieved key genes have a high potential to be prognosticative biomarkers in TNBC. The Survival analysis, also known as time-to-event analysis, estimates the time it takes for a particular event to occur and provides tools to estimate the survival probability of patients over time. With advancements in high-throughput sequencing techniques, gene expression data have become an invaluable resource in this field. This innovative cumulative survival analysis significantly broadens the scope of industrial applications by enabling a comprehensive assessment of survival outcomes associated with varying expressions of gene sets involved in critical biological pathways. This method surpasses traditional survival analysis, which typically evaluates the prognostic power of single genes over time, often falling short of fully understanding the mechanisms behind disease progression. This approach can identify multiple gene targets within disrupted pathways, facilitating the development of more effective drugs to improve survival rates. Consequently, this study offers the opportunity to explore significant biomarkers for TNBC in future research that can be validated with bench-top experimentation.

5 Conclusion

Despite significant advancements in research to identify key genes and biomarkers for early detection, TNBC remains a challenging disease. Through this RNA-seq integrative analysis, we have identified key genes significantly associated with TNBC, highlighting their relevance not only to women-specific cancers but also to other cancer types. Further downstream analyses, including the identification of regulatory biomolecules such as TFs and miRNAs, collectively referred to as reporter biomolecules, as well as gene-set enrichment, novel cumulative survival, and validation analyses, provide valuable diagnostic and prognostic insights. These findings suggest the potential therapeutic utility of these genes and their associated biomolecules. Thus, developing these biomolecules further for experimental research could result in a novel treatment for TNBC.

Author Contributions

Pooja Singh and Pallavi Somvanshi acquired data, analyzed and interpreted data, and drafted a manuscript. Pooja Singh study concepts and design, data acquisition, analysis, and interpretation of data. Pallavi Somvanshi and Rupesh Chaturvedi have done data interpretation and manuscript drafting. All the authors reviewed and approved the manuscript.

Acknowledgments

Pallavi Somvanshi is grateful to SC&IS, Jawaharlal Nehru University, for the facility and all the requisite support.

Ethics Statement

The authors have nothing to report.

Consent

The authors have nothing to report.

Conflicts of Interest

The authors declare no conflicts of interest.

Open Research

Data Availability Statement

The authors have nothing to report.

Supporting Information

References

1E. T. Sedeta, B. Jobre, and B. Avezbakiyev, “Breast Cancer: Global Patterns of Incidence, Mortality, and Trends,” Journal of Clinical Oncology 41, no. 16_suppl (2023): 10528, https://doi.org/10.1200/JCO.2023.41.16_suppl.10528.
10.1200/JCO.2023.41.16_suppl.10528
Google Scholar
2P. Kumar and R. Aggarwal, “An Overview of Triple-Negative Breast Cancer,” Archives of Gynecology and Obstetrics 293, no. 2 (2016): 247–269, https://doi.org/10.1007/s00404-015-3859-y.
10.1007/s00404-015-3859-y
CAS PubMed Web of Science® Google Scholar
3O. Engebraaten, H. K. M. Vollan, and A. L. Børresen-Dale, “Triple-Negative Breast Cancer and the Need for New Therapeutic Targets,” American Journal of Pathology 183, no. 4 (2013): 1064–1074, https://doi.org/10.1016/j.ajpath.2013.05.033.
10.1016/j.ajpath.2013.05.033
CAS PubMed Google Scholar
4O. Obidiro, G. Battogtokh, and E. O. Akala, “Triple Negative Breast Cancer Treatment Options and Limitations: Future Outlook,” Pharmaceutics 15, no. 7 (2023): 1796, https://doi.org/10.3390/pharmaceutics15071796.
10.3390/pharmaceutics15071796
CAS PubMed Web of Science® Google Scholar
5N. Carlomagno, P. Incollingo, V. Tammaro, et al., “Diagnostic, Predictive, Prognostic, and Therapeutic Molecular Biomarkers in Third Millennium: A Breakthrough in Gastric Cancer,” BioMed Research International 2017 (2017): 1–11, https://doi.org/10.1155/2017/7869802.
10.1155/2017/7869802
Google Scholar
6C. Edeki, “Comparative Study of Microarray and Next Generation Sequencing Technologies,” International Journal of Computer Science and Mobile Computing 1, no. 1 (2012): 15–20.
Google Scholar
7B. T. Wilhelm and J. R. Landry, “RNA-Seq—Quantitative Measurement of Expression Through Massively Parallel RNA-Sequencing,” Methods 48, no. 3 (2009): 249–257, https://doi.org/10.1016/j.ymeth.2009.03.016.
10.1016/j.ymeth.2009.03.016
CAS PubMed Web of Science® Google Scholar
8S. Serratì, S. de Summa, B. Pilato, et al., “Next-Generation Sequencing: Advances and Applications in Cancer Diagnosis,” Oncotargets and Therapy 9 (2016): 7355–7365, https://doi.org/10.2147/OTT.S99807.
10.2147/OTT.S99807
CAS PubMed Web of Science® Google Scholar
9D. L. Chen, J. H. Cai, and C. C. N. Wang, “Identification of Key Prognostic Genes of Triple Negative Breast Cancer by LASSO-Based Machine Learning and Bioinformatics Analysis,” Genes 13, no. 5 (2022): 902, https://doi.org/10.3390/genes13050902.
10.3390/genes13050902
CAS PubMed Web of Science® Google Scholar
10L. M. Wei, X. Y. Li, Z. M. Wang, et al., “Identification of Hub Genes in Triple-Negative Breast Cancer by Integrated Bioinformatics Analysis,” Gland Surgery 10, no. 2 (2021): 799–806, https://doi.org/10.21037/gs-21-17.
10.21037/gs-21-17
PubMed Web of Science® Google Scholar
11M. S. Alam, A. Sultana, G. Wang, and M. N. Haque Mollah, “Gene Expression Profile Analysis to Discover Molecular Signatures for Early Diagnosis and Therapies of Triple-Negative Breast Cancer,” Frontiers in Molecular Biosciences 9 (2022): 1049741, https://doi.org/10.3389/fmolb.2022.1049741.
10.3389/fmolb.2022.1049741
CAS PubMed Web of Science® Google Scholar
12D. Tierno, G. Grassi, S. Scomersi, et al., “Next-Generation Sequencing and Triple-Negative Breast Cancer: Insights and Applications,” International Journal of Molecular Sciences 24, no. 11 (2023): 24119688, https://doi.org/10.3390/ijms24119688.
10.3390/ijms24119688
Google Scholar
13D. Castillo, J. M. Gálvez, L. J. Herrera, B. S. Román, F. Rojas, and I. Rojas, “Integration of RNA-Seq Data With Heterogeneous Microarray Data for Breast Cancer Profiling,” BMC Bioinformatics 18, no. 1 (2017): 506, https://doi.org/10.1186/s12859-017-1925-0.
10.1186/s12859-017-1925-0
PubMed Web of Science® Google Scholar
14A. S. Nangraj, G. Selvaraj, S. Kaliamurthi, A. C. Kaushik, W. C. Cho, and D. Q. Wei, “Integrated PPI- and WGCNA-Retrieval of Hub Gene Signatures Shared Between Barrett's Esophagus and Esophageal Adenocarcinoma,” Frontiers in Pharmacology 11 (2020): 881, https://doi.org/10.3389/fphar.2020.00881.
10.3389/fphar.2020.00881
CAS PubMed Web of Science® Google Scholar
15M. R. Rahman, T. Islam, E. Gov, et al., “Identification of Prognostic Biomarker Signatures and Candidate Drugs in Colorectal Cancer: Insights From Systems Biology Analysis,” Medicina 55, no. 1 (2019): 20, https://doi.org/10.3390/medicina55010020.
10.3390/medicina55010020
PubMed Web of Science® Google Scholar
16M. Z. Malik, K. Chirom, S. Ali, R. Ishrat, P. Somvanshi, and R. K. B. Singh, “Methodology of Predicting Novel Key Regulators in Ovarian Cancer Network: A Network Theoretical Approach,” BMC Cancer 19, no. 1 (2019): 1129, https://doi.org/10.1186/s12885-019-6309-6.
10.1186/s12885-019-6309-6
CAS PubMed Google Scholar
17E. Clough and T. Barrett, The Gene Expression Omnibus Database, (2016), 93–110, https://doi.org/10.1007/978-1-4939-3578-9_5.
10.1007/978-1-4939-3578-9_5
Google Scholar
18 “S.A. FastQC: A Quality Control Tool for High Throughput Sequence Data,” (2010), http://www.bioinformatics.babraham.ac.uk/projects/fastqc.
Google Scholar
19A. M. Bolger, M. Lohse, and B. Usadel, “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data,” Bioinformatics 30, no. 15 (2014): 2114–2120, https://doi.org/10.1093/bioinformatics/btu170.
10.1093/bioinformatics/btu170
CAS PubMed Web of Science® Google Scholar
20A. Dobin, C. A. Davis, F. Schlesinger, et al., “STAR: Ultrafast Universal RNA-seq Aligner,” Bioinformatics 29, no. 1 (2013): 15–21, https://doi.org/10.1093/bioinformatics/bts635.
10.1093/bioinformatics/bts635
CAS PubMed Web of Science® Google Scholar
21Y. Liao, G. K. Smyth, and W. Shi, “featureCounts: An Efficient General Purpose Program for Assigning Sequence Reads to Genomic Features,” Bioinformatics 30, no. 7 (2014): 923–930, https://doi.org/10.1093/bioinformatics/btt656.
10.1093/bioinformatics/btt656
CAS PubMed Web of Science® Google Scholar
22K. D. Hansen, R. A. Irizarry, and Z. Wu, “Removing Technical Variability in Rna-Seq Data Using Conditional Quantile Normalization,” Biostatistics 13, no. 2 (2012): 204.
10.1093/biostatistics/kxr054
PubMed Web of Science® Google Scholar
23M. D. Robinson, D. J. McCarthy, and G. K. Smyth, “edgeR: A Bioconductor Package for Differential Expression Analysis of Digital Gene Expression Data,” Bioinformatics 26, no. 1 (2010): 139–140, https://doi.org/10.1093/bioinformatics/btp616.
10.1093/bioinformatics/btp616
CAS PubMed Web of Science® Google Scholar
24P. Langfelder and S. Horvath, “WGCNA: An R Package for Weighted Correlation Network Analysis,” BMC Bioinformatics 9, no. 1 (2008): 559, https://doi.org/10.1186/1471-2105-9-559.
10.1186/1471-2105-9-559
CAS PubMed Web of Science® Google Scholar
25D. Szklarczyk, A. L. Gable, K. C. Nastou, et al., “The STRING Database in 2021: Customizable Protein–Protein Networks, and Functional Characterization of User-Uploaded Gene/Measurement Sets,” Nucleic Acids Research 49, no. D1 (2021): D605–D612, https://doi.org/10.1093/nar/gkaa1074.
10.1093/nar/gkaa1074
CAS PubMed Web of Science® Google Scholar
26P. Shannon, A. Markiel, O. Ozier, et al., “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks,” Genome Research 13, no. 11 (2003): 2498–2504, https://doi.org/10.1101/gr.1239303.
10.1101/gr.1239303
CAS PubMed Web of Science® Google Scholar
27S. X. Ge, D. Jung, and R. Yao, “ShinyGO: A Graphical Gene-Set Enrichment Tool for Animals and Plants. Valencia A, Ed,” Bioinformatics 36, no. 8 (2020): 2628–2629, https://doi.org/10.1093/bioinformatics/btz931.
10.1093/bioinformatics/btz931
CAS PubMed Google Scholar
28M. Kanehisa, “KEGG: Kyoto Encyclopedia of Genes and Genomes,” Nucleic Acids Research 28, no. 1 (2000): 27–30, https://doi.org/10.1093/nar/28.1.27.
10.1093/nar/28.1.27
CAS PubMed Web of Science® Google Scholar
29H. Han, J. W. Cho, S. Lee, et al., “TRRUST v2: An Expanded Reference Database of Human and Mouse Transcriptional Regulatory Interactions,” Nucleic Acids Research 46, no. D1 (2018): D380–D386, https://doi.org/10.1093/nar/gkx1013.
10.1093/nar/gkx1013
CAS PubMed Web of Science® Google Scholar
30H. Y. Huang, Y. C. D. Lin, S. Cui, et al., “miRTarBase Update 2022: An Informative Resource for Experimentally Validated miRNA–Target Interactions,” Nucleic Acids Research 50, no. D1 (2022): D222–D230, https://doi.org/10.1093/nar/gkab1079.
10.1093/nar/gkab1079
CAS PubMed Web of Science® Google Scholar
31L. Chang, G. Zhou, O. Soufan, and J. Xia, “miRNet 2.0: Network-Based Visual Analytics for miRNA Functional Analysis and Systems Biology,” Nucleic Acids Research 48, no. W1 (2020): W244–W251, https://doi.org/10.1093/nar/gkaa467.
10.1093/nar/gkaa467
CAS PubMed Web of Science® Google Scholar
32K. E. Varley, J. Gertz, J. Varley, B. S. Roberts, et al., “Recurrent Read-Through Fusion Transcripts in Breast Cancer,” Breast Cancer Research and Treatment 146, no. 2 (2014): 287–297, https://doi.org/10.1007/s10549-014-3019-2.
10.1007/s10549-014-3019-2
PubMed Web of Science® Google Scholar
33J. Eswaran, D. Cyanam, P. Mudvari, et al., “Transcriptomic Landscape of Breast Cancers Through mRNA Sequencing,” Scientific Reports 2 (2012): 264, https://doi.org/10.1038/srep00264.
10.1038/srep00264
PubMed Web of Science® Google Scholar
34T. Chuan, T. Li, and C. Yi, “Identification of CXCR4 and CXCL10 as Potential Predictive Biomarkers in Triple Negative Breast Cancer (TNBC),” Medical Science Monitor 26 (2020): e918281, https://doi.org/10.12659/MSM.918281.
10.12659/MSM.918281
PubMed Google Scholar
35H. A. Lane and E. A. Nigg, “Antibody Microinjection Reveals an Essential Role for Human Polo-Like Kinase 1 (Plk1) in the Functional Maturation of Mitotic Centrosomes,” Journal of Cell Biology 135, no. 6 Pt 2 (1996): 1701–1713, https://doi.org/10.1083/jcb.135.6.1701.
10.1083/jcb.135.6.1701
CAS PubMed Web of Science® Google Scholar
36T. Tanaka, M. Narazaki, and T. Kishimoto, “IL-6 in Inflammation, Immunity, and Disease,” Cold Spring Harbor Perspectives in Biology 6, no. 10 (2014): a016295, https://doi.org/10.1101/cshperspect.a016295.
10.1101/cshperspect.a016295
PubMed Web of Science® Google Scholar
37D. Briukhovetska, J. Dörr, S. Endres, P. Libby, C. A. Dinarello, and S. Kobold, “Interleukins in Cancer: From Biology to Therapy,” Nature Reviews. Cancer 21, no. 8 (2021): 481–499, https://doi.org/10.1038/s41568-021-00363-z.
10.1038/s41568-021-00363-z
CAS PubMed Web of Science® Google Scholar
38W. Winterbach, P. V. Mieghem, M. Reinders, H. Wang, and D. d. Ridder, “Topology of Molecular Interaction Networks,” BMC Systems Biology 7, no. 1 (2013): 90, https://doi.org/10.1186/1752-0509-7-90.
10.1186/1752-0509-7-90
PubMed Web of Science® Google Scholar
39B. Rude Voldborg, L. Damstrup, M. Spang-Thomsen, and H. Skovgaard Poulsen, “Epidermal Growth Factor Receptor (EGFR) and EGFR Mutations, Function and Possible Role in Clinical Trials,” Annals of Oncology 8, no. 12 (1997): 1197–1206, https://doi.org/10.1023/A:1008209720526.
10.1023/A:1008209720526
PubMed Google Scholar
40N. Sharfe, A. Nahum, A. Newell, et al., “Fatal Combined Immunodeficiency Associated With Heterozygous Mutation in STAT1,” Journal of Allergy and Clinical Immunology 133, no. 3 (2014): 807–817, https://doi.org/10.1016/j.jaci.2013.09.032.
10.1016/j.jaci.2013.09.032
CAS PubMed Web of Science® Google Scholar
41X. Li, F. Wang, X. Xu, J. Zhang, and G. Xu, “The Dual Role of STAT1 in Ovarian Cancer: Insight Into Molecular Mechanisms and Application Potentials,” Frontiers in Cell and Development Biology 9 (2021): 636595, https://doi.org/10.3389/fcell.2021.636595.
10.3389/fcell.2021.636595
PubMed Web of Science® Google Scholar
42R. J. Owens and F. E. Baralle, “Mapping the Collagen-Binding Site of Human Fibronectin by Expression in Escherichia Coli,” EMBO Journal 5, no. 11 (1986): 2825–2830, https://doi.org/10.1002/j.1460-2075.1986.tb04575.x.
10.1002/j.1460-2075.1986.tb04575.x
CAS PubMed Google Scholar
43S. L. Schor and A. M. Schor, “Phenotypic and Genetic Alterations in Mammary Stroma: Implications for Tumour Progression,” Breast Cancer Research 3, no. 6 (2001): 373–379, https://doi.org/10.1186/bcr325.
10.1186/bcr325
CAS PubMed Google Scholar
44Y. Zhang, T. Gu, S. Xu, J. Wang, and X. Zhu, “Anti-Liver Fibrosis Role of miRNA-96-5p via Targeting FN1 and Inhibiting ECM-Receptor Interaction Pathway,” Applied Biochemistry and Biotechnology 195, no. 11 (2023): 6840–6855, https://doi.org/10.1007/s12010-023-04385-1.
10.1007/s12010-023-04385-1
CAS PubMed Google Scholar
45L. Wei, Y. Bai, L. Na, Y. Sun, C. Zhao, and W. Wang, “E2F3 Induces DNA Damage Repair, Stem-Like Properties and Therapy Resistance in Breast Cancer,” Biochimica et Biophysica Acta - Molecular Basis of Disease 1869, no. 8 (2023): 166816, https://doi.org/10.1016/j.bbadis.2023.166816.
10.1016/j.bbadis.2023.166816
CAS PubMed Google Scholar
46J. M. Cunningham, R. A. Vierkant, T. A. Sellers, et al., “Cell Cycle Genes and Ovarian Cancer Susceptibility: A tagSNP Analysis,” British Journal of Cancer 101, no. 8 (2009): 1461–1468, https://doi.org/10.1038/sj.bjc.6605284.
10.1038/sj.bjc.6605284
CAS PubMed Google Scholar
47F. Lalloo, J. Varley, A. Moran, et al., “BRCA1, BRCA2 and TP53 Mutations in Very Early-Onset Breast Cancer With Associated Risks to Relatives,” European Journal of Cancer 42, no. 8 (2006): 1143–1150, https://doi.org/10.1016/j.ejca.2005.11.032.
10.1016/j.ejca.2005.11.032
CAS PubMed Web of Science® Google Scholar
48F. Lerebours, S. Vacher, C. Andrieu, et al., “NF-Kappa B Genes Have a Major Role in Inflammatory Breast Cancer,” BMC Cancer 8 (2008): 41, https://doi.org/10.1186/1471-2407-8-41.
10.1186/1471-2407-8-41
CAS PubMed Web of Science® Google Scholar
49K. Yang, J. Zhao, S. Liu, and S. Man, “RELA Promotes the Progression of Oral Squamous Cell Carcinoma via TFAP2A-Wnt/β-Catenin Signaling,” Molecular Carcinogenesis 62, no. 5 (2023): 641–651, https://doi.org/10.1002/mc.23512.
10.1002/mc.23512
CAS PubMed Web of Science® Google Scholar
50Y. Peng and C. M. Croce, “The Role of MicroRNAs in Human Cancer,” Signal Transduction and Targeted Therapy 1, no. 1 (2016): 15004, https://doi.org/10.1038/sigtrans.2015.4.
10.1038/sigtrans.2015.4
PubMed Web of Science® Google Scholar
51H. Y. Loh, B. P. Norman, K. S. Lai, N. M. A. N. A. Rahman, N. B. M. Alitheen, and M. A. Osman, “The Regulatory Role of MicroRNAs in Breast Cancer,” International Journal of Molecular Sciences 20, no. 19 (2019): 4940, https://doi.org/10.3390/ijms20194940.
10.3390/ijms20194940
CAS PubMed Web of Science® Google Scholar
52M. S. Alam, A. Sultana, H. Sun, et al., “Bioinformatics and Network-Based Screening and Discovery of Potential Molecular Targets and Small Molecular Drugs for Breast Cancer,” Frontiers in Pharmacology 13 (2022): 942126, https://doi.org/10.3389/fphar.2022.942126.
10.3389/fphar.2022.942126
CAS PubMed Web of Science® Google Scholar
53C. Wang, H. Cai, Q. Cai, et al., “Circulating microRNAs in Association With Pancreatic Cancer Risk Within 5 Years,” International Journal of Cancer 155, no. 3 (2024): 519–531, https://doi.org/10.1002/ijc.34956.
10.1002/ijc.34956
CAS PubMed Google Scholar
54X. Yang, Y. Tao, Y. Xu, W. Cai, and Q. Shao, “SLC35A2 Expression Drives Breast Cancer Progression via ERK Pathway Activation,” FEBS Journal 291, no. 7 (2024): 1483–1505, https://doi.org/10.1111/febs.17044.
10.1111/febs.17044
CAS Google Scholar
55G. Zheng, C. Zhang, and C. Zhong, “Identification of Potential Prognostic Biomarkers for Breast Cancer Using WGCNA and PPI Integrated Techniques,” Annals of Diagnostic Pathology 50 (2021): 151675, https://doi.org/10.1016/j.anndiagpath.2020.151675.
10.1016/j.anndiagpath.2020.151675
PubMed Google Scholar
56A. L. Barabási, N. Gulbahce, and J. Loscalzo, “Network Medicine: A Network-Based Approach to Human Disease,” Nature Reviews. Genetics 12, no. 1 (2011): 56–68, https://doi.org/10.1038/nrg2918.
10.1038/nrg2918
CAS PubMed Web of Science® Google Scholar
57G. Fiscon, F. Conte, L. Farina, and P. Paci, “Network-Based Approaches to Explore Complex Biological Systems Towards Network Medicine,” Genes 9, no. 9 (2018): 437, https://doi.org/10.3390/genes9090437.
10.3390/genes9090437
PubMed Web of Science® Google Scholar
58X. He and J. Zhang, “Why Do Hubs Tend to Be Essential in Protein Networks?,” PLoS Genetics 2, no. 6 (2006): e88, https://doi.org/10.1371/journal.pgen.0020088.
10.1371/journal.pgen.0020088
CAS PubMed Web of Science® Google Scholar
59Z. Ding and D. Kihara, “Computational Identification of Protein-Protein Interactions in Model Plant Proteomes,” Scientific Reports 9, no. 1 (2019): 8740, https://doi.org/10.1038/s41598-019-45072-8.
10.1038/s41598-019-45072-8
PubMed Google Scholar
60Q. Zhou, G. S. Lee, J. Brady, et al., “A Hypermorphic Missense Mutation in PLCG2, Encoding Phospholipase Cγ2, Causes a Dominantly Inherited Autoinflammatory Disease With Immunodeficiency,” American Journal of Human Genetics 91, no. 4 (2012): 713–720, https://doi.org/10.1016/j.ajhg.2012.08.006.
10.1016/j.ajhg.2012.08.006
CAS PubMed Web of Science® Google Scholar
61L. C. Walker, N. Waddell, A. Ten Haaf, kConFab Investigators, S. Grimmond, and A. B. Spurdle, “Use of Expression Data and the CGEMS Genome-Wide Breast Cancer Association Study to Identify Genes That May Modify Risk in BRCA1/2 Mutation Carriers,” Breast Cancer Research and Treatment 112, no. 2 (2008): 229–236, https://doi.org/10.1007/s10549-007-9848-5.
10.1007/s10549-007-9848-5
CAS PubMed Google Scholar
62A. L. Angiolillo, C. Sgadari, D. D. Taub, et al., “Human Interferon-Inducible Protein 10 Is a Potent Inhibitor of Angiogenesis In Vivo,” Journal of Experimental Medicine 182, no. 1 (1995): 155–162, https://doi.org/10.1084/jem.182.1.155.
10.1084/jem.182.1.155
CAS PubMed Web of Science® Google Scholar
63Z. Du and C. M. Lovly, “Mechanisms of Receptor Tyrosine Kinase Activation in Cancer,” Molecular Cancer 17, no. 1 (2018): 58, https://doi.org/10.1186/s12943-018-0782-4.
10.1186/s12943-018-0782-4
PubMed Web of Science® Google Scholar
64D. Ungureanu, S. Vanhatupa, N. Kotaja, et al., “PIAS Proteins Promote SUMO-1 Conjugation to STAT1,” Blood 102, no. 9 (2003): 3311–3313, https://doi.org/10.1182/blood-2002-12-3816.
10.1182/blood-2002-12-3816
CAS PubMed Web of Science® Google Scholar
65B. Vastrad, C. Vastrad, A. Tengli, and S. Iliger, “Identification of Differentially Expressed Genes Regulated by Molecular Signature in Breast Cancer-Associated Fibroblasts by Bioinformatics Analysis,” Archives of Gynecology and Obstetrics 297, no. 1 (2018): 161–183, https://doi.org/10.1007/s00404-017-4562-y.
10.1007/s00404-017-4562-y
CAS PubMed Google Scholar
66B. Pasculli, R. Barbano, A. Fontana, et al., “Hsa-miR-155-5p Up-Regulation in Breast Cancer and Its Relevance for Treatment With Poly[ADP-Ribose] Polymerase 1 (PARP-1) Inhibitors,” Frontiers in Oncology 10 (2020): 1415, https://doi.org/10.3389/fonc.2020.01415.
10.3389/fonc.2020.01415
PubMed Web of Science® Google Scholar
67A. Sultana, M. S. Alam, X. Liu, et al., “Single-Cell RNA-Seq Analysis to Identify Potential Biomarkers for Diagnosis, and Prognosis of Non-Small Cell Lung Cancer by Using Comprehensive Bioinformatics Approaches,” Translational Oncology 27 (2023): 101571, https://doi.org/10.1016/j.tranon.2022.101571.
10.1016/j.tranon.2022.101571
CAS PubMed Web of Science® Google Scholar
68Q. L. Chen, C. F. Xie, K. L. Feng, et al., “microRNAs Carried by Exosomes Promote Epithelial-Mesenchymal Transition and Metastasis of Liver Cancer Cells,” American Journal of Translational Research 12, no. 10 (2020): 6811–6826, http://www.ncbi.nlm.nih.gov/pubmed/33194074.
CAS PubMed Google Scholar
69S. Haenisch, Y. Zhao, A. Chhibber, et al., “SOX11 Identified by Target Gene Evaluation of miRNAs Differentially Expressed in Focal and Non-Focal Brain Tissue of Therapy-Resistant Epilepsy Patients,” Neurobiology of Disease 77 (2015): 127–140, https://doi.org/10.1016/j.nbd.2015.02.025.
10.1016/j.nbd.2015.02.025
CAS PubMed Web of Science® Google Scholar
70M. Liu, F. Mo, X. Song, et al., “Exosomal Hsa-miR-21-5p Is a Biomarker for Breast Cancer Diagnosis,” PeerJ 9 (2021): e12147, https://doi.org/10.7717/peerj.12147.
10.7717/peerj.12147
PubMed Web of Science® Google Scholar
71B. Pasculli, R. Barbano, M. Rendina, et al., “Hsa-miR-210-3p Expression in Breast Cancer and Its Putative Association With Worse Outcome in Patients Treated With Docetaxel,” Scientific Reports 9, no. 1 (2019): 14913, https://doi.org/10.1038/s41598-019-51581-3.
10.1038/s41598-019-51581-3
PubMed Google Scholar
72B. Raju, G. Narendra, H. Verma, and O. Silakari, “Identification of Chemoresistance Associated Key Genes-miRNAs-TFs in Docetaxel Resistant Breast Cancer by Bioinformatics Analysis,” 3 Biotech 14, no. 5 (2024): 128, https://doi.org/10.1007/s13205-024-03971-2.
10.1007/s13205-024-03971-2
PubMed Google Scholar
73M. A. Horaira, M. A. Islam, M. K. Kibria, M. J. Alam, S. R. Kabir, and M. N. H. Mollah, “Bioinformatics Screening of Colorectal-Cancer Causing Molecular Signatures Through Gene Expression Profiles to Discover Therapeutic Targets and Candidate Agents,” BMC Medical Genomics 16, no. 1 (2023): 64, https://doi.org/10.1186/s12920-023-01488-w.
10.1186/s12920-023-01488-w
CAS PubMed Google Scholar
74W. Chen, G. Lin, Y. Yao, et al., “MicroRNA Hsa-Let-7e-5p as a Potential Prognosis Marker for Rectal Carcinoma With Liver Metastases,” Oncology Letters 15, no. 5 (2018): 6913–6924, https://doi.org/10.3892/ol.2018.8181.
10.3892/ol.2018.8181
PubMed Google Scholar
75F. Yang, J. Zhang, B. Li, et al., “Identification of Potential lncRNAs and miRNAs as Diagnostic Biomarkers for Papillary Thyroid Carcinoma Based on Machine Learning,” International Journal of Endocrinology 2021 (2021): 3984463, https://doi.org/10.1155/2021/3984463.
10.1155/2021/3984463
PubMed Google Scholar

Volume14, Issue9

May 2025

e70674

Network-Based Integrative Analysis to Identify Key Genes and Corresponding Reporter Biomolecules for Triple-Negative Breast Cancer

ABSTRACT

Background

Objectives

Material and Methodology

Results

Conclusion

1 Introduction

2 Materials and Methodology

2.1 Data Retrieval

2.2 Quality-Check and Data Integration

2.3 Dataset Analysis

2.4 DEG Identification, PPI Reconstruction and Module Identification

2.5 WGCNA Analysis

2.6 Enrichment Analysis of DEGs

2.7 Key Gene and Regulatory Biomolecule Identification

2.8 Cumulative Survival Analysis

3 Results

3.1 Identification of DEGs

3.2 Enrichment Analysis

3.3 PPI Reconstruction of DEGs

3.4 WGCNA Phenotypic Significant Modules Identification

3.5 Key Gene Identification

3.6 Regulatory Biomolecules of Breast Cancer

3.7 Cross-Validation With TCGA

3.8 Cumulative Survival Analysis

4 Discussion

5 Conclusion

Author Contributions

Acknowledgments

Ethics Statement

Consent

Conflicts of Interest

Open Research

Data Availability Statement

Supporting Information

References

Figures

References

Related

Information