Volume 121, Issue 1 pp. 755-767
RESEARCH ARTICLE
Full Access

Comprehensive analysis of lncRNA-TF crosstalks and identification of prognostic regulatory feedback loops of glioblastoma using lncRNA/TF-mediated ceRNA network

Yang Ji

Yang Ji

Department of Medical Technology, Jiangsu Vocational College of Medicine, 283 Jiefangnan Road, Yangcheng, 224005 China

Search for more papers by this author
Yaqin Gu

Yaqin Gu

Department of Medical Technology, Jiangsu Vocational College of Medicine, 283 Jiefangnan Road, Yangcheng, 224005 China

Search for more papers by this author
Shuai Hong

Shuai Hong

Key Laboratory of Intelligent Information Processing, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Search for more papers by this author
Bo Yu

Bo Yu

Key Laboratory of Intelligent Information Processing, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Search for more papers by this author
Jian-Hua Zhang

Corresponding Author

Jian-Hua Zhang

Department of Blood Transfusion, Peking University People's Hospital, Beijing, China

Correspondence Jian-Hua Zhang, Department of Blood Transfusion, Peking University People's Hospital, Beijing 100000, China.

Email: [email protected] and [email protected]

Jin-Na Liu, Key Laboratory of Intelligent Information Processing, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China.

Email: [email protected]

Search for more papers by this author
Jin-Na Liu

Corresponding Author

Jin-Na Liu

Key Laboratory of Intelligent Information Processing, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Correspondence Jian-Hua Zhang, Department of Blood Transfusion, Peking University People's Hospital, Beijing 100000, China.

Email: [email protected] and [email protected]

Jin-Na Liu, Key Laboratory of Intelligent Information Processing, State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China.

Email: [email protected]

Search for more papers by this author
First published: 03 September 2019
Citations: 12

Yang Ji and Yaqin Gu contributed equally to this work.

Abstract

Glioblastoma (GBM) has become the most aggressive primary brain tumor in the world. Patients with GBM usually have a poor prognosis. The median survival times of GBM patients retain less than 2 years. Thus, it is urgent to investigate the molecular mechanism of GBM. Recently, studies have demonstrated that transcription factors (TFs) participate in cancer pathology by regulating long noncoding RNAs (lncRNAs). However, the functional and regulatory roles of TF-lncRNA crosstalks are still unclear. In this study, we constructed a global lncRNA-TF network (GLTN) based on competing endogenous RNA. As a result, some topological features of GLTN were identified. A known GBM lncRNA MCM3AP-AS1 showed multiple central topological features in GLTN. Furthermore, we identified hub genes and extracted the hub-hub pairs from GLTN to form a hub associated lncRNA-TF network (HALTN). Results showed that a risk model combined with multiple hubs had a significant effect on prognosis. Additionally, we performed module searching and two functional modules from HALTN were identified, which were confirmed as risk factors of GBM. More importantly, we also identified some core lncRNA-TF crosstalks that might form feedback loops to control the biological processes in GBM. Our results demonstrated that the synergistic, competitive lncRNA-TF crosstalks played an important role in pathological processes of GBM, and had strong effect on prognosis. All these results can help us to uncover the molecular mechanism and provide a new therapeutic target for GBM.

1 INTRODUCTION

Glioblastoma (GBM) is an aggressive primary brain tumor in adults and has become the most common disease in the world.1 Despite clinical standard treatments comprise surgery, radiation and chemotherapy in the past decades, GBM still cannot be completely removed due to the invasive nature. The median survival time for patients with GBM has remained less than 2 years.2 Although valuable information regarding histological diagnosis and treatment has been provided based on morphology, it is insufficient for predicting clinical outcomes.3 So, there is an urgent task to find the molecular biomarkers for the diagnosis and prognosis of GBM.

Noncoding RNAs (ncRNAs) have been suggested to play crucial roles in cancer pathological processes. Such as microRNA (miRNA), a famous small molecular, can regulate gene expression in the posttranscriptional level and has good diagnosis and prognosis abilities.4 Recently, a novel type of ncRNA that named as long noncoding RNAs (lncRNAs) has been proposed. It is more than 200 nucleotides in length and receives wide attentions because it has a significant effect on complex biological regulation. For example, lncRNAs can exert their function by regulating splicing, translation and gene expression.5, 6 lncRNAs have been demonstrated to participate in multiple biological processes of cancers.7 Yao et al8 found that the expression of lncRNA XIST was upregulated in human GBM stem cells. When XIST was knockdown, microRNA-152 (miR-152) mediated the tumor-suppressive effects, including reducing cell proliferation, migration and invasion as well as inducing apoptosis. lncRNA taurine-upregulated gene 1 (TUG1) has been demonstrated to enhance tumor-induced angiogenesis and vascular endothelial growth factor (VEGF) expression through inhibiting miR-299.9 Thus, it can be seen that interactions between miRNAs and lncRNAs are the crucial mechanism in the pathological of cancers. More recently, some studies proposed a new theory named as competing endogenous RNA (ceRNA) that can elucidate the complex interactions among lncRNAs, miRNAs, and messenger RNAs (mRNAs).10 Furthermore, some studies have demonstrated that biological functions of ceRNA in cancers. Zhang et al11 constructed a miRNA-lncRNA-mRNA interaction network based on ceRNA and identified some functional lncRNAs that involved in the malignant progression of GBM multiforme. Similarly, a dysregulated lncRNA-associated ceRNA network has been constructed by Zhou et al.12 Several novel lncRNA biomarkers have been identified for early diagnosis of human pancreatic cancer. In addition, transcription factor (TF), as an important type of mRNAs, is important in the transcription level. For instance, Jen revealed a novel mechanism by which Oct4 transcriptionally activates NEAT1 and MALAT1 to promote cell proliferation and metastasis, and then leads to lung tumorigenesis and poor prognosis.13 However, there is no systematic analysis of lncRNA-TF interactions based on ceRNA in GBM up to now. So we raised a hypothesis that lncRNAs and TFs can compose ceRNA interactions in the posttranscription level and exert key biological functions in the pathology of GBM.

In this study, to investigate the ceRNA crosstalks of lncRNAs and TFs, an integrative pipeline was performed to identify significant lncRNA-TF pairs and then merge all these pairs into a biological network named as global lncRNA-TF network (GLTN). Firstly, some important topological features of GLTN were identified, which suggested that we should make a comprehensive analysis to the network. A known GBM lncRNA MCM3AP-AS1 that has multiple central topological features in GLTN was identified. Hub genes in network play important roles in biological processes. Thus, we identified the hub genes and extracted the hub-hub pairs from GLTN to form a hub associated lncRNA-TF network named as HALTN. Results showed that a risk model composed of multiple hubs had a significant effect on prognosis. Due to lncRNAs often exerted functions in modules, we performed module searching. Two functional modules were identified from HALTN, which are both risk factors of GBM. Importantly, some core lncRNA-TF interactions were identified to form feedback loops to control the biological processes in GBM. Our results demonstrated the synergistic, competitive lncRNA-TF pairs played an important role in pathological processes of GBM, and had strong effect on prognosis. All these results can help us to uncover the molecular mechanism and provide a new therapeutic target for GBM.

2 MATERIALS AND METHODS

2.1 Data set of lncRNAs and TFs in GBM

We downloaded the genome-wide lncRNA and TF expression data of GBM from TCGA (https://portal.gdc.cancer.gov/). In brief, TCGA transcript level RNA-seq data set was converted into lncRNA/gene level. Genes with reads per kilobase of transcript, per million mapped reads (RPKM)  = 0 in all of the samples were removed. Genes with RPKM = 0 that occurred in part of the samples were set to 0.1 to perform log transformation. To allow survival analysis, only samples with clinical information were retained. As a result, GBM-related RNA-seq data with 155 samples were obtained.

2.2 miRNA-TF and miRNA-lncRNA interactions

Previous studies have demonstrated to identify endogenous interactions between miRNAs and targets by using crosslinking and Argonaute (Ago) immunoprecipitation coupled with high-throughput sequencing (CLIP-Seq).14 starBase is a database curated interaction networks of lncRNAs, miRNAs, and mRNAs that processed by CLIP-Seq (HITS-CLIP, PAR-CLIP, iCLIP, CLASH) data.15 In total, we download 423 975 miRNA-mRNA interactions from starBase, including 13 861 mRNAs and 386 miRNAs. We also downloaded all the human TFs from previous studies and then mapped them to the miRNA-mRNA interactions to extract the miRNA-TF interactions. Furthermore, we downloaded all the lncRNA sequences from GENCODE database and TF targeted miRNAs sequences from miRbase database. Miranda algorithm was performed to identify significant miRNA-lncRNA interactions (parameters: default).

2.3 Construction of a GLTN

Multiple studies have demonstrated that the activity of ceRNA crosstalk between mRNAs and lncRNAs was controlled by the number of shared miRNAs and the strong co-expression patterns. First, for each candidate lncRNA-TF interaction, we calculated the number of shard miRNAs. Hypergeometric test was performed to identify candidate lncRNA-TF pairs at the threshold of P value less than .05 as follows:
where m is the number of miRNAs in the background, t is the number of miRNAs that interact with the TF, n is the number of miRNAs that interact with the lncRNA, and r is the number of common miRNAs that are shared between TFs and lncRNAs.

Secondly, Pearson correlation coefficients (PCCs) of all the candidate lncRNA-TF pairs were computed. Only the candidate lncRNA-TF pairs with PCC grater than 0.6 and P less than .05 were defined as significant lncRNA-TF pairs. Then, GLTN was constructed by combining all significant TF-lncRNA pairs. Cytoscape (http://www.cytoscape.org/) was used for network visualization.

2.4 Construction of HALTN

In network topology, degree is defined as the number of edges that link to a node. Hub genes with higher degrees in biological networks are more important than others.16 Thus, in this study, we selected the top 10% nodes (including lncRNAs and TFs) with the highest degrees in the GLTN as the hubs. HALTN was finally constructed by mapping these hub nodes to the GLTN and extracting all the related edges connecting them.

2.5 Topological analysis of networks

We calculated multiple topological characters of the GLTN and HALTN by running the “igraph” package in R language. For the average path length of the network, 1000 random degree-conserved networks were chosen as control, and the measurement of average path length in each random network was counted. P values were calculated by the fraction of the number of average path length in random network larger than that in the real network.

2.6 Identification of core lncRNA-TF modules

lncRNAs are demonstrated to tend to function in modules.17 To identify core lncRNA-TF modules, we imported the HALTN to the Cytoscape software, and used the plug-in of “MCODE” to identify the functional modules (parameters: “Haircut”, “Fluff”, Node Score Cutoff: 0.2). DAVID (https://david.ncifcrf.gov/) and PATHWAX (http://pathwax.sbc.su.se/) were used for Gene Ontology (GO) and pathway enrichment analysis.

2.7 Construction of the risk score model

To identify prognostic lncRNA related signatures, survival analysis was performed based on the clinical information. Univariate Cox regression was performed to all genes. The risk score for each patient was weighted by integrating the expression values and the regression coefficient of univariate Cox regression:
where ri represents the Cox regression coefficient of gene i in gene set, n represents the number of genes in gene set, and Exp(i) represents the expression value of gene i in corresponding patient. The mean risk score was used as a cut-off to classify patients into high- and low-risk groups.

2.8 Survival analysis

A Kaplan-Meier survival analysis was performed for high- and low-risk groups of patients. Log-rank test (P less than .05) was used to yield statistical significance. All analyses were performed using the R 3.3.0 software.

2.9 DNA regulatory elements and motif analysis

To identify active enhancers in GBM, we downloaded the H3K27ac ChIP-seq data (GSE61852) from GEO database. TopHat and MACs were respectively used to align to human genome (hg19) and identify ChIP-seq peaks. Enhancers were defined as the ChIP-seq peaks that located more than 2 kb from transcription start site (TSS). Promoters were defined as ±2kb from TSS. For the motif analysis, we used FIMO with a P value less than 1e−4 to scan enhancer and promoter regions.18

3 RESULTS

3.1 Construction and analysis of GLTN

Recent studies have found that lncRNAs played crucial roles in the pathological processes of cancers. TFs can bind the promoters or enhancers to regulate the expression of lncRNAs. The ceRNA crosstalks were composed between TFs and lncRNAs that mediated by miRNAs in the posttranscriptional levels. Based on the complex regulatory structure, in this study, to evaluate the functional roles of lncRNAs in GBM, we downloaded the expression matrix of GBM with 155 cancer samples from TCGA data portal. Then we constructed a network named GLTN via a pipeline that encompasses multiple steps (Figure 1, details in methods). In brief, firstly, we mapped all the human TFs to the mRNAs of expression matrix and only reserved the mapped TF-related expression matrix. Then we downloaded and extracted all the AGO-CLIP supported miRNA-TF interactions from starBase database. All the miRNA-lncRNA interactions were obtained from bioinformatics algorithm software miRanda. A candidate lncRNA-TF pair was generated if they competed for at least one common miRNA. We counted the number of common miRNAs that shared between every candidate lncRNA-TF pair. Finally, all the lncRNA-TF ceRNA pairs were identified by performing hypergeometric test at the threshold of P value less than .05 and Pearson correlation test at the threshold of PCC greater than 0.6 and P value less than .05, which composed a binary regulatory network named GLTN. As a result, GLTN contained 507 lncRNA nodes, 437 TF nodes and 2759 edges (Figure 2A).

Details are in the caption following the image

The pipeline for identification of significant lncRNA-TF interactions. First, we identified significant miRNA-lncRNA interactions by performing the software miRanda and downloading all miRNA-mRNA interactions from starBase V2.0. Second, we performed hypergeometric test and Pearson correlation test to all candidate lncRNA-TF pairs. Third, lncRNA-TF pairs with hypergeometric P value less than .05 and Pearson correlation coefficients of PCC greater than 0.6 and P value less than .05 were defined as significant lncRNA-TF interactions. Finally, all lncRNA-TF pairs were merged to the GLTN. ceRNA, competing endogenous RNA; GLTN, global lncRNA-TF network; lncRNA, long noncoding RNA; miRNA, microRNA; mRNA, messenger RNA; PCC, Pearson correlation coefficient; TF, transcription factor

Details are in the caption following the image

Topological features of GLTN. A, The visualization of GLTN. Orange nodes represent lncRNAs, green nodes represent TFs and node size represent degrees. B, Degree distributions of the network. All degrees followed a power-law distribution. C, Comparison of degrees of lncRNAs and TFs. Results showed that lncRNA nodes had higher degrees compared to TF nodes. D, Average path length distributions of the real network and 1000 times random networks. Average path length in real network was larger than that in random cases. E, The distributions of top 10 nodes of degree, betweenness and closeness. F, GO enrichment results of MCM3AP-AS1-related TFs. GLTN, global lncRNA-TF network; GO, Gene Ontology; lncRNA, long noncoding RNA; TF, transcription factor

3.2 Topological features of GLTN

To evaluate the biological functions of lncRNAs and TFs, topological analysis was performed to GLTN. We performed the degree distribution to the network and found that all the nodes followed the power-law distribution (Figure 2B, R2 = .92), indicating that GLTN network is scale free with a small subset of high-degree genes (defined as hub genes) connecting the other genes. We also found that lncRNA nodes usually had higher degrees compared to TF nodes (Figure 2C), implying that a TF could bind to more than one lncRNAs to exert regulatory function. Next, we calculated average path length of GLTN and found average path length of the real network substantially larger than that of random networks (Figure 2D, 1000 times random network, P less than .001), suggesting that GLTN had reduced global efficiency. Above results indicated that lncRNA-TF ceRNA interactions prefer to form regional modules and hub genes might mediate all regional modules. Furthermore, studies found that nodes with central topology features in biological network played key roles in biological processes. Thus, topology features of degree, betweenness and closeness were calculated for nodes in network, respectively. We then selected top 10 crucial nodes of each topological feature and results showed the little intersection among the three features. Even so, we still found an lncRNA named MCM3AP-AS1 ranked in the top of any one topological feature (Figure 2E), indicating that this lncRNA might play important roles in the pathological processes in GBM. Interestingly, some studies have found that this lncRNA MCM3AP-AS1 participated in the regulatory processes of GBM. For instance, Yang et al found that MCM3AP-AS1 was associated with cell viability and migration. Knockdown of MCM3AP-AS1 can repress tube formation of glioma-associated endothelial cells, leading to angiogenesis inhibition of GBM in vitro. Furthermore, knockdown of MCM3AP-AS1 can activate miR-211. KLF5 is the downstream target of miR-211 and is associated with the promoter region of AGGF1. Knockdown of KLF5 repressed AGGF1 expression, which further resulted in inhibiting the activation of PI3K/AKT and ERK1/2 signaling pathways.19 We performed GO and pathway enrichment analysis to the MCM3AP-AS1-related TFs in GLTN (Figures 2F and S1). Results showed that these genes were associated with some basal functions in transcription, such as “regulation of transcription” and “histone acetylation.” Notably, MCM3AP-AS1 might control the P53-related signal transduction. These results suggested that the crucial “hubs” might locate in the central of GLTN and regulate multiple processes in GBM.

3.3 Identification and analysis of hub nodes in GLTN

Based on previous studies, a small subset of genes with high degrees was defined as hub genes. In our study, we selected the top 10% nodes (including lncRNAs and TFs) with the highest degrees in the GLTN as the hubs. As a result, 42 lncRNAs and 52 TFs were selected as hubs. Next, to investigate whether single hub gene with specific properties were prognostic factors in GBM, we performed Cox regression analysis for all ceRNAs and obtained 28 lncRNAs and 26 TFs as prognostic factors (Table S1 and S2). Unexpectedly, for lncRNAs hubs, none of them had a significant effect on survival; and for TF hubs, only two TFs (zinc finger and BTB domain containing 20 [ZBTB20], sine oculis homeobox homolog 1 [SIX1]) were prognostic factors in GBM (Figure 3A and 3B). Furthermore, to investigate the topological features of prognostic factors, we compared the degrees of prognostic factors to other nodes. Results showed that degrees of most prognostic factors were relatively small, indicating that these genes located in the corner of GLTN. This phenomenon was also validated by other studies. Interestingly, we found that TF SIX1 connected with multiple hub lncRNAs in GLTN. Many studies have demonstrated that SIX1 is a crucial regulatory factor in GBM. Tian et al demonstrated that silencing of SIX1 by shRNA would subsequently inhibit cell proliferation and invasion in GBM. Moreover, overexpression of SIX1-induced CTGF upregulation in GBM and significantly enhanced activity of CTGF promoter. Meanwhile, repression of CTGF expression could block SIX1-induced cell proliferation and invasion, revealing that SIX1 participated in GBM growth and metastasis combined with CTCF.20 Then, we performed a comprehensive analysis for triple network that composed of SIX1 and the connected hub lncRNAs (Figure 3C). In the triple network, some miRNAs that mediated the ceRNA pairs have been demonstrated to play complex roles in pathological processes of GBM. For instance, previous studies found that reduction of Dicer levels in human GBM cell lines caused activation of p27 (Kip1) and repression of cell proliferation. p27 (Kip1) is the downstream target of miRNA 221/222, indicating that these microRNAs played a crucial role in promoting the aggressive growth of human GBM.21 Xia et al22 found that overexpression of miR-204-5p suppressed glioma cell growth, migration and invasion, and further suppressed tumorigenesis and increased overall survival in glioma cells. Sun et al demonstrated that miR-320c could suppress cell proliferation and metastasis by targeting E2F1. When suppressing miR-320c in multiple glioma cell lines, cell proliferation and migration will be enhanced.23 All these results indicated that hub gene-related ceRNA crosstalks participated in the regulatory processes of GBM and dysfunction of ceRNA crosstalks might lead to the change of pathology.

Details are in the caption following the image

Identification of hub nodes in GLTN. A, The hub TF ZBTB20 has a significant prognosis effect for GBM. The patients with high-risk scores were assigned into high-risk groups (associated with reduced survival times). B, The hub TF SIX1 has a significant prognosis effect on GBM. C, The triple network of SIX1, hub lncRNAs and mediated miRNAs. GBM, glioblastoma; GLTN, global lncRNA-TF network; lncRNA, long noncoding RNA; SIX1, sine oculis homeobox homolog 1; TF, transcription factor; ZBTB20, zinc finger and BTB domain containing 20

3.4 Construction and analysis of HALTN

Results of last section suggested that the regulatory roles of ceRNA pairs between hub genes are important. Thus, we extracted all the hub-hub associated ceRNA pairs and constructed HALTN (Figure 4A). HALTN composed of 86 hubs (42 lncRNAs and 44 TFs) and 588 edges. We can see two components in HALTN visually. We calculated the average path length of HALTN, which was substantially larger than that of random networks (Figure 4B, 1000 times random network, P less than .001). This result implied that the network had reduced global efficiency and multiple modules were mediated by betweenness centrally genes. To investigate the biological functions of HALTN in GBM, we performed pathway enrichment to the hub TFs in network (Figure 4C). Result showed that some basal pathways, such as “Basal transcription factors” and “Transcriptional misregulation in cancer” have been enriched, indicating these hubs controlled the fate of other genes in GBM. “Notch signaling pathway”, which has been validated as a key regulatory path in GBM, has also enriched in HALTN.24 Single hub gene has weak effects on prognosis of GBM patients. Some studies have suggested that multiple combined genes have strong effect on survival. Therefore, to determine whether these hubs were prognostic factors for GBM, we constructed a risk model. Firstly, we detected the prognosis effects of all lncRNA hubs and all TF hubs, respectively. Results showed that all lncRNA hubs but not all TF hubs have significant effect on prognosis of GBM (Figures 4D and S2). However, when we combined all lncRNA and TF hubs into a risk factor, a stronger prognosis effect has occurred than the prognosis effect of only lncRNA hubs, indicating that ceRNA crosstalks could improve the prognosis effect.

Details are in the caption following the image

A, The visualization of HALTN. Orange nodes represent lncRNAs and green nodes represent TFs. B, Average path length distributions of the real network and 1000 times random networks. C, Results of pathway enrichment analysis for all hub TFs. E, Hub lncRNAs have a significant prognosis effect on GBM. The patients with high-risk scores were assigned into high-risk groups (associated with reduced survival times). GBM, glioblastoma; GLTN, global lncRNA-TF network; lncRNA, long noncoding RNA; SIX1, sine oculis homeobox homolog 1; TF, transcription factor; ZBTB20, zinc finger and BTB domain containing 20

3.5 Identification of functional modules in HALTN

Previous studies have found that lncRNAs exerted functions in close modules. After importing HALTN to Cytoscape, we used a plugin “MCODE” to identify functional modules. Coincidentally, two modules were identified in the two component of HALTN, respectively. The module 1 was composed of 28 nodes (13 lncRNAs and 15 TFs) and 164 edges (Figure 5A). We performed the subpathway enrichment to TFs in module 1 (Figure 5C) and showed that 4 subpathways were enriched because of the fewer genes. Even so, this module also encompassed some crucial genes, such as high mobility group AT-hook 2 (HMGA2), GATA binding protein 6 (GATA6) and forkhead box F2 (FOXF2). Kaur et al identified that the expression level of HMGA2 was activated in primary human GBM tumors and cell lines. Knockdown of HMGA2 led to decreased stemness, invasion and tumorigenicity of GBM.25 Some studies devoted to investigating the methylation levels of GATA6 DNA regulators, indicating the importance of methylation of gene GATA6 in GBM patient outcome.26 Additionally, FOXF2 has also been demonstrated to suppress the transcription of p21Cip1 CDK inhibitor with FoxF1, which synergistically promoted rhabdomyosarcoma carcinogenesis.27 Module 2 encompassed 44 nodes (22 lncRNAs and 22 TFs) and 120 edges (Figure 5D). Subpathway enrichment results showed that this module was related to basal functions, “TGF-beta signaling pathway” and “Jak-STAT signaling pathway” (Figure 5F). Previous studies have found that the transforming growth factor β (TGF-β) pathway functions as an pan-cancer oncogenic factor and is regarded as a therapeutic target.28 TGF-beta increased the self-renewal capacity of Glioma-initiating cells via activating Smad/LIF/JAK-STAT axis. And the effect of TGF-beta and LIF on Glioma-initiating cells promotes oncogenesis.29 Furthermore, Senft et al30 found that suppression of the JAK-2/STAT3 pathway decreased the migratory and invasive potential of GBM cells. Module 2 also contained numbers of known GBM-related TFs, such as SIX1, HOXA11.

Details are in the caption following the image

Module analysis of HALTN. A, The view of module 1. B, Module 1 could be used to divide GBM patients of the test group into two different risk groups (P = .013). C, Subpathway enrichment analysis for TFs in module 1. D, The view of module 2. E, Module 2 could be used to divide GBM patients of the test group into two different risk groups (P = .035). F, Subpathway enrichment analysis of TFs in module 2. GBM, glioblastoma; HALTN, hub associated lncRNA-TF network; TF, transcription factor

Functional modules often showed a stronger prognosis effect than single genes. To determine whether the two modules have the prognosis effects on GBM, we integrated expression profiles of genes into a risk model to perform survival analysis. As a result, we classified the training GBM patients into high- and low-risk groups with different clinical outcomes, respectively (Figure 5B and 5E). These results suggested that synergistic, competing modules composed of lncRNAs and TFs have significant prognosis effects and could be used as biomarkers in GBM.

3.6 Identification of core lncRNA-TF crosstalks

Recent studies have found that TFs can bind the DNA regulatory elements to control the expression of lncRNAs. Thus, to investigate the binding motifs between TFs and lncRNAs, we performed motif searching analysis to the DNA regulatory elements of all lncRNAs. As a result, we find amount of TF binding sites in the enhancers and promoters of lncRNAs, respectively (Figure 6A and 6B). Hub genes often play important roles in the biological processes, so we extracted the motif searching results of 12 hub TFs. In the enhancer regions, TF PLAG1 has the most numerous binding sites and target lncRNAs (Figure 6A). PLAG1 was reported as a oncogene that involved in the Wnt pathway to regulate cancer pathology.31 HOXA13 and POU2F1 also have large amount of binding sites and target lncRNAs. Previous studies found that increased HOXA13 expression can lead to enhanced cell proliferation and invasion. HOXA13 suppressed expressions of β-catenin and phospho-smad2/3 in the nucleus and activated phospho-β-catenin in the cytoplasm. Downregulation of HOXA13 resulted in decreased tumor growth.32 In the promoter regions, PLAG1, HOXA13 and POU2F1 also had the most binding sites and bound the majority of lncRNAs (Figure 6B). All these results implied that hub TFs might function as core TFs that bind the enhancers and promoters of abundant lncRNAs and form the “feedback loops” to participate in cancer biology (Figure 6C). Next, we extracted all hub TF-hub lncRNA regulatory pairs that occurred in enhancers and promoters from motif searching results. Totally, 167 pairs that encompassed 12 TFs and 39 lncRNAs were identified (Figure 6E). These core lncRNA-TF crosstalks had a significant prognosis effect on GBM (Figure 6D). These results suggested that the synergy of core lncRNA-TF crosstalks not only maintained the key pathological function, but also had an impact on patient survival.

Details are in the caption following the image

Identification of core lncRNA-TF pairs from motif analysis. A, Motif searching of enhancer regions of lncRNAs. Bar and overlapping line plot shows identified motifs in enhancers (bars, main y-axis) and counts of lncRNAs that the TF regulated (line, alternate y-axis, green). B, Motif searching of promoter regions of lncRNAs. C, The mimic diagram of feedback loops that composed of lncRNAs and TFs. D, Core lncRNA-TF pairs have a significant prognosis effect on GBM (P = .005). E, The view of core lncRNA-TF pairs. DLX2, distal-less homeobox 2; FOXF2, forkhead box F2; HIC2, hypermethylated in cancer 2 protein; HMGA2, high mobility group AT-hook 2; HOXA11, homeobox A11; HOXA13, homeobox A13; HOXB9, homeobox B9; LHX6, LIM homeobox 6; miRNA, microRNA; mRNA, messenger RNA; PAX9, paired box 9; PLAG1, pleomorphic adenoma gene 1; POU2F1, POU class 2 homeobox 1; TBX18, T-box 18; lncRNA, long noncoding RNA; TF, transcription factor

4 DISCUSSION

GBM has become a lethal type of brain cancer worldwide. Moreover, the median survival for patients with GBM has remained less than 2 years. Thus, it is very urgent to investigate the molecular mechanism and find the significant risk factors for GBM diagnosis and prognosis. A large amount of studies have found that ncRNAs, such as miRNAs, play crucial roles in pathology of GBM and act as strong prognosis factors.4, 33 Moreover, a novel type of ncRNA named lncRNA was proposed by recent studies. lncRNAs could be targeted by miRNAs and act as miRNA sponges. lncRNAs and mRNAs can compete for miRNA sponges through ceRNA mechanism, which has been demonstrated to play crucial roles in pathology of diseases. In GBM, some studies have found that lncRNAs can sponge multiple miRNAs, which maintained the crucial biological processes. For example, Cai et al9 found that TUG1 can target to miR-299, leading to the strengthen of tumor-induced angiogenesis and VEGF expression in GBM. XIST functioned as a tumor suppressor in human GBM stem cells via sponging miR-152.8 Additionally, TFs were crucial because of functional roles in regulating transcription. Some studies have performed the bioinformatics analysis to investigate the comprehensive mechanism of GBM and found some functional lncRNAs in GBM.34 Previous studies also found that the ceRNA crosstalks among TFs, lncRNAs and mRNAs participated in the pathological processes of GBM and had prognosis effect.35

In this study, we only focused on the ceRNA crosstalks between TFs and lncRNAs. Our hypothesis is that lncRNAs and TFs can compose core ceRNA interactions in the posttranscription level and exert key biological functions in the pathology of GBM (Figure 6C). To investigate the ceRNA crosstalks of lncRNAs and TFs, we performed an integrative pipeline to identify significant lncRNA-TF pairs and merge all pairs into the GLTN (Figure 1). A known GBM lncRNA MCM3AP-AS1 that has multiple central topological features in GLTN was identified based on network topology analysis, indicating crucial genes were located in central location of GLTN. Hub genes in network also play important roles in biological processes. Thus, we identified hub genes and extracted the hub-hub pairs from GLTN to form HALTN for further analysis. Results showed that most hub genes have a bad ability to prognosis, but multiple hubs combined as a risk model have a significant effect on prognosis. lncRNAs often exerted functions in modules, thus, two functional modules from HALTN were identified. Importantly, these two modules are both risk factors of GBM. Furthermore, a subset of core lncRNA-TF crosstalks that might form feedback loops to control the biological processes in GBM was identified. These core lncRNA-TF crosstalks showed a significant effect on prognosis.

Here, we focused on identifying core lncRNA-TF crosstalks. Recent studies have found that TF and lncRNA could form the transcriptional feedback loops to exert key biological functions. In our study, two modules with crucial functions and significant prognosis effect were identified. Furthermore, some studies found that TFs, functioning as core TFs, could bind the enhancer and promoter regions of genome-wide genes. We performed motif searching for the enhancers and promoters of lncRNAs and extracted the hub TF-hub lncRNA interactions. Surprisingly, 12 core TFs bound 39 hub lncRNAs in both enhancer and promoter regions, which might form tissue- specific feedback loops and exert special functions. In the next study, we will focus on the diagnosis effects of these core lncRNA-TF pairs. In addition, we have found that crosstalks between lncRNAs and TFs have strong prognosis effects.

However, our present study had some limitations. Firstly, gene expression and hypergeometric test were integrated to identify significant lncRNA-TF interactions. A more precise algorithm will increase stability of our results. Secondly, to investigate the regulatory loops between TFs and lncRNA enhancers, CHIP-seq data of H3K27ac were used. Because of the lack of data, enhancers were identified from the public data of GEO database. If we can download the same-sample multi omics data from TCGA, the core lncRNA-TF feedback loops would be more accurate.

In summary, we performed a comprehensive analysis for the lncRNA-TF ceRNA crosstalks in 155 GBM samples. Our results demonstrated the synergistic, competitive lncRNA-TF pairs played an important role in pathological processes of GBM, and had strong effect on prognosis. All these results can help us to uncover the molecular mechanism and provide a new therapeutic target for GBM.

ACKNOWLEDGMENT

We would like to thank the researchers and study participants for their contributions.

    CONFLICT OF INTERESTS

    The authors declare that there are no conflict of interests.

    AUTHOR CONTRIBUTIONS

    JHZ and JNL designed this study. YQG, SH, and BY collected and processed data. YJ and JNL wrote the manuscript.

    DATA ACCESSIBILITY

    The data sets used and/or analyzed in the present study are available from the manuscript.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.