User and artificial intelligence generated contents, coupled with the multimodal nature of information, have made the identification of false news an arduous task. While models can assist users in improving their cognitive abilities, commonly used black-box models lack transparency, posing a significant challenge for interpretability. This study proposes a novel credibility assessment method of social media content, leveraging multimodal features by optimizing the hierarchical belief rule-based (HBRB) inference method. Compared to other popular feature engineering and deep learning models, our method integrates, analyses, and filters relevant features, improving the HBRB structure to make the model layered, independent, and interconnected, enhancing interpretability and controllability, thereby addressing the rule combination explosion problem. The results highlight the potential of our method to improve the integrity of the online information ecosystem, offering a promising solution for more transparent and reliable credibility assessment in social media.

1. Introduction

The growth of social media platforms has fueled the production of user-generated content and, subsequently, of false information. The easy access and fast generation and diffusion of this undesirable information is a critical issue that challenges the stability of the social order. Assessing the credibility of the vast information available on social media is an arduous task as its mere existence is enough to consider it true [1]. Besides, language in social media has evolved very quickly transitioning from only verbal (unimodal) to verbal and nonverbal (multimodal) communication, resulting in new challenges for credibility assessment of online content. Previous research in this area has explored techniques, such as feature engineering and deep learning, to address this issue [1–5]. On the one hand, feature engineering methods focused on the analysis of characteristics associated with the user, content, and topic extracting key metrics in order to identify false information [1]. On the other hand, deep learning methods employed recurrent neural networks and attention mechanisms to detect rumors and false information [3, 6, 7]. These approaches leveraged various features, such as emotional tendency [8], user’s reputation [9], content complexity [10], dissemination patterns [11], and image-text fusion [3], among others, to effectively assess credibility.

However, they generally face different challenges when assessing credibility in multimodal social media content [1, 2, 5, 11–24]: (1) The complex input-output relationships of deep learning models and “black-box” models are usually difficult to explain, while the cross-modal correlation of feature engineering methods is low [11]; (2) Data limitation may result in low robustness for complex detection tasks; (3) Discrepancies between different information modalities generate unstructured data, for example, a text might be accurate but the content of the accompanying images could be misleading. The interpretability capacity of these models is subjective and thus, its efficiency remains an open question [17].

Model interpretability can be categorized into two groups: (1) post-hoc interpretability, which aims to explain models with weak interpretability through designed interpretation methods. For example, to explain the working mechanism of random forests and deep neural networks, a series of methods including rule extraction [25], activation maximization [26], feature inversion [27], and class-activation mapping [28] were suggested; (2) ante-hoc interpretability refers to the built-in interpretability of the model, allowing its working mechanism to be understood without external aids. Rule-based models, known for their ante-hoc interpretability, can have their mechanisms visualized through rules [29, 30]. Post-hoc interpretability is flexible and suitable for complex models. However, the generated explanations may be complex and not fully accurate requiring additional computation. In contrast, ante-hoc interpretability overall outperforms post-hoc interpretability models in terms of transparency and user acceptance as it contains built-in explanations, provides transparent and immediate feedback, is more trustworthy to users, and is easy to debug. It also presents limitations when dealing with complex nonlinear problems [31].

Given the challenges in readability of previous methods, the belief rule-base (BRB) inference method with ante-hoc interpretability has been widely used in the field of assessment and prediction becaue of its transparent inference pathway and high prediction accuracy [11, 28]. BRB is a generic rule-based model whose interpretability can be characterized by the readability and consistency of the rules. However, BRB encounters a significant challenge known as “rule combination explosion.” The rule combination explosion problem implies that the number of rules grow exponentially with the number of attributes, potentially diminishing the readability, the computational efficiency, and scalability of the model [32]. The optimization process of BRB aims to improve the model accuracy, yet this may inadvertently introduce incorrect rules. Erroneous rules are usually incomprehensible to users. In order to cope with this problem, the hierarchical belief rule-based (HBRB) model has been recently proposed [30]. Compared to the BRB method, the HBRB approach significantly reduces the rule size constructing an assessment system reckoning with the independent effect of each modality (i.e., features of text, image and users) and their relationships that allow credibility to be directly expressed by quantifying multimodal features and performing a correlation analysis.

Here, we introduce the hierarchical belief rule-base inference model for multimodal information credibility assessment (HBRB-MICA) with ante-hoc interpretability. Compared to the post-hoc interpretability model, the ante-hoc interpretability model allowed us to accurately identify the unreliability of each feature adjusting its weight accordingly to meet the needs of the credibility evaluation in different scenarios. We validated the model’s accuracy through training on an open-source annotated dataset of false information. Furthermore, we benchmarked the model’s performance against baseline models.

The main contribution of our work is the development of an innovative model (HBRB-MICA with ante-hoc interpretability) to assess the credibility of multimodal information published in social media by users. This model presents a hierarchic and independent structure based on sub-models providing a controllable and transparent assessment process. In the same line, it copes with the rule combination explosion problem subsequently reducing the computational cost and increasing the efficiency of the process. Therefore, the presented model is potentially scalable.

The rest of this paper is organized as follows: Section 2 provides a review of existing research on credibility assessment methods. Section 3 presents the proposed method (HBRB-MICA) and its components. Section 4 presents the experimental procedure and the results. Section 5 provides the conclusions and discussion of this research.

2. Relevant Research

Herewith, we present the relevant research performed around the selection and extraction of multimodal features for credibility assessment and around different credibility assessment models.

2.1. Feature Selection and Extraction for Credibility Assessment

False information is usually characterized by shorter content, more aggressive wording, and semantic inconsistencies [33, 34]. In order to assess the credibility of online information based on these semantic features, Machackova and Smahel focused on the clarity and understandability of the posted content, and on the users’ feedback using the principal component analysis (PCA) and the repeated-measures ANOVA for feature selection [35]. However, social media information is becoming more and more content-rich, thus focusing on a single textual modality is not effective. Nowadays, images are also included in the main content. Image inpainting algorithms such as DNNAM [36] and MICU [37] can improve the image quality, making the evaluation of their authenticity very difficult. Comprehensive reasoning based on multimodality can solve this problem.

Modality refers to a fine-grained concept when compared to multimedia data as each modality within multimodal data contributes to a unique value to the whole, offering insights not deducible from other modalities alone [38]. In addition to text and images, the modality of user information is also related to information credibility. Choi et al. conducted a study of the information on social Q&A platforms, considering the professional certification of the author and the relevance, richness, and accuracy of the content as criteria to assess credibility [39]. Multimodal research on assessing the credibility of information in the past 5 years has primarily focused on three modalities: text, image, and user characteristics (Table 1). Table 1 shows that, even when the dataset is imbalanced, the combination of rich modal features obtains better accuracy (F1 values) in the credibility assessment [22]. Meel and Vishwakarma [5] demonstrated that combining advanced text and image processing models (BERT, ALBERT, and Inception-ResNet-v2) achieved highly accurate multimodal assessment on different datasets. Qureshi and Malick [20] highlighted the effectiveness of traditional machine learning methods (e.g., logistic regression, decision trees (DTs), and light GBM) in processing text and user features, achieving an F1 value of 90.0%. Wang et al. [22] demonstrated the benefits of multimodal feature fusion by integrating text, image, and user features by using complex neural network models (ERNIE, VGG-19, and BP) obtaining a F1 value of 95.90% on the Weibo dataset. Qi et al. [24] focused on text, image quality, and publishing frequency, combining two neural network models to detect fake news achieving high accuracy values. Evans et al. performed a correlation analysis with a DT classifier to identify the most important features based on the root mean square (RMS) screening of the feature values [23]. This method presented the disadvantage of being less robust compared to the Spearman’s rank correlation in terms of sensitivity to detect and handle linear relationships and outliers.

Table 1. Related research on social media credibility.

Reference	Year	Data source (# features)	Algorithm(s)²	Analyzed features	Results³
Meel and Vishwakarma [5]	2023	Weibo (5250) Media eval (15,519)	BERT + ALBERT + inception-ResNet-v2	Text, image	F1 = 94.94% F1 = 69.39%

Qureshi and Malick [20]	2023	Platform X (5246)	LR DT Light GBM	11, text, user	F1 = 90.0%

Wang, Wang, and Han [22]	2022	Weibo (5840 (T)¹, 3963 (F))	ERNIE + VGG-19 + BP	16, text, image, user	F1 = 95.90%

Evans et al. [23]	2021	Platform X (208,209)	NB KNN DT LR RF	34 + 21, content, context, user	F1 = 95.80%

Qi et al. [24]	2019	Weibo (4779 (T), 4749 (F))¹	MVNN + CNN-RNN	Text, frequency, and pixels of images	F1 = 90.60%

¹T refers to true information, and F refers to false information.
²RF = random forest, kNN = k-nearest neighbors, LR = logistic regression, NB = Naïve Bayes, SVM = support vector machine, DT = decision tree.
³Results shown based on the top-performing classifiers.

Despite the high accuracy scores obtained, these methods focused on the extraction and fusion of features, but these features were easily distorted after fusion making the detection of which of them actually affected the authenticity of information very difficult.

2.2. Credibility Assessment Models

Most studies in credibility assessment on social media were performed by supervised learning [40]. In 2011, Castillo et al. [1] pioneered an automated method to assess the credibility of tweets, selecting 15 key metrics to identify false information based on a DT model. Later, credibility assessment methods based on different machine learning models such as support vector machines [11], Naïve Bayes, k-nearest neighbors, DTs, logistic regression, and random forest [23] were proposed achieving good results (Table 1) in assessing the credibility of social media information based on feature engineering methods and machine learning algorithms.

With the emergency of neural networks, the exploration of deep learning models for rumor detection has been widely used. Deep learning presents a robust learning capacity by using feature vectors, enabling the extraction of higher quality and more representative data features. However, its “black box” nature often obscures the inference process [41]. Unimodal approaches typically capture text features through deep learning models such as Bi-LSTM, graph attention networks (GAT), and pre-trained models (BERT, GPT, etc.). In contrast, multimodal approaches integrate multiple features through feature fusion and feature interaction. Previous research focused on two aspects: (1) the fusion of different types of features and the learning of associations between multimodal features [12, 13, 19] and (2) the construction of features through cross-attention [42], semantic alignment [3], similarity matching [43], and consistency learning [44] mechanisms to discover patterns between associated features, aligned entities, similar semantics, and coherent images and texts during feature interaction. However, the complex interactions between multiple modalities and the intricate neural network architectures made the readability, i.e., the interpretation of how the model arrives at its decision, of these methods very difficult. To address this problem, some studies have attempted to link textual content with comment features (or external knowledge) [45] or to investigate the semantic inconsistency between images and text as indicators of false information [46]. Nevertheless, the performance of these methods was compromised as they tended to focus on single correlations or inconsistencies and ignored more complex interactions and underlying semantic relationships in multimodal information. They also faced challenges in explaining the modeling decision-making process, especially when dealing with large-scale and diverse social media data.

Because of the inexplicability of deep learning methods, BRB inference methods have been widely used in the fields of assessment and prediction in recent years [29, 30]. BRB inference methods present clear inference paths as well as high prediction accuracy [47]. They are developed on the basis of the traditional IF-THEN rule-based expert system and on the evidential reasoning (ER) method being suitable for constructing complex nonlinear causal relationships between both premise and result attributes under uncertain conditions [47]. Most scholars have been using the rule-base inference methodology using the evidential reasoning (RIMER) [48], as it fits both deterministic and stochastic systems with sufficient accuracy. RIMER is particularly effective in handling complex uncertain problems that cannot be easily expressed through deterministic and stochastic mathematical models [49]. However, when too many features are present, RIMER leads to an excessively large rule-base modeling, facing the rule combination explosion problem [50]. To solve it, scholars proposed the HBRB inference method [51], which can significantly reduce the size of the BRB approach, and it is able to construct an assessment system according to the actual structure of the system. Based on this, Cao et al. proposed a new optimization method to ensure the consistency of the HBRB model, making it more suitable for assessment of reliable fault diagnosis in inertial platforms [30]. This method was able to optimize the model by calculating the deviation between the assessing results of sub-models and the real values adjusting the parameters and the weights of rules. To date, it has never been used in information credibility assessment.

3. Method

This section describes the proposed credibility assessment method with ante-hoc interpretability (Figure 1). The method was divided in two steps. Firstly, three features, i.e., text (comments and content), image, and user information, were selected, extracted, and processed, and secondly, the HBRB information credibility inference model was constructed for analyzing multimodal features.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Illustration of the credibility assessment method for social networking information.

3.1. Multimodal Feature Selection, Extraction and Processing

3.1.1. Text Features

Relevant text features were selected to identify false information as shown in Table 2. These features were reported to be effective when spreading false information to a wider audience [1, 9, 53, 54, 56].

Table 2. Description of the text features used in this research.

Dimension	Feature	Description	Reference
Text	Text_Exc_count	Percentage of “!”	[1, 14]
	Text_Que_count	Percentage of “?”	[1, 33]
	Text_At_count	Number of “@”	[2, 52]
	Text_emotion	Emotional polarity in the original text	[9, 16, 53, 54]
	Text_abnormality	Text word abnormality in content	[10, 45]
	Comment_emotion	Emotional polarity in comments	[10, 45]
	Comment_abnormality	Text word abnormality in comments	[10, 42]
	Difference	Divergence in emotional comments	[55]

Note: Bold indicates the model scores that perform best on the Accuracy, Precision, Recall and F1 value.

The use of exclamation and question signs as well as referrals (Text_Exc_count, Text_Que_count, Text_At_count, respectively) are common techniques used to spread undesirable information [1, 2, 14, 52].

Text_emotion referred to the original text emotional polarity being positive or negative based on the sentiment analysis is performed on the text.

Text_abnormality was calculated by using APRIORI (Association Rule by using PRIORI knowledge) [57], an algorithm that extracted the “abnormal” words from the false information of the Weibo-21 dataset [58]. Words were considered “abnormal,” and thus associated with false information, when appearing more than 500 times in that database. Therefore, the threshold to detect them was set at 500/N (N is the number of words in the array A) and, subsequently, the frequent itemset Ci was built. Given the diversity of Chinese languages, we used the synonym recommendation function of Baidu API to extend the set of “abnormal” words. Considering this, the formula (Equation (1))to calculate text word abnormality was defined as follows:

(1)

where In_i represented how many abnormal words appeared in the frequent itemset C_i, r_i represented the weight of the frequent itemset C_i, and Length represented the number of abnormal words present in the text.

Comment_emotion referred to the overall emotion score per comment. For the original text and comment emotion tendency analysis (Figure 2), the scores of the emotion dictionary provided by BonsonNLP and HowNet were included. The process was presented in Figure 2 where, firstly, the emotion scores of the emotion-associated words were obtained by word splitting and emotion lexicon retrieval, then the associated scores were calculated. Finally, the text emotion scores were obtained by weighted summation.

The formula to obtain this metric was defined in Equation (2) where NP_i and NN_i represented the number of comments with positive or negative emotion polarity, respectively.

(2)

Comment_abnormality was set to identify the degree of anomaly of the information comments (Equation (3)). This formula led to the creation of a discriminative word set to identify fake comments.

(3)

where n represented the number of times a particular word of the abnormal word sets occurred in the comments. N was the total number of words considering all comments.

Finally, Difference (Equation 4) was defined to quantify the emotion divergence between the user’s comments by applying the emotion divergence quantification algorithm [59]. The higher the value of the Difference, the larger was the emotional divergence between two different comments.

(4)

where emotion_j and emotion_k represented two different comments, respectively. N was the number of comments.

3.1.2. Image Feature

False information often contains both text and image features whose content usually differs [60]. For example, when a piece of false information creates a fictional scenario or tells a fake story, it can be difficult to find matching images from real life [24, 44]. In addition, the resolution of images used in fake news is generally lower than that of real news as they have been manipulated and repeatedly shared. Repeated sharing also results in a lower number of images used in false information when compared to the number of images available for reliable information [24, 50].

In this work, we selected semantic consistency features of images and text, and other related image features as measurement indicators, as shown in Table 3.

Table 3. Description of the image features used in this research.

Dimension	Feature	Description	Reference
Image	Similarity	Semantic consistency of image and text	[22, 46, 60]
	Size	Image pixel size	[2, 22]
	Img_definition	Image clarity	[2, 24]
	Img_num	Number of images	[2, 22]

Similarity was extracted from the semantic consistency features from the information contained in the image(s) and the accompanying text [22, 46, 60]. Firstly, optical character recognition (OCR) tools were used to extract text from images. Then, the API of Baidu Wenxin Qianfan Large Language Model (LLM) was called to guide the model in the description of images with keywords using prompt words (Figure 3). Then, the pretrained BERT model was applied to obtain text vectors, image OCR text vectors, and image description text vectors. Finally, the cosine similarity (Equation (5)) was implemented to measure the consistency between these three vectors, where s^t and s^v represented the text vectors of image and the vectors of the original text, respectively. The formula was defined as follows:

(5)

The complete process, supported by an example of the extraction of the semantic consistency features from images and texts, was included in Figure 3.

Img_definition referred to the image resolution. We used the OpenCV tool to obtain information about the average resolution of multiple images [61]. Then, we calculated the variance of the image grayscale data to measure the clarity characteristics of the image in order to verify its authenticity. Finally, img_num referred to the number of images of text.

3.1.3. User Features

Two user features, the certification and the influence, were included to assess veracity of social media information (Table 4).

Table 4. Description of the user features used in this research.

Dimension	Feature	Description	Reference
User	Is_certification	User authentication	[9, 62]
User	Influence	User influence	[9, 52]

The Is_certification referred to whether the users had authenticated their accounts on the platform. Influence was calculated as the ratio of the number of fans to the sum of the number of followers of a user. Generally, influential users are more careful when posting information in social media [9, 62]. In contrast, robots are typically used to spread false information holding less influential accounts [9, 52].

3.2. Credibility Assessment Model

We applied, for the first time, the HBRB model with ante-hoc interpretability to assess credibility of multimodal information in social media. The model presents a hierarchical inference structure for evaluating the credibility of multimodal information to solve the problems of illegibility and of excess in rule generation (rule combination explosion) (Figure 4).

The model was divided into three parts: (1) The hierarchical rule activation for inference, (2) the parameter optimization algorithm, and (3) the comprehensive hierarchical inference to generate the final credibility scores. The algorithmic pseudocode of the model was described in Algorithm 1, and its notations were summarized in Table 5.

Algorithm 1: HBRB credibility assessment algorithm with parameter optimization.

1.
function HBRB Credibility Assessment (X);
Input: Input features X = {x₁, x₂, …, x_M}
Output: Credibility score
2.
Initialize sub-models BRB-1, BRB-2, BRB-3, BRB-4;
3.
result₁ ← Process SubModel (BRB-1, input features);
4.
result₂ ← Process SubModel (BRB-2, input features);
5.
result₃ ← Process SubModel (BRB-3, input features);
6.
result₄ ← Process SubModel (BRB-4, [result₁, result₂, result₃]);
7.
← result₄;
8.
return;
9.
Function Process_SubModel (BRB, X):
10.
for each rule R_k in BRB do
11.
Calculate activation weight
12.
end
13.
Calculate μ:
14.
forj ← 1to N do
15.
Calculate :
16.
end
17.
Calculate sub-model result:
18.
return result;
19.
Function optimize_Parameter (training_data):
20.
Define objective function: ;
21.
Constraints: 0 ≤ β_j,k ≤ 1, 0 ≤ θ_k ≤ 1, 0 ≤ δ_i ≤ 1;
22.
Use fmincon to minimize ξ(P) subject to constraints;
23.
return optimized parameters P = {β_j,k, θ_k, δ_i};

Table 5. Description of the algorithm notations.

Symbol	Description
x_i	Input feature
	Set of reference values for x_i
M	Number of input features
R_k	k-th rule in the belief rule base
D_j	j-th possible result in the belief distribution
β_j,k	Belief degree to D_j for the k-th rule
δ_i	Weight of the i-th feature indicator
θ_k	Weight of the k-th rule in the belief rule base
ω_k	Activation weight of the k-th rule
	Matching degree of the i-th attribute in the k-th rule
L	Total number of rules
N	Total number of possible results
μ	Normalization factor in the ER algorithm
	Combined belief degree to D_j
	Final credibility score
ξ(P)	Objective function for parameter optimization
y_m	Actual credibility score for the m-th training sample
	Predicted credibility score for the m-th training sample
P	Set of parameters to be optimized (δ_i, θ_k, β_j,k)

3.2.1. Framework

The HBRB model (Figure 4) adopted a top-down hierarchical inference process to finally produce the results. The hierarchical structure allowed for a nuanced assessment of credibility. Lower-level sub-models (BRB-1, BRB-2, BRB-3) processed individual modality features, producing intermediate credibility scores for each modality. These scores then flowed into the higher-level sub-model (BRB-4), which integrated the modality-specific assessments to produce a final credibility score. This structure enabled the model to capture both modality-specific and cross-modal credibility indicators.

3.2.2. Inference Process

Each sub-model was represented as BRB-h, which consisted of a series of confidence rules (Equation (6)) one of which was described as R_k.

(6)

where x_i represented the input feature.

represented a set of reference values for x_i. M defined the number of input features. The meaning of R_k was a rule that matched the features in the input data to the reference value

. The credibility results were distributed over all possible results D_j in the form of beliefs β_j,k, resulting in a belief distribution.

The rule activation process began with the input data being matched against the reference values in each rule. The degree of activation for each rule was then calculated based on these similarity measures and the attribute weights. Rules with higher degrees of activation had a greater influence on the final credibility assessment. In addition, represented the weight of each feature indicator in this rule, and the weight of this rule in the confidence rule-base was θ_k. The results obtained by the rule activation and fusion were sent to the next submodel as input forming the layers of inference, as shown in Figure 5.

The input data were used to determine the activation weight (Equation (7)) of the features in the sample for all belief rules. We used the evidence reasoning method to fuse all the activated rules to produce the final results. If ω_k = 0, it meant that the rule was not triggered.

(7)

The activation weight of each rule in each sub-model

(Equation (8)) was calculated by considering the activated rule weight ω_k, attribute matching degrees, and attribute weights determining the contribution of each rule to the final inference result (Equation (10)). The ω_k was established as a static rule weight in the rule base, while

was determined as a dynamic activation weight that depended on the characteristics of the rule (including ω_k) and the specific input data being processed. For the activated rules, we used the ER algorithm to synthesize the activated rules and combine the assessment scores of the sub-models with the utility μ (Equation (9)) values and the belief distribution to derive the credibility score

(Equation (10)) [63].

(8)

(9)

(10)

3.2.3. Parameter Optimization

During the inference process, various parameters of the BRB model, such as the attribute weights δ_i, the rule weights θ_k, and the belief levels β_j,k, played a crucial role in the accuracy of the assessment results. The attribute weight δ_i influenced the importance of each input feature. A higher δ_i value indicated that the corresponding feature had a greater impact on the rule activation and subsequent credibility assessment. The rule weight θ_k determined the significance of each rule in the BRB approach. Rules with higher θ_k values had a stronger influence on the final credibility assessment. The belief level β_j,k distributed the credibility results over all possible outcomes. It represented the degree of belief that the k-th rule pointed to the j-th assessment grade, allowing for a nuanced representation of uncertainty in the assessment. Inaccuracies in the parameters directly affected the inference performance of the model. To improve the performance of the model, we built an optimization model (Figure 6) to find the optimal δ_i, θ_k, and β_j,k.

For the number m of the data list, the belief output of the BRB was {(D_j, β_j(m), j = 1, 2, …, N; m = 1, 2, …, M)}. The evaluation obtained by combining the utility value (Equation (9)) of the evaluation of three sub-models was

(Equation (11)). To make the evaluated results

as consistent as possible with the real results y_m, the parameters of the model needed to be optimized and trained (Equation (12)). In this paper, the value y_m referring to the true information in the experimental data was set to 1, and the value y_m of false information was set to 0.

(11)

(12)

in this formula, ξ(P) was the objective function. The optimization process was nonlinear but with linear constraints. To address this problem, we used the fmincon function, which is designed for finding the minimum constrained in nonlinear multivariable functions [64]. This function iteratively adjusted the parameters δ_i, θ_k, and β_j,k to minimize the difference between the predicted credibility scores and the actual scores in the training set, as represented by the objective function ξ(P).

4. Experiments

4.1. Dataset

The experimental data used in this research came from the Chinese Weibo information dataset provided by the Beijing Municipal Bureau of Economy and Informatization. The dataset contained multimodal information collected by using the Weibo ID. The final filtered dataset contained 1102 true entries and 1327 false ones.

4.2. Selection of Features

To select the most suitable features for assessing the credibility of information on the Weibo platform and to reduce the number of rules to solve the rule combination explosion problem, we calculated the correlation between the 14 previously extracted features (described in Tables 2, 3, 4). Credibility was assessed based on the Spearman’s rank correlation coefficient. As a result, nine highly relevant features were selected: (1) as text features: F₁: Comment_emotion, F₂: Text_abnormality, F₃: Text_Exc_count and F₄: Comment_abnormality; (2) as image features: F₅: Similarity_ocr, F₆: Img_num, and F₇: Size; and (3) as user features: F₈: Is_certification and F₉: Influence. The results of the analysis were shown in the heat map of Figure 7.

Two user features, “Influence” and “Is_certification,” showed a strong correlation with credibility. Our results confirmed that authenticated users or users with a large number of followers tended to be more concerned about their reputation and, subsequently, were more careful when posting information.

The three image features “Img_num,” “Size”, and “Similarity_ocr” also presented a positive correlation with credibility. We found that most of the real information contained multiple images for a piece of news which reflected the authenticity of the content from multiple perspectives. Because these images had only been published once, they presented high resolution (high pixel value images) as they had not been compressed by copying and forwarding. In addition, users wrote the relevant text in the image to form an infographic and also repeated the relevant text in the content to make true information easier to share, resulting in similar OCR values from the extracted text and the actual content.

There was also a positive correlation between the “Comment_emotion” and credibility. We detected that truthful information tended to be positive while false information, on the other hand, was inflammatory causing negative emotions among netizens.

The features “Text_Exc_count,” “Text_abnormality,” “Comment_abnormality,” and credibility, respectively, showed negative correlation. False information was often accompanied by “!” to aggravate the tone, resulting in some keywords appearing very often. In addition, users also had a certain ability to identify dubious information questioning it or using critical words, such as “rumour,” “fake,” and “disbelief.”

4.3. Baselines and Evaluation Metrics

We used the root mean square error (RMSE), a widely used evaluation metric for assessing model performance (Equation (13)) to assess the deviation between the credibility assessment values produced by the model and the true values extracted from the labelled dataset.

(13)

The output of the HBRB-MICA model was a credibility score, a value ranging from 0 to 1. The information extracted from the public dataset used was labelled as true or false (1 or 0), so we could classify the news with scores larger than 0.5 as true information and samples with scores lower than 0.5 as false information.

In the comparison study, we compared these credibility scores obtained by our model to the ones obtained by the following models:

•
The support vector machine (SVM) [52] model is an effective classification model. Its core idea is to achieve segmentation of data by constructing a hyperplane for classification. We used an SVM with a Gaussian kernel (RBF) function.
•
Back propagation neural network (BPNN) [65] achieves efficient mapping from input space to output space by building a series of hierarchical structures to extract and learn key features in the data layer by layer. The structure of the BPNN model in this experiment included an input layer, an output layer, and a hidden layer.
•
DT modeling [23] is an intuitive and easy-to-explain form of machine learning, where the core idea is to recursively partition data by constructing a series of simple decision rules to make predictions about the target variable.
•
Fake news detection by semantic correlations between text content and images (FND-SCTI) [66] learns image and text features through VGG-19 and hierarchical attention mechanisms, respectively, and uses variational self-encoders to learn shared latent representations of text and images.
•
Cross-modal attention residual and multichannel convolutional neural networks (CARMN) [67] proposes and fuses cross-modal attention residual network (CARN) and multichannel convolutional neural network (MCN) to selectively extract the information related to a target modality from another source modality.
•
CAFE [4] is an ambiguity-aware multimodal method for fake news detection, which uses Kullback–Leibler (KL) divergence to quantify the degree of ambiguity between different modalities. CAFE uses a strategy of adaptive aggregation of unimodal features and cross-modal correlation to improve the accuracy of detection.

Three of these models are machine learning models (SVM, BPNN, and DT) and the other three are deep learning models (FND-SCTI, CARMN, and CAFE). We used accuracy, precision, recall, and F1 score as evaluation metrics [20].

4.4. Results of the HBRB-MICA Optimization Process and of the Comparison Experiments

In this paper, 60% of the experimental data were sent to the sub-models BRB-1, BRB-2, and BRB-3 for training, and the remaining 40% of the data were evaluated by optimizing the parameters, of which 80% were used for training and 20% for testing. The optimized model achieved a better credibility score. The RMSE of the output results of the initial and the optimized models were 0.5463 and 0.2523, respectively, being the accuracy of the model improved by 53.8% through optimization (Figure 8).

Tables 6, 7, 8, 9 presented some rules and weights for the optimized sub-models BRB-1, -2, -3, and -4, respectively. The complete tables were included in appendix, respectively. In these tables, the levels of credibility were classified as fully credible (F), partly credible (P), and unbelievable (U).

Table 6. Some rules and weights of the optimized sub-model BRB-1.

BRB-1	Weight	Feature				The level of credibility
BRB-1	Weight	F₁	F₂	F₃	F₄	Fully credible (F)	Partly credible (P)	Unbelievable (U)
1	0.5243	Pos	Low	Low	Low	0.4740	0.3073	0.2187
2	0.5474	Pos	Low	Low	Med	0.2203	0.3028	0.4769
3	0.4219	Pos	Low	Low	High	0.3262	0.3311	0.3427
4	0.5017	Pos	Low	Med	Low	0.2790	0.3317	0.3893
…	…	…	…	…	…	…	…	…
41	0.8821	Neg	High	Low	Med	0.0266	0.0670	0.9063
…	…	…	…	…	…	…	…	…
54	0.4623	Neg	High	High	High	0.3296	0.3323	0.3381

Table 7. Some rules and weights of the optimized sub-model BRB-2.

BRB-2	Weight	Feature			The level of credibility
BRB-2	Weight	F₅	F₆	F₇	Fully credible (F)	Partly credible (P)	Unbelievable (U)
1	0.0227	Low	Low	Low	0.3551	0.3413	0.3036
2	0.0002	Low	Low	Med	0.3459	0.3500	0.3041
3	0.5729	Low	Low	High	0.4481	0.2942	0.2577
4	0.4112	Low	Med	Low	0.3277	0.3374	0.3349
…	…	…	…	…	…	…	…
27	0.5123	High	High	High	0.3384	0.3384	0.3384

Table 8. Some rules and weights of the optimized sub-model BRB-3.

BRB-3	Weight	Feature		The level of credibility
BRB-3	Weight	F₈	F₉	Fully credible (F)	Partly credible (P)	Unbelievable (U)
1	0.6883	False	Low	0.0159	0.0313	0.9528
2	0.3635	False	Med	0.1926	0.0578	0.7496
3	0.1702	False	High	0.3825	0.3519	0.2656
4	0.4795	True	Low	0.0430	0.0304	0.9266
5	0.4652	True	Med	0.4709	0.5190	0.0101
6	0.3646	True	High	0.9803	0.0136	0.0061

Table 9. Some rules and weights of the optimized sub-model BRB-4.

BRB-4	Weight	Feature			The level of credibility
BRB-4	Weight	B₁	B₂	B₃	Fully credible (F)	Partly credible (P)	Unbelievable (U)
1	0.3605	F	F	F	0.8555	0.1026	0.0419
2	0.7610	F	F	P	0.4526	0.3067	0.2407
3	0.4223	F	F	U	0.4272	0.3206	0.2522
4	0.3116	F	P	F	0.5541	0.2692	0.1767
…	…	…	…	…	…	…	…
26	0.0239	U	U	F	0.3323	0.3326	0.3351
27	0.0061	U	U	U	0.3319	0.3328	0.3352

Table 6 presented four features, F₁ (Comment_emotion), F₂ (Text_abnormality), F₃ (Text_Exc_count), and F₄ (Comment_abnormality). F₁ contained two levels of sentiment (positive and negative) and the others included three levels of credibility (low, medium, and high). Fifty-four rules were obtained by combining the different states of the four features. These rules were learned and optimized through model training to obtain the weights of rules and probabilities of the three credibility levels (fully credible, partly credible, and unbelievable) at the time the rule was triggered (Supporting Table 1 in appendix).

Similarly, the results on Table 7 indicated that the BRB-2 sub-model contained three types of image features: F₅ (Similarity_ocr), F₆ (Img_num), and F₇ (Size), and each of them contained three levels of credibility (low, medium, and high). The sub-model contained a total of 27 rules (Supporting Table 2 in appendix).

The results shown on Table 8 indicated that the BRB-3 sub-model contained two types of user features and a total of six rules. F₈ (Is_certification) included two types (true and false) and F₉ (Influence) comprised three levels of credibility (low, medium, and high).

Finally, Table 9 showed the output of sub-model BRB-4, which referred to the evaluation results of the three sub-models, the corresponding result types (fully credible, partly credible, and unbelievable), and the 27 rules (Supporting Table 3 in appendix).

Therefore, to validate our model, we selected both real and fake information from the test set (Table 10). An example can be seen in Table 11.

Table 10. Examples of real and fake news.

Information	Publisher	Description	Original text
Real	@Tonight newspaper	The rumor monger was caught	“Shouguang public security captured two suspects spreading plague rumors in violation of the law at 11:05 on August 25…”
Fake	@Mnekane-wang	Rumor about missing persons	“Missing person notice. There are clues and a reward of 100,000 yuan to help spread…”

Table 11. The credibility values (B) for real and fake news per feature obtained by the BRB-4 sub-model.

News	F₁	F₂	F₃	F₄	F₅	F₆	F₇	F₈	F₉	B
Real	Negative	1.28	0	0.0534	0.90	4	276,683	True	3844	0.94
Fake	Positive	4.77	0	0.0348	0.16	1	25,600	False	9	0.03

When we input the feature parameter X = {F_i, i = 1, 2, 3, 4} of the fake information into the BRB-1 sub-model, it activated the rules w₃₃ = 0.0082, w₃₄ = 0.0732, w₃₉ = 0.0368, and w₄₁ = 0.8818 through Equation (7). Then, based on Equation (8), the model performed rule fusion and calculated the result B₁ = 0.2440 (U) using Equation (10). Notably, activated rule 41, which presented the highest weight, suggested that the text features of the information exhibited negative sentiment in comments, a low frequency of exclamation marks “!,” medium text anomalies, and medium level of comment disagreement. This indicated that it was difficult to determine the authenticity of the information based on these text features alone.

When we input the feature parameter X = {F_i, i = 5, 6, 7} into the BRB-2 sub-model, it activated the rules w₁ = 0.0082, w₂ = 0.00002, w₁₀ = 0.0014, and w₁₁ = 0.0005, finally obtaining B₂ = 0.0628 (U). Here, the activated rule 1, with the highest weight, pointed to the poor graphic semantic consistency in the image features, a low number of images, and a low image quality, indicating potential false information in the image.

Inputting the feature parameter X = {F_i, i = 8, 9} into the sub-model BRB-3 activated the rules w₁ = 0.9848, w₂ = 0.0152, obtaining B₃ = 0.0600 (U). The activated rule 1 presented the highest weight, indicating that the publisher was unverified and less influential, suggesting lower credibility of the published content. Combining the assessment results of all modalities into the BRB-4 sub-model gave a result of 0.0290, indicating extremely low credibility for this news (Table 11).

When assessing real information, the activation rules and weights obtained by BRB-1 through Equation (7) were w₂ = 0.3432, w₃ = 0.0010, w₁₂ = 0.6542, w₁₃ = 0.0017 (Supporting Table 1 in appendix). The activation rules and weights obtained by BRB-2 were w₁₅ = 0.0039, w₁₆ = 0.0006, w₁₇ = 0.0850, w₁₈ = 0.0043, w₂₄ = 0.5942, w₂₅ = 0.0036, w₂₆ = 0.3067, and w₂₇ = 0.0018 (Supporting Table 2 in appendix). The activation rules and weights obtained by BRB-3 were w₄ = 0.3805 and w₅ = 0.6195 (Table 8). The activation rules were fused by obtaining activation rules for each model based on Equation (8). The results obtained by the sub-models BRB-1, BRB-2, and BRB-3 based on Equation (10) were 0.5724 (P), 0.8301 (F), and 0.7587 (F), respectively. The rules activated by each sub-model collectively indicated that the real information was characterized by positive sentiment in comments, a low degree of text anomaly, low proportion of “!,” low comment disagreement, large number of images, high image quality, and that the publisher was verified and influential. When processed through the BRB-4 sub-model, a very high final credibility level of 0.9434 was achieved (Supporting Table 3 in appendix).

In the inference process, the model learnt the rules and weights of different modalities of feature activations, so they could be classified as fully credible, partly credible, and unbelievable. HBRB hierarchically explained the low credibility in certain information through rule activation as well as it provided detailed interpretability and increased the transparency of the assessment process. This may facilitate its integration into automated information assessment systems, allowing users to comprehend and trust the system’s judgments. This system could also be benefited from timely human-machine feedback, enhancing overall performance.

We compared the other six models with our HBRB-MICA approach (Table 12). Our HBRB-MICA model obtained the best performance in accuracy (91.7%), precision (88.5%), and F1 score (89.8%) when compared to other models which indicated that the information assessment features selected in this paper were effective (Table 12). The high accuracy across multiple models (SVM, BPNN, DT, and HBRB achieved around 90% accuracy) further supported the effectiveness of our feature selection approach, which demonstrated the superiority of our HBRB-MICA model in the detection of fake news (Table 12). The main conclusions of the comparative analysis, based on the results of Table 12, were as follows:

1.
The HBRB-MICA model constructed in this paper obtained the highest accuracy, precision, and F1 value when compared to the other models. The recall rate was slightly lower than that of the SVM model, which demonstrated the effectiveness of the HBRB-MICA model in assessing information credibility.
2.
Although the SVM and BPNN models obtained good accuracy results, the inference process was inexplicable, and thus the modeling process was not interpretable. In contrast to that, the HBRB model presented a transparent inference engine that could reasonably explain the output results letting the verification of the assessment results. This made the HBRB-MICA model more explanatory and reliable.
3.
FND-SCTI, CARMN, and CAFE performed effectively on detection of fake news (F1 = 0.757, F1 = 0.756, F1 = 0.842, respectively), but the results were slightly lower than those for HBRB-MICA. This was because the process of acquiring image features and fusing text features by deep learning models led to distortion, which reduced their ability to combine the two modalities to determine the credibility of the message. In addition to this, all these models ignored user’s account features. Again, our method effectively addressed this problem by identifying the text semantic information of the image and calculating the ambiguity of the text. It assessed credibility by integrating multiple features, which effectively overcame the semantic feature distortion problem.

Table 12. Resulting accuracy, precision, recall rate and F1 values for the selected models.

Model	Accuracy	Precision	Recall	F1
SVM	0.903	0.808	0.920	0.861
BPNN	0.904	0.842	0.891	0.866
Decision tree	0.895	0.828	0.870	0.848
FND-SCTI	0.758	0.757	0.757	0.757
CARMN	0.741	0.762	0.750	0.756
CAFE	0.840	0.855	0.830	0.842
HBRB-MICA	0.917	0.885	0.911	0.898

Note: Bold values represent the best values.

5. Discussion and Conclusions

In this study, we present a model for assessing the credibility of social networking content leveraging multimodal features. Through a thorough review of related literature, we identified and selected key feature indicators that influenced the credibility of information from three distinct modalities: text, images, and user profiles. We employed the HBRB inference method for credibility assessment. This method effectively circumvented the “rule combination explosion” as the number of features with low relevance were reduced diminishing the impact of irrelevant features on the results. To date, this is the first time that HBRB is used for assessing multimodal social media credibility.

Using the hierarchical structure of the HRBR-MICA approach enables the presentation of the internal assessment process, providing a more transparent and controllable method. In line with this, in online information governance, this method allows for integration with expert knowledge or experience as it could be refined by adjusting the knowledge rules, adding significant practical value, and enhancing its result interpretability.

While this study presents a novel approach to assess multimodal credibility in social media using the HBRB-MICA model, there are still some limitations that need to be addressed. The current model relies heavily on the quality and comprehensiveness of the initial feature selection, which may not capture all nuances of credibility across different social media platforms or cultures. Future work should expand the feature set (including video images, speech, author tags, image similarity, image modification features, among others) to enrich the validity and interpretive power of the model. Also, the multimodal feature fusion using multilevel feature fusion network (MFFN) should be explored to optimize the hierarchical structure of HBRB [68]. Finally, the integration of this method with LLM technology on real-time social media monitoring systems to improve the interpretation of the model could be performed.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding

This paper was supported by the National Natural Science Foundation of China (project numbers are 72274096, 71774084, 72301136, and 72174087) and the Foreign Cultural and Educational Expert Program of the Ministry of Science and Technology of China (G2022182009L).

Supporting Information

The supporting information file contains three tables of the rules and weights of three submodels of the HBRB model after training. They are Supporting Table 1, Supporting Table 2, and Supporting Table 3. Supporting Table 1 is the table of rules and weights for submodel BRB-1, which is the supporting information to Table 6. Supporting Table 2 is the table of rules and weights for submodel BRB-2, which is the supporting information to Table 7. Supporting Table 3 is the table of the rules and weights for submodel BRB-3, which is the supporting information to Table 9.

Open Research

Data Availability Statement

The data used to support the findings of this study are available on request from the corresponding author.

Supporting Information

References

1 Castillo C., Mendoza M., and Poblete B., Predicting Information Credibility in Time-Sensitive Social Media, Internet Research. (2013) 23, no. 5, 560–588, https://doi.org/10.1108/intr-05-2012-0095, 2-s2.0-84885445783.
10.1108/IntR-05-2012-0095
Web of Science® Google Scholar
2 Jin Z., Cao J., Zhang Y., Zhou J., and Tian Q., Novel Visual and Statistical Image Features for Microblogs News Verification, IEEE Transactions on Multimedia. (2017) 19, no. 3, 598–608, https://doi.org/10.1109/tmm.2016.2617078, 2-s2.0-85013439258.
10.1109/TMM.2016.2617078
Web of Science® Google Scholar
3 Khattar D., Goud J. S., Gupta M., and Varma V., Mvae: Multimodal Variational Autoencoder for Fake News Detection, The World Wide Web Conference, June 2019, Austin, TX, 2915–2921, https://doi.org/10.1145/3308558.3313552, 2-s2.0-85066898250.
10.1145/3308558.3313552
Google Scholar
4 Chen Y., Li D., Zhang P. et al., Cross-modal Ambiguity Learning for Multimodal Fake News Detection, Proceedings of the ACM Web Conference 2022, May 2022, Stuttgart, Germany, 2897–2905, https://doi.org/10.1145/3485447.3511968.
10.1145/3485447.3511968
Google Scholar
5 Meel P. and Vishwakarma D., Multi-modal Fusion Using Fine-tuned Self-Attention and Transfer Learning for Veracity Analysis of Web Information, Expert Systems with Applications. (2023) 229, https://doi.org/10.1016/j.eswa.2023.120537.
10.1016/j.eswa.2023.120537
Web of Science® Google Scholar
6 Ma J., Gao W., Mitra P. et al., Detecting Rumors from Microblogs with Recurrent Neural Networks, Proceedings of the 25th International Joint Conference on Artificial Intelligence, July 2016, New York, NY, 3818–3824.
Google Scholar
7 Chen T., Li X., Yin H., and Zhang J., Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection, Trends and Applications in Knowledge Discovery and Data Mining: PAKDD 2018 Workshops, BDASC, BDM, ML4Cyber, PAISI, DaMEMO, June 2018, Melbourne, Australia, 40–52, https://doi.org/10.1007/978-3-030-04503-6_4, 2-s2.0-85059067631.
10.1007/978-3-030-04503-6_4
Google Scholar
8 Yong Z., Yao H., and Wu Y., Rumors Detection in Sina Weibo Based on Text and User Characteristics, 2018 2nd IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), May 2018, Xi’an, China, IEEE, 1380–1386.
Google Scholar
9 Popat K., Assessing the Credibility of Claims on the Web, Proceedings of the 26th International Conference on World Wide Web Companion, April 2017, Perth, Australia, 735–739, https://doi.org/10.1145/3041021.3053379, 2-s2.0-85060283089.
10.1145/3041021.3053379
Google Scholar
10 Mendoza M., Poblete B., and Castillo C., Twitter under Crisis: Can We Trust what We RT?, Proceedings of the First Workshop on Social Media Analytics, May 2010, New York, NY, SOMA’10, 71–79.
Google Scholar
11 Chen Y.-W., Yang J.-B., Xu D.-L., and Yang S.-L., On the Inference and Approximation Properties of Belief Rule Based Systems, Information Sciences. (2013) 234, 121–135, https://doi.org/10.1016/j.ins.2013.01.022, 2-s2.0-84875271774.
10.1016/j.ins.2013.01.022
Web of Science® Google Scholar
12 Wang J., Sun L., Liu Y., Shao M., and Zheng Z., Multimodal Sarcasm Target Identification in Tweets, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. (2022) 1, 8164–8175.
Google Scholar
13 Shang L., Kou Z., Zhang Y., and Wang D., A Duo-Generative Approach to Explainable Multimodal Covid-19 Misinformation Detection, Proceedings of the ACM Web Conference 2022, April 2022, Austin, TX, 3623–3631, https://doi.org/10.1145/3485447.3512257.
10.1145/3485447.3512257
Google Scholar
14 Zeng V., Liu X., and Verma R. M., Does Deception Leave a Content Independent Stylistic Trace?, Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy, May 2022, New York, NY, 349–351, https://doi.org/10.1145/3508398.3519358.
10.1145/3508398.3519358
Google Scholar
15 Hardalov M., Arora A., Nakov P., and Augenstein I., A Survey on Stance Detection for Mis-And Disinformation Identification, 2021, https://arxiv.org/abs/2103.00242.
Google Scholar
16 Zhu Y., Sheng Q., Cao J. et al., Memory-guided Multi-View Multi-Domain Fake News Detection, IEEE Transactions on Knowledge and Data Engineering. (2022) 35, no. 7, 1–14, https://doi.org/10.1109/tkde.2022.3185151.
10.1109/tkde.2022.3185151
Google Scholar
17 Gacto M. J., Alcalá R., and Herrera F., Interpretability of Linguistic Fuzzy Rule-Based Systems: An Overview of Interpretability Measures, The Information of the Science. (2011) 181, no. 20, 4340–4360.
10.1016/j.ins.2011.02.021
Web of Science® Google Scholar
18 Chen Y., Xia R., Yang K., and Zou K., MFMAM: Image Inpainting via Multi-Scale Feature Module with Attention Module, Computer Vision and Image Understanding. (2024) 238, https://doi.org/10.1016/j.cviu.2023.103883.
10.1016/j.cviu.2023.103883
Web of Science® Google Scholar
19 Singhal S., Pandey T., Mrig S., Shah R. R., and Kumaraguru P., Leveraging Intra and Inter Modality Relationship for Multimodal Fake News Detection, Companion Proceedings of the Web Conference 2022, April 2022, Lyon, France, 726–734.
Google Scholar
20 Qureshi K. A. and Malick R. A. S., Explainable Tweet Credibility Ranker: A Comprehensive Credibility Solution, Computers & Electrical Engineering. (2023) 112, https://doi.org/10.1016/j.compeleceng.2023.109028.
10.1016/j.compeleceng.2023.109028
Web of Science® Google Scholar
21 Alrubaian M., Al-Qurishi M., Alamri A., Al-Rakhami M., Hassan M. M., and Fortino G., Credibility in Online Social Networks: A Survey, IEEE Access. (2019) 7, 2828–2855, https://doi.org/10.1109/access.2018.2886314, 2-s2.0-85058638723.
10.1109/ACCESS.2018.2886314
Web of Science® Google Scholar
22 Wang H., Wang S., and Han Y., Detecting Fake News on Chinese Social Media Based on Hybrid Feature Fusion Method, Expert Systems with Applications. (2022) 208, https://doi.org/10.1016/j.eswa.2022.118111.
10.1016/j.eswa.2022.118111
PubMed Web of Science® Google Scholar
23 Evans L., Owda M., Crockett K., and Fernandez Vilas A., Credibility Assessment of Financial Stock Tweets, Expert Systems with Applications. (2021) 168, https://doi.org/10.1016/j.eswa.2020.114351.
10.1016/j.eswa.2020.114351
Web of Science® Google Scholar
24 Qi P., Cao J., Yang T., Guo J., and Li J., Exploiting Multi-Domain Visual Information for Fake News Detection, 2019 IEEE International Conference on Data Mining (ICDM), November 2019, Beijing, China, IEEE, 518–527.
Google Scholar
25 Mashayekhi M. and Gras R., Rule Extraction from Decision Trees Ensembles: New Algorithms Based on Heuristic Search and Sparse Group Lasso Methods, International Journal of Information Technology and Decision Making. (2017) 16, no. 06, 1707–1727, https://doi.org/10.1142/s0219622017500055, 2-s2.0-85009971430.
10.1142/S0219622017500055
Web of Science® Google Scholar
26 Montavon G., Samek W., and Müller K. R., Methods for Interpreting and Understanding Deep Neural Networks, Digital Signal Processing. (2018) 73, 1–15, https://doi.org/10.1016/j.dsp.2017.10.011, 2-s2.0-85033371689.
10.1016/j.dsp.2017.10.011
Web of Science® Google Scholar
27 Shouling J., Jinfeng L., Tianyu D., and Bo L., Survey on Techniques, Applications and Security of Machine Learning Interpretability, Journal of Computer Research and Development. (2019) 56.
Google Scholar
28 Zhou Z.-J., Cao Y., Hu C.-H., Tang S.-W., Zhang C.-C., and Wang J., The Interpretability of Rule-Based Modeling Approach and its Development, Acta Automatica Sinica. (2021) 47, no. 6, 1201–1216.
Google Scholar
29 Chen M., Zhou Z., Han X., Feng Z., and Cao Y., An Interpretable Method for Inertial Platform Fault Diagnosis Based on Combination Belief Rule Base, Measurement. (2023) 217, https://doi.org/10.1016/j.measurement.2023.112960.
10.1016/j.measurement.2023.112960
Web of Science® Google Scholar
30 Cao Y., Tang S., Yao R., Chang L., and Yin X., Interpretable Hierarchical Belief Rule Base Expert System for Complex System Modeling, Measurement. (2024) 226, https://doi.org/10.1016/j.measurement.2023.114033.
10.1016/j.measurement.2023.114033
Web of Science® Google Scholar
31 Sarkar A., Vijaykeerthy D., Sarkar A., and Balasubramanian V. N., A Framework for Learning Ante-hoc Explainable Models via Concepts, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2022, Silver Spring, MD, 10286–10295.
Google Scholar
32 Chang L.-L., Zhou Z.-J., Liao H., Chen Y.-W., Tan X., and Herrera F., Generic Disjunctive Belief-Rule-Base Modeling, Inferencing, and Optimization, IEEE Transactions on Fuzzy Systems. (2019) 27, no. 9, 1866–1880, https://doi.org/10.1109/tfuzz.2019.2892348, 2-s2.0-85072076806.
10.1109/TFUZZ.2019.2892348
Web of Science® Google Scholar
33 García Lozano M., Brynielsson J., Franke U. et al., Veracity Assessment of Online Data, Decision Support Systems. (2020) 129, https://doi.org/10.1016/j.dss.2019.113132.
10.1016/j.dss.2019.113132
Google Scholar
34 Przybyła P. and Soto A. J., When Classification Accuracy Is Not Enough: Explaining News Credibility Assessment, Information Processing & Management. (2021) 58, no. 5, https://doi.org/10.1016/j.ipm.2021.102653.
10.1016/j.ipm.2021.102653
Web of Science® Google Scholar
35 Machackova H. and Smahel D., The Perceived Importance of Credibility Cues for the Assessment of the Trustworthiness of Online Information by Visitors of Health-Related Websites: The Role of Individual Factors, Telematics and Informatics. (2018) 35, no. 5, 1534–1541, https://doi.org/10.1016/j.tele.2018.03.021, 2-s2.0-85047859567.
10.1016/j.tele.2018.03.021
Web of Science® Google Scholar
36 Chen Y., Xia R., Yang K., and Zou K., DNNAM: Image Inpainting Algorithm via Deep Neural Networks and Attention Mechanism, Applied Soft Computing. (2024) 154, https://doi.org/10.1016/j.asoc.2024.111392.
10.1016/j.asoc.2024.111392
Web of Science® Google Scholar
37 Chen Y., Xia R., Yang K., and Zou K., MICU: Image Super-resolution via Multi-Level Information Compensation and U-Net, Expert Systems with Applications. (2024) 245, https://doi.org/10.1016/j.eswa.2023.123111.
10.1016/j.eswa.2023.123111
Web of Science® Google Scholar
38 Lahat D., Adali T., and Jutten C., Multimodal Data Fusion: an Overview of Methods, Challenges, and Prospects, Proceedings of the IEEE. (2015) 103, no. 9, 1449–1477, https://doi.org/10.1109/jproc.2015.2460697, 2-s2.0-84940426185.
10.1109/JPROC.2015.2460697
Web of Science® Google Scholar
39 Choi W., Stvilia B., and Lee H. S., Developing a Platform-specific Framework for Web Credibility Assessment: A Case of Social Q&A Sites, Information Processing & Management. (2023) 60, no. 3.
10.1016/j.ipm.2023.103321
Web of Science® Google Scholar
40 Hashmi E., Yayilgan S. Y., Yamin M. M., Ali S., and Abomhara M., Advancing Fake News Detection: Hybrid Deep Learning with Fasttext and Explainable AI, IEEE Access. (2024) 12, 44462–44480, https://doi.org/10.1109/access.2024.3381038.
10.1109/ACCESS.2024.3381038
Web of Science® Google Scholar
41 Yang Z., Ma J., Chen H., Lin H., Luo Z., and Chang Y., A Coarse-To-Fine Cascaded Evidence-Distillation Neural Network for Explainable Fake News Detection, 2022, https://arxiv.org/abs/2209.14642.
Google Scholar
42 Zheng J., Zhang X., Guo S., Wang Q., Zang W., and Zhang Y., MFAN: Multi-Modal Feature-Enhanced Attention Networks for Rumor Detection, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. (2022) 22, 2413–2419, https://doi.org/10.24963/ijcai.2022/335.
10.24963/ijcai.2022/335
Google Scholar
43 Pan H., Lin Z., Fu P., Qi Y., and Wang W., Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection, Findings of the Association for Computational Linguistics: EMNLP 2020, June 2020, New York, NY.
Google Scholar
44 Dhawan M., Sharma S., Kadam A., Sharma R., and Kumaraguru P., Game-On: Graph Attention Network Based Multimodal Fusion for Fake News Detection, 2022, https://arxiv.org/abs/2202.12478.
Google Scholar
45 Wu L., Rao Y., Yang X., Wang W., and Nazir A., Evidence-Aware Hierarchical Interactive Attention Networks for Explainable Claim Verification, Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, July 2021, Yokohama, Japan, 1388–1394, https://doi.org/10.24963/ijcai.2020/193.
10.24963/ijcai.2020/193
Google Scholar
46 Wu L., Long Y., Gao C., Wang Z., and Zhang Y., MFIR: Multimodal Fusion and Inconsistency Reasoning for Explainable Fake News Detection, Information Fusion. (2023) 100, https://doi.org/10.1016/j.inffus.2023.101944.
10.1016/j.inffus.2023.101944
Web of Science® Google Scholar
47 Chen Y.-W., Yang J.-B., Pan C.-C., Xu D.-L., and Zhou Z.-J., Identification of Uncertain Nonlinear Systems: Constructing Belief Rule-Based Models, Knowledge-Based Systems. (2015) 73, 124–133, https://doi.org/10.1016/j.knosys.2014.09.010, 2-s2.0-84916207779.
10.1016/j.knosys.2014.09.010
Web of Science® Google Scholar
48 Yang J.-B., Liu J., Wang J., Sii H.-S., and Wang H.-W., Belief Rule-Base Inference Methodology Using the Evidential Reasoning Approach-RIMER, IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans. (2006) 36, no. 2, 266–285, https://doi.org/10.1109/tsmca.2005.851270, 2-s2.0-33244478309.
10.1109/TSMCA.2005.851270
Web of Science® Google Scholar
49 Guo M., A Belief-Rule-Based Inference Method for Modeling Systems under Uncertainties, Systems Engineering—Theory & Practice. (2016) 36, no. 8, 1975–1982.
Google Scholar
50 Zhou Z.-J., Hu C.-H., Yang J.-B., Xu D.-L., Chen M.-Y., and Zhou D.-H., A Sequential Learning Algorithm for Online Constructing Belief-Rule-Based Systems, Expert Systems with Applications. (2010) 37, no. 2, 1790–1799, https://doi.org/10.1016/j.eswa.2009.07.067, 2-s2.0-71749101236.
10.1016/j.eswa.2009.07.067
Web of Science® Google Scholar
51 He W., Hu G.-Y., Zhou Z.-J. et al., A New Hierarchical Belief-Rule-Based Method for Reliability Evaluation of Wireless Sensor Network, Microelectronics Reliability. (2018) 87, 33–51, https://doi.org/10.1016/j.microrel.2018.05.019, 2-s2.0-85048184701.
10.1016/j.microrel.2018.05.019
Web of Science® Google Scholar
52 Ma J., Gao W., Wei Z., Lu Y., and Wong K.-F., Detect Rumors Using Time Series of Social Context Information on Microblogging Websites, Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, June 2015, Melbourne, Australia, 1751–1754.
Google Scholar
53 Yang F., Liu Y., Yu X., and Yang M., Automatic Detection of Rumor on Sina Weibo, Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, August 2012, Beijing, China, 1–7, https://doi.org/10.1145/2350190.2350203, 2-s2.0-84866601896.
10.1145/2350190.2350203
Google Scholar
54 Zhang X., Cao J., Li X., Sheng Q., Zhong L., and Shu K., Mining Dual Emotion for Fake News Detection, Proceedings of the Web Conference 2021, April 2021, New York, NY, 3465–3476, https://doi.org/10.1145/3442381.3450004.
10.1145/3442381.3450004
Google Scholar
55 Xiaobo T. and Yingli L., Integrating Emotional Divergence and User Interests into the Prediction of Microblog Retweeting, Library and Information Service. (2017) 61, no. 9.
Google Scholar
56 Wu K., Yang S., and Zhu K. Q., False Rumors Detection on Sina Weibo by Propagation Structures, 2015 IEEE 31st International Conference on Data Engineering, April 2015, Seoul, Republic of Korea, IEEE, 651–662.
Google Scholar
57 Agrawal R. and Srikant R., Fast Algorithms for Mining Association Rules, 20th International Conference Very Large Data Bases, VLDB, May 1994, Santiago, Chile, 487–499.
Google Scholar
58 Nan Q., Cao J., Zhu Y., Wang Y., and Li J., MDFEND: Multi-Domain Fake News Detection, Proceedings of the 30th ACM International Conference on Information & Knowledge Management, October 2021, Birmingham, UK, 3343–3347, https://doi.org/10.1145/3459637.3482139.
10.1145/3459637.3482139
Google Scholar
59 Gao W. and Sebastiani F., From Classification to Quantification in Tweet Sentiment Analysis, Social Network Analysis and Mining. (2016) 6, 19–22, https://doi.org/10.1007/s13278-016-0327-z, 2-s2.0-84963722105.
10.1007/s13278-016-0327-z
CAS Web of Science® Google Scholar
60 Zhou X., Wu J., and Zafarani R., Safe: Similarity-Aware Multi-Modal Fake News Detection, 2020, https://arxiv.org/abs/2003.04981.
Google Scholar
61 Howse J., OpenCV Computer Vision with python, 2013, Packt Publishing.
Google Scholar
62 Lee L. H., Wan C. H., Rajkumar R., and Isa D., An Enhanced Support Vector Machine Classification Framework by Using Euclidean Distance Function for Text Document Categorization, Applied Intelligence. (2012) 37, no. 1, 80–99, https://doi.org/10.1007/s10489-011-0314-z, 2-s2.0-84862586717.
10.1007/s10489-011-0314-z
PubMed Web of Science® Google Scholar
63 Wang Y.-M., Yang J.-B., Xu D.-L., and Chin K.-S., Consumer Preference Prediction by Using a Hybrid Evidential Reasoning and Belief Rule-Based Methodology, Expert Systems with Applications. (2009) 36, no. 4, 8421–8430, https://doi.org/10.1016/j.eswa.2008.10.052, 2-s2.0-60249086458.
10.1016/j.eswa.2008.10.052
Web of Science® Google Scholar
64 Hossain E., Hossain M. S., Zander P.-O., and Andersson K., Machine Learning with Belief Rule-Based Expert Systems to Predict Stock Price Movements, Expert Systems with Applications. (2022) 206, https://doi.org/10.1016/j.eswa.2022.117706.
10.1016/j.eswa.2022.117706
Web of Science® Google Scholar
65 Ajao O., Bhowmik D., and Zargari S., Fake News Identification on Twitter with Hybrid Cnn and Rnn Models, Proceedings of the 9th International Conference on Social Media and Society, July 2018, Copenhagen, Denmark, 226–230.
Google Scholar
66 Zeng J., Zhang Y., and Ma X., Fake News Detection for Epidemic Emergencies via Deep Correlations between Text and Images, Sustainable Cities and Society. (2021) 66, https://doi.org/10.1016/j.scs.2020.102652.
10.1016/j.scs.2020.102652
PubMed Web of Science® Google Scholar
67 Song C., Ning N., Zhang Y., and Wu B., A Multimodal Fake News Detection Model Based on Crossmodal Attention Residual and Multichannel Convolutional Neural Networks, Information Processing & Management. (2021) 58, no. 1, https://doi.org/10.1016/j.ipm.2020.102437.
10.1016/j.ipm.2020.102437
PubMed Web of Science® Google Scholar
68 Chen Y., Xia R., Yang K., and Zou K., MFFN: Image Super-resolution via Multi-Level Features Fusion Network, The Visual Computer. (2024) 40, no. 2, 489–504, https://doi.org/10.1007/s00371-023-02795-0.
10.1007/s00371-023-02795-0
CAS Web of Science® Google Scholar

Citing Literature

All articles

Enhancing Interpretability: A Hierarchical Belief Rule-Based (HBRB) Method for Assessing Multimodal Social Media Credibility

Abstract

1. Introduction

2. Relevant Research

2.1. Feature Selection and Extraction for Credibility Assessment

2.2. Credibility Assessment Models

3. Method

3.1. Multimodal Feature Selection, Extraction and Processing

3.1.1. Text Features

3.1.2. Image Feature

3.1.3. User Features

3.2. Credibility Assessment Model

3.2.1. Framework

3.2.2. Inference Process

3.2.3. Parameter Optimization

4. Experiments

4.1. Dataset

4.2. Selection of Features

4.3. Baselines and Evaluation Metrics

4.4. Results of the HBRB-MICA Optimization Process and of the Comparison Experiments

5. Discussion and Conclusions

Conflicts of Interest

Funding

Supporting Information

Open Research

Data Availability Statement

Supporting Information

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley