International Journal of Intelligent Systems

Volume 2025, Issue 1 4296751

Research Article

Open Access

Medical Image Fusion Using Unified Image Fusion Convolutional Neural Network

Balasubramaniam S.,

Corresponding Author

Balasubramaniam S.

[email protected]

orcid.org/0000-0003-1371-3088

School of Computer Science and Engineering , Kerala University of Digital Sciences , Innovation and Technology (Formerly IIITM-K) , Digital University Kerala , Thiruvananthapuram , Kerala, India

Search for more papers by this author

Vanajaroselin Chirchi,

Vanajaroselin Chirchi

Department of ISE , Dayanand Sagar Academy of Technology and Management , Bengaluru , Karnataka, India

Search for more papers by this author

Sivakumar T. A.,

Corresponding Author

Sivakumar T. A.

[email protected]

orcid.org/0000-0001-8852-2740

Faculty of Engineering and Technology , Villa College , Malé , Maldives

Search for more papers by this author

Gururama Senthilvel P.,

Gururama Senthilvel P.

Department of Computer Science and Engineering , Saveetha School of Engineering , Saveetha Institute of Medical and Technical Sciences (SIMATS) , Saveetha University , Chennai , 602105 , Tamilnadu, India , saveetha.com

Search for more papers by this author

Duraimutharasan N.,

Duraimutharasan N.

School of Computer Science and Applications , REVA UNIVERSITY , Bengaluru , Karnataka, India , reva.edu.in

Search for more papers by this author

Balasubramaniam S.,

Corresponding Author

Balasubramaniam S.

[email protected]

orcid.org/0000-0003-1371-3088

School of Computer Science and Engineering , Kerala University of Digital Sciences , Innovation and Technology (Formerly IIITM-K) , Digital University Kerala , Thiruvananthapuram , Kerala, India

Search for more papers by this author

Vanajaroselin Chirchi,

Vanajaroselin Chirchi

Department of ISE , Dayanand Sagar Academy of Technology and Management , Bengaluru , Karnataka, India

Search for more papers by this author

Sivakumar T. A.,

Corresponding Author

Sivakumar T. A.

[email protected]

orcid.org/0000-0001-8852-2740

Faculty of Engineering and Technology , Villa College , Malé , Maldives

Search for more papers by this author

Gururama Senthilvel P.,

Gururama Senthilvel P.

Search for more papers by this author

Duraimutharasan N.,

Duraimutharasan N.

School of Computer Science and Applications , REVA UNIVERSITY , Bengaluru , Karnataka, India , reva.edu.in

Search for more papers by this author

First published: 17 January 2025

https://doi.org/10.1155/int/4296751

Academic Editor: Eugenio Vocaturo

Share a link

Email
Wechat
Bluesky

Abstract

Medical image fusion (IF) is a process of registering and combining numerous images from multiple- or single-imaging modalities to enhance image quality and lessen randomness as well as redundancy for increasing the clinical applicability of the medical images to diagnose and evaluate clinical issues. The information that is acquired additionally from fused images can be effectively employed for highly accurate positioning of abnormality. Since diverse kinds of images produce various information, IF becomes more complicated for conventional methods to generate fused images. Here, a unified image fusion convolutional neural network (UIFCNN) is designed for IF utilizing medical images. To execute the IF process, two input images, namely, native T1 and T2 fluid-attenuated inversion recovery (T2-FLAIR) are taken from a dataset. An input image-T1 is preprocessed employing bilateral filter (BF), and it is segmented by a recurrent prototypical network (RP-Net) to obtain segmented output-1. Simultaneously, input image-T2-FLAIR is also preprocessed by BF and then segmented using RP-Net to acquire segmented output-2. The two segmented outputs are fused utilizing the UIFCNN that is introduced by assimilating unified and unsupervised end-to-end IF network (U2Fusion) with IF framework based on the CNN (IFCNN). In addition, the UIFCNN obtained maximal Dice coefficient and Jaccard coefficient of 0.928 and 0.920 as well as minimal mean square error (MSE) of 0.221.

1. Introduction

With the growth of medical imaging technologies, clinical images are currently becoming more significant in the medical diagnosing process. The medical images of diverse modalities have both merits and demerits [1]. As an essential and effectual supplementary tool, medical images have an increasingly crucial part in advanced medical diagnosis and treatments. Nevertheless, medical images from a single modality are generally not capable of providing adequate data to meet the necessities of complicated diagnosis owing to limitations of the imaging process [2, 3]. Thus, single form of the medical image is not possible to reflect pathological tissue data accurately and completely [1]. Multimodal medical image fusion (MMIF) can integrate complementary data of actual images and offer the necessitate data in a single fused image. Nowadays, it is medically imperative as general practitioners can access much comprehensive data regarding disease-related variations, thereby providing the patients more accurate clinical treatments and support. The multimodal medical image provides data from diverse views, thus supporting doctors to confirm diagnosis and to make decision concerning future treatments [4]. Currently, IF provides an efficacious solution for various medical image processing research studies. It automatically identifies the data in diverse images and incorporates them for producing single combined image, wherein every key element is obvious [5].

IF is the most popular method in image processing that can be employed for incorporating two or several data into one frame [6], which can support the performance of succeeding computer vision chores [6, 7]. An analytical and visual image quality is enhanced by incorporating various images [8]. An effectual IF can protect vital data by extracting every part of significant information from images and avoiding differences in the ultimate image. Afterward, fusion, the fused image is highly appropriate for machine and human perception [8]. The MMIF method is a robust technique in medical diagnosing and preparation of treatments. The multimodal fusion can incorporate images acquired from several modalities into a single image. The fusion schemes are normally utilized for mapping an operational image into a structural image in medical applications [9]. Presently, deep learning (DL) has gained more breakthroughs in several computer vision and image processing issues such as segmentation, super-resolution, and classification. In the domain of IF, the research based upon DL has also become the most active theme in past decades. Various DL-enabled IF techniques have been designed for digital photography [10]. These techniques are capable to extract the features from input images and making fused image with the required information [4].

The primary goal is to introduce the UIFCNN for IF utilizing medical images. IF is specified as the arrangement of remarkable data from various modalities employing several mathematical systems for generating a single composite image. In this research, two input images, namely, T1 and T2 fluid-attenuated inversion recovery (T2-FLAIR) are considered. Firstly, input image-T1 is obtained and it is preprocessed utilizing BF. After that, segmentation of the filtered image is done by the RP-Net. Thus, segmented output-1 is attained. Simultaneously, input image-T2-FLAIR is preprocessed utilizing BF. Then, image segmentation is accomplished by the RP-Net, and therefore segmented output-2. The two outputs, namely, segmented output-1 and segmented output-2 are fed to the IF stage. IF is conducted to fuse both images utilizing the UIFCNN, which is designed by merging U2Fusion with the IFCNN.

Eminent contributions are defined as follows:

•
Proposed UIFCNN for IF using medical images: This study introduces the UIFCNN, a novel deep-learning architecture specifically designed for medical IF
•
The UIFCNN combines the strengths of the U2Fusion and the IFCNN
•
The few-shot segmentation model (RP-Net) is used for the segmentation of images

The formation of beneath segments is as follows: Section 2 reveals the literature overview along with shortcomings, Section 3 represents the UIFCNN methodology, Section 4 interprets UIFCNN outcomes, and Section 5 exhibits the conclusion of the UIFCNN.

2. Motivation

Due to various challenges with IF, suitable, reliable, and accurate fusion methods are needed for diverse sorts of images in various areas to acquire supreme outcomes. This motivated to present a new technique for IF. In this part, methods taken for review and its shortcomings are interpreted.

2.1. Literature Survey

Duan et al. [3] designed the genetic algorithm (GA) for the fusion of medical images. It attained excellent performance for objective estimation and visual effects. Moreover, segmentation was highly precise, and comprehensive information was maintained in fusion outcomes, but still, this method was less effective as it took more running time. Zhou, Xu, and Zhang [4] presented a dilated residual attention network (DILRAN) for fusion tasks of medical images. It had rapid convergence speed and could extract multiple-scale deep semantic features, even though it failed to conduct well in few corner conditions. Shehanaz et al. [11] introduced optimum weighted average fusion–based particle swarm optimization (OWAF-PSO) for fusing medical images to enhance performance of multimodal mapping. This model lessened computational time, though it did not design registration approaches for the intention of the multicentral applications. Rajalingam et al. [10] devised non-subsampled contourlet transform and guided filtering (NSCT-GIF) for the fusion of medical images. It offered the information supportive for understanding and visualizing the diseases. However, this method failed to employ high potential of few DL approaches for IF. Maseleno et al. [12] established the deep convolutional neural network (DCNN)–based ensemble model for IF. Here, the number of the DCNN was extracted high-level features, and then these were combined with the help of the fusion model, which was based on the weighted averaging concept. It exhibited enhanced visual and feature representation quality. The evaluation was difficult for large-scale datasets. Ghosh and Jayanthi [13] established a Dense-ResNet framework for the medical IF. Here, a median filter was preprocessed by the different modality images, and edge-attention guidance network (ET-Net) has done the segmentation. The performance was superior, but the complexity was high.

2.2. Challenges

The disadvantages faced by methods taken for review are elaborated as follows.

•
The GA algorithm presented by authors in [3] was utilized for obtaining superior fusion outcomes. However, it did not consider quick postprocessing techniques for improving multimodal clinical IF performance.
•
In [4], the DILRAN introduced for IF provided consistent reference for diagnosing disease in real-world medical routine, but it failed to concentrate on decreasing the noises when maintaining complete fusion quality.
•
At present, IF is an effectual solution for detecting information from images in an automated manner, even though it suffers from decomposition grade selection, decomposition sorts in real-world applications, space and time consumption, as well as loss of contrast.

3. Proposed UIFCNN for IF Using Medical Images

IF is utilized to recover crucial information from a group of images and provides a single image for making it highly informative and supportive than other input images. IF enhances applicability and quality of data. Here, the UIFCNN is devised for IF using medical images such as T1 and T2-FLAIR. Firstly, input image-T1 is preprocessed by BF, and it is segmented employing the RP-Net, thus segmented output-1 is achieved. On the other side, input image-T2-FLAIR is preprocessed utilizing BF, and the filtered image is segmented by the RP-Net; therefore, segmented output-2 is obtained. Then, two outputs are fused in the IF phase that is accomplished by the UIFCNN. However, the UIFCNN is devised by integrating U2Fusion with the IFCNN. Figure 1 reveals pictorial depiction of the UIFCNN for IF using medical images.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Pictorial depiction of UIFCNN for IF using medical images.

3.1. Acquisition of Input Images

IF is carried out by taking two input images from a specific dataset [14], which can be modeled by

()

where B specifies a considered dataset and T_f denotes f^th input image-T1, whereas T_j indicates overall T1 images. L_f signifies f^th input image-T2-FLAIR and L_j symbolizes total T2-FLAIR images in the dataset.

3.2. Preprocessing of Input Images Utilizing BF

A preprocessing phase has direct implications on the preparation of resultant images. Here, BF is employed to preprocess input T1 and T2-FLAIR images signified as T_f and L_f.

BF [15] replaces central pixel of a block by computed pixel that is the weighted average, which considers tonal and spatial distances among pairs of pixel in a block. BF effectively decreases noise without any loss of edge information. It can be mathematically specified as follows.

()

where

denotes a restored value of the pixel x, σ_ϕ implies standard deviation (SD) for s spatial distance of neighborhood pixels, σ_ψ indicates SD in tonal distance amid neighborhood pixels, S illustrates Gaussian operation, and V_x and V_y mention pixel intensity of the pixels x and y. N reveals the group of neighborhood pixels that are centered at x, and G_x presents the sum of weights inside a block. The filtered T1 andT2-FLAIR images are represented as A_f and E_f.

3.3. Image Segmentation Utilizing RP-Net

Image segmentation is a vital chore in various computer vision applications and image processing. An intention of image segmentation is partitioning an image to diverse areas regarding certain conditions for further process. To conduct image segmentation, filtered T1 andT2-FLAIR images indicated by A_f and E_f are given as input.

3.3.1. Structure of RP-Net

The RP-Net [16] is introduced for the few-shot learning in clinical images. The few-shot learning aims to generalize a model to effectively handle the limited annotated data in medical imaging. Also, it recognizes the new classes from a few training examples. The RP-Net incorporates recurrent layers to refine the segmentation iteratively, which improves the accuracy and consistency of the segmentation. It comprises three major components, namely, feature encoder, context relation encoder (CRE), and recurrent mask refinement module.

(i)
Feature encoder: This component extracts the features from query and support images. An input to a network is a group of Μ support images z_n ∈ R^Τ×Ε×1 and z_p ∈ R^Τ×Ε×1, padded to similar height Τ and width Ε. The query and support images are initially aligned utilizing affine transformation that is a general phase in various medical imaging tasks.
(ii)
CRE: CRE utilizes a correlation for enhancing context association features and forces a system to concentrate on shape and contexts of the region of interest (ROI). In this component, the correlation evaluation is applied for acquiring context association features among background and foreground feature vectors at individual spatial position (u, v) of C_β and (u − ε, u − τ) of with an offset ε and τ.
()
(iii)
Recurrent mask refinement module: This module iteratively employs CRE and prototypical network for recapturing the variation of local contextual features, and it refines a mask. The prototypical network computes a distance amid query feature vectors and estimated prototypes Κ = {κ_η|η ∈ N}. Moreover, this module acquires supportive features C_n,query features C_p and mask μ_b−1, utilizes CRE for enhancing query features, and employs a prototypical network to produce segmentation mask output μ_b.
()
where f_ς indicates an augmented feature and μ_soft denotes softmax function applied to generate an output. The segmented T1 and T2-FLAIR images are illustrated as R_f and P_f.

3.4. IF Using UIFCNN

In the IF process, the segmented images from various modalities are fused for superior visual appearance and better information. Here, IF is conducted by the UIFCNN that is introduced by merging U2Fusion with the IFCNN. The general outline of IF employing the UIFCNN is delineated in Figure 2. Initially, segmented T1 and T2-FLAIR images R_f and P_f are fed to the IFCNN network. Then, both segmented outputs along with the output from the IFCNN are subjected to the U2Fusion network, wherein the fused image is acquired [17].

3.4.1. Structure of IFCNN

The IFCNN [18] refers to a common IF framework that is developed based upon the CNN, and it is stimulated by the model of transform-domain IF approaches. In addition, the IFCNN contains three parts such as feature extraction (FE), image reconstruction (IR), and feature fusion (FF) modules. An input given to the IFCNN is specified as M_f, such that,

()

where R_f and P_f reveal segmented T1 and T2-FLAIR images.

(i)
FE module: Initially, two convolutional (conv) layers are adopted for extracting low-level features from an input image. FE is a critical process in transform-domain IF algorithms, and it is executed by the processing images with several filters. Conv1 is utilized for extracting an effectual image feature, and therefore, conv1 parameters are fixed during model training. Conv2 is added for tuning conv features of Conv1 in order to match for FF.
(ii)
FF module: The FF module fuses several kinds of input images, and it can also fuse several counts of input images. Here, element-wise fusion rules are employed for fusing conv features of numerous inputs that can be computed by
()
where implies Υ^th feature map of X^th input image that is extracted by the Conv2, mentions Υ^th channel of the fused feature map by the FF module, whereas fuse symbolizes element-wise fusion rules.
(iii)
IR module: As the FE module only comprises two conv layers, the abstraction level of extracted conv features is not higher. Thus, conv layers such as conv3 and conv4 are utilized in the last phase of this network for reconstructing the fusion image from fused conv features . The architecture of the IFCNN is illustrated in Figure 3, and an outcome from the IFCNN is I_f.

3.4.2. Structure of U2Fusion

U2Fusion [19] is capable to solve diverse fusion issues, and the framework of U2Fusion is described in Figure 4. Utilizing FE and information measurements, the U2Fusion evaluates the significance of related source images automatically and produces an adaptive information preservation degree. The U2Fusion network obtains Z_f as an input, such that

()

where I_f mentions the output from the IFCNN. The three stages conducted inU2Fusion are FE, information measurement, and information preservation degree.

(i)
FE: An input Z_f is combined in a single channel, and it is duplicated into three channels. Thereafter, it is subjected into Visual Geometry Group-16 (VGG-16). An output of conv layers prior to max-pooling layers refers to feature maps for succeeding information measurement as along with their respective shapes.
(ii)
Information measurement: For measuring the information comprised in extracted feature maps, gradients are employed for computation. It can be mathematically modeled as
()
where mentions the feature map by the conv layer prior to r^th max-pooling, h implies the feature map in the h^th channel of L_r channels, ‖ . ‖_Q denotes Frobenius norm, and ∇ refers to the Laplacian operator.
(iii)
Information preservation degree: Two adaptive weights are allocated as information preservation degrees for preserving information in the source images that define similarity weights among source and fusion images. To embody and improve variation in the weights, a predefined positive constant ξ is employed for scaling values to assign weights superiorly. Therefore, ϖ₁ and ϖ₂ can be represented by
()
A softmax operation is utilized for mapping kZ_f (1)/ξ, kZ_f (2)/ξ to real numbers among 0 and 1. The fused image acquired from U2Fusion is specified as F_f.

4. Results and Discussion

The outcomes achieved by the UIFCNN that are presented for IF are explicated in this segment.

4.1. Experimentation Setup

The UIFCNN devised for IF using medical images is experimentally implemented in PYTHON version 3.9.11 with Windows 10 OS, 1.7 GHz CPU, 8 GB RAM, and more than 100 GB ROM. Also, the parameter details of the UIFCNN are presented in Table 1.

Table 1. Parameter details.

Parameter	Value
Learning rate	0.01
Batch size	32
Epoch	100
Loss function	MSE

4.2. Dataset Description

The brain tumor segmentation (BraTS2020) dataset [14] contains multimodal scans available in the format of NIfTI files (.nii.gz). Moreover, this dataset describes native (T1), postcontrast T1-weighted (T1Gd), T2-FLAIR, and T2-weighted (T2) volumes. The BraTS2020 dataset has files of about 7 GB.

4.3. Experimental Outcomes

The acquired outcomes of the experimentation of image-1 are delineated in Figure 5. Figures 5(a), 5(b), and 5(c) denote input T1 image, filtered T1 image, and segmented T1 image. Input T2-FLAIR image, filtered T2-FLAIR image, and segmented T2-FLAIR image are indicated in Figures 5(d), 5(e), and 5(f), whereas Figure 5(g) mentions the fused image and Figure 5(h)) denotes the ground truth image.

The obtained experimental outcome of image-2 is deliberated in Figure 6. Figures 6(a), 6(b), and 6(c) denote input T1 image, filtered T1 image, and segmented T1 image. Input T2-FLAIR image, filtered T2-FLAIR image, and segmented T2-FLAIR image are indicated in Figures 6(d), 6(e), and 6(f), whereas Figure 6(g) mentions the fused image and Figure 6(h)) denotes the ground truth image.

4.4. Performance Measures

MSE, Dice coefficient, and Jaccard coefficient are the metrics taken for evaluating the UIFCNN [20–22].

4.4.1. MSE

MSE specifies an average of the square of the sum of error among the input image of U2Fusion and fused image that is given by

()

where F_f indicates the fused image and Ζ_f signifies the input image of U2Fusion.

4.4.2. Dice Coefficient

Dice coefficient is utilized for evaluating a similarity among two samples that is computed by

()

where Η and Ν refer to two samples.

4.4.3. Jaccard Coefficient

Jaccard coefficient estimates the resemblance amid a finite set of samples, which is specified as

()

where K∩D denotes the intersection size and K ∪ D implies the size of union of sets.

4.5. Comparative Techniques

GA [3], DILRAN [4], OWAF-PSO [11], and NSCT-GIF [10] are the existing methods considered for a comparative estimation to prove efficacy of the UIFCNN.

4.6. Comparative Analysis

The evaluation of the UIFCNN is carried out by altering training data and k-value with concern to metrics.

4.6.1. Analysis Regarding Training Data

Figure 7 implies the estimation of the UIFCNN regarding measures by altering training data. The best values acquired by each method taken for assessment while training data being 90% are interpreted. Figure 7(a) reveals an assessment of the UIFCNN by means of the Dice coefficient. The Dice coefficient attained by the UIFCNN is 0.907, whereas GA, DILRAN, OWAF-PSO, and NSCT-GIF obtained 0.881, 0.886, 0.897, and 0.900. Evaluation of the UIFCNN with respective to the Jaccard coefficient is illustrated in Figure 7(b). The UIFCNN achieved 0.904 of the Jaccard coefficient while the value of the Jaccard coefficient acquired by GA is 0.832, DILRAN is 0.860, OWAF-PSO is 0.864, and NSCT-GIF is 0.886. Figure 7(c) exposes the analysis of the UIFCNN in relation to MSE. GA, DILRAN, OWAF-PSO, and NSCT-GIF achieved MSE of 0.298, 0.256, 0.248, and 0.240, whereas MSE attained by the UIFCNN is 0.231.

4.6.2. Analysis Regarding K Value

Assessment of the UIFCNN with consideration of evaluation measures by varying k-value is explicated in Figure 8. For k-value = 9, values achieved by existing and devised approaches are described. Evaluation of the UIFCNN considering the Dice coefficient is mentioned in Figure 8(a). The UIFCNN attained the Dice coefficient of 0.928 while the Dice coefficient achieved by GA is 0.887, DILRAN is 0.889, OWAF-PSO is 0.889, and NSCT-GIF is 0.904. Figure 8(b) reveals the analysis of the UIFCNN based upon the Jaccard coefficient. GA, DILRAN, OWAF-PSO, and NSCT-GIF acquired the Jaccard coefficient of 0.882, 0.892, 0.899, and 0.909, whereas the UIFCNN obtained 0.920. Figure 8(c) displays the assessment of the UIFCNN with regards to MSE. MSE achieved by the UIFCNN is 0.221 while GA, DILRAN, OWAF-PSO, and NSCT-GIF acquired 0.330, 0.324, 0.273, and 0.238.

4.7. Comparative Discussion

Table 2 presents the values obtained by the UIFCNN and other approaches such as GA, DILRAN, OWAF-PSO, and NSCT-GIF for conducted assessments. It is accepted that the UIFCNN achieved maximum values of the Dice coefficient about 0.928 and the Jaccard coefficient about 0.920 as well as minimum MSE about 0.221 for k-value = 9. This proves that the UIFCNN is an excellent model for IF using medical images.

Table 2. Comparative discussion of UIFCNN.

Setups	Metrics/methods	GA	DILRAN	OWAF-PSO	NSCT-GIF	Proposed UIFCNN
Training data = 90%	Dice coefficient	0.881	0.886	0.897	0.900	0.907
	Jaccard coefficient	0.832	0.860	0.864	0.886	0.904
	MSE	0.298	0.256	0.248	0.240	0.231

K value = 9	Dice coefficient	0.887	0.889	0.889	0.904	0.928
	Jaccard coefficient	0.882	0.892	0.899	0.909	0.920
	MSE	0.330	0.324	0.273	0.238	0.221

Note: Bold values represent the proposed work results.

5. Conclusion

The medical IF process has attracted more attention in advanced diagnosing and healthcare over several decades. The primary objective of medical IF is fusing corresponding information from multimodal clinical images for acquiring higher quality images. It is expansively employed in biomedical research studies and medical diagnosing for the doctors. The procedure to fuse clinical images into one image without introducing loss or distortion of information is an encouraging research problem. In this research, the UIFCNN is presented for IF using medical images. The medical images taken for this IF process are T1 and T2-FLAIR. At first, preprocessing of input image-T1 is conducted by BF and segmentation of T1 image is accomplished by the RP-Net, thus segmented output-1 is attained. On the other side, input image-T2-FLAIR is preprocessed utilizing BF. The filtered image is segmented employing the RP-Net, and therefore, segmented image-2 is achieved. Both segmented outputs are fused in the IF phase utilizing the UIFCNN. However, the UIFCNN is devised by incorporating U2Fusion with the IFCNN. Furthermore, the UIFCNN achieved high values of the Dice coefficient and Jaccard coefficient about 0.928 and 0.920 as well as low MSE about 0.221 while considering the k-value as 9. The dataset used in this work only includes two types of input images. This may limit the generalization of the model to other imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET) scans. Incorporating a more diverse dataset would help in testing and refining the capability of the model across various imaging techniques, which will be considered as the future work.

Ethics Statement

The authors have nothing to report.

Conflicts of Interest

The authors declare no conflicts of interest.

Author Contributions

Balasubramaniam S.: conceptualization, investigation, data curation, formal analysis, and writing–original draft; Vanajaroselin Chirchi: revision support; Sivakumar T. A.: writing support; Gururama Senthilvel P.: review and editing; Duraimutharasan N.: revision support.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Open Research

Data Availability Statement

The data used to support the findings of this study are available from the corresponding authors upon reasonable request.

References

1 Wang L., Zhang J., Liu Y., Mi J., and Zhang J., Multimodal Medical Image Fusion Based on Gabor Representation Combination of Multi-CNN and Fuzzy Neural Network, IEEE Access. (2021) 9, 67634–67647.
10.1109/ACCESS.2021.3075953
Google Scholar
2 James A. P. and Dasarathy B. V., Medical Image Fusion: A Survey of the State of the Art, Information Fusion. (2014) 19, 4–19.
10.1016/j.inffus.2013.12.002
Web of Science® Google Scholar
3 Duan J., Mao S., Jin J., Zhou Z., Chen L., and Chen C. P., A Novel GA-Based Optimized Approach for Regional Multimodal Medical Image Fusion With Superpixel Segmentation, IEEE Access. (2021) 9, 96353–96366.
10.1109/ACCESS.2021.3094972
Web of Science® Google Scholar
4 Zhou M., Xu X., and Zhang Y., An Attention-Based Multi-Scale Feature Learning Network for Multimodal Medical Image Fusion, arXiv preprint arXiv:2212.04661. (2022) .
Google Scholar
5 Li Y., Zhao J., Lv Z., and Pan Z., Multimodal Medical Supervised Image Fusion Method by CNN, Frontiers in Neuroscience. (2021) 15.
Google Scholar
6 Ancuti C. O., Ancuti C., De Vleeschouwer C., and Bekaert P., Color Balance and Fusion for Underwater Image Enhancement, IEEE Transactions on Image Processing. (2017) 27, no. 1, 379–393.
10.1109/TIP.2017.2759252
PubMed Web of Science® Google Scholar
7 Tang W., He F., Liu Y., and Duan Y., MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer, IEEE Transactions on Image Processing. (2022) 31, 5134–5149.
10.1109/TIP.2022.3193288
PubMed Google Scholar
8 Kanojia S. and Gupta S., Multimodal Medical Image Fusion Techniques: A Survey, 2022, no. 1.
Google Scholar
9 Wang L., Li B., and Tian L. F., Multi-Modal Medical Image Fusion Using the Inter-Scale and Intra-Scale Dependencies Between Image Shift-Invariant Shearlet Coefficients, Information Fusion. (2014) 19, 20–28.
10.1016/j.inffus.2012.03.002
Web of Science® Google Scholar
10 Rajalingam B., Al-Turjman F., Santhoshkumar R., and Rajesh M., Intelligent Multimodal Medical Image Fusion With Deep Guided Filtering, Multimedia Systems. (2022) 28, no. 4, 1449–1463.
10.1007/s00530-020-00706-0
Google Scholar
11 Shehanaz S., Daniel E., Guntur S. R., and Satrasupalli S., Optimum Weighted Multimodal Medical Image Fusion Using Particle Swarm Optimization, Optik. (2021) 231.
10.1016/j.ijleo.2021.166413
Google Scholar
12 Maseleno A., Kavitha D., Ashok K., Saleh Al Ansari M., Satheesh N., and Vijaya Kumar Reddy R., An Ensemble Learning Approach for Multi-Modal Medical Image Fusion Using Deep Convolutional Neural Networks, International Journal of Advanced Computer Science and Applications. (2023) 14, no. 8, 758–769.
Google Scholar
13 Ghosh T. and Jayanthi N., An Efficient Dense-Resnet for Multimodal Image Fusion Using Medical Image, Multimedia Tools and Applications. (2024) .
10.1007/s11042-024-18974-7
Google Scholar
14 Brain Tumor Segmentation (BraTS2020) Dataset Is Taken From.
Google Scholar
15 Ahn S., Park J., Luo L., and Chong J., Adaptive Object-Region-Based Image Pre-Processing for a Noise Removal Algorithm, KSII Transactions on Internet and Information Systems. (2013) 7, no. 12.
Google Scholar
16 Tang H., Liu X., Sun S., Yan X., and Xie X., Recurrent Mask Refinement for Few-Shot Medical Image Segmentation, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 3918–3928.
10.1109/ICCV48922.2021.00389
Google Scholar
17 Balasubramaniam S., Kadry S., and Kumar K. S., Osprey Gannet Optimization Enabled CNN Based Transfer Learning for Optic Disc Detection and Cardiovascular Risk Prediction Using Retinal Fundus Images, Biomedical Signal Processing and Control. (2024) 93.
10.1016/j.bspc.2024.106177
Google Scholar
18 Zhang Y., Liu Y., Sun P., Yan H., Zhao X., and Zhang L., IFCNN: A General Image Fusion Framework Based on Convolutional Neural Network, Information Fusion. (2020) 54, 99–118.
10.1016/j.inffus.2019.07.011
Web of Science® Google Scholar
19 Xu H., Ma J., Jiang J., Guo X., and Ling H., U2Fusion: A Unified Unsupervised Image Fusion Network, IEEE Transactions on Pattern Analysis and Machine Intelligence. (2020) 44, no. 1, 502–518.
10.1109/TPAMI.2020.3012548
Web of Science® Google Scholar
20 Kadry S., Dhanaraj R. K., K S. K., and Manthiramoorthy C., Res-Unet Based Blood Vessel Segmentation and Cardio Vascular Disease Prediction Using Chronological Chef-Based Optimization Algorithm Based Deep Residual Network From Retinal Fundus Images, Multimedia Tools and Applications. (2024) 1–30.
Google Scholar
21 Balasubramaniam S. and Kumar K. S., Fractional Feedback Political Optimizer With Prioritization-Based Charge Scheduling in Cloud-Assisted Electric Vehicular Network, Ad Hoc and Sensor Wireless Networks. (2022) 52, no. 3-4, 173–198.
Google Scholar
22 Gollagi S. G., Hybrid Model With Optimization Tactics for Software Defect Prediction, International Journal of Modeling, Simulation, and Scientific Computing. (2023) 14, no. 02.
10.1142/S1793962323500319
Google Scholar

All articles

Medical Image Fusion Using Unified Image Fusion Convolutional Neural Network

Abstract

1. Introduction

2. Motivation

2.1. Literature Survey

2.2. Challenges

3. Proposed UIFCNN for IF Using Medical Images

3.1. Acquisition of Input Images

3.2. Preprocessing of Input Images Utilizing BF

3.3. Image Segmentation Utilizing RP-Net

3.3.1. Structure of RP-Net

3.4. IF Using UIFCNN

3.4.1. Structure of IFCNN

3.4.2. Structure of U2Fusion

4. Results and Discussion

4.1. Experimentation Setup

4.2. Dataset Description

4.3. Experimental Outcomes

4.4. Performance Measures

4.4.1. MSE

4.4.2. Dice Coefficient

4.4.3. Jaccard Coefficient

4.5. Comparative Techniques

4.6. Comparative Analysis

4.6.1. Analysis Regarding Training Data

4.6.2. Analysis Regarding K Value

4.7. Comparative Discussion

5. Conclusion

Ethics Statement

Conflicts of Interest

Author Contributions

Funding

Open Research

Data Availability Statement

References

Figures

References

Related

Information