Volume 90, Issue 4 pp. 1431-1445
RESEARCH ARTICLE
Open Access

Deep learning-assisted model-based off-resonance correction for non-Cartesian SWI

Guillaume Daval-Frérot

Guillaume Daval-Frérot

Siemens Healthineers, Saint-Denis, France

CEA, NeuroSpin, CNRS, Paris-Saclay University, Gif-sur-Yvette, France

Inria, MIND, Palaiseau, France

Search for more papers by this author
Aurélien Massire

Aurélien Massire

Siemens Healthineers, Saint-Denis, France

Search for more papers by this author
Boris Mailhé

Boris Mailhé

Siemens Healthineers, Digital Technology & Innovation, Princeton, New Jersey, USA

Search for more papers by this author
Mariappan Nadar

Mariappan Nadar

Siemens Healthineers, Digital Technology & Innovation, Princeton, New Jersey, USA

Search for more papers by this author
Blanche Bapst

Blanche Bapst

Department of Neuroradiology, AP-HP, Henri Mondor University Hospital, Créteil, France

EA 4391, Paris-Est-Créteil University, Créteil, France

Search for more papers by this author
Alain Luciani

Alain Luciani

Department of Medical Imaging, AP-HP, Henri Mondor University Hospital, Créteil, France

Faculty of Public Health, Paris-Est-Créteil University, Créteil, France

Inserm IMRB, U955, Equipe 18, Créteil, France

Search for more papers by this author
Alexandre Vignaud

Alexandre Vignaud

CEA, NeuroSpin, CNRS, Paris-Saclay University, Gif-sur-Yvette, France

Search for more papers by this author
Philippe Ciuciu

Corresponding Author

Philippe Ciuciu

CEA, NeuroSpin, CNRS, Paris-Saclay University, Gif-sur-Yvette, France

Inria, MIND, Palaiseau, France

Correspondence

Philippe Ciuciu, NeuroSpin, CEA, Gif-sur-Yvette, 91191, France.

Email: [email protected]

Search for more papers by this author
First published: 22 June 2023
Citations: 2

Abstract

Purpose

Patient-induced inhomogeneities in the static magnetic field cause distortions and blurring (off-resonance artifacts) during acquisitions with long readouts such as in SWI. Conventional versatile correction methods based on extended Fourier models are too slow for clinical practice in computationally demanding cases such as 3D high-resolution non-Cartesian multi-coil acquisitions.

Theory

Most reconstruction methods can be accelerated when performing off-resonance correction by reducing the number of iterations, compressed coils, and correction components. Recent state-of-the-art unrolled deep learning architectures could help but are generally not adapted to corrupted measurements as they rely on the standard Fourier operator in the data consistency term. The combination of correction models and neural networks is therefore necessary to reduce reconstruction times.

Methods

Hybrid pipelines using UNets were trained stack-by-stack over 99 SWI 3D SPARKLING 20-fold accelerated acquisitions at 0.6 mm isotropic resolution using different off-resonance correction methods. Target images were obtained using slow model-based corrections based on self-estimated Δ B 0 $$ \Delta {B}_0 $$ field maps. The proposed strategies, tested over 11 volumes, are compared to model-only and network-only pipelines.

Results

The proposed hybrid pipelines achieved scores competing with two to three times slower baseline methods, and neural networks were observed to contribute both as pre-conditioner and through inter-iteration memory by allowing more degrees of freedom over the model design.

Conclusion

A combination of model-based and network-based off-resonance correction was proposed to significantly accelerate conventional methods. Different promising synergies were observed between acceleration factors (iterations, coils, correction) and model/network that could be expanded in the future.

1 INTRODUCTION

Many parallel imaging and compressed-sensing (CS) methods1-7 have been proposed over the last two decades to accelerate MRI acquisitions. Non-Cartesian sampling patterns8, 9 have recently gained popularity through their capability to better exploit longer but fewer readouts. In particular, the Spreading Projection Algorithm for Rapid K-space sampLING (SPARKLING), proposed for 2D9 and 3D10, 11 imaging, responds to all degrees of freedom offered by modern MR scanners11 to fully explore k-space and match optimized target sampling densities. SWI,12 commonly used in high resolution brain venography or traumatic brain injuries,13 has been recently studied with SPARKLING11, 14 to reach acceleration factors (AF) higher than 15 in scan times compared to fully sampled Cartesian imaging in high resolution (0.6 mm) isotropic brain imaging. However non-Cartesian sampling patterns tend to be more sensitive to off-resonance artifacts causing geometric distortions and image blurring,15 notably with long readouts (e.g., 20 ms), thereby inducing k-space inconsistencies over the different gradient directions.16, 17 These artifacts emerge mostly from patient-induced static B 0 $$ {B}_0 $$ field inhomogeneities, notably pronounced near air-tissue interfaces, for instance in the vicinity of nasal cavity and ear canals.

Diverse methods have been proposed in the literature to correct those artifacts during the acquisition or image reconstruction. The spherical harmonic shimming technique is the current standard for all systems15, 18 but is generally limited to second or third-order harmonics, which already provide critical improvements. More advanced shim coil designs have been proposed recently19, 20 but still face technical and theoretical limitations.21 Post-processing methods can therefore be necessary as a complement for more demanding cases, such as Cartesian EPI22, 23 where alternating gradient direction at every time frame can be used to deduce and revert off-resonance induced geometric distortions. However, this technique is not applicable to non-Cartesian readouts (e.g., spirals,24-26 rosette,27 SPARKLING9-11) due to the multiple spatially-encoding gradients played simultaneously. Another less constraining and well-established method28-32 consists in compensating the undesired Δ B 0 $$ \Delta {B}_0 $$ spatial variations by modifying the Fourier operator involved in image reconstruction in order to integrate prior knowledge on a Δ B 0 $$ \Delta {B}_0 $$ field map. This technique can be applied to any imaging setup but considerably slows down (e.g., 15-fold) the image reconstruction process. The mandatory Δ B 0 $$ \Delta {B}_0 $$ field map is directly available for multi-echo acquisitions,15 but it necessitates to be either externally collected by extending the scan time, or estimated.14, 33-38

Non-Cartesian acquisition strategies enable shorter scan times at the cost of increased image reconstruction duration. However, taking into account off-resonance correction within extended forward and adjoint operators has a multiplicative effect that makes the processing excessively long. In the recent years, deep learning (DL) has emerged for MRI reconstruction as a means to allow for improved image quality and faster processing, by similarly pushing the computation cost to offline training sessions. However, state-of-the-art network architectures are mostly focused on undersampling artifacts39-43 as they enforce data consistency with Fourier operators, which is inaccurate when dealing with off-resonance effects. More targeted literature invests considerable effort into estimating the Δ B 0 $$ \Delta {B}_0 $$ field map,44, 45 already available in the context of SWI acquisitions (see Ref. 14 for details).

In this work, we study different approaches29-31 to model compressed representations of the non-Fourier operator involved in the data consistency term, and compensate for them using neural networks. The proposed extended non-Cartesian Primal-Dual network (NC-PDNet) architectures43 are trained to reproduce self-corrected14 high resolution SWI volumes based on highly accelerated multi-coil 3D SPARKLING11 trajectories (AF > 17) at 3T and each obtained through 8 h long reconstructions. This approximation allows us to reach a significant acceleration for model inversion with respect to three key elements, namely the number of unrolled iterations, compressed coils and correction components (i.e., interpolators involved in the non-Fourier operator), all contributing multiplicatively to the reconstruction time. The results are then compared to both model-only (CS reconstruction with non-Fourier operator) and network-only (original NC-PDNet) pipelines over 11 dedicated volumes and further decomposed to analyze the contributions of both neural networks and partially correcting models with respect to the three sources of approximation using tailored off-resonance metrics. The various benefits of hybrid architectures are demonstrated, with observed synergies paving the way to more improvements on image quality.

2 THEORY

2.1 Image reconstruction

For convenience we define M = N c × N s $$ M={N}_c\times {N}_s $$ the total number of samples (with N c $$ {N}_c $$ the number of spokes and N s $$ {N}_s $$ the number of samples per spoke) measured over the k-space Ω $$ \varOmega $$ and N = N x × N y × N z $$ N={N}_x\times {N}_y\times {N}_z $$ the total number of voxels (with N x $$ {N}_x $$ , N y $$ {N}_y $$ , and N z $$ {N}_z $$ the image dimension in voxels).

In the absence of B 0 $$ {B}_0 $$ inhomogeneities, the reconstructed image x ^ $$ \hat{x} $$ can be obtained from the multi-channel k-space measurements y = y q q = 1 Q $$ y={\left({y}_q\right)}_{q=1}^Q $$ by solving:
x ^ = argmin x N q Q 1 2 y q F Ω S q x 2 2 + R ( x ) $$ \hat{x}\kern0.5em =\underset{x\in {\mathbb{C}}^N}{\mathrm{argmin}}\sum \limits_q^Q\frac{1}{2}\parallel {y}_q-{F}_{\varOmega }{S}_qx{\parallel}_2^2+\mathcal{R}(x) $$ (1)
where Q $$ Q $$ is the number of channels, S q $$ {S}_q $$ is the sensitivity map of the qth channel, and R $$ \mathcal{R} $$ a regularization function. The operator F Ω $$ {F}_{\varOmega } $$ is the non-uniform fast Fourier transform (NUFFT) defined through the ideal signal equation:
f ( r ) = T obs s ( t ) e ik ( t ) · r d t $$ f(r)\kern0.5em {\displaystyle \begin{array}{c}={\int}_{T_{\mathrm{obs}}}s(t)\;{e}^{ik(t)\cdotp r}\;\mathrm{d}t\end{array}} $$ (2)
with T obs $$ {T}_{\mathrm{obs}} $$ the observation window in seconds, s ( t ) $$ s(t) $$ the measured k-space sample at time t $$ t $$ , f ( r ) $$ f(r) $$ the object magnetization at position r $$ r $$ , k ( t ) $$ k(t) $$ the k-space position at time t $$ t $$ . The sensitivity maps S q $$ {S}_q $$ can be externally acquired or estimated from the central θ $$ \theta $$ % of the k-space,46 with the low frequency F Ω θ % $$ {F}_{\varOmega_{\theta \%}} $$ operator:
S q = F Ω θ % H y q p = 1 Q F Ω θ % H y p 2 2 $$ {S}_q\kern0.5em {\displaystyle \begin{array}{c}=\frac{F_{\varOmega_{\theta \%}}^H{y}_q}{\sqrt{\sum \limits_{p=1}^Q\parallel {F}_{\varOmega_{\theta \%}}^H{y}_p{\parallel}_2^2}}\end{array}} $$ (3)
An efficient way to solve Eq. (1) is through proximal gradient descent:
w k + 1 = x k α k q = 1 Q S q H F Ω H D F Ω S q x k y q $$ {w}_{k+1}\kern0.5em {\displaystyle \begin{array}{c}={x}_k-{\alpha}_k\sum \limits_{q=1}^Q{S}_q^H{F}_{\varOmega}^HD\left({F}_{\varOmega }{S}_q{x}_k-{y}_q\right)\end{array}} $$ (4)
x k + 1 = prox R w k + 1 $$ {\displaystyle \begin{array}{rr}& \\ {}{x}_{k+1}& ={\mathrm{prox}}_{\mathcal{R}}\left({w}_{k+1}\right)\end{array}} $$ (5)
where α k $$ {\alpha}_k $$ is the step size at iteration k $$ k $$ , and prox R $$ {\mathrm{prox}}_{\mathcal{R}} $$ is a proximal operator associated with R $$ \mathcal{R} $$ . The step described in Eq. (4) is often called data consistency. This basic approach can be extended to both CS and DL methods, using different algorithms47, 48 and architectures40, 42, 43 to reach improved image quality in fewer iterations. Hereafter, we replace prox R $$ {\mathrm{prox}}_{\mathcal{R}} $$ with a neural network similarly to43 and set α k = 1 β $$ {\alpha}_k=\frac{1}{\beta } $$ where β $$ \beta $$ is the Lipschitz constant of the data consistency term, following.47

2.2 Signal correction

In order to apply Δ B 0 $$ \Delta {B}_0 $$ corrections, we need to extend the basic Fourier model from Eq. (2) to include the off-resonance effects49:
f ( r ) = T obs s ( t ) e i k ( t ) · r + Δ ω 0 ( r ) t d t $$ f(r)\kern0.5em {\displaystyle \begin{array}{c}={\int}_{T_{\mathrm{obs}}}s(t)\;{e}^{i\left(k(t)\cdotp r+\Delta {\omega}_0(r)t\right)}\;\mathrm{d}t\end{array}} $$ (6)
with Δ ω 0 ( r ) = γ Δ B 0 ( r ) $$ \Delta {\omega}_0(r)=\gamma \Delta {B}_0(r) $$ the off-resonance frequency in radians at position r $$ r $$ , γ $$ \gamma $$ the hydrogen gyromagnetic ratio and Δ B 0 $$ \Delta {B}_0 $$ the actual magnetic field deviation. This new signal formula is discretized over time t m m = 1 M $$ {\left({t}_m\right)}_{m=1}^M $$ and voxels positions r n n = 1 N $$ {\left({r}_n\right)}_{n=1}^N $$ as follows:
f r n = m = 1 M s t m e i Δ ω 0 r n t m e ik t m · r n . $$ f\left({r}_n\right)\kern0.5em {\displaystyle \begin{array}{c}=\sum \limits_{m=1}^Ms\left({t}_m\right)\;{e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_m}\;{e}^{ik\left({t}_m\right)\cdotp {r}_n}.\end{array}} $$ (7)
The term Δ ω 0 r n t m $$ \Delta {\omega}_0\left({r}_n\right)\;{t}_m $$ is dependent on both the k-space and image domain, which is not compatible with a regular Fourier transform. The approach initially proposed by Noll et al.28 and later extended in29-31 amounts to splitting this exponential term into a sum of variables that are each dependent on a single domain:
e i Δ ω 0 r n t m = = 1 L b m , c , n . $$ {e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_m}\kern0.5em {\displaystyle \begin{array}{c}=\sum \limits_{\ell =1}^L{b}_{m,\ell}\;{c}_{\ell, n}.\end{array}} $$ (8)
This way, by combining Eqs. (7) and (8), we can factorize out the term dependent on the image domain and obtain a weighted sum of L $$ L $$ regular Fourier transforms, with L M , N $$ L\ll M,N $$ :
f r n = = 1 L c , n m = 1 M s t m b m , e ik t m · r n . $$ f\left({r}_n\right)\kern0.5em {\displaystyle \begin{array}{c}=\sum \limits_{\ell =1}^L{c}_{\ell, n}\sum \limits_{m=1}^Ms\left({t}_m\right)\;{b}_{m,\ell}\;{e}^{ik\left({t}_m\right)\cdotp {r}_n}.\end{array}} $$ (9)
The coefficients B = b m , M , L $$ B=\left({b}_{m,\ell}\right)\in {\mathbb{C}}^{M,L} $$ and C = c , n L , N $$ C=\left({c}_{\ell, n}\right)\in {\mathbb{C}}^{L,N} $$ can be optimally estimated using the method proposed by Fessler et al.31 considering the following matrix factorization problem:
B ^ , C ^ = argmin B ( M , L ) , C ( L , N ) E BC Fro 2 $$ \hat{B},\hat{C}\kern0.5em =\underset{B\in {\mathbb{C}}^{\left(M,L\right)},C\in {\mathbb{C}}^{\left(L,N\right)}}{\mathrm{argmin}}\parallel E- BC{\parallel}_{\mathrm{Fro}}^2 $$ (10)
with E $$ E $$ the M × N $$ M\times N $$ matrix defined by E m , n = e i Δ ω 0 r n t m $$ {E}_{m,n}={e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_m} $$ . The optimal solution is obtained by decomposing E $$ E $$ through singular value decomposition (SVD) along either axis to obtain B $$ B $$ or C $$ C $$ , and then finding the other one as the least squares solution. We refer hereafter to this solution as singular vector interpolation (SVI) coefficients.
Prior to this formalism, different solutions have been explored,1, 28-30, 49, 50 notably two that were respectively introduced by Man et al.29 and Sutton et al.,30 and still currently used.51, 52 Both methods were also tested in this work as optimality in regards to Eq. (10) does not imply optimality in regards to Eq. (1). Similarly, the solutions are given through least squares after fixing either B $$ B $$ or C $$ C $$ as follows:
e i Δ ω 0 r n t m = = 1 L e i Δ ω 0 , t m c , n $$ {e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_m}\kern0.5em {\displaystyle \begin{array}{c}=\sum \limits_{\ell =1}^L{e}^{i\Delta {\omega}_{0,\ell}\;{t}_m}\;{c}_{\ell, n}\end{array}} $$ (11)
e i Δ ω 0 r n t m = = 1 L b m , e i Δ ω 0 r n t $$ {\displaystyle \begin{array}{rr}& \\ {}{e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_m}& =\sum \limits_{\ell =1}^L{b}_{m,\ell}\;{e}^{i\Delta {\omega}_0\left({r}_n\right)\;{t}_{\ell }}\end{array}} $$ (12)
with Δ ω 0 , $$ \Delta {\omega}_{0,\ell } $$ and t $$ {t}_{\ell } $$ obtained by segmenting the off-resonance frequency range and the time window into L $$ L $$ values, respectively. In what follows, the solution obtained from Eq. (11) named Multi-Frequency Interpolation29 is referred to as MFI, and by analogy the solution from Eq. (12) with the method from30 is referred to as MTI for Multi-Temporal Interpolation.

For all coefficients, the computational load can be considerably reduced by taking advantage of the spoke redundancy (i.e., using the same decomposition over the N c $$ {N}_c $$ spokes) and using histograms of the Δ B 0 $$ \Delta {B}_0 $$ field map to solve a weighted version of Eq. (10), typically decreasing the image dimensions N = 384 × 384 × 208 $$ N=384\times 384\times 208 $$ voxels to N b = 1000 $$ {N}_b=1000 $$ bins (see details in31). This way, the matrix E $$ E $$ is reduced from M × N $$ M\times N $$ to N s × N b $$ {N}_s\times {N}_b $$ and therefore correction coefficients can be obtained in a few seconds for high resolution 3D volumes.

We obtain from Eq. (9) a pseudo-Fourier operator F Ω , $$ {F}_{\varOmega, \varSigma } $$ (with $$ \varSigma $$ representing the interpolation processing) that can be implemented as a wrapper for any regular Fourier operator F Ω $$ {F}_{\varOmega } $$ and directly integrated into Eqs. (1) and (3). The same remark also holds for the adjoint Fourier operator F Ω H $$ {F}_{\varOmega}^H $$ .

2.3 Accelerated reconstruction and correction

The above mentioned correction technique is convenient but still increases the computation cost by multiplying factors of an already time-consuming reconstruction in the case of 3D high-resolution non-Cartesian imaging. Starting from the algorithm presented in Eqs. (4) and (5) combined with the correction operator in Eq. (9), different ways to reduce the computational burden can be explored by decreasing: The number of proximal gradient iterations I $$ I $$ , the number of channels Q $$ Q $$ , or the number of correction components L $$ L $$ . All possibilities were considered, and the last two are further explained in this subsection.

Different coil compression methods exist to decrease Q,53-55 but the most efficient ones, such as geometric coil compression,55 often exploit constraining k-space trajectory properties. More recent learning-based techniques have not been explored but may be considered in the future. Meanwhile, we used the trajectory-independent method by Buehrer et al.53 also based on SVD as it allows an efficient reduction of Q $$ Q $$ while also ordering compressed channels by explained variance, as represented in Figure 1 and detailed afterward. Some relations are observed between Δ B 0 $$ \Delta {B}_0 $$ field maps and compressed channel sensitivities in the (Figure S7) but without strong demarcation; therefore, only the first Q $$ Q $$ components are kept.

Details are in the caption following the image
Expected model-based contributions from coil compression and partial correction. The contributions from coil compression and partial correction components are each considered prior to reconstruction on reversed logarithmic scales for calibration. For coil compression, the total explained variance known using singular value decomposition is given for varying Q $$ Q $$ . For off-resonance correction, the normalized root-mean-square error from Eq. (10) is given for MFI, MTI and SVI coefficients with varying L $$ L $$ . In both cases, the values are obtained from the 99 training acquisitions. In both cases, the retained number of five components corresponds approximately to less than 10% remaining error.

On the contrary, the SVI correction coefficients cover specific regions of the off-resonance spectrum (Figure S6). Simply using the first L $$ L $$ components, also studied hereafter, would mostly shift the data consistency focus toward low off-resonance areas that cover a much broader part of the brain images. A possibility to extend the spectrum coverage is to change the components used for data consistency over consecutive iterations. The SVI coefficients are more convenient for this, as two sets of L 1 $$ {L}_1 $$ and L 2 $$ {L}_2 $$ components with L 2 > L 1 $$ {L}_2>{L}_1 $$ will share the same first L 1 $$ {L}_1 $$ components because of the orthogonality of the decomposition. This is not true for MFI and MTI methods as observed in Figures S4 and S5. It ensures that, while the first L 1 $$ {L}_1 $$ components carry the maximal amount of information, they are not redundant with the other L 2 L 1 $$ {L}_2-{L}_1 $$ components. Diverse strategies have been considered. The best performing solution updates data consistency between the first L $$ L $$ components and the following L + 1 $$ L+1 $$ to 2 L $$ 2L $$ components over iterations, called SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ hereafter, to enforce fidelity toward either low or high off-resonance areas, respectively, in an alternated manner.

The goal is to provide improved reconstruction and correction while minimizing the processing time. I $$ I $$ , Q $$ Q $$ , and L $$ L $$ linearly multiply the time with exponentially decreasing quality contributions, as shown in Figure 1 and in supplementary materials with Table S4. Our approach is therefore to reduce the reconstruction load and recover the image quality using neural networks over compressed information.

3 METHODS

3.1 Proposed pipelines

Following the recommendations from Ramzi et al.,42 the end-to-end NC-PDNet architecture43 has been used as we consider 3D non-Cartesian SWI data. Particularly, we investigated the primal-only version where only the image domain processing is learned with an arbitrary neural network whereas data consistency is applied in k-space. However, in the context of 3D high resolution non-Cartesian and multi-coil imaging, the amount of graphics processing unit (GPU) memory required for a complete training is excessive. All or part of that memory can be moved to central processing unit CPU, but would result in a considerably longer training duration.

To address this issue, a solution often called greedy learning56 consists in breaking down the learning process into stacks trained sequentially to avoid any restriction on the number of iterations or the network size. An ideal case would be to have stacks consisting of at least one network pass followed by a data consistency block to still learn complementary features; however, the memory requirement increases drastically at the transition between single channel image domain and multi-coil k-space data. Therefore, the simplest proposition is to exclude the data consistency from the gradient computation by training an image-to-image network as represented in Figure 2, and then only apply data consistency and repeat for each stack.

Details are in the caption following the image
Illustration of the proposed cross-domain pipeline with stacked-training. The proposed pipeline is implemented as follows: from the Q $$ Q $$ compressed coils, measurements y q $$ {y}_q $$ are processed in k-space (yellow boxes), firstly as initialization. Then the multi-channel signal is transformed into images using the partially-correcting operator F Ω , $$ {F}_{\Omega, \Sigma} $$ (based on the L $$ L $$ correction coefficients B $$ {B}_{\ell } $$ and C $$ {C}_{\ell } $$ ), and combined into a single volume using the sensitivity maps S q $$ {S}_q $$ (green boxes). The volume is then processed through a UNet in the image domain (blue boxes) that takes as input a concatenation of the volumes processed at each previous iteration (up to a given buffer size). The image is then re-converted to multi-channel k-space signal (green box) and the overall process is repeated over I $$ I $$ iterations. During the training stage, all UNets are trained one-by-one fully (purple boxes) to reconstruct the reference volume x ^ target $$ {\hat{x}}_{\mathrm{target}} $$ based on the dataset processed up to the right image domain (i.e., after the green box and prior to blue box to train).

This modification requires adapting the feature originally named “memory”40 or more recently “buffer”,43 where networks could carry additional channels of information between iterations, as the additional network output channels are not all included in the backpropagation graph anymore. Instead, the network output is a single complex-valued volume but that will still be concatenated into a buffer of N B $$ {N}_B $$ volumes from previous stacks to be used as input for the next stack. That way, the overall pipeline keeps the expressiveness required to learn CS-like acceleration schemes as originally suggested.40 The main drawback compared to end-to-end training is enforcing that each stack should yield the target results rather than letting each of the independent networks build intermediate states. However, this modification allows us to train architectures with an arbitrarily high number of coils and iterations.

The different partial correction strategies based on F Ω , $$ {F}_{\varOmega, \varSigma } $$ discussed in Section 2.3 are studied within an unrolled architecture in combination with UNets in the image domain. Indeed, conventional data consistency would either enforce the off-resonance artifacts in spite of the networks, or at best would not contribute to their correction. For all of the following studies, the variables I = 5 $$ I=5 $$ , Q = 5 $$ Q=5 $$ and L = 5 $$ L=5 $$ have been retained to allow for 3D 0.6 mm isotropic corrections within 7–8 min of computing time at inference. The MFI, MTI and SVI coefficients are compared, along with the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ strategy, with N B = 3 $$ {N}_B=3 $$ buffers and pre-computed density compensations57 for the NUFFT operator from the gpuNUFFT,58 and pysap-mri,59 packages. The residual UNets are composed of three scales, each made of convolutional blocks of three layers with kernel size of 3 × 3 × 3 $$ 3\times 3\times 3 $$ and each followed by a ReLU activation (except for the final layer). The number of filters is doubled at each 3D 2 × 2 × 2 $$ 2\times 2\times 2 $$ downscale, with 16 filters at the first scale, and vice-versa for upscales. The complex nature of the input and output volumes is handled by considering real and imaginary parts as two separate real-valued channels. All deep-learning pipelines are composed of L = 5 $$ L=5 $$ independent UNets, each with 390 066 parameters with buffers and 388 338 without. They are trained for 300 epochs, resulting in 100 h long trainings for each pipeline overall, with RAdam optimizer and learning rate of 5 × 10 4 $$ 5\times {10}^{-4} $$ . The minimized cost function is the sum of an L1 loss applied to complex-valued images and multiscale SSIM over the magnitude images. All training experiments were run on the Jean-Zay supercomputer over a single NVIDIA Tesla V100 GPU with 32GB of VRAM.

3.2 Dataset

A total of 123 SWI volumes were acquired on patients with non-Cartesian 3D GRE sequences at 3T (Magnetom Prisma, Siemens Healthcare, Erlangen, Germany) with a 64-channel head/neck coil array. The protocol was approved by local and national ethical committees (IRB: CRM-2111-207). Patient demographics, a study flow diagram and an illustration of SPARKLING trajectories are provided in the supplementary materials (Section S1). The dataset covers a wide range of pathologies (aneurysm, sickle cell anemia, multiple sclerosis) and off-resonance related artifact levels.

Four different variations of the recently proposed full 3D SPARKLING11 sampling pattern were used, but the vast majority (n = 102) were acquired using the following parameters: A 0.6 mm isotropic resolution, a field-of-view of 24 cm in-plane ( N $$ N $$  = 384) over 12.5 cm ( N z $$ {N}_z $$  = 208), a readout of T obs = 20.48 $$ {T}_{\mathrm{obs}}=20.48 $$  ms centered around an echo time TE = 20 $$ TE=20 $$  ms, a repetition time TR = 37 $$ TR=37 $$  ms. A dwell time of δ t = 2 μs $$ \delta t=2\ \upmu \mathrm{s} $$ was used to balance the small number of spokes N c $$ {N}_c $$  = 4900, resulting in N s $$ {N}_s $$  = 10 240 samples per spoke. The other closely related acquisition variations are similarly described in the supplementary materials (Section S1). An acquisition time of 3 min corresponds to an acceleration factor (AF) of 17, defined as follows:
AF = N × N z N c . $$ \mathrm{AF}\kern0.5em {\displaystyle \begin{array}{c}=\frac{N\times {N}_z}{N_c}.\end{array}} $$ (13)

The Δ B 0 $$ \Delta {B}_0 $$ field maps were not acquired in order to avoid prolonging the exams. Instead self-estimated field maps were computed a posteriori using a recently published technique.14 To generate the ground truths, the field maps were used for model-based correction using the method described in31 with the SVI coefficients, resulting in approximately 8 h long reconstructions/corrections with I = 20 $$ I=20 $$ , Q = 20 $$ Q=20 $$ and L = 20 $$ L=20 $$ over a single NVIDIA Tesla V100 GPU.

The dataset was then split into training (n = 99), validation (n = 11) and testing (n = 11) sets according to balanced age, gender, weight, off-resonance pre and post-correction visibility, and pathology type and visibility distributions. Two acquisitions were excluded due to strong motion (n = 1) and insufficient off-resonance correction (n = 1) caused by braces. In both cases, the self-estimated off-resonance correction still considerably improved the images. Minor quality concerns were raised regarding skin fat artifacts (n = 6) and partially deactivated readout coils (n = 2) without causing exclusion.

Although the trainings were carried out from complex-valued to complex-valued volumes, the SWI specific processing was applied afterwards for visualization and scoring as described in.12 The low frequencies were extracted by applying a Hanning window over the central third of k-space, before being removed from the phase image to obtain a high frequency map, subsequently normalized to produce a continuous mask. The magnitude image was multiplied five times by the mask, and a minimum-intensity projection (mIP) was computed using a thickness of 8 mm. All post-processing steps were again run on the Jean-Zay supercomputer.

3.3 Baselines

Diverse baseline models are proposed to assess the contributions of the different features involved in our pipeline. A first baseline detailed in the supplementary materials (Section S3) is given by replacing the neural networks by conventional CS reconstruction using sparsity promoting regularization in the wavelet domain (Symlet eight basis decomposed over three scales). The L 1 $$ {L}_1 $$ -norm was used for R $$ \mathcal{R} $$ function with λ = 10 7 $$ \lambda ={10}^{-7} $$ for thresholding the wavelet coefficients. To compensate the absence of buffers to learn acceleration schemes over iterations, the FISTA47 algorithm available in the pysap-mri package was used. The data consistency is implemented with F Ω , $$ {F}_{\varOmega, \varSigma } $$ with SVI coefficients reduced to L = 5 $$ L=5 $$ . This baseline noted BASE R $$ {\mathrm{BASE}}_{\mathcal{R}} $$ is used to determine to what extent the contribution of neural networks is critical for improved image quality.

The second baseline BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ similarly replaces the correcting operator F Ω , $$ {F}_{\varOmega, \varSigma } $$ with the regular NUFFT F Ω $$ {F}_{\varOmega } $$ while applying the same network-based regularization as the proposed pipeline.

Finally, the SVI and SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ pipelines were tested without buffers, along with the BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ baseline. The goal was to assess that the buffer feature, modified to fit the stacked-training setup, was still relevant even without end-to-end training.

3.4 Evaluation

The results are assessed according to several criteria: Image quality, off-resonance correction and speed. The first one is commonly assessed60 using the structural similarity index (SSIM) and the peak signal-to-noise ratio (PSNR) measures. However, parameters such as the number of coils Q $$ Q $$ barely contribute to off-resonance correction, as shown in Figure 3, and therefore we also provided Δ B 0 $$ \Delta {B}_0 $$ -weighted versions of SSIM and PSNR. In both cases, the voxel-wise score at position r $$ r $$ was weighted by:
w Δ B 0 ( r ) = Δ B 0 ( r ) Δ B 0 ¯ $$ {w}_{\Delta {B}_0}(r)\kern0.5em {\displaystyle \begin{array}{c}=\frac{\mid \Delta {B}_0(r)\mid }{\overline{\mid \Delta {B}_0\mid }}\end{array}} $$ (14)
with Δ B 0 ¯ $$ \overline{\mid \Delta {B}_0\mid } $$ the average value of the absolute Δ B 0 $$ \Delta {B}_0 $$ field map. In the case of PSNR, it corresponds to using a weighted mean-squared error (MSE) in the expression. Among those four metrics, the Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM was chosen as the primary score for this study as it relates more to the visual impact of off-resonance artifacts, followed by classic SSIM for image quality in general. Since these criteria are not defined for complex-valued data, all metrics are applied after SWI processing to also account for magnetic susceptibility information. The individual scores were then compared between pipelines for both SSIM and Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM in order to obtain p $$ p $$ -values through two-sided Wilcoxon signed-rank tests, before applying a Benjamini-Hochberg correction to adjust the false discovery rate at a p-value of p = 0.05 $$ p=0.05 $$ .
Details are in the caption following the image
Contributions of each parameter based on BASE R $$ {\kern0em }_{\mathcal{R}} $$ . SWI images with 4 mm minimum intensity projection and error maps are obtained using BASE R $$ {\kern0em }_{\mathcal{R}} $$ with different parameters: with I = Q = L = 5 $$ I=Q=L=5 $$ (A), and with either I = 10 $$ I=10 $$ , Q = 10 $$ Q=10 $$ or L = 10 $$ L=10 $$ over column (C) from top to bottom. The difference between (A) and (C) is shown over (B), while the error in (C) compared to the references (with I = Q = L = 20 $$ I=Q=L=20 $$ ) are presented in (D). Note in particular that increasing the number of coils Q $$ Q $$ does not contribute to Δ B 0 $$ \Delta {B}_0 $$ regions close to the bucco-nasal area.
The final aspect we focus on in this study is the computation speed of the different pipelines. The main bottleneck is the NUFFT, and has therefore been the focus for more code optimizations, Some other elements, such as the wavelet transform used for sparsity-promoting regularization or the computation of correcting coefficients ( B ^ , C ^ ) $$ \left(\hat{B},\hat{C}\right) $$ , that are already fast in comparison but not negligible, did not benefit from similar attention. Consequently, these computations might irrelevantly change the duration. Additionally, as the proposed pipeline optimizations are solely related to the amount of individual calls to the NUFFT operator, we decided to track them as N F $$ {N}_F $$ to account for the numerical complexity rather than using misleading time measurements:
N F = N F , S + N F , R $$ {N}_F\kern0.5em {\displaystyle \begin{array}{c}={N}_{F,S}+{N}_{F,R}\end{array}} $$ (15)
N F , S = L S × Q $$ {\displaystyle \begin{array}{rr}& \\ {}{N}_{F,S}& ={L}_S\times Q\end{array}} $$ (16)
N F , R = L × Q × ( 2 × I 1 ) $$ {\displaystyle \begin{array}{rr}& \\ {}& \\ {}{N}_{F,R}& =L\times Q\times \left(2\times \mathcal{I}-1\right)\end{array}} $$ (17)
with N F , S $$ {N}_{F,S} $$ the number of calls related to the sensitivity map computation, and N F , R $$ {N}_{F,R} $$ the number of calls related to image reconstruction. The sensitivity maps are systematically corrected with L S = 10 $$ {L}_S=10 $$ components when F Ω , $$ {F}_{\varOmega, \varSigma } $$ is used in Eq. (3), but not corrected at all when using F Ω $$ {F}_{\varOmega } $$ which is equivalent to L S = 1 $$ {L}_S=1 $$ . The reconstruction consists of one iteration for initialization and I 1 $$ \mathcal{I}-1 $$ iterations using data consistency. Note that the forward and adjoint calls are similarly considered for simplicity reasons, although they differ over non-Cartesian grids.

4 RESULTS

The different pipelines are compared to each other hereafter, along with an ablation study of the modified buffer feature. Additional resources are provided in the supplementary materials, and notably more exhaustive CS baselines to quantify and understand the specific contributions of varying parameters I $$ I $$ , Q $$ Q $$ , and L $$ L $$ (Section S3). The number of coils Q $$ Q $$ in particular, as in Figure 3, is shown to have little influence on off-resonance correction, and the coil sensitivity profiles normalized in Eq. (3) seem to have more impact on all metrics than information carried by additional coils. In contrast, increasing L $$ L $$ contributes solely to off-resonance artifacts correction while augmenting I $$ I $$ positively impacts both off-resonance and high frequency content.

4.1 Correction pipelines

The different scores are reported in Table 1 and Figure 4, with corresponding pairwise p $$ p $$ -values in Table 2, and illustrated in Figure 5. Firstly, the baselines BASE C $$ {\kern0em }_{\mathcal{C}} $$ and BASE R $$ {\kern0em }_{\mathcal{R}} $$ in Table 1 demonstrate the expected positive contribution of a partially corrected Fourier operator F Ω , $$ {F}_{\varOmega, \varSigma } $$ against a basic model (BASE C $$ {\kern0em }_{\mathcal{C}} $$ ), and that of stacked UNets against wavelet-based regularization (BASE R $$ {\kern0em }_{\mathcal{R}} $$ ). The only statistically significant difference holds for Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM, with a large improvement for BASE R $$ {\kern0em }_{\mathcal{R}} $$ . Therefore, the proposed UNet architecture, when combined with conventional operator F Ω $$ {F}_{\varOmega } $$ , is not capable of correcting off-resonance effects as much as a partially correcting operator F Ω , $$ {F}_{\varOmega, \varSigma } $$ does.

TABLE 1. Scores for different correction coefficients and pipelines over the test dataset
Classic Δ B 0 $$ \Delta {B}_0 $$ -weighted
Pipeline Data consistency Regularization SSIM PSNR SSIM PSNR
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ F Ω $$ {F}_{\varOmega } $$ UNet 0.9421 33.32 0.8947 23.50
BASE R $$ {\kern0em }_{\mathcal{R}} $$ F Ω , $$ {F}_{\varOmega, \varSigma } $$ (SVI) Wavelet 0.9392 31.32 0.9249 23.33
MFI F Ω , $$ {F}_{\varOmega, \varSigma } $$ (MFI) UNet 0.9596 34.96 0.9475 26.42
MTI F Ω , $$ {F}_{\varOmega, \varSigma } $$ (MTI) UNet 0.9601 35.13 0.9452 26.11
SVI F Ω , $$ {F}_{\varOmega, \varSigma } $$ (SVI) UNet 0.9613 35.24 0.9498 26.57
SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ F Ω , $$ {F}_{\varOmega, \varSigma } $$ (SVI) UNet 0.9616 35.19 0.9541 27.31
  • Note: Various pipelines with I = 5 $$ I=5 $$ , Q = 5 $$ Q=5 $$ and L = 5 $$ L=5 $$ are summarized with their data consistency and regularization terms and evaluated using classic and Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM and PSNR scores.Bold signifies best (i.e. highest) PSNR and SSIM score.
Details are in the caption following the image
Detailed structural similarity index measure (SSIM) scores distributions for different correction coefficients and pipelines over testing dataset. The SSIM and Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM distributions averaged over testing dataset in Table 1 are detailed for all pipelines over reversed logarithmic scales. The first quartile Q 1 $$ {Q}_1 $$ , median, and third quartile Q 3 $$ {Q}_3 $$ are shown as notched boxes. Maximum and minimum scores are delimited by the whiskers, except for outliers defined as values farther from nearest quartile than 1.5 × ( Q 3 Q 1 $$ {Q}_3-{Q}_1 $$ ) and shown as points. For both scores, the four propositions perform well above the baselines, with on top the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ and then the SVI pipelines.
TABLE 2. Pairwise comparison p-values for all coefficients using two-sided Wilcoxon signed-rank tests on classic and Δ B 0 $$ \varDelta {B}_0 $$ -weighted SSIM scores with Benjamini-Hochberg correction
(a) Classic SSIM
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ BASE R $$ {\kern0em }_{\mathcal{R}} $$ MFI MTI SVI SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ - ** ** ** **
BASE R $$ {\kern0em }_{\mathcal{R}} $$ - ** ** ** **
MFI ** ** - ** **
MTI ** ** - ** **
SVI ** ** ** ** -
SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ ** ** ** ** -
(a) Δ B 0 $$ \varDelta {B}_0 $$ -weighted SSIM
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ BASE R $$ {\kern0em }_{\mathcal{R}} $$ MFI MTI SVI SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ ** ** ** ** **
BASE R $$ {\kern0em }_{\mathcal{R}} $$ ** ** ** ** **
MFI ** ** * ** **
MTI ** ** * ** **
SVI ** ** ** ** **
SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ ** ** ** ** **
  • Note: Various pipelines with I = 5 $$ I=5 $$ , Q = 5 $$ Q=5 $$ and L = 5 $$ L=5 $$ are summarized with their data consistency and regularization terms and evaluated using classic and Δ B 0 $$ \varDelta {B}_0 $$ -weighted SSIM and PSNR scores.
  • p > 0.05 image .
  • p < 0.05 image .
  • p > 0.005 image .
Details are in the caption following the image
Reconstructed SWI images over all pipelines. The reconstructed SWI images with three 4 mm in-plane (axial, sagittal, coronal) minimum intensity projections are provided for all color-coded pipelines (following the convention adopted in Figure 4 with regard to the color coding) for acquisition 68 from testing dataset. Various details are pointed out with blue arrows: both baselines provide blurred and incorrect details over axial slices, up to no detail at all over sagittal slices, while both details and textures are recovered by the four proposed pipelines.

Secondly, the different correction approaches (MFI, MTI, SVI) are combined with UNets through data consistency terms. All of them show significantly higher scores than both baselines. Their Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM score follow the same ranking as those previously observed in Figure 1B prior to image reconstruction when solving Eq. (10): the SVI coefficients reach a better score as compared to MFI (second) and MTI (third) coefficients. This suggests that none of the diverse correction approaches carried by the different coefficients are better suited than another to help UNets compensate for the missing correction.

Finally, the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ pipeline improves the SVI approach with regard to Δ B 0 $$ \Delta {B}_0 $$ -weighted SSIM, while not significantly deviating from it for classic SSIM. Exploring more components appears to guide the correction that networks alone could not achieve. This can be observed in Figure 5 near the bucco-nasal region visible with the sagittal views, where only SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ recovers high-frequency details consistent with the target. The additional baselines from supplementary materials (Section S3) show that the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ pipeline competes with reconstruction two to three times slower depending on the evaluation criteria.

4.2 Network and model contributions

The reconstruction steps are further decomposed between data consistency and regularization in Figure 6 and illustrated in Figure 7. For both metrics, the two baselines shown in Figure 6 with brown (BASE R $$ {\kern0em }_{\mathcal{R}} $$ ) and gray (BASE C $$ {\kern0em }_{\mathcal{C}} $$ ) curves are again significantly different. The BASE R $$ {\kern0em }_{\mathcal{R}} $$ curves show a slow start with oscillations between improving data consistency and degrading regularization but with a regular progression afterwards. The oscillations are explained by the constant wavelet threshold, too large during early iterations when the overall magnitude was still too low. In contrast, the networks used in BASE C $$ {\kern0em }_{\mathcal{C}} $$ considerably improve the initialization, then followed by oscillations caused by the data consistency. Those oscillations can be explained by the off-resonance artifacts enforced through the wrong Fourier model F Ω $$ {F}_{\varOmega } $$ but also by the sensitivity profiles altered by the normalization from Eq. (3) when reducing the number of channels Q $$ Q $$ . Both are illustrated in Figure 7 with green and blue arrows respectively.

Details are in the caption following the image
Testing scores evolution over reconstruction. The main pipelines are decomposed between data consistency ( C i $$ {\mathcal{C}}_i $$ ) and regularization ( R i $$ {\mathcal{R}}_i $$ ) steps to provide the classic and Δ B 0 $$ \Delta {B}_0 $$ -weighted structural similarity index measure (SSIM) scores over the testing dataset over reversed logarithmic scales. The MFI and MTI pipelines are omitted for readability as both are redundant with singular vector interpolation (SVI) pipeline. Note the large difference over the first regularization between wavelet-based and network-based approaches, but also the later oscillations notably over the BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ approach.
Details are in the caption following the image
Susceptibility-weighted imaging (SWI) images evolution over reconstruction. The intermediate SWI images with 4 mm minimum intensity projection are provided over reconstruction decomposed into data consistency ( C i $$ {\mathcal{C}}_i $$ ) and regularization ( R i $$ {\mathcal{R}}_i $$ ) steps for acquisition 106 from testing dataset. Various details are pointed out with arrows: coil sensitivity degraded by data consistency (green), off-resonance degraded by data consistency (blue) and off-resonance degraded by regularization (red).

The proposed pipelines SVI and SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ are observed to combine a steeper initialization and better progression over iterations, but still with sensitivity-related oscillations. The SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ is however more robust to those as it alternates between large regions with heterogeneous sensitivities and smaller Δ B 0 $$ \varDelta {B}_0 $$ -specific regions. Overall, the UNets appear to contribute to all three studied factors, but the striking difference between BASE C $$ {\kern0em }_{\mathcal{C}} $$ , SVI, and SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ pipelines suggests that networks only recover low image frequencies when correcting off-resonance. The partial correction model is therefore required in those regions, but the networks serve as an effective image pre-conditioner.

4.3 Buffer feature ablation

The main pipelines have been evaluated with and without the proposed buffer feature modified to account for stacked training, namely SVI and SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ along with BASE C $$ {\kern0em }_{\mathcal{C}} $$ . The different results are shown in Table 3.

TABLE 3. Ablation study of the buffer feature over different architectures on testing dataset
Classic Δ B 0 $$ \varDelta {B}_0 $$ -weighted
Pipeline Buffers p SSIM PSNR p SSIM PSNR
BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ - - 0.9427 33.40 ** 0.8979 23.69
3 0.9421 33.32 0.8947 23.50
SVI - * 0.9603 35.11 * 0.9490 26.57
3 0.9613 35.24 0.9498 26.57
SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ - ** 0.9575 34.71 ** 0.9493 26.98
3 0.9616 35.19 0.9541 27.31
  • Note: The best performing pipelines (SVI and SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ ) are evaluated with and without buffers, along with the baseline BASE C $$ {\mathrm{BASE}}_{\mathcal{C}} $$ based on the non-correcting Fourier operator F Ω $$ {F}_{\varOmega } $$ . The classic and Δ B 0 $$ \varDelta {B}_0 $$ -weighted scores are averaged over the testing dataset, with statistical significance assessed through pairwise two-sided Wilcoxon signed-rank tests over the SSIM scores with a global Benjamini-Hochberg correction.
  • p > 0.05 image .
  • p < 0.05 image .
  • p > 0.005 image .

The baseline BASE C $$ {\kern0em }_{\mathcal{C}} $$ that combines UNets with conventional Fourier data consistency shows Δ B 0 $$ \Delta {B}_0 $$ scores significantly worse when using the buffer feature. One interpretation could be that as consistency with corrupted brings back artifacts, as observed in Figure 7, adding memory of previous iterations only leads to a noisier learning process without additional information.

On the other hand, the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ approach shows significant and large improvements with buffers, more than for the SVI pipeline. However, it should be noted that SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ classic scores without buffers are much lower than any other proposed pipeline, but still better than the baselines. This suggests that the exploratory correction strategy inherently degrades the overall reconstruction to advantage the off-resonance areas, but that buffer feature allows for a compensatory effect within neural networks.

5 DISCUSSION

The diverse contributions of this article are discussed hereafter. First, we compared conventional methods29-31 to perform, to the best of our knowledge, the first partial off-resonance correction study. Second, we analyzed the deep learning contributions in a multi-parametric acceleration pipeline. Finally, we conducted a deep learning study over a fairly large in vivo dataset consisting of model-based self-estimated off-resonance corrected volumes.14

5.1 Network and model improvements

The network and model contributions when facing reduced number of iterations, compressed coils, and correction components have been studied in Section 4.

Overall, the number of coils Q $$ Q $$ was observed to impact metrics by changing the sensitivity distribution superficially rather than improving the signal. Other normalization61 or coil compression methods62 could be explored in the future to at least balance this scoring issue. The proposed UNets appeared to mostly improve the initialization overall and help progression through the buffer feature when applying the SVI 1 / 2 $$ {\mathrm{SVI}}_{1/2} $$ strategy otherwise less efficient. The latter shows how deep learning allows for more flexibility when designing hybrid algorithms. However, high-frequency details are still only recovered through model-based correction.

More elaborated strategies could be developed to distribute efficiently the correction components over iterations. Particularly, for in-out SPARKLING trajectories the MTI coefficients were shown to correlate with high/low image frequencies. A better balance between low frequencies covered by neural networks and high frequencies enforced by specifically selected correction components might produce more reliable results.

5.2 Dataset generation

The Δ B 0 $$ \Delta {B}_0 $$ map estimation technique developed in a previous publication14 assumes the relationship between B 0 $$ {B}_0 $$ inhomogeneities and image phase to be dominant over other phase sources for large echo times (e.g., 20 ms and higher) at 3T. It allows for a simple estimate of the Δ B 0 $$ \Delta {B}_0 $$ field map retrospectively with minimal error and no motion-related mismatch for SWI acquisitions and alike. It also avoids any assumptions on the trajectory and can, therefore, be used on any dataset matching the above-mentioned acquisition setup and providing either the raw k-space data or complex-valued MR images.

The self-estimation method has been carefully assessed during the early stages of the study, and reached our expectations. However, two other competing methods were also considered, the first by Lee et al.38 through simulation from a binary mask, and the second by Patzig et al.35 through non-convex optimization. The simulation method in particular has been implemented in Python but the required mask obtained from the artifacted images was suboptimal. The main advantage of those three self-estimation methods for the purpose of our proposed acceleration pipeline is to require strictly the same data. This implies that the pipeline could be efficiently applied within the clinical context and additionally improve over sessions using the same acquisitions processed through longer and dedicated procedures.

6 CONCLUSIONS

MR acquisitions based on non-Cartesian acquisitions tend to exploit longer but fewer complex readouts to reduce overall scan times but suffer more from B 0 $$ {B}_0 $$ inhomogeneities. Diverse methods exist to compensate for these artifacts, but the few non-constraining ones slow down reconstructions by an unacceptably large factor (10–20 at 3T). Deep learning techniques have been developed in the recent years, but mostly to estimate the Δ B 0 $$ \Delta {B}_0 $$ field map which can already be self-estimated using efficient models for SWI, or to compensate for undersampling artifacts in the CS setting. The proposed approach combines partial models and deep learning to ensure data fidelity at a low computation cost when addressing off-resonance artifacts and gain in image quality. It outlines both their individual and joint contributions. The MR volumes reconstructed at inference in only 7–8 min are competing with baselines obtained in 30 min to approximate 8 h long computations, and could be further accelerated by developing new hybrid strategies. Future work might also explore the parallel aspects of the correcting models to better fit memory constraints of the GPUs, or simply include undersampling compensation through self-supervised learning methods.

ACKNOWLEDGMENTS

The concepts and information presented in this article are based on research results that are not commercially available. Future availability cannot be guaranteed. This work was granted access to the HPC resources of IDRIS under the allocation 2021-AD011011153 made by GENCI. We thank Dr. Zaccharie Ramzi and Chaithya G R for their previous works and open-source contributions. Additionally we thank Cecilia Garrec for the scientifc editing in English. Philippe Ciuciu received funding from Siemens Healthineers (France) to support this research.

    ENDNOTES

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.