Volume 3, Issue 6 e1207
SPECIAL ISSUE PAPER
Full Access

A block preconditioner for two-phase flow in porous media by mixed hybrid finite elements

Stefano Nardean

Stefano Nardean

Division of Sustainable Development, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar

Search for more papers by this author
Massimiliano Ferronato

Massimiliano Ferronato

Department of Civil, Environmental and Architectural Engineering, University of Padova, Padova, Italy

Search for more papers by this author
Ahmad S. Abushaikha

Corresponding Author

Ahmad S. Abushaikha

Division of Sustainable Development, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar

Correspondence Ahmad S. Abushaikha, Division of Sustainable Development, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar.

Email: [email protected]

Search for more papers by this author
First published: 16 October 2021
Citations: 5

Funding information: Qatar National Research Fund, NPRP10-0208-170407

Abstract

In this work, we present an original block preconditioner to improve the convergence of Krylov solvers for the simulation of two-phase flow in porous media. In our modeling approach, the set of coupled governing equations is addressed in a fully implicit fashion, where Darcy's law and mass conservation are discretized in an original way by combining the mixed hybrid finite element (MHFE) and the finite volume (FV) methods. The solution to the sequence of large-size nonsymmetric linearized systems of equations that stem during a full-transient simulation represents the most time and resource consuming task, thus motivating the need for efficient preconditioned Krylov solvers. The proposed preconditioner exploits the block structure of the Jacobian matrix while coping with the nonsymmetric nature of the individual blocks. Both academic and realistic applications have been used to challenge the preconditioner, allowing to point out its robustness, stability and overall computational efficiency.

1 INTRODUCTION

Solving series of large-size sparse systems of equations is a key task for modeling transient physical processes, often posing an issue in terms of high computational cost. The target of this paper is the efficient solution of the linear(ized) systems of equations originating from the simulation of two-phase flow in porous media, which is a classical fundamental application for the energy industry. Despite the continuous development of computer hardware technology with ever more powerful processing units and larger memory capacities being released, high runtime and storage demand are frequent drawbacks encountered in reservoir modeling due to the increase of model size for more accurate simulations. The solving procedure, and hence the linear solver, is usually the main responsible for that. In typical reservoir simulations, around 60% to 80% of the total CPU time is spent in this task.1, 2

The flow of two fluid phases in porous media is governed by the mass balance equations coupled with Darcy's law. In our modeling approach, the weak formulation of this set of PDEs is obtained by applying the Finite Volume (FV) method with a low-order time-marching scheme to the continuity equations and the Mixed Hybrid Finite Element (MHFE)3 method to Darcy's law. The FV method is a well-established choice in reservoir simulations, since it guarantees the mass conservation both at the local and global level, that is, a primary requirement in transport-based problems. On the other hand, a number of schemes can be found in literature for the discretization of Darcy's law. The most popular one is perhaps the simple two-point flux approximation (TPFA),4 which, however, suffers from well-known limitations in dealing with full tensor rock-fluid properties and unstructured grids. Its extension, the multi-point flux approximation (MPFA) variant,5-7 is another popular strategy along with the mimetic finite difference (MFD) scheme, which has been recently used in a two-phase flow model in fractured media.8 Nevertheless, the approximation accuracy in determining the fluid pressure and velocity, even on unstructured meshes with full tensor medium properties, and the continuity of the normal component of the fluxes through the interfaces of the grid, while honoring the mass conservation at the element level,9, 10 have made the MHFE an alternative robust approach. Despite a few drawbacks, such as the enlargement of the stencil or the breach of the discrete maximum principle,11 MHFE and the pioneering nonhybridized version, the mixed finite element (MFE) method, have witnessed an increasing popularity in the last decade. Recent significant applications include coupled flow-poromechanics,12-16 deformation models,17, 18 multiphase19-23 and coupled Stokes–Darcy24 flow in porous media. An MHFE discretization of the two-phase flow model in porous media, which is similar to the one introduced in this work, was proposed by Fučík and Mikyška,25 but it relied on a sequential splitting technique to handle the inherent coupling of the governing equations. In our modeling approach, instead, the coupled solution is addressed by an unconditionally stable fully implicit approach, whereas the inherent nonlinearity is dealt with a Newton scheme, which requires at each time step the solution to a sequence of linearized systems of equations with the Jacobian matrix. Since such systems are usually very large (i.e., 𝒪 ( 1 0 6 ) unknowns and more) and ill-conditioned, iterative techniques, and specifically preconditioned Krylov subspace methods,26 are necessary for their solution.

In order to attain a fast convergence, Krylov subspace solvers need to be equipped with robust and efficient preconditioners that should take advantage of the Jacobian block structure. In this regard, multistage preconditioners, such as the constrained pressure residual (CPR),27, 28 are the standard for industry reservoir simulators. In its basic structure, the two-stage multiplicative CPR algorithm couples an AMG preconditioner for the usually elliptic pressure block and an incomplete factorization for the full matrix. The popularity of this scheme is mostly related to the high scalability of AMG, which is tailored for elliptic problems. The classical version of the CPR preconditioner, however, does not prove highly effective for the MHFE-FV-discretized two-phase flow model, since it is not capable to capture the composite structure of the nonsymmetric pressure block, resulting from the introduction of pressure unknowns on both elements and faces. Such a submatrix exhibits, in fact, an inner 2 × 2 block structure, that can be well exploited by means of a block preconditioning strategy. Although block preconditioners have found some application in reservoir simulators only in recent years,29 they have been long developed for other problems,30 such as the solution of the Navier–Stokes equations,31, 32 coupled poromechanics,33-35 electromagnetism,36, 37 and contact mechanics,38-40 just to cite some.

Specific preconditioners have been also developed in the past for the two-phase flow problem in porous media. Bergamaschi et al.41 developed a preconditioning technique upon a finite element discretization of the governing equations. The preconditioner consists of an incomplete factorization of the whole Jacobian matrix which is repeatedly updated during the simulation. More recently, Skogestad et al.42 followed a different approach by designing an ASPIN-like43 nonlinear preconditioner based on a TPFA approximation of Darcy's equation. Nonlinear preconditioners operate at the level of the nonlinear problem, with a different approach than the linear stage. To the best of our knowledge, this work is one of the first attempts to design a full block preconditioner for the two-phase flow problem in porous media discretized by a combined MHFE-FV strategy.

The rest of the paper is organized as follows. In Section 2 the set of equations governing the two-phase flow in porous media is presented along with the numerical formulation. Our block preconditioner is then introduced and tested in Sections 3 and 4, respectively, where a set of numerical applications allowed to assess the efficiency and robustness of the proposed approach both in academic and realistic settings. The discussion and conclusions section finally closes the paper.

2 MODELING THE TWO-PHASE FLOW IN POROUS MEDIA

The mass conservation equations and Darcy's law, as rearranged by Muskat and Meres,44 allow to mathematically describe the simultaneous flow of multiple fluids in porous media. We consider specifically an isothermal two-phase flow setting, where the fluids consist of oil (o) and water (w). These are assumed to be incompressible and immiscible, resulting in constant densities and dynamic viscosities, whereas the porous matrix is allowed to deform due to the pore pressure change. Capillary forces are also neglected in the model. Following the natural variables formulation,45 the model unknowns are: (i) the pressure of the nonwetting phase, that is, oil, p o , (ii) the top well pressure, p b , and (iii) the water saturation, S w . The mass conservation and Darcy's law, written for both phases, form a set of coupled nonlinear equations, whose solution is obtained through a fully implicit monolithic approach, that is, all model unknowns are computed simultaneously.

2.1 Mathematical model

Let us consider a finite three-dimensional (3-D) porous domain, Ω , with its boundary Γ and the closure Ω = Ω Γ . The boundary is subdivided into two portions, Γ p and Γ v , such that Γ p Γ v = Γ and Γ p Γ v = , where Dirichlet and Neumann conditions, respectively, are prescribed as necessary. We denote time and the simulated open temporal interval of size T with t and 𝕋 , respectively. Indicating also with f α : Ω × 𝕋 the source or sink term for phase α due to the action of wells, p α : Ω × [ 0 , T ] the phase fluid pressure, S α : Ω × [ 0 , T ] the relevant saturation, v α : Ω × [ 0 , T ] 3 the velocity vector and ϕ : Ω ] 0 , 1 [ the medium porosity, the set of governing PDEs is given by:
ϕ S α t + · v α = f α , α = o , w on Ω × 𝕋 ( mass conservation ) , ()
v α = λ α K ( p α γ α z ) , α = o , w on Ω × 𝕋 ( Darcy's law ) , ()
where the symbols and · denote the gradient and divergence operators, respectively. Here, K is the permeability tensor, assumed to be symmetric and positive definite (SPD), z is depth (positive downward), λ α = k r α / μ α is the α -phase mobility factor, with k r α and μ α the relative permeability and the phase dynamic viscosity, respectively, and γ α is the fluid specific weight. Notice, in Equation (1b), that p w = p o since capillarity is not considered in the model, therefore, in the rest of the paper, the subscript will be dropped and pressure will be simply referred to as p.
The set of PDEs (1a) and (1b) is incomplete and cannot be addressed unless additional conditions are supplied. Following the classical Brooks–Corey relationship,46 the phase relative permeability can be mathematically expressed as a function of the saturation:
k r α = k r α 1 S α n α , α = o , w , ()
where k r α 1 is the relative permeability at full saturation, n α is Corey's parameter and
S α = S α S α r 1 S α r S α r
is the normalized phase saturation. Denoting with α and α the two phases, S α r and S α r are the relevant residual saturations. Porosity is allowed to vary with the pore pressure according to the well-known relationship:
ϕ = ϕ 0 1 + c r p p 0 , ()
where c r is the lumped factor expressing the soil compressibility, along with p 0 : Ω and ϕ 0 : Ω ] 0 , 1 [ the initial pressure and porosity, respectively. Regarding the well equations, we adopt the classical Peaceman model,47, 48 which is briefly recalled in Appendix A.1. Finally, the local constraint on saturation
α = o , w S α = 1 ()
closes the problem.
The mass conservation Equation (1a), written for both oil and water, describes a transport problem and exhibits a hyperbolic character. With the aim at improving the algebraic properties of the discretized systems, it is possible to replace one of the two with a parabolic equation obtained by their sum and introducing relationship (4). This is a rather classical trick, which allows to get rid of the dependency on S α . The resulting equation is often referred to as pressure equation. Usually the surviving phase mass balance equation is written for water and is also denoted as saturation equation. In conclusion, the final equivalent set of PDEs consists of Equation (1b) in addition to:
ϕ t + · v = f , on Ω × 𝕋 ( pressure equation ) , ()
ϕ S w t + · v w = f w , on Ω × 𝕋 ( saturation equation ) , ()
where v = v o + v w and f = f o + f w . The set of PDEs (1b), (5a), and (5b), along with the ancillary relationships (2)-(4), defines a well-posed problem whether proper initial and boundary conditions are supplied:
p | t = 0 = p 0 in Ω (initial fluid pressure), ()
S w | t = 0 = S w 0 in Ω (initial water saturation), ()
p = p on Γ p × 𝕋 (prescribed fluid pressure), ()
λ α K p · n = v n α on Γ v × 𝕋 (prescribed Darcy's flux), ()
for given functions S w 0 : Ω , p : Γ p × 𝕋 , and v n α : Γ v × 𝕋 . In Equation (6d), n denotes the outer unit normal vector to Γ v .

2.2 Numerical model

Given a porous domain Ω partitioned into a set of nonoverlapping hexahedra, let h and h denote the collection of the grid elements and faces of size N e and N f , respectively. With reference to the characteristic hexahedral element of the grid depicted in Figure 1, the pressure unknowns are located on the element centroid, p E , and on each face barycenter, π p . The water saturation, S w E , instead, is computed only on the element centroid. Let also z E and π z be the depth (positive downward) of the element centroid and faces, respectively. In the following, we describe the discretization of the PDEs in Equations (1b), (5a), and (5b).

Details are in the caption following the image
Location of the unknowns within the reference element

Equation (1b) is integrated by using the 0 approximation space for pressures ( p h and π p h ) and water saturation ( S w h ), while the MHFE method in the lowest-order Raviart–Thomas 𝕋 0 space49 is used to discretize the velocity ( v α h , α = o , w ) . In a 3-D setting, the latter space consists of local piecewise trilinear vector functions, η i E ( x , y , z ) , defined for each face belonging to element E. These basis functions satisfy the two properties recalled in Nardean et al..50 The unknowns p E and S w E express the average value of the relevant physical quantities within the element, whereas π p plays a similar role for the faces. The face pressures in the MHFE method act the part of Lagrange multipliers, whose introduction is aimed at enforcing the continuity of the normal component of the fluxes across the inter-element faces.

Following a similar mathematical derivation as in Abushaikha et al.,21 the discretized version of Equation (1b) reads:
q α E = λ α B E 1 p E 1 π p E γ α z E 1 π z E , α = o , w , ()
where q α E , π p E , and π z E are the vectors gathering the fluxes of phase α , interface pressures and depths of element E, respectively, and 1 N f E is the vector of unitary components, with N f E the number of faces per element. Moreover, λ α is the diagonal matrix containing the fluid mobility of the upstreaming element for each face, and ( B E ) 1 N f E × N f E is the inverse of the elementary matrix, whose components are:
B i j E = Ω E η i E T ( K E ) 1 η j E d Ω , i , j = 1 , , N f E , ()
with Ω E the element volume. Notice that ( B E ) 1 is SPD if K E is so.
The pressure Equation (5a) is discretized in space with a FV scheme where the grid elements serve as control volumes, yielding:
Ω E ϕ t d Ω + Ω E · v d Ω Ω E f d Ω = 0 , E h .
Introducing the unconditionally stable backward Euler scheme for the time discretization and recalling that the second term at the left-hand side is equivalent to the sum of the total fluxes across the element faces, we have:
Ω E ϕ p n + 1 E ϕ p n E Δ t n + i = 1 N f E q i E Ω E f E = 0 , E h , ()
where q i E = q o , i E + q w , i E . Furthermore, the subscript n denotes the previous time step, n + 1 the actual one, and Δ t n = t n + 1 t n is the time increment. Similarly, the discretized version of the saturation equation reads:
Ω E ϕ p n + 1 E S w , n + 1 E ϕ p n E S w , n E Δ t n + i = 1 N f E q w , i E Ω E f w E = 0 , E h . ()

2.2.1 The MHFE-FV system of equations

Equations (1b), (5a), and (5b), along with the additional relationships (2)–(4), give rise, at each time step, to a nonlinear and frequently ill-conditioned fully implicit problem, where a single system with multiple types of unknowns, consisting of the sets of element, p , top well, p b , and face, π p , pressures, along with water saturation, S w , is assembled and tackled at once. Such system consists of four groups of equations, reflecting the number and size of the unknown types. However, since it is expected the number of wells in the model to be much smaller than that of grid elements, well pressures are incorporated in the set of element pressures, as well as for the relevant equations. Therefore, the final system of nonlinear equations, R = 0 , consists of three sets, where the first one, R p = 0 , is given by (9) and (A5), written for each element and well, respectively. Equation (10), enforced for all grid elements, gives rise to the second set of system equations, R s = 0 . The continuity of the total fluxes across the grid faces forms the third set, R π = 0 :
α = o , w q α , i E + q α , i E = 0 , i h , ()
where E and E denote the elements sharing the ith grid face and q α , i E and q α , i E are the relevant fluxes as expressed in Equation (7).
The nonlinear problem R = 0 is addressed by applying a classical Newton technique. This involves solving iteratively a sequence of linearized systems of equations
𝒥 ( k ) Δ x = R ( k ) J p p J p s J p π J s p J s s J s π J π p J π s J π π ( k ) Δ x p Δ x s Δ x π = R p R s R π ( k ) ()
to obtain the increment Δ x used to update the current solution:
x ( k + 1 ) = x ( k ) + Δ x . ()
In Equations (12) and (13), 𝒥 ( k ) is the Jacobian matrix, R ( k ) is the residual vector, both updated at the previous iteration k, and x ( k ) and x ( k + 1 ) denote the solution at the previous and actual iteration, respectively. Also, J p p ( N e + N w ) × ( N e + N w ) , J s s N e × N e , and J π π N f × N f , with N w the number of wells. The analytical expression of the Jacobian blocks is detailed in Appendix B.1. Notice that when the gravitational effects are neglected, J π s = 0 , otherwise all blocks are nonzero. The overall Jacobian matrix is nonsymmetric so as the diagonal blocks, with the exception of J π π that is SPD when gravitational forces are neglected.

3 FULL BLOCK PRECONDITIONER

The design of a preconditioner for a certain numerical problem relies on the fundamental requirement that its application (to a vector) should mimic as much as possible that of the inverse of the system matrix. Block preconditioners, in particular, achieve this goal by exploiting the inherent block structure of the system matrix, that is, 𝒥 in Equation (12). The superscript ( k ) is here dropped to compact the notation. From an algebraic viewpoint, the starting point is the block 𝒟 𝒰 decomposition of the Jacobian:
𝒥 = 𝒟 𝒰 J p p J p s J p π J s p J s s J s π J π p J π s J π π = I p 0 0 L s p I s 0 L π p L π s I π J p p 0 0 0 S 1 0 0 0 S 2 I p U p s U p π 0 I s U s π 0 0 I π , ()
where I p , I s , and I π are the identity matrices of order N e + N w , N e , and N f , respectively. S 1 and S 2 are the first- and second-level Schur complements:
S 1 = J s s J s p J p p 1 J p s , S 2 = J π π J π p J p p 1 J p π J π s J π p J p p 1 J p s S 1 1 J s π J s p J p p 1 J p π , ()
while the blocks in the lower and upper triangular matrices read:
L s p = J s p J p p 1 , L π p = J π p J p p 1 , L π s = J π s J π p J p p 1 J p s S 1 1 , ()
U p s = J p p 1 J p s , U p π = J p p 1 J p π , U s π = S 1 1 J s π J s p J p p 1 J p π . ()
Inverting exactly matrix 𝒥 by exploiting the 𝒟 𝒰 factorization (14) yields:
𝒥 1 = 𝒰 1 𝒟 1 1 = I p U p s U p s U s π U p π 0 I s U s π 0 0 I π J p p 1 0 0 0 S 1 1 0 0 0 S 2 1 I p 0 0 L s p I s 0 L π s L s p L π p L π s I π .
The explicit inversion of J p p , S 1 , and S 2 is usually not available. Using sparse and cheap approximations of these blocks produces an incomplete inverse of 𝒥 that is used as block preconditioner for the overall problem:
𝒫 1 = I p Ũ p s Ũ p s Ũ s π Ũ p π 0 I s Ũ s π 0 0 I π J ˜ p p 1 0 0 0 S ˜ 1 1 0 0 0 S ˜ 2 1 I p 0 0 L ˜ s p I s 0 L ˜ π s L ˜ s p L ˜ π p L ˜ π s I π , ()
where the blocks overlined by ˜ are sparse approximations for the corresponding terms in Equations (15)–(17). Notice from Equations (B2a) that J p p has a near-diagonal structure, hence the Jacobi preconditioner provides already a good approximation J ˜ p p and allows to compute S ˜ 1 almost exactly. With the same diagonal approximation J ˜ p p 1 , the inexact versions L ˜ s p , L ˜ π p , Ũ p s , and Ũ p π of the decoupling factors in Equations (16) and (17) are explicitly computed. The terms L ˜ π s and Ũ s π are not explicitly stored, but whenever needed we use the approximations L π s = J π s J π p J ˜ p p 1 J p s and U s π = J s π J s p J ˜ p p 1 J p π and apply S ˜ 1 1 implicitly. Finally, S ˜ 2 is computed explicitly by replacing S 1 1 with the inverse of the diagonal of S ˜ 1 .
The application v = v p , v s , v π T of 𝒫 1 in Equation (18) to a vector w = [ w p , w s , w π ] T , as required at each iteration of Krylov subspace solvers, is performed in a block fashion and consists of three main stages:
v π = S ˜ 2 1 L ˜ π s L ˜ s p L ˜ π p w p L ˜ π s w s + w π , ()
v s = S ˜ 1 1 ( w s L ˜ s p w p ) Ũ s π v π , ()
v p = J ˜ p p 1 w p Ũ p s v s Ũ p π v π . ()
Algorithm (1) details the sequence of operations for implementing Equations (19)–(21). As a result of the nonsymmetry of both S ˜ 1 and S ˜ 2 , we will consider only diagonal scaling and incomplete factorizations26 as local preconditioners for the Schur complement inverse applications. Letting aside the product with S ˜ 1 1 and S ˜ 2 1 , whose efficiency largely depends on the specific type of preconditioner being used, only sparse matrix-vector products and vector updates are required, which are highly efficient kernels in parallel platforms.

Algorithm . v = apply_prec J ˜ p p 1 , S ˜ 1 1 , S ˜ 2 1 , L ˜ s p , L ˜ π p , L π s , Ũ p s , Ũ p π , U s π , w

4 NUMERICAL TESTS

The computational performance of the proposed block preconditioner has been assessed in five numerical applications, including both academic and realistic settings. In the first four tests (denoted as Test 0, 1, 2, and 3), we considered a domain reproducing a planar reservoir discretized with a pair of uniform Cartesian grids referred to as coarse and fine, respectively. The coarse grid is used only in Test 0, while the fine one, displayed in Figure 2A, is considered in Tests 1 to 3. By distinction, in Test 4, the plain reservoir has been deformed into a dome formation resulting in a non-Cartesian grid built upon the fine discretization (Figure 2B). The production scenario is common to all tests and simulates a waterflooding process by means of a couple of injection and production wells located at two opposite corners. All grids have been created with Gmsh software.51

Details are in the caption following the image
View of the reservoir models. The blue and red arrows indicate the location of the injection and production wells, respectively. The size is in meters

The reservoir in Test 0, discretized by the coarse grid, is characterized by homogeneous and isotropic permeability and the gravitational forces are taken into account. The main purpose of this application is to analyze the structure of S ˜ 1 and S ˜ 2 and how it typically evolves during an unsteady simulation. We compare also the eigenspectrum of the exact and inexact versions of the Schur complements to evaluate the spectral effect of the approximations.

As to Tests from 1 to 4, they are characterized by different settings for the grid type, porosity and permeability fields with gravity possibly taken into account. One of the objectives of the study is to analyze the robustness and effectiveness of the preconditioner in tackling the ill-conditioning of the systems of equations originating from models with highly heterogeneous and anisotropic permeability distributions. On the other hand, gravity modifies the overall problem structure by introducing an additional nonzero coupling block in the Jacobian. A summary of Test 0–4 settings is provided in Table 1.

TABLE 1. Set-up of the test cases
Test 0/2 1 3 4
Reservoir type Planar Planar Planar Dome
Cond. tensor properties Homogeneous Homogeneous Heterogeneous Heterogeneous
Isotropic Isotropic Anisotropic Anisotropic
Cond. tensor type Diagonal Diagonal Diagonal Full
Horiz. conductivity k x , y [m/d] 1.E-12 1.E-12 [3.53E-15, 1.97E-8] [3.53E-15, 1.97E-8]
Vert. conductivity k z [m/d] 1.E-12 1.E-12 [5.92E-19, 5.92E-9] [1.E-13, 5.92E-9]
Porosity ϕ 0 [-] 2.5E-1 2.5E-1 [3.80E-5, 4.37E-1] [3.80E-5, 4.37E-1]
Oil spec. gravity γ o [kPa/m] 8.00 - - -
Water spec. gravity γ w [kPa/m] 9.81 - - -
  • Note: The values of the permeability and porosity in brackets are the minimum and maximum of the SPE10 model portion used herein. The distribution of the relevant values in the domain follows that of the SPE10 data set. Common settings: (oil dynamic viscosity) μ o = 2 . 3148 E 11 kPa d, (water dynamic viscosity) μ w = 1 . 1574 E 11 kPa d, (rock compressibility) c r = 5 . E 7 kPa 1 , (residual oil saturation) S o r = 0 , (irreducible water saturation) S w r = 0 , (initial water saturation) S w 0 = 0 .

Test 1 is the basic benchmark with homogeneous isotropic permeability, uniform porosity and no gravity. Such a simplified setting introduces a baseline for the preconditioner and assesses the effect of the inexact application of blocks J p p 1 , S 1 1 , and S 2 1 on the solver performance. Test 1 is also benchmarked against global incomplete LU factorizations with variable fill-in. In Test 2, gravity is added, whereas in Test 3, a highly heterogeneous and anisotropic permeability field with abrupt spatial leaps, as retrieved from the upper layers (Tarbert formation) of the 10th comparative SPE data set,52 is introduced. The porosity distribution follows that of the SPE10 model as well. This test is aimed at evaluating the robustness and efficiency of the proposed block preconditioning framework in a real-world setting. Finally, in Test 4, the performance analysis is further stressed by considering a non-Cartesian discretization with full tensor heterogeneous and anisotropic permeability. The wells are vertical and extend through the whole reservoir thickness. The injection well operates at a constant rate, while the production well at a fixed bottom hole pressure (BHP). The fine discretization grids (both Cartesian and non-Cartesian) have 12,500 elements and 40,625 faces, for a total of 65,627 system unknowns.

GMRES,53 restarted every 200 iterations, is chosen as linear solver to address the series of linearized systems (12) stemming during the simulation. The exit criterion for the iteration count is based on the reduction of the 2-norm of the Jacobian system residual below a defined tolerance, that is, r i ( k ) / r 0 ( k ) < τ l , where i is the iteration number and τ l = 1.E-6. For the nonlinear solver, instead, the termination test relies on the absolute residual check. In particular, given the different nature of the unknowns, that is, pressures and saturation, the residual is broken down accordingly and the 2-norm of the single blocks evaluated separately: max R p 2 , R s 2 , R π 2 ( k ) < τ n l , where τ n l = 1.E-6.

The preconditioner performance is monitored by means of a number of local and global indicators. The local quantities measure the average performance of the preconditioned solver addressing the first linearized system at each nonlinear step after the maximum time step size is achieved, while the global quantities evaluate the overall behavior during a full-transient simulation. Local indicators include: (i) i t l , the average number of linear iterations, (ii) t p , (iii) t s , the average CPU time to build the preconditioner and solve the system, respectively, with (iv) t t = t p + t s the average total solving time, and (v) the mean preconditioner density, μ , which is elected as measure of the memory footprint. The quantity μ is obtained by averaging the preconditioner density at each step
μ ^ = nnz J ˜ p p + nnz S ˜ 1 + nnz S ˜ 2 + nnz L ˜ s p + nnz L ˜ π p + nnz L π s + nnz Ũ p s + nnz Ũ p π + nnz U s π nnz ( 𝒥 )
as specified above. Here, the function nnz ( A ) outputs the number of nonzeros of the sparse matrix A. As to the global indicators, (I) i t l , (II) t p , (III) t s , and (IV) t t are the cumulative linear iterations and CPU times, respectively, including (V) i t n l , the total number of nonlinear iterations, and (VI) μ the average preconditioner density over all the time steps. The tests are performed on a workstation equipped with an AMD Ryzen 9 3950X 16-Core processor at 3.49 GHz and 64 GB of RAM.

4.1 Test 0

The objective of this preliminary test is to analyze the structure of S ˜ 1 and S ˜ 2 and its expected evolution during a typical unsteady simulation. In a waterflooding process, in fact, the number of nonzeros of 𝒥 evolves as a result of the water propagation in the reservoir. It is indeed the change in the water saturation that causes an increase in the density of 𝒥 , with a limited role played by pressure. Assuming that the reservoir is initially fully saturated of oil and water is present at its irreducible saturation, the most significant stencil enlargement is noted in blocks J s p , J s s , and J s π (see also Appendix B.1), which are associated with the elemental water mass conservation equations. Knowing how the Schur complement structure evolves because of these variations is important to drive the selection of the local preconditioner.

We consider the planar reservoir in Figure 2A discretized with 300 cells, with homogeneous material properties and gravity taken into account. The simulation runs for 220 days and, at the end, half domain is water saturated. Figure 3A,B shows the pattern of S ˜ 1 at the beginning and end of the simulation, respectively. From an initially diagonal structure, the stencil of S ˜ 1 is progressively enriched by a limited number of fill entries, which remain pretty clustered around the main diagonal. Such entries generate a banded nearly-lower triangular structure resembling the one typically expected in purely advective problems. This inspection suggests that a simple diagonal scaling or an incomplete factorization with no fill-in are good candidates for the application of S ˜ 1 1 , the former especially during the first stages of a transient simulation. Moreover, this guarantees also that the use of the S ˜ 1 diagonal in the computation of S ˜ 2 is expected to provide a good approximation of S 2 .

Details are in the caption following the image
Test 0: stencil of S ˜ 1 at the beginning (A) and end ( t = 220 d) (B) of the simulation with the number of nonzero entries

The structure of S ˜ 2 is more complicated. We can split this matrix into three contributions (see Equation (15)): B 1 = J π π J π p J p p 1 J p π , B 2 = J π s J π p J p p 1 J p s , and B 3 = J s π J s p J p p 1 J p π , with S ˜ 2 = B 1 B 2 S ˜ 1 1 B 3 . The pattern of B 1 (Figure 4A) does not change over time and that of B 2 slightly increases (Figure 4B,C). By distinction, B 3 reflects the waterflooding evolution. It is almost empty at the beginning, with nonzeros only in the rows linked to the injection wells (Figure 4D), then it is progressively filled as the cells are flooded (Figure 4E). The evolution of block B 3 modifies the weight of the product P = B 2 S ˜ 1 1 B 3 over B 1 . At the beginning of the simulation, P 0 and S ˜ 2 B 1 . Then, the S ˜ 2 stencil enlarges as B 3 is filled and P becomes more influential. Looking at Figure 4F,G, it is also expected that both a Jacobi and a zero fill-in factorization should not be effective to approximate the application of S ˜ 2 1 . In this case, an incomplete factorization with fill-in, for instance, would be preferable.

Details are in the caption following the image
Test 0: nonzero pattern of block B 1 (A) along with B 2 (B,C), B 3 (D,E), and S ˜ 2 (F,G) at the beginning (B,D,F) and end (C,E,G) of the simulation

Finally, we compare the eigenspectrum of S ˜ 1 and S ˜ 2 with that of the exact Schur complements S 1 and S 2 at the end of the simulation (Figure 5). The eigenvalue distribution largely overlaps for both Schur complements, proving that the inexact computations are expected to be good approximations with no detrimental effect to the original conditioning. Notice also that the eigenvalues of S ˜ 1 are real and rather clustered away from zero.

Details are in the caption following the image
Test 0: comparison of S 1 and S ˜ 1 (A), along with S 2 and S ˜ 2 (B) eigenvalue distributions

4.2 Test 1

The simulated time interval spans 600 days of production covered in 196 time steps, whose size is dynamically adjusted during the simulation following the approach in Abushaikha et al..21 The maximum time step size is four days with the Courant–Friedrichs–Lewy (CFL) number54, 55 up to 8.56. The main model results are displayed in Figure 6 in terms of pressure, water saturation and fluid velocity. Moreover, Figure 7 shows the water front advancement from the injector toward the producer captured at some representative times along the vertical section connecting the wells. The shape of the profiles recalls that of the classical Buckley–Leverett benchmark,56 since the simulated physical process is the same and the model setting is similar.

Details are in the caption following the image
Test 1 model outcome: pressure (A), and water saturation (B) at the end of the simulation ( t = 600 d), oil velocity (C,D) at t = 0 and t = 600 d, and water velocity (E,F) at t = 30 d and t = 600 d
Details are in the caption following the image
Test 1: water saturation along the cross section connecting the two wells at four times

Figure 8A shows the number of iterations to converge of the first Jacobian system per time step. It can be seen that the performance stabilizes after the maximum time step size is achieved, which, in this case, occurs at the 57th step ( t = 46 . 1 d). After this point, the average performance appears to be a meaningful indicator. The profiles are obtained with different approximations for J ˜ p p 1 , S ˜ 1 1 , and S ˜ 2 1 , as reported in Tables 2 and 3.

Details are in the caption following the image
Number of linear iterations to converge of the first linearized system versus time step for Test 1 (A) and 3 (B)
TABLE 2. Test 1: approximations for J ˜ p p 1 , S ˜ 1 1 , and S ˜ 2 1
# J ˜ p p 1 S ˜ 1 1 S ˜ 2 1 i t l i t l
1 Diag Ex Ex 3306 5.99
2 Diag ILU(0) Ex 3396 6.25
3 Diag Diag Ex 7490 15.08
  • Note: Ex stands for exact application, which is carried out through Matlab's backslash operator. The cumulative number of nonlinear iterations, i t n l , is 576.
TABLE 3. Test 1: global and local computational performance of the block preconditioner with different approximations for S ˜ 1 1 and S ˜ 2 1
# S ˜ 1 1 S ˜ 2 1 i t l t p [s] t s [s] t t [s] μ i t l t p [s] t s [s] t t [s] μ
1 Diag ILU(0) 100599 21.09 1213.30 1234.39 2.617 180.31 0.04 2.24 2.28 2.811
2 ILU(0) ILU(0) 98380 21.07 1182.86 1203.93 2.678 175.01 0.04 2.16 2.20 2.879
3 Diag ILU( τ ) a 52839 1877.24 391.81 2269.05 2.469 97.54 3.27 0.75 4.02 2.391
4 ILU(0) ILU( τ ) a 48283 1879.81 348.62 2228.43 2.533 85.44 3.26 0.62 3.89 2.466
5 Diag ILU( τ ) b 45408 1936.41 317.19 2253.60 2.932 83.84 3.37 0.61 4.98 2.836
6 ILU(0) ILU( τ ) b 41048 1937.03 280.24 2217.27 2.996 72.36 3.37 0.50 3.87 2.909
7 Diag ILU( τ ) c 29248 2046.91 210.54 2257.45 4.948 53.99 3.56 0.40 3.96 4.762
8 ILU(0) ILU( τ ) c 26220 2050.34 191.28 2241.62 5.011 46.09 3.57 0.34 3.91 4.834
  • Note: For all the runs, i t n l = 576 .
  • a τ = 0 . 01 .
  • b τ = 0 . 005 .
  • c τ = 0 . 001 .

As to the block preconditioner, we first analyze the effect of the inexact application of J p p 1 , S 1 1 , and S 2 1 on the overall performance. This investigation allows to understand where most of the efforts are needed to optimize the solver effectiveness, by identifying the blocks requiring a higher-quality local approximation. We used either diagonal scaling, or incomplete factorization, or exact application of the inverse in different combinations. In Table 2, we mainly considered the first two terms, that is, J p p 1 and S 1 1 , whereas, in Table 3, the appropriate setting for the second block is evaluated again and we focused on the approximation of S 2 1 . The analysis is carried out considering both the global and local (average) performance of the preconditioned linear solver.

Starting from Table 2, the outcome of run 1 tells that the diagonal approximation for J p p 1 is appropriate for this block. Replacing the application S ˜ 1 1 with an ILU(0), as in run 2, leads to a marginal increase in the total number of linear iterations (2.7%). By distinction, a diagonal approximation of S ˜ 1 1 causes a 120.6% increase in the global iteration count. Therefore, a more detailed comparison is needed by taking into account the computational effort to compute and apply the preconditioner. In any case, the outcome of this analysis suggests that the overall solver performance mainly relies on the quality of the approximation for S ˜ 2 1 , as expected from the preliminary analysis carried out on Test 0. This is investigated in Table 3. Here, two incomplete factorization variants, that is, ILU(0) and threshold-based ILU( τ ), have been tested for S ˜ 2 1 . The tests have been carried out for both the diagonal scaling and ILU(0) factorization of S ˜ 1 1 . The results show that, although ILU( τ ) can significantly reduce the cumulative linear iteration count, i t l , and solving time, t s , this strategy typically does not pay off because of the higher set-up cost, t p . We also observe that the difference between diagonal scaling and ILU(0) factorization for S ˜ 1 1 is not as significant in terms of total time. Nevertheless, the latter appears to be more effective, hence it is chosen as default option in the next tests. Furthermore, the effect on the preconditioner density, μ and μ , is almost negligible.

The analysis in Section 4.1 highlighted that S ˜ 2 tends to become denser as the simulation proceeds. This result is confirmed by the outcome of Table 3, which shows that the preconditioner density is likely to increase slowly but steadily during the simulation when ILU(0) is used for S ˜ 2 , whereas the opposite trend is noted with ILU( τ ). Specifically, compare the density values in the columns μ and μ (runs 1,2 vs. 3–8). In any case, filtration techniques can be applied to sparsify S ˜ 2 and stabilize the memory occupation, if necessary. Of course, the application of such techniques can affect the set-up cost.

The performance of the block preconditioner is also benchmarked against that of global incomplete LU factorizations with no or variable fill-in. The ILU decomposition is built upon the Jacobian matrix preliminary reordered by means of the reverse Cuthill–McKee algorithm.26 The outcome is provided in Table 4. While with ILU(0) GMRES does not converge within 1000 iterations in most of the linearized systems, threshold-based ILU delivers better results but at a significant higher computational time with respect to our block preconditioner, where the greatest share is due to the set-up cost (compare, in this regard, columns t p , t s , t p , and t s in Tables 3 and 4). Notice also that reducing the threshold value, τ , below 0.01 results in a degradation of the preconditioned solver performance, most likely caused by the introduction of detrimental near-zero entries in the triangular factors.

TABLE 4. Test 1: benchmarking the performance of the block preconditioner against global incomplete LU factorization of the Jacobian
Type i t l t p [s] t s [s] t t [s] μ i t l t p [s] t s [s] t t [s] μ
ILU(0) NC - - - 1.173 - - - - 1.165
ILU( τ ) a 107022 4716.76 1400.90 6117.65 1.608 199.4 8.15 2.73 10.88 1.568
ILU( τ ) b 68425 4675.91 682.88 5358.79 2.494 134.65 8.09 1.45 9.54 2.512
ILU( τ ) c 78911 4796.63 1028.53 5825.16 3.872 158.19 8.33 2.27 10.60 4.330
  • a τ = 0 . 025 .
  • b τ = 0 . 01 .
  • c τ = 0 . 0075 .

4.3 Test 2

In this test, gravity is taken into account to evaluate its impact on the solver performance, whereas the production scenario and the other model settings are the same as in Test 1. The simulated time interval spans 250 days, with a maximum time step size of 0.5 days. This value is lower with respect to the previous test and it was set to keep the number of nonlinear iterations below 6–7. The simulation takes 526 time steps to conclude, with the maximum size achieved at the 37th step ( t = 5 . 79 d) and the CFL number up to 2.26. Figure 9A shows the water saturation field at the end of the simulation on the whole domain, while Figures 9B–D report the solution along a vertical section through the wells. The injected water propagates through the reservoir toward the producer, and simultaneously tends to accumulate at the bottom due to the action of gravity.

Details are in the caption following the image
Test 2: panel (A) depicts the water saturation on the entire domain at end of the simulation ( t = 250 d). The dash-dotted line shows the trace of the vertical section connecting the wells. Along this section, the water saturation is shown at t = 0 . 1 d (B), t = 120 d (C), and t = 250 d (D)

Table 5 reports the main results of the preconditioner performance analysis, which can be compared with those in Table 3. Based on the cumulative number of nonlinear iterations and the smaller time interval simulated than in Test 1, we can say that the problem is more challenging for the outer nonlinear solver, but the effect of gravity on the performance of the preconditioned linear solver appears to be negligible. The number of linear iterations, i t l , in fact, along with the density, μ , are similar to those in Test 1. This result is consistent with the observations in Section 4.1. The extra nonzero block appearing with gravity, J π s , is required only in the computation of the product P for S ˜ 2 (specifically in B 2 ). We observed also that the importance of the P term is not constant during a simulation, but evolves depending on the water distribution within the reservoir. Even in this application, the threshold-based incomplete factorization, though succeeding in deflating the iteration count significantly (runs 2–4), does not outperform ILU(0) in terms of CPU time. A similar behavior of the preconditioner density, which grows during the simulation when an ILU(0) approximation is used, can be noted as well.

TABLE 5. Test 2: global and local computational performance of the block preconditioner with different approximations for S ˜ 2 1
# S ˜ 2 1 i t l t p [s] t s [s] t t [s] μ i t l t p [s] t s [s] t t [s] μ
1 ILU(0) 528006 118.12 6215.87 6333.99 2.531 153.24 0.04 1.74 1.78 2.569
2 ILU( τ ) a 257633 10723.27 1813.43 12536.71 2.395 74.89 3.27 0.51 3.78 2.385
3 ILU( τ ) b 214446 11041.27 1428.21 12469.48 2.808 62.23 3.37 0.40 3.77 2.794
4 ILU( τ ) c 132703 11626.79 939.03 12565.82 4.618 38.74 3.54 0.27 3.81 4.590
  • Note: For all the runs, i t n l = 3280 .
  • a τ = 0 . 01 .
  • b τ = 0 . 005 .
  • c τ = 0 . 001 .

4.4 Test 3

The homogeneous isotropic permeability field of Test 1 and 2 is here replaced by a highly heterogeneous and anisotropic distribution retrieved from the SPE10 data set (see Figure 10A,B). Notice the wide range of values, spanning approximately ten orders of magnitude, and the abrupt discontinuities, especially in the vertical permeability (Figure 10B), which make this test numerically challenging. The porosity distribution, displayed in Figure 10C, follows that of the SPE10 model as well. Given the similar performance of the linear solver with and without gravity and the greater computational burden for the outer nonlinear solver observed in Test 2, in this application we decided to neglect the gravitational effects. The simulation runs for 250 days with a maximum time step size of 1.5 days, achieved at the 42nd step ( t = 16 . 95 d), and a total number of 198 steps. The CFL value is up to 30.27. At the end of the simulation, large portions of the oil in the reservoir have been displaced by water and breakthrough occurs at the producer. Figure 11 provides some model insights in terms of water saturation at t = 80 d and t = 250 d.

Details are in the caption following the image
Test 3: horizontal, k x , y , (A) and vertical, k z , (B) permeability and porosity (c) distributions taken from the SPE10 data set
Details are in the caption following the image
Test 3: water saturation at t = 80 d (A), and t = 250 d (B)

Table 6 summarizes the results obtained for different approximations of S ˜ 2 1 . For two of them, the number of iterations to solve the first linearized systems vs. time step is displayed in Figure 8B. Although the permeability distribution is highly heterogeneous and anisotropic, the number of iterations approximately stabilizes after the maximum time step size has been achieved like in Test 1 (see also Figure 8A). By comparing the iteration counts, i t l and i t l , of run 1 with the counterpart in Table 3, we can appreciate the effect of the ill-conditioning caused by the extremely wide range of permeability and, with a minor role, porosity values. We carried out a sensitivity analysis on the threshold value for ILU( τ ) (runs 2-4), which appears now to pay off with respect to the zero-fill incomplete factorization (run 1). Despite the higher set-up cost, the cumulative total solving time, t t , reduces as the threshold τ is decreased, while the density growth is firmly under control. Moreover, the cumulative number of linear iterations is reduced by 2.6 to 6.5 times. The sensitivity analysis shows that a threshold value for the drop tolerance in the range between 0.001 and 0.005 appears to be a good trade-off between fast convergence and the need of restraining the memory occupation.

TABLE 6. Test 3: global and local computational performance of the block preconditioner with different approximations for S ˜ 2 1
# S ˜ 2 1 i t l t p [s] t s [s] t t [s] μ i t l t p [s] t s [s] t t [s] μ
1 ILU(0) 227024 25.10 3010.35 3035.45 2.789 395.61 0.05 5.35 5.40 2.952
2 ILU( τ ) a 87747 2097.04 925.42 3022.46 1.994 148.68 3.40 1.61 5.01 1.962
3 ILU( τ ) b 68016 2089.94 612.96 2702.90 2.224 115.11 3.39 1.07 4.56 2.186
4 ILU( τ ) c 35107 2179.04 240.48 2419.53 3.156 58.78 3.53 0.41 3.94 3.095
  • Note: For all the runs, i t n l = 617 .
  • a τ = 0 . 01 .
  • b τ = 0 . 005 .
  • c τ = 0 . 001 .

4.5 Test 4

Modeling the fluid flow with a high approximation accuracy in non-Cartesian grids using full tensor permeability properties is an important requirement for modern simulators. This is also a challenge for most discretization schemes, which is effectively addressed by the MHFE method.21 In the last test case, a dome reservoir is taken into consideration with this challenging setting. Porosity and permeability are again taken from the SPE10 data set with the horizontal/vertical anisotropy ratio up to 1.E5. The full permeability tensor is obtained from the original diagonal tensor by rotating the local element axes so as to follow the curvature of the dome formation, similarly as in Nardean et al..50 The main effect of non-Cartesian grid and full tensor permeability on the model can be seen in the stencil of block J π π , which is enlarged with respect to the previous setting. This is because the local matrices ( B E ) 1 , which are block diagonal with a diagonal permeability tensor and a Cartesian grid, become full (see Equation (8)). Therefore, while in the former setting there are up to three nonzero entries in each row of J π π , such limit increases to eleven in the latter. On the contrary, the stencil of the other blocks in the Jacobian is not affected by the change in the grid or permeability tensor type. Figure 12 shows some model insight in terms of water saturation distribution, while the computational performance of the linear solver is summarized in Table 7. As in Test 3, the threshold-based incomplete factorization performs better than the no-fill variant in terms of CPU time. The sensitivity analysis on the threshold τ shows that a value between 0.0005 and 0.001 can be a good compromise in terms of memory footprint and computational efficiency, which is a little lower than in Test 3. Notice also that the density of the preconditioner in this test is lower than in Test 3 when the same value for τ is used in the incomplete factorization.

Details are in the caption following the image
Test 4: water saturation at t = 80 d (A), and t = 250 d (B)
TABLE 7. Test 4: global and local computational performance of the block preconditioner with different approximations for S ˜ 2 1
# S ˜ 2 1 i t l t p [s] t s [s] t t [s] μ i t l t p [s] t s [s] t t [s] μ
1 ILU(0) 278513 28.61 4395.75 4424.36 1.647 489.99 0.05 7.88 7.93 1.777
2 ILU( τ ) a 234091 2162.13 3672.03 5834.16 1.411 433.84 3.53 6.92 10.45 1.405
3 ILU( τ ) b 70266 2220.04 854.30 3074.34 2.235 119.86 3.63 1.50 5.13 2.217
4 ILU( τ ) c 54840 2271.59 640.49 2912.08 2.802 93.55 3.71 1.12 4.83 2.778
5 ILU( τ ) d 28902 2390.80 370.26 2761.06 4.807 49.48 3.91 0.64 4.55 4.762
  • Note: For all the runs, i t n l = 613 .
  • a τ = 0 . 005 .
  • b τ = 0 . 001 .
  • c τ = 0 . 0005 .
  • d τ = 0 . 0001 .

5 DISCUSSION AND CONCLUSIONS

The accurate and efficient numerical modeling of two-phase flow in porous media is a key tool for petroleum engineering applications. In our formulation of the mathematical problem, the demand for higher accuracy of the solution is met by means of a MHFE discretization of Darcy's law, whereas the coupled nonlinear system is addressed in a fully implicit unconditionally stable way by a Krylov solver. Such an approach requires the solution to a sequence of nonsymmetric large-size and often ill-conditioned linearized systems of equations with the Jacobian matrix. This is the most time and resource consuming task in a simulation. Supplying a proper preconditioner to the Krylov subspace solver is decisive to damp such cost and make the fully implicit approach attractive with respect to sequential techniques.

In this paper, we propose a full block preconditioner for the two-phase flow model with a MHFE-FV discretization. Previous research focused on TPFA-based integrations for which the CPR method usually provides good results, but very few works are devoted to the MHFE discretization and this is the first application of a full block preconditioner to such kind of model to the best of the authors' knowledge. The design of the preconditioner takes advantage of the near-diagonal structure of J p p to obtain a high-quality approximation of the first-level Schur complement. Given the nonsymmetric structure of J p p , S 1 and S 2 , we employed diagonal scaling and incomplete factorizations for the inexact application of their inverse. Specifically:
  • a diagonal preconditioner for J p p 1 and an ILU(0) factorization for S ˜ 1 1 proved to be cheap and appropriate choices. Finding an approximation to apply S ˜ 2 1 , conversely, requires a little more care, since it is the factor that mostly governs the effectiveness of the block preconditioner;
  • depending on the problem features, different strategies can be followed for S ˜ 2 1 . In general, for simple settings with homogeneous isotropic permeability, ILU(0) is already a good approximation, whereas in case of heterogeneous materials it is necessary to move to higher-quality variants, such as the threshold-based ILU( τ );
  • accounting for gravity in the simulation does not seem to affect the linear solver performance much, while it can increase significantly the computational burden of the nonlinear solver;
  • when fill-in is allowed in the incomplete factorization of the second-level Schur complement, the preconditioner density is under control during the simulation and it remains approximately constant. Conversely, a slight, but steady, growth is observed with the ILU(0) version, consistently with the progressive increase of number of water flooded cells;
  • in all cases, the proposed solver is stable and robust, being able to provide the solution under any condition achieved during the full-transient simulation and with strongly heterogeneous and anisotropic material properties, with no increase of the average computational cost required by the linear solver.

Ongoing research focuses on applying efficient sparsification techniques to S ˜ 2 and have a full control of the memory occupation. Since the set-up cost increases as a result of this additional operation, filtration can be performed only once a certain memory limit has been exceeded. Additional tests are underway to extend the preconditioning framework to a fully parallel environment, by carrying out appropriate scalability tests and experimenting the introduction of multigrid and/or multiscale approaches as local approximations of S ˜ 1 1 and S ˜ 2 1 .

ACKNOWLEDGMENT

This publication was supported by the National Priorities Research Program grant NPRP10-0208-170407 from Qatar National Research Fund.

Author Contributions

Stefano Nardean: Conceptualization, software, writing–original draft. Massimiliano Ferronato: Methodology, supervision, writing–review & editing. Ahmad S. Abushaikha: Funding acquisition, supervision, writing–review & editing.

Conflict of Interest

The authors declare no potential conflict of interests.

    APPENDIX A.

    A.1 Peaceman well model

    Hydrocarbon reservoirs are exploited through injection and production wells operating under complex regimes. Therefore, in order to numerically reproduce the hydrocarbon flow over time with a good accuracy and match the real production data, it is necessary to introduce in the system of Equations (1b), (5a), and (5b) a specific model for the wells. A classical choice is to resort to Peaceman model. In his seminal work,47 Peaceman was able to devise a numerical formulation of Thiem equation,57 which describes the water flow at steady state around pumping wells in confined aquifers, and incorporate it in the general flow model in porous media. Specifically, a well segment, or the entire well in case of a point source, is hosted in a grid cell. Although the original formulation had somehow a limited application, since it relied on the assumption that the rock permeability is isotropic, the fluid is single-phase, the grid size is regular along the planar axes, and gravity has no effects, it was later extended to account for the medium anisotropy, gravity and multiphase flow in irregular grids. In this version, Peaceman model reads:58
    f α E = λ α , E W I p b p E γ z w z E , ()
    where f α E is the flux of phase α exchanged with the hosting cell E, that is, the sink/source term in Equations (5a) and (5b), λ α , E is the fluid mobility of the upstreaming entity (the grid cell or the well itself for production and injection wells, respectively), p b is the pressure in the well, p E is the cell pressure, γ is the specific weight of the fluid mixture in the well, and z w is the well datum. In Equation (A1), W I is the so-called well index:
    W I = 2 π Δ z k x x k y y ln r e r w + s , ()
    where k x x and k y y denote the permeability values along the axes x and y, r w the well radius, and s the skin factor. In Equation (A2), r e is the so-called equivalent radius, which is expressed as:58
    r e = 0 . 14 k y y k x x 1 2 Δ x 2 + k x x k y y 1 2 Δ y 2 1 2 0 . 5 k y y k x x 1 4 + k x x k y y 1 4 . ()
    Whether the well intercepts more than one cell, the total flux of phase α is given by:
    f α = λ α , 1 W I 1 p b p 1 + m = 2 n p λ α , m W I m p b p m + p = 2 m γ p z p z p 1 , ()
    where h is the ordered set of elements intercepted by the well, with m the mth element from the well top, n p = is the number of perforations and p b is reinterpreted as the top well pressure. Typically, wells are operated under a constant bottomhole pressure (BHP), p b , or rate control for a certain phase α , f α . Therefore, the well equations to be added to the discretized system read:
    p b p b = 0 (BHP control), f α f α = 0 (rate control). ()

    APPENDIX B.

    B.1 Jacobian blocks' structure

    The Jacobian blocks are computed analytically by explicitly deriving Equations (2), (3), (9), (A5), (10), and (11) and assembled from the element/well contributions. Let us define the following terms:
    Λ α , i E = L B i E p E γ α z E j = 1 N f E B E i j 1 π p , j E γ α π z , j E i = 1 , , N f E ()
    Υ α m = W I 1 p b p 1 m = 1 , W I m p b p m + p = 2 m γ p z p z p 1 m > 1 , ()
    where L B i E = j = 1 N f E B E i j 1 . We denote u α i as the upstreaming element of face i for phase α and the relevant value of the mobility as λ α ( S w u α i ) . The set of wells is given by 𝒲 , the elements sharing an internal face of the domain by { E , E } and r h is the collection of elements intercepted by well r. Moreover, let g Q ( i ) be the function that converts the local index of face i in element Q into its global index, whereas l ( i ) performs the opposite action. In the following equations, the subscript n + 1 and the superscript ( k ) are omitted for brevity and [ A ] i j denotes the ( i , j ) entry of block A linked to element/face/well i and j. The block expressions are as follows:
    J p p : J p p Q Q = Ω Q Δ t d ϕ d p p Q + i = 1 N f E λ o S w u o i + λ w S w u w i L B i Q Q h J p p Q r = λ o S w u Q + λ w S w u Q W I Q ( r , Q ) 𝒲 × r J p p r Q = 0 (BHP control), λ α S w u Q W I Q Q r (rate control for phase α ) , r 𝒲 J p p r r = 1 (BHP control), Q r λ α S w u Q W I Q (rate control for phase α ) , r 𝒲 , ()
    J p s : J p s Q u α i = d λ α d S w S w u α i Λ α , i Q ( Q , i , α ) h × { 1 , , N f E } × { o , w } J p s Q Q = d λ α d S w S w Q Υ α Q ( r , Q , α ) 𝒲 × r × { o , w } , ()
    J p π : J p π Q g Q ( j ) = i = 1 N f E λ o S w u o i + λ w S w u w i B Q i j 1 ( Q , j ) h × { 1 , , N f E } , ()
    J s p : J s p Q Q = Ω Q Δ t S w Q d ϕ d p p Q + i = 1 N f E λ w S w u w i L B i Q Q h J s p Q r = λ w S w u Q W I Q ( r , Q ) 𝒲 × r , ()
    J s s : J s s Q Q = Ω Q Δ t ϕ ( p Q ) Q h , J s s Q u w i = d λ w d S w S w u w i Λ w , i Q ( Q , i ) h × { 1 , , N f E } J s s Q Q = d λ w d S w S w Q Υ w Q ( r , Q ) 𝒲 × r , ()
    J s π : J s π Q g Q ( j ) = i = 1 N f E λ w S w u w j B Q i j 1 ( Q , j ) h × { 1 , , N f E } , ()
    J π p : J π p i Q = λ o S w u o i + λ w S w u w i L B l ( i ) Q ( i , Q ) h × { E , E } , ()
    J π s : J π s i u α i = d λ α d S w S w u α i Λ α , l ( i ) E + Λ α , l ( i ) E ( i , α ) h × { o , w } , ()
    J π π : J π π i g Q ( j ) = λ o S w u o i + λ w S w u w i B Q l ( i ) j 1 ( i , j , Q ) h × { 1 , , N f E } × { E , E } . ()

    Biographies

    • Stefano Nardean is PhD candidate in Sustainable Energy at Hamad Bin Khalifa University. The focus of his research is the development of efficient linear solvers for reservoir simulations applications. Other scientific interests, within the subject of applied numerical analysis, concern the modeling of deformation and flow processes in porous media, with specific applications in the field of subsurface hydrology and petroleum engineering.

    • Massimiliano Ferronato is Professor of Numerical Analysis at the University of Padova, Italy. He has authored and co-authored about 170 papers in international journals and conference proceedings. The main scientific interests concern the design of numerical models for the solution of PDEs governing coupled processes in porous media along with the analysis, development and implementation of block preconditioners for saddle-point problems arising in geoscience applications.

    • Ahmad Abushaikha is Associate Professor at Division of Sustainable Development at Hamad Bin Khalifa University. He has authored and co-authored about 45 research publications in international journals and conference proceedings. His research interests are in reservoir simulation, fluid flow in porous media, discretization schemes, high performance computing, petroleum engineering and enhanced recovery mechanisms.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.