We discuss a quadratic criterion optimal control problem for stochastic linear system with delay in both state and control variables. This problem will lead to a kind of generalized forward-backward stochastic differential equations (FBSDEs) with Itô’s stochastic delay equations as forward equations and anticipated backward stochastic differential equations as backward equations. Especially, we present the optimal feedback regulator for the time delay system via a new type of Riccati equations and also apply to a population optimal control problem.

1. Introduction

The problem of optimal control for delayed stochastic system has received a lot of attention recently. One of the reasons is that there are many phenomena which have the nature of pastdependence that is, their behavior at time t not only depends on the current situation, but also on their past history. Such kinds of mathematical models described by stochastic delay differential equations (SDDEs) are ubiquitous and have wide range of applications in physics, biology, engineering, economics, and finance (see Arriojas et al. [1], Mohammed [2, 3], and the references therein).

In control problem, a delay term may arise when there is a time lag between observation and regulation or the aftereffect of control; that is, there may be delay in state or control variables. Although many papers came out to discuss the delayed control problem, the analysis of systems with delay is fraught with many difficulties, not only for the infinite dimensional problem, but also for the absence of Itô′s formula to deal with the delayed part of the trajectory. In order to surmount these difficulties, one can consider specific classes of systems with aftereffect, such as Øksendal and Sulem [4].

This paper is concerned with optimal control of a linear system with delay under a quadratic cost criteria, namely, the quadratic problem for stochastic linear control system with delay. It is well known that linear-quadratic (LQ) control is one of the most important classes of optimal control and the solution of this problem has many real-world applications. Deterministic LQ control problem with delay has been discussed in Alekal et al. [5], Basin et al. [6], and so forth. However, there is little results on stochastic LQ problem with delay since the difficulties in exploring the optimal feedback regulator, such as the Riccati equation, are very different from the case without delay.

In Peng and Yang [7], a new type of backward stochastic differential equations (BSDEs) was introduced, which is named anticipated BSDEs. The anticipated BSDE provides a new method to deal with optimal control problem with delay (see Chen and Wu [8]). In our paper, we study the delayed stochastic LQ problem by virtue of the anticipated BSDEs combined with the SDDEs.

In the next section, we introduce a kind of generalized FBSDE and give the existence and uniqueness result for its solution. With the help of the FBSDE, we find the optimal control of the stochastic LQ problem with delay in both state and control variables and the quadratic cost also involves delay terms.

It is very important to design an optimal feedback regulator for LQ problem in practice. Traditionally, a fundamental tool to obtain the state feedback is the Riccati equation. In Section 3, we introduced a new type of generalized Riccati equations and give the feedback regulator of delayed stochastic LQ problem. To the best of our knowledge, it is the first result on the optimal feedback control for the delayed stochastic LQ problem, where the state and control variables with delay are involved not only in the system, but also in the cost functional. In the last section, we apply our theoretical result to a population optimal control problem. Some technical proofs are put in Appendix.

2. Stochastic LQ Problem with Delay

We first introduce some notations. For any Euclidean space H, we denote by 〈·, ·〉 (resp. |·|) the scalar product (resp. norm) of H. Let ℝ^n×m be the Hilbert space consisting of all n × m matrices with the inner product:

()

Here, the superscript

denotes the transpose of vectors or matrices. Particularly, we denote by 𝒮ⁿ the set of all n × n symmetric matrices.

Let W(·) be a standard 1-dimensional Brownian motion on a complete probability space (Ω, ℱ, P). The information structure is given by a filtration 𝔽 = {ℱ_t} _t≥0, which is generated by W(·) and augmented by all the P-null sets. We assume the dimension of Brownian motion d = 1 just for the simplicity of notations. In fact, all the conclusions in this paper still hold true for the case that the dimension of Brownian motion d > 1. For any K, the Euclidean spaces or sets of matrices, the following notations will be used throughout the paper:

()

In this section, we consider the following linear controlled system involving delays in both state variable and control variable:

()

where δ > 0 is a constant time delay,

are deterministic functions, and a ∈ ℝⁿ.

, {v(t), t ∈ [0, T]} is an 𝔽-adapted square-integrable process taking values in ℝ^k. Let 𝒰_ad denote the set of stochastic processes v(·) of the form:

()

An element of 𝒰_ad is called an admissible control. By the classical results of SDDEs, we know that the system (2.3) has a unique solution for any admissible control v(·). The solution x^v(·) of SDDE (2.3) is called the state trajectory corresponding to the control v(·) ∈ 𝒰_ad and (x^v(·), v(·)) is called an admissible pair.

The cost functional is given by

()

Here,

, and Q is an ℱ_T-measurable nonnegative symmetric bounded matrix. Moreover, we assume that, for any (ω, t) ∈ Ω × [0, T], R₁(t) + R₂(t + δ) and Q are nonnegative definite, N₁(t) + N₂(t + δ) is positive definite, and the inverse

is also bounded.

The controller hopes to minimize the above cost functional J by selecting an appropriate admissible control v(·); that is, the problem is to find u(·) ∈ 𝒰_ad such that

()

We call the problem above the linear-quadratic optimal control problem of delayed system, and we denote it by Problem (LQD). An admissible pair (x^u(·), u(·)) is called optimal for Problem (LQD) if u(·) achieves the infimum of J.

The Problem (LQD) we introduced above is a general type of LQ problem for stochastic system with delay. Not only the state variable and control variable are involved with delays, but also the cost functional contains delay terms.

By introducing a new type of BSDEs-anticipated BSDEs, we desire to solve Problem (LQD).

Theorem 2.1. The control

()

is the unique optimal control of Problem (LQD). Here, (x(·), y(·), z(·)) is the solution of the following generalized FBSDE:

()

Remark 2.2. If we set

()

then the second equation in (2.8) can be rewritten as

()

Proof. Suppose (x^v(·), v(·)) is an arbitrary admissible pair of system (2.3), then

()

Since R₂(t) ≡ 0, N₂(t) ≡ 0, t ∈ (T, T + δ], by the time-shifting transformation and the initial conditions in (2.8), we derive that

()

Applying Itô′s formula to 〈x^v(t) − x(t), y(t)〉, we have

()

Moreover, because v(t) = u(t), t ∈ [−δ, 0) and y(t) = z(t) = 0, t ∈ (T, T + δ], we can rewrite (2.13) as follows:

()

Consequently, by the fact that R₁(·) + R₂(·+δ) and Q are nonnegative and N₁(·) + N₂(·+δ) is positive, we have

()

So letting

()

we have

()

that is, u(·) defined in (2.7) is the optimal control of Problem (LQD).

We will use the parallelogram rule to prove the uniqueness of the optimal control, and this method can also be seen in Wu [9]. We assume that u¹(·) and u²(·) are both optimal controls, and the corresponding trajectories are x¹(·) and x²(·). It is easy to know the trajectories corresponding to (u¹(·) + u²(·))/2 are (x¹(·) + x²(·))/2. Since N₁(·) + N₂(·+δ) is positive, R₁(·) + R₂(·+δ) and Q are nonnegative, we know that J(u¹(·)) = J(u²(·)) = λ ≥ 0, and

()

Because of N₁(·) + N₂(·+δ) being positive, we have u¹(·) = u²(·). We complete our proof.

Remark 2.3. The existence of the optimal control is equivalent to the existence of solution for (2.8), which is a kind of complex generalized FBSDEs. The proof of the existence and uniqueness of the solution for this kind of FBSDE are put in Appendix.

3. Feedback Regulator of Delayed System

It is well known that the feedback representation of optimal control is very useful in applications. And in the classical case, the optimal feedback control can be represented via the Riccati equation. But, for stochastic systems with delay, it is not easy to find the feedback control because of the dependence of the history. What should then be an appropriate “Riccati equation” corresponding to our LQ problem with delay?

In this section, we will pay attention to the feedback regulator of the general delayed LQ problem discussed in Section 2 and try to get the appropriate Riccati equation associated with Problem (LQD). We remark that all given coefficients of the problem are assumed to be deterministic from now on.

Let us start with the following results about anticipated BSDE.

Assume that for all s ∈ [0, t], f(s, w, y, z, ξ, η):[0, T] × Ω × ℝⁿ × ℝ^n×d × L²(Ω, ℱ_r, P; ℝⁿ) × L²(Ω, ℱ_r, P; ℝ^n×d) → L²(Ω, ℱ_r, P; ℝⁿ), where r ∈ [s, T + δ]. And f satisfies the following conditions.

(H3.1)
There exists a constant C > 0, such that for all s ∈ [0, T], y, y′ ∈ ℝⁿ, z, z′ ∈ ℝ^n×d, , , we have
()
(H3.2)
.

Let (y_i(·), z_i(·)), i = 1,2, be, respectively, solutions of the following two anticipated BSDEs:

()

We have the following lemma, which is a direct corollary of Theorem 2.2 in Yu [10].

Lemma 3.1. Assume that f₁, f₂ satisfy (H3.1) and (H3.2), , , and

()

then

()

In order to get the feedback regulator, we introduce the following generalized n × n matrix-valued Riccati equation system:

()

With the boundary conditions

()

Here, λ is some constant.

Then we have the following main result.

Theorem 3.2. Suppose that there exist matrix-valued deterministic processes (K₁(·), K₂(·), H₁(·), H₂(·)) satisfying the generalized Riccati equation system (3.5)–(3.8) with corresponding boundary conditions, and for system (2.3), A₂(t) = B₂(t) = C₂(t) = D₂(t) = 0, t ∈ [0, δ). Then the optimal feedback regulator for the delayed linear quadratic optimal Problem (LQD) is

()

with

()

And the optimal value function is

()

Proof. Let (K₁(·), K₂(·), H₁(·), H₂(·)) be the solution of Riccati equation system (3.5)–(3.8) and we set

()

Applying Itô′s formula to y₁(·), we have

()

with

()

Substituting (3.16) into (3.15), we have

()

where z₂(s) = K₁(s)[C₁(s)x(s) + C₂(s)x(s − δ) + D₁(s)u(s) + D₂(s)u(s − δ)] and

()

The following analysis shows that z₁(·) = z₂(·) a.e., a.s. with the help of (3.7) and (3.8). In fact, taking (3.16) into the presentation of z₂(t), we will have

()

By the conditions K₁(t) = H₁(t) = 0, t ∈ (T, T + δ], K₂(t) = H₂(t) = 0, t ∈ [T, T + δ] and B₂(t) = C₂(t) = D₂(t) = 0, t ∈ [0, δ), we derive

()

We can see that (3.7) and (3.8) ensure that

()

In other words, z₁(·) = z₂(·) a.e., a.s.

So we use the following equation instead of (3.17):

()

Namely, (y₁(·), z₁(·)) introduced in (3.14) satisfies the above anticipated BSDE. From now on, we pay attention to the anticipated BSDEs (2.8) and (3.22).

Applying the similar method we used above, we derive that (3.5) and (3.6) will lead to the following:

()

Thus, we get y(t) = y₁(t), z(t) = z₁(t) by Lemma 3.1 immediately, and feedback representation of optimal control in Theorem 3.2 is proved.

Applying Itô′s formula to 〈x(T), y(T)〉 in J(u(·)), it is easy to get (3.13).

Remark 3.3. (i) The condition A₂(t) = B₂(t) = C₂(t) = D₂(t) = 0, t ∈ [0, δ) is reasonable in practice: the condition implies that there is no delay on the time interval [0, δ). That is to say there is no history of the system on [−δ, 0), that is, x(t) = 0, t ∈ [−δ, 0).

(ii) The matrix-value differential equation (3.5)–(3.8) is a generalized type of Riccati equations which is different from the classical one. When there are no delays in the system, that is, A₂(t) = B₂(t) = C₂(t) = D₂(t) = N₂(t) = R₂(t) = 0, K₂(t) = H₂(t) = 0, our generalized anticipated Riccati equation will degenerate to the Riccati equation in Wu [9].

4. Application

The solvability of Riccati equation (3.5)–(3.8) is not easy to obtain in general. In this section, we derive the unique solvability of the problem for some special cases. As an application of our above results, we will consider a kind of stochastic LQ problem with delay arising from the population control model in this section. The optimal feedback control is given by the new type of Riccati equation and we can obtain its existence and uniqueness.

The following population growth model comes from Mohammed [3]:

()

where x(t) is the population at time t, a₁ and a₂ > 0 are constant death rate and birth rate per capita, respectively. δ > 0 denotes the development period of each individual, and the diffusion part describes the migration of the population; that is, there is migration whose overall rate is distributed like

Based on the population growth model, we consider the following population control problem:

()

Here, we change the original constant death and birth rate to be time-varying a₁(·), a₂(·), b₁(·), σ(·) are deterministic bounded functions on [0, T] and a₂(·) = b₁(·) = 0 on [0, δ).

. And the control v(·) models the intensity of the spending on controlling the population. We take into account that there will be a time lag between control expenditure and the corresponding effect on the population level, so the system is involved with control delay. W(·) is a 1-dimensional B.M.

The objection is to minimize the following cost functional:

()

over the admissible control set

. r₁(·) and n₁(·) are nonnegative- and positive-valued bounded functions on [0, T], respectively, and q ≥ 0 is a given constant. And all the coefficients in our model are equal to zero outside of the interval [0, T]. About the cost functional, one can consider that we are going to control some pests′ quantity. We want to pay less expenditure and make the quantity of pests on lower level.

The above population control problem is a special case of the Problem (LQD) we discussed in Sections 2 and 3. From Theorems 2.1 and 3.2, we have the following results.

Proposition 4.1. The control

()

is the unique optimal control. Here, (x(·), y(·), z(·)) is the solution of the following generalized FBSDE:

()

Proposition 4.2. The feedback control regulator of the population control problem is given by

()

where (k₁(·), k₂(·)) satisfies

()

Moreover, the above Riccati (4.7) admits a unique solution, so the presentation form of (4.6) exists and it is unique. Here, λ is a parameter satisfying λ > a₁(t) + a₂(t), for all t ∈ [0, T].

Proof. Obviously, the existence and uniqueness of presentation (4.6) are equivalent to Riccati (4.7) which has a unique solution, so we pay attention to (4.7).

Let us start with the second equation in (4.7). It is a classical Riccati-type differential equation with bounded coefficients, thence, by Theorem 7.7 of Chapter 6 in Yong and Zhou [11], we get that it admits a unique solution k₂(t) on [0, T] and also we have 0 ≤ k₂(t) ≤ M with M which is big enough positive constant.

Now, let us turn to the first equation in (4.7). It is anticipated. In order to derive its solvability, we discuss the equation backward step by step on the time interval [T − δ, T), [T − 2δ, T − δ), ….

For example, when t ∈ [T − δ, T), the equation is equivalent to the following:

()

Because k₂(t) can be solved in advance, we get that there exists a unique solution of (4.8) and k₁(t) ≥ 0 over [T − δ, T) also by Theorem 7.7 of Chapter 6 in Yong and Zhou [11].

The discussions on the other time intervals are similar; we omit them. All in all, the Riccati equation (4.7) is solved backward in time from [T, T + δ] as an initial value problem and the solution is unique. This will lead to the existence and uniqueness of our feedback control immediately. The proof is completed.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (11101242, 11126214, 10921101, and 61174092), the National Science Fund for Distinguished Young Scholars of China (11125102), the Natural Science Foundation of Shandong Province, China, (ZR2010AQ004), the Independent Innovation Foundation of Shandong University (IIFSDU, no. 2010TS060), and Doctoral Fund of Education Ministry of China.

Appendix

Let us pay attention to the generalized FBSDE (2.8). For the classical FBSDE, Hu, and Peng [12] and Peng, Wu [13] obtained the existence and uniqueness results under some monotonicity conditions. Yong [14] let the method in [12, 13] be systematic and called it the “continuation method." Then Wu [9, 15] discussed the applications of FBSDEs in LQ problem and maximum principle for optimal control problems of FBSDE systems. While if we substitute the optimal control (2.7) into (2.8), we notice that our FBSDE (2.8) is different from the classical one; it contains more delayed and anticipated part in the coefficients. Now, we try to analyze this generalized kind of FBSDEs.

For more general case, we consider the following FBSDE. (Just for simplicity, in this appendix we change the notations with writing the time variable t as a subscript).

Consider the following:

()

Here B, B¹, D, D¹ are k × n matrices, (x, y, z) ∈ ℝⁿ⁺ⁿ⁺ⁿ and b, f, σ are 𝔽-adapted with appropriate dimensions.

We will use the notations:

()

where σ = (σ₁, …, σ_d). We also impose the following monotonicity conditions:

for all

, where ν₁ and ν₂ are given nonnegative constants with ν₁ ≥ 0, ν₂ > 0.

Then we have the following result.

Theorem A.1. Let (HA.1) and (HA.2) hold. Then there exists a unique solution satisfying the generalized FBSDE (A.1).

Proof. Uniqueness

let λ_s = (x_s, y_s, z_s) and be two solutions of (A.1). We set . And we apply Itô′s formula to on [0, T]:

()

Integrating and taking expectations on both sides of the above equation, we have

()

Combining (HA.2), we obtain

()

This implies

. Following (HA.1)-(iv) and the uniqueness of solution for SDDE (see [2, 3]), we have

. Consequently,

. By the uniqueness of solution for BSDE, we have

Existence. In order to prove the existence of the solution we first consider the following family of equations parameterized by α ∈ [0,1]:

()

where ϕ, ψ and ς are given processes in

, ξ ∈ L²(Ω, ℱ_T, P; ℝⁿ). Obviously, the existence of solutions for (A.7) with α = 1 implies our desired conclusion. When α = 0, we notice that (A.7) is in a decupled form, and we obtain the existence and uniqueness result from the classical SDE and BSDE theory. Detailedly, in this case, the second equation and the terminal condition

construct a standard BSDE which admits a unique solution (y⁰, z⁰). Substituting (y⁰, z⁰) into the first equation, we get a standard SDE and it admits a unique solution x⁰. Thence, we need to consider the following case:

()

Here, Λ_s = (X_s, Y_s, Z_s). We want to prove that the mapping defined by

()

is a contraction.

For the difference

, we have the following result:

()

Here C₁ depends on the Lipschitz constants of b, σ, f, Φ and B, B¹, D, D¹. If we select ν₂α₀ + (1 − α₀) ≥ L₁, where L₁ = min (1, ν₂) > 0, then

()

with C₂ = C₁/L₁. For

, using the estimate of SDDE, we derive

()

Similarly, for the difference of solutions

, we have

()

by the anticipated BSDE estimates (see [7]). Here, the constant C₃ depends on the Lipschitz constants, K, B, B¹, D, D¹ and T. Combining the estimates (A.11)–(A.13), we have

()

where the constant C depends on C₁, C₂, C₃ and T. So we choose ρ₀ = 1/2C, then for each ρ ∈ [0, ρ₀], the mapping

is a contraction; that is, for all ρ ∈ [0, ρ₀] the (A.7) for α = ρ has a unique solution. We repeat this process N times with 1 ≤ Nρ₀ ≤ 1 + ρ₀. It then follows that (A.7) for α = 1 has a unique solution.

Remark A.2. It is verified that the FBSDE (2.8) satisfies the assumptions (HA.1) and (HA.2); therefore, there exists a unique solution (x, y, z) for FBSDE (2.8). Consequently, the optimal control of our Problem (LQD) exists and is unique.

References

1 Arriojas M., Hu Y., Mohammed S.-E. A., and Pap G., A delayed Black and Scholes formula, Stochastic Analysis and Applications. (2007) 25, no. 2, 471–492, https://doi.org/10.1080/07362990601139669, 2303097, ZBL1119.60059.
10.1080/07362990601139669
Web of Science® Google Scholar
2 Mohammed S.-E. A., Stochastic Functional Differential Equations, 1984, 99, Pitman, 754561.
Google Scholar
3 Mohammed S.-E. A., Stochastic Differential Equations with Memory: Theory, Examples and Applications, Stochastic Analysis and Related Topics 6, The Geido Workshop, 1996, Progress in Probability, Birkhauser.
Google Scholar
4 Øksendal B. and Sulem A., J. M. Menaldi, E. Rofman, and A. Sulem, A maximum principle for optimal control of stochastic systems with delay, with applications to finance, Optimal Control and Partial Differential Equations, 2000, ISO Press, Amsterdam, The Netherlands, 64–79.
Google Scholar
5 Alekal Y., Brunovský P., Chyung D. H., and Lee E. B., The quadratic problem for systems with time delays, Institute of Electrical and Electronics Engineers. Transactions on Automatic Control. (1971) 16, no. 6, 673–687, 0308892.
10.1109/TAC.1971.1099824
Google Scholar
6 Basin M., Rodriguez-Gonzalez J., and Martinez-Zuniga R., Optimal control for linear systems with time delay in control input based on the duality principle, 3, Proceedings of the American Control Conference, 2003, 2144–2148.
Google Scholar
7 Peng S. and Yang Z., Anticipated backward stochastic differential equations, The Annals of Probability. (2009) 37, no. 3, 877–902, https://doi.org/10.1214/08%2DAOP423, 2537524, ZBL1186.60053.
10.1214/08-AOP423
Web of Science® Google Scholar
8 Chen L. and Wu Z., Maximum principle for the stochastic optimal control problem with delay and application, Automatica. (2010) 46, no. 6, 1074–1080, https://doi.org/10.1016/j.automatica.2010.03.005, 2877190, ZBL1205.93163.
10.1016/j.automatica.2010.03.005
Web of Science® Google Scholar
9 Wu Z., Forward-backward stochastic differential equations, linear quadratic stochastic optimal control and nonzero sum differential games, Journal of Systems Science and Complexity. (2005) 18, no. 2, 179–192, 2136983, ZBL1156.93409.
Google Scholar
10 Yu Z., The stochastic maximum principle for optimal control problems of delay systems involving continuous and impulse controls, Automatica. (2012) 56, no. 6, 1401–1406, https://doi.org/10.1016/j.automatica.2012.06.082.
10.1016/j.automatica.2012.06.082
Google Scholar
11 Yong J. and Zhou X. Y., Stochastic Controls, 1999, 43, Springer, New York, NY, USA, 1696772.
10.1007/978-1-4612-1466-3
Google Scholar
12 Hu Y. and Peng S., Solution of forward-backward stochastic differential equations, Probability Theory and Related Fields. (1995) 103, no. 2, 273–283, https://doi.org/10.1007/BF01204218, 1355060, ZBL0831.60065.
10.1007/BF01204218
Web of Science® Google Scholar
13 Peng S. and Wu Z., Fully coupled forward-backward stochastic differential equations and applications to optimal control, SIAM Journal on Control and Optimization. (1999) 37, no. 3, 825–843, https://doi.org/10.1137/S0363012996313549, 1675098, ZBL0931.60048.
10.1137/S0363012996313549
Web of Science® Google Scholar
14 Yong J., Finding adapted solutions of forward-backward stochastic differential equations: method of continuation, Probability Theory and Related Fields. (1997) 107, no. 4, 537–572, https://doi.org/10.1007/s004400050098, 1440146, ZBL0883.60053.
10.1007/s004400050098
Web of Science® Google Scholar
15 Wu Z., Maximum principle for optimal control problem of fully coupled forward-backward stochastic systems, Systems Science and Mathematical Sciences. (1998) 11, no. 3, 249–259, 1651258, ZBL0938.93066.
Google Scholar

Citing Literature

All articles

Delayed Stochastic Linear-Quadratic Control Problem and Related Applications

Abstract

1. Introduction

2. Stochastic LQ Problem with Delay

3. Feedback Regulator of Delayed System

4. Application

Acknowledgments

Appendix

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley