International Journal of Robust and Nonlinear Control

Volume 31, Issue 17 pp. 8481-8503

RESEARCH ARTICLE

Discounted near-optimal regulation of constrained nonlinear systems via generalized value iteration

Ding Wang,

Corresponding Author

Ding Wang

[email protected]

orcid.org/0000-0002-7149-5712

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Correspondence Ding Wang, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.

Email: [email protected]

Search for more papers by this author

Mingming Zhao,

Mingming Zhao

orcid.org/0000-0002-6405-4652

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Search for more papers by this author

Mingming Ha,

Mingming Ha

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

Search for more papers by this author

Junfei Qiao,

Junfei Qiao

orcid.org/0000-0002-1707-6074

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Search for more papers by this author

Ding Wang,

Corresponding Author

Ding Wang

[email protected]

orcid.org/0000-0002-7149-5712

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Correspondence Ding Wang, Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.

Email: [email protected]

Search for more papers by this author

Mingming Zhao,

Mingming Zhao

orcid.org/0000-0002-6405-4652

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Search for more papers by this author

Mingming Ha,

Mingming Ha

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

Search for more papers by this author

Junfei Qiao,

Junfei Qiao

orcid.org/0000-0002-1707-6074

Faculty of Information Technology, Beijing University of Technology, Beijing, China

Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, China

Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China

Search for more papers by this author

First published: 18 August 2021

https://doi.org/10.1002/rnc.5729

Citations: 8

Funding information: Beijing Natural Science Foundation, JQ19013; National Natural Science Foundation of China, 61773373; 61890930-5; 62021003; National Key Research and Development Project, 2018YFC1900800-5

Share a link

Email
Wechat
Bluesky

Abstract

In this article, a generalized value iteration algorithm is developed to address the discounted near-optimal control problem for discrete-time systems with control constraints. The initial cost function is permitted to be an arbitrary positive semi-definite function without being zero. First, a nonquadratic performance functional is utilized to overcome the challenge caused by saturating actuators. Then, the monotonicity and convergence of the iterative cost function sequence with the discount factor are analyzed. For facilitating the implementation of the iterative algorithm, two neural networks with Levenberg–Marquardt training algorithm are constructed to approximate the cost function and the control law. Furthermore, the initial control law is obtained by employing the fixed point iteration approach. Finally, two simulation examples are provided to validate the feasibility of the present strategy. It is emphasized that the established control laws are successfully constrained for randomly given initial state vectors.

CONFLICT OF INTEREST

The authors declare that there is no potential conflict of interest.

Open Research

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

REFERENCES

1Zhang H, Luo Y, Liu D. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw. 2009; 20(9): 1490-1503.
10.1109/TNN.2009.2027233
PubMed Web of Science® Google Scholar
2Ha M, Wang D, Liu D. Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems. IEEE Trans Syst Man Cybern Syst. 2020; 50(9): 3158-3168.
10.1109/TSMC.2018.2868510
Web of Science® Google Scholar
3Ha M, Wang D, Liu D. Event-triggered constrained control with DHP implementation for nonaffine discrete-time systems. Inf Sci. 2020; 519: 110-123.
10.1016/j.ins.2020.01.020
Web of Science® Google Scholar
4Modares H, Lewis FL, Naghibi-Sistani M. Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Trans Neural Netw Learn Syst. 2013; 24(10): 1513-1525.
10.1109/TNNLS.2013.2276571
PubMed Web of Science® Google Scholar
5Prokhorov DV, Wunsch DC. Adaptive critic designs. IEEE Trans Neural Netw. 1997; 8(5): 997-1007.
10.1109/72.623201
CAS PubMed Web of Science® Google Scholar
6 Werbos PJ. ch. 13. Approximate dynamic programming for real-time control and neural modeling. In: DA White, DA Sofge, eds. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold; 1992.
Web of Science® Google Scholar
7Wang D, Ha M, Qiao J. Self-learning optimal regulation for discrete-time nonlinear systems under event-driven formulation. IEEE Trans Automat Contr. 2020; 65(3): 1272-1279.
10.1109/TAC.2019.2926167
Web of Science® Google Scholar
8Dong L, Zhong X, Sun C, He H. Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems. IEEE Trans Neural Netw Learn Syst. 2017; 28(7): 1594-1605.
10.1109/TNNLS.2016.2541020
PubMed Web of Science® Google Scholar
9Wang D, Liu D. Learning and guaranteed cost control with event-based adaptive critic implementation. IEEE Trans Neural Netw Learn Syst. 2018; 29(12): 6004-6014.
10.1109/TNNLS.2018.2817256
PubMed Web of Science® Google Scholar
10Fan Q, Yang G. Event-based fuzzy adaptive fault-tolerant control for a class of nonlinear systems. IEEE Trans Fuzzy Syst. 2018; 26(5): 2686-2698.
10.1109/TFUZZ.2018.2800724
Web of Science® Google Scholar
11Wang D, Xu X, Zhao M. Neural critic learning toward robust dynamic stabilization. Int J Robust Nonlinear Control. 2020; 30(5): 2020-2032.
10.1002/rnc.4860
Web of Science® Google Scholar
12Wang D. Intelligent critic control with robustness guarantee of disturbed nonlinear plants. IEEE Trans Cybern. 2020; 50(6): 2740-2748.
10.1109/TCYB.2019.2903117
PubMed Web of Science® Google Scholar
13 Yang Y, Gao W, Modares H, Xu CZ. Robust actor-critic learning for continuous-time nonlinear systems with unmodeled dynamics. IEEE Trans Fuzzy Syst. 2021. https://doi.org/10.1109/TFUZZ.2021.3075501
10.1109/TFUZZ.2021.3075501
PubMed Web of Science® Google Scholar
14Zhang H, Wei Q, Luo Y. A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern. 2008; 38(4): 937-942.
10.1109/TSMCB.2008.920269
PubMed Web of Science® Google Scholar
15Kiumarsi B, Lewis FL. Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans Neural Netw Learn Syst. 2015; 26(1): 140-151.
10.1109/TNNLS.2014.2358227
PubMed Web of Science® Google Scholar
16Song R, Xie Y, Zhang Z. Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems. Neurocomputing. 2019; 356: 206-216.
10.1016/j.neucom.2019.05.026
Web of Science® Google Scholar
17Wang D, Ha M, Qiao J. Data-driven iterative adaptive critic control toward an urban wastewater treatment plant. IEEE Trans Ind Electron. 2021; 68(8): 7362-7369.
10.1109/TIE.2020.3001840
Web of Science® Google Scholar
18Luo B, Liu D, Wu H. Adaptive constrained optimal control design for data-based nonlinear discrete-time systems with critic-only structure. IEEE Trans Neural Netw Learn Syst. 2018; 29(6): 2099-2111.
10.1109/TNNLS.2017.2751018
PubMed Web of Science® Google Scholar
19Wang D, Zhao M, Ha M, Ren J. Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application. Neural Netw. 2021; 143: 121-132.
10.1016/j.neunet.2021.05.027
PubMed Web of Science® Google Scholar
20 Wang D, Zhao M, Qiao J. Intelligent optimal tracking with asymmetric constraints of a nonlinear wastewater treatment system. Int J Robust Nonlinear Control. 2021.
10.1002/rnc.5639
Web of Science® Google Scholar
21Kiumarsi B, Vamvoudakis KG, Modares H, Lewis FL. Optimal and autonomous control using reinforcement learning: a survey. IEEE Trans Neural Netw Learn Syst. 2018; 29(6): 2042-2062.
10.1109/TNNLS.2017.2773458
PubMed Web of Science® Google Scholar
22Liu D, Wei Q. Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans Neural Netw Learn Syst. 2014; 25(3): 621-634.
10.1109/TNNLS.2013.2281663
PubMed Web of Science® Google Scholar
23 Yang Y, Vamvoudakis KG, Modares H, Yin Y, Wunsch DC. Hamiltonian-driven hybrid adaptive dynamic programming. IEEE Trans Syst Man Cybern Syst. 2019. https://doi.org/10.1109/TSMC.2019.2962103
Web of Science® Google Scholar
24 Fan Q, Wang D, Xu B. $urn:x-wiley:10498923:media:rnc5729:rnc5729-math-0402$ codesign for uncertain nonlinear control systems based on policy iteration method. IEEE Trans Cybern. 2021. https://doi.org/10.1109/TCYB.2021.3065995
10.1109/TCYB.2021.3065995
Google Scholar
25Al-Tamimi A, Lewis FL, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern. 2008; 38(4): 943-949.
10.1109/TSMCB.2008.926614
PubMed Web of Science® Google Scholar
26Wang D, Liu D, Wei Q, Zhao D, Jin N. Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica. 2012; 48(8): 1825-1832.
10.1016/j.automatica.2012.05.049
Web of Science® Google Scholar
27Mu C, Wang D, He H. Novel iterative neural dynamic programming for data-based approximate optimal control design. Automatica. 2017; 81: 240-252.
10.1016/j.automatica.2017.03.022
Web of Science® Google Scholar
28Wei Q, Liu D, Lin H. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern. 2016; 46(3): 840-853.
10.1109/TCYB.2015.2492242
PubMed Web of Science® Google Scholar
29Li H, Liu D. Optimal control for discrete-time affine non-linear systems using general value iteration. IET Control Theory Appl. 2012; 6(18): 2725-2736.
10.1049/iet-cta.2011.0783
Web of Science® Google Scholar
30Ha M, Wang D, Liu D. Generalized value iteration for discounted optimal control with stability analysis. Syst Control Lett. 2021; 147: 104847:1-104847:7.
10.1016/j.sysconle.2020.104847
Web of Science® Google Scholar
31Wei Q, Liu D, Lin Q. Discrete-time local value iteration adaptive dynamic programming: admissibility and termination analysis. IEEE Trans Neural Netw Learn Syst. 2017; 28(11): 2490-2502.
10.1109/TNNLS.2016.2593743
PubMed Web of Science® Google Scholar
32Wei Q, Lewis FL, Liu D, Song R, Lin H. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst. 2018; 48(6): 875-891.
10.1109/TSMC.2016.2623766
Web of Science® Google Scholar

Citing Literature

Volume31, Issue17

Special Issue:Emerging Approaches for Nonlinear Parameter Varying (NLPV) Systems

25 November 2021

Pages 8481-8503

Discounted near-optimal regulation of constrained nonlinear systems via generalized value iteration

Abstract

CONFLICT OF INTEREST

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Discounted near-optimal regulation of constrained nonlinear systems via generalized value iteration

Abstract

CONFLICT OF INTEREST

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

References

Related

Information