Performance-guaranteed containment control for pure-feedback multi agent systems via reinforcement learning algorithm
Ao Luo
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorCorresponding Author
Wenbin Xiao
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan, China
Correspondence Wenbin Xiao, School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan 411201, China.
Email: [email protected]
Search for more papers by this authorXiao-Meng Li
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorDeyin Yao
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorQi Zhou
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorAo Luo
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorCorresponding Author
Wenbin Xiao
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan, China
Correspondence Wenbin Xiao, School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan 411201, China.
Email: [email protected]
Search for more papers by this authorXiao-Meng Li
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorDeyin Yao
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorQi Zhou
School of Automation and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, China
Search for more papers by this authorFunding information: National Natural Science Foundation of China, Grant/Award Numbers: 61973091; 62003098; The China Postdoctoral Science Foundation, Grant/Award Numbers: 2021TQ0079; 2021M700883; The Key Area Research and Development Program of Guangdong Province, Grant/Award Number: 2021B0101410005; The Local Innovative and Research Teams Project of Guangdong Special Support Program, Grant/Award Number: 2019BT02X353
Abstract
In this article, a performance-guaranteed containment control scheme based on reinforcement learning (RL) algorithm is proposed for a class of pure-feedback multi agent systems (MASs) with unmeasurable states. The unknown nonlinear functions are approximated by the neural networks (NNs) and an adaptive NN state observer is designed for the states estimation. Based on estimated states, the algebraic loop problem can be removed by introducing filtered signals, and the actor-critic architecture of RL algorithm is employed to acquire the optimal controller in the framework of backstepping. Different from many optimal strategies, this article proposes a simpler mechanism based on the uniqueness of the optimal solution to obtain the actor and critic updating laws instead of gradient descent algorithm with complicated calculation. In addition, predefined performance function and an improved error transformation technique are utilized to guarantee the containment error within a prescribed boundary. By using Lyapunov stability theory and graph theory, the stability of the closed-loop system can be demonstrated. Finally, the effectiveness of the method proposed in this article is verified by a simulation example.
CONFLICT OF INTEREST
The authors declare that there is no conflict of interest.
Open Research
DATA AVAILABILITY STATEMENT
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
REFERENCES
- 1Fu JJ, Wan Y, Wen GH, Huang TW. Distributed robust global containment control of second-order multi-agent systems with input saturation. IEEE Trans Neural Netw Learn Syst. 2019; 6(4): 1425-1437.
- 2Zhang FX, Chen YY. Fuzzy adaptive containment control for nonlinear nonaffine pure-feedback multi-agent systems. IEEE Trans Fuzzy Syst. 2021; 29(10): 2878-2889.
- 3Zhao L, Yu JP, Shi P. Command filtered backstepping-based attitude containment control for spacecraft formation. IEEE Trans Syst Man Cybern Syst. 2021; 51(2): 1278-1287.
- 4Cui GZ, Xu SY, Chen XK, Lweis FL, Zhang BY. Distributed containment control for nonlinear multi-agent systems in pure-feedback form. Int J Robust Nonlinear Control. 2018; 28: 2742-2758.
- 5Li K, Mu XW. Containment control of stochastic multi-agent systems with semi-Markovian switching topologies. Int J Robust Nonlinear Control. 2019; 29: 4943-4955.
- 6Zhou Q, Wang W, Ma H, Li HY. Event-triggered fuzzy adaptive containment control for nonlinear multi-agent systems with unknown Bouc-Wen hysteresis input. IEEE Trans Fuzzy Syst. 2021; 29(4): 731-741.
- 7Wang W, Liang HJ, Pan YN, Li TS. Prescribed performance adaptive fuzzy containment control for nonlinear multi-agent systems using disturbance observer. IEEE Trans Cybern. 2020; 50(9): 3879-3891.
- 8Wang CL, Wen CY, Hu QL, Wang W, Zhang XY. Distributed adaptive containment control for a class of nonlinear multi-agent systems with input quantization. IEEE Trans Neural Netw Learn Syst. 2018; 29(6): 2419-2428.
- 9Li YM, Qu FY, Tong SC. Observer-based fuzzy adaptive finite-time containment control of nonlinear multi-agent systems with input delay. IEEE Trans Cybern. 2021; 51(1): 126-137.
- 10Wang W, Tong SC. Observer-based adaptive fuzzy containment control for multiple uncertain nonlinear systems. IEEE Trans Fuzzy Syst. 2019; 27(11): 2079-2089.
- 11Zhou Q, Wang W, Liang HJ, Basin MV, Wang BH. Observer-based event-triggered fuzzy adaptive bipartite containment control of multi-agent systems with input quantization. IEEE Trans Fuzzy Syst. 2021; 29(2): 372-384.
- 12Ren HR, Lu RQ, Xiong JL, Wu YQ, Shi P. Optimal filtered and smoothed estimators for discrete-time linear systems with multiple packet dropouts under Markovian communication constraints. IEEE Trans Cybern. 2020; 50(9): 4169-4181.
- 13Luo B, Liu DR, Huang TW, Wang D. Model-free optimal tracking control via critic-only Q-learning. IEEE Trans Neural Netw Learn Syst. 2016; 27(10): 2134-2144.
- 14 Li HY, Wu Y, Chen M. Adaptive fault-tolerant tracking control for discrete-time multi-agent systems via reinforcement learning algorithm. IEEE Trans Cybern. 2020. doi:10.1109/TCYB.2020.2982168
- 15Ouyang YC, Dong L, Wei YL, Sun CY. Neural network based tracking control for an elastic joint robot with input constraint via actor-critic design. Neurocomputing. 2020; 409: 286-295.
- 16Yang X, Liu DR, Wang D. Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. Int J Control. 2014; 87(3): 553-566.
- 17Kyriakos GV, Lweis FL. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica. 2010; 46: 878-888.
- 18Dong L, Yuan X, Sun CY. Event-triggered receding horizon control via actor-critic design. Sci China Inf Sci. 2020; 63:150210.
- 19Liu DR, Wang D, Wang FY, Li HL, Yang X. Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans Cybern. 2014; 44(12): 2834-2847.
- 20 Peng ZN, Luo R, Hu JP, Shi K, Nguang SK, Ghosh BK. Optimal tracking control of nonlinear multi-agent systems using internal reinforce Q-learning. IEEE Trans Neural Netw Learn Syst. 2021. doi:10.1109/TNNLS.2021.3055761
- 21Peng ZN, Zhao YY, Hu JP, Luo R, Ghosh BK, Nguang SK. Input-output data-based output antisynchronization control of multi-agent systems using reinforcement learning approach. IEEE Trans Ind Inform. 2021; 17(11): 7359-7367.
- 22Ding L, Li S, Gao HB, Liu YJ, Huang L, Deng ZQ. Adaptive neural network-based finite-time online optimal tracking control of the nonlinear system with dead zone. IEEE Trans Cybern. 2021; 51(1): 382-392.
- 23 Xue S, Luo B, Liu DR, Gao Y. Event-triggered ADP for tracking control of partially unknown constrained uncertain systems. IEEE Trans Cybern. 2021. doi:10.1109/TCYB.2021.3054626
- 24Dong GW, Cao L, Yao DY, Li HY, Lu RQ. Adaptive attitude control for multi-MUAV systems with output dead-zone and actuator fault. IEEE/CAA J Autom Sin. 2021; 8(9): 1567-1575.
10.1109/JAS.2020.1003605 Google Scholar
- 25
Lin GH, Li HY, Ma H, Yao DY, Lu RQ. Human-in-the-loop consensus control for nonlinear multi-agent systems with actuator faults. IEEE/CAA J Autom Sin. 2020. doi:10.1109/JAS.2020.1003596
10.1109/JAS.2020.1003596 Google Scholar
- 26Ma H, Li HY, Lu RQ, Huang TW. Adaptive event-triggered control for a class of nonlinear systems with periodic disturbances. Sci China Inf Sci. 2020; 63:150212.
- 27Xiao WB, Cao L, Li HY, Lu RQ. Observer-based adaptive consensus control for nonlinear multi-agent systems with time-delay. Sci China Inf Sci. 2020; 63:132202.
- 28Kang Y, Zhai DH, Liu GP, Zhao YB. On input-to-state stability of switched stochastic nonlinear systems under extended asynchronous switching. IEEE Trans Cybern. 2015; 46(5): 1092-1195.
- 29Tong SC, Li YM, Shi P. Observer-based adaptive fuzzy backstepping output feedback control of uncertain MIMO pure-feedback nonlinear systems. IEEE Trans Fuzzy Syst. 2012; 20(4): 771-785.
- 30Zhang ZQ, Xu SY, Zhang BY. Asymptotic tracking control of uncertain nonlinear systems with unknown actuator nonlinearity. IEEE Trans Automat Contr. 2014; 59(5): 1336-1341.
- 31Ren HR, Hamid RK, Lu RQ, Wu YQ. Synchronization of network systems via aperiodic sampled-data control with constant delay and application to unmanned ground vehicles. IEEE Trans Ind Electron. 2020; 67(6): 4980-4990.
- 32Liang HJ, Liu GL, Zhang HG, Huang TW. Neural-network-based event-triggered adaptive control of nonaffine nonlinear multiagent systems with dynamic uncertainties. IEEE Trans Neural Netw Learn Syst. 2021; 32(5): 2239-2250.
- 33Liang HJ, Guo XY, Pan YN, Huang TW. Event-triggered fuzzy bipartite tracking control for network systems based on distributed reduced-order observers. IEEE Trans Fuzzy Syst. 2021; 29(6): 1601-1614.
- 34Pan YN, Du PH, Xue H, Lam HK. Singularity-free fixed-time fuzzy control for robotic systems with user-defined performance. IEEE Trans Fuzzy Syst. 2021; 29(8): 2388-2398.
- 35Zhang FX, Chen YY. Indirect adaptive fuzzy control for nonaffine nonlinear pure-feedback systems. IEEE Trans Fuzzy Syst. 2020; 28(11): 2918-2929.
- 36 Ma H, Ren HR, Zhou Q, Lu RQ, Li HY. Approximation-based Nussbaum gain adaptive control of nonlinear systems with periodic disturbances. IEEE Trans Syst Man Cybern Syst. 2021. doi:10.1109/TSMC.2021.3050993
- 37Tong SC, Sui S, Li YM. Fuzzy adaptive output feedback control of MIMO nonlinear systems with partial tracking errors constrained. IEEE Trans Fuzzy Syst. 2015; 23(4): 729-742.
- 38Tong SC, Zhang LL, Li Y. Observed-Based adaptive fuzzy decentralized tracking control for switched uncertain nonlinear large-scale systems with dead zones. IEEE Trans Syst Man Cybern Syst. 2016; 46(1): 37-47.
- 39
Yao DY, Li HY, Shi Y. Adaptive event-triggered sliding-mode control for consensus tracking of nonlinear multiagent systems with unknown perturbations. IEEE Trans Cybern. 2022. doi:10.1109/TCYB.2022.3172127
10.1109/TCYB.2022.3172127 Google Scholar
- 40
Cao L, Yao DY, Li HY, Meng W, Lu RQ. Fuzzy-based dynamic event triggering formation control for nonstrict-feedback nonlinear MASs. Fuzzy Set Syst. 2022. doi:10.1016/J.FSS.2022.03.005
10.1016/J.FSS.2022.03.005 Google Scholar
- 41Zargarzadeh H, Dierks T, Jagannathan S. Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Trans Neural Netw Learn Syst. 2015; 26(10): 2535-2549.
- 42Li YM, Sun KK, Tong SC. Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems. IEEE Trans Cybern. 2019; 49(2): 649-661.
- 43Sui S, Tong SC, Chen CLP, Sun KK. Fuzzy adaptive optimal control for nonlinear switched systems with actuator hysteresis. Int J Robust Nonlinear Control. 2019; 33: 609-625.
- 44Wen GX, Ge SS, Tu FW. Optimized backstepping for tracking control of strict-feedback systems. IEEE Trans Neural Netw Learn Syst. 2018; 29(8): 3850-3862.
- 45 Li YM, Liu YJ, Tong SC. Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints. IEEE Trans Neural Netw Learn Syst. 2021. doi:10.1109/TNNLS.2021.3051030
- 46Wen GX, Ge SS, Chen CLP, Tu FW, Wang SN. Adaptive tracking control of surface vessel using optimized backstepping technique. IEEE Trans Cybern. 2019; 49(9): 3420-3431.
- 47 Liu YC, Zhu QD, Wen GX. Adaptive tracking control for perturbed strict-feedback nonlinear systems based on optimized backstepping technique. IEEE Trans Neural Netw Learn Syst. 2020. doi:10.1109/TNNLS.2020.3029587
- 48 Wen GX, Chen CLP, Ge SS. Simplified optimized backstepping control for a class of nonlinear strict-feedback systems with unknown dynamic functions. IEEE Trans Cybern. 2020. doi:10.1109/TCYB.2020.3002108
- 49Wen GX, Chen CLP, Feng J, Zhou N. Optimized multi-agent formation control based on an identifier-actor-critic reinforcement learning algorithm. IEEE Trans Fuzzy Syst. 2018; 26(5): 2719-2731.
- 50Bechlioulis CP, Rovithakis GA. Adaptive control with guaranteed transient and steady state tracking error bounds for strict feedback systems. Automatica. 2009; 45: 532-538.
- 51Li YM, Shao XF, Tong SC. Adaptive fuzzy prescribed performance control of nontriangular structure nonlinear systems. IEEE Trans Fuzzy Syst. 2020; 28(10): 2416-2426.
- 52Qiu JB, Sun KK, Wang T, Gao HJ. Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans Fuzzy Syst. 2019; 27(11): 2152-2162.
- 53 Wang N, Wen GH, Wang Y, Zhang F, Zemouche A. Fuzzy adaptive cooperative consensus tracking of high-order nonlinear multiagent networks with guaranteed performances. IEEE Trans Cybern. 2021. doi:10.1109/TCYB.2021.3051002