Data-enabled predictive control for quadcopters
Funding information: Eidgenössische Technische Hochschule Zürich, H2020 European Research Council, OCAL, No. 787845; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung, NCCR Automation
Abstract
We study the application of a data-enabled predictive control (DeePC) algorithm for position control of real-world nano-quadcopters. The DeePC algorithm is a finite-horizon, optimal control method that uses input/output measurements from the system to predict future trajectories without the need for system identification or state estimation. The algorithm predicts future trajectories of the quadcopter by linearly combining previously measured trajectories (motion primitives). We illustrate the necessity of a regularized variant of the DeePC algorithm to handle the nonlinear nature of the real-world quadcopter dynamics with noisy measurements. Simulation-based analysis is used to gain insights into the effects of regularization, and experimental results validate that these insights carry over to the real-world quadcopter. Moreover, we demonstrate the reliability of the DeePC algorithm by collecting a new set of input/output measurements for every real-world experiment performed. The performance of the DeePC algorithm is compared to Model Predictive Control based on a first-principles model of the quadcopter. The results are demonstrated with a video of successful trajectory tracking of the real-world quadcopter.
1 INTRODUCTION
The analysis and design of control systems is traditionally addressed using a model-based control approach where a model for the system is first identified from data, and the control policy is then designed based on the identified model. The system identification step is often the most time-consuming and challenging part of model-based control approaches.1, 2 System identification often requires expert knowledge and partial system models,3 and unless the control objective is taken into account during the identification process, the obtained model may not be useful for control.4 These observations as well as the advancements in sensing and computation technologies have motivated a tendency toward data-driven control methods yielding many successes.5-8 Such methods bypass the traditional model-based control approach, and design control inputs directly from data. These so-called direct data-driven methods for control design benefit from ease of implementation on complex systems where system identification is too time-consuming and cumbersome. Among these data-driven methods are learning-based and adaptive Model Predictive Control (MPC) approaches, where the unknown system dynamics are substituted with a learned model which maps inputs to output predictions.9-13 However, such methods still require learning an input/output model and often involve (stochastic) function approximation by means of neural networks or Gaussian processes, which come with their own tuning challenges and can be inconsistent across applications.14
One algorithm that does not require any function learning or system identification is the so-called data-enabled predictive control (DeePC) algorithm.15 Instead, this algorithm directly uses previously measured input/output data to predict future trajectories. The previously measured input/output data from the system act as motion primitives that serve as a basis for the subspace of possible system trajectories. The DeePC algorithm builds on the seminal work on linear time invariant (LTI) systems by Willems et al., specifically what is known as the fundamental lemma in behavioral systems theory.16 This result was used by Markovsky et al. for the first time for control purposes allowing for the synthesis of data-driven open loop control for LTI systems.17 The DeePC algorithm extended this method to closed-loop control and was implemented in a receding horizon optimal control setup. This algorithm was shown to be equivalent to MPC for deterministic LTI systems,15 and was later extended giving guarantees on recursive feasibility and closed-loop stability.18 Additionally, numerical case studies have illustrated that the algorithm performs robustly on some stochastic and nonlinear systems and often outperforms system identification followed by conventional MPC.19-21
Several other data-driven control methods have been proposed that make use of input/output data in similar ways as DeePC. One method uses the fundamental lemma to synthesize stabilizing output feedback controllers solving the linear quadratic regulation problem using only input/output data.22 Other methods use previously measured input/output trajectories as motion primitives to compute minimum energy inputs,23 or produce new control inputs for LTI systems.24 All of these methods, including the DeePC algorithm, rely on the linearity property. For nonlinear systems, data-driven control methods that make use of motion primitives to synthesize new trajectories have been proposed.25, 26 Common to these methods is a nonlinear data-fitting step in the generation of the motion primitives. One approach uses sparse identification to fit the raw data to a predefined library of nonlinear primitives.25 An approach tailored to robotics applications learns motion primitives from demonstration trajectories by estimating the parameters of nonlinear Gaussian basis functions.26 By contrast, DeePC directly uses raw data sequences as motion primitives, and there is no data-fitting step. It relies on a robustifying regularization, which is incorporated directly in the optimal control objective, to address nonlinearity and stochasticity.15, 19, 27
The focus of this paper is on implementing this robustified, regularized variant of the DeePC algorithm for the first time on a real-world system. In particular, we seek to analyze how the algorithm can be applied for real-time control of a quadcopter whose dynamics are nonlinear and the measurements are corrupted by noise. The quadcopter is a common benchmark system for verifying data-driven control methods.7, 8, 28-30 It makes for an interesting benchmark because it is nonlinear, open-loop unstable and has fast dynamics. Our main aim is to gain valuable insights on this benchmark which will assist in the implementation of our novel data-driven control method across multiple systems.
Contributions: The DeePC algorithm is implemented for the first time on a real-world system bridging the gap from theory to application. Through this, we gain key insights into choices of the algorithm's hyperparameters, providing tuning guidelines. We demonstrate that the DeePC algorithm is computationally tractable and suitable for real-time control. A video of the DeePC algorithm performing figure 8 trajectory tracking on the real-world quadcopter is provided here: https://doi.org/10.3929/ethz-b-000493419.
Outline: The real-world quadcopter system, problem statement, and DeePC algorithm are introduced in Section 2. The main contributions appear in Section 3, where we present simulation analysis and experimental results, as well as a video of successful trajectory tracking of the quadcopter. We conclude in Section 4 stating some future directions of research.
2 SETTING
We first present the quadcopter system in Section 2.1, providing details about its input/output channels, and the first-principles modeling that is used for simulation-based analysis. We then formally state in Section 2.2 the quadcopter control goal as a general finite-horizon, discrete-time, optimal control problem. Section 2.3 recalls the DeePC algorithm, showing how it can be used to address both LTI and nonlinear stochastic control problems in a data-driven way.
2.1 Quadcopter
For the purpose of simulation, we use a nonlinear, continuous-time quadcopter model. Full details of the model derivation are provided in other works.31, 32 Here we highlight the key definitions, equations, and control architecture. The model presented is also the starting point for the model-based control methods that are used for comparison in Section 3.4.














where is the acceleration due to gravity. An important feature of these equations is that the equilibrium inputs are the same at all positions
and at all yaw angles
.


- the body rate references about the
and
axes, denoted by
and
respectively, and
- the total thrust force from the propellers combined, denoted by
.
The outer controller adjusts these three inputs to ensure that the quadcopter tracks a position reference provided by the user, based on feedback of position and orientation measurements, ,
, and
, provided by an external motion capture system.33, 34 Our aim is to design a data-driven outer controller for this 3 input, 5 output off-the-shelf quadcopter system (see Figure 2 for a schematic of the architecture). Previous work has demonstrated that it is possible to design in simulation a data-driven controller from position and orientation measurements to propeller thrusts directly.15 As the goal of this work is to control a real-world quadcopter, we consider that the inner controller on the off-the-shelf quadcopter constitutes a part of the black-box system under investigation.

2.2 Problem statement























Without knowledge of system (2), solving problem (3) is no longer possible as we are unable to predict forward trajectories of the system, and estimate the current state . To resolve these issues, we approach the problem in a data-driven manner. In particular, we use the DeePC algorithm,15 which replaces the constraints requiring system knowledge by raw input/output data to solve an optimization problem similar to (3), and, under assumptions to be recalled next, directly equivalent to (3).
2.3 Data-enabled predictive control
2.3.1 DeePC for deterministic LTI systems





Definition 1. (Persistency of Excitation[16])Let . The sequence of signals
is called persistently exciting of order L if the Hankel matrix
has full row rank.
Note that the property of being persistently exciting of order L requires the length of the sequence of signals be large enough; in particular, the length must be such that . Intuitively, a persistently exciting sequence of signals must be sufficiently long and sufficiently rich to excite all aspects of the dynamics (4). The DeePC algorithm relies on the following fundamental result.
Theorem 1. (Theorem 1 of Reference [16])Let . Let
be a trajectory of (4) of length
such that
is persistently exciting of order
. Then
is a trajectory of (4) if and only if there exists
such that

The result above states that the subspace spanned by the columns of the Hankel matrix corresponds exactly to the subspace of possible trajectories of (4). Hence, the Hankel matrix may serve as a nonparametric model for (4), one that is simply constructed from raw time-series data and does not require any learning.
In what follows, we will see how the above theorem allows us to perform implicit state estimation as well as predict forward trajectories of the unknown system allowing us to solve an optimization problem equivalent to (3) when the system is of the form (4).






















Every column of the Hankel matrix is a trajectory of the system (motion primitive), and any new trajectory (right-hand side of (6)) can be synthesized by a linear combination of these motion primitives. Hence, given an input sequence u to be applied to the system, one can solve the first three block equations of (6) for g, and the corresponding output sequence is given by . The top two block equations in (6) are used to implicitly fix the initial condition from which the future trajectory departs. To uniquely fix the initial condition from which the future trajectory departs, one must set
, where
is the lag of the system (i.e., the number of past measurements required to uniquely identify the current state of the system through back-propagation of the dynamics (4)). This in turn implies that the predicted trajectory given by
is unique.17 Note that the lag
of the system is a priori unknown, but is upper bounded by n. Hence, knowing an upper bound on the state dimension n of the system is sufficient to obtain unique predictions.





2.3.2 Regularized DeePC for nonlinear noisy systems
The goal of this paper is to implement the above DeePC optimization problem to control a real-world quadcopter described above in Section 2.1. As the quadcopter dynamics do not satisfy the deterministic LTI assumption necessary to show the equivalence of the MPC optimization problem (3) and the DeePC optimization problem (7), regularizations are needed. Indeed, when the input/output data used for the Hankel matrix in (7) is obtained from a nonlinear system or is corrupted by process or measurement noise (as is the case with any real-world application) the subspace spanned by the columns of the Hankel matrix no longer coincides with the subspace of possible trajectories of the system. In fact, in any real-world problem setting the Hankel matrix used for predictions in (7) will generally be full rank. Hence, the Hankel matrix constraint will imply that any trajectory is possible leading to poor closed-loop performance of the DeePC algorithm. Furthermore, the online measurements used to set the initial condition from which the predicted trajectory departs are corrupted by measurement noise, and thus may cause poor predictions. Including a 2-norm penalty on the difference between the estimated initial condition
and the measured initial condition
coincides roughly with a least-square estimate of the true initial condition.





It has been shown that when , where
and
, problem (8) coincides with a distributionally robust problem formulation. Using such a q-norm regularization for the decision variable g induces robustness to all systems (nonlinear or stochastic) that could have produced the data in the Hankel matrices (5) within an s-norm induced Wasserstein ball around the data samples used, where
.19, 27
The computational complexity of (8) can be characterized by the number of decision variables and constraints. There are decision variables,
equality constraints, and
inequality constraints, when
and
are box constraint sets. As is expected of a finite-horizon optimal control method, the computational complexity grows with the time horizon
. Furthermore,
and
also affect the computational complexity. The former is related to the observability of the unknown system (2), the latter to the system's dimensionality.
3 RESULTS
- The nonlinear and stochastic nature of the quadcopter system requires that the regularization function in (8) and the other hyperparameters offered by the DeePC Algorithm 1 be chosen appropriately for the application at hand. This is addressed by the simulation-based analysis in Section 3.2.
- The simulation model is a simplification of the real-world quadcopter system which neglects complex aerodynamic phenomena, drag, delays in actuation, communication and sensing, and process noise. Essentially, the simulation model contains merely the bare Newtonian dynamics, and even those are subject to parametric uncertainties. Therefore, it is not clear that simulation-based parameter selection can be directly transferred to real-world experiments. This is addressed by the experimental results in Section 3.3.
The real-world results were collected from laboratory experiments conducted using a motion capture system to provide measurements of the position and orientation of the quadcopter at a frequency of 25 Hz. Thus, the sampling time in the discrete-time dynamics (2) is 40 ms. The laboratory setup was developed as part of a previous work.35 To provide the reader with an idea for the scale of the setup, the Crazyflie 2.036 quadcopter weighs 28 grams and a 12 cubic meter flying space was available. Further details on the setup are given in Section 3.3 where the experimental results are presented. The simulation environment uses the model presented in Section 2.1 and the model parameters identified in a previous work.37 These model parameters do not match the specific Crazyflie 2.0 used for the experiments, partially due to additional hardware required for detection by the motion capture system.
3.1 Data collection
As described in Section 2.3, the input signal used in the Hankel matrices appearing in (7) must be persistently exciting of sufficient order. This data can be collected by injecting a random input sequence, or by performing a manual flight experiment where a human performs the function of the outer controller. For repeatability of results, we chose the former. Two possible choices of random input signals to be applied during the data collection phase are a pseudorandom binary sequence (PRBS) designed for multiple inputs,38 or a white noise signal. Both types of perturbations were tested in simulations and showed a negligible difference in the performance of the DeePC algorithm. The results in this paper are presented using a PRBS input signal during the data collection phase because it generally provides better performance for classical system identification techniques.39 The input signals applied for data collection consist of the PRBS excitation signal added to an existing controller that maintains the quadcopter around the hover state. The data collected was used to populate the Hankel matrices in (5).
3.2 Simulation-based analysis and insights




















, the total number of data points used to construct the Hankel matrices in (5),
, the time horizon used for initial condition estimation,
, the weight on the softened initial condition constraint,
, the weight on the regularization of g,
- q, the norm used to regularize g in (10), and
- p, the number of outputs used to construct the Hankel matrices in (5).
Although p may seem fixed by the output measurements available, in the case of quadcopter control, it is reasonable to consider whether to use all measurements for position control, that is, set , or use only the position measurements, that is, set
.
Note that if one were to approach the control problem through system identification followed by MPC, a number of hyperparameters would also need to be selected. For example, the MATLAB subspace system identification method N4SID requires choosing a model order, weighting scheme, forward estimation and backward prediction horizons, weighting prefilter, output weighting matrix, and other hyperparameters. More generally, system identification for quadcopters requires significant engineering, and previous works resort to the use of partial model knowledge, such as the presence of integrators40 or the decoupled nature of the dynamics.41, 42 This is in addition to the use of full model knowledge in simulating the system and generating the input/output data for identification in these works. Further, the DeePC hyperparameters affect the closed-loop control performance directly and not through an offline system identification step, which means that they can be easily adapted online on the arrival of new data.





Procedure 1.[Procedure for collecting results in simulation and real-world experiments] For simulation, the system used was a model of the off-the-shelf quadcopter system with dynamics (1) and architecture as in Figure 2, where measurements were affected by zero-mean Gaussian noise with covariance matrix as in Appendix A. For the real-world experiments, the system used was the Crazyflie 2.0.
- 1.
The quadcopter is brought to hover at
with a stabilizing controller. The system is excited by adding a PRBS signal to the output of the stabilizing controller, as per Section 3.1, for the input/output data collection step of the DeePC algorithm.
- 2.
The regularized DeePC optimization problem (8) is setup with the input/output data collected in step 1.
- 3.
The DeePC controller is turned on and the quadcopter is commanded to track a diagonal step up from
to
.
- 4.
The resulting closed-loop tracking error is measured as
, where
is the time index at the start of the step trajectory and
is the chosen experiment length, which corresponds to 10 seconds in real time.
3.2.1 Sensitivity to
and 
As discussed in Section 2.3, for LTI systems the DeePC algorithm requires a minimum number of data points to satisfy the persistency of excitation property. Since we apply the DeePC algorithm to a nonlinear system subject to measurement noise, it becomes unclear as to how many data points are needed in order to construct the Hankel matrices in (5). Figure 3 shows the sensitivity analysis of and
on the tracking error. Figure 3 (left) shows the influence of
on the tracking error, where for each value of
considered we show the smallest tracking error achieved over all combinations of the other hyperparameters in the grid given by (11) with
. Similarly, Figure 3 (right) shows the influence of
on the tracking error, where for each value of
considered we show the smallest tracking error achieved over all combinations of the other hyperparameters in the grid given by (11) with
.











Here is the number of states corresponding to a minimal realization of (1) linearized about hover. Note that the minimum number of data points such that the Hankel matrix in (8) is square is directly affected by the number of outputs p. Hence, a larger p requires more data points to satisfy the lower bound in (12) and thus results in more decision variables in problem (8). The distinct improvement in the tracking error when
is chosen such that the Hankel matrix in (8) is square is also observed in a power system application of DeePC.21
A similar trend is observed in Figure 3 (right) for where good tracking performance is achieved for values larger than
for
, and
for
. This suggests that more past measurements are needed to estimate the initial condition of the unknown system when
. We observed, however, that setting
gives steadier flight of the quadcopter. Under noisy measurements, increasing
leads to better initial condition estimates. For the remaining results (simulation and experimental), Procedure 1 was conducted with the number of data points
and with
. This resulted in good tracking error performance for both
and
, while keeping the size of the DeePC optimization problem (8) small enough to be computationally tractable in real-time.
3.2.2 Sensitivity to
,
, q, and p
Figure 4 shows the results from the grid search as a heat map over with fixed values of
and
for the purpose of visualization, and fixed values of
and
for the reasons described above. The figure provides the insight that there is a threshold for
(approximately
) beyond which small tracking error can be achieved. The intuitive explanation for this insight is that a large enough penalization on the softened initial condition constraint ensures that the future predicted trajectory departs from an initial condition close to the actual initial condition. A similar trend of the tracking performance as a function of
is observed in other numerical case studies of DeePC.15, 21 This suggests that a tuning guideline for
is to choose it as large as possible without causing the optimization solver to encounter numerical issues.




Figure 4 also exposes a range for in which small tracking error is achieved. To investigate this further we consider the grid search results for all combinations of
and
. Figure 5 shows the results from the grid search over
for a fixed value of
and for all four combinations of q and p, for example, the line for
,
, is the slice of Figure 4 at the fixed value of
. In all cases a small tracking error is achieved for a range of
, although the combination
,
performs relatively poorly. This range of
with acceptable tracking error is wider for
than for
, which suggests that for the setup under consideration, 2-norm regularization is less sensitive to hyperparameter selection than 1-norm regularization. This observation is supported by observing the heat maps for all four combinations
and
as provided in Appendix B. Based on these insights, for the remainder of the results we fix the values
and
and now investigate in more detail the influence of
and the choice of output measurements
.







To provide some intuition for how influences the optimal solution of the regularized DeePC optimization problem (8) we now take a closer look at the closed loop trajectories resulting from
. Figure 6(A, B) shows the
coordinate of the simulated closed loop trajectory over time (solid line), the reference
(dotted line), and the trajectory predicted by problem (8) at representative time instants (dashed line).



In the case of no regularization (Figure 6(A), ), the predictions do not correspond to the physics of the model and the actual position diverges, that is, the quadcopter crashes. Since the data used in the Hankel matrix in (8) is obtained from a nonlinear system and is corrupted by measurement noise, then the subspace spanned by the columns of the Hankel matrix is all of
. Hence, without regularization on the decision variable g, the Hankel matrix predicts that every trajectory is possible. The value
is selected from the grid search result where the DeePC algorithm achieved the smallest tracking error (see Figure 5). We see in Figure 6(B) that desirable reference tracking is achieved and that more physical predictions are computed by the regularized optimization problem (8).
An important distinction between the hyperparameter and the
,
, and
hyperparameters discussed above, is that the
regularization cannot be arbitrarily increased, shown also in Figure 5. The reason is that at a certain level the regularization term
in (8) dominates the tracking error term, leading to poor tracking performance and eventually instability of the system. However, the range of
resulting in small tracking error is large (e.g.,
for
,
in Figure 5) indicating robustness to the choice of
.
Hyperparameters and q, which parameterize the regularization function
in optimization problem (8), are the main parameters of the regularized DeePC algorithm that are not present in model-based control approaches. These hyperparameters provide distributional robustness against the uncertainty in the system generating the input/output data.19, 27 Increasing the regularization weight
provides an increased level of robustness at the cost of being conservative. For intuition, the counterpart of
in a model-based approach are model-order selection parameters that decide how much of the data should be attributed to the model and how much to noise. Similarly, the choice of the regularization norm q corresponds to the choice of a loss function in system identification, such as the average or the worst-case cost. The range of values of
and q which result in a small tracking error depends on the nature of the uncertainty in the system, and the analysis above does not indicate a general guideline that we would expect to apply across multiple systems. Interestingly, however, we observe here and in other applications20, 21 that the combination
and
performs well. We will explore this empirical observation further in future work.
3.3 Real-world DeePC implementation
We now investigate how the insights gained through the simulation analysis of Section 3.2 transfer to laboratory experiments on a real-world quadcopter, with the details of the experimental setup provided at the start of Section 3. The experiments are performed as per Procedure 1 (see Section 3.2) and through the results we investigate: (a) whether the insights from the simulation-based analysis are validated in experiments; (b) whether the hyperparameter values identified from the simulation-based analysis can be directly transferred to the laboratory environment; and (c) the reliability of the tracking performance achieved.
Figure 7 provides a schematic of the laboratory setup used to collect the experimental results. The motion capture system consists of multiple cameras placed around the flying space and connected to a dedicated computer. The software running on the motion capture computer provides accurate measurements34 of the position and orientation of the Crazyflie 2.036 quadcopter, i.e., measurements of . These measurements are available to an offboard laptop where the outer controller from Figure 2 is implemented. The control decisions of the outer controller, that is
, are sent via the Crazyradio link to the Crazyflie 2.0 where the firmware provided with the quadcopter runs an onboard controller to track these.

The following analysis of performance on the real-world system focuses on hyperparameters and p as these are hyperparameters for which the simulation-based analysis of Section 3.2 did not provide clear tuning guidelines. On the other hand, the tuning guidelines found in Section 3.2 for hyperparameters
,
, and
generalized well to the real-world quadcopter, and no significant new insights were observed when varying these hyperparameters in the real-world. Hyperparameter q was set to the 2-norm because it reduces the number of decision variables in the optimization problem (8) to be solved online and hence reduces the online computation time required. Moreover, the simulation-based results in Section 3.2 suggest that similarly low tracking error performance is achievable with both
.
Figure 6(C, D) shows the coordinate of the closed loop trajectory, reference, and DeePC predictions when implemented on the quadcopter using the same hyperparameter values as Figure 6(A, B) respectively. The main feature of Figure 6 is that the simulation and experimental results show qualitatively similar closed-loop trajectories (solid lines) and predictions computed by the DeePC optimization problem (8) (dashed lines). This provides experimental validation of the insight that regularization is required to predict physically reasonable trajectories when applying DeePC to a real system. Moreover, a direct transfer of the hyperparameters selected via simulation to the experiments was possible, and we observed that tracking performance was not significantly improved by adjusting the regularization parameter
. Appendix C provides a similar comparison for hyperparameter values above and below
, indicating that the real-world implementation also achieves the best tracking performance at approximately
.
To investigate the reliability of the performance observed in Figure 6(D), and also to investigate the influence of hyperparameter p, Procedure 1 was repeated in 28 experiments for each of and
. To capture different operating conditions, 14 trials were performed with a fully charged battery and 14 with a partially depleted battery. Figures 8 and 9 and Table 1 summarize the results. Figure 8 shows the position time series data (solid grey) of all 28 trajectories for
(A, B, C) and for
(D, E, F), with the average at each time point (dashed) shown to assist with visualization. Figure 9 shows that same data as a top view.








Tracking error![]() |
Solve time (ms) | |||||
---|---|---|---|---|---|---|
p | Mean | Median | SD | Mean | Median | SD |
3 | 75 | 69 | 21 | 4.14 | 3.92 | 1.49 |
5 | 93 | 86 | 23 | 6.66 | 5.70 | 4.78 |
- a Computed as described in the Procedure 1.
Quantitatively, Table 1 shows that achieves a lower tracking error compared to
, in terms of mean, median, and SD. This is likely due to the orientation measurements having higher noise than the position measurements. This can be addressed by performing a weighted penalization of
using the covariance matrix of the measurement noise. Qualitatively, Figures 8 and 9 suggest that there is less variation in the closed loop trajectories with
than with
. This result on the real-world quadcopter suggests than when applying DeePC to other systems, performance may be improved by discarding measurements with higher noise as long as the system is observable with the remaining measurements.
From the online computation perspective, Table 1 shows that optimization problem (8) is solved sufficiently fast for both and
considering that output measurements are provided for real-time implementation at 25 Hz. For the case of
, there were 451 optimization decision variables, 168 equality constraints, and 300 inequality constraints. As a point of reference, the optimization problem in the output-feedback MPC approach of Section 3.4 had 283 optimization decision variables, 208 equality constraints, and the same number of inequality constraints as in the DeePC.
A video of the quadcopter successfully tracking step trajectories and a figure 8 using the DeePC algorithm can be found here: https://doi.org/10.3929/ethz-b-000493419.
3.3.1 Summary of hyperparameter selection insights
- Choose
as per (12), that is, choose it to be larger than both the minimum amount needed for persistency of excitation in the LTI case and the minimum amount such that the Hankel matrix in (8) is square.
- Choose
by incrementally increasing it until steady tracking is observed. This coincides with a value which both exceeds the lag
of the system in the LTI case and provides good initial condition estimates in the presence of noisy measurements.
- Choose
as large as possible without causing the optimization solver to encounter numerical issues.
- In regards to p, performance may be improved by discarding measurements with higher noise as long as the system is observable with the remaining measurements.
The selection of the regularization function , parameterized in hyperparameters
and q, depends on the nature of the uncertainty in the system generating the input/output data and is expected to vary from one application to another. Preliminary empirical observation suggests that the combination
and
serves as a good initial choice. The 2-norm regularization is advantageous for real-time control because it reduces the number of decision variables in the optimization problem (8) to be solved online and hence reduces the online computation time required.
3.4 Comparison with model-based control
The results in Section 3.3 show that DeePC Algorithm 1 achieves good performance for the step reference tracking task specified in Procedure 1 in a data-driven fashion. We now present a model-based point of comparison that is developed for linear systems. We take a first-principles approach that considers the linearization of the quadcopter dynamics (1) about the hover equilibrium point, and we assume that the inner controller tracks the body rates reference signal without dynamics or delays. We use a sampling time of 0.04 seconds, that is, 25 Hz, to convert the continuous-time linear model to discrete-time. The resulting linear system model can be readily derived.44 Hence we consider a model based-controller with eight states and three inputs, and
, respectively.
The model-based control method we implement is output-feedback MPC, as described in Section 2.2. Optimization problem (3) is solved in a receding horizon fashion with the dynamics function f replaced by the linear-time invariant system model described above, the cost function c given by (9), and all parameters set to the same values as used for the DeePC as given in Appendix A. The state estimate,
, is constructed by directly taking the measurements for
, and
is estimated as the discrete time derivative of subsequent
measurements. Figure 10 compares a trajectory of this first-principles MPC approach with that of the DeePC. Figure 10(A) shows the time series of the vertical position
, and Figure 10(B) shows the trajectory in the
-plane

Figure 10(A) shows that DeePC and MPC achieve qualitatively similar tracking performance for the vertical position . Both have a similar rise time and settling time, with the most distinct feature being that the DeePC controller overshoots the reference but then settles to a smaller steady-state offset. For MPC, this offset is present because there is a model mismatch between the steady-state input,
, and that needed to maintain the real-world quadcopter at steady state. As the DeePC controller is provided with the same
, this indicates that the structure of the DeePC controller is able, to some extent, to correct for a mismatch of the steady state input
provided. Figure 10(B) shows a clear disparity between the tracking performance in the horizontal
-plane. Where the MPC follows an almost straight line trajectory from the starting point to the target, the DeePC controller by contrast has quite different tracking behavior for the
and
directions, a trend also observed in Figure 9 and in our simulation-based tests. This leaves open an interesting direction for further investigation to understand why the DeePC controller produces a faster rise time for the
direction compared to the
direction.
Overall, for the quadcopter application we see that DeePC performs similarly to MPC where a first-principles model is available. This indicates the potential for DeePC to tackle applications where a first-principles model is either not available or identifying all the necessary model parameters is not conceivable.
3.4.1 Model mismatch
In all of our analysis, the off-the-shelf quadcopter is maintained at a zero yaw angle by the inner controller. At that yaw angle, the quadcopter body frame
,
axes are aligned with the inertial frame
,
, as is demonstrated in the top view (right) of Figure 1. Therefore, the
and
dynamics are decoupled from each other with respect to the body rate reference control inputs of the outer controller
and
, respectively. In the real-world experimental setup, the yaw angle measurement zero reference point must be calibrated by carefully aligning the quadcopter body frame with the inertial frame, and some calibration error is expected. We now consider a case where there is a yaw calibration error of approximately
, which is exaggerated for the purpose of demonstration. The quadcopter body frame is rotated by
around the inertial
axis at the yaw measurement zero reference point, leading to a misalignment in the inertial and body frames that is unknown to the controller.
To capture this yaw miscalibration in simulation, an offset of between the true yaw angle and the yaw angle measurement available to the controller is induced. Figure 11 shows the simulation results of the quadcopter tracking a 1 meter step in the
direction with DeePC and output-feedback MPC. With no knowledge of the coupling between the
and
dynamics induced by the misalignment of the inertial and body frames, the output-feedback MPC controller causes the quadcopter to deviate considerably in the positive
direction then spiral around the target in the
-plane. By contrast, the quadcopter takes a more direct path to its target under DeePC control. This suggests that the DeePC controller implicitly learns a good mapping between the body rate references
,
, and the
,
dynamics from the data collected at the misaligned frames of reference. The mapping is not perfect; a slight spiraling effect as the quadcopter approaches its target is observed, but the improvement to the model-based approach which equally lacks knowledge of the frames misalignment is apparent.

The yaw angle mismatch is an example of a bias error that can occur when adopting a linear model-based control approach to a nonlinear system. Such a bias error is present when the linearization is performed at an incorrect operating point. The DeePC algorithm provides some robustness to such a bias error, since it is able to adapt to unknown operating conditions of the system from the data, and also by virtue of the regularization in optimization problem (8). One can further consider a case where the yaw angle measurement calibration drifts slowly over time, and a periodic recalibration is required for a model-based control approach to perform well. Instead of recalibration, the data in the Hankel matrix in (8) can be updated online in the DeePC approach. We will explore this concept further in future work.
4 CONCLUSION
We demonstrated that the regularized DeePC algorithm is suitable for real-time control of a real-world quadcopter, thereby bridging the gap between theory and practice. In the process, we performed a sensitivity analysis on the hyperparameters of the DeePC algorithm in simulation, gaining key insights on their effect. These simulation takeaways generalized well to the real-world quadcopter system, where minimal hyperparameter refining was performed. Through the real-world implementation, it was demonstrated that the DeePC algorithm is computationally tractable and adequately solvable in real-time, with solve times far beneath the real-time requirement. The insights from the simulation and real-world experiments were condensed into a set of hyperparameter selection guidelines expected to assist with applying the DeePC algorithm to other systems (see Section 3.3.1). Future work includes applying the DeePC algorithm on other real-worlds systems for which no first-principles model can be derived.
ACKNOWLEDGEMENTS
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme grant agreement OCAL, No. 787845, the Swiss National Science Foundation (SNSF) under the National Centre of Competence in Research (NCCR) Automation, and ETH Zürich. Open Access Funding provided by Eidgenossische Technische Hochschule Zurich. [Correction added on 20May 2022, after first online publication: CSAL funding statement has been added.]
CONFLICT OF INTEREST
The authors declare no potential conflict of interest.
APPENDIX A: PARAMETERS FOR IMPLEMENTATION OF THE DEEPC ALGORITHM
, the total number of data points used to construct the Hankel matrices in (5),
, the time horizon used for initial condition estimation,
, the weight on the softened initial condition constraint,
, the weight on the regularization of g,
, the norm used to regularize g in (10),
, the number of outputs used to construct the Hankel matrices in (5),
, the prediction horizon, (corresponds to
in continuous time),
, the quadratic tracking error cost matrix,
, the quadratic control effort cost matrix,
, the control inputs constraints set, given by:
,
,
, the outputs constraints set, given by:
,
,
when
. Note that the constraints on the quadcopter orientation,
,
, are omitted when
,
, the steady state hovering control inputs,
, the covariance matrix of measurement noise in simulation when
. Note that when
the covariance matrix is the top left
block of
.
APPENDIX B: FURTHER RESULTS FOR THE GRID SEARCH ANALYSIS
For completeness, we include here the results for the grid search analysis, described in Section 3.2, for all hyperparameters considered. Figure B1 bottom left is the same as shown in Section 3.2, and the other plots in Figure B1 are for the remaining combinations of and
.






APPENDIX C: COMPARING SENSITIVITY TO
IN SIMULATION AND EXPERIMENT
Figure C1 shows results similar to Figure 6 for comparing the closed loop trajectories (solid lines) and the predictions computed by the DeePC optimization problem (8) (dashed lines). This shows the same trend that the performance observed in the simulation-based analysis, Figure C1(A–C), is qualitatively similar to that observed in the real-world experiments, Figure C1(D–F).



Qualitatively, the best chosen in simulation also performs best in reality and results in a similar closed loop trajectory. A smaller value of
results in a faster but more oscillatory response, and a larger value of
results in a sluggish response. This figure demonstrates that, despite unmodeled dynamics in simulation, the real-world system behaves similarly to the simulation model when applying DeePC Algorithm 1. Consequently, simulation-based hyperparameter selection was adapted on the real system with minimal adjustments required.
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available in “paper-deepc-2019-for-ijrnl-data-gen” at http://doi.org/10.3929/ethz-b-000490768, Reference 45.
REFERENCES
- * We slightly deviate from the classical definition of a Hankel matrix, which requires it to be square, and allow general dimensions.