Estimation of the Derivatives of a Function in a Convolution Regression Model with Random Design
Abstract
A convolution regression model with random design is considered. We investigate the estimation of the derivatives of an unknown function, element of the convolution product. We introduce new estimators based on wavelet methods and provide theoretical guarantees on their good performances.
1. Introduction
The motivation of this problem is the deconvolution of a signal f from f⋆g perturbed by noise and randomly observed. The function g can represent a driving force that was applied to a physical system. Such situations naturally appear in various applied areas, as astronomy, optics, seismology, and biology. Model (1) can also be viewed as a natural extension of some 1-periodic convolution regression models as those considered by, for example, Cavalier and Tsybakov [1], Pensky and Sapatinas [2], and Loubes and Marteau [3]. In the form (1), it has been considered in Bissantz and Birke [4] and Birke et al. [5] with a deterministic design and in Hildebrandt et al. [6] with a random design. These last works focus on kernel methods and establish their asymptotic normality. The estimation of f(m), more general to f = f(0), is of interest to examine possible bumps and to study the convexity-concavity properties of f (see, for instance, Prakasa Rao [7], for standard statistical models).
In this paper, we introduce new estimators for f(m) based on wavelet methods. Through the use of a multiresolution analysis, these methods enjoy local adaptivity against discontinuities and provide efficient estimators for a wide variety of unknown functions f(m). Basics on wavelet estimation can be found in, for example, Antoniadis [8], Härdle et al. [9], and Vidakovic [10]. Results on the wavelet estimation of f(m) in other regression frameworks can be found in, for example, Cai [11], Petsa and Sapatinas [12], and Chesneau [13].
The first part of the study is devoted to the case where h, the common density of X1, …, Xn, is known. We develop a linear wavelet estimator and an adaptive nonlinear wavelet estimator. The second one uses the double hard thresholding technique introduced by Delyon and Juditsky [14]. It does not depend on the smoothness of f(m) in its construction; it is adaptive. We exhibit their rates of convergence via the mean integrated squared error (MISE) and the assumption that f(m) belongs to Besov balls. The obtained rates of convergence coincide with existing results for the estimation of f(m) in the 1-periodic convolution regression models (see, for instance, Chesneau [15]).
The second part is devoted to the case where h is unknown. We construct a new linear wavelet estimator using a plug-in approach for the estimation of h. Its construction follows the idea of the “NES linear wavelet estimator” introduced by Pensky and Vidakovic [16] in another regression context. Then we investigate its MISE properties when f(m) belongs to Besov balls, which naturally depend on the MISE of the considered estimator for h. Furthermore, let us mention that all our results are proved with only moments of order 2 on ξ1, which provides another theoretical contribution to the subject.
The remaining part of this paper is organized as follows. In Section 2 we describe some basics on wavelets and Besov balls and present our wavelet estimation methodology. Section 3 is devoted to our estimators and their performances. The proofs are carried out in Section 4.
2. Preliminaries
This section is devoted to the presentation of the considered wavelet basis, the Besov balls, and our wavelet estimation methodology.
2.1. Wavelet Basis
2.2. Besov Balls
The interest of Besov balls is to contain various kinds of homogeneous and inhomogeneous functions u. See, for example, Meyer [20], Donoho et al. [21], and Härdle et al. [9].
2.3. Wavelet Estimation
Let f be the unknown function in (1) and the considered wavelet basis taken with N > 5m (to ensure that ϕ and ψ belong to the class ). Suppose that f(m) exists with .
The second step is the estimation of and using (Y1, X1), …, (Yn, Xn). The idea of the third step is to exploit the sparse representation of f(m) by selecting the most interesting wavelet coefficients estimators. This selection can be of different natures (truncation, thresholding,…). Finally, we reconstruct these wavelet coefficients estimators on , providing an estimator for f(m).
3. Rates of Convergence
In this section, we list the assumptions on the model, present our wavelet estimators, and determine their rates of convergence under the MISE over Besov balls.
3.1. Assumptions
Let us recall that f and g are the functions in (1) and h is the density of X1.
- (K1)
We have f(q)(a) = f(q)(b) = 0 for any q ∈ {0, …, m}, , and there exists a known constant C1 > 0 such that supx∈[a,b] | f(x)| ≤ C1.
- (K2)
First of all, let us define the Fourier transform of an integrable function u by
() -
The notation will be used for the complex conjugate.
-
We have and there exist two constants, c1 > 0 and δ ≥ 0, such that
() - (K3)
There exists a constant c2 > 0 such that
()
3.2. When h Is Known
3.2.1. Linear Wavelet Estimator
Proposition 1 presents an elementary property of .
Theorem 2 below investigates the performance of in terms of rates of convergence under the MISE over Besov balls.
Theorem 2. Suppose that (K1)–(K3) are satisfied and that with M > 0, p ≥ 1, r ≥ 1, s ∈ (max(1/p − 1/2,0), N), and N > 5(m + δ + 1). Let be defined by (14) with j0 such that
s* = s + min(1/2 − 1/p, 0) ([a] denotes the integer part of a).
Then there exists a constant C > 0 such that
Note that the rate of convergence corresponds to the one obtained in the estimation of f(m) in the 1-periodic white noise convolution model with an adapted linear wavelet estimator (see, e.g., Chesneau [15]).
The considered estimator depends on s (the smoothness parameter of f(m)); it is not adaptive. This aspect, as well as the rate of convergence , can be improved with thresholding methods. The next paragraph is devoted to one of them: the hard thresholding method.
3.2.2. Hard Thresholding Wavelet Estimator
The construction of uses the double hard thresholding technique introduced by Delyon and Juditsky [14] and recently improved by Chaubey et al. [25]. The main interest of the thresholding using λj is to make adaptive; the construction (and performance) of does not depend on the knowledge of the smoothness of f(m). The role of the thresholding using ςj in (20) is to relax some usual restrictions on the model. To be more specific, it enables us to only suppose that ξ1 admits finite moments of order 2 (with known or a known upper bound of ), relaxing the standard assumption , for any .
Further details on the constructions of hard thresholding wavelet estimators can be found in, for example, Donoho and Johnstone [26, 27], Donoho et al. [21, 28], Delyon and Juditsky [14], and Härdle et al. [9].
Theorem 3 below investigates the performance of in terms of rates of convergence under the MISE over Besov balls.
Theorem 3. Suppose that (K1)–(K3) are satisfied and that with M > 0, r ≥ 1, {p ≥ 2, s ∈ (0, N)} or {p ∈ [1,2), s ∈ ((2m + 2δ + 1)/p, N)}, and N > 5(m + δ + 1). Let be defined by (19). Then there exists a constant C > 0 such that
The proof of Theorem 3 is an application of a general result established by [25, Theorem 6.1]. Let us mention that (lnn/n)2s/(2s+2m+2δ+1) corresponds to the rate of convergence obtained in the estimation of f(m) in the 1-periodic white noise convolution model with an adapted hard thresholding wavelet estimator (see, e.g., Chesneau [15]). In the case m = 0 and δ = 0, this rate of convergence becomes the optimal one in the minimax sense for the standard density-regression estimation problems (see Härdle et al. [9]).
- (i)
for the case p ≥ 2 corresponding to the homogeneous zone of Besov balls (lnn/n) 2s/(2s+2m+2δ+1) is equal to the rate of convergence attained by up to a logarithmic term,
- (ii)
for the case p ∈ [1,2) corresponding to the inhomogeneous zone of Besov balls it is significantly better in terms of power.
3.3. When h Is Unknown
There are numerous possibilities for the choice of . For instance, can be a kernel density estimator or a wavelet density estimator (see, e.g., Donoho et al. [21], Härdle et al. [9], and Juditsky and Lambert-Lacroix [29]).
The estimator is derived to the “NES linear wavelet estimator” introduced by Pensky and Vidakovic [16] and recently revisited in a more simple form by Chesneau [13].
Theorem 4 below determines an upper bound of the MISE of .
Theorem 4. Suppose that (K1)–(K3) are satisfied, , and that with M > 0, p ≥ 1, r ≥ 1, s ∈ (max(1/p − 1/2,0), N), and N > 5(m + δ + 1). Let be defined by (24) with j2 such that . Then there exists a constant C > 0 such that
The proof follows the idea of [13, Theorem 3] and uses technical operations on Fourier transforms.
- (i)
- (ii)
if and h satisfy that there exist υ ∈ [0,1] and a constant C > 0 such that
()then, the optimal integer j2 is such that and we obtain the following rate of convergence for :()
- (i)
The relaxation of the assumption (K2), perhaps by considering (K2′): there exist four constants, C1 > 0, , η > 0, and δ ≥ 0, such that
() -
This condition was first introduced by Delaigle and Meister [30] in a context of deconvolution-estimation of function. It implies (K2) and has the advantage to consider some functions g having zeros in Fourier transform domain as numerous kinds of compactly supported functions.
- (ii)
The construction of an adaptive version of through the use of a thresholding method.
- (iii)
The extension of our results to the risk with p ≥ 1.
4. Proofs
In this section, C denotes any constant that does not depend on j, k, or n. Its value may change from one term to another and may depend on ϕ or ψ.
Proof of Proposition 1. By the independence between X1 and ξ1, , sup(f⋆g) = supp(h) = [a*, b*], and , we have
It follows from (K1) and m integration by parts that . Using the Fubini theorem, , (30), and the Parseval identity, we obtain
Proposition 1 is proved.
Proof of Theorem 2. We expand the function f(m) on as (8) at the level . Since forms an orthonormal basis of , we get
The Parseval identity yields
Combining (33), (34), and (35), we have
Proof of Theorem 3. For γ ∈ {ϕ, ψ}, any integer j ≥ τ, and k ∈ Λj,
- (a1)
using arguments similar to those in Proposition 1, we obtain
() - (a2)
using (33), (34), and (35) with γ instead of ϕ, we have
()with .
Proof of Theorem 4. We expand the function f(m) on as (8) at the level . Since forms an orthonormal basis of , we get
Upper Bound for S2. Proceeding as in (37), we get
Upper Bound for S1. The triangular inequality gives
Therefore, using and , we obtain
Combining (44), (45), and (62), we obtain the desired result; that is,
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The authors are thankful to the reviewers for their comments which have helped in improving the presented work.
Appendix
Let us now present in detail the general result of [25, Theorem 6.1] used in the proof of Theorem 3.
- (i)
n functions q1, …, qn with for any i ∈ {1, …, n},
- (ii)
two sequences of real numbers and satisfying limn→∞υn = ∞ and limn→∞μn = ∞
-
(A1) any integer j ≥ τ and any k ∈ Λj,
() -
(A2) there exist two constants, θγ > 0 and σ ≥ 0, such that, for any integer j ≥ τ and any k ∈ Λj,
()