Volume 2011, Issue 1 604150
Research Article
Open Access

Adaptive Wavelet Estimation of a Biased Density for Strongly Mixing Sequences

Christophe Chesneau

Corresponding Author

Christophe Chesneau

Université de Caen-Basse Normandie, Département de Mathématiques, UFR de Sciences, 14032 Caen, France unicaen.fr

Search for more papers by this author
First published: 14 April 2011
Academic Editor: Palle E. Jorgensen

Abstract

The estimation of a biased density for exponentially strongly mixing sequences is investigated. We construct a new adaptive wavelet estimator based on a hard thresholding rule. We determine a sharp upper bound of the associated mean integrated square error for a wide class of functions.

1. Introduction

In the standard density estimation problem, we observe n random variables X1, …, Xn with common density function f. The goal is to estimate f from X1, …, Xn. However, in some applications, X1, …, Xn are not accessible; we only have n random variables Z1, …, Zn with the common density
(1.1)
where w denotes a known positive function and μ is the unknown normalization parameter: μ = ∫w(y)f(y)dy. Our goal is to estimate the “biased density” f from Z1, …, Zn. Practical examples can be found in, for example, [13] and the survey by the author of [4].

The standard i.i.d. case has been investigated in several papers. See, for example, [59]. To the best of our knowledge, the dependent case has only been examined in [10] for associated (positively or negatively) Z1, …, Zn. In this paper, we study another dependent (and realistic) structure which has not been addressed earlier: we suppose that Z1, …, Zn is a sample of a strictly stationary and exponentially strongly mixing process (Zi) i (to be defined in Section 2). Such a dependence condition arises for a wide class of GARCH-type time series models classically encountered in finance. See, for example, [11, 12] for an overview.

We focus our attention on the wavelet methods because they provide a coherent set of procedures that are spatially adaptive and near optimal over a wide range of function spaces. See, for example, [13, 14] for a detailed coverage of wavelet theory in statistics. We develop two new wavelet estimators: a linear nonadaptive based on projections and a nonlinear adaptive using the hard thresholding rule introduced by [15]. We measure their performances by determining upper bounds of the mean integrated squared error (MISE) over Besov balls (to be defined in Section 3). We prove that our adaptive estimator attains a sharp rate of convergence, close to the one attained by the linear wavelet estimator (constructed in a nonadaptive fashion to minimize the MISE).

The rest of the paper is organized as follows. Section 2 is devoted to the assumptions on the model. In Section 3, we present wavelets and Besov balls. The considered wavelet estimators are defined in Section 4. Section 5 is devoted to the results. The proofs are postponed in Section 6.

2. Assumptions on the Model

We assume that Z1, …, Zn coming from a strictly stationary process (Zi) i. For any m, we define the mth strongly mixing coefficient of (Zi) i by
(2.1)
where, for any u, is the σ-algebra generated by the random variables …, Zu−1, Zu and is the σ-algebra generated by the random variables Zu, Zu+1, ….
We consider the exponentially strongly mixing case, that is, there exist three known constants, γ > 0, c > 0, and θ > 0, such that, for any m,
(2.2)
This assumption is satisfied by a large class of GARCH processes. See, for example, [11, 12, 16, 17].

Note that, when θ, we are in the standard i.i.d. case.

W.o.l.g., the support of the functions f, and w are [0,1].

There exist two constants, c > 0 and C > 0, such that
(2.3)
There exists a (known) constant C > 0 such that
(2.4)
For any m ∈ {1, …, n}, let be the density of (Z0, Zm). There exists a constant C > 0 such that
(2.5)

The two first boundedness assumptions are standard in the estimation of biased densities. See, for example, [68].

3. Wavelets and Besov Balls

Let N be an integer ϕ and ψ be the initial wavelets of dbN (so supp (ϕ) = supp (ψ) = [1 − N, N]). Set
(3.1)
With an appropriate treatments at the boundaries, there exists an integer τ satisfying 2τ ≥ 2N such that the collection = {ϕτ,k(·), k ∈ {0, …, 2τ − 1}; ψj,k(·); j − {0, …, τ − 1}, k ∈ {0, …, 2j − 1}}, is an orthonormal basis of 𝕃2([0,1]) (the space of square-integrable functions on [0,1]). See [18].
For any integer τ, any h𝕃2([0,1]) can be expanded on as
(3.2)
where αj,k and βj,k are the wavelet coefficients of h defined by
(3.3)
Let M > 0, s > 0, p ≥ 1, and r ≥ 1. A function h belongs to if and only if there exists a constant M* > 0 (depending on M) such that the associated wavelet coefficients (3.3) satisfy
(3.4)
In this expression, s is a smoothness parameter and p and r are norm parameters. For a particular choice of s, p, and r, contains some classical sets of functions as the Hölder and Sobolev balls. See [19].

4. Estimators

Firstly, we consider the following estimator for μ:
(4.1)
It is obtained by the method of moments (see Proposition 6.2 below).
Then, for any integer jτ and any k ∈ {0, …, 2j − 1}, we estimate the unknown wavelet coefficient
  • (i)

    by

    (4.2)

  • (ii)

    by

    (4.3)

Note that they are those considered in the i.i.d. case (see, e.g., [8, 9]). Their statistical properties, with our dependent structure, are investigated in Propositions 6.2, 6.3, and 6.4 below.

Assuming that with p ≥ 2, we define the linear estimator by
(4.4)
where is defined by (4.2) and j0 is the integer satisfying
(4.5)

For a survey on wavelet linear estimators for various density models, we refer the reader to [20]. For the consideration of strongly mixing sequences, see, for example, [21, 22].

We define the hard thresholding estimator by
(4.6)
x ∈ [0,1], where is defined by (4.2) and by (4.3), for any random event 𝒜, 𝕀𝒜 is the indicator function on 𝒜, j1 is the integer satisfying
(4.7)
θ is the one in (2.2), κ is a large enough constant (the one in Proposition 6.4 below) and λn is the threshold
(4.8)
The feature of the hard thresholding estimator is to only estimate the “large" unknown wavelet coefficients of f which contain his main characteristics.

For the construction of hard thresholding wavelet estimators in the standard density model, see, for example, [15, 23].

5. Results

Theorem 5.1 (upper bound for <!--${ifMathjaxEnabled: 10.1155%2F2011%2F604150}-->f∧L<!--${/ifMathjaxEnabled:}--><!--${ifMathjaxDisabled: 10.1155%2F2011%2F604150}--><!--${/ifMathjaxDisabled:}-->). Consider (1.1) under the assumptions of Section 2. Suppose that with s > 0, p ≥ 2, and r ≥ 1. Let be (4.4). Then there exists a constant C > 0 such that

(5.1)

The proof of Theorem 5.1 uses a suitable decomposition of the MISE and a moment inequality on (4.2) (see Proposition 6.3 below).

Note that n−2s/(2s+1) is the optimal rate of convergence (in the minimax sense) for the standard density model in the independent case (see, e.g., [14, 23]).

Theorem 5.2 (upper bound for <!--${ifMathjaxEnabled: 10.1155%2F2011%2F604150}-->f∧H<!--${/ifMathjaxEnabled:}--><!--${ifMathjaxDisabled: 10.1155%2F2011%2F604150}--><!--${/ifMathjaxDisabled:}-->). Consider (1.1) under the assumptions of Section 2. Let be (4.6). Suppose that with r ≥ 1, {p ≥ 2 and s > 0} or {p ∈ [1,2) and s > 1/p}. Then there exists a constant C > 0 such that

(5.2)

The proof of Theorem 5.2 uses a suitable decomposition of the MISE, some moment inequalities on (4.2) and (4.3) (see Proposition 6.3 below), and a concentration inequality on (4.3) (see Proposition 6.4 below).

Theorem 5.2 shows that, besides being adaptive, attains a rate of convergence close to the one of . The only difference is the logarithmic term (ln n) (1+1/θ)(2s/(2s+1)).

Note that, if we restrict our study to the independent case, that is, θ, the rate of convergence attained by becomes the standard one: (log n/n) 2s/(2s+1). See, for example, [14, 15, 23].

6. Proofs

In this section, we consider (1.1) under the assumptions of Section 2. Moreover, C denotes any constant that does not depend on j, k and n. Its value may change from one term to another and may depends on ϕ or ψ.

6.1. Auxiliary Results

Lemma 6.1. For any integer jτ and any k ∈ {0, …, 2j − 1}, let be (4.2) and . Then, under the assumptions of Section 2, there exists a constant C > 0 such that

(6.1)
This inequality holds for ψ instead of ϕ (and, a fortiori, defined by (4.3) instead of and instead of αj,k).

Proof of Lemma 6.1. We have

(6.2)
Due to (2.3), we have and . Therefore
(6.3)
Using (2.4) and the Cauchy-Schwarz inequality, we obtain
(6.4)
Hence
(6.5)
Lemma 6.1 is proved.

Proposition 6.2. For any integer jτ such that 2jn and any k ∈ {0, …, 2j − 1}, let and be (4.1). Then,

  • (1)

    one has

    (6.6)

  • (2)

    there exists a constant C > 0 such that

    (6.7)

  • (3)

    there exists a constant C > 0 such that

    (6.8)

These results hold for ψ instead of ϕ (and, a fortiori, instead of αj,k).

Proof of Proposition 6.2. (1) We have

(6.9)
Since f is a density, we obtain
(6.10)

(2) We have

(6.11)
Using (2.3) and (2.4), we have sup x∈[0,1]g(x) ≤ C. Hence,
(6.12)
It follows from the stationarity of (Zi) i and 2jn that
(6.13)
where
(6.14)
Let us now bound T1 and T2.

Upper Bound for T1 Using (2.5), (2.3), and doing the change a variables y = 2jxk, we obtain

(6.15)
Therefore,
(6.16)

Upper Bound for T2 By the Davydov inequality for strongly mixing processes (see [24]), for any q ∈ (0,1), it holds that

(6.17)
By (2.3), we have
(6.18)
and, by (6.12),
(6.19)
Therefore,
(6.20)
Since , we have
(6.21)
It follows from (6.13), (6.16), and (6.21) that
(6.22)
Combining (6.11), (6.12), and (6.22), we obtain
(6.23)

(3) Proceeding in a similar fashion to 2-, we obtain

(6.24)
Using (2.3) (which implies sup x∈[0,1](1/w(x)) ≤ C) and applying the Davydov inequality, we obtain
(6.25)
The proof of Proposition 6.2 is complete.

Proposition 6.3. For any integer jτ such that 2jn and any k ∈ {0, …, 2j − 1}, let and be (4.2). Then,

  • (1)

    there exists a constant C > 0 such that

    (6.26)

  • (2)

    there exists a constant C > 0 such that

    (6.27)

These inequalities hold for defined by (4.3) instead of , and instead of αj,k.

Proof of Proposition 6.3. (1) Applying Lemma 6.1 and Proposition 6.2, we have

(6.28)

(2) We have

(6.29)
By (2.3), we have and sup x∈[0,1](1/w(x)) ≤ C. So,
(6.30)
By (6.4), we have |αj,k | ≤ C. Therefore
(6.31)
It follows from (6.31) and (6.28) that
(6.32)
The proof of Proposition 6.3 is complete.

Proposition 6.4. For any j ∈ {τ, …, j1} and any k ∈ {0, …, 2j − 1}, let , be (4.3) and λn be (4.8). Then there exist two constants, κ > 0 and C > 0, such that

(6.33)

Proof of Proposition 6.4. It follows from Lemma 6.1 that

(6.34)
where
(6.35)
In order to bound P1 and P2, let us present a Bernstein inequality for exponentially strongly mixing process. We refer to [25, 26].

Lemma 6.5 (see [25], [26].)Let γ > 0, c > 0, θ > 1 and (Zi) i be a stationary process such that, for any m, the associated mth strongly mixing coefficient (2.2) satisfies amγexp (−c | m|θ). Let n*, h : be a measurable function and, for any i, Ui = h(Zi). One assumes that 𝔼(U1) = 0 and there exists a constant M > 0 satisfying |U1 | ≤ M < . Then, for any m ∈ {1, …, n} and any λ > 4mM/n, one has

(6.36)

Upper Bound for P1 For any i ∈ {1, …, n}, set

(6.37)
Then U1, …, Un are identically distributed, depend on the stationary strongly mixing process (Zi) i which satisfies (2.2), Proposition 6.2 gives
(6.38)
and, by (2.3) and (6.4),
(6.39)
It follows from Lemma 6.5 applied with U1, …, Un, λ = κCλn, λn = ((ln n) 1+1/θ/n) 1/2, m = (uln n) 1/θ with u > 0 (chosen later), M = C2j/2 and , that
(6.40)

Therefore, for large enough κ and u, we have

(6.41)

Upper Bound for P2 For any i ∈ {1, …, n}, set

(6.42)
Then U1, …, Un are identically distributed, depend on the stationary strongly mixing process (Zi) i which satisfies (2.2), Proposition 6.2 gives
(6.43)
By (2.3), we have
(6.44)
It follows from Lemma 6.5 applied with U1, …, Un, λ = κCλn, λn = ((ln n) 1+1/θ/n) 1/2, m = (uln n) 1/θ with u > 0 (chosen later) and M = C that
(6.45)

Therefore, for large enough κ and u, we have

(6.46)
Putting (6.34), (6.41), and (6.46) together, this ends the proof of Proposition 6.4.

6.2. Proofs of the Main Results

Proof of Theorem 5.1. We expand the function f on as

(6.47)
where and .

We have, for any x ∈ [0,1],

(6.48)
Since is an orthonormal basis of 𝕃2([0,1]), we have,
(6.49)
Using Proposition 6.3, we obtain
(6.50)
Since p ≥ 2, we have . Hence
(6.51)
Therefore,
(6.52)
The proof of Theorem 5.1 is complete.

Proof of Theorem 5.2. We expand the function f on as

(6.53)
where and .

We have, for any x ∈ [0,1],

(6.54)
Since is an orthonormal basis of 𝕃2([0,1]), we have
(6.55)
where
(6.56)
Let us bound R, T, and S, in turn.

Upper Bound for R Using Proposition 6.3 and 2s/(2s + 1) < 1, we obtain

(6.57)

Upper Bound for T For r ≥ 1 and p ≥ 2, we have . Since 2s/(2s + 1) < 2s, we have

(6.58)
For r ≥ 1 and p ∈ [1,2), we have . Since s > 1/p, we have s + 1/2 − 1/p > s/(2s + 1). So
(6.59)
Hence, for r ≥ 1, {p ≥ 2 and s > 0} or {p ∈ [1,2) and s > 1/p}, we have
(6.60)

Upper Bound for S Note that we can write the term S as

(6.61)
where
(6.62)
Let us investigate the bounds of S1, S2, S3, and S4 in turn.

Upper Bounds for S1 and S3 We have

(6.63)
So,
(6.64)
It follows from the Cauchy-Schwarz inequality, Propositions 6.3 and 6.4, and that
(6.65)
Since 2s/(2s + 1) < 1, we have
(6.66)

Upper Bound for S2 Using again Proposition 6.3, we obtain

(6.67)
Hence,
(6.68)
Let j2 be the integer defined by
(6.69)
We have
(6.70)
where
(6.71)
We have
(6.72)
For r ≥ 1 and p ≥ 2, since ,
(6.73)
For r ≥ 1, p ∈ [1,2) and s > 1/p, using , and (2s + 1)(2 − p)/2 + (s + 1/2 − 1/p)p = 2s, we have
(6.74)
So, for r ≥ 1, {p ≥ 2 and s > 0} or {p ∈ [1,2) and s > 1/p}, we have
(6.75)

Upper Bound for S4 We have

(6.76)
Let j2 be the integer (6.69). Then
(6.77)
where
(6.78)
We have
(6.79)
For r ≥ 1 and p ≥ 2, since , we have
(6.80)
For r ≥ 1, p ∈ [1,2) and s > 1/p, using , and (2s + 1)(2 − p)/2 + (s + 1/2 − 1/p)p = 2s, we have
(6.81)
So, for r ≥ 1, {p ≥ 2 and s > 0} or {p ∈ [1,2) and s > 1/p}, we have
(6.82)
It follows from (6.61), (6.66), (6.75), and (6.82) that
(6.83)

Combining (6.55), (6.57), (6.60), and (6.83), we have, for r ≥ 1, {p ≥ 2 and s > 0} or {p ∈ [1,2) and s > 1/p},

(6.84)
The proof of Theorem 5.2 is complete.

Acknowledgment

This paper is supported by ANR Grant NatImages, ANR-08-EMER-009.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.