We propose a new robust hypothesis test for (possibly non-linear) constraints on M-estimators with possibly non-differentiable estimating functions. The proposed test employs a random normalizing matrix computed from recursive M-estimators to eliminate the nuisance parameters arising from the asymptotic covariance matrix. It does not require consistent estimation of any nuisance parameters, in contrast with the conventional heteroscedasticity-autocorrelation consistent (HAC)-type test and the Kiefer–Vogelsang–Bunzel (KVB)-type test. Our test reduces to the KVB-type test in simple location models with ordinary least-squares estimation, so the error in the rejection probability of our test in a Gaussian location model is $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0001$ . We discuss robust testing in quantile regression, and censored regression models in detail. In simulation studies, we find that our test has better size control and better finite sample power than the HAC-type and KVB-type tests.

1. INTRODUCTION

Conventional hypothesis testing rests on consistent estimation of the asymptotic covariance matrix. In time series econometrics, the non-parametric kernel estimator originating from the spectral estimation of Priestley (1981) is a leading example; see also Newey and West (1987, 1994), Andrews (1991) and den Haan and Levin (1997) for econometric contributions. This estimator, which is also known as a heteroscedasticity-autocorrelation consistent (HAC) estimator, leads to asymptotic chi-squared tests that are robust to heteroscedasticity and serial correlations of unknown form, but the testing results can vary with the choices of the kernel function and its bandwidth.

In view of this, Kiefer et al. (2000, KVB hereafter) propose to replace the HAC estimator with a random normalizing matrix to avoid the selection of the bandwidth in the non-parametric kernel estimation in linear regression models. This approach is extended to robust testing in non-linear regression and generalized method of moments (GMM) models; see Bunzel et al. (2001) and Vogelsang (2003) for more details. As for specification testing, Lobato (2001) develops a portmanteau test for serial correlations, and Kuan and Lee (2006) propose general M-tests of moment conditions that are robust not only in the KVB sense but also in the presence of an estimation effect. For robust overidentifying restrictions (OIR) tests, see Sun and Kim (2012) and Lee et al. (2014).

As we will see later, to test for (possibly non-linear) constraints on the class of M-estimators of Huber (1967), a consistent estimator for the derivative of the expectation of the estimating function is needed. When the estimating function is differentiable with respect to the parameter vector, a consistent estimator for this is simply the sample average of the derivative of the estimating function. However, when the estimating function is not differentiable, the estimation is less straightforward; a leading example is the quantile regression (QR) estimator of Koenker and Basset (1978). Although an explicit form of the derivative of the expectation of the estimating function is available in this case, the conditional density of model errors is in the expression. Therefore, a consistent estimator for this matrix involves non-parametric kernel estimation of the conditional density. As such, user-chosen bandwidth is needed and the performance of the HAC-type and KVB-type tests can be sensitive to this choice. However, one may appeal to the bootstrap method to circumvent consistent estimation of any nuisance matrix – see, e.g., Buchinsky (1995) and Fitzenberger (1997) – but the resulting tests are computationally demanding. Moreover, tests based on the moving blocks bootstrap (MBB), as suggested by Fitzenberger (1997), can be sensitive to the selection of block length and the number of bootstrap samples. Subsampling may also be applied, but it suffers from a problem similar to MBB.

In this paper, we propose a new robust hypothesis test for possibly non-linear constraints on M-estimators with possibly non-differentiable estimating functions. The proposed test employs a normalizing matrix computed from recursive M-estimators to eliminate the nuisance parameters in the limit and hence does not require consistent estimation of any nuisance parameters, in contrast with the HAC-type and KVB-type tests. This feature makes the proposed test appealing because consistent estimators for such nuisance parameters may not only be difficult to obtain but also sensitive to user-chosen parameters, which in turn lead to poor finite sample performance of the test. The null limit of the proposed test is shown to be the same as that of Lobato (2001) and hence the asymptotic critical values are already available. Moreover, we show that the proposed test reduces to the KVB-type test in simple location models with ordinary least-squares (OLS) estimation so the error in rejection probability of the proposed test in a Gaussian location model is thus $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0002$ , in contrast with the conventional tests for which the error in rejection probability is typically no better than $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0003$ . We consider robust testing in QR and censored regression models in detail. In simulation studies, we find that our test can have better size control and better finite sample power than the HAC-type and KVB-type tests.

Our method is also known as the self-normalization method in the statistics literature.1 Note that, while Shao (2010) constructs confidence intervals for the parameters that are functionals of the marginal or joint distribution of stationary time series (e.g. mean or normalized spectral means), we consider hypothesis testing on parameters defined in a more general class of econometric models that include these parameters as special cases. Shao (2012) later constructs confidence intervals for the parameters in stationary time series models based on the frequency domain maximum likelihood estimator that can be applied to a large class of long/short memory time series models with weakly dependent innovations; see also Zhou and Shao (2013) and Huang et al. (2015) for the inference for parameters in different context.

This paper proceeds as follows. In Section 2., we introduce M-estimation and related asymptotic results. Then, in Section 3., we present the proposed test as well as the HAC-type and KVB-type tests. As examples, we discuss robust testing in QR and censored regression models in Section 4.. We report simulation results in Section 5., and we conclude in Section 6.. All proofs are deferred to the Appendix.

2. M-ESTIMATION AND ASYMPTOTIC RESULTS

Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0004$ be a sequence of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0005$ random vectors and let θ be a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0006$ unknown parameter vector with the true value $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0007$ . In the context of M-estimation, an M-estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0008$ for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0009$ can be defined as the one satisfying the following estimating equation,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0010$ (2.1)

where T is the sample size and ϕ is a measurable and separable $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0011$ -valued estimating function with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0012$ . Clearly, the class of M-estimators includes many estimators as special cases. For example, consider the linear regression model: $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0013$ . The OLS estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0014$ satisfies $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0015$ , the asymmetric least-squares estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0016$ of Newey and Powell (1987) and Kuan et al. (2009) for the τth conditional expectile of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0017$ satisfies $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0018$ , where I is the indicator function, and the QR estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0019$ for the τth conditional quantile of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0020$ is such that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0021$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0022$ ; see Fitzenberger (1997) for more details. The symmetrically trimmed least-squares estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0023$ proposed by Powell (1986b) is also an M-estimator because, for truncated data, it satisfies

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0024$

See also Section 4.2. for the censored regression models. In what follows, [c] denotes the integer part of the real number c, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0025$ denotes the sup norm for a vector, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0026$ denotes convergence in probability, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0027$ denotes convergence in distribution, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0028$ denotes equality in distribution and ⇒ denotes weak convergence of associated probability measures. We also let W_q be a vector of q independent, standard Wiener processes and we let B_q be the Brownian bridge with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0029$ for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0030$ .

2.1. Asymptotic normality of M-estimators

Define

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0031$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0032$ ; for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0033$ , we simply write $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0034$ , which is the sample average of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0035$ . To derive the limiting distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0036$ , we impose the following ‘high-level’ assumptions that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0037$ is $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0038$ consistent and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0039$ obeys a multivariate functional central limit theorem (FCLT).

Assumption 2.1. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0040$ .

Assumption 2.2. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0041$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0042$ and S is the matrix square root of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0043$ ; (i.e. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0044$ ).

Assumption 2.1 holds under various sets of primitive regularity conditions. These conditions typically require that some stochastic properties (such as memory, heterogeneity and moment restrictions) of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0045$ and some smoothness and domination conditions on ϕ so that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0046$ obeys a weak uniform law of large numbers, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0047$ is continuous in θ, and that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0048$ is identifiably unique. The discussion for consistency of M-estimation can be found, for example, in Huber (1981, Chapter 6), Tauchen (1985, pp. 422–24) and White (1994, pp. 33–35). Assumption 2.2 is more than enough to establish the asymptotic normality of M-estimators but is required to derive the weak limit of recursive M-estimators. Note that the conditions that ensure multivariate FCLT are sufficient for Assumption 2.2; see, e.g., Corollary 4.2 of Wooldridge and White (1988) or Theorem 7.30 of White (2001). By Assumption 2.2, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0049$ .

When ϕ is twice continuously differentiable with respect to θ and the corresponding second derivative is bounded in probability, the limiting distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0050$ can be easily derived using the first-order Taylor expansion. By expanding the estimating equation 2.1 about $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0051$ , we immediately have under Assumption 2.1 that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0052$

As long as a law of large numbers can be applied to the sample averages of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0053$ , a Bahadur representation for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0054$ can then be given by

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0055$ (2.2)

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0056$ , a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0057$ non-singular matrix of constants. Under Assumption 2.2, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0058$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0059$ .

Because many M-estimators, such as LAD and QR estimators, rely on non-differentiable ϕ, it is not possible to expand 2.1 around $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0060$ using the technique above, but a Bahadur representation as in 2.2 is still available for such M-estimators. To see this, define

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0061$

and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0062$ ; for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0063$ , we simply write $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0064$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0065$ . We now impose the following two assumptions; see also Weiss (1991) for primitive conditions.

Assumption 2.3. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0066$ is twice continuously differentiable with a bounded second derivative and satisfies the following property: $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0067$ .

Assumption 2.4. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0068$ uniformly in $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0069$ and M_o is non-singular.

Given these two assumptions and by a first-order Taylor expansion of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0070$ around $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0071$ , we obtain an expression as in 2.2, and hence the asymptotic normality for such M-estimators follows immediately from Assumption 2.2.

2.2. Weak convergence of recursive M-estimators

Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0072$ be the recursive M-estimator using the subsample of first j observations; that is, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0073$ is the one such that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0074$

To derive its weak limit, we modify Assumptions 2.1 and 2.3 to the following

Assumption 2.1′. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0075$ , uniformly in $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0076$ .

Assumption 2.3′. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0077$ is twice continuously differentiable with a bounded second derivative and satisfies $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0078$ , uniformly in $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0079$ .

These two assumptions should not be abrupt as the recursive M-estimators and the full-sample M-estimator ought to behave similarly. It can be shown that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0080$

which implies that under Assumption 2.2,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0081$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0082$ . By replacing $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0083$ with the full-sample M-estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0084$ , we can apply the continuous mapping theorem to obtain that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0085$ (2.3)

This result has been established in the literature on testing parameter constancy for linear mean and median regressions; see, e.g. Ploberger et al. (1989), Kuan and Hornik (1995) and Chen and Kuan (2001). Here, we show that it remains valid for recursive M-estimators. This result is needed to construct robust tests for parameter restrictions.

3. TESTS FOR GENERAL HYPOTHESES

The null hypothesis of interest consists of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0086$ possibly non-linear restrictions on $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0087$ ,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0088$ (3.1)

where γ is a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0089$ vector of continuously differentiable functions with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0090$ that is of full row rank in a neighbourhood of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0091$ . Under the assumptions above and using a delta method, we can show that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0092$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0093$ .

3.1. The HAC-type test

To test the hypothesis defined in 3.1, the test statistic for a HAC-type test is defined as $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0094$ , in which $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0095$ is the HAC estimator for the asymptotic covariance matrix of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0096$ such that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0097$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0098$ is a consistent estimator of M_o and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0099$ is a non-parametric kernel estimator of V given by

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0100$

where κ is a kernel function and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0101$ denotes the truncation lag or bandwidth. As $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0102$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0103$ , and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0104$ is consistent for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0105$ . The null distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0106$ is a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0107$ distribution. In this paper, a test based on test statistic $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0108$ will be called the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0109$ test.

Although the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0110$ has a standard null limit and is robust to heteroscedasticity and serial correlations of unknown form, its finite sample performance will depend on choices of the kernel function and its bandwidth. To implement the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0111$ test, it also requires a consistent estimator for M_o. If ϕ is continuously differentiable, then

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0112$

and the sample average of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0113$ is a natural consistent estimator for M_o. However, when ϕ is not differentiable, the estimation is less straightforward. For example, in the QR regression models, the M_o involves the conditional density of the error terms. Although kernel-based estimators for M_o are available (see Weiss, 1991), the resulting $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0114$ test would suffer from the problems arising from non-parametric kernel estimation. For more details and more examples, see Section 4.. In addition, if the constraints are non-linear, we will need to estimate $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0115$ by $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0116$ . Even it is straightforward to obtain $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0117$ , the finite sample performance of the tests may be adversely affected by this estimator.

3.2. The KVB-type test

The main idea underlying the KVB approach is to employ a random normalizing matrix in place of the kernel-based covariance matrix estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0118$ to avoid the problems arising from non-parametric kernel estimation. Following this approach, we construct $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0119$ as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0120$

Clearly, it reduces to that of KVB when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0121$ with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0122$ . Given $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0123$ , a KVB-type test statistic is defined as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0124$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0125$ with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0126$ .

It is easy to derive the weak limit of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0127$ (and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0128$ ) when ϕ is smooth (see Kuan and Lee, 2006), and it is less straightforward when ϕ is non-smooth. To allow for non-smooth ϕ, we impose the following assumption, which is similar to the condition (v) of Theorem 7.2 in Newey and McFadden (1994).

Assumption 3.1.Define

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0129$

Then, there exists a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0130$ such that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0131$ .

This assumption should not be restrictive because sufficient conditions for Assumption 2.3′ are also sufficient for Assumption 3.1; see, e.g. Huber (1967, p. 227) and Weiss (1991, p. 62).

Lemma 3.1.Suppose that Assumptions 2.1, 2.2, 2.3′, 2.4 and 3.1 hold. Then $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0132$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0133$ and S is the matrix square root of V.

As shown in Lemma 3.1, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0134$ remains random in the limit so the null distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0135$ will not be a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0136$ distribution. However, the following theorem shows that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0137$ is asymptotically pivotal under the null and its weak limit is the same as that of Lobato (2001). Therefore, the corresponding critical values can be found in Lobato (2001, p. 1067).

Theorem 3.1.Suppose that Assumptions 2.1, 2.2, 2.3′, 2.4 and 3.1 hold. Then under the null,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0138$

Although the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0139$ test does not require consistent estimation of V, it still requires consistent estimation of M_o and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0140$ . In particular, as we have discussed previously, it can be hard to obtain M_o and there might be an additional smoothing parameter to pick when ϕ is not differentiable. In view of this, in the next subsection, we propose a new way to construct robust tests without consistent estimation of M_o and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0141$ .

3.3. The proposed test

We propose to use the following normalizing matrix,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0142$ (3.2)

which is computed using only recursive estimators, and define the test statistic of our test as $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0143$ . Note that we do not need to estimate M_o and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0144$ to obtain $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0145$ , and this makes our test appealing. Under the assumptions above, we can show that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0146$ (3.3)

where Ξ is the matrix square root of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0147$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0148$ is the asymptotic variance of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0149$ . It follows that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0150$ (3.4)

Then, 3.3 and 3.4 are sufficient to derive the null distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0151$ which is summarized in the following theorem.

Theorem 3.2.Suppose that Assumptions 2.1′, 2.2, 2.3′ and 2.4 hold. Then, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0152$ . Moreover,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0153$

under the null hypothesis that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0154$ .

Remark 3.1.In 3.2 the summation starts with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0155$ where p is the number of unknown parameters. When the criterion function is not smooth, it may not be easy to compute the M-estimators. In addition, if the sample size is small, the optimization might not be stable and the global minimization or maximization may not be easy to guarantee. Therefore, one might want to start with a larger subsample size to avoid these problems. In fact, the theory would still hold as long as the summation starts with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0156$ and the sequence $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0157$ satisfies $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0158$ such that if $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0159$ , then

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0160$

as in Theorem 3.2.2

Remark 3.2.To shed more insight on the difference and similarity between $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0161$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0162$ , we consider the linear model $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0163$ and the null hypothesis 3.1 with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0164$ . Based on the OLS estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0165$ , the KVB normalizing matrix can be expressed as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0166$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0167$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0168$ . As for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0169$ , we first show that for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0170$ ,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0171$

Then the proposed normalizing matrix becomes

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0172$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0173$ . The result above shows that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0174$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0175$ employ different but similar normalizing matrices in the context of linear regression with OLS estimation. Of particular interest is that in a simple location model (i.e. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0176$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0177$ for all t), these two tests are algebraically equivalent because $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0178$ in this case. Therefore, the error in rejection probability of the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0179$ test is also $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0180$ in a Gaussian location model, as shown in Jansson (2004).

3.4. Asymptotic local power

In this section, we compare the local powers of the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0181$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0182$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0183$ tests under local alternatives defined as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0184$ (3.5)

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0185$ is a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0186$ vector of non-zero constants. Under the local alternatives, the limiting distributions of these tests are summarized in the following theorem. Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0187$ .

Theorem 3.3.Suppose that Assumptions 2.1′, 2.2, 2.3′, 2.4 and 3.1 hold. Then under the local alternatives defined in 3.5, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0188$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0189$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0190$ .

Given Theorem 3.3, we can derive the asymptotic local power for these tests now. Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0191$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0192$ be the critical values at α significance level taken from $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0193$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0194$ , respectively. Also, let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0195$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0196$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0197$ be the asymptotic local powers of the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0198$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0199$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0200$ tests, respectively.

Corollary 3.1.Suppose that Assumptions 2.1′, 2.2, 2.3′, 2.4 and 3.1 hold. Then, under the local alternatives, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0201$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0202$ .

It is obvious that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0203$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0204$ have the same asymptotic local power because their limiting distributions are identical. The local power curves of these tests are the same as those in Figure 1 of Kuan and Lee (2006), so we omit them here. As we can see from Figure 1 of Kuan and Lee (2006), $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0205$ has better local power than the other two tests. However, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0206$ may not outperform the other two tests in finite samples because its performance would depend on user-chosen parameters as well.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Size-adjusted powers of the OLS-based tests.

4. EXAMPLES

In this section, we illustrate the application of the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0207$ test in QR and in censored regression models. QR is a leading example for a non-differentiable ϕ. In the second example, whether the corresponding ϕ is differentiable or not and whether consistent estimation of M_o is easy or not depend on the estimation method used.

4.1. QR models

In the context of QR, the τth conditional quantile of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0208$ given a vector of explanatory variables x_t is typically specified as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0209$

where F is the conditional distribution function of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0210$ given x_t, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0211$ is a $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0212$ vector of unknown parameters with the true value $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0213$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0214$ is a real-valued function that is continuously differentiable in the neighbourhood of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0215$ . Similar to the mean regression model, the QR model can also be written as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0216$ (4.1)

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0217$ is the error term with the τth conditional quantile being zero under the true value $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0218$ , a condition ensuring the correct specification for the model 4.1. Note that the model 4.1 with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0219$ is also known as a median regression model.

To estimate the unknown parameter vector $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0220$ , Koenker and Bassett (1978) propose the QR estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0221$ that minimizes the following criterion function:

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0222$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0223$ is known as a check function. An analytic form for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0224$ is generally not available; nonetheless, many algorithms for obtaining $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0225$ have been proposed in the literature; see, e.g. Koenker and d'Orey (1987) and Koenker and Park (1996). Note also that the LAD estimator corresponds to the QR estimator with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0226$ .

Note that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0227$ is not differentiable, so the limiting distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0228$ is relatively difficult to derive. Nonetheless, under suitable conditions, the asymptotic normality result remains valid for the QR estimator (or LAD estimator); see Powell (1986a), Weiss (1991) and Fitzenberger (1997), among many others. Consider the linear quantile regression (i.e. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0229$ ). The linear QR estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0230$ is an M-estimator satisfying $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0231$ . For more general non-linear models, we can follow Weiss (1991) and Fitzenberger (1997), and show that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0232$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0233$ ; see also Koenker (2005, p. 124). Therefore, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0234$ is an M-estimator with a non-differentiable ϕ.

To implement HAC-type and KVB-type tests in QR models, it is necessary to estimate M_o consistently and M_o is given as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0235$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0236$ is the conditional density of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0237$ . $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0238$ can be estimated by a non-parametric kernel method, but the test performance will depend on the choices of kernel and bandwidth. To circumvent consistent estimation of M_o, one may appeal to MBB or subsampling, but the testing results can be sensitive to the selection of block length instead. Therefore, the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0239$ test seems to be practically useful for testing in QR models because it avoids consistent estimation of M_o and is thus free from user-chosen parameters.

4.2. Censored regression models

Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0240$ be the dependent variable generated according to $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0241$ . If we only observe data points $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0242$ with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0243$ , then $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0244$ forms a censored regression model and the OLS estimator by regression $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0245$ on x_t is not consistent for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0246$ ; see Amemiya (1984) for an early review. Most studies rely on maximum likelihood (ML) estimation. Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0247$ and let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0248$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0249$ be the corresponding conditional distribution and density functions. Under the assumption of independence, the log likelihood function can then be written as $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0250$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0251$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0252$ and

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0253$

The Gaussian ML estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0254$ is an M-estimator that solves $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0255$ .

The Gaussian ML estimator is not robust to conditional heteroscedasticity and non-normality, however. Powell (1984) proposes a censored LAD estimator instead, which satisfies 2.1 with non-differentiable $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0256$ . Although such an estimator is robust to conditional heteroscedasticity and non-normality, hypothesis tests based on this estimator are not easy to implement because

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0257$ (4.2)

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0258$ is the true conditional density of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0259$ . Therefore, the censored LAD-based test would suffer from the same problems as in QR models.

When the conditional distribution of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0260$ is symmetric about $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0261$ , the symmetrically trimmed least-squares estimator introduced by Powell (1986b) is also consistent for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0262$ and robust to conditional heteroscedasticity and non-normality. Moreover, tests based on this estimator are easier to implement. To see this, from equation (2.9) in Powell (1986b), this estimator is an M-estimator with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0263$ . Although ϕ is non-differentiable, the corresponding M_o can be expressed as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0264$

for which consistent estimation is straightforward.

The censored regression model can be applied not only to cross-sectional data but also to time series data. For time series data, the assumption of serial independence may not be appropriate, but Robinson (1982) show that the Gaussian ML estimator remains consistent and asymptotically normal with a complicated asymptotic covariance matrix. Therefore, the test developed in this paper can also be useful.

5. MONTE CARLO SIMULATIONS

In this section, the finite sample performance of the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0265$ test is evaluated via Monte Carlo simulations. We consider three sample sizes ( $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0266$ , 100 and 500) and two different nominal sizes (5% and 10%). The number of replications is 5,000 for size simulations and 1,000 for power simulations. Because the results for different nominal sizes are qualitatively similar, we report only the results for 5% nominal size.

As in KVB, we consider the linear regression specification, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0267$ , where x_t is a 5 × 1 vector of regressors, with the first element equal to 1 for all t and where other elements are mutually independent AR(1) processes. The last four elements are generated in the same way as the AR(1)-HOMO errors introduced below. In the same simulation design, the correlation coefficients and the distributions of the regressors are the same as those of the error terms. θ is an unknown parameter vector with the true value $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0268$ , and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0269$ is an error term. The null hypothesis is given by

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0270$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0271$ is the ith element of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0272$ and q denotes the number of restrictions on $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0273$ . We consider $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0274$ for size simulations and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0275$ for power simulations. To test these hypotheses, we apply the OLS and LAD methods for estimating $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0276$ . It is well known that the OLS estimator enjoys optimality under (i.i.d.) normal errors, but the LAD estimator may be better for leptokurtic errors.

In size simulations, we set $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0277$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0278$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0279$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0280$ represents the conditional variance of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0281$ . The data-generating processes (DGPs) for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0282$ are AR(1)-HOMO and AR(1)-HET. AR(1) indicates that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0283$ is governed by the AR(1) model, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0284$ , where ρ is set to be either 0.5 or 0.8, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0285$ is a white noise with unit variance and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0286$ , which is a scaling factor such that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0287$ . While HOMO stands for conditional homoscedasticity of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0288$ (we set $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0289$ for all t), HET denotes that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0290$ is conditionally heteroscedastic with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0291$ as considered in Fitzenberger (1997, p. 255). We generate $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0292$ from i.i.d. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0293$ and standardized Student's t(4). This enables us to examine if LAD-based tests are more appropriate for leptokurtic data. As for the regressors, they follow the AR(1) model specified as that for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0294$ .

For comparison, we simulate the HAC-type and KVB-type tests. For the former, we consider the Bartlett kernel covariance matrix estimator for which the truncation lag is determined by the non-parametric method of Newey and West (1994) with the weighting vector $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0295$ and two preliminary truncation lags, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0296$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0297$ and 12. Unlike the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0298$ test, these two tests require consistent estimation of M_o. Thus, we employ the estimator $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0299$ for OLS and the following kernel-based estimator for LAD:

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0300$

Here, κ is the Gaussian kernel, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0301$ is the bandwidth that vanishes in the limit and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0302$ are LAD residuals. Although optimal selection of bandwidth has been extensively discussed in the literature on density estimation, it is not clear how the value of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0303$ should be selected such that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0304$ enjoys optimality in some sense. We thus examine the effect of selecting $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0305$ by employing the two bandwidths: (a) $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0306$ ; (b) $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0307$ . Here, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0308$ with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0309$ and R being the sample standard deviation and interquartile range of the LAD

residuals, respectively. The first is suggested by Silverman (1986, p. 48) to obtain an optimal rate for density estimation, but the second goes to zero at the same rate as in Koenker (2005, p. 81). Given these choices, the OLS-based tests read $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0310$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0311$ , and the LAD-based tests are denoted as $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0312$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0313$ , where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0314$ (a) and (b).

The empirical sizes for the OLS-based tests are reported in Tables 1 and 2. Clearly, these tests are all oversized in small samples (e.g. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0315$ ) and the distortions deteriorate when q or ρ becomes larger. We also observe that leptokurtosis has little effect on the size performance, but heteroscedasticity does result in more size distortions especially for leptokurtic data and smaller q. Among these tests, the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0316$ test has the largest size distortions, regardless of the values of a. In particular, its size distortions are much larger for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0317$ and remain even when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0318$ . The other tests are clearly less oversized and a quite encouraging result is that the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0319$ test dominates the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0320$ test in terms of finite sample size. It is also found that when data become more persistent (i.e. ρ becomes larger), the size distortion of the former increases slightly, yet the size distortion of the latter increases dramatically.

Table 1. Empirical sizes of the robust hypothesis tests (OLS, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0321$ )

		AR(1)-HOMO						AR(1)-HET
		$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0322$			t(4)			$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0323$			t(4)
	q	$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0324$	100	500	50	100	500	50	100	500	50	100	500
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0325$	1	6.3	5.3	5.0	5.7	5.8	4.8	9.9	8.2	5.9	11.6	8.9	5.8
	2	7.8	6.3	5.3	7.8	6.3	5.5	10.5	8.2	5.2	11.6	8.4	6.6
	3	10.4	7.7	5.8	9.7	7.6	6.2	12.6	9.2	5.9	13.1	9.5	6.8
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0326$	1	9.9	7.3	5.6	9.7	7.9	5.5	11.7	8.6	6.1	14.0	10.3	5.9
	2	11.3	8.2	5.8	11.8	8.7	6.1	12.7	8.7	5.5	13.9	10.0	7.0
	3	14.5	9.5	6.2	14.3	10.5	6.8	14.0	9.4	6.0	15.7	10.7	7.0
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0327$	1	16.2	12.2	8.3	17.3	13.1	7.6	19.4	13.7	8.4	23.6	16.9	8.2
	2	23.7	17.9	9.9	25.3	17.7	10.5	25.7	18.8	9.6	29.1	19.8	10.1
	3	33.4	22.9	10.5	32.5	21.1	11.7	32.8	22.6	10.4	34.5	22.3	11.6
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0328$	1	23.4	16.6	8.1	25.1	17.5	7.0	27.8	18.3	8.3	31.1	21.3	8.7
	2	37.9	25.3	10.3	40.0	25.9	10.7	40.4	27.4	10.3	43.1	28.0	11.0
	3	52.1	34.9	11.6	51.0	34.5	12.6	53.1	34.9	12.4	53.1	35.7	12.8

Note

The entries are rejection frequencies in percentages; the nominal size is 5%.

Table 2. Empirical sizes of the robust hypothesis tests (OLS, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0329$ )

		AR(1)-HOMO						AR(1)-HET
		$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0330$			t(4)			$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0331$			t(4)
	q	$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0332$	100	500	50	100	500	50	100	500	50	100	500
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0333$	1	7.5	5.9	5.6	7.1	6.3	5.3	11.9	10.4	6.5	13.3	10.6	6.7
	2	10.2	8.7	5.5	9.6	8.3	5.7	14.2	10.8	5.6	14.6	12.1	6.8
	3	14.6	11.4	7.9	13.5	11.7	7.1	17.9	13.7	7.9	17.8	13.9	7.5
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0334$	1	17.8	12.2	6.9	17.4	12.4	7.1	21.0	14.6	7.1	22.7	15.5	7.3
	2	23.1	16.4	7.6	22.9	16.7	8.1	24.7	17.3	7.0	27.3	18.8	8.8
	3	32.3	20.6	9.4	30.6	20.8	9.4	31.4	20.2	9.1	31.9	20.9	8.9
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0335$	1	28.9	23.2	10.7	30.5	23.5	10.5	35.1	27.0	11.7	39.0	29.5	12.7
	2	45.4	35.1	13.6	46.9	34.6	14.8	47.7	37.2	14.9	51.1	37.2	16.1
	3	60.7	46.0	18.3	60.4	44.3	18.5	59.5	45.4	18.2	60.4	44.9	18.5
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0336$	1	34.1	24.6	11.8	35.7	24.9	11.5	40.2	29.4	12.7	43.8	31.7	13.9
	2	53.7	38.6	15.6	55.7	39.2	16.6	56.9	41.4	16.6	59.0	42.0	17.6
	3	70.8	51.6	20.9	69.9	51.3	20.6	70.0	52.5	20.5	71.1	52.1	20.9

Note

The entries are rejection frequencies in percentage; the nominal size is 5%.

The empirical sizes for LAD are summarized in Tables 3 and 4. Generally, the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0337$ test is still the best test and the HAC-type test is the worst. Compared with the preceding tables, we observe that the OLS-based and LAD-based tests have similar patterns regarding size performance. It is also found that the LAD-based $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0338$ test has empirical sizes closer to the nominal size of 5%, but this is not necessary for the other tests. These results suggest that, as far as an accurate finite sample size is concerned, the LAD-based $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0339$ test is preferred for testing in linear regressions.

Table 3. Empirical sizes of the robust hypothesis tests (LAD, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0340$ )

		AR(1)-HOMO						AR(1)-HET
		$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0341$			t(4)			$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0342$			t(4)
	q	$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0343$	100	500	50	100	500	50	100	500	50	100	500
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0344$	1	4.4	3.5	4.2	4.1	3.8	4.1	7.1	5.9	5.2	8.4	7.2	5.7
	2	5.0	4.2	4.3	4.7	4.1	4.0	6.4	4.7	4.1	6.6	4.5	5.0
	3	7.1	5.4	4.2	6.2	4.9	4.3	6.6	4.7	3.8	7.0	5.4	4.2
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0345$	1	6.6	5.5	4.9	6.0	5.7	5.4	9.3	7.4	5.5	12.1	10.0	6.4
	2	7.7	7.0	5.2	8.8	7.0	5.7	9.3	6.6	4.4	11.9	7.9	5.9
	3	11.2	8.2	5.9	10.1	8.5	6.4	9.3	6.2	4.5	12.4	8.8	5.9
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0346$	1	9.3	7.4	5.5	8.8	8.8	7.5	12.3	10.2	6.6	15.9	13.5	8.9
	2	12.8	9.9	6.7	14.3	11.8	8.4	14.9	10.5	6.7	17.6	14.6	9.1
	3	19.4	13.5	8.5	19.2	16.2	10.4	17.0	12.1	8.1	20.0	15.8	10.1
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0347$	1	13.6	9.8	6.8	13.3	10.8	7.6	17.5	11.9	7.9	21.0	16.0	9.1
	2	20.0	14.0	8.0	21.4	15.4	8.8	20.2	13.7	6.8	24.4	16.8	9.0
	3	28.8	20.3	9.7	27.6	20.6	10.5	25.7	15.9	7.4	27.6	19.7	8.9
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0348$	1	17.5	13.2	8.2	17.9	15.0	10.0	21.6	15.5	10.4	25.8	19.9	11.6
	2	28.1	19.7	11.3	30.6	23.0	14.8	28.0	20.0	11.3	33.1	25.0	14.5
	3	41.7	30.3	14.8	42.6	33.5	18.5	37.7	26.2	14.1	40.7	29.8	17.4
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0349$	1	23.7	17.2	8.2	23.6	18.2	9.8	27.7	19.0	10.2	30.7	23.1	11.4
	2	39.1	27.8	11.8	41.2	30.1	15.8	39.0	27.5	11.8	41.9	32.0	14.9
	3	55.8	41.6	16.6	55.0	44.0	20.0	51.6	37.4	15.3	54.2	40.6	18.9

Note

The entries are rejection frequencies in percentage; the nominal size is 5%.

Table 4. Empirical sizes of the robust hypothesis tests (LAD, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0350$ )

		AR(1)-HOMO						AR(1)-HET
		$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0351$			t(4)			$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0352$			t(4)
	q	$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0353$	100	500	50	100	500	50	100	500	50	100	500
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0354$	1	5.3	4.9	4.2	4.6	4.3	4.6	8.9	8.1	5.9	10.0	7.7	5.7
	2	7.1	6.3	5.0	6.8	5.7	4.7	7.8	6.6	4.6	8.3	6.9	4.6
	3	9.2	7.6	6.0	9.1	8.2	5.9	10.3	6.2	5.1	9.2	7.2	4.7
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0355$	1	12.3	8.9	5.7	12.0	8.9	6.1	14.1	10.3	6.1	16.1	11.3	6.6
	2	18.3	12.8	7.2	18.1	13.9	7.0	17.0	11.2	5.5	20.2	13.7	6.2
	3	25.8	17.7	8.4	26.1	17.9	8.0	21.3	13.2	6.7	22.9	15.7	6.4
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0356$	1	15.7	11.5	6.1	15.2	12.3	7.5	17.9	12.9	7.1	20.0	14.6	8.6
	2	24.3	16.7	9.1	25.2	17.9	9.5	23.6	16.1	9.1	27.1	17.7	9.1
	3	36.1	26.0	11.0	36.2	27.0	12.0	30.4	21.0	9.5	32.8	24.0	10.3
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0357$	1	24.5	18.0	9.1	23.6	18.2	9.6	27.9	20.3	10.1	29.8	21.4	10.9
	2	37.2	28.2	12.6	38.4	30.4	13.5	35.8	25.0	9.9	38.4	29.8	12.1
	3	51.5	39.6	17.1	51.9	39.5	16.7	44.3	32.5	13.4	46.6	34.7	13.4
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0358$	1	28.8	21.1	10.7	28.0	22.9	12.6	32.2	23.4	11.8	34.1	27.3	13.3
	2	44.6	34.4	16.9	46.6	36.4	17.2	44.3	33.1	15.8	46.9	35.4	16.4
	3	62.4	50.5	22.7	62.7	50.9	23.2	55.7	43.2	19.1	57.6	46.5	20.0
$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0359$	1	33.1	23.0	11.0	32.7	24.1	12.9	35.7	24.7	11.9	37.0	27.9	13.6
	2	52.6	38.1	18.0	54.1	39.8	18.1	52.0	37.3	17.2	54.0	38.9	17.9
	3	70.3	55.6	24.6	71.7	56.1	25.1	65.5	49.1	20.7	65.6	52.3	21.6

Note

The entries are rejection frequencies in percentage; the nominal size is 5%.

For power simulations, we consider $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0360$ and 100 and the null hypothesis $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0361$ against the alternatives for which $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0362$ . The DGPs considered are AR(1)-HOMO and AR(1)-HET with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0363$ and two error terms: $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0364$ and (standardized) Student's t errors. As shown in the preceding results, the $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0365$ test is slightly oversized for these DGPs, but other tests have substantial size distortions, especially when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0366$ and when inappropriate user-chosen parameters are used. To provide a proper power comparison, we thus simulate the size-adjusted powers. However, it should be stressed that, while size adjustment enables us to compare the power performance of tests with different finite sample sizes, it is generally infeasible in practical applications.

The power curves for the OLS-based and LAD-based tests are plotted in Figures 1 and 2, respectively, with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0367$ on the horizonal axis. The different panels show the following: AR(1)-HOMO with (a) normal errors and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0368$ , (b) normal errors and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0369$ , (c) t(4) errors and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0370$ and (d) t(4) errors and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0371$ ; AR(1)-HET with normal errors and (e) $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0372$ and (f) $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0373$ . Clearly, their powers grow with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0374$ and T, but are adversely affected by leptokurtosis or heteroscedasticity. Comparing the OLS-based $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0375$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0376$ tests, we find that although the latter delivers slightly higher power when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0377$ , they perform quite similarly when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0378$ , which shows that their power differences disappear very quickly. In the LAD case, the KVB-type test is no longer free from user-chosen parameters and it is of interest to see that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0379$ performs similarly to $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0380$ and outperforms $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0381$ in both samples. Note that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0382$ may be even more powerful than $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0383$ in a larger sample, in contrast with the OLS case. Comparing $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0384$ with the HAC-type test, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0385$ does suffer from power loss in the OLS case, but it still performs similarly to $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0386$ when $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0387$ is small. For LAD, the HAC-type test depends on more user-chosen parameters and it is clear that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0388$ performs better than $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0389$ in a smaller sample.

These simulation results together suggest that the proposed $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0390$ test is practically useful because it dominates the other tests in terms of finite sample size and can enjoy power advantage when the other tests are computed using inappropriate user-chosen parameters. Finally, by comparing Figures 1 and 2 we also find that although the LAD-based $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0391$ test is dominated by the OLS-based $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0392$ test for AR(1)-HOMO with normal errors, the former may perform better when data are leptokurtic (resulting from either heteroscedasticity or leptokurtic errors).

6. CONCLUSIONS

In this paper, we propose new robust hypothesis tests for non-linear constraints on M-estimators with possibly non-differentiable estimating functions. The proposed approach may serve as a good alternative to hypothesis testing because it does not require consistent estimation of any nuisance parameters in the asymptotic covariance matrix. Hence, it circumvents the problems arising from such consistent estimation, a sharp contrast with the HAC-type and KVB-type tests. Our simulations also suggest that the proposed test is practically useful because it performs better than the HAC-type and KVB-type tests in terms of finite sample size, and it has a power advantage when the latter tests are computed with inappropriate user-chosen parameters.

ACKNOWLEDGEMENTS

We thank the editor Michael Jansson, Anil Bera, Joon Park, Werner Ploberger, Jeffrey Racine, two anonymous referees, and the participants at the 2006 Far Eastern Meeting of the Econometric Society in Beijing for helpful suggestions and comments. All errors are our responsibility. The research support from the National Science Council of the Republic of China (NSC94-2415-H-194-009 for W-M. Lee) is also gratefully acknowledged.

Appendix A A

Proof of Lemma 3.1.Let $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0393$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0394$ be defined as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0395$

Then it is easy to see that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0396$ with $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0397$ . We first consider the term $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0398$ . By the first-order Taylor expansion about $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0399$ , we have under Assumptions 2.1 and 2.4 that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0400$ (A.1)

where the last equality follows from the Bahadur representation: $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0401$ (as shown in Section 2.1.). As a result,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0402$

As for $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0403$ , its sup norm can be expressed as

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0404$

Under Assumptions 2.1 and 3.1, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0405$ . Moreover, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0406$ and, as shown in equation A.1, $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0407$ . These results together imply $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0408$ . As such

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0409$

Given Assumption 2.2 and applying the continuous mapping theorem, we immediately obtain $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0410$ . Applying again the continuous mapping theorem yields the weak limit of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0411$ . $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0412$

Proof of Theorem 3.1.As $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0413$ , we immediately have

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0414$

where $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0415$ and Ξ is a non-singular $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0416$ matrix square root of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0417$ . It then follows from Lemma 3.1 that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0418$ has the following weak limit:

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0419$

Given $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0420$ and applying the delta method,

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0421$

We thus have under the null that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0422$

where Ξ is defined as above. Applying the continuous mapping theorem, we can obtain under the null that

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0423$

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0424$

Proof of Theorem 3.2.Equations 3.3 and 3.4 are sufficient to show Theorem 3.2. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0425$

Proof of Theorem 3.3.Under the local alternative 3.5, we have

$urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0426$

see also the proof of Theorem 3.1. With this result and that $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0427$ , $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0428$ and $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0429$ regardless of the values of $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0430$ , the weak limits for the three tests immediately follow. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0431$

Proof of Corollary 3.1.The results follows directly from Theorem 3.3. $urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0432$

Supporting Information

References

Amemiya, T. (1984). Tobit models: a survey. Journal of Econometrics 24, 3–61.
10.1016/0304-4076(84)90074-5
Web of Science® Google Scholar
Andrews, D. W. K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59, 817–58.
10.2307/2938229
Web of Science® Google Scholar
Buchinsky, M. (1995). Estimating the asymptotic covariance matrix for quantile regression models. Journal of Econometrics 68, 303–38.
10.1016/0304-4076(94)01652-G
Web of Science® Google Scholar
Bunzel, H., N. M. Kiefer and T. J. Vogelsang (2001). Simple robust testing of hypotheses in nonlinear models. Journal of the American Statistical Association 96, 1088–96.
10.1198/016214501753209068
Web of Science® Google Scholar
Chen, M-Y. and C-M. Kuan (2001). Testing parameter constancy in models with infinite variance errors. Economics Letters 72, 11–18.
10.1016/S0165-1765(01)00413-X
CAS Web of Science® Google Scholar
den Haan, W. J. and A. T. Levin (1997). A practitioner's guide to robust covariance matrix estimation. In G. S. Maddala and C. R. Rao (Eds.), Handbook of Statistics, Volume 15, 299–342. Amsterdam: Elsevier.
Google Scholar
Fitzenberger, B. (1997). The moving blocks bootstrap and robust inference for linear least squares and quantile regressions. Journal of Econometrics 82, 235–87.
10.1016/S0304-4076(97)00058-4
Web of Science® Google Scholar
Huang, Y., S. Volgushev and X. Shao (2015). On self-normalization for censored dependent data. Journal of Time Series Analysis 36, 109–24.
10.1111/jtsa.12096
Web of Science® Google Scholar
Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In L. M. Cam and J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1, 221–33. Berkeley, CA: University of California Press.
Google Scholar
Huber, P. J. (1981). Robust Statistics. New York, NY: John Wiley.
10.1002/0471725250
PubMed Google Scholar
Jansson, M. (2004). The error in rejection probability of simple autocorrelation robust tests. Econometrica 72, 937–46.
10.1111/j.1468-0262.2004.00517.x
Web of Science® Google Scholar
Kiefer, N. M., T. J. Vogelsang and H. Bunzel (2000). Simple robust testing of regression hypothesis. Econometrica 68, 695–714.
10.1111/1468-0262.00128
Web of Science® Google Scholar
Koenker, R. (2005). Quantile Regression. Cambridge: Cambridge University Press.
10.1017/CBO9780511754098
Web of Science® Google Scholar
Koenker, R. and G. Bassett (1978). Regression quantiles. Econometrica 46, 33–50.
10.2307/1913643
Web of Science® Google Scholar
Koenker, R. and V. d'Orey (1987). Computing regression quantiles. Applied Statistics 36, 383–93.
10.2307/2347802
Web of Science® Google Scholar
Koenker, R. and B. J. Park (1996). An interior point algorithm for nonlinear quantile regression. Journal of Econometrics 71, 265–83.
10.1016/0304-4076(96)84507-6
Web of Science® Google Scholar
Kuan, C-M. and K. Hornik (1995). The generalized fluctuation test: a unifying view. Econometric Reviews 14, 135–61.
10.1080/07474939508800311
Google Scholar
Kuan, C-M. and W-M. Lee (2006). Robust M tests without consistent estimation of the asymptotic covariance matrix. Journal of the American Statistical Association 101, 1264–75.
10.1198/016214506000000375
CAS Web of Science® Google Scholar
Kuan, C-M., J-H. Yeh and Y-C. Hsu (2009). Assessing value at risk with CARE, the conditional autoregressive expectile models. Journal of Econometrics 150, 261–70.
10.1016/j.jeconom.2008.12.002
Web of Science® Google Scholar
Lee, W-M., C-M. Kuan and Y-C. Hsu (2014). Over-identifying restrictions without consistent estimation of the asymptotic covariance matrix. Journal of Econometrics 181, 181–93.
10.1016/j.jeconom.2014.04.002
Web of Science® Google Scholar
Lobato, I. N. (2001). Testing that a dependent process is uncorrelated. Journal of the American Statistical Association 96, 1066–76.
10.1198/016214501753208726
Web of Science® Google Scholar
Newey, W. K. and D. L. McFadden (1994). Large sample estimation and hypothesis testing. In R. F. Engle and D. L. McFadden (Eds.), Handbook of Econometrics, Volume 4, 2111–245. Amsterdam: Elsevier.
10.1016/S1573-4412(05)80005-4
Google Scholar
Newey, W. K. and J. L. Powell (1987). Asymmetric least squares estimation and testing. Econometrica 55, 819–47.
10.2307/1911031
Web of Science® Google Scholar
Newey, W. K. and K. D. West (1987). A simple positive semi-definite heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55, 703–8.
10.2307/1913610
Web of Science® Google Scholar
Newey, W. K. and K. D. West (1994). Automatic lag selection in covariance matrix estimation. Review of Economic Studies 61, 631–53.
10.2307/2297912
Web of Science® Google Scholar
Ploberger, W., W. Krämer, and K. Kontrus (1989). A new test for structural stability in the linear regression model. Journal of Econometrics 40, 307–18.
10.1016/0304-4076(89)90087-0
Web of Science® Google Scholar
Powell, J. L. (1984). Least absolute deviations estimation for the censored regression model. Journal of Econometrics 25, 303–25.
10.1016/0304-4076(84)90004-6
Web of Science® Google Scholar
Powell, J. L. (1986a). Censored regression quantiles. Journal of Econometrics 32, 143–55.
10.1016/0304-4076(86)90016-3
Web of Science® Google Scholar
Powell, J. L. (1986b). Symmetrically trimmed least squares estimation for Tobit models. Econometrica 54, 1435–60.
10.2307/1914308
Web of Science® Google Scholar
Priestley, M. B. (1981). Spectral Analysis and Time Series. San Diego, CA: Academic Press.
Web of Science® Google Scholar
Robinson, P. M. (1982). On the asymptotic properties of estimators of models containing limited dependent variables. Econometrica 50, 27–41.
10.2307/1912527
Web of Science® Google Scholar
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.
10.1007/978-1-4899-3324-9
Web of Science® Google Scholar
Shao, X. (2010). A self-normalized approach to confidence interval construction in time series. Journal of the Royal Statistical Society, Series B 72, 343–66.
10.1111/j.1467-9868.2009.00737.x
Web of Science® Google Scholar
Shao, X. (2012). Parametric inference in stationary time series models with dependent errors. Scandinavian Journal of Statistics 39, 772–83.
10.1111/j.1467-9469.2011.00781.x
Web of Science® Google Scholar
Sun, Y. and M. S. Kim (2012). Simple and powerful GMM over-identification tests with accurate size. Journal of Econometrics 166, 267–81.
10.1016/j.jeconom.2011.09.039
Web of Science® Google Scholar
Tauchen, G. (1985). Diagnostic testing and evaluation of maximum likelihood model. Journal of Econometrics 30, 415–43.
10.1016/0304-4076(85)90149-6
Web of Science® Google Scholar
Vogelsang, T. J. (2003). Testing in GMM models without truncation. Advances in Econometrics 17, 199–233.
10.1016/S0731-9053(03)17010-7
Web of Science® Google Scholar
Weiss, A. A. (1991). Estimating nonlinear dynamic models using least absolute error estimation. Econometric Theory 7, 46–68.
10.1017/S0266466600004230
Web of Science® Google Scholar
White, H. (1994). Estimation, Inference, and Specification Analysis. Cambridge: Cambridge University Press.
10.1017/CCOL0521252806
Google Scholar
White, H. (2001). Asymptotic Theory for Econometricians (revised ed.). Orlando, FL: Academic Press.
Google Scholar
Wooldridge, J. M. and H. White (1988). Some invariance principles and central limit theorems for dependent heterogeneous processes. Econometric Theory 4, 210–30.
10.1017/S0266466600012032
Web of Science® Google Scholar
Zhou, Z. and X. Shao (2013). Inference for linear models with dependent errors. Journal of the Royal Statistical Society, Series B 75, 323–43.
10.1111/j.1467-9868.2012.01044.x
Web of Science® Google Scholar

1 This line of research started after the first version of our paper, which was presented at an Econometric Society conference in 2006. We thank a referee for bringing the self-normalization literature to our attention.

2 We thank a referee for pointing this out.

Citing Literature

Volume18, Issue1

February 2015

Pages 95-116

Robust hypothesis tests for M-estimators with possibly non-differentiable estimating functions

Summary

1. INTRODUCTION

2. M-ESTIMATION AND ASYMPTOTIC RESULTS

2.1. Asymptotic normality of M-estimators

2.2. Weak convergence of recursive M-estimators

3. TESTS FOR GENERAL HYPOTHESES

3.1. The HAC-type test

3.2. The KVB-type test

3.3. The proposed test

3.4. Asymptotic local power

4. EXAMPLES

4.1. QR models

4.2. Censored regression models

5. MONTE CARLO SIMULATIONS

Note

Note

Note

Note

6. CONCLUSIONS

ACKNOWLEDGEMENTS

Appendix A A

Supporting Information

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Robust hypothesis tests for M-estimators with possibly non-differentiable estimating functions

Summary

1. INTRODUCTION

2. M-ESTIMATION AND ASYMPTOTIC RESULTS

2.1. Asymptotic normality of M-estimators

2.2. Weak convergence of recursive M-estimators

3. TESTS FOR GENERAL HYPOTHESES

3.1. The HAC-type test

3.2. The KVB-type test

3.3. The proposed test

3.4. Asymptotic local power

4. EXAMPLES

4.1. QR models

4.2. Censored regression models

5. MONTE CARLO SIMULATIONS

Note

Note

Note

Note

6. CONCLUSIONS

ACKNOWLEDGEMENTS

Appendix A A

Supporting Information

References

Citing Literature

Figures

References

Related

Information