Volume 18, Issue 1 pp. 95-116
ARTICLE
Full Access

Robust hypothesis tests for M-estimators with possibly non-differentiable estimating functions

Wei-Ming Lee

Wei-Ming Lee

Department of Economics, National Chung Cheng University, Chia-Yi, 621 Taiwan, China

Search for more papers by this author
Yu-Chin Hsu

Yu-Chin Hsu

Institute of Economics, Academia Sinica, Taipei, 115 Taiwan, China

Search for more papers by this author
Chung-Ming Kuan

Chung-Ming Kuan

Department of Finance, National Taiwan University, Taipei, 106 Taiwan, China

Search for more papers by this author
First published: 03 December 2014
Citations: 2

Summary

We propose a new robust hypothesis test for (possibly non-linear) constraints on M-estimators with possibly non-differentiable estimating functions. The proposed test employs a random normalizing matrix computed from recursive M-estimators to eliminate the nuisance parameters arising from the asymptotic covariance matrix. It does not require consistent estimation of any nuisance parameters, in contrast with the conventional heteroscedasticity-autocorrelation consistent (HAC)-type test and the Kiefer–Vogelsang–Bunzel (KVB)-type test. Our test reduces to the KVB-type test in simple location models with ordinary least-squares estimation, so the error in the rejection probability of our test in a Gaussian location model is urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0001. We discuss robust testing in quantile regression, and censored regression models in detail. In simulation studies, we find that our test has better size control and better finite sample power than the HAC-type and KVB-type tests.

1. INTRODUCTION

Conventional hypothesis testing rests on consistent estimation of the asymptotic covariance matrix. In time series econometrics, the non-parametric kernel estimator originating from the spectral estimation of Priestley (1981) is a leading example; see also Newey and West (1987, 1994), Andrews (1991) and den Haan and Levin (1997) for econometric contributions. This estimator, which is also known as a heteroscedasticity-autocorrelation consistent (HAC) estimator, leads to asymptotic chi-squared tests that are robust to heteroscedasticity and serial correlations of unknown form, but the testing results can vary with the choices of the kernel function and its bandwidth.

In view of this, Kiefer et al. (2000, KVB hereafter) propose to replace the HAC estimator with a random normalizing matrix to avoid the selection of the bandwidth in the non-parametric kernel estimation in linear regression models. This approach is extended to robust testing in non-linear regression and generalized method of moments (GMM) models; see Bunzel et al. (2001) and Vogelsang (2003) for more details. As for specification testing, Lobato (2001) develops a portmanteau test for serial correlations, and Kuan and Lee (2006) propose general M-tests of moment conditions that are robust not only in the KVB sense but also in the presence of an estimation effect. For robust overidentifying restrictions (OIR) tests, see Sun and Kim (2012) and Lee et al. (2014).

As we will see later, to test for (possibly non-linear) constraints on the class of M-estimators of Huber (1967), a consistent estimator for the derivative of the expectation of the estimating function is needed. When the estimating function is differentiable with respect to the parameter vector, a consistent estimator for this is simply the sample average of the derivative of the estimating function. However, when the estimating function is not differentiable, the estimation is less straightforward; a leading example is the quantile regression (QR) estimator of Koenker and Basset (1978). Although an explicit form of the derivative of the expectation of the estimating function is available in this case, the conditional density of model errors is in the expression. Therefore, a consistent estimator for this matrix involves non-parametric kernel estimation of the conditional density. As such, user-chosen bandwidth is needed and the performance of the HAC-type and KVB-type tests can be sensitive to this choice. However, one may appeal to the bootstrap method to circumvent consistent estimation of any nuisance matrix – see, e.g., Buchinsky (1995) and Fitzenberger (1997) – but the resulting tests are computationally demanding. Moreover, tests based on the moving blocks bootstrap (MBB), as suggested by Fitzenberger (1997), can be sensitive to the selection of block length and the number of bootstrap samples. Subsampling may also be applied, but it suffers from a problem similar to MBB.

In this paper, we propose a new robust hypothesis test for possibly non-linear constraints on M-estimators with possibly non-differentiable estimating functions. The proposed test employs a normalizing matrix computed from recursive M-estimators to eliminate the nuisance parameters in the limit and hence does not require consistent estimation of any nuisance parameters, in contrast with the HAC-type and KVB-type tests. This feature makes the proposed test appealing because consistent estimators for such nuisance parameters may not only be difficult to obtain but also sensitive to user-chosen parameters, which in turn lead to poor finite sample performance of the test. The null limit of the proposed test is shown to be the same as that of Lobato (2001) and hence the asymptotic critical values are already available. Moreover, we show that the proposed test reduces to the KVB-type test in simple location models with ordinary least-squares (OLS) estimation so the error in rejection probability of the proposed test in a Gaussian location model is thus urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0002, in contrast with the conventional tests for which the error in rejection probability is typically no better than urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0003. We consider robust testing in QR and censored regression models in detail. In simulation studies, we find that our test can have better size control and better finite sample power than the HAC-type and KVB-type tests.

Our method is also known as the self-normalization method in the statistics literature. Note that, while Shao (2010) constructs confidence intervals for the parameters that are functionals of the marginal or joint distribution of stationary time series (e.g. mean or normalized spectral means), we consider hypothesis testing on parameters defined in a more general class of econometric models that include these parameters as special cases. Shao (2012) later constructs confidence intervals for the parameters in stationary time series models based on the frequency domain maximum likelihood estimator that can be applied to a large class of long/short memory time series models with weakly dependent innovations; see also Zhou and Shao (2013) and Huang et al. (2015) for the inference for parameters in different context.

This paper proceeds as follows. In Section 2., we introduce M-estimation and related asymptotic results. Then, in Section 3., we present the proposed test as well as the HAC-type and KVB-type tests. As examples, we discuss robust testing in QR and censored regression models in Section 4.. We report simulation results in Section 5., and we conclude in Section 6.. All proofs are deferred to the Appendix.

2. M-ESTIMATION AND ASYMPTOTIC RESULTS

Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0004 be a sequence of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0005 random vectors and let θ be a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0006 unknown parameter vector with the true value urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0007. In the context of M-estimation, an M-estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0008 for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0009 can be defined as the one satisfying the following estimating equation,
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0010(2.1)
where T is the sample size and ϕ is a measurable and separable urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0011-valued estimating function with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0012. Clearly, the class of M-estimators includes many estimators as special cases. For example, consider the linear regression model: urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0013. The OLS estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0014 satisfies urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0015, the asymmetric least-squares estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0016 of Newey and Powell (1987) and Kuan et al. (2009) for the τth conditional expectile of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0017 satisfies urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0018, where I is the indicator function, and the QR estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0019 for the τth conditional quantile of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0020 is such that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0021
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0022; see Fitzenberger (1997) for more details. The symmetrically trimmed least-squares estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0023 proposed by Powell (1986b) is also an M-estimator because, for truncated data, it satisfies
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0024
See also Section 4.2. for the censored regression models. In what follows, [c] denotes the integer part of the real number c, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0025 denotes the sup norm for a vector, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0026 denotes convergence in probability, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0027 denotes convergence in distribution, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0028 denotes equality in distribution and ⇒ denotes weak convergence of associated probability measures. We also let Wq be a vector of q independent, standard Wiener processes and we let Bq be the Brownian bridge with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0029 for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0030.

2.1. Asymptotic normality of M-estimators

Define
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0031
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0032; for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0033, we simply write urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0034, which is the sample average of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0035. To derive the limiting distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0036, we impose the following ‘high-level’ assumptions that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0037 is urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0038 consistent and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0039 obeys a multivariate functional central limit theorem (FCLT).

Assumption 2.1.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0040.

Assumption 2.2.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0041, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0042 and S is the matrix square root of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0043; (i.e. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0044).

Assumption 2.1 holds under various sets of primitive regularity conditions. These conditions typically require that some stochastic properties (such as memory, heterogeneity and moment restrictions) of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0045 and some smoothness and domination conditions on ϕ so that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0046 obeys a weak uniform law of large numbers, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0047 is continuous in θ, and that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0048 is identifiably unique. The discussion for consistency of M-estimation can be found, for example, in Huber (1981, Chapter 6), Tauchen (1985, pp. 422–24) and White (1994, pp. 33–35). Assumption 2.2 is more than enough to establish the asymptotic normality of M-estimators but is required to derive the weak limit of recursive M-estimators. Note that the conditions that ensure multivariate FCLT are sufficient for Assumption 2.2; see, e.g., Corollary 4.2 of Wooldridge and White (1988) or Theorem 7.30 of White (2001). By Assumption 2.2, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0049.

When ϕ is twice continuously differentiable with respect to θ and the corresponding second derivative is bounded in probability, the limiting distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0050 can be easily derived using the first-order Taylor expansion. By expanding the estimating equation 2.1 about urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0051, we immediately have under Assumption 2.1 that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0052
As long as a law of large numbers can be applied to the sample averages of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0053, a Bahadur representation for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0054 can then be given by
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0055(2.2)
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0056, a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0057 non-singular matrix of constants. Under Assumption 2.2, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0058, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0059.
Because many M-estimators, such as LAD and QR estimators, rely on non-differentiable ϕ, it is not possible to expand 2.1 around urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0060 using the technique above, but a Bahadur representation as in 2.2 is still available for such M-estimators. To see this, define
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0061
and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0062; for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0063, we simply write urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0064 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0065. We now impose the following two assumptions; see also Weiss (1991) for primitive conditions.

Assumption 2.3.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0066 is twice continuously differentiable with a bounded second derivative and satisfies the following property: urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0067.

Assumption 2.4.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0068 uniformly in urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0069 and Mo is non-singular.

Given these two assumptions and by a first-order Taylor expansion of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0070 around urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0071, we obtain an expression as in 2.2, and hence the asymptotic normality for such M-estimators follows immediately from Assumption 2.2.

2.2. Weak convergence of recursive M-estimators

Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0072 be the recursive M-estimator using the subsample of first j observations; that is, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0073 is the one such that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0074
To derive its weak limit, we modify Assumptions 2.1 and 2.3 to the following

Assumption 2.1′.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0075, uniformly in urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0076.

Assumption 2.3′.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0077 is twice continuously differentiable with a bounded second derivative and satisfies urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0078, uniformly in urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0079.

These two assumptions should not be abrupt as the recursive M-estimators and the full-sample M-estimator ought to behave similarly. It can be shown that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0080
which implies that under Assumption 2.2,
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0081
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0082. By replacing urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0083 with the full-sample M-estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0084, we can apply the continuous mapping theorem to obtain that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0085(2.3)
This result has been established in the literature on testing parameter constancy for linear mean and median regressions; see, e.g. Ploberger et al. (1989), Kuan and Hornik (1995) and Chen and Kuan (2001). Here, we show that it remains valid for recursive M-estimators. This result is needed to construct robust tests for parameter restrictions.

3. TESTS FOR GENERAL HYPOTHESES

The null hypothesis of interest consists of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0086 possibly non-linear restrictions on urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0087,
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0088(3.1)
where γ is a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0089 vector of continuously differentiable functions with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0090 that is of full row rank in a neighbourhood of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0091. Under the assumptions above and using a delta method, we can show that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0092, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0093.

3.1. The HAC-type test

To test the hypothesis defined in 3.1, the test statistic for a HAC-type test is defined as urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0094, in which urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0095 is the HAC estimator for the asymptotic covariance matrix of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0096 such that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0097, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0098 is a consistent estimator of Mo and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0099 is a non-parametric kernel estimator of V given by
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0100
where κ is a kernel function and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0101 denotes the truncation lag or bandwidth. As urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0102, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0103, and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0104 is consistent for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0105. The null distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0106 is a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0107 distribution. In this paper, a test based on test statistic urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0108 will be called the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0109 test.
Although the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0110 has a standard null limit and is robust to heteroscedasticity and serial correlations of unknown form, its finite sample performance will depend on choices of the kernel function and its bandwidth. To implement the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0111 test, it also requires a consistent estimator for Mo. If ϕ is continuously differentiable, then
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0112
and the sample average of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0113 is a natural consistent estimator for Mo. However, when ϕ is not differentiable, the estimation is less straightforward. For example, in the QR regression models, the Mo involves the conditional density of the error terms. Although kernel-based estimators for Mo are available (see Weiss, 1991), the resulting urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0114 test would suffer from the problems arising from non-parametric kernel estimation. For more details and more examples, see Section 4.. In addition, if the constraints are non-linear, we will need to estimate urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0115 by urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0116. Even it is straightforward to obtain urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0117, the finite sample performance of the tests may be adversely affected by this estimator.

3.2. The KVB-type test

The main idea underlying the KVB approach is to employ a random normalizing matrix in place of the kernel-based covariance matrix estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0118 to avoid the problems arising from non-parametric kernel estimation. Following this approach, we construct urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0119 as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0120
Clearly, it reduces to that of KVB when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0121 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0122. Given urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0123, a KVB-type test statistic is defined as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0124
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0125 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0126.

It is easy to derive the weak limit of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0127 (and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0128) when ϕ is smooth (see Kuan and Lee, 2006), and it is less straightforward when ϕ is non-smooth. To allow for non-smooth ϕ, we impose the following assumption, which is similar to the condition (v) of Theorem 7.2 in Newey and McFadden (1994).

Assumption 3.1.Define

urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0129
Then, there exists a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0130 such that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0131.

This assumption should not be restrictive because sufficient conditions for Assumption 2.3′ are also sufficient for Assumption 3.1; see, e.g. Huber (1967, p. 227) and Weiss (1991, p. 62).

Lemma 3.1.Suppose that Assumptions 2.1, 2.2, 2.3′, 2.4 and 3.1 hold. Then urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0132, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0133 and S is the matrix square root of V.

As shown in Lemma 3.1, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0134 remains random in the limit so the null distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0135 will not be a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0136 distribution. However, the following theorem shows that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0137 is asymptotically pivotal under the null and its weak limit is the same as that of Lobato (2001). Therefore, the corresponding critical values can be found in Lobato (2001, p. 1067).

Theorem 3.1.Suppose that Assumptions 2.1, 2.2, 2.3′, 2.4 and 3.1 hold. Then under the null,

urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0138

Although the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0139 test does not require consistent estimation of V, it still requires consistent estimation of Mo and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0140. In particular, as we have discussed previously, it can be hard to obtain Mo and there might be an additional smoothing parameter to pick when ϕ is not differentiable. In view of this, in the next subsection, we propose a new way to construct robust tests without consistent estimation of Mo and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0141.

3.3. The proposed test

We propose to use the following normalizing matrix,
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0142(3.2)
which is computed using only recursive estimators, and define the test statistic of our test as urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0143. Note that we do not need to estimate Mo and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0144 to obtain urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0145, and this makes our test appealing. Under the assumptions above, we can show that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0146(3.3)
where Ξ is the matrix square root of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0147 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0148 is the asymptotic variance of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0149. It follows that
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0150(3.4)
Then, 3.3 and 3.4 are sufficient to derive the null distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0151 which is summarized in the following theorem.

Theorem 3.2.Suppose that Assumptions 2.1′, 2.2, 2.3′ and 2.4 hold. Then, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0152. Moreover,

urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0153
under the null hypothesis that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0154.

Remark 3.1.In 3.2 the summation starts with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0155 where p is the number of unknown parameters. When the criterion function is not smooth, it may not be easy to compute the M-estimators. In addition, if the sample size is small, the optimization might not be stable and the global minimization or maximization may not be easy to guarantee. Therefore, one might want to start with a larger subsample size to avoid these problems. In fact, the theory would still hold as long as the summation starts with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0156 and the sequence urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0157 satisfies urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0158 such that if urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0159, then

urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0160
as in Theorem 3.2.

Remark 3.2.To shed more insight on the difference and similarity between urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0161 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0162, we consider the linear model urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0163 and the null hypothesis 3.1 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0164. Based on the OLS estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0165, the KVB normalizing matrix can be expressed as

urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0166
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0167 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0168. As for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0169, we first show that for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0170,
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0171
Then the proposed normalizing matrix becomes
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0172
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0173. The result above shows that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0174 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0175 employ different but similar normalizing matrices in the context of linear regression with OLS estimation. Of particular interest is that in a simple location model (i.e. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0176 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0177 for all t), these two tests are algebraically equivalent because urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0178 in this case. Therefore, the error in rejection probability of the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0179 test is also urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0180 in a Gaussian location model, as shown in Jansson (2004).

3.4. Asymptotic local power

In this section, we compare the local powers of the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0181, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0182 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0183 tests under local alternatives defined as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0184(3.5)
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0185 is a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0186 vector of non-zero constants. Under the local alternatives, the limiting distributions of these tests are summarized in the following theorem. Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0187.

Theorem 3.3.Suppose that Assumptions 2.1′, 2.2, 2.3′, 2.4 and 3.1 hold. Then under the local alternatives defined in 3.5, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0188, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0189 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0190.

Given Theorem 3.3, we can derive the asymptotic local power for these tests now. Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0191 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0192 be the critical values at α significance level taken from urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0193 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0194, respectively. Also, let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0195, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0196 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0197 be the asymptotic local powers of the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0198, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0199 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0200 tests, respectively.

Corollary 3.1.Suppose that Assumptions 2.1′, 2.2, 2.3′, 2.4 and 3.1 hold. Then, under the local alternatives, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0201 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0202.

It is obvious that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0203 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0204 have the same asymptotic local power because their limiting distributions are identical. The local power curves of these tests are the same as those in Figure 1 of Kuan and Lee (2006), so we omit them here. As we can see from Figure 1 of Kuan and Lee (2006), urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0205 has better local power than the other two tests. However, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0206 may not outperform the other two tests in finite samples because its performance would depend on user-chosen parameters as well.

Details are in the caption following the image
Size-adjusted powers of the OLS-based tests.

4. EXAMPLES

In this section, we illustrate the application of the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0207 test in QR and in censored regression models. QR is a leading example for a non-differentiable ϕ. In the second example, whether the corresponding ϕ is differentiable or not and whether consistent estimation of Mo is easy or not depend on the estimation method used.

4.1. QR models

In the context of QR, the τth conditional quantile of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0208 given a vector of explanatory variables xt is typically specified as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0209
where F is the conditional distribution function of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0210 given xt, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0211 is a urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0212 vector of unknown parameters with the true value urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0213 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0214 is a real-valued function that is continuously differentiable in the neighbourhood of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0215. Similar to the mean regression model, the QR model can also be written as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0216(4.1)
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0217 is the error term with the τth conditional quantile being zero under the true value urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0218, a condition ensuring the correct specification for the model 4.1. Note that the model 4.1 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0219 is also known as a median regression model.
To estimate the unknown parameter vector urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0220, Koenker and Bassett (1978) propose the QR estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0221 that minimizes the following criterion function:
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0222
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0223 is known as a check function. An analytic form for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0224 is generally not available; nonetheless, many algorithms for obtaining urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0225 have been proposed in the literature; see, e.g. Koenker and d'Orey (1987) and Koenker and Park (1996). Note also that the LAD estimator corresponds to the QR estimator with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0226.

Note that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0227 is not differentiable, so the limiting distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0228 is relatively difficult to derive. Nonetheless, under suitable conditions, the asymptotic normality result remains valid for the QR estimator (or LAD estimator); see Powell (1986a), Weiss (1991) and Fitzenberger (1997), among many others. Consider the linear quantile regression (i.e. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0229). The linear QR estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0230 is an M-estimator satisfying urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0231. For more general non-linear models, we can follow Weiss (1991) and Fitzenberger (1997), and show that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0232, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0233; see also Koenker (2005, p. 124). Therefore, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0234 is an M-estimator with a non-differentiable ϕ.

To implement HAC-type and KVB-type tests in QR models, it is necessary to estimate Mo consistently and Mo is given as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0235
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0236 is the conditional density of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0237. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0238 can be estimated by a non-parametric kernel method, but the test performance will depend on the choices of kernel and bandwidth. To circumvent consistent estimation of Mo, one may appeal to MBB or subsampling, but the testing results can be sensitive to the selection of block length instead. Therefore, the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0239 test seems to be practically useful for testing in QR models because it avoids consistent estimation of Mo and is thus free from user-chosen parameters.

4.2. Censored regression models

Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0240 be the dependent variable generated according to urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0241. If we only observe data points urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0242 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0243, then urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0244 forms a censored regression model and the OLS estimator by regression urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0245 on xt is not consistent for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0246; see Amemiya (1984) for an early review. Most studies rely on maximum likelihood (ML) estimation. Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0247 and let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0248 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0249 be the corresponding conditional distribution and density functions. Under the assumption of independence, the log likelihood function can then be written as urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0250, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0251, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0252 and
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0253
The Gaussian ML estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0254 is an M-estimator that solves urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0255.
The Gaussian ML estimator is not robust to conditional heteroscedasticity and non-normality, however. Powell (1984) proposes a censored LAD estimator instead, which satisfies 2.1 with non-differentiable urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0256. Although such an estimator is robust to conditional heteroscedasticity and non-normality, hypothesis tests based on this estimator are not easy to implement because
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0257(4.2)
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0258 is the true conditional density of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0259. Therefore, the censored LAD-based test would suffer from the same problems as in QR models.
When the conditional distribution of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0260 is symmetric about urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0261, the symmetrically trimmed least-squares estimator introduced by Powell (1986b) is also consistent for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0262 and robust to conditional heteroscedasticity and non-normality. Moreover, tests based on this estimator are easier to implement. To see this, from equation (2.9) in Powell (1986b), this estimator is an M-estimator with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0263. Although ϕ is non-differentiable, the corresponding Mo can be expressed as
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0264
for which consistent estimation is straightforward.

The censored regression model can be applied not only to cross-sectional data but also to time series data. For time series data, the assumption of serial independence may not be appropriate, but Robinson (1982) show that the Gaussian ML estimator remains consistent and asymptotically normal with a complicated asymptotic covariance matrix. Therefore, the test developed in this paper can also be useful.

5. MONTE CARLO SIMULATIONS

In this section, the finite sample performance of the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0265 test is evaluated via Monte Carlo simulations. We consider three sample sizes (urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0266, 100 and 500) and two different nominal sizes (5% and 10%). The number of replications is 5,000 for size simulations and 1,000 for power simulations. Because the results for different nominal sizes are qualitatively similar, we report only the results for 5% nominal size.

As in KVB, we consider the linear regression specification, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0267, where xt is a 5 × 1 vector of regressors, with the first element equal to 1 for all t and where other elements are mutually independent AR(1) processes. The last four elements are generated in the same way as the AR(1)-HOMO errors introduced below. In the same simulation design, the correlation coefficients and the distributions of the regressors are the same as those of the error terms. θ is an unknown parameter vector with the true value urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0268, and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0269 is an error term. The null hypothesis is given by
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0270
where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0271 is the ith element of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0272 and q denotes the number of restrictions on urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0273. We consider urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0274 for size simulations and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0275 for power simulations. To test these hypotheses, we apply the OLS and LAD methods for estimating urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0276. It is well known that the OLS estimator enjoys optimality under (i.i.d.) normal errors, but the LAD estimator may be better for leptokurtic errors.

In size simulations, we set urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0277 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0278, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0279 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0280 represents the conditional variance of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0281. The data-generating processes (DGPs) for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0282 are AR(1)-HOMO and AR(1)-HET. AR(1) indicates that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0283 is governed by the AR(1) model, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0284, where ρ is set to be either 0.5 or 0.8, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0285 is a white noise with unit variance and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0286, which is a scaling factor such that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0287. While HOMO stands for conditional homoscedasticity of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0288 (we set urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0289 for all t), HET denotes that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0290 is conditionally heteroscedastic with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0291 as considered in Fitzenberger (1997, p. 255). We generate urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0292 from i.i.d. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0293 and standardized Student's t(4). This enables us to examine if LAD-based tests are more appropriate for leptokurtic data. As for the regressors, they follow the AR(1) model specified as that for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0294.

For comparison, we simulate the HAC-type and KVB-type tests. For the former, we consider the Bartlett kernel covariance matrix estimator for which the truncation lag is determined by the non-parametric method of Newey and West (1994) with the weighting vector urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0295 and two preliminary truncation lags, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0296, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0297 and 12. Unlike the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0298 test, these two tests require consistent estimation of Mo. Thus, we employ the estimator urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0299 for OLS and the following kernel-based estimator for LAD:
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0300
Here, κ is the Gaussian kernel, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0301 is the bandwidth that vanishes in the limit and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0302 are LAD residuals. Although optimal selection of bandwidth has been extensively discussed in the literature on density estimation, it is not clear how the value of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0303 should be selected such that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0304 enjoys optimality in some sense. We thus examine the effect of selecting urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0305 by employing the two bandwidths: (a) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0306; (b) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0307. Here, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0308 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0309 and R being the sample standard deviation and interquartile range of the LAD

residuals, respectively. The first is suggested by Silverman (1986, p. 48) to obtain an optimal rate for density estimation, but the second goes to zero at the same rate as in Koenker (2005, p. 81). Given these choices, the OLS-based tests read urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0310 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0311, and the LAD-based tests are denoted as urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0312 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0313, where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0314 (a) and (b).

The empirical sizes for the OLS-based tests are reported in Tables 1 and 2. Clearly, these tests are all oversized in small samples (e.g. urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0315) and the distortions deteriorate when q or ρ becomes larger. We also observe that leptokurtosis has little effect on the size performance, but heteroscedasticity does result in more size distortions especially for leptokurtic data and smaller q. Among these tests, the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0316 test has the largest size distortions, regardless of the values of a. In particular, its size distortions are much larger for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0317 and remain even when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0318. The other tests are clearly less oversized and a quite encouraging result is that the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0319 test dominates the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0320 test in terms of finite sample size. It is also found that when data become more persistent (i.e. ρ becomes larger), the size distortion of the former increases slightly, yet the size distortion of the latter increases dramatically.

Table 1. Empirical sizes of the robust hypothesis tests (OLS, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0321)
AR(1)-HOMO AR(1)-HET
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0322 t(4) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0323 t(4)
q urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0324 100 500 50 100 500 50 100 500 50 100 500
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0325 1 6.3 5.3 5.0 5.7 5.8 4.8 9.9 8.2 5.9 11.6 8.9 5.8
2 7.8 6.3 5.3 7.8 6.3 5.5 10.5 8.2 5.2 11.6 8.4 6.6
3 10.4 7.7 5.8 9.7 7.6 6.2 12.6 9.2 5.9 13.1 9.5 6.8
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0326 1 9.9 7.3 5.6 9.7 7.9 5.5 11.7 8.6 6.1 14.0 10.3 5.9
2 11.3 8.2 5.8 11.8 8.7 6.1 12.7 8.7 5.5 13.9 10.0 7.0
3 14.5 9.5 6.2 14.3 10.5 6.8 14.0 9.4 6.0 15.7 10.7 7.0
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0327 1 16.2 12.2 8.3 17.3 13.1 7.6 19.4 13.7 8.4 23.6 16.9 8.2
2 23.7 17.9 9.9 25.3 17.7 10.5 25.7 18.8 9.6 29.1 19.8 10.1
3 33.4 22.9 10.5 32.5 21.1 11.7 32.8 22.6 10.4 34.5 22.3 11.6
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0328 1 23.4 16.6 8.1 25.1 17.5 7.0 27.8 18.3 8.3 31.1 21.3 8.7
2 37.9 25.3 10.3 40.0 25.9 10.7 40.4 27.4 10.3 43.1 28.0 11.0
3 52.1 34.9 11.6 51.0 34.5 12.6 53.1 34.9 12.4 53.1 35.7 12.8

Note

  • The entries are rejection frequencies in percentages; the nominal size is 5%.
Table 2. Empirical sizes of the robust hypothesis tests (OLS, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0329)
AR(1)-HOMO AR(1)-HET
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0330 t(4) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0331 t(4)
q urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0332 100 500 50 100 500 50 100 500 50 100 500
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0333 1 7.5 5.9 5.6 7.1 6.3 5.3 11.9 10.4 6.5 13.3 10.6 6.7
2 10.2 8.7 5.5 9.6 8.3 5.7 14.2 10.8 5.6 14.6 12.1 6.8
3 14.6 11.4 7.9 13.5 11.7 7.1 17.9 13.7 7.9 17.8 13.9 7.5
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0334 1 17.8 12.2 6.9 17.4 12.4 7.1 21.0 14.6 7.1 22.7 15.5 7.3
2 23.1 16.4 7.6 22.9 16.7 8.1 24.7 17.3 7.0 27.3 18.8 8.8
3 32.3 20.6 9.4 30.6 20.8 9.4 31.4 20.2 9.1 31.9 20.9 8.9
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0335 1 28.9 23.2 10.7 30.5 23.5 10.5 35.1 27.0 11.7 39.0 29.5 12.7
2 45.4 35.1 13.6 46.9 34.6 14.8 47.7 37.2 14.9 51.1 37.2 16.1
3 60.7 46.0 18.3 60.4 44.3 18.5 59.5 45.4 18.2 60.4 44.9 18.5
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0336 1 34.1 24.6 11.8 35.7 24.9 11.5 40.2 29.4 12.7 43.8 31.7 13.9
2 53.7 38.6 15.6 55.7 39.2 16.6 56.9 41.4 16.6 59.0 42.0 17.6
3 70.8 51.6 20.9 69.9 51.3 20.6 70.0 52.5 20.5 71.1 52.1 20.9

Note

  • The entries are rejection frequencies in percentage; the nominal size is 5%.

The empirical sizes for LAD are summarized in Tables 3 and 4. Generally, the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0337 test is still the best test and the HAC-type test is the worst. Compared with the preceding tables, we observe that the OLS-based and LAD-based tests have similar patterns regarding size performance. It is also found that the LAD-based urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0338 test has empirical sizes closer to the nominal size of 5%, but this is not necessary for the other tests. These results suggest that, as far as an accurate finite sample size is concerned, the LAD-based urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0339 test is preferred for testing in linear regressions.

Table 3. Empirical sizes of the robust hypothesis tests (LAD, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0340)
AR(1)-HOMO AR(1)-HET
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0341 t(4) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0342 t(4)
q urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0343 100 500 50 100 500 50 100 500 50 100 500
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0344 1 4.4 3.5 4.2 4.1 3.8 4.1 7.1 5.9 5.2 8.4 7.2 5.7
2 5.0 4.2 4.3 4.7 4.1 4.0 6.4 4.7 4.1 6.6 4.5 5.0
3 7.1 5.4 4.2 6.2 4.9 4.3 6.6 4.7 3.8 7.0 5.4 4.2
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0345 1 6.6 5.5 4.9 6.0 5.7 5.4 9.3 7.4 5.5 12.1 10.0 6.4
2 7.7 7.0 5.2 8.8 7.0 5.7 9.3 6.6 4.4 11.9 7.9 5.9
3 11.2 8.2 5.9 10.1 8.5 6.4 9.3 6.2 4.5 12.4 8.8 5.9
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0346 1 9.3 7.4 5.5 8.8 8.8 7.5 12.3 10.2 6.6 15.9 13.5 8.9
2 12.8 9.9 6.7 14.3 11.8 8.4 14.9 10.5 6.7 17.6 14.6 9.1
3 19.4 13.5 8.5 19.2 16.2 10.4 17.0 12.1 8.1 20.0 15.8 10.1
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0347 1 13.6 9.8 6.8 13.3 10.8 7.6 17.5 11.9 7.9 21.0 16.0 9.1
2 20.0 14.0 8.0 21.4 15.4 8.8 20.2 13.7 6.8 24.4 16.8 9.0
3 28.8 20.3 9.7 27.6 20.6 10.5 25.7 15.9 7.4 27.6 19.7 8.9
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0348 1 17.5 13.2 8.2 17.9 15.0 10.0 21.6 15.5 10.4 25.8 19.9 11.6
2 28.1 19.7 11.3 30.6 23.0 14.8 28.0 20.0 11.3 33.1 25.0 14.5
3 41.7 30.3 14.8 42.6 33.5 18.5 37.7 26.2 14.1 40.7 29.8 17.4
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0349 1 23.7 17.2 8.2 23.6 18.2 9.8 27.7 19.0 10.2 30.7 23.1 11.4
2 39.1 27.8 11.8 41.2 30.1 15.8 39.0 27.5 11.8 41.9 32.0 14.9
3 55.8 41.6 16.6 55.0 44.0 20.0 51.6 37.4 15.3 54.2 40.6 18.9

Note

  • The entries are rejection frequencies in percentage; the nominal size is 5%.
Table 4. Empirical sizes of the robust hypothesis tests (LAD, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0350)
AR(1)-HOMO AR(1)-HET
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0351 t(4) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0352 t(4)
q urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0353 100 500 50 100 500 50 100 500 50 100 500
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0354 1 5.3 4.9 4.2 4.6 4.3 4.6 8.9 8.1 5.9 10.0 7.7 5.7
2 7.1 6.3 5.0 6.8 5.7 4.7 7.8 6.6 4.6 8.3 6.9 4.6
3 9.2 7.6 6.0 9.1 8.2 5.9 10.3 6.2 5.1 9.2 7.2 4.7
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0355 1 12.3 8.9 5.7 12.0 8.9 6.1 14.1 10.3 6.1 16.1 11.3 6.6
2 18.3 12.8 7.2 18.1 13.9 7.0 17.0 11.2 5.5 20.2 13.7 6.2
3 25.8 17.7 8.4 26.1 17.9 8.0 21.3 13.2 6.7 22.9 15.7 6.4
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0356 1 15.7 11.5 6.1 15.2 12.3 7.5 17.9 12.9 7.1 20.0 14.6 8.6
2 24.3 16.7 9.1 25.2 17.9 9.5 23.6 16.1 9.1 27.1 17.7 9.1
3 36.1 26.0 11.0 36.2 27.0 12.0 30.4 21.0 9.5 32.8 24.0 10.3
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0357 1 24.5 18.0 9.1 23.6 18.2 9.6 27.9 20.3 10.1 29.8 21.4 10.9
2 37.2 28.2 12.6 38.4 30.4 13.5 35.8 25.0 9.9 38.4 29.8 12.1
3 51.5 39.6 17.1 51.9 39.5 16.7 44.3 32.5 13.4 46.6 34.7 13.4
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0358 1 28.8 21.1 10.7 28.0 22.9 12.6 32.2 23.4 11.8 34.1 27.3 13.3
2 44.6 34.4 16.9 46.6 36.4 17.2 44.3 33.1 15.8 46.9 35.4 16.4
3 62.4 50.5 22.7 62.7 50.9 23.2 55.7 43.2 19.1 57.6 46.5 20.0
urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0359 1 33.1 23.0 11.0 32.7 24.1 12.9 35.7 24.7 11.9 37.0 27.9 13.6
2 52.6 38.1 18.0 54.1 39.8 18.1 52.0 37.3 17.2 54.0 38.9 17.9
3 70.3 55.6 24.6 71.7 56.1 25.1 65.5 49.1 20.7 65.6 52.3 21.6

Note

  • The entries are rejection frequencies in percentage; the nominal size is 5%.

For power simulations, we consider urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0360 and 100 and the null hypothesis urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0361 against the alternatives for which urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0362. The DGPs considered are AR(1)-HOMO and AR(1)-HET with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0363 and two error terms: urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0364 and (standardized) Student's t errors. As shown in the preceding results, the urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0365 test is slightly oversized for these DGPs, but other tests have substantial size distortions, especially when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0366 and when inappropriate user-chosen parameters are used. To provide a proper power comparison, we thus simulate the size-adjusted powers. However, it should be stressed that, while size adjustment enables us to compare the power performance of tests with different finite sample sizes, it is generally infeasible in practical applications.

The power curves for the OLS-based and LAD-based tests are plotted in Figures 1 and 2, respectively, with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0367 on the horizonal axis. The different panels show the following: AR(1)-HOMO with (a) normal errors and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0368, (b) normal errors and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0369, (c) t(4) errors and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0370 and (d) t(4) errors and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0371; AR(1)-HET with normal errors and (e) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0372 and (f) urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0373. Clearly, their powers grow with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0374 and T, but are adversely affected by leptokurtosis or heteroscedasticity. Comparing the OLS-based urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0375 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0376 tests, we find that although the latter delivers slightly higher power when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0377, they perform quite similarly when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0378, which shows that their power differences disappear very quickly. In the LAD case, the KVB-type test is no longer free from user-chosen parameters and it is of interest to see that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0379 performs similarly to urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0380 and outperforms urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0381 in both samples. Note that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0382 may be even more powerful than urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0383 in a larger sample, in contrast with the OLS case. Comparing urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0384 with the HAC-type test, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0385 does suffer from power loss in the OLS case, but it still performs similarly to urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0386 when urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0387 is small. For LAD, the HAC-type test depends on more user-chosen parameters and it is clear that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0388 performs better than urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0389 in a smaller sample.

Details are in the caption following the image
Size-adjusted powers of the LAD-based tests.

These simulation results together suggest that the proposed urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0390 test is practically useful because it dominates the other tests in terms of finite sample size and can enjoy power advantage when the other tests are computed using inappropriate user-chosen parameters. Finally, by comparing Figures 1 and 2 we also find that although the LAD-based urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0391 test is dominated by the OLS-based urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0392 test for AR(1)-HOMO with normal errors, the former may perform better when data are leptokurtic (resulting from either heteroscedasticity or leptokurtic errors).

6. CONCLUSIONS

In this paper, we propose new robust hypothesis tests for non-linear constraints on M-estimators with possibly non-differentiable estimating functions. The proposed approach may serve as a good alternative to hypothesis testing because it does not require consistent estimation of any nuisance parameters in the asymptotic covariance matrix. Hence, it circumvents the problems arising from such consistent estimation, a sharp contrast with the HAC-type and KVB-type tests. Our simulations also suggest that the proposed test is practically useful because it performs better than the HAC-type and KVB-type tests in terms of finite sample size, and it has a power advantage when the latter tests are computed with inappropriate user-chosen parameters.

ACKNOWLEDGEMENTS

We thank the editor Michael Jansson, Anil Bera, Joon Park, Werner Ploberger, Jeffrey Racine, two anonymous referees, and the participants at the 2006 Far Eastern Meeting of the Econometric Society in Beijing for helpful suggestions and comments. All errors are our responsibility. The research support from the National Science Council of the Republic of China (NSC94-2415-H-194-009 for W-M. Lee) is also gratefully acknowledged.

    Appendix A A

    Proof of Lemma 3.1.Let urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0393 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0394 be defined as

    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0395
    Then it is easy to see that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0396 with urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0397. We first consider the term urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0398. By the first-order Taylor expansion about urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0399, we have under Assumptions 2.1 and 2.4 that
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0400(A.1)
    where the last equality follows from the Bahadur representation: urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0401 (as shown in Section 2.1.). As a result,
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0402
    As for urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0403, its sup norm can be expressed as
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0404
    Under Assumptions 2.1 and 3.1, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0405. Moreover, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0406 and, as shown in equation A.1, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0407. These results together imply urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0408. As such
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0409
    Given Assumption 2.2 and applying the continuous mapping theorem, we immediately obtain urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0410. Applying again the continuous mapping theorem yields the weak limit of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0411.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0412

    Proof of Theorem 3.1.Asurn:x-wiley:13684221:media:ectj12041:ectj12041-math-0413, we immediately have

    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0414
    where urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0415 and Ξ is a non-singular urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0416 matrix square root of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0417. It then follows from Lemma 3.1 that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0418 has the following weak limit:
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0419
    Given urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0420 and applying the delta method,
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0421
    We thus have under the null that
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0422
    where Ξ is defined as above. Applying the continuous mapping theorem, we can obtain under the null that
    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0423

    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0424

    Proof of Theorem 3.2.Equations 3.3 and 3.4 are sufficient to show Theorem 3.2.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0425

    Proof of Theorem 3.3.Under the local alternative 3.5, we have

    urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0426
    see also the proof of Theorem 3.1. With this result and that urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0427, urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0428 and urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0429 regardless of the values of urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0430, the weak limits for the three tests immediately follow.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0431

    Proof of Corollary 3.1.The results follows directly from Theorem 3.3.urn:x-wiley:13684221:media:ectj12041:ectj12041-math-0432

  1. 1 This line of research started after the first version of our paper, which was presented at an Econometric Society conference in 2006. We thank a referee for bringing the self-normalization literature to our attention.
  2. 2 We thank a referee for pointing this out.
    • The full text of this article hosted at iucr.org is unavailable due to technical difficulties.