Volume 44, Issue 4 pp. 418-436
Original Article
Open Access

Geometric ergodicity and conditional self-weighted M-estimator of a GRCAR( p ) model with heavy-tailed errors

Xiaoyan Li

Xiaoyan Li

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331 China

Search for more papers by this author
Jiazhu Pan

Corresponding Author

Jiazhu Pan

Department of Mathematics and Statistics, University of Strathclyde, Glasgow, G1 1XH UK

Correspondence to: Jiazhu Pan, Department of Mathematics and Statistics, University of Strathclyde, 26 Richmond Street, Glasgow G1 1XH, UK.

Email: [email protected]

Search for more papers by this author
Anchao Song

Anchao Song

School of Public Health and Management, Chongqing Medical University, Chongqing, 400016 China

Search for more papers by this author
First published: 19 January 2023

Abstract

We establish the geometric ergodicity for general stochastic functional autoregressive (linear and nonlinear) models with heavy-tailed errors. The stationarity conditions for a generalized random coefficient autoregressive model (GRCAR( p )) are presented as a corollary. And then, a conditional self-weighted M-estimator for parameters in the GRCAR( p ) is proposed. The asymptotic normality of this estimator is discussed by allowing infinite variance innovations. Simulation experiments are carried out to assess the finite-sample performance of the proposed methodology and theory, and a real heavy-tailed data example is given as illustration.

1 INTRODUCTION

Suppose that y t , t 1 p are observations from the generalized random coefficient autoregressive (GRCAR) model with order p defined by
y t = ϕ t 0 + ϕ t 1 y t 1 + ϕ t 2 y t 2 + + ϕ t p y t p + ε t , t = 1 , 2 , (1)
where { ( ϕ t , ε t ) = ( ϕ t 0 , ϕ t 1 , , ϕ t p , ε t ) , t 1 } is a sequence of independent and identically distributed (i.i.d.) random vectors with E ϕ t = ϕ = ( ϕ 0 , ϕ 1 , , ϕ p ) . Here, it is assumed that ( ϕ t , ε t ) is independent of t 1 = σ ( y t 1 , y t 2 , , y 1 p ) . We are interested in stationarity of the above model and estimation of unknown parameter vector ϕ = ( ϕ 0 , ϕ 1 , , ϕ p ) with its true value ϕ 0 . Note that, in model (1), the random coefficients are permitted to be correlated with the error process.

Model (1) with p = 1 (GRCAR(1)) was first introduced by Hwang and Basawa (1998), and it includes Markovian bilinear model and random coefficient exponential autoregressive model as special cases. When V a r ( ϕ t ) = 0 , model (1) becomes the ordinary autoregressive (AR( p )) model. There have been a lot of perfect theoretical achievements about it. For example, Ling (2005) proposed a self-weighted least absolute deviation estimator and showed its asymptotic normality. The method has been used in many references, such as Pan et al. (2007), Pan and Chen (2013), Pan et al. (2015). Wang and Hu (2017) proposed a self-weighted M-estimator for the AR(p) model and established the asymptotic normality of this estimator. When V a r ( ϕ t ) 0 with p = 1 , and ϕ t is independent of ε t , model (1) becomes the first-order random coefficient autoregressive (RCAR(1)) model (see Nicholls and Quinn (1982)), which has been frequently used to describe the random perturbations of dynamical systems in economics and biology (see Tong (1990), Yu et al. (2011), Zhang et al. (2015), Araveeporn (2017)). As a generalization of RCAR model and AR models, GRCAR model has become one type of important models in nonlinear time series, since it allows dependence between random errors and random coefficients. Estimation of parameters and asymptotic properties of GRCAR models have been studied in the literature. For instance, Hwang and Basawa (1997) established the local asymptotic normality of a class of GRCAR models. Zhao and Wang (2012) constructed confidence regions for the parameters by using empirical likelihood method. Zhao et al. (2013) considered the problem of testing the constancy of coefficients in a GRCAR model by empirical likelihood method. Zhao et al. (2018) studied the variable selection problem in GRCAR models. Zhao et al. (2019) proposed a weighted least squares estimate and empirical likelihood (EL) based weights through using some auxiliary information for GRCAR models. Moreover, time series models with heavy-tailed errors, even when E ( ε t 2 ) is infinite, are often found and studied in economic and financial modeling. Wu (2013) studied M-estimation for general ARMA processes with infinite variance. Yang and Ling (2017) investigated the self-weighted least absolute deviation estimation for heavy-tailed threshold autoregressive models. Fu et al. (2021) studied the asymptotic properties for the conditional self-weighted M-estimator of GRCAR(1) model with possibly heavy-tailed errors. However, general easy-to-check conditions for stationarity and limiting distributions of robust parameter estimators for statistical inference of GRCAR( p ) with heavy-tailed errors are still open problems.

This article aims to reach two targets. First, we establish the geometric ergodicity of general stochastic functional autoregressive (linear and nonlinear) models with possibly heavy-tailed error terms under a mild moment condition. Moreover, the stationarity conditions for GRCAR( p ) are implied as a corollary of our general result. Second, motivated by Yang and Ling (2017), Ling (2005), Wang and Hu (2017) and Fu et al. (2021), we prove the asymptotic property of a self-weighted M-estimator (SM-estimator) for GRCAR( p ) with possible infinite variance, and show that the limiting distribution of SM-estimator is asymptotically normal. Simulation results and a real data example are given to support our methodology.

The contents of this article are organized as follows. Section 2 presents the main results. Section 3 reports the simulation results. Section 4 shows a real data example. All proofs of our main results are given in Section 5.

2 MAIN RESULTS

2.1 Geometric Ergodicity

We first establish geometric ergodicity of general stochastic functional autoregressive models (including linear and nonlinear) under a mild moment condition. Then the geometric ergodicity which can imply the stationarity conditions of model (1) is given as a corollary of the main theorem.

Consider a general stochastic functional autoregressive model defined as follows
y t = φ t ( y t 1 , , y t p ) + ε t , t 1 ( y 0 , y 1 , , y p + 1 ) p , (2)
where { φ t } is a sequence of i.i.d. stochastic functions such that ( φ t ( · ) , ε t ) being of i.i.d. random vectors, and ( φ t ( · ) , ε t ) is independent of t 1 = σ ( y t 1 , y t 2 , , y 1 p ) . It can be seen that both linear and nonlinear autoregressive models are included. This model can be rewritten in vector form as follows:
X t = Φ t ( X t 1 ) + ε t U , t 1 , X 0 p , (3)
where Φ t ( X t 1 ) = φ t ( y t 1 , , y t p ) , y t 1 , , y t p + 1 , X t = y t , , y t p + 1 , U = 1 , 0 , , 0 Under the above conditions, model (3) is a homogeneous Markov chain. It is easily seen that stationarity for y t , t 1 p is equivalent to that for X t , t 0 . Furthermore, the geometric ergodicity for model (2) is equivalent to that for the model (3).

Theorem 2.1.Suppose model (3) satisfies

  • (i)

    There exists some norm · v on the p-dimensional vector space,constants 0 < ρ < 1 , 0 < δ < 1 and c 0 , such that

    E Φ t ( x ) v δ ρ x v δ + c , x p ; (4)

  • (ii)

    The density function of ε t is continuous and positive everywhere,and E ε t δ < for δ in (i).

    Then, model (3) is geometrically ergodic, which implies { y t } in model (2) is stationary and geometrically ergodic.

Under the conditions of Theorem 2.1, more concretely, we have the following corollary.

Corollary 2.2.Suppose model (2) satisfies

  • (i)

    There exists a constant vector φ = ( φ 1 , , φ p ) and a 0 < δ < 1 satisfying

    1 φ 1 z φ p z p 0 , z 1 (5)
    such that
    lim x E φ t ( x ) φ x δ x = 0 (6)
    and for any K > 0 ,
    sup x K E φ t ( x ) φ x δ < . (7)

  • (ii)

    The density function of ε t is continuous and positive everywhere,and E ε t δ < for δ in (i).

    Then, { y t } in model (2) is stationary and geometrically ergodic.

In Corollary 2.2, when φ t ( x ) = ϕ t 0 + ϕ t 1 , , ϕ t p x , x p , we can get the stationarity conditions for GRCAR( p ) as another corollary of our general result.

Corollary 2.3.Suppose model (1) satisfies

  • (C.1)

    • (i)

      1 ϕ 1 z ϕ p z p 0 , for z 1 ;

    • (ii)

      E ϕ t i ϕ i δ < , i = 1 , 2 , , p , E ϕ t 0 δ < , for a constant 0 < δ < 1 ;

    • (iii)

      The density function of ε t is continuous and positive everywhere, and E ε t δ < for δ in (ii).

      Then, { y t } in model (1) is stationary and geometrically ergodic.

Remark 1.Theorem 2.1 establishes the geometric ergodicity of general stochastic functional autoregressive (linear and nonlinear) models with possibly heavy-tailed error terms under a mild moment condition. The stationarity conditions for GRCAR( p ) in Corollary 2.3 is a consequence of Theorem 2.1. In Corollary 2.3, the moment condition is very weak. We only require a finite moment of order δ ( 0 < δ < 1 ) about the error ε t , which includes the Cauchy distribution. The condition on random coefficients makes the model not too far away from the linear AR. This is a reasonable requirement for any (non-parametric or parametric) AR-type model.

Remark 2.We note that the GRCAR is a quite broad kind of models for time series data. A special case of this type of models can be used to describe conditionally heteroscedastic structure. For example, consider the model defined as: y t = ( ϕ 0 + β 0 ε t ) + ( ϕ 1 + β 1 ε t ) y t 1 + ε t , ε t N ( 0 , σ 2 ) . The conditional mean and conditional variance of this model are

E ( y t | t 1 ) = ϕ 0 + ϕ 1 y t 1 ; V a r ( y t | t 1 ) = σ 2 ( 1 + β 0 + β 1 y t 1 ) 2 .
It is shown that this very special case of the GRCAR has similar function to AR(1)–ARCH(1), but we don't need to restrict the parameters β 0 and β 1 to be non-negative. Furthermore, if the assumption on the distribution of error is changed to ε t t 2 , the model becomes a model with infinite variance.

2.2 Conditional Self-weighted M-estimation

Denote X t 1 = ( 1 , y t 1 , , y t p ) . Then the model (1) becomes y t = ϕ t X t 1 + ε t , where ϕ t = ( ϕ t 0 , ϕ t 1 , , ϕ t p ) . Define the objective function
L n ( ϕ ) = t = 1 n ω t ρ ( y t ϕ X t 1 ) , (8)
where ϕ = E ( ϕ t ) = ( ϕ 0 , ϕ 1 , , ϕ p ) and ω t is a positive function which is measurable to t 1 = σ ( y t 1 , y t 2 , ) and ρ ( · ) is assumed to be a suitable nonnegative convex function. The conditional self-weighted M-estimator ϕ ^ S M of ϕ is defined by
ϕ ^ S M = arg min ϕ Θ L n ( ϕ )
where Θ p + 1 is the parameter space containing the true value ϕ 0 .

2.3 Asymptotic Normality of SM-estimation

To derive the asymptotic property of ϕ ^ S M , we need the following assumptions:
  • (C.2)

    Let ρ ( · ) be a convex function on with left derivative ψ and right derivative ψ + . Choose a function ψ such that ψ ψ ψ + .

  • (C.3)

    Suppose that G ( t ) : = E ψ ( ε 1 + t ) exists, G ( t ) has a derivative λ > 0 at t = 0 and G ( 0 ) = 0 .

  • (C.4)

    E ψ 2 ( ε 1 ) = τ < and E ( ψ ( ε 1 + t ) ψ ( ε 1 ) ) 2 0 , as t 0 .

  • (C.5)

    ω t = g ( y t 1 , , y t p ) is a measurable and positive function on p such that E ( ω t + ω t 2 ) ( X t + X t 2 ) < , where v denotes the Euclidean norm of a vector v .

Theorem 2.4.Under the (C.1)–(C.5),we have

n ( ϕ ^ S M ϕ 0 ) L N ( 0 , τ λ 2 1 Ω 1 ) (9)
where = E ( ω t X t X t ) , Ω = E ( ω t 2 X t X t ) and L denotes convergence in distribution.

Remark 3.Assumption (C.1) does not rule out the possibility that ε t has an infinite variance, and even E | ε t | is infinite. Theorem 2.4 establishes the asymptotic property of SM-estimators for parameters in GRAR( p ) models with possible heavy-tailed errors.

Remark 4.It is worth mentioning that Assumptions (C.2)–(C.4) are traditional assumptions for an M-estimation in a linear model, which can be found in many references, for examples, Bai et al. (1992), Wu (2007) and Wang and Zhu (2018). Examples of ρ ( x ) satisfying assumptions include ρ ( x ) = x 2 , ρ ( x ) = x and ρ ( x ) = 1 2 x 2 I ( x m ) + ( m x 1 2 m 2 ) I ( x > m ) , which then correspond to the conditional self-weighted least-squares estimator, conditional self-weighted least absolute deviation estimator and conditional self-weighted Huber estimator respectively. Assumption (C.5) is standard on the weight ω t for the self-weighted method in IVAR models which allows E y t 2 to be infinite by properly choosing weight function ω t . Firstly, the purpose of the weight ω t is to downweight the leverage points in X t such that the covariance matrices Ω and in Theorem 2.4 above are finite. Secondly, the ω t allow us to approximate L n ( ϕ ^ n ) by a quadratic form. In addition, Theorem 2.4 generalizes the results of Ling (2005), Wang and Hu (2017) and Fu et al. (2021).

Remark 5.For the case ρ ( x ) = x 2 and E ε 1 2 = σ 2 < , taking ψ ( x ) = 2 x , λ = 2 . Applying Theorem 2.4, we have

n ( ϕ ^ S M ϕ 0 ) L N ( 0 , σ 2 1 Ω 1 ) (10)
which shows the asymptotic property of the conditional self-weighted least-squares estimator of parameters in a GRCAR( p ) model with finite variance.

For the case ρ ( x ) = | x | and E ε 1 2 = , taking ψ ( x ) = s i g n ( x ) . Suppose that the errors ε t have zero median with a density f ( x ) satisfying sup x R | f ( x ) | < . Then λ = 2 f ( 0 ) and τ = 1 . Using Theorem 2.4, we have
n ( ϕ ^ S M ϕ 0 ) L N ( 0 , 1 4 f ( 0 ) 2 1 Ω 1 ) (11)
which shows the asymptotic property of the conditional self-weighted least absolute deviation estimator for a GRCAR( p ) model with infinite variance.
For the case ρ ( x ) = 1 2 x 2 I ( x m ) + ( m x 1 2 m 2 ) I ( x > m ) , taking ψ ( x ) = m I ( x < m ) + x I ( x m ) + m I ( x > m ) , and
λ = m m d F ( x ) , τ = m 2 m m ( m 2 x 2 ) d F ( x ) . (12)
Using Theorem 2.4, we have
n ( ϕ ^ S M ϕ 0 ) L N ( 0 , τ λ 2 1 Ω 1 ) (13)
which includes the asymptotic property of the conditional self-weighted Huber estimator for a GRCAR( p ) model with finite variance or infinite variance.

3 SIMULATION STUDIES

We conduct some simulation studies in finite samples through Monte Carlo experiments. What we are interested in are the accuracy and sampling distribution of the proposed estimator. The results show that our method performs well.

Data are generated from the following GRCAR ( p ) models:

Model A: y t = ( ϕ 0 + 0 . 1 ε t ) + ( ϕ 1 + 0 . 1 ε t ) y t 1 + ε t . Consider ε t N ( 0 , 1 ) and ε t t 2 .

Model B: y t = ( ϕ 0 + 0 . 1 u t ) + ( ϕ 1 + 0 . 1 u t ) y t 1 + ε t , where u t N ( 0 , 1 ) . Consider ε t N ( 0 , 1 ) , ε t t 2 and ε t C a u c h y ( 0 , 1 ) .

The true values of parameters in model A are ( ϕ 0 , ϕ 1 ) = ( 0 , 0 . 5 ) , and ( ϕ 0 , ϕ 1 ) = ( 0 , 0 . 5 ) in model B. In model A, the random coefficients are correlated with the error process. The random coefficients are independent of the error process in model B. Here three distributions are given for reference: N ( 0 , 1 ) has finite expectation and finite variance; t 2 has finite expectation but infinite variance; Cauchy(0,1) only have a finite moment of order δ ( 0 < δ < 1 ) . We set the sample sizes n = 200 and n = 400 . The number of replications is 2000.

Tables I and II list the biases, standard deviations (SD) and asymptotic standard deviations (ADs) of the conditional self-weighted least absolute deviation estimator ( S M 1 ) and conditional self-weighted Huber estimator ( S M 2 ) with the following choice of weight functions respectively:
w 1 t = 1 , | y t 1 | K , K 3 y t 1 3 , | y t 1 | > K , w 2 t = I { | y t 1 | K } , w 3 t = 1 1 + y 2 t 1 , w 4 t = 1 ( 1 + y t 1 ) 2
where K is the 0.9 quantile of data | y 1 | , , | y n | . The weight w 1 t is similar to Ling (2005). The weights w 2 t , w 3 t and w 4 t were considered by Yang and Ling (2017). The tuning parameter of Huber estimator is taken as m = 1 . 5 . We define
^ = 1 n t = 1 n ω t X t X t , Ω ^ = 1 n t = 1 n ω t 2 X t X t , τ ^ = 1 n t = 1 n ψ 2 ( ε ^ t ) , G ^ ( r ) = 1 n t = 1 n ψ ( ε ^ t + r ) , (14)
λ ^ is the derivative of G ^ ( r ) at r = 0 and ε ^ t is the sequence of residuals in GRCAR( p ). The ADs are calculated by (10)-(14). We estimate f ( 0 ) by
f ^ n ( 0 ) = 1 σ ^ ω b n n t = 1 n ω t K ( y t ϕ ^ n X t 1 b n )
where σ ^ ω = 1 n t = 1 n ω t , K ( x ) = e x / ( 1 + e x ) 2 and b n = 1 . 06 × n 1 / 5 . For the choice of the optimal bandwidth and its motivation, we refer to Silverman (1986, p. 40) and Pan et al. (2007). Tables I and II show that all the biases are very small and all the SDs and ADs are very close no matter E ε t 2 is finite or infinite and no matter the random coefficients are correlated with the error process or independent of the error process. All the biases, SDs and ADs become smaller, when n increases from 200 to 400. And the S M 2 estimators perform better than the S M 1 estimators. All of the estimators based on w 1 t are more efficient than the others.
Table I. Bias, SDs and ADs of the S M estimators for model A
ε t N ( 0 , 1 ) ε t t 2
ϕ ^ 0 S M 1 ϕ ^ 0 S M 2 ϕ ^ 1 S M 1 ϕ ^ 1 S M 2 ϕ ^ 0 S M 1 ϕ ^ 0 S M 2 ϕ ^ 1 S M 1 ϕ ^ 1 S M 2
w t 1 n = 200 Bias 0.000 0.000 0 . 006 0 . 005 0.001 0.001 0.004 0.005
SD 0.103 0.084 0.093 0.077 0.123 0.118 0.066 0.065
AD 0.116 0.075 0.103 0.067 0.137 0.077 0.076 0.043
n = 400 Bias 0.001 0.001 0.003 0.002 0.001 0.000 0.002 0.003
SD 0.072 0.059 0.065 0.053 0.086 0.082 0.044 0.043
AD 0.079 0.053 0.070 0.047 0.093 0.054 0.051 0.030
w t 2 n = 200 Bias 0.001 0.001 0.002 0.000 0.000 0.002 0.008 0.007
SD 0.111 0.090 0.090 0.111 0.132 0.125 0.105 0.099
AD 0.125 0.081 0.151 0.098 0.114 0.082 0.076 0.064
n = 400 Bias 0.002 0.002 0.001 0.001 0.002 0.002 0.003 0.002
SD 0.078 0.064 0.064 0.076 0.092 0.086 0.044 0.065
AD 0.086 0.057 0.101 0.068 0.076 0.057 0.051 0.044
w t 3 n = 200 Bias 0.001 0.001 0.003 0.004 0.000 0.001 0.004 0.005
SD 0.110 0.089 0.095 0.078 0.141 0.134 0.078 0.075
AD 0.126 0.081 0.107 0.069 0.162 0.090 0.090 0.050
n = 400 Bias 0.000 0.001 0.002 0.002 0.001 0.000 0.002 0.002
SD 0.078 0.064 0.067 0.055 0.099 0.093 0.053 0.052
AD 0.086 0.057 0.073 0.049 0.110 0.063 0.061 0.035
w t 4 n = 200 Bias 0.002 0.001 0.004 0.004 0.001 0.002 0.004 0.005
SD 0.118 0.096 0.094 0.078 0.152 0.144 0.072 0.070
AD 0.135 0.087 0.106 0.069 0.174 0.096 0.082 0.046
n = 400 Bias 0.000 0.001 0.002 0.002 0.001 0.000 0.002 0.002
SD 0.083 0.068 0.066 0.054 0.106 0.100 0.048 0.048
AD 0.093 0.061 0.073 0.048 0.119 0.068 0.056 0.032
Table II. Bias, SDs and ADs of the S M estimators for model B
ε t N ( 0 , 1 ) ε t t 2 ε t C a u c h y ( 0 , 1 )
ϕ ^ 0 S M 1 ϕ ^ 0 S M 2 ϕ ^ 1 S M 1 ϕ ^ 1 S M 2 ϕ ^ 0 S M 1 ϕ ^ 0 S M 2 ϕ ^ 1 S M 1 ϕ ^ 1 S M 2 ϕ ^ 0 S M 1 ϕ ^ 0 S M 2 ϕ ^ 1 S M 1 ϕ ^ 1 S M 2
w t 1 n = 200 Bias 0.001 0.001 0.006 0.006 0.001 0.000 0.005 0.004 0.000 0.000 0.003 0.003
SD 0.095 0.075 0.092 0.076 0.110 0.106 0.068 0.064 0.127 0.133 0.045 0.044
AD 0.109 0.075 0.106 0.073 0.131 0.076 0.079 0.046 0.156 0.077 0.045 0.022
n = 400 Bias 0.000 0.001 0.003 0.003 0.000 0.001 0.002 0.002 0.002 0.002 0.002 0.002
SD 0.066 0.053 0.065 0.053 0.078 0.075 0.048 0.045 0.090 0.095 0.031 0.031
AD 0.074 0.053 0.072 0.051 0.089 0.054 0.054 0.032 0.106 0.055 0.031 0.015
w t 2 n = 200 Bias 0.002 0.000 0.006 0.006 0.002 0.000 0.005 0.004 0.001 0.001 0.005 0.005
SD 0.102 0.081 0.136 0.110 0.117 0.112 0.097 0.093 0.133 0.138 0.064 0.064
AD 0.118 0.081 0.155 0.107 0.139 0.081 0.117 0.069 0.161 0.081 0.070 0.035
n = 400 Bias 0.001 0.001 0.005 0.004 0.000 0.001 0.003 0.003 0.002 0.002 0.003 0.003
SD 0.071 0.058 0.095 0.076 0.083 0.079 0.070 0.067 0.093 0.100 0.044 0.045
AD 0.080 0.057 0.105 0.074 0.094 0.057 0.080 0.048 0.110 0.057 0.048 0.024
w t 3 n = 200 Bias 0.000 0.001 0.006 0.005 0.002 0.001 0.004 0.002 0.000 0.000 0.004 0.005
SD 0.101 0.079 0.094 0.077 0.125 0.121 0.079 0.073 0.164 0.175 0.076 0.073
AD 0.116 0.079 0.107 0.074 0.150 0.087 0.089 0.052 0.200 0.102 0.072 0.036
n = 400 Bias 0.001 0.001 0.002 0.002 0.003 0.001 0.001 0.001 0.000 0.000 0.002 0.002
SD 0.070 0.057 0.065 0.053 0.090 0.086 0.055 0.051 0.115 0.124 0.051 0.050
AD 0.079 0.056 0.073 0.052 0.102 0.062 0.061 0.037 0.136 0.072 0.049 0.026
w t 4 n = 200 Bias 0.000 0.001 0.006 0.005 0.003 0.001 0.003 0.002 0.001 0.001 0.004 0.005
SD 0.108 0.085 0.094 0.077 0.136 0.131 0.074 0.068 0.176 0.188 0.066 0.062
AD 0.125 0.085 0.107 0.074 0.161 0.094 0.082 0.048 0.215 0.109 0.060 0.030
n = 400 Bias 0.001 0.001 0.002 0.002 0.004 0.002 0.001 0.001 0.001 0.001 0.002 0.002
SD 0.075 0.061 0.065 0.053 0.097 0.092 0.051 0.048 0.125 0.134 0.044 0.043
AD 0.085 0.060 0.073 0.052 0.110 0.067 0.056 0.034 0.146 0.077 0.041 0.021

To get an overall view on the sampling distributions of the S M 1 estimators and the S M 2 estimators, we simulate 2000 replications for the case ϕ 1 = 0 . 5 and n = 400 when the error distributions are N ( 0 , 1 ) and t 2 for model A, and for the case ϕ 1 = 0 . 5 and n = 400 when the error distributions are N ( 0 , 1 ) , t 2 and Cauchy for model B. Denote N n 1 = n ( ϕ ^ 1 S M 1 0 . 5 ) / σ ^ S M 1 , N n 2 = n ( ϕ ^ 1 S M 2 0 . 5 ) / σ ^ S M 2 ; N t 1 = n ( ϕ ^ 1 S M 1 0 . 5 ) / σ ^ S M 1 , N t 2 = n ( ϕ ^ 1 S M 2 0 . 5 ) / σ ^ S M 2 ; N c 1 = n ( ϕ ^ 1 S M 1 0 . 5 ) / σ ^ S M 1 , N c 2 = n ( ϕ ^ 1 S M 2 0 . 5 ) / σ ^ S M 2 , when the error distribution is N ( 0 , 1 ) , t 2 and C a u c h y respectively, where σ ^ S M 1 and σ ^ S M 2 are the SDs of ϕ ^ 1 S M 1 and ϕ ^ 1 S M 2 respectively. Figure 1 shows the density curves of model A. The density curves of model B are presented in Figure 2. We can see that the density of N ( 0 , 1 ) is approximated reasonably well by those of N n 1 , N n 2 , N t 1 , N t 2 and N c 1 , N c 2 in both model A and model B.

Details are in the caption following the image
The sampling distribution for model A
Details are in the caption following the image
The sampling distribution for model B

In conclusion, the numeric results show that the conditional self-weighted M-estimators perform well in finite sample no matter with finite variance or infinite variance.

4 REAL DATA ANALYSIS

The proposed methodology is applied to modeling a real dataset. We consider the Hang Seng Index (HSI) for the stock market which is the most influential index in the Hong Kong stock market and one of the most important indices in the Asian financial markets as well. This index has been extensively investigated in the literature. Our dataset consists of the daily Hang Seng closing index from 7 May 2020 to 31 December 2021, which was downloaded at https://cn.investing.com/. There are 412 available observations in total, which are denoted by x 1 , x 2 , , x 412 . The first 392 data are selected as the training sample to build model, and next 20 data are used as test sample to evaluate the model. We take the following steps to analyze this dataset by the GRCAR model and method proposed in this article.
  •  

    Step 1. Data transformation: The sample time plot for the data { x t } is shown in Figure 3(a). It can be seen that the time series is not stationary because of various levels. To get a stationary time series, let y t = 100 × ( log ( x t / x t 1 ) ) . The sample path plot for the data { y t } is shown in Figure 3(b). Figure 3(b) indicates that { y t } is close to stationarity.

  •  

    Step 2. Model identification: The plot of sample autocorrelation function (ACF) and sample partial autocorrelation function (PACF) of { y t } are presented in Figure 3(c) and (d) respectively. Figure 3 can provide some important information for tentative identification of the orders of a stable AR model. Based on the sample ACF and PACF plots, it is reasonable to consider fitting an AR(3) autocorrelation structure to { y t } . Since stock data are affected by various factors, the coefficients of the autoregressive model may also change randomly over time, even the coefficients may be correlated with the error, so we can try to fit a GRCAR(3) model instead of AR(3). However, we realize that the data may be heavy-tailed and how to determine the order of autoregression for a time series with infinite variance is a problem which needs further study. Here we just use the sample PACF to give a rough indication of the order in which a GRCAR model might be fitted.

  •  

    Step 3. Heavy-tail test: To test whether { y t } has a heavy-tailed distribution, we use Hill's estimator (see Drees et al. (2000) and Resnick (2000)) to estimate the tail index of y t . Let y ( 1 ) > y ( 2 ) > > y ( n ) be the order statistics of y t , t = 1 , , n . The estimators of the right-tail and left-tail indices are defined as

    H 1 k = 1 k i = 1 k log ( y ( i ) y ( k + 1 ) ) 1 , H 2 k = 1 k i = 1 k log ( y ( n i + 1 ) y ( n k ) ) 1 ,
    respectively. Figure 4 displays the Hill estimates of the right-tail and left-tail indices when 1 k 200 . From Figure 4, we can see that both the right and left tail indices are most likely less than 2. Hence, y t should have a heavy tail. Therefore, it may be more appropriate to suppose these data are generated from a process with infinite variance rather than to assume this data have finite variance.

    Based on the above discussion, we can fit a GRCAR(3) model to the data:

    y t = ( ϕ 0 + ε t ) + ( ϕ 1 + ε t ) y t 1 + ( ϕ 2 + ε t ) y t 2 + ( ϕ 3 + ε t ) y t 3 + ε t , ε t t 2 . (15)

  •  

    Step 4. Parameter estimation: The unknown parameters are estimated by different methods on the training data. We calculated the mean absolute errors (MAEs) of predicted values for transformed data based on 1000 repetitions. The results are shown in Table III. From them, we can see that the self-weighted estimators perform better, especially S M 2 . So we choose the S M 2 for further analysis. The estimates are

    ( ϕ 0 , ϕ 1 , ϕ 2 , ϕ 3 ) = ( 0 . 0493 , 0 . 0059 , 0 . 0214 , 0 . 1608 ) ,
    whose corresponding asymptotic standard deviations are 0 . 0028 , 0 . 0019 , 0 . 0018 , 0 . 0018 respectively.

  •  

    Step 5. Model diagnostics: The absolute value of the eigenvalues for the corresponding matrix in model (15) are 0.5549, 0.5383 and 0.5383, which are all less than one. Therefore, this model satisfies the stationary conditions of Corollary 2.3. Figure 5 presents the residuals of the fitted model (15), the normal Q Q plot of the residuals, the sample autocorrelation function (ACF) of the residuals and the sample ACF of the squared residuals. From that, we can see the model (15) fits the data reasonably well.

  •  

    Step 6. Prediction: We use the above model (15) to predict y 393 , y 394 , , y 412 in the test set. As the coefficient of model (15) is random, the predicted values of Hang Seng Index from 3 December 2021 to 31 December 2021, x 393 , x 394 , , x 412 , were calculated by taking the average of 1000 repetitions. We compare our GRCAR(3) model with the AR(3) model and AR(3)-ARCH(2) model. The predictive performance of different models is presented in Figure 6. We can see that the predicted value of model (15) captures the change trend of the real value and most of the predicted values are very close to the true values. Figure 6 and Table IV also show that the GRCAR(3) model performs better than AR(3) model and AR(3)–ARCH(2) model for this dataset.

Table III. Mean absolute errors (MAE) of predicted values on the training set for transformed data
Methods S M 1 S M 2 LAD LS
MAE 0.9691 0.9529 0.9800 0.9996
Table IV. Some indicators of predictive accuracy on test set about the original data
Mean of residual SD of residual MAE
AR(3) 32 . 60 282.93 200.98
GRCAR(3) 8.07 279.34 176.75
Details are in the caption following the image
(a) Sample path of data x t , (b) sample path of data y t , (c) sample ACF of y t , (d) sample PACF of y t
Details are in the caption following the image
Hill estimates of the right tail index H 1 k (black) and the left tail index H 2 k (blue)
Details are in the caption following the image
(a) The residuals from the fitted GRCAR(3) model, (b) the normal Q Q plot of the residuals, (c) the sample autocorrelation function (ACF) of the residuals and (d) the sample ACF of the squared residuals
Details are in the caption following the image
The predictive performance of different models

In summary, our model and method perform well in analysis and forecasting of time series data with heavy-tailed distributions.

5 PROOFS OF THEORETICAL RESULTS

This section presents the proofs of our theoretical results.

Proof of Theorem 2.1.It is easy to see that { X t } defined by (3) is a homogenous Markov chain. By the conditions of Theorem 2.1, this Markov chain is u p -irreducible and aperiodic, and all bounded sets with positive u p -measure in R p are small sets. Take the test function g ( x ) = x v δ . It holds that

E g ( X t ) X t 1 = x = E g ( Φ t ( x ) + ε t U ) = E Φ t ( x ) + ε t U v δ E Φ t ( x ) v δ + E ε t U v δ ρ x v δ + c + ( E ε t δ ) U v δ = ρ x v δ + c + c = α x v δ ( α ρ ) x v δ c c ,
where 0 < ρ < α < 1 , c = ( E ε t δ ) U v δ . Let C = x : x v δ k , k > max 1 , c + c α ρ , then C is a small set, and
E g ( X t ) X t 1 = x α x v δ c 1 , x C , E g ( X t ) X t 1 = x c 2 , x C ,
where c 1 = ( α ρ ) k c c , c 2 = ρ k + c + c . By the Lyapunov shift criteria (see Meyn and Tweedie (1994)), the model (3) is geometrically ergodic,which implies { y t } in model (2) is stationary and geometrically ergodic.

Proof of Corollary 2.2.We only need to verify the condition (i) of Theorem 2.1. Denote

A = φ 1 φ 2 φ p 1 φ p 1 0 0 0 0 1 0 0 0 0 1 0 .
The condition (5) of Corollary 2.2 is equivalent to λ p φ 1 λ p 1 φ p 0 , | λ | 1 . It implies the roots λ 1 , , λ p of | λ I A | = 0 satisfy | λ i | < 1 , i = 1 , 2 , , p . Put ρ = max { | λ i | , i = 1 , 2 , , p } , then 0 < ρ < 1 . Thus there exists a positively definite matrix V p × p and 0 < ρ < 1 such that A V A ρ 2 V (see Ciarlet (1982)). Furthermore,
x R p , x A V A x ρ 2 x V x .
Define a norm · v as follows
x v = x V x 1 / 2 , x R p .
Then
A x v ρ x v , x R p .
Let H t ( x ) = Φ t ( x ) A x = φ t ( x ) φ x , 0 , , 0 , where φ = φ 1 , , φ p . By the norm equivalence, there is a positive constant M such that
E H t ( x ) v δ x v δ M E H t ( x ) δ x δ = M E φ t ( x ) φ x δ x δ . (16)
By (6), when x , the right-hand side of the above inequality tends to 0. Hence, there exists a constant K 0 such that
E H t ( x ) v δ 1 2 1 ρ δ x v δ
for x v δ > K 0 . Therefore,
E Φ t ( x ) v δ = E A x + H t ( x ) v δ A x v δ + E H t ( x ) v δ ρ δ x v δ + E H t ( x ) v δ 1 2 ( 1 + ρ δ ) x v δ , x v δ K 0 , ρ δ x v δ + M 1 , x v δ < K 0 .
By (7) and (16), when x v δ < K 0 , there exists a constant M 1 0 such that E H t ( x ) v δ M 1 . This implies that the condition (i) of Theorem 2.1 holds. By Theorem 2.1, we get the result of this corollary.

Proof of Corollary 2.3.We only need to verify the condition (i) of corollary 2.2.

Define

φ t ( x ) = ϕ t 0 + ϕ t 1 , , ϕ t p x , x R p .
Then
φ t ( x ) φ x δ = ϕ t 0 + ϕ t 1 ϕ 1 , , ϕ t p ϕ p x δ ϕ t 0 δ + ϕ t 1 ϕ 1 , , ϕ t p ϕ p x δ .
Therefore
lim x E φ t ( x ) φ x δ x lim x E ϕ t 0 δ x + lim x E ϕ t 1 ϕ 1 , , ϕ t p ϕ p x δ x lim x E ϕ t 0 δ x + lim x E ϕ t 1 ϕ 1 , , ϕ t p ϕ p δ x δ x lim x E ϕ t 0 δ x + lim x E i = 1 p ϕ t i ϕ i δ x δ x lim x E ϕ t 0 δ x + lim x i = 1 p E ϕ t i ϕ i δ x 1 δ = 0 ,
where the second inequality uses the Schwarz inequality, the third inequality uses the Triangle inequality, and the last equation holds by the condition (ii) of Corollary 2.3. Also,
sup x K E φ t ( x ) φ x δ sup x K E ϕ t 0 δ + i = 1 p E ϕ t i ϕ i δ < .
This implies that the condition (i) of Corollary 2.2 holds. By Corollary 2.2, { y t } in model (1) is stationary and geometrically ergodic.

In the following, we give two lemmas, which will be used frequently in the proof of Theorem 2.4. The first lemma is directly taken from Davis et al. (1992).

Lemma 5.1.Let V n ( · ) and V ( · ) be stochastic process on R p + 1 and suppose that V n ( · ) L V ( · ) on C ( p + 1 ) . Let ξ n minimize V n ( · ) and ξ n minimize V ( · ) . If V n ( · ) is convex for each n and ξ is unique with probability one, then ξ n L ξ on p + 1 .

Proof.See Davis et al. (1992).

Lemma 5.2.Under the conditions (C.1)–(C.5), we have, as n ,

  • (a)

    1 n t = 1 n ω t X t X t p , 1 n t = 1 n ω t 2 X t X t p Ω ;

  • (b)

    for any fixed ( p + 1 ) × 1 vector C such that C Ω C > 0 , max 1 t n | ω t C X t | n p 0 ;

  • (c)

    1 n t = 1 n ω t X t ψ ( ε t ) L N ( 0 , τ Ω ) .

Proof.By applying the conditions (C.1) with y t being stationary and ergodic, it is easy to get (a) and (b). So we omit the proofs of (a) and (b), and only give the proof of (c). Put ς n t = 1 n ω t C X t ψ ( ε t ) . Then

t = 1 n ς n t = C 1 n t = 1 n ω t X t ψ ( ε t ) ,
and ς n t , 1 t n is a sequence of martingale differences with respect to t 1 . By (a), it follows that
t = 1 n E ( 1 n ω t 2 C X t X t C ψ 2 ( ε t ) | t 1 ) = 1 n t = 1 n ( ω t 2 C X t X t C ) E ψ 2 ( ε t ) = τ 1 n t = 1 n ( ω t 2 C X t X t C ) p τ υ , (17)
where υ = C Ω C . Put ξ t = ω t C X t , then for any η > 0 , we have
t = 1 n E ( ς n t 2 I ( | ς n t | > η ) | t 1 ) = 1 n t = 1 n ξ t 2 E ( ψ 2 ( ε t ) I ( | ξ t ψ ( ε t ) | > η n ) | t 1 ) max 1 t n E ( ψ 2 ( ε t ) I ( | ξ t ψ ( ε t ) | > η n ) | t 1 ) 1 n t = 1 n ξ t 2 . (18)
Notice that
I ( | ξ t ψ ( ε t ) | > η n ) I ( | ψ ( ε t ) | > η M ) + I ( | ξ t | n > 1 M )
for any fixed M > 0 . It implies that, for 1 t n ,
E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 E ψ 2 ( ε t ) I ψ ( ε t ) > η M + E ψ 2 ( ε t ) I ξ t n > 1 M t 1 E ψ 2 ( ε 1 ) I ψ ( ε 1 ) > η M + τ · max 1 t n I ξ t n > 1 M E ψ 2 ( ε 1 ) I ψ ( ε 1 ) > η M + τ · I max 1 t n ξ t n > 1 M . (19)
This leads to
max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 E ψ 2 ( ε 1 ) I ψ ( ε 1 ) > η M + τ · I max 1 t n ξ t n > 1 M . (20)
Thus
E max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 E ψ 2 ( ε 1 ) I ψ ( ε 1 ) > η M + τ · P max 1 t n ξ t n > 1 M .
Notice E ψ 2 ( ε 1 ) < . Then for any ε > 0 , there exists M = M ( ε ) such that
E ψ 2 ( ε 1 ) I ψ ( ε 1 ) > η M < ε ,
which implies that
E max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 τ · P max 1 t n ξ t n > 1 M + ε . (21)
From (b) and (21), we have
lim sup n E max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 ε .
Then
lim n E max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 = 0 .
Thus
max 1 t n E ψ 2 ( ε t ) I ξ t ψ ( ε t ) n > η t 1 = o p ( 1 ) , (22)
which, combining with (22) and (a), derives that, for any η > 0 ,
t = 1 n E ς n t 2 I ( ς n t > η ) t 1 = o p ( 1 ) . (23)
Therefore, by applying the martingale central limit theorem, (18) and (23), we have
1 n t = 1 n ω t X t ψ ( ε t ) L N ( 0 , τ Ω ) .
The proof of the Lemma 5.2 is completed.

Proof of Theorem 2.4.Denote β ^ n = n ( ϕ ^ S M ϕ 0 ) and

V n ( μ ) = t = 1 n ω t ρ ε t 1 n μ X t ρ ( ε t ) , (24)
where β ^ n is the value of μ that minimizes the convex objective function V n ( μ ) and μ p + 1 . Put
A n = 1 n t = 1 n ω t X t ψ ( ε t ) , B t ( μ ) = ω t 0 μ X t n ψ ( ε t + s ) ψ ( ε t ) d s .
Then
V n ( μ ) = μ A n + t = 1 n B t ( μ ) = μ A n + t = 1 n E ( B t ( μ ) t 1 ) + t = 1 n B t ( μ ) E ( B t ( μ ) t 1 ) . (25)
From the condition (C.3), we obtain
t = 1 n E ( B t ( μ ) t 1 ) = t = 1 n ω t 0 μ X t n E ψ ( ε t + s ) d s = t = 1 n ω t 0 μ X t n λ s ( 1 + o ( 1 ) ) d s = λ μ 2 n t = 1 n ω t X t X t μ ( 1 + o ( 1 ) ) = λ μ 2 1 n t = 1 n ω t X t X t μ + o p ( 1 ) .
Note that B t ( μ ) E ( B t ( μ ) t 1 ) , 1 t n is a sequence of martingale differences. Then we get
t = 1 n E ( B t 2 ( μ ) t 1 ) = t = 1 n E ω t 2 0 μ X t n ψ ( ε t + s ) ψ ( ε t ) d s 2 t 1 t = 1 n E ω t 2 0 μ X t n d s 0 μ X t n ψ ( ε t + s ) ψ ( ε t ) 2 d s t 1 t = 1 n ω t 2 μ X t n 0 μ X t n E ψ ( ε t + s ) ψ ( ε t ) 2 d s = μ 1 n t = 1 n ω t 2 X t X t μ · o ( 1 ) . (26)
From Lemma 5.1 and (26), we can obtain
t = 1 n E B t 2 ( μ ) μ 1 n t = 1 n E ω t 2 X t X t μ · o ( 1 ) = μ Ω μ · o ( 1 ) 0 .
Therefore, we have
t = 1 n E B t 2 ( μ ) 0 .
Thus
E t = 1 n B t ( μ ) E ( B t ( μ ) t 1 ) 2 = t = 1 n E B t ( μ ) E ( B t ( μ ) t 1 ) 2 2 t = 1 n E B t 2 ( μ ) 0 .
Hence (25) can be rewritten to be
V n ( μ ) = λ 2 μ 1 n t = 1 n ω t X t X t μ μ A n + o p ( 1 ) . (27)
Because of Lemma 5.2, we can get
V n ( μ ) L V ( μ ) = λ 2 μ μ μ A ,
where A N ( 0 , τ Ω ) . Note that V ( μ ) has a unique minimum at μ = 1 λ 1 A almost surely and V n ( μ ) has convex sample paths due to the condition (C.2). Applying Lemma 5.1, we have
β ^ n = n ( ϕ ^ S M ϕ 0 ) L 1 λ 1 A N ( 0 , τ λ 2 1 Ω 1 ) .
So the proof of Theorem 2.4 is completed.

6 CONCLUDING REMARKS

This article establishes the geometric ergodicity of general stochastic functional autoregressive models (including linear and nonlinear) under a broad condition. Furthermore, the stationary conditions and a self-weighted M-estimator for GRCAR(p) models with possibly heavy-tailed errors are proposed. The proposed estimator is shown to be asymptotically normal. The simulation study and a real data example showed that our theory and methodology perform well in practice. A general approach to stationarity and estimation for GRCAR models with heavy-tailed errors is presented. The methodology and results could be extended further to other time series models such as heavy-tailed GRCARMA models.

ACKNOWLEDGEMENTS

The authors thank the Editor, the Co-Editor and the Referee(s) for their insightful comments and suggestions that make us improve our article significantly. The second author's work was partially supported by the National Natural Science Foundation of China (Grant No. 12171161).

    DATA AVAILABILITY STATEMENT

    Our dataset consists of the daily Hang Seng closing index from 7 May 2020 to 31 December 2021. The data were downloaded at https://cn.investing.com and are available publicly.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.