Volume 45, Issue 6 pp. 910-930

Original Article

Open Access

Threshold Network GARCH Model

Yue Pan,

Yue Pan

Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK

Search for more papers by this author

Jiazhu Pan,

Corresponding Author

Jiazhu Pan

[email protected]

orcid.org/0000-0001-7346-2052

Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK

Correspondence to: Jiazhu Pan, Department of Mathematics and Statistics, University of Strathclyde, Glasgow G1 1XH, UK. Email: [email protected]

Search for more papers by this author

Yue Pan,

Yue Pan

Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK

Search for more papers by this author

Jiazhu Pan,

Corresponding Author

Jiazhu Pan

[email protected]

orcid.org/0000-0001-7346-2052

Department of Mathematics and Statistics, University of Strathclyde, Glasgow, UK

Correspondence to: Jiazhu Pan, Department of Mathematics and Statistics, University of Strathclyde, Glasgow G1 1XH, UK. Email: [email protected]

Search for more papers by this author

First published: 13 May 2024

https://doi.org/10.1111/jtsa.12743

Share a link

Email
Wechat
Bluesky

Abstract

Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model and its variations have been widely adopted in the study of financial volatilities, while the extension of GARCH-type models to high-dimensional data is always difficult because of over-parameterization and computational complexity. In this article, we propose a multi-variate GARCH-type model that can simplify the parameterization by utilizing the network structure that can be appropriately specified for certain types of high-dimensional data. The asymmetry in the dynamics of volatilities is also considered as our model adopts a threshold structure. To enable our model to handle data with extremely high dimension, we investigate the near-epoch dependence (NED) of our model, and the asymptotic properties of our quasi-maximum-likelihood-estimator (QMLE) are derived from the limit theorems for NED random fields. Simulations are conducted to test our theoretical results. At last we fit our model to log-returns of four groups of stocks and the results indicate that bad news is not necessarily more influential on volatility if the network effects are considered.

1 Introduction

To pursue maximum return or to circumvent potential risk, investors constantly revise their portfolio according to any related information. Understanding how the volatility of financial assets responds to new information is crucial in risk management and a widely studied area in econometrics and statistics. In the literature, statistical models that describe the formation of financial risks have been developed and conducted in practice. The Autoregressive Conditional Heteroscedasticity (ARCH) model was proposed by Engle (1982) for estimating the variance of United Kingdom's inflation. In an ARCH( $p$ ) model, the volatilities of returns are affected by up to $p$ lags of past observations. Bollerslev (1986) then proposed a generalized ARCH model (GARCH), to accommodate longer memory of past observations. It has become one of the most popular models in econometrics ever since, and numerous variations of GARCH model have been developed for modeling volatility with complicated structures. See Teräsvirta (2009) for a survey of different GARCH-type models.

When we study risks of multiple assets simultaneously, the conditional variances that represent individual risks are of interest as well as conditional covariances that represent risk-sharing relationships. On the other hand, risk of a particular individual could be affected by covariates of itself, and of those who closely related to it. This leads to the need of extending the GARCH-type models into the multi-variate case. For an

N

dimensional time series

\left\{{\mathbf{y}}_t\right\}

, a canonical expression of multi-variate GARCH would be

\begin{array}{ll}\hfill & {\mathbf{y}}_t={H}_t^{1/2}{\mathbf{z}}_t,\\ {}\hfill & {H}_t=g\left({\mathbf{y}}_{t-1},{H}_{t-1}\right),\end{array}

(1)

where random vector

{\mathbf{z}}_t

satisfies

𝔼 (z_{t}) = 0

and

\mathit{\operatorname{var}}\left({\mathbf{z}}_t\right)={I}_N

and there could be various specifications of the function

g\left(\cdotp \right)

as it represents the structure of the conditional covariance matrix

{H}_t

. For more details on this subject, an excellent survey paper by Bauwens et al. (2006) on the family of multi-variate GARCH models is recommended.

However, in terms of parameter estimation, there are major challenges that could make multi-variate GARCH (MGARCH) models inapplicable in empirical analysis when it comes to dealing with high-dimensional data. For example, the number of parameters rises at the speed of $𝒪 (N^{4})$ in vectorized GARCH model (VEC-GARCH) proposed by Bollerslev et al. (1988). Over parameterized specification causes high computational complexity, and makes it problematic to derive conditions for positive definiteness of the conditional covariance matrix ${H}_t$ . Plenty of efforts have been made in the literature, diagonal VEC-GARCH (DVEC-GARCH) model by Bollerslev et al. (1988) and Baba-Engle-Kraft-Kroner GARCH (BEKK-GARCH) model by Engle and Kroner (1995) are proposed with the aim of simplifying the conditions of positive definite by imposing structural restrictions on the conditional covariance matrix. The number of parameters could also be significantly reduced to $𝒪 (N^{2})$ in the Constant Conditional Correlation GARCH (CCC-GARCH) by Bollerslev (1990) and Dynamic Conditional Correlation GARCH (DCC-GARCH) by Engle (2002) and Tse and Tsui (2001). On the other hand, as an alternative way to overcome the over-parameterization problem, the idea of factor variables was imposed by Engle et al. (1990) on the multi-variate ARCH model as a dimension reduction technique. This idea is later introduced to MGARCH model by Bollerslev and Engle (1993), as well as succeeding work of Pan et al. (2010), Hu and Tsay (2014) and Li et al. (2016). At certain application scenarios when there is network structure behind the data we are interested in, multiple variables are connected and a multi-variate GARCH-type model could be fitted. These variations of MGARCH models solved the over-parameterization problem to some extent, but the number of parameters still expands along with the number of dimension nevertheless. With this flaw, MGARCH models could only be imposed on data with a small number of dimensions, such as stock indices of multiple markets or exchange rates of two currencies (see Karolyi, 1995 and Tse and Tsui, 2001).

Despite aforementioned difficulties due to dimensionality, for some specific types of multi-variate data where the connections between different components are actually observable, it is still possible to significantly simplify the model setup in the following aspects:

Instead of considering both volatilities and co-volatilities, we focus on studying the dynamics of volatilities only,
And instead of parameterizing every cross-individual effect, appropriate network structure can be embedded into the model.

In many cases such network structure can provide sufficient information about how the influence of pulses travel through edges between individual nodes. For instance, Nitzan and Libai (2011) found that customers connected with a defecting neighbor are 80% more likely to cancel their cellular service, and Goel and Goldstein (2014) concluded that the accuracy of individual behavior prediction can be significantly improved based on network data compared with conventional marketing practices.

Zhou et al. (2020) proposed a network GARCH model (see (2) for detailed specification of this model) that significantly reduces the parameterization complexity – the number of parameters remains fixed in their model no matter how large is the dimension $N$ . However, they did not fully utilize such an advantage as their discussion on parameter estimation is limited to the case when $N$ is fixed. Such setting narrows the variety of scenarios where their model could be applied since the size of network is often extremely large. In a study on social network that consists of 2982 users, Zhu et al. (2017) proposed a network AR model and the corresponding least squares estimator is proved to remain valid when the sample size $T\to \infty$ and the dimension $N\to \infty$ . Compared with their AR-type model, the unobserved volatility processes raise difficulties in extending such properties to GARCH-type models. We manage to address this problem by considering our network model as a spatial process on a two-dimensional lattice and adopting the asymptotic theorems for random fields proposed by Jenish and Prucha (2012) in the estimation of parameters. Since their limit theorems require NED, we will show such properties under certain restrictions on parameters and network structure. The idea of using limit theorems of spatial processes in the inference of high-dimensional time series has been considered by Xu et al. (2022) in the instrumental variable quantile regression estimation of their dynamic network quantile regression model. In this article, we will first introduce this idea into the estimation of high-dimensional GARCH-type models.

Aside from data with a diverging number of dimensions, we also aim to enable our model to handle data with asymmetry that was observed in empirical work such that positive and negative pulses affect volatilities differently in magnitude as well as in direction. While most GARCH-type models have an implicit assumption that the volatilities respond equally to the magnitude of positive and negative returns, Glosten et al. (1993) proposed a Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model with a threshold structure, allowing the volatility to act asymmetrically in magnitude responding to positive and negative pulses. The threshold GARCH (TGARCH) by Zakoïan (1994) also accommodates the asymmetry, but in magnitude of influence on conditional standard deviation. The exponential GARCH (EGARCH) of Nelson (1991) takes log transformation on the conditional variances, lifting the limitation of non-negative coefficients in conventional GARCH-type models and making it possible for their model to explain asymmetry in the direction of how volatility change corresponding to positive and negative news.

To study the asymmetrical dynamics in the volatilities of high-dimensional financial data, we propose a threshold network GARCH (TNGARCH) in Section 2. Stationarity conditions of this model are derived in Section 3.1 with fixed $N$ . In Section 3.2 we prove the ${L}_2$ -NED of proposed model under certain restrictions. The asymptotic properties of QMLE are investigated in Section 4, in the case when $T\to \infty$ and $N\to \infty$ at a lower rate. Then we propose a Wald statistic in Section 5.1, to test the existence of threshold effect, and in Section 5.2 we introduce a test for high-dimensional white noise proposed by Li et al. (2019). In Section 6, our methodology is tested on simulated data that are generated based on four different kinds of network structure. We observed an asymmetry that is different from existing literature, in how much the volatility responds to good news and bad news at individual level by applying our model to high-dimensional time series of log returns in Section 7. At last in Section 8, conclusions and potential directions for future research are summarized.

2 Model Setup

Consider an undirected and weightless network with $N$ nodes. Define the adjacency matrix $A={\left({a}_{ij}\right)}_{1\le i,j\le N}$ , where ${a}_{ij}=1$ if there is a connection between node $i$ and node $j$ , otherwise ${a}_{ij}=0$ . Besides, self-connection is not allowed for any node $i$ by letting ${a}_{ii}=0$ .

The connection can be defined differently with respect to practical scenarios, such as two social network accounts in a mutual followship, or two stocks who share at least one of top shareholders. As an interpretation of the network structure, $A$ is symmetric since ${a}_{ij}={a}_{ji}$ , hence for any node $i$ , the out-degree ${d}_i^{\left(\mathrm{out}\right)}={\sum}_{j=1}^N{a}_{ij}$ is equal to the in-degree ${d}_i^{\left(\mathrm{in}\right)}={\sum}_{j=1}^N{a}_{ji}$ and we use ${d}_i$ to denote both for convenience.

For any node

i

in this network, let

{y}_{it}

be the observation at time

t

, and

{h}_{it}

be the unobservable conditional heteroscedasticity of

{y}_{it}

, i.e.

{h}_{it}:= \operatorname{var}\left({y}_{it}|{\mathscr{H}}_{t-1}\right)

where

{\mathscr{H}}_{t-1}

denotes the

\sigma

-algebra consisting of all available information up to

t-1

. A network GARCH(1, 1) specification of the conditional variance incorporates the network effect:

{h}_{it}=\omega +\alpha {y}_{i,t-1}^2+\lambda \sum \limits_{j=1}^N{w}_{ij}{y}_{j,t-1}^2+\beta {h}_{i,t-1},\kern1em i=1,2,\dots, N.

(2)

Model (2) indicates that the volatility

{h}_{it}

of each stock

i

, is influenced by not only its own previous change of price measured by

{y}_{i,t-1}^2

, but also the average (with weight

{w}_{ij}=\frac{a_{ij}}{d_i}

) of

{y}_{j,t-1}^2

for all node

j

that are related to node

i

. To ensure the positiveness of conditional variance, it is constrained that

\omega >0

while

\alpha, \lambda, \beta \ge 0

To model the asymmetry in volatility, our TNGARCH model contains the threshold structure comparing with model (2). A TNGARCH(1,1) model is specified as follows:

\begin{array}{ll}\hfill & {y}_{it}={\varepsilon}_{it}\sqrt{h_{it}},\\ {}\hfill & {h}_{it}=\omega +\left({\alpha}^{(1)}{1}_{\left\{{y}_{i,t-1}\ge 0\right\}}+{\alpha}^{(2)}{1}_{\left\{{y}_{i,t-1}<0\right\}}\right){y}_{i,t-1}^2+\lambda \sum \limits_{j=1}^N{w}_{ij}{y}_{j,t-1}^2+\beta {h}_{i,t-1},\\ {}\hfill & i=1,2,\dots, N,\end{array}

(3)

where

{1}_{\left\{\cdotp \right\}}

is the indicator function. To assure the positiveness of

{h}_{it}

, the coefficients

\omega, {\alpha}^{(1)},{\alpha}^{(2)},\lambda

and

\beta

are assumed to have the same constraints as in (2).

\left\{{\varepsilon}_{it}\right\}

is a white noise process satisfying the following assumption:

Assumption 1. $\left\{{\varepsilon}_{it}\right\}$ is i.i.d. across $i$ and $t$ , with non-degenerate distribution, mean 0 and variance 1.

This assumption allows us to investigate, in the next section, the conditions for our model to have a unique strictly stationary solution, which serves as a precondition for further discussion on parameter estimation and statistical inference.

3 Stationarity and Near-Epoch Dependence

To derive the conditions under which model (3) is strictly stationary, we rewrite the conditional variance process in vector form

{\mathbf{h}}_t=\omega {\mathbf{1}}_N+{B}_{t-1}{\mathbf{h}}_{t-1},

(4)

with notations as follows:

\begin{array}{ll}\hfill & {\mathbf{h}}_t={\left({h}_{1t},{h}_{2t},\dots, {h}_{Nt}\right)}^{\prime}\in {\mathbb{R}}^N,\\ {}\hfill & {\mathbf{1}}_N={\left(1,1,\dots, 1\right)}^{\prime}\in {\mathbb{R}}^N,\\ {}\hfill & {B}_{t-1}={\alpha}^{(1)}{R}_{t-1}{E}_{t-1}+{\alpha}^{(2)}\left({I}_N-{R}_{t-1}\right){E}_{t-1}+\lambda {D}^{-1}A{E}_{t-1}+\beta {I}_N,\\ {}\hfill & {R}_{t-1}=\operatorname{diag}\left\{{1}_{\left\{{y}_{1,t-1}\ge 0\right\}},{1}_{\left\{{y}_{2,t-1}\ge 0\right\}},\dots, {1}_{\left\{{y}_{N,t-1}\ge 0\right\}}\right\},\\ {}\hfill & {E}_{t-1}=\operatorname{diag}\left\{{\varepsilon}_{1,t-1}^2,{\varepsilon}_{2,t-1}^2,\dots, {\varepsilon}_{N,t-1}^2\right\},\\ {}\hfill & D=\operatorname{diag}\left\{{d}_1,{d}_2,\dots, {d}_N\right\}.\end{array}

In Section 3.1, the stationarity of (4) when $N$ is a fixed number will be discussed. However, to estimate the parameters when $N\to \infty$ , limit theorems based on stationarity and ergodicity of fixed-dimensional time series are not sufficient. Therefore, in Section 3.2 we will discuss the near-epoch dependence for a random field that supports the adoption of limit theorems for spatial processes in the subsequent sections.

3.1 Stationarity with N being Fixed

Since

{y}_{it}={\varepsilon}_{it}\sqrt{h_{it}}

{y}_{it}\ge 0

is equivalent to

{\varepsilon}_{it}\ge 0

. Hence

{R}_{t-1}=\operatorname{diag}\left\{{1}_{\left\{{\varepsilon}_{1,t-1}\ge 0\right\}},{1}_{\left\{{\varepsilon}_{2,t-1}\ge 0\right\}},\dots, {1}_{\left\{{\varepsilon}_{N,t-1}\ge 0\right\}}\right\}.

In this case, the random matrices

\left\{{B}_t\right\}

are i.i.d. and model (4) is a generalized autoregressive equation by definition 1.4 in Bougerol and Picard (1992). It is easy to verify that

𝔼 (\log^{+} {‖B_{0}‖}_{*}) < \infty

. Therefore, the top Lyapunov exponent associated to

\left\{{B}_t\right\}

is well-defined as follows:

γ : = \inf \{𝔼 (\frac{1}{t + 1} \log {‖B_{t} B_{t - 1} \dots B_{0}‖}_{*}), t \in N\},

(5)

where

{\left\Vert \cdotp \right\Vert}_{\ast }

is an operator norm of

N\times N

matrices, corresponding to any norm on

{\mathbb{R}}^N

through

{\left\Vert M\right\Vert}_{\ast }=\sup \left\{\left\Vert M\mathbf{x}\right\Vert /\left\Vert \mathbf{x}\right\Vert; \kern0.3em \mathbf{x}\in {\mathbb{R}}^N,\kern0.3em \mathbf{x}\ne 0\right\}.

According to theorem 3.2 in Bougerol and Picard (1992), the series

{\mathbf{h}}_t=\omega {\mathbf{1}}_N+\omega \sum \limits_{k=1}^{\infty }{B}_{t-1}\dots {B}_{t-k}{\mathbf{1}}_N,

(6)

is the unique strictly stationary and ergodic solution of model (4) if and only if the Lyapunov exponent

\gamma <0

. Under this condition, process

\left\{{\mathbf{y}}_t\right\}

is also strictly stationary and ergodic where

{\mathbf{y}}_t={\left({y}_{1t},{y}_{2t},\dots, {y}_{Nt}\right)}^{\prime}\in {\mathbb{R}}^N

since we could easily construct a continuous function

\Lambda :{\mathbb{R}}^N\to {\mathbb{R}}^N

according to (3) such that

{\mathbf{y}}_t=\Lambda \left({\mathbf{h}}_t\right)

. Besides, since

{y}_{it}={\varepsilon}_{it}\sqrt{h_{it}}

, the almost sure convergence of (6) guarantees that

𝔼 (h_{i t}) < \infty

for any

i

. Thus,

𝔼 ‖ y_{t} ‖^{2} = \sum_{i = 1}^{N} 𝔼 (h_{i t}) < \infty

with

\left\Vert \cdotp \right\Vert

being an Euclidean norm.

By the subadditive ergodic theorem in Kingman (1973),

\gamma =\underset{t\to \infty }{\lim}\frac{1}{t+1}\log {\left\Vert {B}_t{B}_{t-1}\dots {B}_0\right\Vert}_{\ast }

almost surely. In this case,

\gamma

could be approximated through computer simulation technique given a specific distribution of

{\varepsilon}_{it}

. For the purpose of reducing computation complexity, we derive a sufficient condition that is simple and much easier to verify.

Theorem 1.Under Assumption 1, model (4) has a unique strictly stationary and ergodic solution in the form (6) if

\max \left\{{\alpha}^{(1)},{\alpha}^{(2)}\right\}+\beta +\lambda <1.

(7)

3.2 Near-Epoch Dependence for Random Fields

Let $D:= \left\{\left(i,t\right):i\in {\mathbb{N}}_{+},t\in \mathbb{Z}\right\}$ be a lattice on space ${\mathbb{R}}^2$ , and $\rho \left(\left(i,t\right),\left(j,\tau \right)\right):= \max \left\{|i-j|,|t-\tau |\right\}$ measures the distance between any two locations $\left(i,t\right),\left(j,\tau \right)\in D$ . Assume we have observations from model (3) $\left\{{y}_{it},1\le i\le N,1\le t\le T\right\}$ , then these observations could be regarded as triangular array of random fields $\left\{{y}_{it}:\left(i,t\right)\in {D}_{NT}, NT\ge 1\right\}$ with $\left\{{D}_{NT}, NT\ge 1\right\}$ being a series of finite rectangular lattices ${D}_{NT}:= \left\{\left(i,t\right):1\le i\le N,1\le t\le T\right\}$ . Then the growth of sample size is ensured by unbounded expansion of ${D}_{NT}$ as $NT\to \infty$ . Such expansion is represented as ${\left|{D}_{NT}\right|}_c\to \infty$ , where ${\left|\cdotp \right|}_c$ is the cardinality of ${D}_{NT}$ . The discussions on the asymptotic behaviors of random fields concern only the expansion of sample region, therefore the theoretical results derived in this section will apply as long as ${\left|{D}_{NT}\right|}_c= NT\to \infty$ .

Let ${\left\Vert \cdotp \right\Vert}_p$ denote the ${L}_p$ -norm, i.e. ${‖X‖}_{p} : = {(𝔼 | X |^{p})}^{1 / p}$ for an arbitrary random variable $X$ . The definition of NED random fields is given as follows (see definition 1 in Jenish and Prucha, 2012):

Definition 1.A triangular array of random fields $𝒴 : = {y_{i t} : (i, t) \in D_{N T}, N T \geq 1}$ is said to be ${L}_p$ -NED ( $p\ge 1$ ) on $\mathcal{E}=\left\{{\varepsilon}_{it}:\left(i,t\right)\in D\right\}$ if ${\sup}_{\left(i,t\right)\in D}{\left\Vert {y}_{it}\right\Vert}_p<\infty$ , and

{\left\Vert {y}_{it}-E\Big({y}_{it}|{\mathcal{F}}_{it}(s)\right\Vert}_p\le {d}_{it}\psi (s),

where

{\mathcal{F}}_{it}(s):= \sigma \left\{{\varepsilon}_{j\tau}:\rho \left(\left(i,t\right),\left(j,\tau \right)\right)\le s\right\}

\psi (s)

is some non-negative sequence with

{\lim}_{s\to \infty}\psi (s)=0

, and

\left\{{d}_{it}:\left(i,t\right)\in {D}_{NT}, NT\ge 1\right\}

is an array of finite positive constants.

Remark.If $ψ (s) = 𝒪 (s^{- μ})$ for some $\mu >0$ , then $𝒴$ is said to be ${L}_p$ -NED on $\mathcal{E}$ of size- $\mu$ ; If $ψ (s) = 𝒪 (ρ^{s})$ for some $0<\rho <1$ , then $𝒴$ is said to be ${L}_p$ -NED on $\mathcal{E}$ geometrically; If ${\sup}_{\left(i,t\right)\in D}{d}_{it}<\infty$ , then $𝒴$ is said to be uniformly ${L}_p$ -NED on random field $\mathcal{E}$ . Note that geometric NED means NED of size- $\mu$ for all $\mu >0$ .

We need following assumptions before discussing the NED property of $𝒴$ . Assumption 2 is needed to prove that ${\sup}_{\left(i,t\right)\in D}{\left\Vert {h}_{it}\right\Vert}_2<\infty$ ; Assumption 3 put restriction on the sparsity of the network: the power of connections between two nodes decays with their distance in case (a), or two nodes are only connected if they are sufficiently close in case (b). Similar restrictions on the network structure could also be seen in assumption 3 by Xu and Lee (2015) and assumption 3.2 by Xu et al. (2022).

Assumption 2.There exists $κ_{4} : = 𝔼 ε_{i t}^{4} < \infty$ , such that

{\kappa}_4{\left(\max \left\{{\alpha}^{(1)},{\alpha}^{(2)}\right\}+\beta +\lambda \right)}^2<1.

Assumption 3.The row-normalized adjacency matrix $W$ satisfies one of following conditions:

(a).
$w_{i j} = 𝒪 (| i - j |^{- \frac{μ + 2}{2}})$ for some $\mu >0$ ;
(b).
${w}_{ij}\ne 0$ if $\mid i-j\mid \le K$ for some constant $K\ge 1$ , and ${w}_{ij}=0$ otherwise.

Theorem 2.If condition (7) holds, under Assumptions 1, 2 and 3(a), $\left\{{h}_{it}:\left(i,t\right)\in {D}_{NT}, NT\ge 1\right\}$ is uniformly ${L}_2$ -NED on $\left\{{\varepsilon}_{it}:\left(i,t\right)\in D\right\}$ of size- $\mu$ , where the NED size $\mu$ is the constant in Assumption 3(a). Moreover, if Assumption 3(b) holds instead of 3(a), $\left\{{h}_{it}:\left(i,t\right)\in {D}_{NT}, NT\ge 1\right\}$ is uniformly and geometrically ${L}_2$ -NED on $\left\{{\varepsilon}_{it}:\left(i,t\right)\in D\right\}$ .

Remark.Note that ${‖y_{i t}^{2} - 𝔼 (y_{i t}^{2} | ℱ_{i t} (s))‖}_{2} = {‖ε_{i t}^{2}‖}_{2} {‖h_{i t} - 𝔼 (h_{i t}^{2} | ℱ_{i t} (s))‖}_{2}$ , then Assumption 2 facilitates the ${L}_2$ -NED of ${y}_{it}^2$ 's given the ${L}_2$ -NED of ${h}_{it}$ 's. Besides, since ${h}_{it}\ge \omega >0$ , it is easy to verify that $\sqrt{h_{it}}$ is a Lipschitz transformation of ${h}_{it}$ using mean value theorem. Then proposition 2 in Jenish and Prucha (2012) allows $\sqrt{h_{it}}$ 's to inherit the NED properties from ${h}_{it}$ 's, therefore we could also verify that ${y}_{it}$ 's are also ${L}_2$ -NED.

4 Parameter Estimation

From model (3) we have observations

\left\{{y}_{it}:\left(i,t\right)\in {D}_{NT}, NT\ge 1\right\}

with respect to true parameters

{\theta}_0:= {\left({\omega}_0,{\alpha}_0^{(1)},{\alpha}_0^{(2)},{\lambda}_0,{\beta}_0\right)}^{\prime}\in {\mathbb{R}}^5

. Based on the infinite past of observations, the quasi log-likelihood function is

\begin{array}{ll}\hfill & {L}_{NT}\left(\theta \right)=\frac{1}{NT}\sum \limits_{i=1}^N\sum \limits_{t=1}^T{l}_{it}\left(\theta \right),\\ {}\hfill & {l}_{it}\left(\theta \right)=\log {\sigma}_{it}^2\left(\theta \right)+\frac{y_{it}^2}{\sigma_{it}^2\left(\theta \right)},\end{array}

(8)

where

{\sigma}_{it}^2

is generated from model (3) as

{\sigma}_{it}^2=\omega +\left\{{\alpha}^{(1)}{1}_{\left\{{y}_{i,t-1}\ge 0\right\}}+{\alpha}^{(2)}{1}_{\left\{{y}_{i,t-1}<0\right\}}\right\}{y}_{i,t-1}^2+\lambda {d}_i^{-1}\sum \limits_{j=1}^N{a}_{ij}{y}_{j,t-1}^2+\beta {\sigma}_{i,t-1}^2,

and

\theta := {\left(\omega, {\alpha}^{(1)},{\alpha}^{(2)},\lambda, \beta \right)}^{\prime}\in {\mathbb{R}}^5

is the parameter vector.

Since the evaluation of the exact value of (8) is infeasible in practice, it is convenient to approximate (8) with

\begin{array}{ll}\hfill & {\tilde{L}}_{NT}\left(\theta \right)=\frac{1}{NT}\sum \limits_{i=1}^N\sum \limits_{t=1}^T{\tilde{l}}_{it}\left(\theta \right),\\ {}\hfill & {\tilde{l}}_{it}\left(\theta \right)=\log {\tilde{\sigma}}_{it}^2\left(\theta \right)+\frac{y_{it}^2}{{\tilde{\sigma}}_{it}^2\left(\theta \right)},\end{array}

(9)

where

{\tilde{\sigma}}_{it}^2

is also generated from model (3) but with initial value

{\tilde{\sigma}}_{i0}^2=0

. And the QMLE of

\theta \in \Theta

is given by

{\hat{\theta}}_{NT}:= \underset{\theta \in \Theta}{\mathrm{argmin}}\kern0.20em {\tilde{L}}_{NT}\left(\theta \right).

To prove the asymptotic properties of

{\hat{\theta}}_{NT}

, we need following assumptions aside from those required by Theorem 2:

Assumption 4. $\Theta$ is a compact subset of $\left\{\theta :\omega >0,{\alpha}^{(1)}>0,{\alpha}^{(2)}>0,\lambda >0,\beta >0\right\}$ such that all $\theta \in \Theta$ satisfy (7) and Assumption 2, and the true parameter ${\theta}_0\in \Theta$ is an interior point of $\Theta$ .

Assumption 5. ${\sup}_{\left(i,t\right)\in D}{\sup}_{\theta \in \Theta}{\left\Vert {\sigma}_{it}^2\left(\theta \right)\right\Vert}_p<\infty$ for some $p>1$ .

Assumption 6. $𝔼 ε_{i t}^{4 r} < \infty$ for some $r>2$ , and following bounds exists:

\begin{array}{ll}\hfill & \underset{\left(i,t\right)\in D}{\sup }{\left\Vert {\sigma}_{it}^2\left({\theta}_0\right)\right\Vert}_{2r}<\infty; \\ {}\hfill & \underset{\left(i,t\right)\in D}{\sup }{\left\Vert \frac{\partial }{\partial {\theta}_k}{\sigma}_{i,t}^2\left({\theta}_0\right)\right\Vert}_{2r}<\infty; \\ {}\hfill & \underset{\left(i,t\right)\in D}{\sup }{\left\Vert \frac{\partial^2}{\partial {\theta}_j{\partial}_k\theta }{\sigma}_{it}^2\left({\theta}_0\right)\right\Vert}_2<\infty, \end{array}

where

{\theta}_k

denotes the

k

th component of parameter vector

\theta

Assumption 7.The NED-size $\mu$ in Theorem 2 satisfies $\frac{r-2}{2r-2}\mu >2$ with $r$ being the one in Assumption 6.

Assumptions 4 is also required by Zhou et al. (2020) to prove the asymptotic properties in the case when $N$ being fixed. With both $T\to \infty$ and $N\to \infty$ , additional assumptions as above are required to adopt the limit theorems of random fields. Specifically, Assumption 5 is required for ${l}_{it}\left(\theta \right)$ to satisfy the bound condition of law of large numbers (LLN) for random fields (assumption 2(a) in Jenish and Prucha, 2012); Assumption 6 facilitates the heredity of NED property from ${\sigma}_{it}^2\left({\theta}_0\right)$ to the more complicated forms of first-order and second-order derivatives of ${L}_{NT}\left({\theta}_0\right)$ ; Assumption 7 is a constraint on the decaying rate of NED coefficients, which is required by the central limit theorem (CLT) for random fields (assumption 4(c) in Jenish and Prucha, 2012). Of course, as we have remarked after Definition 1 that geometric NED means NED of size- $\mu$ for all $\mu >0$ , therefore Assumption 7 would be trivial under geometric NED.

Theorem 3.Under Assumptions required by Theorem 2, Assumption 4 and Assumption 5, the quasi-maximum likelihood estimator ${\hat{\theta}}_{NT}$ is consistent, i.e.

{\hat{\theta}}_{NT}\overset{p}{\to }{\theta}_0,

NT\to \infty

; If Assumptions 6 and 7 also hold, and the smallest eigenvalue

{\lambda}_{\mathrm{min}}\left({\Sigma}_{NT}\right)

\sum_{N T} : = \frac{κ_{4} - 1}{N T} \sum_{(i, t) \in D_{N T}} 𝔼 [\frac{1}{σ_{i t}^{4} (θ_{0})} \frac{\partial}{\partial θ} σ_{i t}^{2} (θ_{0}) \frac{\partial}{\partial θ^{'}} σ_{i t}^{2} (θ_{0})],

satisfies that

\underset{NT\ge 1}{\operatorname{inf}}{\lambda}_{min}\left({\Sigma}_{NT}\right)>0,

(10)

then

{\hat{\theta}}_{NT}

is asymptotically normal as

NT\to \infty

and

N=o(T)

\sqrt{NT}{\Sigma}_{NT}^{1/2}\left({\hat{\theta}}_{NT}-{\theta}_0\right)\overset{d}{\to}\mathrm{N}\left(0,{\left({\kappa}_4-1\right)}^2{I}_5\right),

where

{I}_5

is the

5\times 5

identity matrix.

Remark.Condition (10) can be implied if the smallest eigenvalues ${\lambda}_{\mathrm{min}}^{\left(i,t\right)}$ of $𝔼 [\frac{1}{σ_{i t}^{4} (θ_{0})} \frac{\partial}{\partial θ} σ_{i t}^{2} (θ_{0}) \frac{\partial}{\partial θ^{'}} σ_{i t}^{2} (θ_{0})]$ satisfy that ${\operatorname{inf}}_{NT\ge 1}{\operatorname{inf}}_{\left(i,t\right)\in {D}_{NT}}{\lambda}_{min}^{\left(i,t\right)}>0.$

As we will show in the proof of Proposition 5.1,

{\kappa}_4

and

{\Sigma}_{NT}

above could be approximated by

{\hat{\kappa}}_4:= \frac{1}{NT}\sum \limits_{i=1}^N\sum \limits_{t=1}^T\frac{y_{it}^4}{{\tilde{\sigma}}_{it}^4\left({\hat{\theta}}_{NT}\right)},

(11)

and

{\hat{\Sigma}}_{NT}:= \frac{{\hat{\kappa}}_4-1}{NT}\sum \limits_{i=1}^N\sum \limits_{t=1}^T\left[\frac{1}{{\tilde{\sigma}}_{it}^4\left({\hat{\theta}}_{NT}\right)}\frac{\partial {\tilde{\sigma}}_{it}^2\left({\hat{\theta}}_{NT}\right)}{\partial \theta}\frac{\partial {\tilde{\sigma}}_{it}^2\left({\hat{\theta}}_{NT}\right)}{\partial {\theta}^{\prime }}\right],

(12)

respectively. The later could be calculated recursively as

\frac{\partial }{\partial \theta }{\tilde{\sigma}}_{it}^2\left({\hat{\theta}}_{NT}\right)={\tilde{\mathbf{u}}}_{i,t-1}+\hat{\beta}\frac{\partial }{\partial \theta }{\tilde{\sigma}}_{i,t-1}^2\left({\hat{\theta}}_{NT}\right)

where

{\tilde{\mathbf{u}}}_{i,t-1}=\left(\begin{array}{l}1\\ {}{y}_{i,t-1}^2{1}_{\left\{{\hat{\varepsilon}}_{i,t-1}\ge 0\right\}}\\ {}{y}_{i,t-1}^2{1}_{\left\{{\hat{\varepsilon}}_{i,t-1}<0\right\}}\\ {}\sum \limits_{j=1}^N{w}_{i,j}{y}_{j,t-1}^2\\ {}{\tilde{\sigma}}_{i,t-1}^2\left({\hat{\theta}}_{NT}\right)\end{array}\right).

5 Tests on Threshold Effect and Residuals

5.1 A Wald Test for the Threshold Effect

Given a null hypothesis

{H}_0:\Gamma {\theta}_0=\eta,

(13)

where

\Gamma

is an

s\times 5

matrix with rank

s

and

\eta

is an

s

-dimensional vector, we could define a Wald test statistic as follows:

{W}_{NT}:= {\left(\Gamma {\hat{\theta}}_{NT}-\eta \right)}^{\prime }{\left\{\frac{\Gamma}{NT}{\left({\hat{\kappa}}_4-1\right)}^2{\hat{\Sigma}}_{NT}^{-1}{\Gamma}^{\prime}\right\}}^{-1}\left(\Gamma {\hat{\theta}}_{NT}-\eta \right),

(14)

where

{\hat{\kappa}}_4

and

{\hat{\Sigma}}_{NT}

are defined in (11) and (12).

By the asymptotic normality of ${\hat{\theta}}_{NT}$ , ${W}_{NT}$ could also be proved to follow a canonical asymptotic distribution as in Proposition 5.1.

Proposition 5.1.Under the same assumptions required by Theorem 3, as $T\to \infty$ and $N=o(T)$ , the Wald test statistic defined in (14) asymptotically follows a ${\chi}^2$ distribution with degree of freedom $s$ , i.e.

{W}_{NT}\overset{d}{\to }{\chi}_s^2.

5.2 A White Noise Test on the Residuals

There has been a large literature investigating high-dimensional time series models, including Xu and Lee (2015), Zhu et al. (2017) and Xu et al. (2022) among others, but none of them has used diagnostic tools to check the model adequacy. In this section, we will introduce a high-dimensional white noise test developed by Li et al. (2019) that can be applied to the diagnostic of high-dimensional models including ours.

Assume we have residuals

\left\{{\mathbf{r}}_t:1\le t\le T\right\}

, where

{\mathbf{r}}_t:= {\left({r}_{1t},\dots, {r}_{Nt}\right)}^{\prime }

. We want to test whether

\left\{{\mathbf{r}}_t:1\le t\le T\right\}

are high-dimensional white noises, i.e. there exists a matrix

P

such that

{H}_0:{\mathbf{r}}_t=P{\mathbf{z}}_t,

(15)

where

{\mathbf{z}}_t={\left({\varepsilon}_{1t},\dots, {\varepsilon}_{Nt}\right)}^{\prime }

. The test statistic is the sum of squared singular values of first

q

lagged sample autocovariance matrices:

{G}_q:= \sum \limits_{\tau =1}^q tr\left({\hat{S}}_{\tau }{\hat{S}}_{\tau}^{\prime}\right),

(16)

where

{\hat{S}}_{\tau }=\frac{1}{T}{\sum}_{t=1}^T{\mathbf{r}}_t{\mathbf{r}}_{t-\tau}^{\prime }

with

{\mathbf{r}}_t={\mathbf{r}}_{t+T}

when

t\le 0

A

is unknown, the sample covariance matrix of

{\mathbf{r}}_t

{\hat{S}}_0=\frac{1}{T}{\sum}_{t=1}^T{\mathbf{r}}_t{\mathbf{r}}_t^{\prime }

. According to (2.8) in Li et al. (2019), we reject (15) if

\frac{G_q-\frac{N^2q}{T}{\hat{s}}_1^2}{\sqrt{\frac{2{N}^2q}{T^2}{\left({\hat{s}}_2-\frac{N}{T}{\hat{s}}_1^2\right)}^2}}>{Z}_{\alpha },

where

{\hat{s}}_1=\frac{1}{N} tr\left({\hat{S}}_0\right)

{\hat{s}}_2=\frac{1}{N} tr\left({\hat{S}}_0^2\right)

and

{Z}_{\alpha }

is the upper-

\alpha

quantile of standard normal distribution.

Note that

\left\{{\mathbf{r}}_t:1\le t\le T\right\}

being white-noise means that the residuals are uncorrelated over

t

. However, it doesn't indicate that the residuals are uncorrelated over both

i

and

t

. The later indicates a stronger adequacy of high-dimensional model. We could assume that

P={I}_N

in the null hypothesis, and by (2.5) in Li et al. (2019), we reject

{H}_0:{\mathbf{r}}_t={\mathbf{z}}_t

\frac{G_q-\frac{N^2q}{T}}{\sqrt{\frac{2{N}^2q}{T^2}+\frac{4{N}^3{q}^2\left({\kappa}_4-3\right)}{T^3}+\frac{8{N}^3{q}^2}{T^3}}}>{Z}_{\alpha }.

6 Simulation Study

6.1 Network Simulation

The symmetric matrix $A$ in model 4 represents an undirected network structure, the pattern of which varies over different application scenarios. In this simulation study, we tend to use four different mechanisms of simulating corresponding network. The network structure in Example 1 adapts to Assumption 3, which is required by geometric NED as we have shown in Theorem 2. Simulation mechanisms introduced in Examples 2–4 are for testing the robustness of our estimation, against network structures that may violate Assumption 3.

Example 1.For each node $i\in \left\{1,2,\dots, N\right\}$ , it is connected to node $j$ only if $j$ is inside $i$ 's $D$ -neighborhood. That is, in the adjacency matrix, ${a}_{ij}=1$ if $0<\mid i-j\mid \le D$ and ${a}_{ij}=0$ otherwise. Figure 1(a) is a visualization of such a network with $N=100$ and $D=10$ .

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Visualized network structures with N = 100. (a) Example 1 (D = 10) (b) Example 2; (c) Example 3; (d) Example 4 (K = 10)

Example 2. (Network structure with random distribution)For each node $i\in \left\{1,2,\dots, N\right\}$ , we generate ${D}_i$ from uniform distribution $U\left(0,5\right)$ , and then draw $\left[{D}_i\right]$ samples randomly from $\left\{1,2,\dots, N\right\}$ to form a set ${S}_i$ ( $\left[x\right]$ denotes the integer part of $x$ ). $A=\left({a}_{ij}\right)$ could be generated by letting ${a}_{ij}=1$ if $j\in {S}_i$ and ${a}_{ij}=0$ otherwise. In a network simulated with such mechanism, as it is indicated in Figure 1(b), there is no significantly influential node (i.e. node with extremely large in-degree).

Example 3. (Network structure with power-law distribution)According to Clauset et al. (2009), for each node $i$ in such a network, ${D}_i$ is generated the same way as in Example 2. Instead of uniformly selecting $\left[{D}_i\right]$ samples from $\left\{1,2,\dots, N\right\}$ , these samples are collected w.r.t. probability ${p}_i={s}_i/{\sum}_{i=1}^N{s}_i$ where ${s}_i$ is generated from a discrete power-law distribution $\mathbb{P}\left\{{s}_i=x\right\}\propto {x}^{-a}$ with scaling parameter $a=2.5$ . As shown in Figure 1(c), a few nodes have much larger in-degrees while most of them have less than 2. Compared to Example 2, network structure with power-law distribution exhibits larger gaps between the influences of different nodes. This type of network is suitable for modeling social media such as Twitter and Instagram, where celebrities have huge influence while the ordinary majority has little.

Example 4. (Network structure with stochastic blocks)As it was proposed in Nowicki and Snijders (2001), in a network with stochastic block structure, all nodes are divided into blocks and nodes from the same block are more likely to be connected comparing to those from different blocks. To simulate such structure, these $N$ nodes are randomly divided into $K$ groups by assigning labels $\left\{1,2,\dots, K\right\}$ to every nodes with equal probability. For any two nodes $i$ and $j$ from the same group, let $\mathbb{P}\left({a}_{ij}=1\right)=0.5$ while for those two from different groups, $\mathbb{P}\left({a}_{ij}=1\right)=0.001/N$ . Hence, it is very unlikely for nodes to be connected across groups. Our simulated network successfully mimics this characteristic as Figure 1(d) shows clear boundaries between groups. Block network also has its advantage in practical perspective. For instance, the price of one stock is highly relevant to those in the same industry sector.

In the next section, the simulation study is carried out on datasets that are generated according to the process (3) in conjunction with three types of adjacency matrices in Examples 1–4.

6.2 Simulation Results

Setting the true parameters ${\theta}_0$ as ${\left(0.1,0.1,0.2,0.2,0.2\right)}^{\prime }$ , we generate data according to process (3) with different sample sizes $T$ and number of dimensions $N$ . In our setting, $T$ increases from 50 to 4000, while $N$ also increases at relatively slower rates of $𝒪 (\sqrt{T})$ and $𝒪 (T / \log (T))$ respectively, as it is showed in the following table:

$T$	50	100	200	500	800	1000	1500	2000	2500	3000	4000
$N\approx \sqrt{T}$	7	10	14	22	28	31	38	44	50	54	63
$N\approx T/\log (T)$	12	21	37	80	119	144	205	263	319	374	482

For each combination of

\left(T,N\right)

M=1000

datasets will be simulated independently, according to (3). Based on the

m

th (

m=1,2,\dots, M

) dataset, the estimation of

{\theta}_0

will be carried out and the estimation result is denoted as

{\hat{\theta}}_m={\left({\hat{\theta}}_{km}\right)}^{\prime }={\left({\hat{\omega}}_m,{\hat{\alpha}}_m^{(1)},{\hat{\alpha}}_m^{(2)},{\hat{\lambda}}_m,{\hat{\beta}}_m\right)}^{\prime }

. For

k=\left\{1,2,3,4,5\right\}

, the following two measurements are used to evaluate the performance of simulation results:

root-mean-square error: ${\mathrm{RMSE}}_k=\sqrt{M^{-1}{\sum}_{m=1}^M{\left({\hat{\theta}}_{km}-{\theta}_{k0}\right)}^2}$ ,
coverage probability: ${\mathrm{CP}}_k={M}^{-1}{\sum}_{m=1}^M{1}_{\left\{{\theta}_{k0}\in C{I}_{km}\right\}}$ .

{\mathrm{CI}}_{km}

is the 95% confidence interval defined as

{\mathrm{CI}}_{km}=\left({\hat{\theta}}_{\mathrm{km}}-{z}_{0.975}{\hat{\mathrm{SE}}}_{\mathrm{km}},{\hat{\theta}}_{\mathrm{km}}+{z}_{0.975}{\hat{\mathrm{SE}}}_{\mathrm{km}}\right),

where the estimated SE

{\hat{\mathrm{SE}}}_{\mathrm{km}}

could be calculated as the square root of

k

th diagonal element of

{(NT)}^{-1}\left({\hat{\kappa}}_4-1\right){\hat{\Sigma}}^{-1}

and

{z}_{0.975}

is the 0.975th quantile of standard normal distribution. To eliminate the effect of starting points, a different initial guess of

\theta

is used for each

m

As it is demonstrated in line graphs (c) and (d) in Figures 2-5, the consistency of the estimator is obvious since RMSE drops toward zero when $T$ and $N$ increases. Additionally, $\hat{\mathrm{SE}}$ is proved to provide reliable estimation of true SE since the coverage probability (CP) converges to its theoretical value of 95% in graphs (a) and (b) in Figures 2-5. In conclusion, the asymptotic properties of our estimator in Theorem 3 are well supported by our simulation results, even for network structures in Examples 2–4 that may violate Assumption 3.

Remark.It is worth noticing that the CPs show a lower efficiency of convergence in general when $N = 𝒪 (T / \log (T))$ , comparing with the case when $N = 𝒪 (\sqrt{T})$ . Such phenomenon raises an assumption that the performance of the estimator $\hat{\mathrm{SE}}$ is highly related to the ratio of $T$ and $N$ . We repeat the simulation for 61 different combinations of $\left(T,N\right)$ and the scatter graph Figure 6 indicates that such assumption could be true, and the convergence of $\hat{\mathrm{SE}}$ only requires $T/N\to \infty$ , which includes what we have in Theorem 3, where $T\to \infty$ and $N\to \infty$ at a lower rate.

7 Empirical Data Analysis

In addition to simulation studies, we want to test our model using real data from Chinese Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE). The dataset consists of daily log returns of 286 stocks, which are observed in two consecutive years of 2019 and 2020 (

T=487

except for closing days). These stocks come from four industry sectors as follows:

75 stocks from automotive industry sector;
73 stocks from financial industry sector;
68 stocks from information industry sector;
70 stocks from pharmaceutical industry sector.

And our model is tested within each sector, in which the number of stocks is approximately $T/\log (T)\approx 79$ . Hence the estimates and inferences could be trusted according to the simulation study.

As an initial impression of data from each category the time plots of daily average log returns are presented in Figure 7. We also have the shareholder information of each stock, based on which two stocks are considered as connected when they share at least one common shareholder among their top ten shareholders. By this principle, four adjacency matrices are constructed and visualized as Figure 8 for four different industry sectors. Although it is quite intuitive to tell from Figure 8 the sparsity of these four networks, we tend to use the network density (ND) as a quantified measurement, which is defined by the ratio of the number of existing edges to the number of potential connections:

\mathrm{ND}:= 100\%\times \frac{\sum \limits_{i=1}^N{d}_i}{\mathrm{N}\left(N-1\right)}.

The results of parameter estimation is summarized in Table I. It is worth noting that the estimated network effect $\lambda$ for automotive industry sector is not statistically significant while the other estimates for other coefficients or estimates from other sectors are all significant at 5% level. As indicated in Figure 8(a), this could be caused by the sparsity of the network structure as the data from automotive industry has the lowest network density comparing to others. Positive estimates of $\lambda$ indicate positive correlation between the return of a stock and the returns of its neighbors. Comparing with other parameters, the estimates of $\beta$ are much larger for all four categories. Strong memory of volatility has been observed in many econometric studies on daily data, and such persistence would be stronger with data sampled at higher frequency according to Nelson (1991).

Table I. Estimation results based on daily log-returns (2019&2020) of stocks from four industries.

Parameter	Estimation	SE	p-Value	Parameter	Estimation	SE	p-Value
Automotive industry				Financial industry
$\omega$	0.000099	5.83e $-$ 07	$<$ 0.05	$\omega$	0.000043	3.12e $-$ 06	$<$ 0.05
${\alpha}^{(1)}$	0.199408	1.08e $-$ 02	$<$ 0.05	${\alpha}^{(1)}$	0.247765	1.41e $-$ 02	$<$ 0.05
${\alpha}^{(2)}$	0.136423	1.01e $-$ 02	$<$ 0.05	${\alpha}^{(2)}$	0.202237	1.47e $-$ 02	$<$ 0.05
$\lambda$	0.004591	4.71e $-$ 03	0.16465	$\lambda$	0.010469	5.35e $-$ 03	$<$ 0.05
$\beta$	0.727756	1.17e $-$ 02	$<$ 0.05	$\beta$	0.737272	1.09e $-$ 02	$<$ 0.05
Information industry				Pharmaceutical industry
$\omega$	0.000105	6.39e $-$ 06	$<$ 0.05	$\omega$	0.000063	4.15e $-$ 06	$<$ 0.05
${\alpha}^{(1)}$	0.172737	9.34e $-$ 03	$<$ 0.05	${\alpha}^{(1)}$	0.180950	1.05e $-$ 02	$<$ 0.05
${\alpha}^{(2)}$	0.122312	8.86e $-$ 03	$<$ 0.05	${\alpha}^{(2)}$	0.131722	1.06e $-$ 02	$<$ 0.05
$\lambda$	0.009475	4.03e $-$ 03	$<$ 0.05	$\lambda$	0.012929	4.06e $-$ 03	$<$ 0.05
$\beta$	0.745699	1.11e $-$ 02	$<$ 0.05	$\beta$	0.753305	1.11e $-$ 02	$<$ 0.05

We now conduct a Wald test on the existence of threshold effect based on the estimated parameters. By letting

\Gamma := \Big(0,1,

1,0,0\Big)

and

\eta := 0

in (13), we can make a null hypothesis as follows:

{H}_0:{\alpha}_0^{(1)}={\alpha}_0^{(2)}.

As it is indicated in Table II, we could reject the null hypothesis with strong confidence and conclude that there exists extremely significant threshold effect within each industry sector.

Table II. p-Values of Wald test on

{H}_0:{\alpha}_0^{(1)}={\alpha}_0^{(2)}

Automotive industry	Financial industry	Information industry	Pharmaceutical industry
1.09e $-$ 10	2.16e $-$ 07	3.8e $-$ 06	3.17e $-$ 06

Using the diagnostic tool introduced in Section 5.2, we could check the model adequacy by inspecting the correlations between residual vectors ${\mathbf{r}}_t={\left[\frac{y_{1t}}{{\tilde{\sigma}}_{1t}\left({\hat{\theta}}_{NT}\right)},\dots, \frac{y_{Nt}}{{\tilde{\sigma}}_{Nt}\left({\hat{\theta}}_{NT}\right)}\right]}^{\prime }$ . We will test null hypothesis ${H}_0:{\mathbf{r}}_t=P{\mathbf{z}}_t$ with $P$ being unknown and $P={I}_N$ respectively, the results are summarized in Table III. In all sectors, we cannot reject the hypothesis that the residual vectors are high-dimensional white noises with $𝔼 r_{t} = 0$ and $\mathrm{Var}\left({\mathbf{r}}_t\right)=P{P}^{\prime }$ over $t$ . However, the stronger hypothesis ${H}_0:{\mathbf{r}}_t={\mathbf{z}}_t$ is rejected, as there exist correlations between residuals $\left\{\frac{y_{it}}{{\tilde{\sigma}}_{it}\left({\hat{\theta}}_{NT}\right)}\right\}$ with different $i$ . We might be able to eliminate such deficiency in the adequacy of our model by heterogeneous parameterization with coefficients as ${\omega}_i$ . ${\alpha}_i^{(1)}$ , ${\alpha}_i^{(2)}$ , ${\lambda}_i$ and ${\beta}_i$ , or by considering a dynamic network structure. However, the purpose of the introduction of network structure is to reduce the number of parameters of high-dimensional time series. Besides, deriving limit theorems for models with heterogeneous parameters or dynamic network could be theoretically challenging.

Table III. Results of high-dimensional white noise test on

{H}_0:{\mathbf{r}}_t=P{\mathbf{z}}_t

with

q=3

and

\alpha =0.01

	Automotive industry	Financial industry	Information industry	Pharmaceutical industry
$P$ is unknown	Not rejected	Not rejected	Not rejected	Not rejected
$P={I}_N$	Rejected	Rejected	Rejected	Rejected

On the other hand, our results on asymmetric effect of positive and negative news are quite different compared to what was derived from univariate data in the literature. For instance, in a study by Engle and Ng (1993) on daily returns of Japanese stock index TOPIX, it was found that negative news would have larger impact on future volatility. Such phenomenon is reasonable in stock market since investors would lose confidence to a certain asset when it performs badly, hence they would adjust their portfolio and add more uncertainty to the future. However, it is not necessarily the case if we take into consideration the whole picture instead of looking at one individual and ignoring possible impact of its neighbors in the same system. In our estimation results, ${\alpha}^{(1)}$ are uniformly larger than ${\alpha}^{(2)}$ , indicating a larger impact of good news on volatility. A more precise conclusion would be that the volatility of one individual is more sensitive to its own good news, which actually does not contradict the conclusion of Engle and Ng (1993), since in the univariate case, how much proportion of the ‘bad news’ effect is actually contributed by bad performance in systematic perspective remains unknown. Our results show that good news has larger ‘local influence’ as it is indicated by ${\alpha}^{(1)}$ , while there is a possibility that bad news, despite of having less ‘local influence’, spreads faster and has larger ‘global influence’ on the neighbors through network connection. Such potential leads to a future extension of our model that the threshold effect could be further applied on the coefficient $\lambda$ , allowing good news and bad news to have asymmetric network effect.

8 Conclusion

In this article, we propose a TNGARCH model by taking consideration of network effect, as well as the threshold effect of shocks on volatilities. Our model can be applied to describe asymmetric properties for volatilities of high dimensional time series without increasing the parameterization complexity. Strict stationarity when $N$ is fixed, as well as near-epoch dependence when $N\to \infty$ are discussed. Then the parameters are estimated by quasi-maximum likelihood estimation, the consistency and the asymptotic normality of the proposed estimator are proved as well. The results of simulation study support the theoretical properties of QMLE. At last, our model is fitted to real stock data containing 710 stocks of four industries from SSE and SZSE. Empirical results reveal that although volatility is more sensitive to bad news in univariate case, with network structure being considered, there is a possibility that majority of the revision of individual volatility is due to the impact of bad news of its neighbors, hence the ‘local influence’ of bad news is not necessarily larger than that of good news in such case.

There is room for extension of our methodology, which could lead to interesting topics for future research. In Theorem 3 we have derived asymptotic properties when $T\to \infty$ and $N\to \infty$ at a lower rate, our estimation method enables us to make reliable inference on parameters even when the data has hundreds of dimensions according to the simulation study. The limitation is also obvious, for example, as shown in Figures 2-5, to get a decent approximation of standard errors, we need to collect 4000 samples even though the number of dimensions is about 482, which could be even higher in real-world situations: user data collected from a social network often consists of millions of individual accounts whereas it may be impossible to collect sufficient number of samples over time even at daily frequency. Therefore our model would be applicable in a much larger scale if its statistical properties could be derived when $N$ increases at the same rate, or even higher rate compared to $T$ . As far as we know, there is no published work in the literature to solve this problem theoretically for GARCH-type models. Another limitation of our model is that the way we consider the network effect is simplified in two aspects: The network structure is deterministic rather than stochastic over time, embedding a random network in our model would make more sense, it would nevertheless raise the complexity of the model, and may cause problems in the estimation of parameters (see Chandrasekhar and Lewis, 2011); Moreover, there is only one type of individual-to-individual relation considered since the network structure is constructed solely based on common shareholders. Zhu et al. (2023) constructed their factor-augmented network using several types of relations, including individual-to-individual relations and factor-to-individual relations. Bringing more information into consideration would possibly improve the adequacy of the model and we will leave it for future research.

Acknowledgments

Many thanks to the Editor, the Co-Editor and the Referees for their insightful comments. The referees' suggestions lead to significant improvement of our article. The second author's work was partially supported by the National Natural Science Foundation of China (Grant No.12171161).

Open Research

Data Availability Statement

Our dataset consists of daily log returns of 286 stocks, which are observed in two consecutive years of 2019 and 2020 (except for holidays) from Chinese Shanghai Stock Exchange (SSE) and Shenzhen Stock Exchange (SZSE). The data are available at request. The data that support the findings of this study are available at request.

Supporting Information

References

Bauwens L, Laurent S, Rombouts JV. 2006. Multivariate GARCH models: a survey. Journal of Applied Econometrics 21(1): 79–109.
10.1002/jae.842
Web of Science® Google Scholar
Bollerslev T. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31(3): 307–327.
10.1016/0304-4076(86)90063-1
Web of Science® Google Scholar
Bollerslev T. 1990. Modelling the coherence in short-run nominal exchange rates: a multivariate generalized arch model. The Review of Economics and Statistics 72(3): 498–505.
10.2307/2109358
Web of Science® Google Scholar
Bollerslev T, Engle RF. 1993. Common persistence in conditional variances. Econometrica 61(1): 167–186.
10.2307/2951782
Web of Science® Google Scholar
Bollerslev T, Engle RF, Wooldridge JM. 1988. A capital asset pricing model with time-varying covariances. Journal of Political Economy 96(1): 116–131.
10.1086/261527
Web of Science® Google Scholar
Bougerol P, Picard N. 1992. Stationarity of GARCH processes and of some nonnegative time series. Journal of Econometrics 52(1): 115–127.
10.1016/0304-4076(92)90067-2
Web of Science® Google Scholar
Chandrasekhar A, Lewis R. 2011. Econometrics of sampled networks. Unpublished Manuscript, MIT.[422].
Google Scholar
Clauset A, Shalizi CR, Newman MEJ. 2009. Power-law distributions in empirical data. SIAM Review 51(4): 661–703.
10.1137/070710111
Web of Science® Google Scholar
Engle RF. 1982. Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4): 987–1007.
10.2307/1912773
Web of Science® Google Scholar
Engle R. 2002. Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional heteroskedasticity models. Journal of Business & Economic Statistics 20(3): 339–350.
10.1198/073500102288618487
Web of Science® Google Scholar
Engle RF, Kroner KF. 1995. Multivariate simultaneous generalized ARCH. Econometric Theory 11(1): 122–150.
10.1017/S0266466600009063
Web of Science® Google Scholar
Engle RF, Ng VK. 1993. Measuring and testing the impact of news on volatility. The Journal of Finance 48(5): 1749–1778.
10.1111/j.1540-6261.1993.tb05127.x
Web of Science® Google Scholar
Engle RF, Ng VK, Rothschild M. 1990. Asset pricing with a factor-ARCH covariance structure: empirical estimates for treasury bills. Journal of Econometrics 45(1): 213–237.
10.1016/0304-4076(90)90099-F
Web of Science® Google Scholar
Glosten LR, Jagannathan R, Runkle DE. 1993. On the relation between the expected value and the volatility of the nominal excess return on stocks. The Journal of Finance 48(5): 1779–1801.
10.1111/j.1540-6261.1993.tb05128.x
Web of Science® Google Scholar
Goel S, Goldstein DG. 2014. Predicting individual behavior with social networks. Marketing Science 33(1): 82–93.
10.1287/mksc.2013.0817
Web of Science® Google Scholar
Hu Y-P, Tsay RS. 2014. Principal volatility component analysis. Journal of Business & Economic Statistics 32(2): 153–164.
10.1080/07350015.2013.818006
Web of Science® Google Scholar
Jenish N, Prucha IR. 2012. On spatial processes and asymptotic inference under near-epoch dependence. Journal of Econometrics 170(1): 178–190.
10.1016/j.jeconom.2012.05.022
PubMed Web of Science® Google Scholar
Karolyi GA. 1995. A multivariate GARCH model of international transmissions of stock returns and volatility: the case of the United States and Canada. Journal of Business & Economic Statistics 13(1): 11–25.
10.1080/07350015.1995.10524575
Web of Science® Google Scholar
Kingman JFC. 1973. Subadditive ergodic theory. Annals of Probability 1(6): 883–899.
10.1214/aop/1176996798
Web of Science® Google Scholar
Li W, Gao J, Li K, Yao Q. 2016. Modeling multivariate volatilities via latent common factors. Journal of Business & Economic Statistics 34(4): 564–573.
10.1080/07350015.2015.1092975
Web of Science® Google Scholar
Li Z, Lam C, Yao J, Yao Q. 2019. On testing for high-dimensional white noise. The Annals of Statistics 47(6): 3382–3412.
10.1214/18-AOS1782
Web of Science® Google Scholar
Nelson DB. 1991. Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59(2): 347–370.
10.2307/2938260
Web of Science® Google Scholar
Nitzan I, Libai B. 2011. Social effects on customer retention. Journal of Marketing 75(6): 24–38.
10.1509/jm.10.0209
Web of Science® Google Scholar
Nowicki K, Snijders TAB. 2001. Estimation and prediction for stochastic block structures. Journal of the American Statistical Association 96(455): 1077–1087.
10.1198/016214501753208735
Web of Science® Google Scholar
Pan J, Polonik W, Yao Q. 2010. Estimating factor models for multivariate volatilities: an innovation expansion method. In Proceedings of COMPSTAT'2010, Physica-Verlag, Heidelberg; 305–314.
10.1007/978-3-7908-2604-3_28
Google Scholar
Teräsvirta T. 2009. An introduction to univariate GARCH models. In Handbook of Financial Time Series, Springer, Berlin; 17–42.
10.1007/978-3-540-71297-8_1
Google Scholar
Tse Y, Tsui AK. 2001. A multivariate GARCH model with time-varying correlations. Mathematics Preprint Archive 2001(6): 1184–1217.
Google Scholar
Xu X, Lee L. 2015. Maximum likelihood estimation of a spatial autoregressive tobit model. Journal of Econometrics 188(1): 264–280.
10.1016/j.jeconom.2015.05.004
Web of Science® Google Scholar
Xu X, Wang W, Shin Y, Zheng C. 2022. Dynamic network quantile regression model. Journal of Business & Economic Statistics 42(2): 407–421.
10.1080/07350015.2022.2093882
Web of Science® Google Scholar
Zakoïan J-M. 1994. Threshold heteroskedastic models. Journal of Economic Dynamics and Control 18(5): 931–955.
10.1016/0165-1889(94)90039-6
Web of Science® Google Scholar
Zhou J, Li D, Pan R, Wang H. 2020. Network GARCH model. Statistica Sinica 30: 1–18.
Web of Science® Google Scholar
Zhu X, Pan R, Li G, Liu Y, Wang H. 2017. Network vector autoregression. Annals of Statistics 45(3): 1096–1123.
10.1214/16-AOS1476
Web of Science® Google Scholar
Zhu Z, Zhang N, Zhu K. 2023. Big portfolio selection by graph-based conditional moments method. ArXiv 2301:11697.
Google Scholar

Volume45, Issue6

November 2024

Pages 910-930

Threshold Network GARCH Model

Abstract

1 Introduction

2 Model Setup

3 Stationarity and Near-Epoch Dependence

3.1 Stationarity with N being Fixed

3.2 Near-Epoch Dependence for Random Fields

4 Parameter Estimation

5 Tests on Threshold Effect and Residuals

5.1 A Wald Test for the Threshold Effect

5.2 A White Noise Test on the Residuals

6 Simulation Study

6.1 Network Simulation

6.2 Simulation Results

7 Empirical Data Analysis

8 Conclusion

Acknowledgments

Open Research

Data Availability Statement

Supporting Information

References

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Threshold Network GARCH Model

Abstract

1 Introduction

2 Model Setup

3 Stationarity and Near-Epoch Dependence

3.1 Stationarity with N being Fixed

3.2 Near-Epoch Dependence for Random Fields

4 Parameter Estimation

5 Tests on Threshold Effect and Residuals

5.1 A Wald Test for the Threshold Effect

5.2 A White Noise Test on the Residuals

6 Simulation Study

6.1 Network Simulation

6.2 Simulation Results

7 Empirical Data Analysis

8 Conclusion

Acknowledgments

Open Research

Data Availability Statement

Supporting Information

References

Figures

References

Related

Information