We thank K. A. Aastveit, G. Barone-Adesi, M. Deistler, M. Del Negro, R. Engle, D. Giannone, T. Götz, K. Hrvol'ovà, C. Hurlin, A. Levchenko, S. Ng, A. Onatski, C. Pérignon, G. Urga, M. Watson, and B. Werker for their useful comments. We also thank three referees for insightful comments which helped us to significantly improve the paper. The first author would like to acknowledge that this work has been co-funded by the European Research Council (ERC) under the European Union Horizon 2020 research and innovation programme Proof of Concept Grant Agreement 640924 as well as the Cyprus Research Promotion Foundation and the European Regional Development Fund research project EXCELLENCE/1216/0074. The second author gratefully acknowledges the Swiss National Science Foundation (SNSF) for Grant 105218 162633.Search for more papers by this author

E. Andreou,

E. Andreou

[email protected]

Department of Economics, University of Cyprus

CEPR

Search for more papers by this author

P. Gagliardini,

P. Gagliardini

[email protected]

Faculty of Economics, Università della Svizzera italiana (USI, Lugano)

Swiss Finance Institute (SFI)

Search for more papers by this author

E. Ghysels,

E. Ghysels

[email protected]

University of North Carolina—Chapel Hill

Kenan-Flagler Business School

CEPR

Search for more papers by this author

M. Rubin,

M. Rubin

[email protected]

EDHEC Business School

First published: 25 July 2019

https://doi.org/10.3982/ECTA14690

Citations: 48

Share a link

Email
Wechat
Bluesky

Abstract

We derive asymptotic properties of estimators and test statistics to determine—in a grouped data setting—common versus group-specific factors. Despite the fact that our test statistic for the number of common factors, under the null, involves a parameter at the boundary (related to unit canonical correlations), we derive a parameter-free asymptotic Gaussian distribution. We show how the group factor setting applies to mixed-frequency data. As an empirical illustration, we address the question whether Industrial Production (IP) is still the dominant factor driving the U.S. economy using a mixed-frequency data panel of IP and non-IP sectors. We find that a single common factor explains 89% of IP output growth and 61% of total GDP growth despite the diminishing role of manufacturing.

1 Introduction

Estimating and testing for the existence of common factors among large panels with group-specific factors is of interest in various areas in economics as well as other fields. For instance, for the unobservable pervasive factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0001$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0002$ estimated from two separate panels of data, one may be interested in testing how many factors are common between them. In this paper, a new test is introduced for the number of canonical correlations between vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0003$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0004$ equal to 1 and its asymptotic distribution is derived for large T and N, where N denotes the minimum cross-sectional size across groups, in the context of approximate factor models in the spirit of Bai and Ng (2002), Stock and Watson (2002), and Bai (2003). While there is an extensive literature on approximate group factor models, there does not exist a unifying inferential theory for large panel data framework.1 Our main theoretical contribution is an inference procedure for the number of common and group-specific factors based on canonical correlation analysis of the principal components (PCs) estimates on each group. The first-stage estimation of PCs affects the subsequent canonical correlation analysis, and this complicates the asymptotic analysis. As a result, the asymptotic distribution of the test statistics is nonstandard in terms of convergence rates and involves a nontrivial bias correction. We show that, under the null of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0005$ common factors across the two groups, the sum of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0006$ largest estimated canonical correlations minus $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0007$ , recentered and rescaled by (parameter-dependent) functions of N and T, converges in distribution to a standard Gaussian. We also provide a feasible version of the statistic, propose estimators for the common and group-specific factors, and characterize their asymptotic distribution. The inference procedure is general in scope and also of interest in many applications other than the one considered in this paper. Our work is most closely related to Chen (2010, 2012), Wang (2012), Ando and Bai (2015), and Breitung and Eickmeier (2016). However, the existing literature does not provide a comprehensive asymptotic treatment of group factor models for large T and N, especially regarding testing hypotheses on the number of common and group-specific factors.

As a specific application of group factor models, we consider panels of data sampled at different frequencies and study the role of Industrial Production (IP) sectors in the U.S. economy. Our empirical application revisits the analysis of Foerster, Sarte, and Watson (2011) who used factor methods to decompose industrial production into components arising from aggregate shocks and idiosyncratic sector-specific shocks. They focused exclusively on the IP sectors. We have fairly extensive data on U.S. industrial production. They consist of 117 sectors that make up aggregate IP, each sector roughly corresponding to a four-digit industry classification using NAICS. These data are published monthly, and therefore cover a rich panel. On the other hand, contrary to IP, we do not have monthly or quarterly data for the cross-section of U.S. output across non-IP sectors, but we do so on an annual basis. Indeed, the U.S. Bureau of Economic Analysis provides Gross Domestic Product (GDP) and Gross Output by industry—not only IP sectors—annually. Hence, we have a panel consisting of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0008$ (H for high-frequency) IP sector growth series sampled across MT time periods, where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0009$ for quarterly data and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0010$ for monthly data, with T the number of years. Moreover, we also have a panel of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0011$ (L for low frequency) non-IP sectors—such as Services and Construction, for example—which is only observed over T years. Hence, generically speaking, we have a high-frequency panel data set of size $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0012$ and a low-frequency panel data set of size $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0013$ . We allow for the presence of three types of unobservable factors: (1) those which explain variations in both panels/groups, and therefore are common factors, (2) group-specific (in our application, frequency-specific) factors—namely, (a) those exclusively pertaining to IP, and (b) those exclusively affecting non-IP sectors.

Using the inferential theory for group factor models developed in this paper, we find that a single common factor explains around 89% of the variability in the aggregate IP output growth index, and a factor specific to IP has very little additional explanatory power, during the period 1977–2011. This implies that the single common factor can be interpreted as an IP factor. Moreover, a large part of the variability of GDP output growth in service sectors, such as Transportation and warehousing (62%); Arts, entertainment, recreation, accommodation, and food services (53%), as well as other sectors, for example, Retail trade (31%), are also explained by the common factor. A single low-frequency factor, unrelated to manufacturing but related to sectors such as Finance, insurance, real estate, rental and leasing (21%); Educational services, health care social assistance (18%), as well as Government (22%), drives GDP growth variability. The results reflect the great advantage of the mixed-frequency setting—compared to the single-frequency one—in the context of our IP and GDP sector application. The mixed-frequency panel setting allows us to identify and estimate the high-frequency values of factors common to IP and non-IP sectors. With IP (i.e. high-frequency) data only, we cannot assess what is common with the non-IP sectors. With low-frequency data only, we cannot estimate the high-frequency common factors from a large cross-section.

The rest of the paper is organized as follows. In Section 2, we introduce the group factor model and discuss identification. In Section 3, we study estimation and inference on the number of common factors. The large sample theory appears in Section 4. Section 5 introduces mixed-frequency group factor models, whereas Section 6 presents the results of a Monte Carlo study. Section 7 covers the empirical application. Section 8 concludes the paper. The Technical Appendix of the paper provides regularity conditions and proofs of theorems. The Supplemental Material Andreou, Gagliardini, Ghysels, and Rubin (2019), henceforth referred to as Online Appendix (OA), provides the proofs of lemmas, reports supplementary theoretical results on identification and estimation, provides an extensive description of the data set used in the empirical application, supplementary empirical results, as well as the details about the Monte Carlo simulation design and results.

2 Identification in Group Factor Models

We use the following notation for the group factor model setting:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0014$ (2.1)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0015$ collects observations for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0016$ individuals in group j, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0017$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0018$ are the matrices of factor loadings, and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0019$ is the vector of error terms, with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0020$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0021$ (for simplicity, we focus on cases involving only two groups). The dimensions of the common factor $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0022$ and the group-specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0023$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0024$ are respectively $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0025$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0026$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0027$ . In the absence of common factors, we set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0028$ , while in cases without group-specific factors, we set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0029$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0030$ The group-specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0031$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0032$ are orthogonal to the common factor $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0033$ . Since the unobservable factors can be standardized, we assume

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0034$ (2.2)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0035$ denotes the identity matrix of order k (we refer to (2.2) as Assumption A.2 in the list of regularity conditions in Appendix A). We allow for a nonzero covariance $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0036$ between group-specific factors.

In standard linear latent factor models, the normalization induced by an identity factor variance-covariance matrix identifies the factor space up to an orthogonal rotation (and change of signs). Under an identification condition implied by our set of assumptions, the rotational invariance of (2.1)–(2.2) allows only for separate rotations among the components of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0037$ , among those of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0038$ , and finally those of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0039$ . The rotational invariance of (2.1)–(2.2) therefore maintains the interpretation of common and group-specific factors.2 We consider the generic setting of equation (2.1) and let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0043$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0044$ , be the dimensions of the pervasive factor spaces for the two groups, and define $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0045$ . We collect the factors of each group in the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0046$ -dimensional vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0047$ , and define their variance and covariance matrices: $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0048$ . From (2.2), we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0049$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0050$ . We want to show that the factor space dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0051$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0052$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0053$ are identifiable using canonical correlation analysis applied to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0054$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0055$ . In particular, we want to propose an identification strategy for these dimensions and the corresponding factor spaces using canonical correlations and directions. Before stating the main identification result, let us first recall some basics from canonical analysis (see, e.g., Anderson (2003) and Magnus and Neudecker (2007)). Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0056$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0057$ , denote the ordered canonical correlations between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0058$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0059$ . The $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0060$ largest eigenvalues of matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0061$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0062$ are the same, and are equal to the squared canonical correlations $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0063$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0064$ between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0065$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0066$ . The associated eigenvectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0067$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0068$ ), with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0069$ , of matrix R (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0070$ ) standardized such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0071$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0072$ ) are the canonical directions which yield the canonical variables $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0073$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0074$ ).

The next proposition deals with determining $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0075$ , the number of common factors, using canonical correlations between the vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0076$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0077$ which are unobserved and estimated by principal components.

Proposition 1.Under Assumption A.2, the following hold:

(i) If $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0078$ , the largest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0079$ canonical correlations between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0080$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0081$ are equal to 1, and the remaining $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0082$ canonical correlations are strictly less than 1.
(ii) Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0083$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0084$ matrix whose columns are the canonical directions for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0085$ associated with the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0086$ canonical correlations equal to 1, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0087$ . Then $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0088$ (up to an orthogonal matrix), for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0089$ .
(iii) If $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0090$ , all canonical correlations between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0091$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0092$ are strictly less than 1.
(iv) Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0093$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0094$ ) be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0095$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0096$ ) matrix whose columns are the eigenvectors of matrix R (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0097$ ) associated with the smallest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0098$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0099$ ) eigenvalues. Then $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0100$ (up to an orthogonal matrix) for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0101$ .

Proposition 1 shows that the number of common factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0102$ , the common factor space spanned by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0103$ , and the spaces spanned by group-specific factors, can be identified from the canonical correlations and canonical variables of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0104$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0105$ (see OA Appendix C.1 for the proof). Therefore, the factor space dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0106$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0107$ , and factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0108$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0109$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0110$ , are identifiable (up to a rotation) from information that can be inferred by disjoint principal component analysis (PCA) on the two groups. Indeed, disjoint PCA on the two groups allows us to identify the dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0111$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0112$ , and vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0113$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0114$ up to linear one-to-one transformations. The latter indeterminacy does not prevent identifiability of the common and group-specific factors from Proposition 1, due to the invariance of canonical correlations and canonical variables under linear one-to-one transformations of vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0115$ .3

3 Estimation and Inference on the Number of Common Factors

3.1 Estimators

Let us first assume that the true number of factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0117$ in each group $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0118$ is known, and also that the true number of common factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0119$ is known. Proposition 1 suggests the following estimation procedure for the common factors. Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0120$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0121$ be estimated (up to a rotation) by extracting the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0122$ principal components (PCs) from each sub-panel j, and denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0123$ these PC estimates of the factors, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0124$ . Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0125$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0126$ matrix of estimated PCs extracted from panel $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0127$ associated with the largest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0128$ eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0129$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0130$ . Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0131$ denote the empirical covariance matrix of the estimated vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0132$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0133$ , that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0134$ , and let matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0135$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0136$ be defined as

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0137$ (3.1)

Note that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0138$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0139$ . Matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0140$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0141$ have the same nonzero eigenvalues. The $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0142$ largest eigenvalues of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0143$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0144$ ), denoted by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0145$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0146$ , are the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0147$ squared sample canonical correlation between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0148$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0149$ . The associated $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0150$ canonical directions, collected in the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0151$ matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0152$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0153$ matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0154$ ), are the eigenvectors associated with the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0155$ largest eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0156$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0157$ ), normalized to have length 1 with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0158$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0159$ ). It also holds that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0160$ (3.2)

Definition 1.Two estimators of the common factors vector are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0161$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0162$ .

From equation (3.2), we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0163$ , and similarly for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0164$ , that is, the estimated common factor values match in-sample the normalization condition of identity variance-covariance matrix in (2.2). Let matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0165$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0166$ ) be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0167$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0168$ ) matrix collecting $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0169$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0170$ ) eigenvectors associated with the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0171$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0172$ ) smallest eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0173$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0174$ ), normalized to have length 1 with respect to the matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0175$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0176$ ). It also holds that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0177$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0178$ The estimators of the group-specific factors can be defined analogously to the estimators of the common factors: $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0179$ . By construction, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0180$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0181$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0182$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0183$ ) are orthogonal in-sample. An alternative estimator for the group-specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0184$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0185$ ) is obtained by computing the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0186$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0187$ ) principal components of the variance-covariance matrix of the residuals of the regression of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0188$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0189$ ) on the estimated common factors.4 Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0190$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0191$ matrix of estimated common factors, and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0192$ the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0193$ matrix collecting the estimated loadings:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0194$ (3.3)

Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0195$ be the residuals of the regression of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0196$ on the estimated common factor $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0197$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0198$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0199$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0200$ matrix of the regression residuals, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0201$ .

Definition 2.Estimators of the specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0202$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0203$ ) are defined as the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0204$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0205$ ) PCs of sub-panel $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0206$ (resp. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0207$ ), namely, the columns of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0208$ matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0209$ are the eigenvectors associated with the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0210$ largest eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0211$ , normalized to have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0212$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0213$ .

Note that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0214$ is orthogonal in-sample both to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0215$ and to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0216$ . This sample orthogonality property matching the population one (see (2.2)) explains why we focus on the estimation procedure in Definition 2 compared to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0217$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0218$ . Moreover, we define $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0219$ as the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0220$ matrix collecting the loadings estimators:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0221$ (3.4)

where the second equality follows from the in-sample orthogonality of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0222$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0223$ , and the normalization of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0224$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0225$ .

3.2 Inference on the Number of Common Factors via Canonical Correlations

One of our objectives is to determine how many factors are common between groups in the generic factor model in equation (2.1), that is, we consider the problem of inferring the dimension $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0226$ of the common factor space. To do so, we first consider the case where the number of pervasive factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0227$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0228$ in each sub-panel is known, hence $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0229$ is also known, and we relax this assumption in the next section. From Proposition 1, dimension $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0230$ is the number of unit canonical correlations between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0231$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0232$ . We consider the hypotheses: $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0233$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0234$ and finally, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0235$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0236$ are the ordered canonical correlations of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0237$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0238$ . Hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0239$ corresponds to the absence of common factors. Generically, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0240$ corresponds to the case of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0241$ common factors and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0242$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0243$ group-specific factors in each group. The largest possible number of common factors is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0244$ . In order to select the number of common factors, let us consider the following sequence of tests: $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0245$ against $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0246$ , for each $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0247$ . To test $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0248$ against $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0249$ , for any given $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0250$ , we consider

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0251$ (3.5)

The statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0252$ corresponds to the sum of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0253$ largest sample canonical correlations of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0254$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0255$ . We reject the null $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0256$ when $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0257$ is negative and large. The critical value is obtained from the large sample distribution of the statistic when $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0258$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0259$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0260$ , provided in Section 4. The number of common factors is estimated by sequentially applying the tests starting from $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0261$ .

3.3 Estimation and Inference When $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0262$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0263$ Are Unknown

When the true number of pervasive factors is not known, but consistent estimators $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0264$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0265$ , say, are available, the asymptotic distribution and rate of convergence for the test statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0266$ based on $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0267$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0268$ is the same as those based on the true number of factors. Intuitively, this holds because the consistency of estimators $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0269$ , that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0270$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0271$ , implies that the estimation error for the number of pervasive factors is asymptotically negligible.5 Therefore, the asymptotic distributions and rates of convergence of the test statistics and factors estimators will be derived assuming that the true dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0272$ in each group, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0273$ , are known. Examples of consistent estimators for the numbers of pervasive factors include the tests proposed by Bai and Ng (2002) (applied in Section 7), Onatski (2010), or Ahn and Horenstein (2013).

4 Large Sample Theory

In this section, we derive the large sample distribution of the test statistic for the dimension of the common factor space and provide a feasible version of it. We also define a consistent selection procedure for the number of common factors (the asymptotic distribution of the factor and loading estimates is provided in the OA). We consider the joint asymptotics $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0274$ . Let us denote $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0275$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0276$ . Without loss of generality, we set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0277$ , which implies $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0278$ . We assume that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0279$ (4.1)

which we refer to as Assumption A.1 in the list of regularity conditions in Appendix A. The conditions in (4.1) allow for a wide range of relative growth rates for the time-series and cross-sectional panel dimensions as long as N grows faster than $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0280$ and slower than $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0281$ . They accommodate both the case where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0282$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0283$ grow at the same rate, and the case where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0284$ grows faster than $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0285$ , namely, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0286$ . To derive the large sample distribution of the test statistic for the number of common factors, we deploy an asymptotic expansion for the estimated PCs in each group, which extends results in Bai and Ng (2002, 2006), Stock and Watson (2002), and Bai (2003), and we report in Proposition 3 in Appendix B. For $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0287$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0288$ , the estimate $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0289$ is asymptotically equivalent (in a sense made precise in Proposition 3), up to negligible terms, to

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0290$ (4.2)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0291$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0292$ is a nonsingular stochastic factor rotation matrix, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0293$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0294$ is the limit average error variance conditional on the sigma field $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0295$ generated by current and past factor values $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0296$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0297$ . The zero-mean term $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0298$ drives the randomness in group factor estimates conditional on factor path. Vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0299$ is measurable with respect to the factor path and induces a bias term at order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0300$ in principal components estimates. Vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0301$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0302$ depend on sample sizes but, for convenience, we omit the indices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0303$ , T.

Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0304$ be the conditional covariance between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0305$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0306$ , that is,

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0307$

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0308$ , for j, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0309$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0310$ We set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0311$ . Moreover, let us define the (probability) limits $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0312$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0313$ , and let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0314$ be the large sample counterpart of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0315$ .

Theorem 1.Under Assumptions A.1–A.7, and the null hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0316$ of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0317$ common factors, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0318$ (4.3)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0319$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0320$ , and

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0321$

and where the upper index (c) denotes the upper $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0322$ block of a vector, and the upper index $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0323$ denotes the upper-left $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0324$ block of a matrix.

Proof.See Appendix B.1. □

The matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0325$ is the upper-left $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0326$ block of the limit covariance matrix between $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0327$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0328$ , where the weight $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0329$ accounts for the different sample sizes in the two sub-panels. Vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0330$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0331$ are residuals of the orthogonal projection of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0332$ onto $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0333$ in-sample, and of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0334$ onto $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0335$ in the population, respectively. In fact, the orthogonal projection of vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0336$ along vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0337$ can be absorbed in the transformation matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0338$ in expansion (4.2), and therefore is asymptotically immaterial for the computation of canonical correlations and for the large sample distribution of the test statistic.

The asymptotic distribution in Theorem 1 is valid for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0339$ (Assumption A.1). It covers the variety of convergence rates and asymptotic biases and variances the statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0340$ features, for different relative growth rates of sample dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0341$ when $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0342$ , namely,

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0343$

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0344$ if $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0345$ . In particular, the convergence rate of the statistic is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0346$ . When $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0347$ (see below), the convergence rate is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0348$ and the asymptotic variance is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0349$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0350$ . Note that, if the PCs in the groups were observed, then testing for unit canonical correlations would be degenerate, as it involves testing for deterministic relationships between random vectors. The estimation errors of the PCs drive the asymptotic distribution of the statistic, with a nonstandard convergence rate. It might be surprising that we find an asymptotic Gaussian distribution when testing a hypothesis for a parameter at the boundary, that is, canonical correlations equal to 1. What makes the test asymptotically Gaussian is the fact that there is a re-centering of the statistic due to the sampling error in the first-step estimates of the PCs, and a CLT applies to the recentered squared estimation errors. The re-centering term involves a component of order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0351$ and a component of order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0352$ . One may wonder whether this Gaussian asymptotic distribution is a good approximation for the small sample distribution of the recentered and rescaled $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0353$ . In Section 6 and OA Section E, we report the results of extensive Monte Carlo simulations showing that this is the case in a setting that mimics our empirical application.

To get a feasible distributional result for the statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0354$ , we need consistent estimators for the unknown scalars $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0355$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0356$ , and matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0357$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0358$ in Theorem 1. To simplify the analysis, we assume at this stage that the errors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0359$ are (i) uncorrelated across sub-panels j and individuals i, at all leads and lags, and (ii) a conditionally homoscedastic martingale difference sequence for each individual i, conditional on the factor path, that is,

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0360$ (4.4)

for all j, i, t, h (see Assumption A.9). Then, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0361$ (4.5)

Matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0362$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0363$ do not depend on time. The projection residual $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0364$ vanishes because $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0365$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0366$ , is spanned by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0367$ . This explains why $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0368$ is null and the convergence rate is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0369$ . Similarly, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0370$ , so that the bias term at order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0371$ is zero.6 Under (4.4), the bias component $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0376$ in the PC estimates is immaterial since it can be absorbed in the transformation matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0377$ in (4.2). In fact, Connor and Korajczyk (1986) and Bai (2003, Theorem 4) showed that the principal component estimator is consistent even for fixed T in such a case. In Theorem 2 below, we replace $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0378$ with its large sample limit $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0379$ , matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0380$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0381$ by consistent estimators. We show that the estimation error for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0382$ in the bias adjustment is of order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0383$ , and therefore the asymptotic distribution of the statistic is unchanged.

Theorem 2.Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0384$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0385$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0386$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0387$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0388$ are the loadings estimators defined in equations (3.3) and (3.4), $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0389$ with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0390$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0391$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0392$ . Define the test statistic:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0393$ (4.6)

and let Assumptions A.1–A.9 hold. Then:

(i) Under the null hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0394$ of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0395$ common factors, we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0396$ .
(ii) Under the alternative hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0397$ , we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0398$ .

Proof.See Appendix B.2. □

The feasible asymptotic distribution in Theorem 2 is the basis for a one-sided test of the null hypothesis of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0399$ common factors. The rejection region for a test of the null hypothesis at asymptotic level α is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0400$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0401$ is the α-quantile of the standard Gaussian distribution for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0402$ . From Theorem 2 (ii), the test is consistent.

One way to implement the model selection procedure to estimate the number of common factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0403$ proposed in Section 3.2 consists in testing sequentially the null hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0404$ , against the alternative $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0405$ , using the test statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0406$ defined in Theorem 2 for any generic number r of common factors. A “naive” procedure is initiated with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0407$ , proceeds backwards, and is stopped at the largest integer $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0408$ such that the null $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0409$ cannot be rejected, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0410$ . Otherwise, set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0411$ if the test rejects the null $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0412$ for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0413$ . This “naive” procedure is not a consistent estimator of the number of common factors. Indeed, asymptotically, a nonzero probability α of underestimating $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0414$ exists coming from the type I error of the test of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0415$ against $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0416$ , when the true number of factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0417$ is strictly positive.

Building on the results in Pötscher (1983), Cragg and Donald (1997), and Robin and Smith (2000), a consistent estimator of the number of common factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0418$ , for any integer $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0419$ , is obtained allowing the asymptotic size α to go to zero as N, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0420$ . The following Proposition 2 (proved in OA Appendix C.2) defines a consistent inference procedure for the number of common factors.

Proposition 2.Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0421$ be a sequence of real scalars defined in the interval $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0422$ for any $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0423$ , such that (i) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0424$ and (ii) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0425$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0426$ . Then, under Assumptions A.1–A.9, the estimator of the number of common factors defined as

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0427$

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0428$ , if $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0429$ for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0430$ , is consistent, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0431$ under $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0432$ , for any integer $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0433$ .

Condition (i) ensures asymptotically zero probability of type I error when testing $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0434$ against $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0435$ . Condition (ii) is a lower bound on the convergence rate to zero of the asymptotic size, and is used to keep asymptotically zero probability of type II error of each step of the procedure. The conditions in Proposition 2 are satisfied, for example, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0436$ such that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0437$ (4.7)

for constants $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0438$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0439$ .7

5 Mixed-Frequency Group Factor Models

The idea to apply group factor analysis to mixed-frequency data is novel as frequency-based grouping can indeed be the basis of identification strategies and statistical inference. In this section, we explore this topic as it pertains to our empirical application. We consider a setting where both low- and high-frequency data are available. Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0445$ be the low-frequency (LF) time units. Each time period $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0446$ is divided into M sub-periods with high-frequency (HF) dates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0447$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0448$ . Moreover, we assume a panel data structure with a cross-section of size $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0449$ of high-frequency data and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0450$ of low-frequency data. It will be convenient to use a double time index to differentiate low- and high-frequency data. Specifically, we let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0451$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0452$ , be the high-frequency data observation i during sub-period m of low-frequency period t. Likewise, we let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0453$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0454$ , be the observation of the ith low-frequency series at t. These observations are gathered into the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0455$ -dimensional vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0456$ , for all m, and the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0457$ -dimensional vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0458$ , respectively.

We assume that there are three types of latent pervasive factors, which we denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0459$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0460$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0461$ , respectively. The first represents a vector of factors which affect both high- and low-frequency data (we use again superscript C for common), whereas the other two types of factors affect exclusively high (superscript H) and low (marked by L) frequency data. We denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0462$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0463$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0464$ the dimensions of these factors. The latent factor model with high-frequency data sampling is

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0465$ (5.1)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0466$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0467$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0468$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0469$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0470$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0471$ are matrices of factor loadings. The vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0472$ is unobserved for each high-frequency sub-period and the measurements, denoted by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0473$ , depend on the observation scheme, which can be either flow-sampling or stock-sampling (or some general linear scheme).

In the case of flow-sampling, the low-frequency observations are the sum (or average) of all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0474$ across all m, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0475$ .8 Then, model (5.1) implies

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0476$ (5.2)

Let us define the aggregated variables and innovations $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0477$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0478$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0479$ , and the aggregated factors: $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0480$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0481$ , H, L. Then we can stack the observations $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0482$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0483$ and write

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0484$ (5.3)

that is, the group factor model, with common factor $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0485$ and group-specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0486$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0487$ . The normalized latent common and group-specific factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0488$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0489$ , satisfy the counterpart of (2.2).

The results in Sections 2, 3, and 4 can be applied for identification and inference in the mixed-frequency factor model. Using the same arguments in the mixed-frequency setting of equation (5.3), identification can be achieved for the aggregated factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0490$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0491$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0492$ , and the factor loadings $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0493$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0494$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0495$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0496$ . Consequently, the estimators and test statistics developed for the group factor model (2.1) can also be used to define estimators for the loadings matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0497$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0498$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0499$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0500$ , and the aggregated factor values $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0501$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0502$ , and the test statistic for the common factor space dimension $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0503$ in equation (5.3). We denote these estimators $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0504$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0505$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0506$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0507$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0508$ , and also the infeasible and feasible test statistics $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0509$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0510$ . Once the factor loadings are identified from (5.3) and estimated, the values of the common and high-frequency factors for sub-periods $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0511$ are identifiable by cross-sectional regression of the high-frequency data on loadings $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0512$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0513$ in (5.1). More specifically, the estimators of the common and high-frequency factor values are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0514$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0515$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0516$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0517$ (the asymptotic distribution of the factor estimates is provided in OA Proposition D.7). Hence, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0518$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0519$ are obtained by regressing $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0520$ on $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0521$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0522$ across $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0523$ , for any $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0524$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0525$ . Consequently, with flow-sampling, we can identify and estimate $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0526$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0527$ at all high-frequency sub-periods. On the other hand, only $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0528$ , that is, the within-period sum of the low-frequency factor, is identifiable by the paired panel data set consisting of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0529$ combined with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0530$ . This is not surprising, since we have no high-frequency observations available for the LF data.

One can consider an alternative approach to inference on the number of common factors and their estimated values. Instead of first aggregating the high-frequency data as in equation (5.3) and then applying PCA in each group, one can extract the principal components directly on the high-frequency panel (and the low-frequency panel) and then aggregate the high-frequency PCA estimates. The procedure then continues identically in both approaches. In our Monte Carlo experiments, the performances of the two approaches are found to be similar (see Section 6 and OA Section E for more details). In the empirical application, the results are almost indistinguishable (see OA Section D.11.2).

6 Monte Carlo Simulation Analysis

The objectives of the Monte Carlo simulation study are: to assess the adequacy of the asymptotic distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0531$ to approximate its small sample counterpart, to evaluate the finite sample size and power properties of tests for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0532$ based on the statistics $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0533$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0534$ , and to compare the sequential testing procedure for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0535$ in Proposition 2, vis-à-vis the alternative procedures suggested by Chen (2012) and Wang (2012). We perform our simulations in the context of the mixed-frequency setting of Section 5 to align the analysis with the empirical application.

Section E of the OA reports a detailed description of the simulation designs and tables of results. The data generating process (DGP) is the high-frequency model (5.1) with flow-sampled LF variables. The idiosyncratic innovations are independent of the factors, serially i.i.d., and possibly weakly cross-sectionally correlated within each panel—corresponding to an approximate factor model. We consider $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0536$ high-frequency sub-periods, as in our empirical application with yearly and quarterly data, and different numbers of factors across DGPs, namely, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0537$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0538$ , and 5. The DGP for the vector of stacked factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0539$ is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0540$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0541$ is a common scalar AR coefficient for all the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0542$ factors and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0543$ . The innovations $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0544$ are i.i.d. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0545$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0546$ is a block-matrix such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0547$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0548$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0549$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0550$ . The scaling term ς ensures that the factor normalization in (2.2) holds for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0551$ , while the scalar parameter ϕ generates correlation between pairs of HF and LF specific factors. Factor loadings are simulated from a multivariate zero-mean Gaussian distribution, such that the cross-sectional distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0552$ 's of the regressions of observables on factors mimics the empirical application. We run 4000 simulations for each DGP, and consider $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0553$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0554$ , T as small as the ones in our empirical applications, and progressively increase them.

All the results summarized below are qualitatively similar (1) when different values of the factor autocorrelation $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0555$ are considered, namely, 0 and 0.6, (2) for different (small) levels of the weak cross-sectional correlation of the idiosyncratic errors, and (3) for different magnitudes of the pervasiveness of the factors as measured by the theoretical $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0556$ 's for regressions of the simulated observables on the factors. We refer the reader to the OA for additional details.

6.1 Asymptotic Gaussian Distribution, Size, and Power Properties

First, we want to verify whether the Gaussian asymptotic distribution provides a good small sample approximation for the infeasible statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0559$ . Figure 1 displays the empirical distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0560$ , computed under the null of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0561$ common factors from data simulated from a DGP with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0562$ , and overlapped with the asymptotic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0563$ distribution. For small sample sizes as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0564$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0565$ , the empirical distribution approximates well a normal distribution with unit standard deviation, but is centered around a small positive value: the empirical mean and standard deviation are 0.16, and 1.14, respectively. Nevertheless, the left tail of this empirical distribution resembles relatively well the one of a standard Gaussian. As the sample sizes grow to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0566$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0567$ , the empirical distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0568$ has empirical mean and standard deviation of 0.02 and 1.01, respectively, and almost perfectly overlaps with the asymptotic distribution. As these results are qualitatively similar for alternative DGPs and sample sizes, we conclude that our asymptotic theory provides a good approximation also in small samples.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Small sample distribution of the recentered and rescaled $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0557$ statistic. The figure displays the histograms of the empirical distribution of the recentered and rescaled $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0558$ statistic computed on mixed-frequency panels of observations, for different sample sizes N_H, N_L, T, simulated from a DGP where k^C = k^H = k^L = 1, all factors and idiosyncratic terms are generated from Gaussian random variables, and M = 4. The solid line corresponds to the asymptotic standard Gaussian distribution of the recentered and rescaled statistic.

**Figure 1**
Open in figure viewer PowerPoint

Small sample distribution of the recentered and rescaled $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0557$ statistic. The figure displays the histograms of the empirical distribution of the recentered and rescaled $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0558$ statistic computed on mixed-frequency panels of observations, for different sample sizes N_H, N_L, T, simulated from a DGP where k^C = k^H = k^L = 1, all factors and idiosyncratic terms are generated from Gaussian random variables, and M = 4. The solid line corresponds to the asymptotic standard Gaussian distribution of the recentered and rescaled statistic.

The tables in OA Section E.5 display the empirical size of the tests for the null hypotheses of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0569$ or 2, common factors corresponding to nominal sizes of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0570$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0571$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0572$ . They also report the empirical power of tests for the null hypothesis of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0573$ common factors, when the true number of common factors is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0574$ . We observe that the asymptotic Gaussian distribution provides an overall very good approximation for the left tail of the infeasible test statistics $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0575$ under the null, even for samples as small as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0576$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0577$ , corroborating the graphical evidence of Figure 1. For the vast majority of sample sizes, and simulation designs, the size distortions are in the order of 1% to maximum 3% for the designs in which $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0578$ . The size distortions for the feasible statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0579$ are from 1% to 12% larger than those of the infeasible statistic when $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0580$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0581$ . The designs in which $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0582$ for samples as small as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0583$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0584$ feature larger size distortions for smaller samples due to the fact that, by construction of the designs, the signal-to-noise ratio for each of the two common factors is halved compared with the designs in which $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0585$ . As expected, when either the sample sizes, or the signal-to-noise ratio of the common factors increase, the size distortions monotonically disappear. The power of the feasible test statistics is always equal to 1, with the exception of designs with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0586$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0587$ .

6.2 Estimation of the Number of Common Factors

We compare the following three estimators of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0588$ : (a) the consistent sequential testing procedure of our Proposition 2, (b) a selection procedure based on the penalized information criterion of Theorem 3.7 in Chen (2012), and (c) the three-steps selection procedure proposed by Wang (2012).9 We focus on the average estimated number of common factors computed over the 4000 simulations.

We consider both the case in which the true numbers of pervasive factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0595$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0596$ in the two panels are known, and the case where they are estimated using the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0597$ information criterion of Bai and Ng (2002). Generally, the estimates of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0598$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0599$ are very precise and do not affect significantly the estimation of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0600$ . The only exceptions are the smaller samples with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0601$ , and the DGPs with many pervasive factors in the LF panel, say $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0602$ , where the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0603$ criterion tends to severely underestimate the values of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0604$ , while the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0605$ produces better estimates.10 The critical value for our selection procedure is as in equation (4.7), with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0609$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0610$ .

For a small number—say not larger than 3—of uncorrelated specific factors, the penalized information criterion proposed in Chen (2012) yields the correct number of factors in almost all simulations for any sample size, while our selection procedure is less accurate only for sample sizes as small as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0611$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0612$ : the average value of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0613$ ranges between 0.85 and 1 for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0614$ . The average value of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0615$ for our selection procedure approaches quickly the true value $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0616$ as the sample sizes increase.

The procedure of Chen (2012) overestimates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0617$ when the correlation ϕ among the specific factors increases from 0 to 0.7, and 0.95. The overestimation is much less severe for our sequential test procedure, also in larger samples, which also features a faster improvement in performance as the sample sizes increase. We observe a monotonic decrease in the precision across all the estimators when the number of specific factors becomes as large as 5; nevertheless, the deterioration in performance is less pronounced for our procedure. Finally, the consistent three-steps selection procedure of Wang (2012) performs similarly to the one of Chen (2012) in DGPs with a small number of uncorrelated specific factors. However, as either ϕ or the numbers of specific factors increase, this procedure largely overestimates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0618$ and becomes the worst among the three considered.

7 Empirical Application

Recent public policy debates argue that manufacturing has been declining in the United States and most jobs have migrated overseas to lower wage countries. The share of the Industrial Production (IP) sector declined from more than 25% to roughly 18% during our sample period 1977–2011. However, the fact that its size shrank does not necessarily exclude the possibility that the IP sector still is a key factor of total U.S. output. When studying the role of the IP sector, we face a conundrum. On the one hand, we have 117 sectors that make up aggregate IP. These data are published monthly, and therefore cover a rich time series and cross-section. On the other hand, contrary to IP, we do not have monthly or quarterly data regarding the cross-section of U.S. output across non-IP sectors, but we do so on an annual basis. Using the class of mixed-frequency group factor models proposed in Section 5, the objective of the empirical application is to shed light on the key question of interest, namely, whether, despite the shrinking size of IP sectors, the factors related to IP are still dominant determinants of U.S. output fluctuations.

7.1 Data Description

For the IP sectors, we use the same 117 IP sectoral growth rates indices sampled at quarterly frequency from 1977.Q1 to 2011.Q4, as in Foerster, Sarte, and Watson (2011) for comparison.11 The data for all the remaining non-IP sectors consist of the annual growth rates of real GDP for the following 42 sectors: 35 Services, Construction, Farms, Forestry-fishing, and related activities, General government (federal), Government enterprises (federal), General government (state and local), and Government enterprises (state and local). These LF data are published by the Bureau of Economic Analysis (BEA).12 Hence we consider the panel of these yearly GDP sectoral and the quarterly IP data given that one of the objectives of this application is to study the comovements among these different sectors. A description of the practical implementation of our procedure appears in OA Section D.9.13

7.2 Common, Low-, and High-Frequency Factors

We assume that our data set follows the factor structure for flow-sampling as in equation (5.2), with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0619$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0620$ corresponding to the 117 quarterly IP series and the 42 annual GDP non-IP sector data series, respectively, for the period 1977.Q1–2011.Q4. We exclude the annual series related to IP sectors from the annual GDP panel in order to avoid double counting. Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0621$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0622$ panel of the yearly observations of the IP indices growth rates computed as the sum of the quarterly growth rates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0623$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0624$ , for year t, and let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0625$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0626$ panel of the yearly growth rates of the non-IP indices. Let also $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0627$ be the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0628$ panel of quarterly IP indices growth rates.

We start by selecting the number of factors in each sub-panel, which are of dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0629$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0630$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0631$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0632$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0633$ . We use the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0634$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0635$ information criteria of Bai and Ng (2002), following the empirical literature. For the panels of IP growth rate at quarterly ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0636$ ) and annual ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0637$ ) frequencies, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0638$ selects two factors for each panel, whereas the more strict $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0639$ criterion selects one factor for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0640$ and two factors for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0641$ . For the annual GDP (non-IP) sectors panel, both $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0642$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0643$ select a single factor.14 Our results corroborate the evidence in Foerster, Sarte, and Watson (2011), suggesting that there are either one or two pervasive factors in the quarterly IP growth data. While the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0647$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0648$ choose factors in an unconditional setup, we are also interested in the explanatory power of these factors in a conditional setup. Hence the empirical analysis proceeds with two factors for each panel, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0649$ , in order to avoid potentially omitted factors/variables in explaining economic activity growth and subsequently re-assess the conditional significance of factors using the BIC criterion.15

In order to select the number of common and frequency-specific factors, we follow our proposed procedure in Proposition 2. The estimated canonical correlations of the first two PC's estimated in each sub-panel $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0650$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0651$ are used to compute the value of the feasible standardized test statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0652$ in (4.6) and Theorem 2, for testing the null hypotheses of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0653$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0654$ common factors.16 The first canonical correlation is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0655$ , while the second one is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0656$ . These results are consistent with the presence of one common factor in each of the two mixed-frequency data sets considered, as represented by hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0657$ in Section 3.2. The values of the statistics are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0658$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0659$ for the null hypotheses of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0660$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0661$ common factors, respectively. The test rejects the null hypothesis of the presence of two common factors ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0662$ ), for significance levels as small as 0.05%, while we cannot reject the null of one common factor at the 5% significance level. Our selection procedure detailed in Proposition 2 with critical level as in (4.7) with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0663$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0664$ , produces the estimate $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0665$ . Hence, we select a model with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0666$ .

In Figure 2, Panel (a) plots the IP and GDP growth rates during the period 1977–2011 and the remaining Panels (b)–(d) present the estimated factor paths from the panels of 42 GDP sectors and 117 IP indices for the common, the HF-specific, and the LF-specific factors, respectively. All factors are standardized to have zero mean and unit variance in the sample and their sign is chosen so that the majority of the associated loadings are positive. A visual inspection of the plots reveals that the common factor in Panel (b) resembles the IP index in Panel (a), with a large decline corresponding to the Great Recession following the financial crisis of 2007–2008 and the positive spike associated to the recent economic recovery. On the other hand, the LF-specific factor displayed in Panel (d) features a less dramatic fall during the Great Recession, and actually features a positive spike in 2008, followed by large negative values in the following years. This constitutes preliminary evidence suggesting that some non-IP sectors could feature different responses to the recent financial crisis.

The relationship of factors with the sectoral GDP and IP growth series, in a regression context, reveals additional information about the conditional correlations of the factors with specific economic activity growth sectors. This in turn can help us shed light on which IP and non-IP series are driving the factors. We start with a disaggregated analysis, and examine the relative importance of the common and frequency-specific factors in explaining the variability across all sectoral growth rates. For each sector in the panel, we regress the GDP or IP index growth rates on (i) the common factor only, (ii) the specific factor only, for non-IP and IP series, respectively, and (iii) both common and specific factors. In Table I, we report the quantiles of the empirical distribution of the adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0671$ (denoted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0672$ ) of these regressions. In addition, we report the percentage value of the times the BIC (denoted by %BIC) selects, among the aforementioned three regression models (i)–(iii), the alternative factor conditional information set (common and/or frequency-specific), for each sectoral index in the cross-section.17

Table I. Adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0667$ and Percentage Values of BIC of the Regressions With Common and/or Frequency-Specific Factors From Economic Activity Indices Growth Ratesa

Factors	10%	25%	50%	75%	90%	% BIC
	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0668$ : Quantiles
Observables: Gross Domestic Product, 1977–2011
common	−2.2	−0.5	11.5	28.9	42.9	38.1
common, LF-specific	0.1	9.2	25.4	34.5	60.3	28.6
LF-specific	−2.8	−2.3	5.7	15.7	22.4	33.3
Observables: IP, 1977.Q1–2011.Q4
common	0.3	4.8	20.3	36.0	60.0	42.7
common, HF-specific	1.1	6.8	28.7	45.3	63.4	48.7
HF-specific	−0.7	−0.1	3.0	11.2	23.5	8.5

a The regressions in the first three lines involve the growth rates of the 42 non-IP sectors as dependent variables, while those in the last three lines involve the growth rates of the 117 IP indices as dependent variables. The explanatory variables are factors estimated from the same indices using a mixed-frequency factor model with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0669$ . Reported are the adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0670$ of the regressions on common and/or frequency-specific factors for different quantiles of the cross-section. The last column reports the percentage values that the BIC chooses the specific factor type regression model.

From the first three lines in Table I, we observe that adding the LF-specific factor to the common factor regressions for the non-IP indices yields an increment of the median $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0674$ around 14% (going from 11.5% to 25.4%) and the 90% quantile of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0675$ increases by 17%. Adjusting for the number of the variables in the factor regression models, the BIC favors the model with both the common and the LF-specific factors in explaining the GDP growth rate in 29% of the sectors, whereas the model with the common factor alone is selected in about 38% of the series. When the high-frequency-specific factor is added to the common factor, it contributes an increment of around 8% in the median $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0676$ for the IP sectors. The 49% BIC value provides strong evidence that both the common and high-frequency factors explain the IP sectoral growth rate. Overall, the results in Table I show that the common factor turns out to be pervasive for most of the IP and non-IP sectors alike as demonstrated by both the relative $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0677$ vis-à-vis those with just the frequency-specific factor.

In order to investigate which sectors drive the variation of our estimated factors and provide an economic interpretation to our factors, we list in Table II the highest and lowest ten GDP non-IP sectors in terms of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0693$ when regressed on the common factor only (in Panel A), and both the common and LF-specific factors (in Panel B). We also report the top and bottom ten ranked GDP non-IP sectors with the highest and lowest absolute increments in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0694$ when the LF-specific factor is added to the common one (in Panel C).18

Table II. Regression of Yearly Sectoral GDP Growth on Common and LF-Specific Factors: Adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0678$ a

Panel A. Regressor: common factor		Panel B. Regressors: common and LF-specific factors		Panel C. Increment in adj. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0679$ in Panels A and B
Sector	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0680$	Sector	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0681$	Sector	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0682$
Ten sectors with largest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0683$		Ten sectors with largest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0684$		Ten sectors with largest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0685$
Truck transportation	63.10	Misc. prof., scient., & tech. serv.	66.67	Misc. prof., scient., & tech. serv.	49.69
Accommodation	62.43	Admin. & support services	62.63	Gov. enterprises (state & local)	34.69
Construction	44.05	Truck transportation	62.51	Rental & leasing serv.	29.52
Other transp. & support activ.	43.31	Accommodation	61.48	General gov. (state & local)	24.90
Administrative & support services	42.69	Construction	59.75	Legal services	24.32
Other services, except gov.	42.53	Warehousing & storage	52.53	Motion picture & sound rec.	22.77
Warehousing & storage	40.95	Gov. enterprises (state & local)	45.78	Fed. Res. banks, credit interm..	20.31
Air transportation	31.58	Other services, except gov.	41.75	Administrative & support services	19.95
Retail trade	30.70	Other transportation & support act.	41.71	Social assistance	19.91
Amusem., gambling, & recr. ind.	29.17	gov. enterprises (federal)	37.78	Real estate	18.14
Ten sectors with smallest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0686$		Ten sectors with smallest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0687$		Ten sectors with smallest $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0688$
Funds, trusts, & other finan. vehicles	−1.23	Ambulatory health care services	7.76	Accommodation	−0.96
Motion picture & sound record. ind.	−1.68	Management of comp. & enterpr.	7.52	Rail transportation	−1.16
Pipeline transportation	−1.74	Funds, trusts, & other fin. vehicles	6.15	Other transportation & support act.	−1.59
Information & data processing services	−1.84	Information & data processing services	1.96	Air transportation	−1.77
Transit & ground passenger transp.	−2.05	Educational services	1.35	Retail trade	−2.15
General gov. (state & local)	−2.12	Insurance carriers & related activities	0.36	Amusements, gambling	−2.15
Forestry, fishing & related activities	−2.33	Water transportation	−0.64	Educational services	−2.62
Water transportation	−2.94	Farms	−1.87	Farms	−2.80
Securities, commodity contr., & investm.	−2.99	Forestry, fishing	−5.31	Forestry, fishing	−2.98
Insurance carriers	−3.03	Securities, commodity contr.	−5.99	Securities, commodity contr.	−3.00

a The adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0689$ , denoted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0690$ , are reported for the restricted MIDAS regressions of the growth rates of 42 GDP non-IP sectoral indices on the estimated factors. Regressions in Panel A involve a LF explained variable and the estimated common factor. Regressions in Panel B involve a LF explained variable and both the common and LF-specific factors. In Panel C, we report the difference in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0691$ (denoted as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0692$ ) between the regressions in Panel B and regressions in Panel A.

From Panel A, we first note that the common factor alone explains most of the variability of service sectors with direct economic links to IP sectors like Truck transportation, Administration & Support Services, and Warehousing, with an $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0695$ ranging from 63% to 43%, as well as Accommodation with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0696$ of 63%. This indicates that the common factor is driven by service sectors related to IP and could thereby be interpreted as an IP factor, as already noted on Figure 2. On the other hand, the common factor turns out to be completely unrelated to most of the Financial, Insurance, and Information services sectors. Turning to Panel C of Table II, which reports the difference in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0697$ between the regressions in Panels A and B, we note that the LF-specific factor explains more than 20% of the variability of output for very heterogeneous services sectors as well as Government (state and local).19 Interpreting these results, we conclude that the LF-specific factor is completely unrelated to service sectors which depend almost exclusively on IP output (e.g., transportation, retail trade), and is a common factor driving the comovement of other non-IP service sectors, such as Professional scientific and technical services, Government, legal services.

In Table II, we highlight further differences in the dynamics of output growth between the two sub-sectors of the financial services industry which are particularly revealing, Securities and Credit intermediation, extensively studied by Greenwood and Scharfstein (2013). We find that the sub-sectors Funds, trusts, and other financial vehicles as well as Securities, commodity contracts, and investments, are unrelated to both the common and LF-specific factors, indicating that their output growth is uncorrelated with the common component of real output growth and across the other sectors that correlate with the U.S. economic activity. In contrast, the Credit intermediation industry comoves with the other IP and non-IP sectors (see Tables D.24 and D.25 in the OA).

Up to this point, we examined the explanatory power of the factors for sectoral output indices. For non-IP GDP, these indices correspond to the finest level of disaggregation of output growth by sector. In Table III, we report the results of regressions with aggregated indices instead. In particular, we regress the output of each aggregate index either on the estimated (a) common factor, (b) frequency-specific, or (c) both aforementioned factors, and report the corresponding $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0710$ of these regressions in the first three columns. The last column in Table III reports the model favored by the BIC among the three regression specifications. It is important to note that now we also include the GDP Manufacturing aggregate index which is not used in the estimation of the factors. Panel A in Table III shows that the common factor explains around 89% of the variability in the aggregate IP growth index, confirming that this factor can be interpreted as an IP factor during the period 1977–2011. This is further corroborated in Panel B where we obtain an $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0711$ of 82% in the regression of the GDP Manufacturing Index on the common factor alone. As most of the sectors included in the IP index are Manufacturing sectors, this result is not surprising. Yet, it is still worth noting because, as remarked earlier, the GDP data on Manufacturing have not been used in the factor estimation, in order to avoid double-counting these sectors in our mixed-frequency sectoral panel.20

Table III. Regression Results of Aggregate IP and Selected GDP Indices Growth Rates on Estimated Factorsa

Panel A Quarterly observations, 1977.Q1–2011.Q4
	(1)	(2)	(3)	(3)-(1)
Sector	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0698$	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0699$	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0700$		BIC
Industrial Production	89.06	5.02	90.26	1.20	CH
Panel B Yearly observations, 1977–2011
	(1)	(2)	(3)	(3)-(1)
Sector	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0701$	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0702$	$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0703$
GDP	60.54	8.59	74.21	13.67	CL
GDP—Manufacturing	81.88	−3.03	81.53	−0.35	C
GDP—Agriculture, forestry, fishing, & hunting	1.43	−2.52	−1.26	−2.69	C
GDP—Construction	44.05	11.22	59.75	15.70	CL
GDP—Wholesale trade	20.35	7.90	30.83	10.48	CL
GDP—Retail trade	30.70	−2.86	28.56	−2.15	C
GDP—Transportation & warehousing	62.14	−2.95	60.97	−1.17	C
GDP—Information	12.14	22.28	37.57	25.43	CL
GDP—Finance, insurance, real estate, rental, & leasing	−1.42	21.22	21.11	22.53	L
GDP—Professional and business services	30.02	30.21	65.61	35.59	CL
GDP—Educational serv., health care, and social assistance	−1.38	18.38	18.18	19.56	L
GDP—Arts, entertainment, recreation, accommodation, & food serv.	53.51	−2.23	53.70	0.18	C
GDP—Government	−2.12	22.37	20.47	22.59	L

a The adjusted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0704$ , denoted $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0705$ , of the regression of growth rates of the aggregate IP index and selected aggregated GDP output indices on the common factor (column $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0706$ ), the specific HF and LF factors only (columns $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0707$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0708$ ), and the common and frequency-specific factors together (column (3)) are reported. The fourth column displays the difference between the values in the third and first columns. The last column reports the choice of the BIC across the regression models with the common factor, or the frequency-specific factor, or both factors (C denotes the common factor, H denotes the high-frequency factor, and L denotes the low-frequency factor and corresponding factor combinations (CL and CH) in the regression models). The factors are estimated from the panel of 42 GDP non-IP sectors and 117 IP indices using a mixed-frequency factor model with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0709$ .

Looking at the aggregate GDP index, we first note that even if the weight of IP sectors in the aggregate GDP index has always been below 30%, still 61% of its total variability can be explained exclusively by the common factor which—as shown in Panel B—is primarily an IP factor. This implies that there must be substantial comovement between IP and some important service sectors. Moreover, it appears from the first line in Panel B that a relevant part of the variability of the aggregate GDP index not due to the common factor is explained by the LF-specific factor (since the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0712$ increases by about 14% from 60.5% to 74.2%). This indicates that significant comovements are present among the most important sectors of the U.S. economy which are not related to manufacturing. Indeed, Panel B indicates that some services sectors such as Professional and Business Services and Information, as well as other sectors such as Wholesale trade and Construction, load significantly on both the common and the LF-specific factor, while some other sectors like Finance and Government load exclusively on the LF-specific factor.21

The BIC in Table III, Panel B, favors the regression model with both the common and low-frequency factors, among the three factor regression specifications for the U.S. GDP growth rate, while the low-frequency factor alone yields a low $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0713$ of 9%. Similarly, although the HF-specific factor in Panel A seems to be relatively less important in explaining the aggregate IP index (as the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0714$ increases by only 1% when it is added as a regressor to the common factor regression model for the IP growth rate), the BIC suggests that both the common and HF factors are important.22 Overall, the small $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0716$ could suggest that the HF-specific factor is pervasive only for a subgroup of IP sectors which have relatively low weights in the index, meaning that their aggregate output is a negligible part of the output of the entire IP sector and, consequently, also the entire U.S. economy. These results corroborate the findings of Foerster, Sarte, and Watson (2011), who claimed that the main results of their paper are qualitatively the same when considering either one or two common factors extracted from the same 117 IP indices of our study. It is worth emphasizing that the common factor explains the dominant 89% of the variability of the total IP growth and 61% of the GDP growth.

Given that our sample period covers the Great Moderation, characterized by a reduction in the volatility of business-cycle fluctuations starting in the mid-1980s, we revisit this analysis for different sub-samples. The details can be found in OA Section D.11.4, while we discuss here briefly the main results. We find a deterioration of the overall fit of approximate factor models during the Great Moderation period starting in 1984 and ending in 2007—a finding also reported by Foerster, Sarte, and Watson (2011)—where our common factor plays a relatively less significant role during that period. Interestingly, when the financial crisis is added to the Great Moderation (sample 1984–2011), we find patterns closer to the full sample results presented above. The other findings, that is, the exposure of the various sub-indices, appear to be similar in sub-samples and in the full sample.

8 Conclusions

We present a general framework for group factor models and develop a unified asymptotic theory for the identification of common and group factors, for the determination of the number of common and specific factors, for the estimation of loadings and factor values via principal component analysis and canonical correlation analysis in a setting with large-dimensional data sets, using asymptotic expansions both in the cross-sections and in time-series dimensions. Of special interest is the group factor mixed-frequency model for which the data panels of different/mixed frequencies allow not only for a natural grouping in extracting factors but also a framework which has the advantage of identifying and estimating factors which are common across frequencies as well as frequency-specific.

Our theoretical contributions, in particular Theorems 1 and 2, are of interest beyond (mixed-frequency) group factor models. Inference regarding the rank of an unknown, real-valued matrix is an important and well-studied problem.23 For indefinite matrix estimators, there is a well-developed framework; see Donald, Fortuna, and Pipiras (2007). The case of semi-definite matrix estimators still poses many challenges, however, as discussed by Bai and Ng (2007) and more recently in Donald, Fortuna, and Pipiras (2010) who argued that the tests suggested in the literature are not suitable. In fact, when the rank of a generic (positive) semi-definite matrix, say V, needs to be estimated using a semi-definite estimator, say $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0717$ , the asymptotic variance-covariance matrix of this estimator—denoted as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0718$ —is necessarily singular, as shown in Donald, Fortuna, and Pipiras (2007). Therefore, standard rank tests cannot be applied as they assume that the matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0719$ is full rank. In addition, our results in Section 4 provide the guidance to the construction of the asymptotic distribution of the (sum of the) eigenvalues of a semi-definite matrix, and develop a sequential testing procedure for determining the rank of the matrix itself. This test, for example, would enable us to determine the number of latent dynamic factors in large panels of data, without having to estimate them, a problem tackled by Bai and Ng (2007). In their paper, first a number—say r—of static factors should be estimated by PCA from a large panel. Differently from their methodology, and also differently from the solution proposed by Amengual and Watson (2007), we can directly test the rank—say $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0720$ —of the residual covariance (or correlation matrix) of a VAR model estimated on the factors themselves. Furthermore, our methods can be used to develop a new test for the question posed by Pelger (2019) as to whether the factor spaces of statistical and economic factors are equal.

There is a plethora of applications to which our theoretical analysis applies. We selected a specific example based on the work of Foerster, Sarte, and Watson (2011) who analyzed the dynamics of comovements across 117 IP quarterly sectors using factor models. We revisit part of their analysis and incorporate the rest—and most dominant part—of the U.S. economy, namely, the non-IP sectors whose growth rate we only observe annually. We find evidence for a single common factor among IP and non-IP sectors which explains 89% of the aggregate IP index and 61% of the aggregate GDP index.

Despite the generality of our analysis, we can think of many possible extensions, such as models with loadings which change across sub-periods, that is, periodic loadings, or loadings which vary stochastically or feature structural breaks. Moreover, we could consider the problem of specification and estimation of a joint dynamic model for the common and frequency-specific factors extracted with our methodology (see Ghysels (2016) and the references therein for structural Vector Autoregressive (VAR) models with mixed-frequency sampling). Further, in the interest of conciseness, we have focused our analysis on models with two sampling frequencies, leading to group factor models with two groups. Results could be extended to cover the cases with more than two groups, and therefore more than two sampling frequencies. All these extensions are left for future research.

1 Most papers deal with large T and finite cross-sections (e.g., Tucker (1958), Flury (1984), Schott (1991), Gregory and Head (1999), and Kose, Otrok, and Whiteman (2008)). Goyal, Pérignon, and Villa (2008) extended the classical group factor setting to approximate group factor models, but did not derive any asymptotic results.

2 More formally, Proposition D.1 in Appendix D.1 deals with the identification of factor spaces for given dimensions $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0040$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0041$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0042$ . Proposition D.1 is implied by Proposition 1 in Wang (2012).

3 Computing PCs first is necessary because the alternative approach of canonical correlations applied to the raw data may not necessarily uncover pervasive factors. The alternative approach to stack all groups into one panel and apply standard PCA to estimate common factors is not a solution for at least two reasons: (1) the estimate of the common factor obtained from the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0116$ principal components of the pooled data is inconsistent due to the correlation in the residuals terms arising from the group-specific factors, and (2) the combined data may not give the common factors because the common factors may not even be the leading factors in the combined data.

4 This alternative estimation method for the group-specific factors corresponds to the method proposed by Chen (2012) who adopted an information criterion approach to estimate the number of factors, whereas we use a sequential testing method. Compared to Chen (2012), our paper derives results on the asymptotic distribution of the sample canonical correlations and estimated factors, whereas Chen (2012) only has consistency and rate of convergence results.

5 For similar arguments, see footnote 5 of Bai (2003). A word of caution is warranted, however. It is known that pre-testing generates problems in terms of lack of uniform properties, and we therefore abstract from uniformity.

6 If the errors are weakly correlated across series and/or time, consistent estimation of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0372$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0373$ requires thresholding of estimated cross-sectional covariances and/or HAC-type estimators. If the errors are conditionally heteroscedastic, we need consistent estimators of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0374$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0375$ as well.

7 In the empirical application, we use $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0440$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0441$ , which yields, for example, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0442$ close to 0.05 for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0443$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0444$ , that are the smallest cross-sectional size and the low-frequency time-series dimension in our data set. In the Monte Carlo study, we find a good performance of the selection procedure with this choice.

8 In the remainder of this section, we study identification and inference for the model with flow-sampling as it corresponds to the empirical application. The identification with stock-sampling is discussed in OA Section D.3.

9 We thank an anonymous referee for suggesting the following three-steps estimation procedure for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0589$ , which is a special case of the one suggested by Wang (2012): (i) estimate the numbers $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0590$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0591$ of pervasive factors in each panel separately, (ii) estimate the number R of pervasive factors in the stacked panel of flow-sampled HF and LF data, (iii) estimate $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0592$ as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0593$ . We use the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0594$ criteria of Bai (2003) to estimate the number of factors in the first two steps.

10 In unreported results, we have estimated $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0606$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0607$ , and also R, using the ER and GR ratios of Ahn and Horenstein (2013), and noted that they perform similarly or worse than the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0608$ criterion. Alternative estimators, such as the one proposed by Onatski (2010), could also be considered.

11 Following Foerster, Sarte, and Watson (2011), we focus only on quarterly IP data, as they share the main features of the monthly ones but are less noisy/volatile. Details about the data are in OA Section D.10. Note also that we cover the statistical factor model specification of Foerster, Sarte, and Watson (2011), not their structural analysis involving input-output linkages.

12 The sectoral GDP data are not available at quarterly frequency (in contrast to the aggregate GDP index). All growth rates refer to seasonally adjusted real output indices, and are expressed in percentage points.

13 In OA Section D.11.1, we replicate the analysis in Section II.B of Foerster, Sarte, and Watson (2011), in order to rule out the possibilities that (a) sectoral weights in GDP and IP aggregate indexes are the major determinants in explaining the variability of the indexes themselves, and (b) their aggregate variability is driven mainly by sector-specific variability. Our analysis confirms their findings, which justifies the use of a mixed-frequency factor model to study the comovement among sectors.

14 We use $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0644$ as maximum number of factors when computing $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0645$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0646$ .

15 Foerster, Sarte, and Watson (2011) also used two factors while they emphasized the importance of the first factor.

16 Given the good finite sample properties presented in the simulations (in Section 6 and OA) for a range of DGPs, we expect that for our empirical application, the asymptotic theory also provides a good approximation.

17 The regressions in the second and third rows are restricted MIDAS regressions. Those in the fourth, fifth, and sixth rows impose the estimated coefficients of the common and high-frequency factors to be the same for each quarter, as they are estimated as high-frequency regressions. The empirical distribution of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0673$ corresponding to the first and second lines (resp., fourth and fifth lines) of Table I are represented in the histograms available in OA, Figures D.11(a) and (b) (resp., (c) and (d)).

18 The entire list of non-IP sectors ranked by the three criteria used in Table II is available in Tables D.24–D.26 in the OA, Section D.11.

19 Such services include Miscellaneous professional, scientific, and technical services, Administrative and support services, Legal services, Real estate, some important financial services like Federal Reserve banks, Credit intermediation, and Related activities, Rental and leasing services.

20 A detailed discussion of the difference in the sectoral components of the IP index and the GDP Manufacturing index is provided in OA Section D.10.

21 The results change when we look at the Finance sector disaggregated in (1) Federal Reserve banks, credit intermediation, and related activities, (2) Securities, commodity contracts, and investments, (3) Insurance carriers and related activities, as evident in Table II.

22 See also Table D.27 in OA Section D.11, for the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0715$ of the regression of all GDP indices on the HF factor only, and all the three factors together.

23 See, for instance, Gill and Lewbel (1992), Cragg and Donald (1996), Robin and Smith (2000), and Kleibergen and Paap (2006).

24 That is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0783$ for some $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0784$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0785$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0786$ , and similarly for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0787$ .

25 If $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1046$ , then $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1047$ is not negligible with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1048$ . Similarly, if $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1049$ , then $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1050$ is not negligible with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1051$ . In those cases, we need a more accurate asymptotic expansion.

26 That is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1078$ , uniformly in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1079$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1080$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1081$ for some $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1082$ .

Appendix

We use the following notation. Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0721$ denote the Frobenius norm of matrix A. We denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0722$ the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0723$ -norm of random matrix Z, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0724$ . We denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0725$ convergence in distribution. For a sigma-field $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0726$ , we denote by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0727$ ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0728$ -stably) the stable convergence on $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0729$ of a sequence of random vectors, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0730$ as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0731$ , for any Borel set A with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0732$ , where ∂A is the boundary of set A, and any measurable set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0733$ (see, e.g., Renyi (1963), Aldous and Eagleson (1963), Hall and Heyde (1980), Kuersteiner and Prucha (2013)). In particular, for a symmetric positive-definite random matrix Ω measurable with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0734$ , by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0735$ ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0736$ -stably) we mean $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0737$ ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0738$ -stably), where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0739$ is independent of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0740$ .

Appendix A: Assumptions

We make the following assumptions:

Assumption A.1.We have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0741$ such that the conditions in (4.1) hold.

Assumption A.2.The unobservable factor process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0742$ satisfies the normalization restrictions in (2.2), with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0743$ positive-definite.

Assumption A.3.The loadings matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0744$ is such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0745$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0746$ is a positive-definite $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0747$ matrix with distinct eigenvalues and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0748$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0749$ .

Assumption A.4.The error terms $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0750$ and the factors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0751$ are such that, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0752$ and all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0753$ : (a) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0754$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0755$ , a.s., where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0756$ , (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0757$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0758$ , for a constant $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0759$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0760$ is defined in Assumption A.5 (b).

Assumption A.5.Define the variables $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0761$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0762$ , indexed by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0763$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0764$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0765$ . (a) For any $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0766$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0767$ we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0768$

as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0769$ , where the asymptotic variance matrix is

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0770$

for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0771$ , for any $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0772$ .

Moreover, for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0773$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0774$ , we have (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0775$ , a.s., and (c) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0776$ , for constants $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0777$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0778$ .

Assumption A.6.(a) The triangular array processes $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0779$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0780$ are strong mixing of size $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0781$ , uniformly in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0782$ .24 Moreover, (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0788$ , as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0789$ , uniformly in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0790$ , and (c) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0791$ , as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0792$ , uniformly in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0793$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0794$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0795$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0796$ .

Assumption A.7.For $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0797$ : (a) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0798$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0799$ , for any $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0800$ and a constant M, where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0801$ ; (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0802$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0803$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0804$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0805$ ; (c) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0806$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0807$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0808$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0809$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0810$ .

Assumption A.8.For $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0811$ : (a) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0812$ , for large δ; (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0813$ , for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0814$ ; (c) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0815$ , for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0816$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0817$ , where either $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0818$ , or $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0819$ , or $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0820$ ; (d) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0821$ , for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0822$ ; where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0823$ are constants, and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0824$ .

Assumption A.9.The error terms are such that: (a) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0825$ , if either $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0826$ , or $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0827$ , (b) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0828$ , (c) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0829$ , say, where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0830$ , for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0831$ .

Assumption A.1 defines the asymptotic scheme. Assumption A.2 concerns the first- and second-order moments of the factor vector. Positive-definiteness of the variance-covariance matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0832$ is necessary for our model to have exactly $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0833$ pervasive factors. It holds if, and only if, the eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0834$ are smaller than 1 in modulus. The zero restrictions on the matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0835$ in (2.2), corresponding to the orthogonality of the common and group-specific factors, as well as the identity diagonal blocks, are identification conditions. Assumption A.3 concerns the empirical cross-sectional second-order moment matrix of the loadings in each group $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0836$ . It implies that matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0837$ has full column-rank, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0838$ large enough, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0839$ . Positive-definiteness of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0840$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0841$ , is also necessary for the existence of exactly $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0842$ pervasive factors. Note that we consider non-random loadings to simplify the assumptions and proofs. If the loadings were random, stochastic convergence could be obtained with a DGP for the loadings which satisfies the conditions of the LLN for weakly dependent data. Assumptions A.2 and A.3 are similar to conditions used in the large-scale factor model literature (see Assumptions A and B in Bai and Ng (2002, 2006), Bai (2003), among others).

Assumption A.4 requires the existence of higher-order moments for the factors and the error terms, similarly as, for example, in Assumptions A and C.1 in Bai and Ng (2002) and Bai (2003).

Assumption A.5 constrains the amount of admissible cross-sectional dependence of the error terms across different individuals, in the spirit of the framework—introduced by Chamberlain and Rothschild (1983)—of weak cross-sectional dependence characterizing “approximate factor models.” No distributional assumption is made on the idiosyncratic terms. Assumption A.5 (a) states that the cross-sectional averages of the error terms scaled by factor loadings satisfy a CLT. It corresponds to Assumption F.3 in Bai (2003). We adopt stable convergence on the sigma-field $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0843$ to allow for the asymptotic variance-covariance matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0844$ to possibly depend on common factors. That would occur, for example, if there are common components in the conditional volatility processes of the idiosyncratic errors. Assumption F.3 in Bai (2003) applies if the trivial filtration is replaced for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0845$ . Assumption A.5 (b) concerns higher-order conditional moments of the scaled cross-sectional average of error terms. A sufficient condition for Assumption A.5 (b) with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0846$ is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0847$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0848$ , a.s., for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0849$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0850$ . For $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0851$ , it corresponds to Assumption C.3 in Bai (2003). Assumption A.5 (c) concerns the fourth-order moment of cross-sectional averages of squared error terms and corresponds to Assumption C.5 in Bai (2003).

Assumption A.6 allows for weak serial dependence in error terms and factor processes. Specifically, Assumption A.6 (a) is a strong mixing condition, where (minus) the mixing size is inversely related to the moment order r introduced in Assumptions A.4 and A.5. We rely on this specific concept of time-series dependence because we use a CLT for data that are near-epoch dependent (NED) on mixing processes (see, e.g., Davidson (1994)), to show the asymptotic Gaussian distribution of the test statistic in Theorem 1. We deploy this specific version of the CLT for dependent data as it allows us to cope with the rather complex nature of the leading term in the asymptotic expansion of the test statistic, that involves the time-series average of the square of a cross-sectional average of scaled errors (instead of an average of averages as in the asymptotic expansion of factor estimates). We use Assumptions A.3, A.4, A.5 (a)–(b), and A.6 to check the conditions of the CLT in Section B.1.6 (i). Assumptions A.6 (b) and (c) require that certain quantities are well-approximated by their projection on a finite number of components of a mixing process to apply the NED property.

Assumption A.7 consists of additional restrictions on the weak cross-sectional and time-series dependence of the error and factor processes, which are used to prove the asymptotic expansions for the PCA estimates of the pervasive factors in the two groups in Proposition 3 in Section B.1.1. Specifically, Assumption A.7 (a) concerns cross-sectional averages of cross-products of error terms at different dates. It requires both that these cross-sectional averages are close to the corresponding population covariances in the large sample limit, and that the latter covariances decay with the time lag in a summable way. Assumptions A.7 (b) and (c) provide bounds on terms involving processes $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0852$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0853$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0854$ , that consist of averages of products of error terms at different dates. We elaborate on the conditions of Assumption A.7 in OA Section D.7 to show that they hold under weak primitive assumptions.

Assumptions A.5, A.6, and A.7 yield conditions of weak cross-sectional and time-series dependence to control terms such as those in Assumptions C, D, E, and F.1–F.3 in Bai (2003). They could be substituted, at the expense of more elaborated proofs, by other weak dependence assumptions for factors and idiosyncratic errors.

Assumption A.8 is used to get bounds on the remainder terms in the asymptotic expansions of estimated factors and loadings in Proposition D.4 uniformly across i and t. These bounds are used to control the estimation error for the re-centering and re-scaling terms of the feasible test statistic in Theorem 2. Specifically, Assumption A.8 (a) is a tail condition on the factor stationary distribution, Assumption A.8 (b) constrains the amount of cross-sectional dependence of the error terms, while Assumption A.8(d) is a uniform bound on true factor loadings. In Assumption A.8 (c), we require that time-series averages of certain zero-mean processes involving error terms and factors satisfy a large deviation bound. Such a large deviation bound is implied by tail conditions plus restrictions on serial dependence like strong mixing (see, e.g., Theorems 3.1 and 3.2 in Bosq (1998)).

Assumption A.9 simplifies the derivation of the feasible asymptotic distribution of the statistic in Theorem 2. This condition excludes correlation of the error terms across individuals and time (conditional on the factors), as well as conditional heteroscedasticity, and implies a “strict factor model” for each group. In that sense, it is more restrictive than Assumptions A.5, A.6, A.7, and A.8 (b)–(c). Moreover, under Assumption A.9, the matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0855$ in Assumption A.5 (a) simplifies to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0856$ , while $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0857$ if either $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0858$ , or $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0859$ . We note that Assumption A.9 simplifies substantially the proof of Theorem 2, but is not needed in the proofs of Theorem 1 and Propositions D.4 through D.7.

Appendix B: Proofs

B.1 Proof of Theorem 1

The proof of Theorem 1 is structured as follows. We start by deriving an asymptotic expansion for the estimates of the pervasive factors extracted by PCA in each group (Section B.1.1). This result yields an asymptotic expansion for the sample canonical correlation matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0860$ (Section B.1.2), and, in turn, it is used to obtain the asymptotic expansions of the eigenvalues and eigenvectors of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0861$ by perturbation methods (Sections B.1.3 and B.1.4). This yields the asymptotic expansions of the canonical correlations and of the test statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0862$ (Section B.1.5). Finally, the asymptotic Gaussian distribution of the test statistic follows by applying a suitable CLT for dependent triangular arrays (Section B.1.6). The proofs of Proposition 3 and technical Lemmas B.1–B.9 are provided in OA Section C.

B.1.1 Asymptotic Expansion of the Factor Estimates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0863$

Proposition 3.Under Assumptions A.1–A.4, A.5 (b), (c), A.6 (a), and A.7, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0864$ (B.1)

for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0865$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0866$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0867$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0868$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0869$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0870$ , and terms $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0871$ are such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0872$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0873$ as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0874$ , and the matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0875$ converges in probability to a nonstochastic orthogonal $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0876$ matrix, for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0877$ .

In each panel, Proposition 3 provides a more accurate asymptotic expansion of principal components compared to known results used to show consistency and asymptotic normality of PCA estimators in large panels (see, e.g., Bai and Ng (2002, 2006), Bai (2003) Stock and Watson (2002))). We need such refined result to control higher-order terms in the asymptotic expansion of the test statistic.

B.1.2 Asymptotic Expansion of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0878$

The canonical correlations and the canonical directions are invariant to one-to-one transformations of the vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0879$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0880$ (see, e.g., Anderson (2003)). Therefore, without loss of generality, for the asymptotic analysis of the test statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0881$ , we can set $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0882$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0883$ , in expansion (B.1). We get

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0884$ (B.2)

where

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0885$ (B.3)

for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0886$ . From the definition of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0887$ in (3.1), and by using (B.2) and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0888$ , we get

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0889$ (B.4)

By using the definition of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0890$ in Proposition 3, in the next lemma we derive an upper bound for terms $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0891$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0892$ .

Lemma B.1.Under Assumptions A.1–A.4, A.5 (b)–(c), A.6 (a), and A.7, we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0893$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0894$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0895$ .

Let us now expand matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0896$ at second order in the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0897$ . The reason for going beyond the first order is the following. It turns out that the first-order contribution of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0898$ to the statistic of interest involves leading terms of stochastic orders $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0899$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0900$ (see Lemma B.5 below). The second-order remainder term is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0901$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0902$ is not negligible with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0903$ , when either T is too small compared to N, or N is too small compared to T. In order to get validity of our results for more general conditions on the relative growth rate of N and T such as in Assumption A.1, we consider a second-order expansion. By using $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0904$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0905$ , from (B.4) we get the next lemma.

Lemma B.2.Under Assumptions A.1–A.4, A.5 (b)–(c), A.6 (a), and A.7, the second-order asymptotic expansion of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0906$ is

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0907$ (B.5)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0908$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0909$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0910$ ,

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0911$ (B.6)

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0912$ (B.7)

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0913$ .

Equation (B.5) represents matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0914$ as the sum of the sample canonical correlation matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0915$ computed with the true factor values, the estimation error term $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0916$ that consists of first-order and second-order components $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0917$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0918$ , and the third-order remainder term $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0919$ .

B.1.3 Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0920$ and Its Eigenvalues and Eigenvectors

Let us now characterize matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0921$ and its eigenvalues, that are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0922$ , that is, the squared sample canonical correlations of vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0923$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0924$ , under the null hypothesis of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0925$ common factors among the two groups of observables. Since the vectors $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0926$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0927$ have a common component of dimension $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0928$ , we know that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0929$ a.s. Using the notation

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0930$

we can write matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0931$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0932$ , in (B.3) in block form as

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0933$

By straightforward matrix algebra, we get the next lemma.

Lemma B.3.The matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0934$ is such that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0935$

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0936$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0937$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0938$ .

Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0939$ is the sample canonical correlation matrix for the residuals of the sample orthogonal projections of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0940$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0941$ onto $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0942$ . From Lemma B.3, the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0943$ largest eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0944$ are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0945$ , while the remaining $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0946$ eigenvalues are the eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0947$ and are such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0948$ , a.s. Let us define

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0949$ (B.8)

Then, the eigenvectors associated with the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0950$ unit eigenvalues of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0951$ are spanned by the columns of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0952$ . The columns of matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0953$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0954$ span the space $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0955$ .

B.1.4 Eigenvalues and Eigenvectors of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0956$ Obtained by Perturbation Methods

The estimators of the first $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0957$ canonical correlations are such that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0958$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0959$ , are the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0960$ largest eigenvalues of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0961$ . We now derive their asymptotic expansion under the null hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0962$ using perturbations arguments applied to equation (B.5). Let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0963$ be a $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0964$ matrix whose columns are eigenvectors of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0965$ associated with the eigenvalues $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0966$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0967$ . We have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0968$ (B.9)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0969$ is the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0970$ diagonal matrix containing the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0971$ largest eigenvalues of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0972$ . We know from the previous subsection that the eigenspace associated with the largest eigenvalue of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0973$ (equal to 1) has dimension $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0974$ and is spanned by the columns of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0975$ . Since the columns of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0976$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0977$ span $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0978$ , we can write the following expansions:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0979$ (B.10)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0980$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0981$ are defined in equation (B.8), the stochastic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0982$ matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0983$ is nonsingular with probability approaching (w.p.a.) 1, stochastic matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0984$ is diagonal, and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0985$ is a $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0986$ stochastic matrix. By the continuity of the matrix eigenvalue and eigenfunction mappings, and Lemma B.1, we have that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0987$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0988$ converge in probability to null matrices as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0989$ at rate $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0990$ . By substituting the expansions (B.5) and (B.10) into the eigenvalue-eigenvector equation (B.9), using the characterization of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0991$ obtained in Lemma B.3, and keeping terms up to order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0992$ , we get expressions for matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0993$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0994$ . These yield the asymptotic expansions of the eigenvalues and eigenvectors of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0995$ provided in the next lemma.

Lemma B.4.Under Assumptions A.1–A.4, A.5 (b)–(c), A.6 (a), and A.7, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0996$ (B.11)

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0997$ (B.12)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0998$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0999$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1000$ denote the upper-left $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1001$ block, the upper-right $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1002$ block, and the lower-right $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1003$ block of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1004$ , and similarly for the blocks of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1005$ .

In equations (B.11) and (B.12), in the terms that are of second order with respect to $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1006$ , we can replace $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1007$ by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1008$ without changing the order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1009$ of the remainder term. Note that the approximation in (B.11) holds for the terms in the main diagonal, as matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1010$ has been defined to be diagonal.

B.1.5 Asymptotic Expansion of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1011$

Let us now derive an asymptotic expansion for the sum of the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1012$ largest canonical correlations $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1013$ . By using the expansion of the matrix square root function in a neighborhood of the identity, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1014$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1015$ , from equation (B.11) we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1016$

Using $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1017$ , this implies

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1018$ (B.13)

by the commutative property of the trace and including third-order terms in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1019$ . We derive the asymptotic expansions of the terms within the trace operator in the r.h.s. of (B.13) by plugging in the expressions of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1020$ and its components from Lemma B.2. After tedious algebra, we get the next lemma.

Lemma B.5.Under Assumptions A.1–A.4, A.5 (b)–(c), A.6 (a), and A.7, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1021$

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1022$ . The terms in the curly brackets are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1023$ .

We have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1024$ from the definitions of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1025$ in Lemma B.1 and of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1026$ in Lemma B.5, and the condition $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1027$ in Assumption A.1. Therefore, the leading stochastic terms in the difference $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1028$ are of order $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1029$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1030$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1031$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1032$ .

From the definition of matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1033$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1034$ , we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1035$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1036$ . Moreover, let us define the process

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1037$ (B.14)

Process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1038$ depends on $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1039$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1040$ , but we do not make this dependence explicit for expository purpose. By using these definitions, together with the commutativity and linearity properties of the trace operator, from Lemma B.5 we get

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1041$ (B.15)

Under our set of assumptions, terms $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1042$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1043$ are $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1044$ . In fact, in the next subsection, we show that these terms are jointly asymptotically Gaussian distributed. The remainder term $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1045$ in the r.h.s. of (B.15) is negligible with respect to the first term in the r.h.s.25

B.1.6 Asymptotic Distribution of the Test Statistic Under the Null Hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1052$

From the asymptotic expansion (B.15), we obtain the asymptotic distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1053$ under the null hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1054$ of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1055$ common factors. First, we apply a CLT for weakly dependent triangular array data to prove the asymptotic normality of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1056$ as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1057$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1058$ depends on $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1059$ via process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1060$ defined in (B.14).

(i) CLT for Near-Epoch Dependent (NED) Processes

Let process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1061$ be as defined in Assumption A.6, and let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1062$ for any positive integer m, with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1063$ .

Lemma B.6.Under Assumptions A.3, A.4 (a), (b), A.5 (b), and A.6 (a)–(c), we have

(i) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1064$ is measurable w.r.t. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1065$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1066$ for all $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1067$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1068$ ,
(ii) $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1069$ , for a constant $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1070$ ,
(iii) Process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1071$ is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1072$ near epoch dependent ( $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1073$ -NED) of size −1 on process $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1074$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1075$ is strong mixing of size $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1076$ , uniformly in $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1077$ ,26
(iv) Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1083$ is positive-definite and such that
$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1084$ (B.16)

Then, by an application of the univariate CLT in Corollary 24.7 in Davidson (1994) and the Cramér–Wold device, we have that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1085$ (B.17)

as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1086$ . Let us now compute the limit autocovariance matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1087$ explicitly. By the Law of Iterated Expectation and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1088$ , we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1089$ (B.18)

Moreover, from Assumptions A.3 and A.5 (a), vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1090$ is asymptotically Gaussian for any h, t as $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1091$ :

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1092$ (B.19)

We use the Lebesgue lemma to interchange the limit for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1093$ and the outer expectation in the r.h.s. of (B.18), and the fact that convergence in distribution plus uniform integrability imply convergence of the expectation for a sequence of random variables (see Theorem 25.12 in Billingsley (1995)) to show the next lemma.

Lemma B.7.Under Assumptions A.3 and A.5 (b), we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1094$

Lemma B.7 allows to deploy the joint asymptotic Gaussian distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1095$ to compute the limit autocovariance $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1096$ . By using that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1097$ is measurable w.r.t. $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1098$ , we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1099$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1100$ . To compute the upper-left block of matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1101$ , we use Theorem 12, p. 284, in Magnus and Neudecker (2007) and Theorem 10.21 in Schott (2005) which provide the covariance between two quadratic forms of Gaussian vectors. We get $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1102$ . Therefore, from (B.16) and Lemma B.7, we get

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1103$ (B.20)

(ii) Asymptotic Gaussian Distribution of the Test Statistic

Let us define vector $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1104$ . From equations (B.15) and (B.20), and by using

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1105$

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1106$ , under the hypothesis of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1107$ common factors in each group, the statistics $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1108$ is such that

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1109$

From equation (B.17), the r.h.s. converges in distribution to a standard normal distribution, which yields Theorem 1. Note that this asymptotic distribution holds for any value of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1110$ , and independently of whether $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1111$ or $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1112$ , because the diverging factors in the numerator and the denominator of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1113$ cancel.

B.2 Proof of Theorem 2

To establish the asymptotic distribution of the feasible statistic in Theorem 2, we need to control the effect of replacing the re-centering and scaling terms by means of their estimates. The latter involve factors and loadings estimates. Hence, in OA Section D.4, we derive uniform asymptotic expansions of factors and loadings estimators. These results are instrumental for the proof of Theorem 2, as well as for the proofs of other results in this paper. In Sections B.2.1 and B.2.2, we show the statements in Part (i) and in Part (ii) of Theorem 2, respectively.

B.2.1 Proof of Part (i)

Let us first consider the asymptotic distribution of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1114$ under the null hypothesis of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1115$ common factors. Under the assumptions of Theorem 2, the unfeasible asymptotic distribution in Theorem 1 becomes

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1116$ (B.21)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1117$ and we use (4.5) and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1118$ . Theorem 2 (i) follows, if we prove

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1119$ (B.22)

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1120$ (B.23)

Indeed, the statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1121$ can be rewritten as

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1122$

where the ratio $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1123$ converges in probability to 1 from (B.23), the term within the curly brackets in the first line in the r.h.s. converges in distribution to a standard normal distribution from (B.21), and the term on the second line on the r.h.s. is $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1124$ from (B.22).

Le us now prove equations (B.22) and (B.23) by deriving the asymptotic expansions of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1125$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1126$ . To derive the asymptotic expansion of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1127$ , we use its definition $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1128$ , where the matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1129$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1130$ , involve the estimated loadings and residuals. We plug in the uniform asymptotic expansions from Proposition D.4(ii) in OA Section D.4 to show the next result.

Lemma B.8.Under Assumptions A.1–A.9: (i) The asymptotic expansion of estimator $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1131$ is

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1132$ (B.24)

for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1133$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1134$ with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1135$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1136$ and

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1137$ (B.25)

and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1138$ , $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1139$ are nonsingular matrices w.p.a. 1. (ii) The asymptotic expansion of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1140$ is

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1141$ (B.26)

for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1142$ , where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1143$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1144$ , and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1145$ .

Equation (B.24) allows to compute the asymptotic approximation of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1146$ by matrix inversion:

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1147$ (B.27)

Substituting equations (B.27) and (B.26) into the expression of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1148$ and rearranging terms, we get

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1149$

Therefore, from the definitions of matrices $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1150$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1151$ in Lemma B.8, we have

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1152$ (B.28)

where $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1153$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1154$ , for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1155$ . In particular, the upper-left $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1156$ block of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1157$ vanishes, that is, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1158$ for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1159$ .

From equation (B.28), we get the asymptotic expansion for $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1160$ :

$urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1161$ (B.29)

Moreover, Proposition D.4(ii) implies $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1162$ . This equation, together with the asymptotic expansion (B.29) and the commutative property of the trace operator, imply equation (B.22). Similarly, the asymptotic expansion (B.29) and the convergence $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1163$ imply equation (B.23).

B.2.2 Proof of Part (ii)

In order to prove Theorem 2 (ii), we consider the behavior of statistic $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1164$ under the alternative hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1165$ of less than $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1166$ common factors. Specifically, let $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1167$ be the true number of common factors in the DGP. The statistic is given by $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1168$ . We rely on the following lemma. For its proof, we assume that $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1169$ is used to estimate the common factor in panel $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1170$ , while estimator $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1171$ is used in panel $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1172$ .

Lemma B.9.Under the alternative hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1173$ , with $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1174$ , we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1175$ , w.p.a. 1, for a constant $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1176$ .

From Lemma B.9 and using $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1177$ , where the $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1178$ term follows from the continuity of the eigenvalues mapping, we get $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1179$ . Under $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1180$ , we have $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1181$ canonical correlations that are equal to 1, while the other ones are strictly smaller than 1. Therefore, $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1182$ . Then, from Lemma B.9, we get $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1183$ , w.p.a. 1, for a constant $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1184$ . The conclusion follows.

Supporting Information

References

Ahn, S. C., and A. R. Horenstein (2013): “Eigenvalue Ratio Test for the Number of Factors,” Econometrica, 81, 1203–1227.
10.3982/ECTA8968
Web of Science® Google Scholar
Aldous, D., and G. Eagleson (1963): “On Mixing and Stability of Limit Theorem,” Annals of Probability, 6, 325–331.
10.1214/aop/1176995577
Web of Science® Google Scholar
Amengual, D., and M. W. Watson (2007): “Consistent Estimation of the Number of Dynamic Factors in a Large N and T Panel,” Journal of Business and Economic Statistics, 25, 91–96.
10.1198/073500106000000585
Web of Science® Google Scholar
Anderson, T. W. (2003): An Introduction to Multivariate Statistical Analysis. Wiley-Interscience.
Google Scholar
Ando, T., and J. Bai (2015): “Multifactor Asset Pricing With a Large Number of Observable Risk Factors and Unobservable Common and Group-Specific Factors,” Journal of Financial Econometrics, 13, 556–604.
10.1093/jjfinec/nbu026
Web of Science® Google Scholar
Andreou, E., P. Gagliardini E. Ghysels, and M. Rubin (2019): “ Supplement to ‘Inference in Group Factor Models With an Application to Mixed-Frequency Data’,” Econometrica Supplemental Material, 87, https://doi.org/10.3982/ECTA14690.
Google Scholar
Bai, J. (2003): “Inferential Theory for Factor Models of Large Dimensions,” Econometrica, 71, 135–171.
10.1111/1468-0262.00392
Web of Science® Google Scholar
Bai, J., and S. Ng (2002): “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70, 191–221.
10.1111/1468-0262.00273
Web of Science® Google Scholar
Bai, J., and S. Ng (2006): “Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions,” Econometrica, 74, 1133–1150.
10.1111/j.1468-0262.2006.00696.x
Web of Science® Google Scholar
Bai, J., and S. Ng (2007): “Determining the Number of Primitive Shocks in Factor Models,” Journal of Business and Economic Statistics, 25, 52–60.
10.1198/073500106000000413
Web of Science® Google Scholar
Billingsley, P. (1995): Probability and Measure. Wiley.
Google Scholar
Bosq, D. (1998): Nonparametric Statistics for Stochastic Processes. Springer-Verlag.
10.1007/978-1-4612-1718-3
Google Scholar
Breitung, J., and S. Eickmeier (2016): “Analyzing International Business and Financial Cycles Using Multi-Level Factor Models: A Comparison of Alternative Approaches,” Advances in Econometrics, 15, 177–214.
Google Scholar
Chamberlain, G., and M. Rothschild (1983): “Arbitrage, Factor Structure and Mean-Variance Analysis on Large Asset Markets,” Econometrica, 51, 1281–1304.
10.2307/1912275
Web of Science® Google Scholar
Chen, P. (2010): “ A Grouped Factor Model,” Unpublished Manuscript, MPRA Paper No. 36082.
Google Scholar
Chen, P. (2012): “ Common Factors and Specific Factors,” Unpublished Manuscript, MPRA Paper No. 36114.
Google Scholar
Connor, G., and R. A. Korajczyk (1986): “Performance Measurement With the Arbitrage Pricing Theory: A New Framework for Analysis,” Journal of Financial Economics, 15, 373–394.
10.1016/0304-405X(86)90027-9
Web of Science® Google Scholar
Cragg, J. G., and S. G. Donald (1996): “On the Asymptotic Properties of LDU-Based Tests of the Rank of a Matrix,” Journal of the American Statistical Association, 91, 1301–1309.
10.1080/01621459.1996.10476999
Web of Science® Google Scholar
Cragg, J. G., and S. G. Donald (1997): “Inferring the Rank of a Matrix,” Journal of Econometrics, 76, 223–250.
10.1016/0304-4076(95)01790-9
Web of Science® Google Scholar
Davidson, J. (1994): Stochastic Limit Theory. Oxford University Press.
10.1093/0198774036.001.0001
Google Scholar
Donald, S. G., N. Fortuna, and V. Pipiras (2007): “On Rank Estimation in Symmetric Matrices: The Case of Indefinite Matrix Estimators,” Econometric Theory, 23, 1103–1123.
10.1017/S0266466607070478
Web of Science® Google Scholar
Donald, S. G., N. Fortuna, and V. Pipiras (2010): “ On Rank Estimation in Semidefinite Matrices,” Unpublished Manuscript, CEF.UP Working Paper 2-2010.
Google Scholar
Flury, B. N. (1984): “Common Principal Components in k Groups,” Journal of the American Statistical Association, 79, 892–898.
10.1080/01621459.1984.10477108
CAS Web of Science® Google Scholar
Foerster, A. T., P.-D. G. Sarte, and M. W. Watson (2011): “Sectoral versus Aggregate Shocks: A Structural Factor Analysis of Industrial Production,” Journal of Political Economy, 119, 1–38.
10.1086/659311
Web of Science® Google Scholar
Ghysels, E. (2016): “Macroeconomics and the Reality of Mixed Frequency Data,” Journal of Econometrics, 193, 294–314.
10.1016/j.jeconom.2016.04.008
Web of Science® Google Scholar
Gill, L., and A. Lewbel (1992): “Testing the Rank and Definiteness of Estimated Matrices With Applications to Factor, State-Space and ARMA Models,” Journal of the American Statistical Association, 87, 766–776.
10.1080/01621459.1992.10475278
Web of Science® Google Scholar
Goyal, A., C. Pérignon, and C. Villa (2008): “How Common Are Common Return Factors Across the NYSE and Nasdaq?” Journal of Financial Economics, 90, 252–271.
10.1016/j.jfineco.2008.01.004
Web of Science® Google Scholar
Greenwood, R., and D. Scharfstein (2013): “The Growth of Finance,” Journal of Economic Perspectives, 27, 3–28.
10.1257/jep.27.2.3
Web of Science® Google Scholar
Gregory, A. W., and A. C. Head (1999): “Common and Country-Specific Fluctuations in Productivity, Investment, and the Current Account,” Journal of Monetary Economics, 44, 423–451.
10.1016/S0304-3932(99)00035-5
Web of Science® Google Scholar
Hall, P., and C. Heyde (1980): Martingale Limit Theory and Its Application. Academic Press.
Google Scholar
Kleibergen, F., and R. Paap (2006): “Generalized Reduced Rank Tests Using the Singular Value Decomposition,” Journal of Econometrics, 133, 97–126.
10.1016/j.jeconom.2005.02.011
Web of Science® Google Scholar
Kose, A. M., C. Otrok, and C. H. Whiteman (2008): “Understanding the Evolution of World Business Cycles,” Journal of International Economics, 75, 110–130.
10.1016/j.jinteco.2007.10.002
Web of Science® Google Scholar
Kuersteiner, G., and I. Prucha (2013): “Limit Theory for Panel Data Models With Cross Sectional Dependence and Sequential Exogeneity,” Journal of Econometrics, 174, 107–126.
10.1016/j.jeconom.2013.02.004
PubMed Web of Science® Google Scholar
Magnus, J. R., and H. Neudecker (2007): Matrix Differential Calculus With Applications in Statistics and Econometrics. Chichester/New York: John Wiley and Sons.
Google Scholar
Onatski, A. (2010): “Determining the Number of Factors From Empirical Distribution of Eigenvalues,” Review of Economics and Statistics, 92, 1004–1016.
10.1162/REST_a_00043
Web of Science® Google Scholar
Pelger, M. (2019): “Large-Dimensional Factor Modeling Based on High-Frequency Observations,” Journal of Econometrics, 208(1), 23–42.
10.1016/j.jeconom.2018.09.004
Web of Science® Google Scholar
Pötscher, B. M. (1983): “Order Estimation in ARMA-Models by Lagrangian Multiplier Tests,” Annals of Statistics, 11, 872–885.
10.1214/aos/1176346253
Web of Science® Google Scholar
Renyi, A. (1963): “On Stable Sequences of Events,” Sankhya, 25, 293–302.
Google Scholar
Robin, J. M., and R. Smith (2000): “Tests of Rank,” Econometric Theory, 16, 151–175.
10.1017/S0266466600162012
Web of Science® Google Scholar
Schott, J. R. (1991): “Some Tests for Common Principal Component Subspaces in Several Groups,” Biometrika, 78, 771–777.
10.1093/biomet/78.4.771
Web of Science® Google Scholar
Schott, J. R. (2005): Matrix Analysis for Statistics ( Second Ed.). New York: Wiley.
Google Scholar
Stock, J. H., and M. W. Watson (2002): “Forecasting Using Principal Components From a Large Number of Predictors,” Journal of the American Statistical Association, 97, 1167–1179.
10.1198/016214502388618960
Web of Science® Google Scholar
Tucker, L. R. (1958): “An Inter-Battery Method of Factor Analysis,” Psychometrika, 23, 111–136.
10.1007/BF02289009
Web of Science® Google Scholar
Wang, P. (2012): “ Large Dimensional Factor Models With a Multi-Level Factor Structure: Identification, Estimation, and Inference,” Unpublished Manuscript, Hong Kong University of Science and Technology.
Google Scholar

Citing Literature

Volume87, Issue4

July 2019

Pages 1267-1305

Inference in Group Factor Models With an Application to Mixed-Frequency Data

Abstract

1 Introduction

2 Identification in Group Factor Models

3 Estimation and Inference on the Number of Common Factors

3.1 Estimators

3.2 Inference on the Number of Common Factors via Canonical Correlations

3.3 Estimation and Inference When $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0262$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0263$ Are Unknown

4 Large Sample Theory

5 Mixed-Frequency Group Factor Models

6 Monte Carlo Simulation Analysis

6.1 Asymptotic Gaussian Distribution, Size, and Power Properties

6.2 Estimation of the Number of Common Factors

7 Empirical Application

7.1 Data Description

7.2 Common, Low-, and High-Frequency Factors

8 Conclusions

Appendix

Appendix A: Assumptions

Appendix B: Proofs

B.1 Proof of Theorem 1

B.1.1 Asymptotic Expansion of the Factor Estimates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0863$

B.1.2 Asymptotic Expansion of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0878$

B.1.3 Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0920$ and Its Eigenvalues and Eigenvectors

B.1.4 Eigenvalues and Eigenvectors of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0956$ Obtained by Perturbation Methods

B.1.5 Asymptotic Expansion of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1011$

B.1.6 Asymptotic Distribution of the Test Statistic Under the Null Hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1052$

B.2 Proof of Theorem 2

B.2.1 Proof of Part (i)

B.2.2 Proof of Part (ii)

Supporting Information

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Inference in Group Factor Models With an Application to Mixed-Frequency Data

Abstract

1 Introduction

2 Identification in Group Factor Models

3 Estimation and Inference on the Number of Common Factors

3.1 Estimators

3.2 Inference on the Number of Common Factors via Canonical Correlations

3.3 Estimation and Inference When and Are Unknown

4 Large Sample Theory

5 Mixed-Frequency Group Factor Models

6 Monte Carlo Simulation Analysis

6.1 Asymptotic Gaussian Distribution, Size, and Power Properties

6.2 Estimation of the Number of Common Factors

7 Empirical Application

7.1 Data Description

7.2 Common, Low-, and High-Frequency Factors

8 Conclusions

Appendix

Appendix A: Assumptions

Appendix B: Proofs

B.1 Proof of Theorem 1

B.1.1 Asymptotic Expansion of the Factor Estimates

B.1.2 Asymptotic Expansion of Matrix

B.1.3 Matrix and Its Eigenvalues and Eigenvectors

B.1.4 Eigenvalues and Eigenvectors of Matrix Obtained by Perturbation Methods

B.1.5 Asymptotic Expansion of

B.1.6 Asymptotic Distribution of the Test Statistic Under the Null Hypothesis

B.2 Proof of Theorem 2

B.2.1 Proof of Part (i)

B.2.2 Proof of Part (ii)

Supporting Information

References

Citing Literature

Figures

References

Related

Information

3.3 Estimation and Inference When $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0262$ and $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0263$ Are Unknown

B.1.1 Asymptotic Expansion of the Factor Estimates $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0863$

B.1.2 Asymptotic Expansion of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0878$

B.1.3 Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0920$ and Its Eigenvalues and Eigenvectors

B.1.4 Eigenvalues and Eigenvectors of Matrix $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-0956$ Obtained by Perturbation Methods

B.1.5 Asymptotic Expansion of $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1011$

B.1.6 Asymptotic Distribution of the Test Statistic Under the Null Hypothesis $urn:x-wiley:00129682:media:ecta200048:ecta200048-math-1052$