"The Apple iPod iTunes Anti-Trust Litigation"
Filing
753
***ERRONEOUS ENTRY, PLEASE REFER TO DOCUMENT NO. 754 *** EXHIBITS re 752 Opposition/Response to Motion, filed byApple Inc.. (Attachments: # 1 Exhibit 2, # 2 Exhibit 3, # 3 Exhibit 4, # 4 Exhibit 5, # 5 Exhibit 6, # 6 Exhibit 7, # 7 Exhibit 8, # 8 Exhibit 9, # 9 Exhibit 11, # 10 Exhibit 12, # 11 Proposed Order)(Related document(s) 752 ) (Kiernan, David) (Filed on 1/14/2014) Modified on 1/14/2014 (jlmS, COURT STAFF).
Exhibit 12
ARTICLE IN PRESS
Journal of Econometrics 141 (2007) 597–620
www.elsevier.com/locate/jeconom
Asymptotic properties of a robust variance matrix
estimator for panel data when T is large
Christian B. Hansen
University of Chicago, Graduate School of Business, 5807 South Woodlawn Ave., Chicago, IL 60637, USA
Available online 20 November 2006
Abstract
I consider the asymptotic properties of a commonly advocated covariance matrix estimator for
panel data. Under asymptotics where the cross-section dimension, n, grows large with the time
dimension, T, fixed, the estimator is consistent while allowing essentially arbitrary correlation within
each individual. However, many panel data sets have a non-negligible time dimension. I extend the
usual analysis to cases where n and T go to infinity jointly and where T ! 1 with n fixed. I provide
conditions under which t and F statistics based on the covariance matrix estimator provide valid
inference and illustrate the properties of the estimator in a simulation study.
r 2007 Elsevier B.V. All rights reserved.
JEL classification: C12; C13; C23
Keywords: Panel; Heteroskedasticity; Autocorrelation; Robust; Covariance matrix
1. Introduction
The use of heteroskedasticity robust covariance matrix estimators, cf. White (1980), in
cross-sectional settings and of heteroskedasticity and autocorrelation consistent (HAC)
covariance matrix estimators, cf. Andrews (1991), in time series contexts is extremely
common in applied econometrics. The popularity of these robust covariance matrix
estimators is due to their consistency under weak functional form assumptions. In
particular, their use allows the researcher to form valid confidence regions about a set of
parameters from a model of interest without specifying an exact process for the
disturbances in the model.
E-mail address: chansen1@chicagoGSB.edu.
0304-4076/$ - see front matter r 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.jeconom.2006.10.009
ARTICLE IN PRESS
598
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
With the increasing availability of panel data, it is natural that the use of robust
covariance matrix estimators for panel data settings that allow for arbitrary within
individual correlation are becoming more common. A recent paper by Bertrand et al.
(2004) illustrated the pitfalls of ignoring serial correlation in panel data, finding through a
simulation study that inference procedures which fail to account for within individual
serial correlation may be severely size distorted. As a potential resolution of this problem,
Bertrand et al. (2004) suggest the use of a robust covariance matrix estimator proposed by
Arellano (1987) and explored in Kezdi (2002) which allows arbitrary within individual
correlation and find in a simulation study that tests based on this estimator of the
covariance parameters have correct size.
One drawback of the estimator of Arellano (1987), hereafter referred to as the
‘‘clustered’’ covariance matrix (CCM) estimator, is that its properties are only known in
conventional panel asymptotics as the cross-section dimension, n, increases with the time
dimension, T, fixed. While many panel data sets are indeed characterized by large n and
relatively small T, this is not necessarily the case. For example, in many differences-indifferences and policy evaluation studies, the cross-section is composed of states and the
time dimension of yearly or quarterly (or occasionally monthly) observations on each state
for 20 or more years.
In this paper, I address this issue by exploring the theoretical properties of the CCM
estimator in asymptotics that allow n and T to go to infinity jointly and in asymptotics
where T goes to infinity with n fixed. I find that the CCM estimator, appropriately
normalized, is consistent without imposing any conditions on the rate of growth of T
relative to n even when the time series dependence between the observations within each
individual is left unrestricted. In this case, both the OLS estimator and the CCM estimator
pffiffiffi
converge at only the n-rate, essentially because the only information is coming from
cross-sectional variation. If the pffiffiffiffiffiffiffi series process is restricted to be strongly mixing, I
time
show that the OLS estimator is nT -consistent but that, because high lags pffiffiffi not down
are
weighted, the robust covariance matrix estimator still converges at only the n-rate. This
behavior suggests, as indicated in the simulations found in Kezdi (2002), that it is the n
dimension and not the size of n relative to T that matters for determining the properties of
the CCM estimator.
It is interesting to note that the limiting behavior of b changes ‘‘discontinuously’’ as the
b
amount of dependence is limited. In ffiffiffiffiffiffiffi
particular, the rate of convergence of b changes from
b
p
pffiffiffi
n in the ‘‘no-mixing case’’ to nT when mixing is imposed. However, despite the
difference in the limiting behavior of b there is no difference in the behavior of standard
b,
inference procedures based on the CCM estimator between the two cases. In particular, the
same t and F statistics will be valid in either case (and in the n ! 1 with T fixed case)
without reference to the asymptotics or degree of dependence in the data.
I also derive the behavior of the CCM estimator as T ! 1 with n fixed, where I find the
estimator is not consistent but does have a limiting distribution. This result corresponds to
asymptotic results for HAC estimators without truncation found in recent work by Kiefer
and Vogelsang (2002, 2005), Phillips et al. (2003), and Vogelsang (2003). While the limiting
distribution is not proportional to the true covariance matrix in general, it is proportional
to the covariance matrix in the important special case of iid data across individuals,1
1
Note that this still allows arbitrary correlation and heteroskedasticity within individuals, but restricts that the
pattern is the same across individuals.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
599
allowing construction of asymptotically pivotal statistics in this case. In fact, in this case,
the standard t-statistic is not asymptotically normal but converges in distribution to a
random variable which is exactly proportional to a tnÀ1 distribution. This behavior
suggests the use of the tnÀ1 for constructing confidence intervals and tests when the CCM
estimator is used as a general rule, as this will provide asymptotically correct critical values
under any asymptotic sequence.
I then explore the finite sample behavior of the CCM estimator and tests based upon it
through a short simulation study. The simulation results indicate that tests based on the
robust standard error estimates generally have approximately correct size in serially
correlated panel data even in small samples. However, the standard error estimates
themselves are considerably more variable than their counterparts based on simple
parametric models. The bias of the simple parametric estimators is also typically smaller in
the cases where the parametric model is correct, suggesting that these standard error
estimates are likely preferable when the researcher is confident in the form of the error
process. In the simulation, I also explore the behavior of an analog of White’s (1980) direct
test for heteroskedasticity proposed by Kezdi (2002).2 The results indicate the performance
of the test is fairly good for moderate n, though it is quite poor when n is small. This
simulation behavior suggests that this test may be useful for choosing between the use of
robust standard error estimates and standard errors estimated from a more parsimonious
model when n is reasonably large.
The remainder of this paper is organized as follows. In Section 2, I present the basic
framework and the estimator and test statistics that will be considered. The asymptotic
properties of these estimators are collected in Section 3, and Section 4 contains a discussion
of a Monte Carlo study assessing the finite sample performance of the estimators in simple
models. Section 5 concludes.
2. A heteroskedasticity–autocorrelation consistent covariance matrix estimator for panel
data
Consider a regression model defined by
yit ¼ x0it b þ it ,
(1)
where i ¼ 1; . . . ; n indexes individuals, t ¼ 1; . . . ; T indexes time, xit is a k  1 vector of
observable covariates, and it is an unobservable error component. Note that this
formulation incorporates the standard fixed effects model as well as models which include
other covariates that enter the model with individual specific coefficients, such as
individual specific time trends, where these covariates have been partialed out. In these
cases, the variables xit , yit , and it should be interpreted as residuals from regressions of xà ,
it
yà , and à on an auxiliary set of covariates zà from the underlying model
it
it 0
it
0
yà ¼ xà b þ zà g þ à . For example, in the fixed effects model, Zà is a matrix of dummy
it
it
it
it
variables for each individual and g is a vector of individual specific fixed effects. In this
P
case, xit ¼ xà À ð1=TÞ T xà , and yit and it are defined similarly. Alternatively, xit , yit ,
it
t¼1 it
and it could be interpreted as variables resulting from other transformations which
2
Solon and Inoue (2004) offers a different testing procedure for detecting serial correlation in fixed effects panel
models. See also Bhargava et al. (1982), Baltagi and Wu (1999), Wooldridge (2002, pp. 275, 282–283), and
Drukker (2003).
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
600
remove the nuisance parameters from the equation, such as first-differencing to remove the
fixed effects. In what follows, all properties are given in terms of the transformed variables
for convenience. Alternatively, conditions could be imposed on the underlying variables
and the properties derived as T ! 1 as in Hansen (2006).3
Within each individual, the equations defined by (1) may be stacked and represented in
matrix form as
y i ¼ xi b þ i ,
(2)
where yi is a T Â 1 vector of individual outcomes, xi is a T Â k vector of observed covariates,
and i is a T Â 1 vector of unobservables affecting the outcomes yi with E½i 0i jxi ¼ Oi . The
P
P
OLS estimator of b from Eq. (2) may then be defined as b ¼ ð n x0 xi ÞÀ1 n x0 y . The
b
i¼1
i
i¼1
i i
properties of b as n ! 1 with T fixed are well known. In particular, under regularity
b
pffiffiffi b
conditions, nP À bÞ is asymptotically normal with covariance matrix QÀ1 WQÀ1 where
ðb
P
Q ¼ limn ð1=nÞ n E½x0i xi and W ¼ limn ð1=nÞ n E½x0i Oi xi .
i¼1
i¼1
The problem of robust covariance matrix estimation is then estimating W without
imposing a parametric structure on the Oi . In this paper, I consider the estimator suggested
by Arellano (1987) which may be defined as
n
1 X 0 0
b
x bib xi ,
W¼
nT i¼1 i i
(3)
b
where bi ¼ yi À xi b are OLS residuals from Eq. (2). This estimator is an appealing
generalization of White’s (1980) heteroskedasticity consistent covariance matrix estimator
that allows for arbitrary intertemporal correlation patterns and heteroskedasticity across
individuals.4 The estimator is also appealing in that, unlike HAC estimators for time series
data, its implementation does not require the selection of a kernel or bandwidth parameter.
b
The properties of W under conventional panel asymptotics where n ! 1 with T fixed are
well-established. In the remainder of this paper, I extend this analysis by considering the
b
properties of W under asymptotic sequences where T ! 1 as well.
The chief reason for interest in the CCM estimator is for performing inference about b
b.
pffiffiffiffiffiffiffiffi b
d
b
Suppose d nT ðb À bÞ ! Nð0; BÞ and define an estimator of the asymptotic variance of b as
b where B ! B. The following estimator of the asymptotic variance of b based on
b p
b
ð1=d nT ÞB
b
W is used throughout the remainder of the paper:
d bÞ
Avarðb ¼
n
X
!À1
i¼1
¼
n
X
i¼1
3
b
ðnT W Þ
x0i xi
!À1
x0i xi
n
X
i¼1
n
X
i¼1
x0ibib0i xi
!À1
x0i xi
!
n
X
!À1
x0i xi
.
ð4Þ
i¼1
This is especially relevant in Theorem 3 where the mixing conditions will not hold for the transformed variables
if, for example, the transformation is to remove fixed effects by differencing out the individual means. Hansen
(2006) provides conditions on the untransformed variables which will cover this case in a different but related
context. This approach complicates the proof and notation and is not pursued here.
4
It does, however, ignore the possibility of cross-sectional correlation, and it will be assumed that there is no
cross-sectional correlation for the remainder of the paper.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
601
In addition, for testing the hypothesis Rb ¼ r for a q  k matrix R with rank q, the usual t
(for R a 1 Â k vector) and Wald statistics can be defined as
pffiffiffiffiffiffiffi
nT ðRb À rÞ
b
t ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
bÀ1 b bÀ1
RQ W Q R0
(5)
bÀ1 b bÀ1
b
b
F Ã ¼ nTðRb À rÞ0 ½RQ W Q R0 À1 ðRb À rÞ,
(6)
Ã
and
P
b
b
respectively, where W is defined above and Q ¼ ð1=nTÞ n x0i xi . In Section 3, I verify
i¼1
d
d
b tà ! Nð0; 1Þ, F à ! w2 , and Avarðb is
d bÞ
that, despite differences in the limiting behavior of b,
q
valid for estimating the asymptotic variance of b as n ! 1 regardless of the behavior of T.
b
b
I also consider the behavior of tà and F à as T ! 1 with n fixed. In this case, W is not
consistent for W but does have a limiting distribution; and when the data are iid across i,5 I
d
show that tà !ðn=ðn À 1ÞÞ1=2 tnÀ1 and that F à is asymptotically pivotal and so can be used
b
to construct valid tests. This behavior suggests that inference using ðn=ðn À 1ÞÞW and
forming critical values using a tnÀ1 distribution will be valid regardless of the asymptotic
sequence considered.
b
It is worth noting that the estimator W has also been used extensively in multilevel
models to account for the presence of correlation between individuals within cells; cf.
Liang and Zeger (1986) and Bell and McCaffrey (2002). For example, in a schooling study,
one might have data on individual outcomes where the individuals are grouped into
classes. In this case, the cross-sectional unit of observation could be defined as the class,
and arbitrary correlation between all individuals within each class could be allowed. In this
case, one would expect the presence of a classroom specific random effect resulting in
equicorrelation between all individuals within a class. While this would clearly violate the
mixing assumptions imposed in obtaining the asymptotic behavior as T ! 1 with n fixed,
b
it would not invalidate the use of W for inference about b in cases where n and T go to
infinity jointly.
In addition to being useful for performing inference about b W may also be used to test
b, b
the specification of simple parametric models of the error process.6 Such a test may be
useful for a number of reasons. If a parametric model is correct, the estimates of the
variance of b based on this model will tend to behave better than the estimates obtained
b
b
from W . In particular, parametric estimates of the variance of b will often be considerably
b
b
less variable and will typically converge faster than estimates made using W ; and if the
parametric model is deemed to be adequate, this model may be used to perform FGLS
estimation. The FGLS estimator is asymptotically more efficient than the OLS estimator,
and simulation evidence in Hansen (2006) suggests that the efficiency gain to using FGLS
over OLS in serially correlated panel data may be substantial.
5
Note that this still allows arbitrary correlation and heteroskedasticity within individuals but restricts that the
pattern is the same across individuals.
6
The test considered is a straightforward generalization of the test proposed by White (1980) for
heteroskedasticity and was suggested in the panel context by Kezdi (2002).
ARTICLE IN PRESS
602
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
To define the specification test, called hereafter the heteroskedasticity–autocorrelation
P
b yÞ
yÞ
y
(HA) test, let W ðb ¼ ð1=nTÞ n x0i Oi ðb 0 xi where b are estimates of a finite set of
i¼1
yÞ
parameters describing the disturbance process and Oi ðb is the implied covariance matrix
for individual i.7 Define a test statistic
À
b
b yÞÞ b
b
b yÞÞ,
S Ã ¼ ðnTÞ½vecðW À W ðb 0 D vecðW À W ðb
(7)
b
b
where D is a positive semi-definite weighting matrix that estimates the variance of vecðW À
b ðb and AÀ is the generalized inverse of a matrix A.8 In the following section, it will be
W yÞÞ
d
b
shown that S Ã ! w2
kðkþ1Þ=2 for D defined below.
b is
A natural choice for D
n
1 X
b
D¼
½ðvecðx0ibib0i xi À x0i Oi ðb i ÞÞðvecðx0ibib0i xi À x0i Oi ðb i ÞÞ0 .
yÞx
yÞx
nT i¼1
(8)
b
Under asymptotics where fn; Tg ! 1 jointly, another potential choice for D is an estimate
b:
of the asymptotic variance of W
n
1 X
b
b
b
½ðvecðx0ibib0i xi À W ÞÞðvecðx0ibib0i xi À W ÞÞ0 .
V¼
nT i¼1
(9)
b
b
b yÞÞ
That V provides an estimatorffi of the variance of vecðW À W ðb follows from the fact that
pffiffiffiffiffiffiffi
pffiffi
b
b yÞÞ
as fn; Tg ! 1, vecðW Þ is n-consistent while vecðW ðb will be nT -consistent in many
b yÞÞ
b
cases, so vecðW ðb may be taken as a constant relative to vecðW Þ. The difference in rates
of convergence would arise, for example, in a fixed effects panel model where the errors
follow an AR process with common AR coefficients across individuals. However, it is
important to note that this will not always be the case. In particular, in random effects
models, the estimator of the variance of the individual specific shock will converge at only
pffiffiffi
a n rate, implying the same rate of convergence for both the robust and parametric
estimators of the variance. In the following section, I outline the asymptotic properties of
b W , and V from which the behavior of tà , F à , and Sà will follow. The properties of D,
b
b
b, b
b
though not discussed, will generally be the same as those of V under the different
asymptotic sequences considered.
3. Asymptotic properties of the robust covariance matrix estimator
To develop the asymptotic inference results, I impose the following conditions.
b yÞ
Consistency and asymptotic normality of W ðb will generally follow from consistency and asymptotic
b In particular, defining W i ðyÞ as the derivative of W with respect to yi and letting y be a p  1
normality of y:
P
¯ y
¯
vector, a Taylor series expansion of W ðb yields W ðb ¼ W ðyÞ þ p W i ðyÞðb À yÞ where y is an intermediate
yÞ
yÞ
i¼1
b À W ðyÞ will inherit the properties of
value. As long as a uniform law of large numbers applies to W i ðyÞ, W ðyÞ
7
b À y. The problem is then reduced to finding an estimator of y that is consistent and asymptotically normal with a
y
mean zero asymptotic distribution. Finding such an estimator in fixed effects panel models with serial correlation
and/or heteroskedasticity when n ! 1 and T=n ! r where ro1 is complicated, though there are estimators
which exist. See, for example, Nickell (1981), MaCurdy (1982), Solon (1984), Lancaster (2002), Hahn and
Kuersteiner (2002), Hahn and Newey (2004), and Hansen (2006).
8
b
b yÞ
The test could alternatively be defined by only considering the ðkðk þ 1ÞÞ=2 unique elements of W À W ðb and
using the inverse of the implied covariance matrix. This test will be equivalent to the test outlined above.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
603
Assumption 1. fxi ; i g are independent across i, and E½i 0i jxi ¼ Oi .
P
Assumption 2. QnT ¼ E½ n ðx0i xi =nTÞ is uniformly positive definite with constant limit Q
i¼1
where limits are taken as n ! 1 with T fixed in Theorem 1, as fn; Tg ! 1 in Theorems 2
and 3, and as T ! 1 with n fixed in Theorem 4.
In addition, I impose either Assumption 3(a) or Assumption 3(b) depending on the
context.
Assumption 3.
(a) E½i jxi ¼ 0.
(b) E½xit it ¼ 0.
Assumptions 1–3 are quite standard for panel data models. Assumption 1 imposes
independence across individuals, ruling out cross-sectional correlation, but leaves the time
series correlation unconstrained and allows general heterogeneity across individuals.
Assumption 2 is a standard full rank condition, and the restriction that QnT has a constant
limit could be relaxed at the cost of more complicated notation. Assumption 3 imposes
that one of two orthogonality conditions is satisfied. Assumption 3(b) imposes that xit
and it are uncorrelated and is weaker than the strict exogeneity imposed in Assumption 3(a). Assumption 3(a) is stronger than necessary, but it simplifies the proof of asympb
b
totic normality of W and consistency of V . In addition, Assumption 3(a) would typically
9
be imposed in fixed effects models.
The first theorem, which is stated here for completeness, collects the properties of b and
b
b in asymptotics where n ! 1 with T fixed.
W
Theorem 1. Suppose the data are generated by model (1), that Assumptions 1 and 2 are
satisfied, and that n ! 1 with T fixed.
(i) If Assumption 3(b) holds and Ejxith j4þd oDo1 and Ejit j4þd oDo1 for some d40,
then
!
n
pffiffiffiffiffiffiffi
1 X
d
nT ðb À bÞ ! QÀ1 N 0; W ¼ lim
b
E½x0i Oi xi ,
n nT
i¼1
and
b p
W ! W.
(ii) In addition, if Assumption 3(a) holds and Ejxith j8þd oDo1 and Ejit j8þd oDo1 for
some d40, then
pffiffiffiffiffiffiffi
b
nT ½vecðW À W Þ
!
n
1 X
d
! N 0; V ¼ lim
E½ðvecðx0i i 0i xi À W ÞÞðvecðx0i i 0i xi À W ÞÞ0 ,
n nT
i¼1
9
Note that a balanced panel has also implicitly been assumed. All of the results with the exception of Corollary
4.1 could be extended to accommodate unbalanced panels at the cost of more complicated notation.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
604
and
b p
V ! V.
Remark 3.1. It follows from Theorem 1(i) that the asymptotic variance of b can be
b
estimated using (4) since
!À1
!À1
n
n
n
X
X
X
0
d bÞ
x0 xi
x0bib xi
x0 xi
Avarðb ¼
i
i¼1
1
¼
nT
i
i¼1
n
1 X 0
x xi
nT i¼1 i
!À1
i
i
i¼1
n
X
b 1
W
x0 xi
nT i¼1 i
!À1
¼
1 bÀ1 b bÀ1
Q WQ ,
nT
bÀ1 b bÀ1 p
where Q W Q ! QÀ1 WQÀ1 . It also follows immediately from the definitions of tà and
d
F à in Eqs. (5) and (6) and Theorem 1(i) that, under the null hypothesis, tà ! Nð0; 1Þ and
d
b yÞ
F Ã ! w2 . Similarly, using Theorem 1(ii) and assuming W ðb has properties similar to those
q
b
b
of W , it will follow that the HA test statistic, S Ã , formed using D defined above converges
2
in distribution to a wkðkþ1Þ=2 under the null hypothesis.
b
b
Theorem 1 verifies that b and W are consistent and asymptotically normal as n ! 1
with T fixed without imposing any restrictions on the time series dimension. In the
following results, I consider alternate asymptotic approximations under the assumption
that both n and T are going to infinity.10 In these cases, consistency and asymptotic
b
normality of suitably normalized versions of W are established under weak conditions.
Theorem 2, given immediately below, covers the case where n and T are going to infinity
and there is not weak dependence in the time series. In particular, the results of Theorem 2
P
are only interesting in the case where W ¼ limn;T ð1=nT 2 Þ n E½x0i Oi xi 40. Perhaps the
i¼1
leading case where this behavior would occur is in a model where it includes an individual
specific random effect that is uncorrelated to xit and the estimated model does not include
an individual specific effect. In this case, all observations for a given individual will be
equicorrelated, and the condition given above will hold. Theorem 3, given following
Theorem 2, covers the case where there is mixing in the time series.
Theorem 2. Suppose the data are generated by model (1), that Assumptions 1 and 2 are
satisfied, and that fn; Tg ! 1 jointly.
(i) If Assumption 3(b) holds and Ejxith j4þd oDo1 and Ejit j4þd oDo1 for some d40,
then
n
pffiffiffi b
1 X
d
nðb À bÞ ! QÀ1 Nð0; W ¼ lim 2
E½x0i Oi xi Þ,
n;T nT
i¼1
10
One could also consider sequential limits in which one takes limits as n or T goes to infinity with the other
dimension fixed and then lets the other dimension go to infinity. It could be shown that under the conditions of
Theorem 2 and appropriate normalizations sequential limits taken first with respect to either n or T would yield
the same results as the joint limit. Similarly, under the conditions of Theorem 3, the sequential limits taken first
with respect to either n or T would produce the same results as the joint limit.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
605
and
p
b
W =T ! W .
(ii) In addition, if Assumption 3(a) holds and Ejxith j8þd oDo1 and Ejit j8þd oDo1 for
some d40, then
pffiffiffi
b
n½vecðW =T À W Þ
!
n
1 X
d
0
0
0
0
0
E½ðvecðxi i i xi À W ÞÞðvecðxi i i xi À W ÞÞ ,
! N 0; V ¼ lim 4
n;T nT
i¼1
and
p
b
V =T 3 ! V .
Remark 3.2. It is important to note that the results presented in Theorem 2 are not
interesting in the setting where the fj; kg element of Oi becomes small when jj À kj is large
P
since in these circumstances ð1=nT 2 Þ n E½x0i Oi xi ! 0. Theorem 3 presents results which
i¼1
are relevant in this case.
b
Remark 3.3. Theorem 2 verifies consistency and asymptotic normality of both b and W
b
while imposing essentially no constraints on the time series dependence in the data. The
large cross-section effectively allows the time series dimension to pffiffiffiignored even when ffiffiffiffiffiffiffi
be
pT is
large. However, without constraints on the time series, b is n-consistent, not nT b
consistent. Intuitively, the slower rate of convergence is due to the fact that there may be
little information contained in the time series since it is allowed to be arbitrarily dependent.
pffiffiffiffiffiffiffi
b
Remark 3.4. The fact that b and W are not nT -consistent will not affect practical
b
implementation of inference about b In particular, the estimate of the asymptotic variance
b.
b based on Eq. (4) is
of b
!À1
!À1
n
n
n
X
X
X
0
0
0
0
d b ¼
AvarðbÞ
x xi
x bib xi
x xi
i
i¼1
i
i
i¼1
!À1
n
1 1 X 0
¼
x xi
n nT i¼1 i
i
i¼1
b
ðW =TÞ
n
1 X 0
x xi
nT i¼1 i
!À1
1 bÀ1 b
bÀ1
¼ Q ðW =TÞQ ,
n
bÀ1 b
bÀ1 p
where Q ðW =TÞQ ! QÀ1 WQÀ1 : The t-statistic defined in Eq. (5) may also be expressed
as
pffiffiffiffiffiffiffi
nT ðRb À rÞ
b
Ã
t ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
bÀ1 b bÀ1
RQ W Q R0
pffiffiffi b
nðRb À rÞ
¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
À1
b
bÀ1
b
RQ ðW =TÞQ R0
which converges in distribution to a Nð0; 1Þ random variable under the null hypothesis,
d
Rb ¼ r, by Theorem 2(i). Similarly, it follows that F Ã ! w2 under the null. Finally, the HA
q
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
606
test statistic, S Ã , defined above also satisfies
b
b yÞÞ bÀ
b
b yÞÞ
SÃ ¼ ðnTÞ½vecðW À W ðb 0 D vecðW À W ðb
b
b
b yÞ=TÞ,
b
b yÞ=TÞ0 ðD=T 3 ÞÀ vecðW =T À W ðb
¼ n½vecðW =T À W ðb
which converges in distribution to a w2
kðkþ1Þ=2 under the conditions of the theorem and the
b ðb behaves similarly to V .
b
additional assumption that W yÞ
The previous theorem establishes the properties of b and the robust variance matrix
b
estimator as n and T go to infinity jointly without imposing restrictions on the time series
dependence. While the result is interesting, there are many cases in which one might expect
the time series dependence to diminish over time. In the following theorem, the properties
b
of b and W are established under the assumption that the data are strong mixing in the
b
time series dimension.
Theorem 3. Suppose the data are generated by model (1), that Assumptions 1 and 2 are
satisfied, and that fn; Tg ! 1 jointly.
(i) If Assumption 3(b) is satisfied, Ejxith jrþd oD and Ejit jrþd oD for some d40, and fxit ; it g
is a strong mixing sequence in t with a of size À3r=ðr À 4Þ for r44,
!
n
X
pffiffiffiffiffiffiffi
d
0
b À bÞ ! QÀ1 N 0; W ¼ lim 1
E½xi Oi xi
nT ðb
n;T nT
i¼1
and
p
b
W À W ! 0.
(ii) In addition, if Assumption 3(a) is satisfied, Ejxith jrþd oD and Ejit jrþd oD for some d40,
and fxit ; it g is a strong mixing sequence in t with a of size À7r=ðr À 8Þ for r48,
pffiffiffi
b
n½vecðW À W Þ
!
n
1 X
d
0
0
0
0
0
E½ðvecðxi i i xi À W ÞÞðvecðxi i i xi À W ÞÞ ,
! N 0; V ¼ lim 2
n;T nT
i¼1
and
p
b
V =T ! V .
b
Remark 3.5. Theorem 3 verifies consistency and asymptotic normality of both b and W
b
under fairly conventional conditions on the time series dependence of the variables. The
pffiffiffiffiffiffiffi
added restriction on the time series dependence pffiffiffi
allows estimation of b at the nT -rate,
which differs from the case above where b is only n-consistent. Intuitively, the increase in
b
the rate of convergence is due to the fact that under the mixing conditions, the time series is
more informative than in the case analyzed in Theorem 2.
Remark 3.6. It follows immediately from the conclusions of Theorem 3 and the definitions
d bÞ,
d bÞ
of Avarðb tà , and F à in Eqs. (4)–(6) that Avarðb is valid for estimating the asymptotic
d
d
variance of b and that tà ! Nð0; 1Þ and F à ! w2 under the null hypothesis. The HA test
b
q
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
607
statistic, S Ã , also satisfies
b
b yÞÞ bÀ
b
b yÞÞ
S Ã ¼ ðnTÞ½vecðW À W ðb 0 D vecðW À W ðb
b
b yÞÞ,
b
b yÞÞ b
¼ n½vecðW À W ðb 0 ðD=TÞÀ vecðW À W ðb
which converges in distribution to a w2
kðkþ1Þ=2 under the conditions of the theorem and the
b
b
b
assumption that D behaves similarly to V . In this case, V could also typically be used as
pffiffiffiffiffiffiffi
Ã
b yÞ
the weighting matrix in forming S since it will often be the case that W ðb will be nT pffiffiffi
b
consistent while W is n-consistent.
Theorems 1–3 establish that conventional estimators of the asymptotic variance of b and
b
b
t and F statistics formed using W have their usual properties as long as n ! 1 regardless
of the behavior of T. In addition, the results indicate that it is essentially only the size of n
that matters for the asymptotic behavior of the estimators under these sequences. To
b
complete the theoretical analysis, I present the asymptotic properties of W as T ! 1 with
n fixed below. The results are interesting in providing a justification for a commonly used
procedure and in unifying the results and the different asymptotics considered.
Theorem 4. Suppose the data are generated by model (1), that Assumptions 1, 2, and 3(b) are
satisfied, and that T ! 1 with n fixed. If Ejxith jrþd oD, Ejit jrþd oD, and fxit ; it g is a strong
mixing sequence in t with a of size À3r=ðr À 4Þ for r44, then
pffiffiffiffiffiffiffi
pffiffiffiffiffiffiffi d
p
d
nT ðb À bÞ ! QÀ1 Nð0; W Þ; x0i xi =nT À Qi =n ! 0; x0i i = nT ! Nð0; W i =nÞ,
b
and
!
!À1
n
n
n
X
X
1X
b
W !U ¼
ðLi Bi B0i Li À Li Bi
B0j Lj
Qj
Qi
n i¼1
j¼1
j¼1
!À1
!
n
n
X
X
À Qi
Qj
Lj Bj B0i Li
d
j¼1
þ Qi
n
X
j¼1
!À1
Qj
j¼1
n
X
j¼1
!
Lj Bj
n
X
j¼1
!
B0j Lj
n
X
!À1
Qj
Qi ,
j¼1
P
where W i ¼ limT ð1=TÞE½x0i Oi xi , W ¼ limT ð1=nTÞ i E½x0i Oi xi , Bi $Nð0; I k Þ is a k-dimen1=2
sional normal vector with E½Bi B0j ¼ 0 and Li ¼ W i .
b
Remark 3.7. Theorem 4 verifies that W is not consistent but does have a limiting distribution as
T ! 1 with n fixed. Unfortunately, the result here differs from results obtained in Phillips
et al. (2003), Kiefer and Vogelsang (2002, 2005), and Vogelsang (2003) who consider HAC
estimation in time series data without truncation in that how to construct asymptotically pivotal
statistics from U is not immediately obvious. However, in one important special case, U is
proportional to the true covariance matrix allowing construction of asymptotically pivotal tests.
Corollary 4.1. Suppose the conditions of Theorem 4 are satisfied and that Qi ¼ Q and
W i ¼ W for all i. Then
!
n
n
n
X
1
1X X 0
d
b
Bi B0i À
Bi
Bi L
W !U ¼ L
n
n i¼1
i¼1
i¼1
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
608
for Bi defined in Theorem 4 and L ¼ W 1=2 . Then, for testing the null hypothesis H0 : Rb ¼ r
against the alternative H1 : Rbar for a q  k matrix R with rank q, the limiting distributions
of the conventional Wald (F à ) and t-type ðtÃ Þ tests under H0 are
bÀ1 b bÀ1
b
b
F Ã ¼ ðnTÞðRb À rÞ0 ½RQ W Q R0 À1 ðRb À rÞÞ
"
!#À1
X
nq
d
e0 1
e
e e0
F q;nÀq ,
! Bq;n
Bq;n ; ¼
Bq;i B0q;i À Bq;n Bq;n
n
nÀq
i
ð10Þ
and
pffiffiffiffiffiffiffi
nT ðRb À rÞ
b
t ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
À1
b b bÀ1
RQ W Q R0
rffiffiffiffiffiffiffiffiffiffiffi
e
B1;n
n
d
! qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼
ð11Þ
tnÀ1 ,
P 2
2
nÀ1
e
ð1=nÞð i B1;i À B1;n Þ
pffiffiffi P
e
where Bq;i $Nð0; I q Þ, Bq;n ¼ ð1= nÞ n Bq;i , tnÀ1 is a t distribution with n À 1 degrees of
i¼1
freedom, and F q;nÀq is an F distribution with q numerator and n À q denominator degrees of
freedom.
Ã
b
Corollary 4.1 gives the limiting distribution of W as T ! 1 under the additional
restriction that Qi ¼ Q and W i ¼ W for all i. These restrictions would be satisfied when
the data vectors for each individual fxi ; yi g are iid across i. While this is more restrictive
than the condition imposed in Assumption 1, it still allows for quite general forms of
conditional heteroskedasticity and does not impose any structure on the time series process
within individuals.
The most interesting feature about the result in Corollary 4.1 is that under the
b
conditions imposed, the limiting distribution of W is proportional to the actual covariance
matrix in the data. This allows construction of asymptotically pivotal statistics based on
standard t and Wald tests as in Phillips et al. (2003), Kiefer and Vogelsang (2002, 2005),
and Vogelsang (2003). This is particularly convenient in the panel case since the limiting
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
distribution of the t-statistic is exactly
ðn=ðn À 1ÞÞ tnÀ1 where tnÀ1 denotes the t
distribution with n À 1 degrees of freedom.11 It is also interesting that EU ¼ ð1 À ð1=nÞÞW .
b
This suggests normalizing the estimator W by n=ðn À 1Þ will result in an asymptotically
unbiased estimator in asymptotics where T ! 1 with n fixed and will likely reduce the
finite-sample bias under asymptotics where n ! 1. In addition, the t-statistic constructed
b
based on the estimator defined by ðn=ðn À 1ÞÞW will be asymptotically distributed as a tnÀ1
for which critical values are readily available.12
The conclusions of Corollary 4.1 suggest a simple procedure for testing hypotheses
regarding regression coefficients which will be valid under any of the asymptotics
b
considered. Using ðn=ðn À 1ÞÞW and obtaining critical values from a tnÀ1 distribution will
yield tests which are asymptotically valid regardless of the asymptotic sequence since the
11
b
If n ¼ 1, W is identically equal to 0. In this case, it is easy to verify that U equals 0, though the results of
Theorem 4 and Corollary 4.1 are obviously uninteresting in this case.
12
b
This is essentially the normalization used in Stata’s cluster command, which normalizes W by
½ðnT À 1Þ=ðnT À kÞ ½n=ðn À 1Þ, where the normalization is motivated as a finite-sample adjustment under the
usual n ! 1, T fixed asymptotics; see Stata User’s Guide Release 8, p. 275 (Stata Corporation, 2003).
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
609
tnÀ1 ! Nð0; 1Þ and n=ðn À 1Þ ! 1 as n ! 1. Thus, this approach will yield valid tests
under any of the asymptotics considered in the presence of quite general heteroskedasticity
and serial correlation.13
In addition, it is important to note that in the cases where there is weak dependence in
the time series and T is large, more efficient estimators of the covariance matrix which
make use of this information are available. In particular, standard time series HAC
estimators which downweight the correlation between observations that are far apart will
have faster rates of convergence than the CCM estimator.
b
Finally, it is worth noting that the maximum rank of W will generally be n À 1, which
b
b
suggests that W will be rank deficient when k4n À 1: Since W is supposed to estimate a
b
full rank matrix, it seems likely that inference based on W will perform poorly in these
cases. Also, the above development ignores time effects, which will often be included in
panel data models. Under T fixed, n ! 1 asymptotics, the time effects can be included in
the covariate vector xit and pose no additional complications. However, as T ! 1,
they also need to be considered separately from x and partialed out with the individual
fixed effects. This partialing out will generally result in the presence of an Oð1=nÞ
correlation between individuals. When n is large, this correlation should not matter,
but in the fixed n, T ! 1 case, it will invalidate the results. The effect of the presence
of time effects was explored in a simulation study with the same design as that
reported in the following section where each model was estimated including a full set of
time fixed effects. The results, which are not reported below but are available upon
b
request, show that tests based on W are somewhat more size distorted than when no
time effects are included for small n, but that this size distortion diminishes quickly
as n increases.
4. Monte Carlo evidence
The asymptotic results presented above suggest that tests based on the robust standard error estimates should have good properties regardless of the relative sizes of n
and T. I report results from a simple simulation study used to assess the finite sample
effectiveness of the robust covariance matrix estimator and tests based upon it below.
Specifically, the simulation focuses on t-tests for regression coefficients and the HA test
discussed above.
The Monte Carlo simulations are based on two different specifications: a ‘‘fixed effect’’
specification and a ‘‘random effects’’ specification. The terminology refers to the fact that
in the ‘‘fixed effect’’ specification, the models will be estimated including individual specific
fixed effects with the goal of focusing on the case where the underlying disturbances exhibit
weak dependence. In the ‘‘random effects’’ specification individual specific effects are not
estimated and the goal is to examine the behavior of the CCM estimator and tests based
upon it in an equicorrelated model.
The fixed effect specification is
yit ¼ x0it b þ ai þ eit ,
where xit is a scalar and ai is an individual specific effect. The data generating process for
the fixed effect specification allows for serial correlation in both xit and eit and
13
This argument also applies to testing multiple parameters using F Ã .
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
610
heteroskedasticity:
xit ¼ :5xitÀ1 þ vit ; vit $Nð0; :75Þ,
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
eit ¼ reitÀ1 þ a0 þ a1 x2 uit ; uit $Nð0; 1 À r2 Þ,
it
ai $Nð0; :5Þ.
Data are simulated using four different values of r, r 2 f0; :3; :6; :9g, in both the
homoskedastic ða0 ¼ 1; a1 ¼ 0Þ and heteroskedastic ða0 ¼ a1 ¼ :5Þ cases, resulting in a
total of eight distinct parameter settings. The models are estimated including xit and a full
set of individual specific fixed effects.14
The random effects specifications is
yit ¼ x0it b þ it ,
where xit is a normally distributed scalar with E½x2 ¼ 1 and E½xit1 xit2 ¼ :8 for all t1 at2 . it
it
contains an individual specific random component and a random error term:
it ¼ ai þ uit ,
ai $Nð0; rÞ,
uit $Nð0; 1 À rÞ.
Note that the random effects data generating process implies that E½it1 it2 ¼ r for t1 at2 .
Three values of r are employed for the random effects specification: .3, .6, and .9. The
model is estimated by regressing yit on xit and a constant.
The fixed effects model is commonly used in empirical work when panel data are
available. The random effects specification is also widely used in the policy evaluation
literature. In many policy evaluation studies, the covariate of interest is a policy variable
that is highly correlated within aggregate cells, often with a correlation of one, which has
led to the dominance of the random effects estimator in this context. For example, a
researcher may desire to estimate the effect of classroom level policies on student-level
micro data containing observations from multiple classrooms. In this setting, T indexes the
number of students within each class, n indexes the number of classrooms, and ai is a
classroom specific random effect. The CCM estimator has been widely utilized in such
situations in order to consistently estimate standard errors.15
Simulation results for various values of the cross-sectional (n) and time ðTÞ dimensions
are reported. For each fn; Tg combination, reported results for each of the 11 parameter
settings (eight for the fixed effects specification and three for the random effects
specification) are based on 1,000 simulation repetitions. Each simulation estimates three
types of standard errors for b unadjusted OLS standard errors, bOLS , CCM standard
b:
s
errors, bCLUS , and standard errors consistent with an AR(1) process, bARð1Þ .16 For the
s
s
14
Since ai is uncorrelated with xi , this model could be estimated using random effects. I chose to consider a
different specification for the random effects estimates where the xit were generated to more closely resemble
covariates which appear in policy analysis studies.
15
This is, in fact, one of the original motivations for the development of the CCM estimator, cf. Liang and
Zeger (1986).
16
bARð1Þ imposes the parametric structure implied by an AR(1) process. The r parameter is estimated from the
s
OLS residuals using the procedure described in Hansen (2006) which consistently estimates AR parameters in
fixed effects panel models. The standard errors are then computed as ðX 0 X ÞÀ1 X 0 OðbÞX ðX 0 X ÞÀ1 where OðbÞ is the
r
r
covariance matrix implied by an AR(1) process.
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
611
random effects specification, standard errors consistent with random effects, bRE , are
s
s
substituted for bARð1Þ .17 bCLUS is consistent for all parameter settings. bOLS is consistent only
s
s
in the iid case (the homoskedastic data generating process with r ¼ 0Þ. bARð1Þ is consistent
s
in all homoskedastic data generating processes, and bRE is consistent in all models for
s
which it is reported. In all cases, the CCM estimator is computed using the normalization
implied by T ! 1 with n fixed asymptotics; that is, the CCM estimator is computed as
b
b
ðn=ðn À 1ÞÞW for W defined in Eq. (3).
Tables 1–4 present the results of the Monte Carlo study, where each table corresponds to
a different fn; Tg combination.18 In each table, Panel A presents the fixed effects results for
the homoskedastic and heteroskedastic cases, while Panel B presents the random effects
results. Column (1) presents t-test rejection rates for 5% level tests based on OLS, CCM,
and AR(1) standard errors. The critical values for tests based on OLS and AR(1) errors
are taken from a tnTÀnÀ1 distribution, and the critical values for tests based on clustered
standard errors are taken from a tnÀ1 distribution. Columns (2) and (3) present the mean
and standard deviation of the estimated standard errors respectively. Column (4) presents
the standard deviation of the b The difference between columns (2) and (4) is therefore
b’s.
the bias of the estimated standard errors. Finally, column (5) presents the rejection rates
for the HA test described above which tests the null hypothesis that both the CCM
estimator and the parametric estimator are consistent.
As expected, tests based on bOLS and bARð1Þ perform well in the cases where the assumed
s
s
model is consistent with the data across the full range of n and T combinations. The results
pffiffiffiffiffiffi
ffi
are also consistent with the asymptotic theory, clearly illustrating the nT -consistency of b
b
b
b
b
and W with the bias of W and the variance of both b and W decreasing as either n or T
b
increases. Of course, when the assumed parametric model is inconsistent with the data,
tests based on parametric standard errors suffer from size distortions and the standard
error estimates are biased. The RE tests have the correct size for moderate and large n, but
not for small n (i.e. n ¼ 10); and as indicated by the asymptotic theory, the T dimension
has no apparent impact on the size of RE based tests or the overall performance of the RE
estimates.
Tests based on the CCM estimator have approximately correct size across all
combinations of n and T and all models of the disturbances considered in the fixed effect
specification. The estimator does, however, display a moderate bias in the small n case; it
seems likely that this bias does not translate into a large size distortion due to the fact that
the bias is small relative to the standard error of the estimator and the use of the tnÀ1
distribution to obtain the critical values. While the clustered standard errors perform well
in terms of size of tests and reasonably well in terms of bias, the simulations reveal that a
potential weakness of the clustered estimator is a relatively high variance. The CCM
estimates have a substantially higher standard deviation than the other estimators and
this difference, in percentage terms, increases with T. This behavior is consistent with the
17
bRE is estimated in a manner analogous to bARð1Þ where the covariance parameters are estimated in the usual
s
s
manner from the OLS and within residuals.
18
Tables 1–4 correspond to fn; Tg ¼ f10; 10g, fn; Tg ¼ f10; 50g, fn; Tg ¼ f50; 10g, fn; Tg ¼ f50; 50g, respectively.
Additional results for fn; Tg ¼ f10; 200g, fn; Tg ¼ f50; 20g, fn; Tg ¼ f50; 200g, fn; Tg ¼ f200; 10g, and fn; Tg ¼
f200; 50g are available from the author upon request. The results are consistent with the asymptotic theory with
the performance of the CCM estimator improving as either n or T increases in the fixed effects specification and as
n increases in the random effects specification. In the random effects case, the performance does not appear to be
greatly influenced by the size of T relative to n.
ARTICLE IN PRESS
612
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
Table 1
Data generating process
N ¼ 10; T ¼ 10
A. Fixed effects
Homoskedastic, r ¼ 0
OLS
Cluster
AR1
Homoskedastic, r ¼ :3
OLS
Cluster
AR1
Homoskedastic, r ¼ :6
OLS
Cluster
AR1
Homoskedastic, r ¼ :9
OLS
Cluster
AR1
Heteroskedastic, r ¼ 0
OLS
Cluster
AR1
Heteroskedastic, r ¼ :3
OLS
Cluster
AR1
Heteroskedastic, r ¼ :6
OLS
Cluster
AR1
Heteroskedastic, r ¼ :9
OLS
Cluster
AR1
B. Random effects
r ¼ :3
OLS
Cluster
RE
r ¼ :6
OLS
Cluster
RE
r ¼ :9
OLS
Cluster
RE
t-test rejection
rate
(1)
Mean (s.e.)
Std (s.e.)
Std ðbÞ
(2)
(3)
(4)
0.038
0.043
0.041
0.1180
0.1149
0.1170
0.0133
0.0330
0.0141
0.1152
0.1152
0.1152
0.152
0.082
0.054
0.055
0.1130
0.1212
0.1240
0.0136
0.0357
0.0161
0.1269
0.1269
0.1269
0.095
0.093
0.060
0.051
0.1005
0.1167
0.1219
0.0133
0.0352
0.0181
0.1231
0.1231
0.1231
0.074
0.145
0.053
0.054
0.0609
0.0772
0.0795
0.0090
0.0249
0.0136
0.0818
0.0818
0.0818
0.038
0.126
0.057
0.126
0.1150
0.1410
0.1140
0.0126
0.0458
0.0137
0.1502
0.1502
0.1502
0.051
0.171
0.068
0.143
0.1165
0.1538
0.1284
0.0137
0.0500
0.0172
0.1708
0.1708
0.1708
0.036
0.187
0.074
0.117
0.1238
0.1717
0.1503
0.0153
0.0572
0.0219
0.1853
0.1853
0.1853
0.027
0.198
0.087
0.097
0.1406
0.1872
0.1830
0.0209
0.0641
0.0336
0.2181
0.2181
0.2181
0.031
0.295
0.115
0.097
0.1063
0.1561
0.1693
0.0231
0.0609
0.0460
0.1926
0.1926
0.1926
0.017
0.399
0.118
0.094
0.1030
0.2024
0.2180
0.0248
0.0788
0.0600
0.2438
0.2438
0.2438
0.054
0.482
0.108
0.095
0.0987
0.2346
0.2546
0.0293
0.0909
0.0723
0.2925
0.2925
0.2925
0.093
HA test
rejection rate
(5)
0.135
0.133
0.123
0.085
0.042
0.044
0.049
0.074
0.027
0.023
0.018
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
613
Table 2
Data generating process
N ¼ 10; T ¼ 50
A. Fixed effects
Homoskedastic, r ¼ 0
OLS
Cluster
AR1
Homoskedastic, r ¼ :3
OLS
Cluster
AR1
Homoskedastic, r ¼ :6
OLS
Cluster
AR1
Homoskedastic, r ¼ :9
OLS
Cluster
AR1
Heteroskedastic, r ¼ 0
OLS
Cluster
AR1
Heteroskedastic, r ¼ :3
OLS
Cluster
AR1
Heteroskedastic, r ¼ :6
OLS
Cluster
AR1
Heteroskedastic, r ¼ :9
OLS
Cluster
AR1
B. Random effects
r ¼ :3
OLS
Cluster
RE
r ¼ :6
OLS
Cluster
RE
r ¼ :9
OLS
Cluster
RE
t-test rejection
rate
(1)
Mean (s.e.)
Std (s.e.)
Std ðbÞ
(2)
(3)
(4)
0.054
0.050
0.057
0.0462
0.0449
0.0460
0.0024
0.0117
0.0026
0.0472
0.0472
0.0472
0.184
0.088
0.043
0.050
0.0459
0.0520
0.0529
0.0024
0.0133
0.0031
0.0519
0.0519
0.0519
0.077
0.155
0.042
0.047
0.0447
0.0574
0.0598
0.0028
0.0150
0.0044
0.0590
0.0590
0.0590
0.049
0.225
0.046
0.049
0.0372
0.0562
0.0583
0.0034
0.0159
0.0072
0.0600
0.0600
0.0600
0.046
0.158
0.051
0.162
0.0459
0.0606
0.0458
0.0021
0.0169
0.0023
0.0637
0.0637
0.0637
0.052
0.199
0.041
0.142
0.0479
0.0735
0.0553
0.0022
0.0198
0.0032
0.0724
0.0724
0.0724
0.046
0.229
0.043
0.112
0.0558
0.0928
0.0748
0.0031
0.0260
0.0054
0.0934
0.0934
0.0934
0.067
0.239
0.046
0.076
0.0857
0.1428
0.1338
0.0079
0.0451
0.0163
0.1490
0.1490
0.1490
0.059
0.568
0.104
0.097
0.0471
0.1356
0.1475
0.0092
0.0547
0.0413
0.1636
0.1636
0.1626
0.147
0.703
0.104
0.095
0.0466
0.1897
0.2079
0.0105
0.0727
0.0567
0.2331
0.2331
0.2331
0.212
0.744
0.106
0.103
0.0450
0.2310
0.2539
0.0130
0.0920
0.0701
0.2785
0.2785
0.2785
0.245
HA test
rejection rate
(5)
0.185
0.159
0.184
0.150
0.057
0.047
0.059
0.099
0.014
0.007
0.014
ARTICLE IN PRESS
614
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
Table 3
Data generating process
N ¼ 50; T ¼ 10
A. Fixed effects
Homoskedastic, r ¼ 0
OLS
Cluster
AR1
Homoskedastic, r ¼ :3
OLS
Cluster
AR1
Homoskedastic, r ¼ :6
OLS
Cluster
AR1
Homoskedastic, r ¼ :9
OLS
Cluster
AR1
Heteroskedastic, r ¼ 0
OLS
Cluster
AR1
Heteroskedastic, r ¼ :3
OLS
Cluster
AR1
Heteroskedastic, r ¼ :6
OLS
Cluster
AR1
Heteroskedastic, r ¼ :9
OLS
Cluster
AR1
B. Random effects
r ¼ :3
OLS
Cluster
RE
r ¼ :6
OLS
Cluster
RE
r ¼ :9
OLS
Cluster
RE
t-test rejection
rate
(1)
Mean (s.e.)
Std (s.e.)
Std ðbÞ
(2)
(3)
(4)
0.049
0.057
0.047
0.0522
0.0515
0.0522
0.0026
0.0062
0.0028
0.0526
0.0526
0.0526
0.106
0.080
0.059
0.055
0.0500
0.0552
0.0556
0.0027
0.0072
0.0033
0.0569
0.0569
0.0569
0.053
0.102
0.048
0.049
0.0447
0.0549
0.0553
0.0026
0.0071
0.0037
0.0539
0.0539
0.0539
0.132
0.156
0.075
0.067
0.0273
0.0364
0.0367
0.0273
0.0367
0.0367
0.0387
0.0387
0.0387
0.220
0.119
0.047
0.116
0.0517
0.0673
0.0516
0.0025
0.0093
0.0028
0.0659
0.0659
0.0659
0.213
0.197
0.062
0.139
0.0521
0.0741
0.0581
0.0026
0.0114
0.0033
0.0768
0.0768
0.0768
0.369
0.214
0.048
0.108
0.0558
0.0820
0.0688
0.0031
0.0126
0.0045
0.0840
0.0840
0.0840
0.451
0.152
0.038
0.057
0.0623
0.0899
0.0834
0.0043
0.0144
0.0070
0.0883
0.0883
0.0883
0.324
0.291
0.062
0.059
0.0451
0.0776
0.0788
0.0041
0.0135
0.0091
0.0822
0.0822
0.0822
0.673
0.357
0.073
0.068
0.0452
0.1004
0.1028
0.0049
0.0183
0.0127
0.1034
0.1034
0.1034
0.892
0.497
0.062
0.063
0.0447
0.1192
0.1210
0.0056
0.0212
0.0147
0.1246
0.1246
0.1246
0.943
HA test
rejection rate
(5)
0.099
0.092
0.072
0.078
0.210
0.140
0.056
0.023
0.058
0.054
0.048
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
615
Table 4
Data generating process
N ¼ 50; T ¼ 20
A. Fixed effects
Homoskedastic, r ¼ 0
OLS
Cluster
AR1
Homoskedastic, r ¼ :3
OLS
Cluster
AR1
Homoskedastic, r ¼ :6
OLS
Cluster
AR1
Homoskedastic, r ¼ :9
OLS
Cluster
AR1
Heteroskedastic, r ¼ 0
OLS
Cluster
AR1
Heteroskedastic, r ¼ :3
OLS
Cluster
AR1
Heteroskedastic, r ¼ :6
OLS
Cluster
AR1
Heteroskedastic, r ¼ :9
OLS
Cluster
AR1
B. Random effects
r ¼ :3
OLS
Cluster
RE
r ¼ :6
OLS
Cluster
RE
r ¼ :9
OLS
Cluster
RE
t-test rejection
rate
(1)
Mean (s.e.)
Std (s.e.)
Std ðbÞ
(2)
(3)
(4)
0.050
0.049
0.052
0.0342
0.0341
0.0342
0.0013
0.0040
0.0014
0.0341
0.0341
0.0341
0.097
0.094
0.051
0.056
0.0334
0.0379
0.0382
0.0013
0.0045
0.0016
0.0393
0.0393
0.0393
0.077
0.120
0.059
0.050
0.0315
0.0407
0.0412
0.0014
0.0052
0.0021
0.0414
0.0414
0.0414
0.300
0.200
0.059
0.060
0.0222
0.0327
0.0329
0.0013
0.0047
0.0024
0.0336
0.0336
0.0336
0.580
0.168
0.063
0.171
0.0340
0.0458
0.0340
0.0011
0.0056
0.0012
0.0479
0.0479
0.0479
0.408
0.209
0.051
0.145
0.0350
0.0527
0.0399
0.0012
0.0068
0.0016
0.0536
0.0536
0.0536
0.675
0.228
0.050
0.119
0.0394
0.0636
0.0514
0.0017
0.0084
0.0027
0.0653
0.0653
0.0653
0.802
0.196
0.036
0.058
0.0507
0.0809
0.0751
0.0028
0.0131
0.0056
0.0775
0.0775
0.0775
0.681
0.405
0.069
0.063
0.0320
0.0726
0.0738
0.0029
0.0131
0.0085
0.0756
0.0756
0.0756
0.915
0.515
0.066
0.055
0.0318
0.0976
0.0996
0.0033
0.0169
0.0118
0.1012
0.1012
0.1012
0.944
0.614
0.054
0.051
0.0314
0.1166
0.1194
0.0038
0.0204
0.0140
0.1203
0.1203
0.1203
0.948
HA test
rejection rate
(5)
0.088
0.086
0.092
0.094
0.406
0.294
0.123
0.034
0.064
0.055
0.053
ARTICLE IN PRESS
616
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
pffiffiffi
n-consistency of the estimator and does suggest that if a parametric estimator is
available, it may have better properties for estimating the variance of b
b:
The clustered estimator performs less well in the random effects specification. For small
n, tests based on the CCM estimator suffer from a substantial size distortion for all values
of T. For moderate to large values of n, the tests have the correct size, and the overall
performance does not appear to depend on T. In addition, the variance of b does ffiffiffiffiffiffiffi
b
p not
appear to decrease as T increases. These results are consistent with the lack of nT consistency in this case.19
The performance of the HA test is much less robust than that of t-tests based on
clustered standard errors. For small n, the tests are badly size distorted and have essentially
no power against any alternative hypotheses. As n and T grow, the test performance
improves. With n ¼ 50, the test remains size distorted, but it does have some power against
alternatives that increases as T increases. The HA test also performs poorly for the random
effects specification for small n. However, for moderate or large n, the test has both the
correct size and good power.
Overall, the simulation results support the use of clustered standard errors for
performing inference on regression coefficient estimates in serially correlated panel data,
though they also suggest care should be taken if n is small and one suspects a ‘‘random
b
effects’’ structure. The poor performance of W in ‘‘random effects’’ models with small n is
already well-known; see for example Bell and McCaffrey (2002) who also suggest a bias
b
reduction for W in this case. However, that the estimator does quite well even for small n
in the serially correlated case where the errors are mixing is somewhat surprising and is a
new result which is suggested by the asymptotic analysis of the previous section. The
simulation results confirm the asymptotic results, suggesting that the clustered standard
errors are consistent as long as n ! 1 and that they are not sensitive to the size of n
relative to T. The chief drawback of the CCM estimator is that the robustness comes at the
cost of increasing the variance of the standard error estimate relative to that of standard
errors estimated through more parsimonious models.
The HA test offers one simple information based criterion for choosing between the
CCM estimator and a simple parametric model of the error process. However, the
simulation evidence regarding its usefulness is mixed. In particular, the properties of the
test are poor in small sample settings where there is likely to be the largest gain to using a
parsimonious model. However, in moderate sized samples, the test performs reasonably
well, and there still may be gains to using a simple parametric model in these cases.
5. Conclusion
This paper explores the asymptotic behavior of the robust covariance matrix estimator
of Arellano (1987). It extends the usual analysis performed under asymptotics where n !
1 with T fixed to cases where n and T go to infinity jointly, considering both non-mixing
and mixing cases, and to the case where T ! 1 with n fixed. The limiting behavior of the
OLS estimator, b in each case is different. However, the analysis shows that the
b,
conventional estimator of the asymptotic variance and the usual t and F statistics have the
same properties regardless of the behavior of the time series as long as n ! 1: In addition,
The inconsistency of b when T increases with n fixed in differences-in-differences and policy evaluation studies
b
has also been discussed in Donald and Lang (2001).
19
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
617
when T ! 1 with n fixed and the data satisfy mixing conditions and an iid assumption
across individuals, the usual t and F statistics can be used for inference despite the fact that
the robust covariance matrix estimator is not consistent but converges in distribution to a
limiting random variable. In this case, it is shown that the t statistic constructed using
n=ðn À 1Þ times the estimator of Arellano (1987) is asymptotically tnÀ1 , suggesting the use
of n=ðn À 1Þ times the estimator of Arellano (1987) and critical values obtained from a tnÀ1
in all cases. The use of this procedure is also supported in a short simulation experiment,
which verifies that it produces tests with approximately correct size regardless of the
relative size of n and T in cases where the time series correlation between observations
diminishes as the distance between observations increases. The simulations also verify that
tests based on the robust standard errors are consistent as n increases regardless of the
relative size of n and T even in cases when the data are equicorrelated.
Acknowledgments
The research reported in this paper was motivated through conversations with Byron
Lutz, to whom I am very grateful for input in developing this paper. I would like to thank
Whitney Newey and Victor Chernozhukov as well as anonymous referees and a coeditor
for helpful comments and suggestions. This work was partially supported by the William
S. Fishman Faculty Research Fund at the Graduate School of Business, the University of
Chicago. All remaining errors are mine.
Appendix
For brevity, sketches of the proofs are provided below. More detailed versions are
available in an additional Technical Appendix from the author upon request and in
Hansen (2004).
pffiffiffiffiffiffiffi
P
p
d
Proof of Theorem 1. b À b ! 0 and
b
nT ðb À bÞ ! QÀ1 Nð0; W ¼ limn ð1=nTÞ n
b
i¼1
E½x0i Oi xi Þ follow immediately under the conditions of Theorem 1 from the Markov
LLN and the Liapounov CLT. The remaining conclusions follow from repeated use of the
Cauchy–Schwarz inequality, Minkowski’s inequality, the Markov LLN, and the
Liapounov CLT. &
The proofs of Theorems 2 and 3 make use of the following lemmas which provide a LLN
and CLT for inid data as fn; Tg ! 1 jointly.
Lemma 1. Suppose fZ i;T g are independent across i for all T with E½Z i;T ¼ mi;T and
P
p
EjZ i;T j1þd oDo1 for some d40 and all i; T. Then ð1=nÞ n ðZ i;T À mi;T Þ ! 0 as fn; Tg !
i¼1
1 jointly.
Proof. The proof follows from standard arguments, cf. Chung (2001) Chapter 5. Details
are given in Hansen (2004). &
Lemma 2. For k  1 vectors Z i;T , suppose fZ i;T g are independent across i for all T with
E½Z i;T ¼ 0, E½Z i;T Z 0i;T ¼ Oi;T , and EkZ i;T k2þd oDo1 for some d40. Assume O ¼
P
pffiffiffi P
limn;T ð1=nÞ n Oi;T is positive definite with minimum eigenvalue lmin 40. Then ð1= nÞ n
i¼1
i¼1
d
Z i;T ! Nð0; W Þ as fn; Tg ! 1 jointly.
ARTICLE IN PRESS
618
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
Proof. The result follows from verifying the Lindeberg condition of Theorem 2 in Phillips
and Moon (1999) using an argument similar to that used in the proof of Theorem 3 in
Phillips and Moon (1999). Details are given in Hansen (2004). &
Proof of Theorem 2. The conclusions follow from conventional arguments making
repeated use of the Cauchy–Schwarz inequality, Minkowski’s inequality, and Lemmas 1
and 2. &
In addition to using Lemmas 1 and 2, I make use of the following mixing inequality,
restated from Doukhan (1994) Theorem 2 with a slight change of notation, to establish the
properties of the estimators as fn; Tg ! 1 when mixing conditions are imposed. Its proof
may be found in Doukhan (1994, p. 25–30).
Lemma 3. Let fzt g be a strong mixing sequence with E½zt ¼ 0, Ekzt ktþ oDo1, and mixing
coefficient aðmÞ of size ð1 À cÞr=ðr À cÞ where c 2 2N, P and r4c. Then there is a constant
cXt,
C depending only on t and aðmÞ such that Ej T yt jt pCDðt; ; TÞ with Dðt; ; TÞ
t¼1
defined in Doukhan (1994) and satisfying Dðt; ; TÞ ¼ OðTÞ if tp2 and Dðt; ; TÞ ¼
OðT t=2 Þ if t42.
Proof of Theorem 3. The conclusions follow under the conditions of the theorem by
making use of the Cauchy–Schwarz inequality, Minkowsk’s inequality, and Lemma 3 to
verifythe conditions of Lemmas 1 and 2. &
pffiffiffi
d
Proof of Theorem 4. Under ffiffiffiffi hypotheses of the theorem, nðb À bÞ ! QÀ1 Nð0; W Þ,
b
p thed
p
x0i xi =T À Qi ! 0, and x0i i = T ! Nð0; W i Þ are immediate from a LLN and CLT for
mixing sequences, cf. White (2001, Theorems 3.47 and 5.20). The conclusion then follows
b
from the definition of W and bi . &
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pffiffiffiffiffiffiffi
bÀ1 b bÀ1
Proof of Corollary 4.1. Consider tà ¼ nT ðRb À rÞ= RQ W Q R0 . Under the null
b
pffiffiffiffiffiffiffi
P
b
nT Rðb À bÞ ¼ Rðð1=nTÞ i x0i xi ÞÀ1
hypothesis, Rb ¼ r, so the numerator of tà is
pffiffiffiffiffiffiffi P
P
pffiffiffi
d
ðð1= nT Þ i x0i i Þ ! RQÀ1 L i Bi = n. From Theorem 4 and the hypotheses of the
Corollary, the denominator of tà converges in distribution to
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
!
u
n
n
n
X
u
1X X 0
À1 1
0
tRQ
L
Bi Bi À
Bi
Bi LQÀ1 R0 .
n
n i¼1
i¼1
i¼1
It follows from the Continuous Mapping Theorem that
P
pffiffiffi
RQÀ1 L i Bi = n
à d
t ! qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi .
P
P
P
ð1=nÞRQÀ1 Lð n Bi B0i À ð1=nÞ n Bi n B0i ÞLQÀ1 R0
i¼1
i¼1
i¼1
Define d ¼ ðRQÀ1 LLQÀ1 R0 Þ1=2 , so
P
pffiffiffi
d i B1;i = n
d
tà ! U ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P
P
P
ðd2 =nÞð n B1;i B01;i À ð1=nÞ n B1;i n B01;i Þ
i¼1
i¼1
i¼1
e
B1;n
¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi .
P
e2
ð1=nÞð i B2 À B1;n Þ
1;i
ARTICLE IN PRESS
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
e
It is straightforward to show that B1;n $Nð0; 1Þ, that
e2
B1;n
P
2
i B1;i
2
e
À B1;n $w2 , and that
nÀ1
619
P
2
i B1;i
À
e
and B1;n are independent, from which it follows that
U¼
n 1=2
n 1=2
e
B1;n
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi $
tnÀ1 .
P
nÀ1
nÀ1
e2
ð i B2 À B1;n Þ=ðn À 1Þ
1;i
The result for F Ã is obtained through a similar argument, and using a result from Rao
(2002) Chapter 8b to verify that the resulting quantity follows an F distribution. &
References
Andrews, D.W.K., 1991. Heteroskedasticity and autocorrelation consistent covariance matrix estimation.
Econometrica 59 (3), 817–858.
Arellano, M., 1987. Computing robust standard errors for within-groups estimators. Oxford Bulletin of
Economics and Statistics 49 (4), 431–434.
Baltagi, B.H., Wu, P.X., 1999. Unequally spaced panel data regressions with AR(1) disturbances. Econometric
Theory 15, 814–823.
Bell, R.M., McCaffrey, D.F., 2002. Bias reduction in standard errors for linear regression with multi-stage
samples. Mimeo RAND.
Bertrand, M., Duflo, E., Mullainathan, S., 2004. How much should we trust differences-in-differences estimates?
Quarterly Journal of Economics 119 (1), 249–275.
Bhargava, A., Franzini, L., Narendranathan, W., 1982. Serial correlation and the fixed effects model. Review of
Economic Studies 49, 533–549.
Chung, K.L., 2001. A Course in Probability Theory, third ed. Academic Press, San Diego.
Donald, S., Lang, K., 2001. Inference with differences in differences and other panel data. Mimeo.
Doukhan, P., 1994. Mixing: properties and examples. In: Fienberg, S., Gani, J., Krickeberg, K., Olkin, I.,
Wermuth, N. (Eds.), Lecture Notes in Statistics, vol. 85. Springer, New York.
Drukker, D.M., 2003. Testing for serial correlation in linear panel-data models. Stata Journal 3, 168–177.
Hahn, J., Kuersteiner, G.M., 2002. Asymptotically unbiased inference for a dynamic panel model with fixed
effects when both N and T are large. Econometrica 70 (4), 1639–1657.
Hahn, J., Newey, W.K., 2004. Jackknife and analytical bias reduction for nonlinear panel models. Econometrica
72 (4), 1295–1319.
Hansen, C.B., 2004. Inference in linear panel data models with serial correlation and an essay on the impact of
401(k) participation on the wealth distribution. Ph.D. Dissertation, Massachusetts Institute of Technology.
Hansen, C.B., 2006. Generalized least squares inference in multilevel models with serial correlation and fixed
effects. Journal of Econometrics, doi:10.1016/j.jeconom.2006.07.011.
Kezdi, G., 2002. Robust standard error estimation in fixed-effects panel models. Mimeo.
Kiefer, N.M., Vogelsang, T.J., 2002. Heteroskedasticity–autocorrelation robust testing using bandwidth equal to
sample size. Econometric Theory 18, 1350–1366.
Kiefer, N.M., Vogelsang, T.J., 2005. A new asymptotic theory for heteroskedasticity–autocorrelation robust tests.
Econometric Theory 21, 1130–1164.
Lancaster, T., 2002. Orthogonal parameters and panel data. Review of Economic Studies 69, 647–666.
Liang, K.-Y., Zeger, S., 1986. Longitudinal data analysis using generalized linear models. Biometrika 73 (1),
13–22.
MaCurdy, T.E., 1982. The use of time series processes to model the error structure of earnings in a longitudinal
data analysis. Journal of Econometrics 18 (1), 83–114.
Nickell, S., 1981. Biases in dynamic models with fixed effects. Econometrica 49 (6), 1417–1426.
Phillips, P.C.B., Moon, H.R., 1999. Linear regression limit theory for nonstationary panel data. Econometrica 67
(5), 1057–1111.
Phillips, P.C.B., Sun, Y., Jin, S., 2003. Consistent HAC estimation and robust regression testing using sharp
origin kernels with no truncation. Cowles Foundation Discussion Paper 1407.
Rao, C.R., 2002. Linear Statistical Inference and Its Application. Wiley-Interscience.
ARTICLE IN PRESS
620
C.B. Hansen / Journal of Econometrics 141 (2007) 597–620
Solon, G., 1984. Estimating autocorrelations in fixed effects models. NBER Technical Working Paper no. 32.
Solon, G., Inoue, A., 2004. A portmanteau test for serially correlated errors in fixed effects models. Mimeo.
Stata Corporation, 2003. Stata User’s Guide Release 8. Stata Press, College Station, Texas.
Vogelsang, T.J., 2003. Testing in GMM models without truncation. In: Fomby, T.B., Hill, R.C. (Eds.), Advances
in Econometrics, volume 17, Maximum Likelihood Estimation of Misspecified Models: Twenty Years Later.
Elsevier, Amsterdam, pp. 192–233.
White, H., 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for
heteroskedasticity. Econometrica 48 (4), 817–838.
White, H., 2001. Asymptotic Theory for Econometricians, revised edition. Academic Press, San Diego.
Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data. The MIT Press, Cambridge,
MA.
Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.
Why Is My Information Online?