serial correlation coefficient - manfred mudelsee
TRANSCRIPT
Investigating a new estimator of theserial correlation coefficient
Manfred Mudelsee
Institute of Mathematics and StatisticsUniversity of Kent at Canterbury
CanterburyKent CT2 7NFUnited Kingdom
8 May 1998
Abstract
A new estimator of the lag-1 serial correlation coefficient ρ,∑n−1i=1 ε3
i εi+1 /∑n−1
i=1 ε4i , is compared with the old estimator,∑n−1
i=1 εiεi+1 /∑n−1
i=1 ε2i , for a stationary AR(1) process with known mean. The
mean and the variance of both estimators are calculated using the second or-
der Taylor expansion of a ratio. No further approximation is used. In case of
the mean of the old estimator, we derive Marriott and Pope’s (1954) formula,
with (n− 1)−1 instead of (n)−1, and an additional term ∝ (n− 1)−2. In case
of the variance of the old estimator, Bartlett’s (1946) formula results, with
(n− 1)−1 instead of (n)−1. The theoretical expressions are corroborated with
simulation experiments. The main results are as follows. (1) The new esti-
mator has a larger negative bias and a larger variance than the old estimator.
(2) The theoretical results for the mean and the variance of the old estimator
describe the principal behaviours over the entire range of ρ, in particular, the
decline to zero negative bias as ρ approaches unity.
1 Introduction
Consider the following estimators of the serial correlation coefficient ρ oflag 1 from a time series εi (i = 1, . . . , n) sampled from a process Ei withknown (zero) mean:
ρ̂ =
∑n−1i=1 εiεi+1∑n−1
i=1 ε2i
, (1)
ρ̂? =
∑n−1i=1 ε3
i εi+1∑n−1i=1 ε4
i
. (2)
Estimators of type (1) are well-known (e. g. Bartlett 1946, Marriott andPope 1954), whereas (2) is new to my best knowledge.
Let Ei be the stationary AR(1) process,
E1 ∼ N(0, 1),
Ei = ρ Ei−1 + Ui, i = 2, . . . , n, (3)
with Ui i. i. d. ∼ N(0, σ2) and σ2 = 1− ρ2. The following misjudgement ofmine led to the present study. ρ̂? minimises
n−1∑i=1
(εi+1 − ρεi)2 ε2
i ,
which is a weighted sum of squares. However, since Ei has constant varianceunity, no weighting should be necessary.
Nevertheless, we compare estimators (1) and (2) for process (3). We re-strict ourselves to 0 < ρ < 1. We first calculate their variances, respectivelymeans, up to the second order of the Taylor expansion of a ratio, similarlyto Bartlett (1946), respectively Marriott and Pope (1954). Since we concen-trate on the lag-1 estimators and process (3), no further approximation isnecessary. These theoretical expressions and also those of White (1961) forestimator (1) are then examined with simulation experiments.
2 Series expansions
2.1 Old estimator
2.1.1 Variance
We write (1) as
ρ̂ =1n
∑n−1i=1 εiεi+1
1n
∑n−1i=1 ε2
i
(4)
=c
v, say,
1
and have, up to the second order of deviations from the means of c and v,the variance
var(ρ̂) = var(c/v) ' var(c)
E2(v)− 2 E(c)
cov(v, c)
E3(v)+ E2(c)
var(v)
E4(v). (5)
We now apply a standard result for quadravariate standard Gaussian distri-butions with serial correlations ρj (e. g. Priestley 1981:325),
cov(εaεa+s, εbεb+s+t) = ρb−a ρb−a+t + ρb−a+s+t ρb−a−s,
from which we derive, for εi following process (3):
cov
(1
n
n−1∑a=1
εaεa+s,1
n
n−1∑b=1
εbεb+s+t
)=
=1
n2
n−1∑a,b=1
(ρ|b−a| ρ|b−a+t| + ρ|b−a+s+t| ρ|b−a−s|). (6)
Without further approximation we can directly derive the various constituentsof the right-hand side of (5) by the means of (6).
var(c) = var
(1
n
n−1∑i=1
εiεi+1
)= (put s = 1 and t = 0 in (6))
=1
n2
n−1∑a,b=1
ρ2|b−a| +n−1∑
a,b=1
ρ|b−a+1| ρ|b−a−1|
.
We find, by summing over the points of the a-b lattice in a ‘diagonal’ manner,
n−1∑a,b=1
ρ2|b−a| = (n− 1) + 2n−2∑i=1
i ρ2(n−i−1)
= (n− 1)
(2
1− ρ2− 1
)+ 2
(ρ2n−4 − ρ−2)
(1− ρ−2)2(7)
(at the last step we have used the arithmetic-geometric progression), and
n−1∑a,b=1
ρ|b−a+1| ρ|b−a−1| = (n− 1) ρ2 + 2n−2∑i=1
i ρ2(n−i−1)
= (n− 1)
(2
1− ρ2+ ρ2 − 2
)+ 2
(ρ2n−4 − ρ−2)
(1− ρ−2)2. (8)
(7) and (8) give var(c). Further,
cov(v, c) = cov
(1
n
n−1∑i=1
ε2i ,
1
n
n−1∑i=1
εiεi+1
)
2
= (put s = 0 and t = 1 in (6))
=2
n2
n−1∑a,b=1
ρ|b−a| ρ|b−a+1|.
We find
n−1∑a,b=1
ρ|b−a| ρ|b−a+1| = ρ2n−3 +n−2∑i=1
(2 i + 1) ρ2(n−i)−3
= (n− 1)2
ρ
(1
1− ρ2− 1
)
+2(ρ2n−5 − ρ−3)
(1− ρ−2)2+
(ρ2n−3 − ρ−1)
(1− ρ−2). (9)
(At the last step we have used the arithmetic-geometric and also the geomet-ric progression.) (9) gives cov(v, c). Further,
var(v) = var
(1
n
n−1∑i=1
ε2i
)= (put s = t = 0 in (6))
=2
n2
n−1∑a,b=1
ρ2|b−a|,
which is given from (7). We further need
E(v) =n− 1
n(10)
and
E(c) =ρ(n− 1)
n. (11)
All three terms contributing to var(ρ̂) have a part ∝ (n− 1)−2. However,these parts cancel out:
var(ρ̂) ' (1− ρ2)
(n− 1). (12)
Bartlett (1946) already gave, up to the order (n)−1,
var(ρ̂) ' (1− ρ2)
n, (13)
which cannot be distinguished from our result.
3
2.1.2 Mean
We have, up to the second order of deviations from the means of c andv, the mean
E(ρ̂) = E(c/v) ' E(c)
E(v)− cov(v, c)
E2(v)+ E(c)
var(v)
E3(v). (14)
As above, we derive cov(v, c) and var(v) from (9) and (7), respectively, andhave E(v) and E(c) given by (10) and (11), respectively, yielding
E(ρ̂) ' ρ− 2 ρ
(n− 1)+
2
(n− 1)2
(ρ− ρ2n−1)
(1− ρ2). (15)
Marriott and Pope (1954) investigated the estimator
ρ̂′ =1
n−1
∑n−1i=1 εiεi+1
1n
∑n−1i=1 ε2
i
and gave, up to the order (n)−1,
E(ρ̂′) ' ρ− 2 ρ
n,
which cannot be distinguished from the first two terms of our result.
2.1.3 Unknown mean of the process
I have also tried to calculate up to the same order of approximation themean and the variance of the estimator∑n−1
i=1
(εi − 1
n−1
∑n−1j=1 εj
) (εi+1 − 1
n−1
∑n−1j=1 εj+1
)∑n−1
i=1
(εi − 1
n−1
∑n−1j=1 εj
)2 ,
for the case when the mean of the process is unknown. That would requireto calculate the following sums over the points of a cubic lattice:
n−1∑a,b,c=1
ρ|b−a| ρ|c−a−s|
for s = 0 and 1. However, I was unable to obtain exact formulas.
2.2 New estimator
2.2.1 Variance
We write (2) as
ρ̂? =1n
∑n−1i=1 ε3
i εi+11n
∑n−1i=1 ε4
i
=d
w, say.
4
var(ρ̂?), up to the second order of the Taylor expansion, is given by (5),substituting d for c and w for v, respectively. We need the following resultfor octavariate standard Gaussian distributions with serial correlations ρj
(Appendix),
cov(ε3aεa+s, ε
3bεb+s+t) = 9 ρb−a ρb−a+t + 9 ρb−a+s+t ρb−a−s
+ 18 ρs ρb−a ρb−a+s+t + 18 ρb−a ρb−a−s ρs+t
+ 18 ρs ρ2b−a ρs+t + 18 ρ2
b−a ρb−a+s+t ρb−a−s
+ 6 ρ3b−a ρb−a+t, (16)
from which we derive, for εi following process (3):
cov
(1
n
n−1∑a=1
ε3aεa+s,
1
n
n−1∑b=1
ε3bεb+s+t
)=
=1
n2
n−1∑a,b=1
( 9 ρ|b−a| ρ|b−a+t| + 9 ρ|b−a+s+t| ρ|b−a−s|
+ 18 ρ|s| ρ|b−a| ρ|b−a+s+t| + 18 ρ|b−a| ρ|b−a−s| ρ|s+t|
+ 18 ρ|s| ρ2|b−a| ρ|s+t| + 18 ρ2|b−a| ρ|b−a+s+t| ρ|b−a−s|
+ 6 ρ3|b−a| ρ|b−a+t| ) . (17)
Without further simplification we can directly obtain the various constituentsof var(ρ̂?) by the means of (17).
var(d) = var
(1
n
n−1∑i=1
ε3i εi+1
)= (put s = 1 and t = 0 in (17))
=1
n2
9n−1∑
a,b=1
ρ2|b−a| + 9n−1∑
a,b=1
ρ|b−a+1| ρ|b−a−1|
+ 18 ρn−1∑
a,b=1
ρ|b−a| ρ|b−a+1| + 18 ρn−1∑
a,b=1
ρ|b−a| ρ|b−a−1|
+ 18 ρ2n−1∑
a,b=1
ρ2|b−a| + 18n−1∑
a,b=1
ρ2|b−a| ρ|b−a+1| ρ|b−a−1|
+ 6n−1∑
a,b=1
ρ4|b−a|
.
The first and the fifth sum is given by (7), the second sum by (8) and thethird by (9). We find,
n−1∑a,b=1
ρ|b−a| ρ|b−a−1| =n−1∑
a,b=1
ρ|b−a| ρ|b−a+1|,
5
which is given by (9),
n−1∑a,b=1
ρ2|b−a| ρ|b−a+1| ρ|b−a−1| = (n− 1) ρ2 + 2n−2∑i=1
i ρ4(n−i−1)
= (n− 1)
(2
1− ρ4+ ρ2 − 2
)+ 2
(ρ4n−8 − ρ−4)
(1− ρ−4)2(18)
and
n−1∑a,b=1
ρ4|b−a| = (n− 1) + 2n−2∑i=1
i ρ4(n−i−1)
= (n− 1)
(2
1− ρ4− 1
)+ 2
(ρ4n−8 − ρ−4)
(1− ρ−4)2. (19)
(7), (8), (9), (18) and (19) give var(d). Further,
cov(w, d) = cov
(1
n
n−1∑i=1
ε4i ,
1
n
n−1∑i=1
ε3i εi+1
)
= (put s = 0 and t = 1 in (17))
=1
n2
36n−1∑
a,b=1
ρ|b−a| ρ|b−a+1| + 36 ρn−1∑
a,b=1
ρ2|b−a|
+ 24n−1∑
a,b=1
ρ3|b−a| ρ|b−a+1|
.
The last sum of this expression,
n−1∑a,b=1
ρ3|b−a| ρ|b−a+1| = (n− 1) ρ +n−2∑i=1
i ρ4(n−i)−3 +n−2∑i=1
i ρ4(n−i)−5
= (n− 1)
(1
ρ−1 − ρ3+
1
ρ− ρ5− 1
ρ
)
+(ρ4n−7 + ρ4n−9) (1− ρ−4n+4)
(1− ρ−4)2. (20)
(7), (9) and (20) give cov(w, d). Further,
var(w) = var
(1
n
n−1∑i=1
ε4i
)= (put s = t = 0 in (17))
=1
n2
72n−1∑
a,b=1
ρ2|b−a| + 24n−1∑
a,b=1
ρ4|b−a|
,
6
which is given from (7) and (19), respectively. We further need
E(w) =3 (n− 1)
n(21)
and
E(d) =3 ρ (n− 1)
n. (22)
As for the old estimator, the terms ∝ (n−1)−2 which contribute to var(ρ̂?)cancel out:
var(ρ̂?) ' 5
3
(1− ρ2)
(n− 1). (23)
2.2.2 Mean
We use (14) with d instead of c and w instead of v, respectively. Wederive cov(w, d) from (7), (9) and (20), further, var(w) from (7) and (19),and we have E(w) and E(d) given by (21) and (22), respectively. This yields,up to the second order of deviations from the means of d and w,
E(ρ̂?) ' ρ − 1
(n− 1)
4
3ρ
(5− 2
1 + ρ2
)
+1
(n− 1)2
[4
(ρ− ρ2n−1)
(1− ρ2)+
8
3
(ρ3 − ρ4n−1)
(1− ρ2) (1 + ρ2)2
]. (24)
2.3 Remark
ρ̂? is found to have a larger negative bias and also a larger variance than ρ̂.In the simulation experiment, we intend not only to prove that. We are alsointerested to examine the significance of the term ∝ (n−1)−2 in (15). In thecomparison we include the theoretical results of White (1961) who studiedthe old estimator (1) for process (3). He calculated the kth moment of (c/v)via the joint moment generating function m(c, v), with c and v defined as in(4) without the factors 1/n,
E(ρ̂k) =∫ 0
−∞
∫ vk
−∞. . .∫ v2
−∞
∂km(c, v)
∂ck
∣∣∣∣∣c =0
dv1 dv2 . . . dvk.
He expanded the integrand to terms of order n−3 and ρ4 and gave the fol-lowing results (the index ‘W ’ refers to his study):
var(ρ̂)W '(
1
n− 1
n2+
5
n3
)−(
1
n− 9
n2+
53
n3
)ρ2 − 12
n3ρ4, (25)
E(ρ̂)W '(1− 2
n+
4
n2− 2
n3
)ρ +
2
n2ρ3 +
2
n2ρ5. (26)
7
3 Simulation experiments
In the first experiment, for each combination of n and ρ listed in Table1, we generated a set of 250 000 time series from process (3). For every timeseries, ρ̂? and ρ̂ have been calculated. The sample means, µρ̂? sim and µρ̂ sim,and the sample standard deviations, σρ̂? sim and σρ̂ sim, over the simulationsare listed in Table 1. Therein, our theoretical values—from (12), (15), (23)and (24)—are included. In case of the means, these are additionally splittedinto terms ∝ (n− 1)−2 and terms of lower order. Also the theoretical valuesfrom White’s (1961) formula, herein (25) and (26), are listed.
We further investigated whether the distribution of ρ̂? approaches Gaus-sianity as fast as that of ρ̂. Fig. 1 shows histograms of ρ̂?, respectively ρ̂,from the first simulation experiment in comparison with Gaussian distribu-tions N(µρ̂? sim, σ2
ρ̂? sim), respectively N(µρ̂ sim, σ2ρ̂ sim).
In the second simulation experiment (Fig. 2), formulas (12) versus (25)and (15) versus (26) are examined over the entire range of ρ, with particularinterest on ρ large. We plot the negative bias, ρ − E(ρ̂), and
√var(ρ̂). For
each combination of ρ and n, the number of simulated time series after (3)is 250 000.
The error due to the limited number of simulations is small enough fornot influencing any intended comparison.
4 Results and discussion
The first simulation experiment (Table 1) confirms, over a broad range ofρ and n, that ρ̂? has a larger bias and a larger variance, respectively, than ρ̂.
In general, the deviations between the theoretical results and the simula-tion results decrease with n increased, as is to be expected.
For any combination of ρ and n listed in Table 1, our terms ∝ (n−1)−2 in(15), respectively (24), bring these theoretical results closer to the simulationresults, with the exception of E(ρ̂) for ρ = 0.9 and n = 800 where thesimulation noise prevents that comparison. For ρ large, respectively n small,these second order terms contribute heavier.
The frequency distribution of ρ̂? (Fig. 1) has a similar shape as that ofρ̂. It is shifted to smaller values against that and also broader, reflecting thelarger negative bias, respectively the larger variance. The functional formof the distribution of ρ̂ is approximately the Leipnik distribution, which isheavier skewed for ρ large and tends to Gaussianity with n increased. Bothestimators seem to approach Gaussianity equally fast (Fig. 1).
The second simulation experiment (Fig. 2) compares White’s (1961) the-oretical results for ρ̂, (25) and (26), with ours, (12) and (15). It shows thathis are better performing up to a certain value of ρ.
Above, our formulas describe better the simulation, particularly, the de-
8
cline to zero negative bias, respectively zero variance, as ρ approaches unity.The decline to zero negative bias is caused by the term ∝ (n− 1)−2 in (15).Those declines are reasonable, since for ρ = 1 all time series points haveequal value and c = v in (4).
For small ρ, his formulas perform better since they are more accuratewith respect to powers of (1/n) than ours. For larger ρ his approximationbecomes less accurate. In particular, for n ≥ 3 and ρ > 0, ρ− E(ρ̂)W cannotbecome zero. For n ≥ 10, d
dρ(ρ− E(ρ̂)W ) cannot become negative. For n ≥ 8,
var(ρ̂)W cannot become zero. That means, for those cases White’s (1961)formulas cannot produce the decline to zero negative bias and zero variance,respectively.
The fact that Bartlett’s (1946) formula for the variance, (13), describesthe simulation better than (12) for ρ → 0, is regarded as spurious.
These results mean that the expansion formulas for var(ρ̂) and E(ρ̂), (5)and (14), respectively, are sufficient to derive the principal behaviours forρ → 1. In case of the mean, however, no further approximation is allowedwhich would lead to the term ∝ (n− 1)−2 in (15) be neglected.
5 Conclusions
1. In case of the stationary AR(1) process (3) with known mean, the newestimator, ρ̂?, has a larger negative bias and a larger variance than theold estimator, ρ̂. Its distribution tends to Gaussianity about equallyfast.
2. Our formula (15) for E(ρ̂) describes the expected decline to zero nega-tive bias for ρ → 1.
3. The formula (12) for var(ρ̂) describes the expected decline to zero vari-ance for ρ → 1.
4. The second order Taylor expansion of a ratio is sufficient to derivethese two principal behaviours for ρ → 1, if no further approximationsare made. This condition could be fulfilled since we had concentratedourselves on the lag-1 estimator and process (3).
5. For unknown mean of the process, it is more complicated to derive theequations with the same accuracy.
Acknowledgements
Q. Yao is appreciated for comments on the manuscript. The present studywas carried out while the author was Marie Curie Research Fellow (EU-TMRgrant ERBFMBICT971919).
9
References
Bartlett, M. S., 1946, On the theoretical specification and sampling prop-erties of autocorrelated time-series: Journal of the Royal Statistical SocietySupplement, v. 8, p. 27-41 (Corrigenda: 1948, Journal of the Royal Statis-tical Society Series B, v. 10, no. 1).
Marriott, F. H. C., and Pope, J. A., 1954, Bias in the estimation of au-tocorrelations: Biometrika, v. 41, p. 390-402.
Priestley, M. B., 1981, Spectral Analysis and Time Series: London, Aca-demic Press, 890 p.
Scott, D. W., 1979, On optimal and data-based histograms: Biometrika,v. 66, p. 605-610.
White, J. S., 1961, Asymptotic expansions for the mean and variance ofthe serial correlation coefficient: Biometrika, v. 48, p. 85-94.
Appendix
We assume that εi is drawn from a standard Gaussian distribution withserial correlations ρj. We derive (16) by means of the moment generatingfunction. Write
cov(ε3aεa+s, ε
3bεb+s+t) = E(ε3
aεa+sε3bεb+s+t) − E(ε3
aεa+s) E(ε3bεb+s+t)
= E(ε3aεa+sε
3bεb+s+t) − 9 ρs ρs+t.
Now, E(ε3aεa+sε
3bεb+s+t) is the coefficient of (t31 t2 t33 t4)/(3! 1! 3! 1!) in the mo-
ment generating function
m(t1, t2, t3, t4) = exp
1
2
4∑i=1
4∑j=1
σijtitj
,
with
σ11 = σ22 = σ33 = σ44 = 1,
σ12 = σ21 = ρs,
σ13 = σ31 = ρb−a,
σ14 = σ41 = ρb−a+s+t,
σ23 = σ32 = ρb−a−s,
10
σ24 = σ42 = ρb−a+t,
σ34 = σ43 = ρs+t.
Terms ∝ (t31 t2 t33 t4) in m(t1, t2, t3, t4) can only be within the fourth order ofthe series expansion of the exponential function, i. e., within
1
4!
1
2
4∑i=1
4∑j=1
σijtitj
4
,
further restricting, within
1
384(t21 + t23 + 2 σ12t1t2 + 2 σ13t1t3
+ 2 σ14t1t4 + 2 σ23t2t3 + 2 σ24t2t4 + 2 σ34t3t4)4,
further restricting, within
1
384(4 σ2
13t21t
23 + 2 t21t
23 + 4 σ12t
31t2 + 4 σ13t
31t3 + 4 σ14t
31t4 + 4 σ23t
21t2t3
+ 4 σ24t21t2t4 + 4 σ34t
21t3t4 + 4 σ12t1t2t
23 + 4 σ13t1t
33 + 4 σ14t1t
23t4
+ 4 σ23t2t33 + 4 σ24t2t
23t4 + 4 σ34t
33t4 + 8 σ12σ13t
21t2t3 + 8 σ12σ14t
21t2t4
+ 8 σ12σ34t1t2t3t4 + 8 σ13σ14t21t3t4 + 8 σ13σ23t1t2t
23 + 8 σ13σ24t1t2t3t4
+ 8 σ13σ34t1t23t4 + 8 σ14σ23t1t2t3t4 + 8 σ23σ34t2t
23t4)
2.
Finally, these terms are
1
384(2 · 4 σ2
13t21t
23 · 8 σ12σ34t1t2t3t4 + 2 · 4 σ2
13t21t
23 · 8 σ13σ24t1t2t3t4
+ 2 · 4 σ213t
21t
23 · 8 σ14σ23t1t2t3t4 + 2 · 2 t21t
23 · 8 σ12σ34t1t2t3t4
+ 2 · 2 t21t23 · 8 σ13σ24t1t2t3t4 + 2 · 2 t21t
23 · 8 σ14σ23t1t2t3t4
+ 2 · 4 σ12t31t2 · 4 σ34t
33t4 + 2 · 4 σ13t
31t3 · 4 σ24t2t
23t4
+ 2 · 4 σ13t31t3 · 8 σ23σ34t2t
23t4 + 2 · 4 σ14t
31t4 · 4 σ23t2t
33
+ 2 · 4 σ23t21t2t3 · 4 σ14t1t
23t4 + 2 · 4 σ23t
21t2t3 · 8 σ13σ34t1t
23t4
+ 2 · 4 σ24t21t2t4 · 4 σ13t1t
33 + 2 · 4 σ34t
21t3t4 · 4 σ12t1t2t
23
+ 2 · 4 σ34t21t3t4 · 8 σ13σ23t1t2t
23 + 2 · 4 σ12t1t2t
23 · 8 σ13σ14t
21t3t4
+ 2 · 4 σ13t1t33 · 8 σ12σ14t
21t2t4 + 2 · 4 σ14t1t
23t4 · 8 σ12σ13t
21t2t3
+ 2 · 8 σ12σ13t21t2t3 · 8 σ13σ34t1t
23t4 + 2 · 8 σ13σ14t
21t3t4 · 8 σ13σ23t1t2t
23).
11
Thus, we find
E(ε3aεa+sε
3bεb+s+t) =
3! 1! 3! 1!
384(96 σ12σ34 + 96 σ13σ24 + 96 σ14σ23
+ 192 σ12σ13σ14 + 192 σ13σ23σ34
+ 192 σ12σ213σ34 + 192 σ2
13σ214σ23
+ 64 σ313σ24).
This gives the final result
cov(ε3aεa+s, ε
3bεb+s+t) = 9 ρb−a ρb−a+t + 9 ρb−a+s+t ρb−a−s
+ 18 ρs ρb−a ρb−a+s+t + 18 ρb−a ρb−a−s ρs+t
+ 18 ρs ρ2b−a ρs+t + 18 ρ2
b−a ρb−a+s+t ρb−a−s
+ 6 ρ3b−a ρb−a+t.
It should be noted that this result has been checked using the joint cumulantof order eight, in my assessment without less effort.
12
Tab
le1:
Fir
stsi
mul
atio
nex
peri
men
t.T
henu
mbe
rof
sim
ulat
edti
me
seri
esaf
ter
(3)
isin
each
case
250
000.
S.d
.:st
anda
rdde
viat
ion.
The
orde
rof
the
unce
rtai
nty
ofth
em
ean,
resp
ecti
vely
the
stan
dard
devi
atio
n,du
eto
the
limit
ednu
mbe
rof
sim
ulat
ions
,is
S.d
./√
250
000.
Sim
:si
mul
atio
n
(sam
ple
mea
nan
dsa
mpl
est
anda
rdde
viat
ion,
µρ̂
?sim
and
σρ̂
?sim
,re
spec
tive
lyµ
ρ̂sim
and
σρ̂
sim
).T
(24,
23):
theo
reti
calva
lue,
this
stud
y,(2
4),
resp
ective
ly(2
3).
2nd:
term∝
(n−
1)−
2in
(15)
,re
spec
tive
ly(2
4).
1st:
term
sno
t∝
(n−
1)−
2in
(15)
,re
spec
tive
ly(2
4).
ρ̂?
Sim
ρ̂Si
mρ̂
?T
(24)
1st
ρ̂?
T(2
4)2n
dρ̂
?T
(24,
23)
ρ̂T
(15)
1st
ρ̂T
(15)
2nd
ρ̂T
(15,
12)
ρ̂T
(26,
25)
ρ=
.200
Mea
n.1
5437
433
0.1
6867
697
4.1
0883
190
8.0
1054
171
6.1
1937
362
5.1
5555
555
5.0
0514
403
2.1
6069
958
8.1
6776
640
0
n=
10S.d
..3
2250
070
3.3
0752
412
9.4
2163
702
1.3
2659
863
2.3
0407
367
5
S.d
./
500
.000
645
001
.000
615
048
ρ=
.200
Mea
n.1
7146
424
4.1
8238
719
7.1
5681
511
4.0
0236
531
5.1
5918
043
0.1
7894
736
8.0
0115
420
1.1
8010
156
9.1
8199
160
0
n=
20S.d
..2
3935
337
6.2
1695
536
0.2
9019
050
0.2
2478
059
4.2
1623
505
7
S.d
./
500
.000
478
706
.000
433
910
ρ=
.200
Mea
n.1
8620
844
8.1
9224
108
8.1
8325
484
0.0
0035
563
4.1
8361
047
5.1
9183
673
4.0
0017
353
8.1
9201
027
3.1
9232
345
6
n=
50S.d
..1
6227
341
1.1
3773
871
5.1
8070
158
0.1
3997
084
2.1
3772
031
9
S.d
./
500
.000
324
546
.000
275
477
ρ=
.500
Mea
n.3
8829
477
0.4
2268
803
4.2
4814
814
8.0
3643
334
4.2
8458
149
3.3
8888
888
8.0
1646
084
2.4
0534
973
1.4
2212
500
0
n=
10S.d
..3
0363
569
6.2
9179
935
2.3
7267
799
6.2
8867
513
4.2
8017
851
4
S.d
./
500
.000
607
271
.000
583
598
ρ=
.500
Mea
n.4
6474
052
0.4
8087
103
2.4
5374
149
6.0
0122
911
7.4
5497
061
4.4
7959
183
6.0
0055
532
4.4
8014
716
0.4
8091
700
0
n=
50S.d
..1
4310
087
2.1
2423
156
1.1
5971
914
1.1
2371
791
4.1
2420
950
0
S.d
./
500
.000
286
201
.000
248
463
ρ=
.500
Mea
n.4
8642
804
2.4
9349
091
5.4
8478
747
2.0
0013
292
6.4
8492
039
8.4
9328
859
0.0
0006
005
7.4
9334
864
7.4
9343
581
4
n=
150
S.d
..0
8658
971
1.0
7103
908
3.0
9159
291
3.0
7094
756
5.0
7108
367
5
S.d
./
500
.000
173
179
.000
142
078
13
Tab
le1:
(Con
tinu
ed)
ρ̂?
Sim
ρ̂Si
mρ̂
?T
(24)
1st
ρ̂?
T(2
4)2n
dρ̂
?T
(24,
23)
ρ̂T
(15)
1st
ρ̂T
(15)
2nd
ρ̂T
(15,
12)
ρ̂T
(26,
25)
ρ=
.900
Mea
n.7
9482
476
5.8
3198
103
0.6
5399
825
5.0
6017
638
2.7
1417
463
7.8
0526
315
7.0
2576
401
2.8
3102
717
0.8
2537
245
0
n=
20S.d
..1
5289
861
7.1
4164
999
8.1
2909
944
4.1
0000
000
0.1
3964
096
8
S.d
./
500
.000
305
797
.000
283
299
ρ=
.900
Mea
n.8
6759
846
6.8
8323
932
7.8
5278
754
3.0
0225
185
8.8
5503
940
2.8
8181
818
1.0
0096
660
3.8
8278
478
5.8
8262
209
8
n=
100
S.d
..0
5614
476
9.0
4955
218
9.0
5655
663
7.0
4380
858
2.0
4983
168
4
S.d
./
500
.000
112
289
.000
099
104
ρ=
.900
Mea
n.8
9464
605
5.8
9775
169
9.8
9415
014
6.0
0003
457
1.8
9418
471
7.8
9774
718
3.0
0001
483
9.8
9776
202
3.8
9775
974
4
n=
800
S.d
..0
1917
615
3.0
1573
253
5.0
1990
800
7.0
1542
067
5.0
1572
382
4
S.d
./
500
.000
038
352
.000
031
465
ρ=
.980
Mea
n.9
0367
096
8.9
2936
735
1.7
0630
147
0.1
8279
971
1.8
8910
118
2.8
7684
210
5.0
7347
766
7.9
5031
977
2.9
0078
056
3
n=
20S.d
..1
1756
580
5.1
0651
587
3.0
5893
796
9.0
4565
315
4.1
1818
543
8
S.d
./
500
.000
235
131
.000
213
031
ρ=
.980
Mea
n.9
6434
177
3.9
7133
571
9.9
5386
797
9.0
0291
532
0.9
5678
329
9.9
7015
075
3.0
0124
943
8.9
7140
019
2.9
7039
001
0
n=
200
S.d
..0
2159
105
1.0
1902
243
0.0
1821
148
7.0
1410
655
7.0
1954
402
2
S.d
./
500
.000
043
182
.000
038
044
ρ=
.980
Mea
n.9
7775
960
2.9
7903
384
1.9
7739
856
3.0
0002
889
9.9
7742
746
2.9
7901
950
9.0
0001
238
6.9
7903
189
5.9
7902
190
2
n=
2000
S.d
..0
0552
619
3.0
0465
273
5.0
0574
599
9.0
0445
083
1.0
0465
873
1
S.d
./
500
.000
011
052
.000
009
305
ρ=
.999
Mea
n.9
9510
599
1.9
9660
456
7.9
8832
531
5.0
0622
666
5.9
9455
198
1.9
9499
599
1.0
0253
513
8.9
9753
113
0.9
9503
590
4
n=
500
S.d
..0
0569
976
4.0
0484
587
7.0
0258
392
8.0
0200
150
2.0
0595
376
0
S.d
./
500
.000
011
399
.000
009
691
ρ=
.999
Mea
n.9
9758
724
0.9
9818
015
5.9
9633
533
4.0
0057
443
4.9
9690
976
8.9
9800
050
0.0
0024
554
3.9
9824
604
4.9
9800
299
4
n=
2000
S.d
..0
0187
081
6.0
0162
235
4.0
0129
099
4.0
0100
000
0.0
0172
844
4
S.d
./
500
.000
003
741
.000
003
244
ρ=
.999
Mea
n.9
9890
792
5.9
9896
063
0.9
9889
346
4.0
0000
093
2.9
9889
439
7.9
9896
003
9.0
0000
039
9.9
9896
043
9.9
9896
004
3
n=
5000
0S.d
..0
0024
757
7.0
0020
794
8.0
0025
813
6.0
0019
995
1.0
0020
777
9
S.d
./
500
.000
000
495
.000
000
415
14
-0.5 0.0 0.5 1.0
ρ = 0.20n = 10
-0.5 0.0 0.5
ρ = 0.20n = 20
-0.2 0.0 0.2 0.4 0.6
ρ = 0.20n = 50
Figure 1: First simulation experiment (cf. Table 1). Histograms of ρ̂?
(thick line), respectively ρ̂ (thin line), compared with Gaussian distributionsN(µρ̂? sim, σ2
ρ̂? sim) (heavy line), respectively N(µρ̂ sim, σ2ρ̂ sim) (light line). The
number of histogram classes follows Scott (1979).
15
ρ = 0.50n = 10
ρ = 0.50n = 50
ρ = 0.50n = 150
-0.5 0.0 0.5 1.0
0.0 0.5
0.3 0.4 0.5 0.6 0.7
Figure 1: (Continued)
16
ρ = 0.90n = 20
ρ = 0.90n = 100
ρ = 0.90n = 800
0.5 1.0
0.7 0.8 0.9 1.0
0.85 0.90 0.95
Figure 1: (Continued)
17
ρ = 0.98n = 20
ρ = 0.98n = 200
ρ = 0.98n = 2000
0.6 0.8 1.0 1.2
0.90 0.95 1.00
0.96 0.97 0.98 0.99
Figure 1: (Continued)
18
ρ = 0.999n = 500
ρ = 0.999n = 2000
ρ = 0.999n = 50000
0.98 0.99 1.00 1.01
0.995 1.000
0.9985 0.9990 0.9995
Figure 1: (Continued)
19
0.0 0.2 0.4 0.6 0.8 1.0
0.00
0.05
0.10
ρ - E(ρ^)
n = 10
SQRT[var(ρ^)]
n = 10
0.0 0.2 0.4 0.6 0.8 1.0
ρ
0.00
0.10
0.20
0.30
Figure 2: Second simulation experiment. Above: negative bias, below:standard deviation. Simulation results (dots). Theoretical result, our for-mula, (12) and (15), respectively, (heavy line). Theoretical result, White’s(1961) formula, (25) and (26), respectively, (light line).
20
0.00
0.04
0.080.0 0.2 0.4 0.6 0.8 1.0
ρ - E(ρ^)
n = 20
0.0 0.2 0.4 0.6 0.8 1.0
ρ
0.00
0.10
0.20
SQRT[var(ρ^)]
n = 20
Figure 2: (Continued)
21
0.0 0.2 0.4 0.6 0.8 1.0
0.00
0.01
0.02
ρ - E(ρ^)
n = 100
SQRT[var(ρ^)]
n = 100
0.0 0.2 0.4 0.6 0.8 1.0
ρ
0.00
0.05
0.10
Figure 2: (Continued)
22