Lecture Notes on
ST304-TIME SERIES ANALYSIS &
FORECASTING
by
Professor Howell Tong
c©
2005,2006
CHAPTER ONE
GENERAL
(0) Examples of Time Series.
Some examples will be shown in the lecture but you are encouraged to provide your
own!
(1) t ∈ {0,±1, ±2, . . .} = T , called the parameter space, T . Xt takes real values so that
state space is R. {Xt : t = 0, ±1,±2, . . .} denotes a sequence of random variables
(r.v.s.) indexed by t, i.e. for each t, Xt is a real valued r.v. {Xt} is a stochastic
process. If t denotes times, {Xt : t = 0, ±1, ±2, . . .} is called a time series (in
discrete time), to be abbreviated by T.S.
(2) {Xt} is strictly stationary (≡ stationary in the strict sense) if for any t1, t2, . . . , tn ∈ T ,
(any n ∈ T ), and any k ∈ T , the joint distribution of {Xt1 , Xt2 , . . . , Xtn} is the same
as the joint distribution of {Xt1+k, Xt2+k, . . . Xtn+k}.
We have statistical equilibrium.
(3) {Xt} is second order stationary (≡ stationary in wide sense ≡ covariance stationary
≡ wide sense stationary) if
(i) E(Xt) = µ <∞, independent of t
(ii) Var Xt = σ
2 <∞, independent of t and cov(Xt, Xt+τ ) is a function of τ only.
This course is mainly concerned with wide sense stationary T.S. For simplicity, we
call them stationary T.S.
N.B. Strict stationarity + existence of µ, σ2
⇒6⇐ wide sense stationarity.
1
(4) cov (Xt, Xt+τ )
4
= γ(τ), autocovariance function at lag τ .
{γ(τ)τ = 0,±1,±2, . . .} is called the autocovariance function (a.c.v.f.).
γ(τ) = γ(−τ)
because cov(Xt, Xt+τ ) = cov(Xt+τ , Xt).
ρ(τ)
4
= γ(τ)/γ(0), normalisation.
{ρ(τ) : τ = 0,±1,±2, . . .} is called the autocorrelation function (a.c.r.f.). Sometimes
write γτ for γ(τ) and ρτ for ρ(τ).
(i) γ(0) = σ2 = var Xt (obvious)
(ii) |γ(τ)| ≤ γ(0) all τ (by Cauchy-Schwartz inequality)
(iii) γ(τ) = γ(−τ) ∀τ (done)
(i′) ρ(0) = 1
(ii′) |ρ(τ)| ≤ 1 ∀τ
(iii′) ρ(−τ) = ρ(τ) ∀τ
Finally, one important property for the autocovariance function of a stationary time
series. For any real constants a1, a2, Var(a1X1 + a2X2) = cov(a1X1 + a2X2, a1X1 +
a2X2) = a1a1cov(X1, X1)+a1a2cov(X1, X2)+a2a1cov(X2, X1)+a2a2cov(X2, X2) ≥ 0.
Thus,
∑2
i=1
∑2
j=1 γ(i− j)aiaj ≥ 0. In fact, this property holds for any real constants
a1, a2, . . . , ak and any k−subset of the time series, e.g. {Xt1 , Xt2 , . . . , Xtk}, so that
we have
∑k
i=1
∑k
j=1 γ(ti − tj)aiaj ≥ 0.
The above property is called the positive semi-definiteness of the acvf.
(5) Often we assume that Xt1 , Xt2 , . . . , Xtn are jointly Gaussian for any n ∈ Z+ and any
t1, t2, . . . , tn ∈ T . Then {Xt : t = 0,±1, . . .} is called a Gaussian T.S..
For Gaussian T.S. strict stationarity ≡ wide sense stationarity.
2
CHAPTER TWO
STANDARD MODELS
(0) Strict White Noise Process/White Noise Process
Notation:
Xt ∼ SWN (µ, σ2) or IID (µ, σ2) / Xt ∼WN (µ, σ2)
{Xt} is a sequence of independent/uncorrelated random variables with mean µ <∞
(usually but not always zero) and variance σ2(<∞).
γ(τ) =
{
σ2 for τ = 0
0 otherwise (i.e. o.w.)
3
ρ(τ) =
{
1 for τ = 0
0 o.w.
W.N. is the basic building block for T.S. models.
At the elementary level, Xt’s are assumed to have a joint Gaussian distribution. (Re-
call that in this case independence ≡ uncorrelatedness.)
(1) Moving Average Model
Let’s start with a moving average model of order 1.
Notation: Xt ∼ MA(1)
Xt = b0Yt + b1Yt−1,
where Yt ∼WN (0, σ2).
Without loss of generality set b0 = 1 because the above model is equivalent to
Xt = et + c1et−1,
where et ∼WN (0, b20σ2) and c1 = b1/b0
An MA(1) model is stationary because:
(i) EXt = (b0 + b1)E(Yt); independent of t. [Note: In this case, the mean of Xt is
zero since the mean of Yt is zero. If we want to have a non-zero mean for Xt,
we can either use a Yt with non-zero mean or modify the MA(1) model to
Xt = a+ b0Yt + b1Yt−1,
where the mean of Xt is simply equal to a.]
(ii) Var Xt = σ
2(b20 + b
2
1); independent of t.
(iii)
γ(1) = cov(Xt+1, Xt)
= E[(b0Yt+1 + b1Yt)× (b0Yt + b1Yt−1)]
= (b1b0)σ
2,
i.e. independent of t.
4
γ(2) = cov(Xt+2, Xt) = cov(b0Yt+2 + b1Yt+1, b0Yt + b1Yt−1) = 0,
noting the zero covariance is due to the non-overlapping of the respective Y−components.
Similar argument then leads to γ(τ) = 0 for τ = 3, 4, . . ..
The corresponding acrf can be calculated easily to yield ρ(0) = 1, ρ(1) =
γ(1)/γ(0) = b1b0/(b
2
0 + b
2
1), ρ(2) = ρ(3) = . . . = 0.
[Note: Recall that γ(−τ) = γ(τ).]
Thus, for an MA(1) model the acvf (and equivalently the acrf) ’cuts off at lag
1’ in the sense that the function is zero after lag 1.
The above discussion can be generalised to an MA model of general order, i.e.
for the model
Xt = b0Yt + b1Yt−1 + . . .+ b`Yt−`,
where ` ≥ 1;Yt ∼WN (0, σ2).
We denote this model by MA(`). This model is again stationary with mean
(b0 + b1 + . . .+ b`)E(Yt), variance σ
2(b20 + b
2
1 + . . .+ b
2
`) and
γ(τ) = cov(Xt+τ , Xt)
= E[(b0Yt+τ + b1Yt+τ−1 + . . .+ b`Yt+τ−`)×
(b0Yt + b1Yt−1 + . . .+ b`Yt−`)]
=
(bτb0+ bτ+1b1 + . . .+ b` b`−|τ |)σ2 (|τ | ≤ `)
↑ ↑ ↑
coef.of coef.of coef.of
Yt Yt−1 Yt+|τ |−`
0 (|τ | > `)
(iii′)
ρ(τ) =
∑`
s=τ bsbs−|τ | /
∑`
j=0 b
2
j (|τ | ≤ `)
0 (|τ | > `)
5
For MA(`), acrf. cuts off at lag `
(2) Autoregressive Model
We start with the first order model.
Notation: Xt ∼ AR(1)
(2.1) Xt ∼ AR(1)
Xt − λXt−1 = Yt.
This model need not be stationary at this point.
... Xt = Yt + λXt−1
= Yt + λ(Yt−1 + λXt−2)
= Yt + λYt−1 + λ2Xt−2
= . . .
=
m∑
s=0
λsYt−s + λm+1Xt−m−1.
... E
[
Xt −
m∑
s=0
λsYt−s
]2
= λ2m+2E(X2t−m−1)→ 0 as m→∞ ⊗
provided |λ| < 1.
... |λ| < 1⇒ Xt =
∞∑
s=0
λsYt−s in “mean squares sense” defined in ⊗. It has now be-
come an MA(∞) model.
...
|λ| < 1⇒ {Xt} is stationary with EXt = 11−λEYt and
γ(τ) =
∑∞
s=0 λ
sλ|τ |+sσ2 = σ2λ|τ |/(1− λ2).
6
For:
EXt = E
( ∞∑
s=0
λsYt−s
)
=
∞∑
s=0
λsE(Yt−s)
= E(Yt)
∞∑
s=0
λs
=
1
1− λE(Yt).
Short Cut :
If we know {Xt} is stationary(!) EXt − λEXt−1 = EYt, which
gives immediately EXt =
1
1−λEYt because EXt−1 = EXt.
For τ > 0, cov(Xt, Xt+τ ) = cov
( ∞∑
s=0
λsYt−s,
∞∑
r=0
λrYt+τ−r
)
=
∞∑
s=0
λsλτ+sσ2
Similarly for τ < 0.
Short Cut :
If we know {Xt} is stationary, then
Xt − λXt−1 = Yt ⇒ cov(Xt, Xt−1)− λcov(Xt−1, Xt−1) = cov(Yt, Xt−1)
i.e. γ(1)− λγ(0) = 0.
Similarly, cov(Xt, Xt−2)− λcov(Xt−1, Xt−2) = cov(Yt, Xt−2)
i.e. γ(2)− λγ(1) = 0.
In general, we have the so called Y ule−Walker equation :
γ(t)− λγ(t− 1) = 0 for t = 1, 2, . . .
c.f. Xt − λXt−1 = Yt, the model.
To find γ(0) i.e. Var Xt:
From Xt − λXt−1 = Yt,
we have Cov(Xt, Xt)− λcov(Xt−1, Xt) = cov(Yt, Xt)
i.e. γ(0)− λγ(1) = cov(Yt, Yt)
... Xt = Yt + fn.{Yt−1, Yt−2, . . .}
... γ(0)− λ(λγ(0)) = σ2
7
i.e. γ(0) = σ
2
1−λ2 boundary condition of Yule-Walker equation.
N.B. In simulation, start with X0, and set
Xt − λXt−1 = Yt , t = 1, 2, 3, . . . .
Using Y1, Y2, . . . , from random number generator, we get typically the following pic-
ture:
↘ transient effect due to warming up.
To be discarded if stationarity is intended.
ρ(τ) = γ(τ)/γ(0) = λ|τ |, τ = 0,±1,±2, . . .
ρ(τ)→ 0 exponentially as τ →∞
(2.2) Xt ∼ AR(2)
Xt + a1Xt−1 + a2Xt−2 = Yt.
To solveXt in terms of Yt, Yt−1, Yt−2 . . . , it is best to introduce the B operator, namely
BXt = Xt−1. Then the model becomes
(1 + a1B + a2B
2)Xt = Yt, where B
2Xt = B(BXt) = Xt−2,
i.e.
(1− λ1B)(1− λ2B)Xt = Yt ,
8
where λ1 and λ2 are distinct roots of the characteristic equation z
2 + a1z + a2 = 0
such that λ1λ2 = a2, λ1 + λ2 = −a1.
... Xt = [(1− λ1B)(1− λ2B)]−1Yt
formally
1
(1− λ1B)(1− λ2B) Yt
=
1
(λ1 − λ2)
{
λ1
(1− λ1B) −
λ2
(1− λ2B)
}
Yt
=
∞∑
s=0
(
λs+11 − λs+12
λ1 − λ2
)
Yt−s (in mean squares sense)
provided |λ1| < 1 , |λ2| < 1.
i.e. Xt =
∞∑
s=0
hsYt−s independent of t provided |λ1| < 1, |λ2| < 1.
Also, |λ1| < 1, |λ1| < 1⇒ EXt =
( ∞∑
0
hs
)
(EYt) = K. EYt, K <∞,
cov (Xt, Xt+|τ |) =
∞∑
s=0
hshs+|τ |σ2,
and
Var Xt =
( ∞∑
0
h2s
)
σ2 = cσ2, c <∞.
Note: hs is a function of λ1, λ2, which are functions of a1 and a2.
... |λ1| < 1 and |λ2| < 1⇒ {Xt} is stationary
Easier Method to evaluate mean and a.c.v.f. for stationary AR(2)
Xt + a1Xt−1 + a2Xt−2 = Yt.
Suppose we know that a1 and a2 have been so chosen as to ensure stationarity. Then
we may proceed as follows.
By stationarity, EXt = EXt−1 = EXt−2 = µX say
9
... µX(1 + a1 + a2) = EYt = µY say
... µX = µY / (1 + a1 + a2) .
Now, cov(Xt−τ , Xt + a1Xt−1 + a2Xt−2) = cov(Xt−τ , Yt)
τ = 0,
τ > 0
γ(0) + a1γ(1) + a2γ(2) = σ
2
γ(τ) + a1γ(τ − 1) + a2γ(τ − 2) = 0
Y ule−Walker
Equations
For
τ > 0, ρ(τ) + a1ρ(τ − 1) + a2ρ(τ − 2) = 0 . . . . . .⊗
τ = 1, ρ(1) + a1 + a2ρ(1) = 0 ⇒ ρ(1) = − a11+a2
In practice, compute recursively ρ(2), ρ(3), . . . , using ⊗ and starting with ρ(0) =
1, ρ(1) = − a1
1+a2
.
To solve ⊗ in closed form we quote a standard result from difference equations
ρ(τ) = Aλτ1 +Bλ
τ
2,
where λ1, λ2 are solutions of z
2 + a1z + a2 = 0, subject to ρ(0) = 1, ρ(1) =
−a1
1 + a2
,
... ρ(τ) =
(1− λ22)λ|τ |+11 − (1− λ21)λ|τ |+12
(λ1 − λ2)(1 + λ1λ2) .
This result is only useful as a closed form analytic solution.
To find γ(0):
Now γ(0) + a1γ(1) + a2γ(2) = σ
2
⇒ γ(0)[1 + a1ρ(1) + a2ρ(2)] = σ2
⇒ γ(0) = (1 + a2)σ
2
(1− a2)(1− a1 + a2)(1 + a1 + a2)
Behaviour of a.c.r.f.
(a) λ1, λ2 real
ρ(τ)→ 0 exponentially as τ →∞
Typically |λ1| > |λ2|, then ρτ ∼ cλτ1, τ large
10
ρ(τ) always positive if λ1 > 0
ρ(τ) alternating between positive and
negative if λ1 < 0
(b) λ1 =
√
a2e
iφ, λ2 =
√
a2e
−iφ
It can be shown that the acrf is an exponentially damped sine function. The algebra
is messy, which we skip. We only state the results below.
(0 ≤ a21 < 4a2) φ = cos−1(−a1/2
√
a2) ∈ [0, pi).
ρ(τ) =
a
τ/2
2 sin(τφ+ ψ)
sinψ
, τ ≥ 0
where tanψ =
1 + a2
1− a2 tanφ.
11
Therefore ρ(τ) lies on exponentially damped sine wave with period 2pi/φ.
Note: The closer is a2 to 1 the weaker is the damping.
(2.3) Xt ∼ AR(k)
The discussions above on AR(1) and AR(2) models suggest the following results for
a general AR(k) model, which, for the sake of completeness, we state/summarize
without proof.
Xt + a1Xt−1 + . . .+ akXt−k = Yt, Yt ∼WN(0, σ2).
Let the characteristic equation zk+a1z
k−1+. . .+ak = 0 have distinct roots λ1, λ2, . . . , λk
such that |λj| < 1, all j. In this case, Xt =
∞∑
s=0
hsYt−s (in mean squares sense) where
∞∑
s=0
|hs| <∞ and
∞∑
s=0
h2s <∞. In fact hs =
k∑
j=1
cjλ
s
j cj =
λk−1j
Π
l 6= j(λj−λl)
.
... |λj| < 1 for j = 1, 2, . . . , k ⇒ {Xt} is stationary
Under stationarity we can find mean and a.c.v.f.:
(i) µX(1 + a1 + a2 + . . . ak) = µY
... µX = µY /
k∑
j=0
aj, (a0 = 1).
(ii) cov(Xt−τ , Xt + a1Xt−1 + . . .+ akXt−k) = 0, τ > 0
= σ2, τ = 0
leading to the Yule-Walker equation:
ρ(τ) + a1ρ(τ − 1) + . . .+ akρ(τ − k) = 0, τ > 0.
It can be shown that the general solution is
ρ(τ) = B1λ
|τ |
1 +B2λ
|τ |
2 + . . .+Bkλ
|τ |
k ,
12
where B1, B2, . . . , Bk are determined from initial values of ρ i.e. ρ(0), ρ(1), . . . ρ(k−1).
[In practice, as in AR(1) and AR(2), we find ρ(1), ρ(2), . . . ρ(k − 1) first by solving
the (k − 1) equations obtained by putting τ = k − 1, k − 2, . . . , 1 in Y-W equations
and recalling ρ(r) = ρ(−r)
ρ(k − 1) + a1ρ(k − 2) + . . . . . . . . .+ (ak−2 + ak)ρ(1) + ak−1 = 0
ρ(k − 2) + . . . . . . . . .+ (ak−3 + ak−1)ρ(1) + ak−2 = 0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
akρ(k − 1) + ak−1ρ(k − 2) + . . . . . . . . .+ (a2 + 1)ρ(1) + a1 = 0
Then find other ρ(τ)’s using Y-W equation (putting τ = k, k + 1, . . .).]
To find γ(0), use
cov(Xt, Xt + a1Xt−1 + . . .+ akXt−k) = σ2.
i.e. γ(0)
[
k∑
r=0
arρ(r)
]
= σ2.
(2.4) Characteristics of AR(k)
We know a.c.r.f. of AR(k) tails off (exponentially). It is not always very easy to see
this in practice due to sampling fluctuations.
As a complementary tool, we introduce another useful concept as follows.
The partial autocorrelation function (pacf) pi(.) of a stationary T.S. is defined by,
pi(1) = corr(X1, X0) = ρ(1)
pi(2) = corr(X2 − Eˆ(X2|X1), X0 − Eˆ(X0|X1))
pi(3) = corr(X3 − Eˆ(X3|X2, X1), X0 − Eˆ(X0|X1, X2))
etc.
Here, typically Eˆ(X3|X2, X1) denotes the linear regression of X3 on X2, X1.
Example (AR(1)):
Xt = −a1Xt−1 + Yt
pi(1) = corr(X1, X0) = ρ(1) = −a1
pi(2) = corr(X2 − Eˆ(X2|X1), X0 − Eˆ(X0|X1))
Now,
Eˆ(X2|X1) = Eˆ(−a1X1 + Y2|X1)
= −a1X1 + Eˆ(Y2|X1)
= −a1X1.
... X2 − Eˆ(X2|X1) = X2 + a1X1
= Y2.
13
Next, X0 − Eˆ(X0|X1) is a function of X0 and X1, which is independent of Y2.
... pi(2) = 0.
Similar arguement shows that pi(3) = pi(4) = . . . = 0. In words, for an AR(1), the
pacf cuts off at 1.
Similar argument can be extended to an AR(k) leading to the result:
... for a stationary AR(k), the pacf cuts off at k.
Notes:
(1) For Xt ∼ AR(k) with Yt ∼ IID and N(0, σ2), we can verify that pi(k) = −ak.
(2) For Xt ∼ MA(k), pi(k) > 0 ∀k but ↓ 0 in modulus as k →∞,
i.e. for an MA(`), pacf tails off.
14
(2.5) Duality and Invertibility
We have seen that the root condition |λj| < 1, all j, ensures that an AR(k) admits the
MA(∞) representation; i.e. Xt+a1Xt−1+. . .+akXt−k = Yt admits the representation
Xt =
∞∑
s=0
hsYt−s (in mean squares sense or in short m.s.).
We have also seen the duality
acrf pacf root condition
AR(k) tails off cuts off at k stationarity
MA(`) cuts off at ` tails off ???
Essentially, stationarity of an AR(k) model entails writing it as an MA(∞) in the
mean squares sense. A dual question is: Under what condition can we convert an
MA(`) into an AR (∞) ?
Let’s look at an MA(1) model:
Xt = Yt − βYt−1
Formally,
Yt = Xt + βYt−1 = Xt + β{Xt−1 + βYt−2} = . . .
That is
Yt = Xt + βXt−1 + β2Xt−2 + . . .+ βmXt−m + βm+1Yt−m−1,
which resembles the argument in section 2.1! Thus, as in that section, if |β| < 1,
Yt =
∑∞
j=0 β
jXt−j ( in the mean squares sense). We say the MA(1) model is invertible
if |β| < 1, so that it is now equivalent to an AR(∞) model (in the mean squares sense).
Note that β is the root of the so-called ’characteristic equation’: b0z + b1 = 0, where
b0 = 1 and b1 = −β. Thus, equivalently, we say that if the root of the characteristic
is inside the unit circle, the MA(1) model is invertible.
The above discussion suggests the following answer to the general question: Under
what condition can we convert an MA(`) into an AR (∞) ? [We’ll skip the proof!]
Xt = b0Yt + b1Yt−1 + . . .+ b`Yt−`
= (b0 + b1B + . . .+ b`B
`)Yt.
Formally,
Yt = (b0 + b1B + . . .+ b`B
`)−1Xt.
The inverse operator exists if and only if roots λ1, . . . , λ` of characteristic equation
b0z
` + b1z
`−1 + . . . b` = 0
15
all lie inside the unit circle. In this case, there exist gjs such that
Yt =
∞∑
s=0
gsXt−s ,
∞∑
0
|gs| <∞ ,
∞∑
0
g2s <∞ .
We call such an MA model invertible.
|λj| < 1 for j = 1, 2, . . . , ` ⇒ MA(`) is invertible
Notes:
(1) In practice, we observe {Xt}. The WN {Yt} is stipulated as a conceptual build-
ing block and is unobservable. It turns out that to build an MA model for the
observations {X1, X2, . . . , Xn}, only invertible ones can be admitted/identified
uniquely. To see how non-uniqueness can arise, consider
Xt = Yt + 2Yt−1 , Yt ∼WN(0, 1) . . . (I)
and
Xt = ²t +
1
2
²t−1 , ²t ∼WN(0, 4) . . . (II)
In each case, σ2X = 5, ρ(1) =
2
5
, ρ(τ) = 0, τ ≥ 2. Model (I) is not invertible but
Model (II) is.
For Gaussian data, 6 ∃ information to enable us to say whether (I) or
(II) is the true generating mechanism. As far as Gaussian data are
concerned, both models are equally valid. To achieve uniqueness, we
impose invertibility. There are other bonuses in doing so.
(2) An MA(∞) is sometimes called a linear representation (some authors further
impose iid).
(3) Some books have used the characteristic equation in the form of b0+ b1z+ . . .+
b`z
` = 0. The roots are then λ−1j and we want these to lie outside the unit circle.
(2.6) ARMA Models
Notation: Xt ∼ ARMA(k, `)
A wider class of models is obtained upon marrying AR with MA. Thus,
Xt +
k∑
j=1
ajXt−j = Yt +
∑`
j=1
bjYt−j.
In short: α(B)Xt = β(B)Yt, where Yt ∼ WN (0, σ2). WLOG have assumed
EYt = 0
... EXt = 0. [If Xt has non-zero mean, we can first replace Xt by X
′
t = Xt−EXt and
16
then consider the above model for X ′t; alternatively, insert a real constant a0 on the
left-hand-side of the model.] WLOG b0 = 1. [Otherwise absorb b0 in σ
2.]
Roots of zk + a1z
k−1 + . . . + ak = 0 all lie inside the unit circle, i.e. |λj| < 1 all
j ⇒ {Xt} is stationary and ∃{bj} such that Xt =
∞∑
s=0
hsYt−s (in the mean squares
sense).
Formally hs is the coefficient of z
s in the power series expansion of
1 + b1z + b2z + . . .+ b`k
`
1 + a1z + a2z2 + . . .+ akzk
.
Assume that stationarity is obtained. Then
(I) µX
k∑
j=0
aj = µY
∑`
k=0
bk (a0 = b0 = 1) ;
(II) To get γ(τ)’s is a much harder problem!
(See APPENDIX for non-examinable information.)
In practice, ARMA models provide more parsimonious fit to data (i.e. involving
fewer unknown parameters) than pure AR or pure MA models.
(2.7) Periodic Process (also called Harmonic Process)
In all the above models, ρ(τ) → 0 as τ → ∞. Does these exist a stationary T.S.
model whose ρ(τ) 6→ 0 as τ →∞?
Consider
Xt = A cos(ωt+ Φ)
where A, ω are constants, Φ ∼ U(−pi, pi).
EXt =
A
2pi
∫ pi
−pi
cos(ωt+ φ)dφ = 0 all t
γ(τ) = EXtXt+τ
=
A2
2pi
∫ pi
−pi
cos(ωt+ φ) cos(ωt+ ωτ + φ)dφ
=
A2
4pi
∫ pi
−pi
{cos[2(ωt+ φ) + ωτ ] + cos τω}dφ
=
A2
2
cosωτ, independent of t. ... stationary
Being a cosine function, γ(τ) 6→ 0 as τ →∞.
Notes:
17
(i) This can be generalised to
Xt =
k∑
i=1
Ai cos(ωit+ Φi),
where {Ai}, {ωi}, k are constants, and Φ1, . . . ,Φk are IID r.v.s. ∼ U(−pi, pi).
(ii) This model is somewhat pathological because Xt is a ‘deterministic’ periodic
wave once the phase-angles (Φ)’s are fixed.
APPENDIX: Evaluation of γ(k)’ for ARMA (k, `) models:
(i) Let
g(τ) = cov(Xt−τ , Xt + a1Xt−1 + . . .+ akXt−k)
= cov(Xt−τ , Yt + b1Yt−1 + . . .+ b`Yt−`)
= γ(τ) + a1γ(τ − 1) + . . .+ akγ(τ − k)
=
k∑
r=0
arγ(τ − r) (1)
= α(B)γ(τ) τ ≥ 0
Clearly,
g(τ) = 0 for τ > ` . . . . . . . . . . . . (2)
because Xt−τ is uncorrelated with Yt+b1Yt−1+. . .+b`Y1−`. (c.f. the Yule-Walker
equation).
(ii)
k∑
r=0
arg(τ + r) =
k∑
r=0
arcov(Xt−τ−r, β(B)Yt)
= cov
(
k∑
r=0
arXt−τ−r, β(B)Yt
)
= cov(α(B)Xt−τ , β(B)Yt)
= cov(β(B)Xt−τ , β(B)Yt)
= cov(zt−τ , zt), say, where zt = β(B)Yt ∼ MA(`).
= γ(z)(τ), say . . . . . . . . . (3)
=
σ2
∑`−τ
s=0 bsbs+τ 0 ≤ τ ≤ `
0 τ > `
18
(iii) Putting τ = `, `− 1, . . . , 0 in (3) and using (2), we have
g(`) = γ(z)(`)
g(`− 1) + a1g(`) = γ(z)(`− 1)
g(`− 2) + a1g(`− 1) + a2g(`) = γ(z)(`− 2)
...
...
g(0) . . . . . .+ a`−2g(`− 2) + a`−1g(`− 1) + a`g(`) = γ(z)(0) .
Hence, g(0), g(1), . . . , g(`).
... g(τ) is determined ∀τ ≥ 0.
(iv) Then solve the (k + 1) equations from (1) for γ(0), γ(1), . . . , γ(k):
γ(k) + a1γ(k − 1) + . . .+ ak−1γ(1) + akγ(0) = g(k)
γ(k − 1) + . . .+ (ak + ak−1)γ(1) + ak−1γ(0) = g(k − 1)
...
akγ(k) + . . . . . . . . . . . .+ a1γ(1) + γ(0) = g(0)
∣∣∣∣∣∣∣∣∣∣
(v) Finally γ(k + 1), γ(k + 2), . . . , can be determined successively from
γ(k), γ(k − 1), . . . , γ(0) and g(k + 1), g(k + 2), . . . .
19
CHAPTER THREE
SPECTRAL ANALYSIS
(1) It may be proved that if
∞∑
k=−∞
|ρ(τ)| <∞, then there exists a function, f(ω),
which has the properties of a probability density function, such that
(I) ρ(τ) =
∫ pi
−pi
eiωτf(ω)dω
Here, f is called the spectral density function.
(2) Conversely, it may be shown that
(II) f(ω) =
1
2pi
∞∑
τ=−∞
ρ(τ)e−iτω.
Note: |f(ω)| ≤ 1
2pi
∞∑
τ=−∞
|ρ(τ)| <∞ if
∞∑
−∞
|ρ(τ)| <∞.
(3) ρ(0) = 1 =
∫ pi
−pi
f(ω)dω
i.e.
γ(0) = varXt =
∫ pi
−pi
h(ω)dω,
where
h(ω) = (varXt)f(ω)
↙ ↘
non-normalised normalised
s.d.f. s.d.f.
This is an ANALYSIS OF VARIANCE of stationary T.S. {Xr}: h(ω)dω = amount
of “variation” of {Xt} over the frequency range (ω, ω + dω).
[ i.e. average squared amplitude of oscillations within frequency range (ω, ω+dω) . ]
(4) ρ(−τ) = ρ(τ)⇒ f(−ω) = f(ω) from (II).
20
(5) An Example of Spectral Density Function
AR(1): Xt − λXt−1 = Yt, Yt ∼ WN(0, σ