identification of static and dynamic models of strategic …doubleh/eco273b/dynunobshet.pdf ·...

Post on 23-Aug-2021

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Identification of Static and Dynamic Models ofStrategic Interactions

Lecture notes by Han Hong

Department of EconomicsStanford University

5th June 2007

Identifying Dynamic Discrete Decision Processes: two periods

• Thierry Magnac and David Thesmar (2002)

• Two period model.

• i ∈ I = 1, . . . ,K.• State variable h = (x , ε).

• ε = ε1, . . . , εK.• Period 1 utility: ui (x , ε).

• Next period state variables:

h′ =(x ′, ε′

)drawn conditional on h = (x , ε).

Assumptions

• Additive separability

∀i ∈ I , ui (x , ε) = u∗i (x) + εi ,

ε is independent of x .

• Conditional independence: Random preference shocks at twoperiods ε′ and ε are independent and indepedent of x andd = i .

• Discrete support: The support of first period state variables x(resp. second period x ′) is X (resp. X ′). The joint supportX = X ∪ X ′ is discrete and finite, i.e.,

X = x1, . . . , x#X

• Transition matrix of the (x , ε) process

P(h′|h, d

)= P

(x ′, ε′|x , d

)= G

(ε′)P

(x ′|x , d

)• Bellman equation

vi (x , ε) = ui (x) + εi + βE

(max

jvj

(x ′, ε′

)|x , d = i

)• Decompose

vi (x , ε) = v∗i (x) + εi

where

v∗i (x) = u∗i (x) + βE

(max

j

(v∗j

(x ′

)+ ε′j

)|x , d = i

)

• What can be recovered from the data:

∀(d , d ′) ∈ I 2,∀

(x , x ′

)∈ X × X ′,

P(d ′, x ′, d |x

)= P

(d ′|x ′

)P

(x ′|x , d

)P (d |x)

• P (x ′|x , d) is nonparametrically specified and is exactlyidentified from the data.

• Can P (d ′|x ′) and P (d |x) be used to identify the structuralparameters:

b = u∗1 (X ) , . . . , u∗K (X ) , v∗1

(X ′) , . . . , v∗

K

(X ′) ,G , β

where f (X ) is a short-cut for f (x) ,∀x ∈ X. For example,

u∗1 (X ) = u∗1 (x) ,∀x ∈ X.

• The probability that the agent chooses d = i given thestructure b and observable state variable x :

∀ (x , i) ∈ X × I

pi (x ; b) = P

(v∗i (x ; b) + εi = max

j

(v∗j (x ; b) + εj

)|x , b

).

• Definition of identification

• Observable vector of choice probabilities:

p (x) = (p1 (x) , . . . , pK (x))

• Number of observable choice probabilities and number ofparameters that can be identified.

• Mapping between the observable choice probabilities and thevector of value functions:

∀ (x , i) ∈ X × I , v∗i (x) = v∗

K (x) + qi (p (x) ;G )

• There is no loss of generality in setting v∗K (X ′) = 0. Then for

i = 1, . . . ,K − 1:

v∗i

(X ′) = qi

(p

(X ′) ;G

)• There is also no loss of generality setting u∗K (x) = 0.

• Expected second period value function: for

v∗ (x ′

)= v∗

1

(x ′

), . . . , v∗

K

(x ′

)

DefineR

(v∗ (

x ′);G

)= EG max

i∈I

(v∗i

(x ′

)+ εi

)Decompose the first period total utility function

v∗i (x) = u∗i (x) + βE

[R

(v∗ (

x ′);G

)|x , d = i

]• Recall

qi (x) = v∗i (x)− v∗

K (x)

Now

v∗i (x) = ui (x) + βE

[R

(v∗ (

x ′);G

)|x , d = i

]and since u∗K (x) = 0:

v∗K (x) = βE

[R

(v∗ (

x ′);G

)|x , d = K

]

• Combine these relations:

qi (x) =v∗i (x)− v∗

K (x)

=u∗i (x) + βE[R

(v∗ (

x ′);G

)|x , d = i

]− βE

[R

(v∗ (

x ′);G

)|x , d = K

]• Therefore u∗i (x) can be recovered by

u∗i (x) = qi (x)− βE [R (v∗ (x ′) ;G ) |x , d = i ]

+βE [R (v∗ (x ′) ;G ) |x , d = K ]

• So given β, G , u∗K (x) = 0, v∗K (x ′) = 0, the other utility

functions u∗i (x) , i = 1, . . . ,K − 1 and the other second periodvalue functions v∗

i (x ′) , i = 1, . . . ,K − 1 can be identified.

• Estimation follows from identification.

• Exclusion restrictions might identify β: if

∃ (x1, x2) ∈ X 2,∃i ∈ I , such that x1 6= x2, u∗i (x1) = u∗i (x2)

but x1 and x2 still generate different transition probabilitesP (x ′|x , d), then qi (x1) should be different from qi (x2).

• Identifying β:

qi (x1)− βE[R

(v∗ (

x ′);G

)|x1, d = i

]+ βE

[R

(v∗ (

x ′);G

)|x1, d = K

]−

qi (x2)− βE

[R

(v∗ (

x ′);G

)|x2, d = i

]+ βE

[R

(v∗ (

x ′);G

)|x2, d = K

]= 0.

• Parametric restriction: They said it can be used to identify G.Is this true?

Single agent dynamic discrete choice model: infinite horizon

• Players are forward looking.

• Infinite Horizon, Stationary, Markov Transition

• Now players maximize expected discounted utility usingdiscount factor β.

Wi (s, εi ; σ) = maxai∈Ai

Πi (ai , s) + εi (ai )

+ β

Z Xa−i

Wi (s′, ε′i ; σ)g(s ′|s, ai , a−i )σ−i (a−i |s)f (ε′i )dε′i

ff

• Definition: A Markov Perfect Equilibrium is a collection ofδi (s, εi ), i = 1, ..., n such that for all i , all s and all εi , δi (s, εi )maximizes Wi (s, εi ; σi , σ−i ).

• Conditional independence:

• ε distributed i.i.d. over time.

• State variables evolve according to g (s ′|s, ai , a−i ) .

• Define choice specific value function

Vi (ai , s) = Πi (ai , s) + βEˆVi (s

′)|s, ai

˜.

• Players chooose ai to maximize Vi (ai , s) + εi (ai ),• Ex ante value function (Social surplus function)

Vi (s) =Eεi maxai

[Vi (ai , s) + εi (ai )]

=G (Vi (ai , s),∀ai = 0, . . . , K)

=G (Vi (ai , s) − Vi (0, s) ,∀ai = 1, . . . , K) + Vi (0, s)

• When the error terms are extreme value distributed

Vi (s) = logKX

k=0

exp (Vi (k, s))

= logKX

k=0

exp (Vi (k, s) − Vi (0, s)) + Vi (0, s) .

• Relationship between Πi (ai , s) and Vi (ai , s):

Vi (ai , s) =Πi (ai , s) + βEˆG

`Vi (ai , s

′),∀ai = 0, . . . , K´|s, ai

˜=Πi (ai , s) + βE

ˆG

`Vi (k, s ′) − Vi

`0, s ′

´,∀k = 1, . . . , K

´|s, ai

˜+ βE

ˆVi

`0, s ′

´|s, ai

˜• With extreme value distributed error terms

Vi (ai , s) =Πi (ai , s) + βE

"log

KXk=0

exp`Vi

`k, s ′

´− Vi

`0, s ′

´´|s, ai

#+ βE

ˆVi

`0, s ′

´|s, ai

˜

• Hotz and Miller (1993): one to one mapping between σi (ai |s)and differences in choice specific value functions:

(Vi (1, s) − Vi (0, s), ...Vi (K , s) − Vi (0, s)) = Ωi (σi (0|s), ..., σi (K |s))

• Example: i.i.d extreme value f (εi ):

σi (ai |s) =exp(Vi (ai , s) − Vi (0, s))PKk=0 exp(Vi (k, s) − Vi (0, s))

• Inverse mapping:

log (σi (k|s)) − log (σi (0|s)) = Vi (k, s) − Vi (0, s)

• Since we can recover Vi (k, s)− Vi (0, s), we only need toknow Vi (0, s) to recover Vi (k, s) ,∀k.

• If we know Vi (0, s), Vi (ai , s) and Πi (ai , s) is one to one.

• Identify Vi (0, s) first. Set ai = 0:

Vi (0, s) =Πi (0, s) + βE

[log

K∑k=0

exp (Vi (k, s ′)− Vi (0, s ′)) |s, 0

]+ βE [Vi (0, s ′) |s, 0]

• This is a single contraction mapping unique fixed pointiteration.

• Add Vi (0, s) to Vi (k, s)− Vi (0, s) to identify all Vi (k, s).

• Then all Πi (k, s) calculated from Vi (k, s) through

Πi (k, s) = Vi (k, s)− βE [Vi (s′) |s, k] .

• Why normalize Πi (0, s) = 0?

• Why not Vi (0, s) = 0?

• If a firm stays out of the market in period t, current profit 0,but option value of future entry might depend on market size,number of other firms, etc.

• These state variables might evolve stochastically.

• Rest of the identification arguments: identical to the staticmodel.

• Nonparametric and Semiparametric Estimation

• Hotz-Miller inversion recovers Vi (k, s)− Vi (0, s) instead ofΠi (k, s)− Πi (0, s).

• Nonparametrically compute Vi (0, s) using

Vi (0, s) =βE

[log

K∑k=0

exp(Vi (k, s ′)− Vi (0, s ′)

)|s, 0

]+ βE

[Vi (0, s ′) |s, 0

]• Obtain and Vi (k, s) and forward compute Πi (k, s).

• The rest is identical to the static model.

• In semiparametric models, θ converges at a T 1/2 rate and hasnormal asymptotics.

• Apply the results of Newey (1994)-derive appropriate“influence functions”.

• The asymptotic distribution is invariant to the choice ofmethod used to estimate the first stage.

• With proper weighting function (need to estimatenonparametrically), can achieve the same efficiency as fullinformation maximum likelihoood.

• These results hold for both static and dynamic models.

Discrete Games

• Dynamic and static discrete games.

• Private information assumption.

• No unobserved state variables.

• No distinction for identification purpose.

• Therefore suffices to study static games.

• Results immediately translated into dynamic results.

Notations

• Players, i = 1, ..., n.

• Actions ai ∈ 0, 1, . . . ,K.• A = 0, 1, . . . ,Kn and a = (a1, ..., an)..

• si ∈ Si : state for player i .

• S = ΠiSi and s = (s1, ..., sn) ∈ S .

• s is common knowledge and also observed by econometrician.

• For each agent i , K + 1 state variables εi (ai )

• εi (ai ): private information to each agent.

• εi = (εi (0) , . . . , εi (K )).

• Density f (εi ), i.i.d. across i = 1, . . . , n.

• Period utility for player i with action profile a:

ui (a, s, εi ; θ) = Πi (ai , a−i , s; θ) + εi (ai )

• Example: the period profit of firm i for entering the market.

• Generalizes a standard discrete choice model.

• Agents act in isolation in standard discrete choice models.

• Unlike a standard discrete choice model, a−i enters utility.

• Player i ’s decision rule is a function ai = δi (s, εi ).

• Note that ε−i does not enter.

• ε−i is private information of other players.

• Conditional choice probability σi (ai |s) for player i :

σi (ai = k|s) =

∫1 δi (s, εi ) = k f (εi )dεi .

• Choice probability is conditional s: public information.

• Choice specific expected payoff for player i :

Πi (ai , s; θ) =∑a−i

Πi (ai , a−i , s; θ)σ−i (a−i |s).

• Expected utility from choosing ai , excluding preference shock.

• The optimal action for player i satisfies:

σi (ai |s) = Prob

εi |Πi (ai , s; θ) + εi (ai )

> Πi (aj , s; θ) + εi (aj) for j 6= i .

• Πi (ai , a−i , s; θ) is often a linear function, e.g.:

Πi (ai , a−i , s) =

s ′ · β + δ∑j 6=i

1 aj = 1 if ai = 1

0 if ai = 0

• Mean utility from not entering normalized to zero.

• δ measures the influence of j ’s entry choice on i ’s profit.

• If firms compete with each other: δ < 0.

• β measure the impact of the state variables on profits.

• εi (ai ) capture shocks to the profitability of entry.

• Often εi (ai ) are assumed to be i.i.d. extreme value distributed:

f (εi (k)) = e−εi (k)ee−εi (k).

Nonparametric Identification

A1 Assume that the error terms εi (ai ) are distributed i.i.d. acrossactions ai and agents i , and come from a known parametricfamily.

• Not possible to allow nonparametric mean utility and errorterms at once, even in simple single agent problems (e.g. aprobit).

• In Bajari, Hong and Ryan (2005)- even a single agent model isnot identified without an independence assumption.

• Well known that Πi (0, s) are not identified.

• σi (ai |s) only functions of Πi (ai , s)− Πi (0, s).

• Suppose εi (ai ) is extreme value,

σi (ai |s) =exp(Πi (ai , s)− Πi (0, s))∑Kk=0 exp(Πi (k, s)− Πi (0, s))

A2 For all i and all a−i and s, Πi (ai = 0, a−i , s) = 0.

• Can only learn choice specific value functions up to a firstdifference. Need normalization

• Similar to “outside good” assumption in single agent model.

• Entry: the utility from not entering is normalized to zero.

• Hotz and Miller (1993) inversion, for any k, k ′:

log (σi (k|s))− log(σi (k

′|s))

= Πi (k, s)− Πi (k′, s).

• More generally let Γ : 0, ...,K × S → [0, 1]:

(σi (0|s), ..., σi (K |s)) = Γi (Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s))

• And the inverse Γ−1:

(Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s)) = Γ−1i (σi (0|s), ..., σi (K |s))

• Invert equilibrium choice probabilities to nonparametricallyrecover Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s).

• Πi (ai , s) is known by our inversion and probabilites σi can beobserved by econometrician.

• Next step: how to recover Πi (ai , a−i , s) from Πi (ai , s).

• Requires inversion of the following system:

Πi (ai , s) =∑a−i

σ−i (a−i |s)Πi (ai , a−i , s),

∀i = 1, . . . , n, ai = 1, . . . ,K ..

• Given s, n×K × (K + 1)n−1 unknowns utilities of all agents.

• Only n × (K ) known expected utilities.

• Obvious solution: impose exclusion restrictions.

• Partition s = (si , s−i ), and suppose

Πi (ai , a−i , s) = Πi (ai , a−i , si )

depends only on the subvector si .

Πi (ai , s−i , si ) =∑a−i

σ−i (a−i |s−i , si )Πi (ai , a−i , si ).

• Identification: Given each si , the second moment matrix ofthe “regressors” σ−i (a−i |s−i , si ),

Eσ−i (a−i |s−i , si )σ−i (a−i |s−i , si )′

is nonsingular.

• Needs at least (K + 1)n−1 points in the support of theconditional distribution of s−i given si .

• Nonparametric estimation

• Semiparametric estimation

• Linear probability model

• fixed effect panel data

• Multiple equilibria computation.

• Unobserved heterogeneity

• Consider single agent dynamic model first.

• Random coefficient model:

Dan AckerbergA New Use of Importance Sampling to ReduceComputational Burden in Simulation Estimation.Working paper, 2001.

• Bayesian methods:

Imai et alBayesian Estimation of Dynamic Discrete Choice Models.Working paper, 2005.

Andriy NoretzDynamic Discrete Choice Models with serially correlatedunobservables .Working paper, 2006.

Ackerberg 2001

• Static utility: x ′i u + εi , εi extreme value.

• Forward looking agent with discounting

• Random coefficient: u ∼ g (xi , θ).

• Moments to match:

1

n

n∑i=1

[1 (yi = a)− P (yi = a|xi , θ)] t (xi ) .

• Conditional choice probabilities:

P (yi = a|xi , θ) =

∫eV (a,xi ,u)∑

a′∈A eV (a′,xi ,u)g (u|xi , θ) du

=

∫eV (a,xi ,u)∑

a′∈A eV (a′,xi ,u)

g (u|xi , θ)

q (u|xi )q (u|xi ) du

• Using S draws of us , s = 1, . . . ,S from q (u|xi ), theconditional choice probability can be simulated as:

P (yi = a|xi , θ) =1

S

S∑s=1

eV (a,xi ,uis)∑a′∈A eV (a′,xi ,uis)

g (uis |xi , θ)

q (uis |xi )

• Simulated moment conditions:

1

n

n∑i=1

[1 (yi = a)− P (yi = a|xi , θ)

]t (xi ) .

which is

1

n

n∑i=1

[1 (yi = a)− 1

S

S∑s=1

eV (a,xi ,uis)∑a′∈A eV (a′,xi ,uis)

g (uis |xi , θ)

q (uis |xi )

]t (xi ) .

• Separation of simulation and value function computation fromthe estimation step.

• Draw uis , i = 1, . . . , n, s = 1, . . . ,S before estimation starts

• Compute all V (a, xi , uis) for all i = 1, . . . , n and s = 1, . . . ,Sbefore hand.

• No need to recompute V (a, xi , uis) when estimating θ.

• θ only reweights the density

g (uis |xi , θ)

q (uis |xi )

during optimization estimation.

• Same logic applies to simulated MLE, and in Bayesiananalysis.

Noretz 2006

• Bayesian method: serially correlated unobserved statevariables

• st = (yt , xt). xt observed. yt not observed.

• Use Gibbs sampler:• Given θ, dt , xt , draw yt .• Given dt , xt , yt , draw θ

• Joint likelihood:

π (θ) p (yT ,i , xT ,i , dT ,i , . . . , y1,i , x1,i , d1,i |θ)=π (θ) ΠT

t=1p (dt,i |yt,i , xt,i , θ) f (xt,i , yt,i |xt−1,i , yt−1,i , dt−1,i ; θ) .

• conditional choice probability can be indicator:

p (di ,t |yi ,t , xt,i , θ)

=1 (V (yt,i , xt,i , dt,i ; θ) ≥ V (yt,i , xt,i , d ; θ) ,∀d ∈ D) .

• Break unobservables into serially independent and seriallydependent components: yt = (νt , εt),

f (xt+1, νv+1, εt+1|xt , νt , εt , d ; θ)

=p (νv+1|xt+1, εt+1; θ) p (xt+1, εt+1|xt , εt , d ; θ) .

• Joint likelihood becomes

π (θ)∏i,t

p (dt,i |Vt,i ) p (Vi,t |xi,t , εi,t ; θ) p (xt,i , εt,i |xt−1,i , εt−1,i , dt−1,i ; θ)

• Vit can be drawn “analytically” conditional on (xi ,t , εi ,t ; θ)subject to the constraints specified by the p (dt,i |Vt,i )indicators.

• θ and εi ,t are drawn using Metropolis-Hasting steps.

• “Analytic” drawing of Vt,i = Vt,d ,i = V (st,i , d ; θ) , d ∈ D,where s = (x , ε, ν), requires value function updating.

• For example:

u (st,i , d ; θ) = u (xt,i , d ; θ) + vt,d ,i + εt,d ,i .

• Then

V (st,i , d ; θ) =u (xt,i , d ; θ) + vt,d ,i + εt,d ,i

+ βE [V (st+1; θ) |εt,i , xt,i , d ; θ] .

• At every step θm, the expected value function

E[V (st+1; θ

m) |εmt,i , xmt,i , d ; θm

]is updated by averaging over near history point of θ on themcmc chain and over the importance sampling draws of ε’s.

• How to update the approximate value function V m(sm,j ; θm

)V m (s; θm) = max

d∈Du (s, d ; θm) + βE (m)

[V

(s ′; θm

)|s, d ; θm

].

• At each iteration m, Draw randow statessm,j , j = 1, . . . , N (m) from an i.i.d. density g (·) > 0.

• Each iteration m, only keep track of the history of lengthN (m):

θk ; sk,j ,V k(sk,j ; θk

), j = 1, . . . , N (k)m−1

k=m−N(m).

• In this history, find i = 1, . . . , N (m) cloest to θ parameterdraws.

• Only the value functions in importance sampling thatcorrespond to these nearest neighbors are used in theapproximation by averaging.

• Update value function as:

E (m)[V

(s ′; θ

)|s, d ; θ

]=

N(m)∑i=1

N(ki )∑j=1

V ki

(ski ,j ; θki

) f(ski ,j |s, d ; θ

)/g

(ski ,j

)∑N(m)

r=1

∑N(kr )q=1 f (skr ,q|s, d ; θ) /g (skr ,q) .

=

N(m)∑i=1

N(ki )∑j=1

V ki

(ski ,j ; θki

)Wki ,j ,m (s, d , θ)

• weights simplies with i.i.d. known unobservable components.

• expected max value function can be integrated out withextreme value errors.

Particle filtering

• Used in macro dynamic models by Fernandez and Rubio.

• Particle filtering is an importance sampling method.

• x is latent state, y is observable random variable:

• We are interested in the posterior distribution of x given y .

• Want to compute E (t (X ) |y)∫t (x) p (x |y) dx =

∫t (x)

p (y |x) p (x)

p (y)dx

=

∫t (x) p(y |x)p(x)

g(x) g (x) dx∫ p(y |x)p(x)g(x) g (x) dx

• If we have r = 1, . . . ,R draws from the density g (x), then wecan approximate

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr ) w (xr |y) /1

R

R∑r=1

w (xr |y) ,

where

w (xr |y) =p (y |xr ) p (xr )

g (xr ).

• Or one can write

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr ) w (xr |y)

where

w (xr |y) = w (xr |y) /

R∑r=1

w (xr |y) .

• In other words, given R draws from g (x), t (X ) is integratedagainst a discrete distribution that places weights w (xr ) oneach of the r = 1, . . . ,R points on the discrete support.

• Alternatively, one can compute

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr )

where xr , r = 1, . . . ,R are R draws from the weightedempirical distribution on the xr , r = 1, . . . ,R, where theweights are w (xr ).

• When g (x) = p (x), w (x |y) = p (y |x), and

w (xr |y) = p (y |xr ) /

R∑r=1

p (y |xr ) .

• In a Bayesian setup, the coefficients θ are the latent statevariables x . p (x) is then the prior distribution of θ.

• A random coefficient model is just like a hierachical Bayesianmodel where the µ are the hyper-parameters in the priordistribution, and the hyperparameters are to be estimated.The prior for θ can be made dependent on both µ andcovariates (state variables) s.

• Computing the weights p (y |θ) involves value functioniteration and is computationally difficult.

• If we use GMM objective function instead of the likelihood toform p (y |θ), there might be some computation savingsbecause only one simulation r is needed for each observation.

• A more interesting case is when there is dynamic unobservablestate variables.

• Suppose s = (x , v), such that x is observed but v is notobserved.

• We might be interested in the posterior distribution of theentire sample path of v in addition to those of θ, the“parameters”.

• Particle filtering can be used in this case.

• How can computation be maximally sped up?

• Do particles lend themselves to pararrell processing?

p (v0:t |y1:t , x0:t) =p (v0:t , y1:t , x0:t)

p (y1:t , x0:t)

=p (v0:t−1, y1:t−1, x0:t−1) p (vt , yt , xt |v0:t−1, y1:t−1, x0:t−1)

p (y1:t−1, x0:t−1) p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (vt , yt , xt |v0:t−1, y1:t−1, x0:t−1)

p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (yt |xt , vt) p (vt , xt |v0:t−1, y1:t−1, x0:t−1)

p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (yt |xt , vt) p (vt , xt |vt−1, yt−1, xt−1)

p (yt , xt |y1:t−1, x0:t−1)

The fourth equality follows from the conditional independenceassumption and the markovian structure, and the fifth equalityfollows from the markovian structure of the state variabletransition.

• Therefore we only need to make the following smallmodifications to the filtering algorithm:

• First, for each particle in period t − 1, with its associatedvalue of vt−1, simulate from

p (vt |xt , vt−1, yt−1, xt−1) =p (vt , xt |vt−1, yt−1, xt−1)

p (xt |vt−1, yt−1, xt−1).

• If vt and xt are conditionally independent givenvt−1, yt−1, xt−1, then we can directly simulate from

p (vt |vt−1, yt−1, xt−1) .

• Then reweight by p (yt |xt , vt).

• The weights are not easy to compute because they eitherinvolve value function iteration orbackward recursion.

• In summary, three components of recursive particle filteringmethod:

p (v0:t |x1:t , y1:t) ∝p (v0:t−1|y1:t−1, x0:t−1)

p (vt |xt , vt−1, yt−1, xt−1) p (yt |xt , vt)

• The first part p (v0:t−1|y1:t−1, x0:t−1), defines recursion.

• The second part p (vt |xt , vt−1, yt−1, xt−1) , defines theimportance sampling density.

• The third part p (yt |xt , vt), defines the weights on theparticles.

• Relation to the macro model of Jesus Fernandez-Villaverdeand Juan Rubio-Ramırez.

• Their paper: “Estimating Macroeconomic Models: ALikelihood Approach”, forthcoming, Review of EconomicStudies.

• Basically, very similar.

• They also have feedback from yt to the latent state variablesxt+1, vt+1:

p (vt , xt |v0:t−1, y1:t−1, x0:t−1) = p (vt , xt |vt−1, yt−1, xt−1)

unlike stochastic volatility models, where transitions of vt , xt

are automonous.

• They introduce this dependence by allowing for singularity inthe measurement equation:

Yt = g (St ,Vt ; γ)

• The states in their transition equation are not necessarily alllatent:

St = f (St−1,Wt ; γ)

• Feedback from Yt to latent states is allowed when Yt is asubcomponent of St and is directly observed, such that thatparticular component of g (·) has no noise Vt .

• In their assumption 2, they assume that both Y and V arecontinuously distributed and that g (·) (f (·) as well) isinvertable.

• Then the conditional density of Yt (which basically is used tocalculate the weights) can be determined from the density ofVt and the jacobian of the transformation.

• They do mention though, in one brief sentence, that thisassumption can possibly be relaxed as long as the weights canbe computed.

• Our discrete Yt model falls into this extension where there isno invertibility.

• The weights can still be computed through the logitprobability form and the value function iterations (orbackward recursion).

• Calculating value functions by backward induction:

• Assuming binary choice model.

• At time T :

VT (s) = log [exp (Π (s, 1)) + exp (Π (s, 0))]

• Suppose Vt+1 (s) is known, then at time t:

Vt (s, 1) =Π (s, 1) + βE[Vt+1

(s ′

)|s, 1

]Vt (s, 0) =Π (s, 0) + βE

[Vt+1

(s ′

)|s, 0

]Vt (s) = log [exp (Vt (s, 1)) + exp (Vt (s, 0))] .

top related