probabilistic analysis of mfgs iii. master equations, games with
TRANSCRIPT
PROBABILISTIC ANALYSIS OF MFGSIII. MASTER EQUATIONS, GAMES WITH
COMMON NOISE, AND WITH MAJOR ANDMINOR PLAYERS
René Carmona
Department of Operations Research & Financial EngineeringPACM
Princeton University
Minerva Lectures, Columbia U October 27, 2016
RECALL THE JOINT CHAIN RUILE
I If u is smoothI If dξt = ηt dt + γt dWt
I If dXt = bt dt + σt dWt and µt = PXt
u(t , ξt , µt ) = u(0, ξ0, µ0) +
∫ t
0∂x u(s, ξs, µs) ·
(γsdWs
)+
∫ t
0
(∂t u(s, ξs, µs) + ∂x u(s, ξs, µs) · ηs +
12
trace[∂2
xx u(s, ξs, µs)γsγ†s])
ds
+
∫ t
0E[∂µu(s, ξs, µs)(Xs) · bs
]ds +
12
∫ t
0E[trace
(∂v[∂µu(s, ξs, µs)
](Xs)σsσ
†s)]
ds
where the process (Xt , bt , σt )0≤t≤T is an independent copy of the process(Xt , bt , σt )0≤t≤T , on a different probability space (Ω, F , P)
DERIVING THE MASTER EQUATION
I If we assume a form of Dynamic Programing PrincipleI If (t , x , µ) → U(t , x , µ) = Vµ(t , x) is the master fieldI We expect(
U(t ,Xt , µt )−∫ t
0f(s,Xs, µs, α(s,Xs, µs,Ys)
)ds)
0≤t≤T
to be a martingale whenever (Xt ,Yt ,Zt )0≤t≤T is the solution of theFBSDE characterizing the optimal path under (µt )0≤t≤T .
I So compute its Itô differential and set the drift to 0
AN EXAMPLE OF DERIVATION
dXt = b(t ,Xt , µt , αt )dt + dWt
H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)
α(t , x , µ, y) = arg infα
H(t , x , µ, y , α)
Itô’s Formula with µt = PXt
(set αt = α(t ,Xt , µt , ∂U(t ,Xt , µt )) and bt = b(t ,Xt , µt , αt ))
dU(t ,Xt , µt ) =(∂tU(t ,Xt , µt ) + bt · ∂xU(t ,Xt , µt ) +
12
trace[∂2xxU(t ,Xt , µt )] + f
(t , x , µ, αt
))dt
+ E[bt · ∂µU(t ,Xt , µt )(Xt ) +
12∂v∂µU(t ,Xt , µt )
](Xt )]
]dt + ∂xU(t ,Xt , µt )dWt
THE ACTUAL MASTER EQUATION
∂tU(t , x , µ) + b(t , x , µ, α(t , x , µ, ∂U(t , x , µ))
)· ∂xU(t , x , µ)
+12
trace[∂2
xxU(t , x , µ)]
+ f(t , x , µ, α(t , x , µ, ∂U(t , x , µ))
)+
∫Rd
[b(t , x ′, µ, α(t , x , µ, ∂U(t , x , µ))
)· ∂µU(t , x , µ)(x ′)
+12
trace(∂v∂µU(t , x , µ)(x ′)
)]dµ(x ′) = 0,
for (t , x , µ) ∈ [0,T ]× Rd × P2(Rd ), with the terminal conditionV (T , x , µ) = g(x , µ).
MEAN FIELD GAMES WITH A COMMON NOISE
Starting with a finite player game, i.e.
Simultaneous Minimization of
J i (α) = E
∫ T
0f (t ,X i
t , µNt , α
it )dt + g(XT , µ
NT )
, i = 1, · · · ,N
under constraints (dynamics of players private states)
dX it = b(t ,X i
t , µNt , α
it )dt + σ(t ,X i
t , µNt , α
it )dW i
t + σ0(t ,X it , µ
Nt , α
it )dW 0
t
for i.i.d. Wiener processes W kt for k = 0,1, · · · ,N.
LARGE GAME ASYMPTOTICS (CONT.)
Conditional Law of Large NumbersI If we consider exchangeable equilibriums,(α1
t , · · · , αNt ), then
I By de Finetti LLNlim
N→∞µN
t = PX1t |F
0t
I Dynamics of player 1 (or any other player) becomes
dX 1t = b(t ,X 1
t , µt , α1t )dt +σ(t ,X 1
t , µt , α1t )dWt +σ0(t ,X 1
t , µt , α1t )dW 0
t ;
with µt = PX1t |F
0t.
I Cost to player 1 (or any other player) becomes
E∫ T
0f (t ,Xt , µt , α
1t )dt + g(XT , µT )
MFG WITH COMMON NOISE PARADIGM
1. For each Fixed measure valued (F0t )-adapted process µ = (µt ) in P(R), solve
the standard stochastic control problem
α = arg infα
E
∫ T
0f (t ,Xt , µt , αt )dt + g(XT , µT )
subject to
dXt = b(t ,Xt , µt , αt )dt + σ(t ,Xt , µt , αt )dWt + σ0(t ,Xt , µt , αt )dW 0t ;
2. Fixed Point Problem: determine µ = (µt ) so that
∀t ∈ [0,T ], PXt |F0t
= µt a.s.
Once this is done one expects that, if αt = φ(t ,Xt ), for N player game,
αj∗t = φ∗(t ,X j
t ), j = 1, · · · ,N
form an approximate Nash equilibrium for the game with N players.
PONTRYAGIN STOCHASTIC MAXIMUM PRINCIPLE
Freeze µ = (µt )0≤t≤T , write (reduced) Hamiltonian
H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)
Standard definition
Given an admissible control α = (αt )0≤t≤T and the corresponding controlledstate process Xα = (Xα
t )0≤t≤T , any couple (Yt ,Zt )0≤t≤T satisfying:dYt = −∂x H(t ,Xα
t , µt ,Yt , αt )dt + ZtdWt + Z 0t dW 0
t
YT = ∂x g(XαT , µT )
is called a set of adjoint processes
STOCHASTIC CONTROL STEP SOLUTION
Determineα(t , x , µ, y) = arg inf
αH(t , x , µ, y , α)
Inject in FORWARD and BACKWARD dynamics and SOLVEdXt = b(t ,Xt , µt , α(t ,Xt , µt ,Yt ))dt + σ(t ,Xt )dWt + σ0(t ,Xt )dW 0
t ,
dYt = −∂xHµt (t ,X ,Yt , α(t ,Xt , µt ,Yt ))dt + ZtdWt + Z 0t dW 0
t
with X0 = x0 and YT = ∂xg(XT , µT )
Standard FBSDE (for each fixed t → µt )
FIXED POINT STEP
Solve the fixed point problem
(µt )0≤t≤T −→ (Xt )0≤t≤T −→ (PXt |F0t)0≤t≤T
Note: if we enforce µt = PXt |F0t
for all 0 ≤ t ≤ T in FBSDE we havedXt = b(t ,Xt ,PXt |F0
t, α
PXt |F0t (t ,Xt ,Yt ))dt + σ(t ,Xt )dWt + σ(t ,Xt ) dW 0
t ,
dYt = −∂xHPXt |F
0t (t ,Xα
t ,Yt , αPXt |F
0t (t ,Xt ,Yt ))dt + ZtdWt + Z 0
t dW 0t ,
withX0 = x0 and YT = ∂xg(XT ,PXT |F0
T)
FBSDE of Conditional McKean-Vlasov type !!!
Very difficult
SEVERAL APPROCHES
I Relaxed Controls (R.C. - Delarue - Lacker)
I FBSDEs of Conditional McKean-Vlasov Type (RC - Delarue)I Serious measurability difficulties
I e.g. may need extra martingale terms (and jumps) in BSDE iffiltrations not Brownian
I Develop TheoryI SDEs of Conditional McKean-Vlasov Type (baby steps in RC - Zhu)I Conditional Propagation of Chaos (baby steps in RC - Zhu)
I Introduce notions of strong and weak solutions to MFG problemI Extend Yamada-Watanabe theory to MFG set-upI Existence for a finite common noise (Schauder Theorem)I Weak Solutions by Limiting argumentsI Uniqueness via Monotonicity or Strong ConvexityI Strong Solutions via extension of Yamada-Watanabe
MFGs with Major andMinor Players
R.C. - G. Zhu, R.C. - P. Wang
MFG WITH MAJOR AND MINOR PLAYERS. I
State equationsdX 0
t = b0(t ,X 0t , µt , α
0t )dt + σ0(t ,X 0
t , µt , α0t )dW 0
t
dXt = b(t ,Xt , µt ,X 0t , αt , α
0t )dt + σ(t ,Xt , µt ,X 0
t , αt , α0t dWt ,
Costs J0(α0,α) = E
[∫ T0 f0(t ,X 0
t , µt , α0t )dt + g0(X 0
T , µT )]
J(α0,α) = E[∫ T
0 f (t ,Xt , µNt ,X
0t , αt , α
0t )dt + g(XT , µT )
],
OPEN LOOP VERSION OF THE MFG PROBLEM
The controls used by the major player and the representative minor playerare of the form:
α0t = φ0(t ,W 0
[0,T ]), and αt = φ(t ,W 0[0,T ],W[0,T ]), (1)
for deterministic progressively measurable functions
φ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0
andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A
THE MAJOR PLAYER BEST RESPONSE
Assume representative minor player uses the open loop control given byφ : (t ,w0,w) 7→ φ(t ,w0,w),
Major player minimizes
Jφ,0(α0) = E[∫ T
0f0(t ,X 0
t , µt , α0t )dt + g0(X 0
T , µT )]
under the dynamical constraints:dX 0
t = b0(t ,X 0t , µt , α
0t )dt + σ0(t ,X 0
t , µt , α0t )dW 0
t
dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), α0t )dt
+σ(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), α0t )dWt ,
µt = L(Xt |W 0[0,t]) conditional distribution of Xt given W 0
[0,t].
Major player problem as the search for:
φ0,∗(φ) = arg infα0
t =φ0(t,W 0[0,T ]
)Jφ,0(α0) (2)
Optimal control of the conditional McKean-Vlasov type!
THE REP. MINOR PLAYER BEST RESPONSE
System against which best response is sought comprises
I a major playerI a field of minor players different from the representative minor playerI Major player uses strategy α0
t = φ0(t ,W 0[0,T ])
I Representative of the field of minor players uses strategyαt = φ(t ,W 0
[0,T ],W[0,T ]).
State dynamicsdX 0
t = b0(t ,X 0t , µt , φ
0(t ,W 0[0,T ]))dt + σ0(t ,X 0
t , µt , φ0(t ,W 0
[0,T ]))dW 0t
dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dt+σ(t ,Xt , µt ,X 0
t , φ(t ,W 0[0,T ],W[0,T ]), φ
0(t ,W 0[0,T ]))dWt ,
where µt = L(Xt |W 0[0,t]) is the conditional distribution of Xt given W 0
[0,t].
Given φ0 and φ, SDE of (conditional) McKean-Vlasov type
THE REP. MINOR PLAYER BEST RESPONSE (CONT.)
Representative minor player chooses a strategy αt = φ(t ,W 0[0,T ],W[0,T ]) to
minimize
Jφ0,φ(α) = E
[∫ T
0f (t ,X t ,X 0
t , µt , αt , φ0(t ,W 0
[0,T ]))dt + g(X T , µt )],
where the dynamics of the virtual state X t are given by:
dX t = b(t ,X t , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dt
+ σ(t ,X t , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dW t ,
for a Wiener process W = (W t )0≤t≤T independent of the other Wienerprocesses.
I Optimization problem NOT of McKean-Vlasov type.I Classical optimal control problem with random coefficients
φ∗(φ0, φ) = arg inf
αt =φ(t,W 0[0,T ]
,W[0,T ])Jφ
0,φ(α)
NASH EQUILIBRIUM
Search for Best Response Map Fixed Point
(φ0, φ) =(φ0,∗(φ), φ∗(φ0, φ)
).
CLOSED LOOP VERSIONS OF THE MFG PROBLEM
I Closed Loop VersionControls of the major player and the representative minor player are ofthe form:
α0t = φ0(t ,X 0
[0,T ], µt ), and αt = φ(t ,X[0,T ], µt ,X 0[0,T ]),
for deterministic progressively measurable functionsφ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0 andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A.
I Markovian VersionControls of the major player and the representative minor player are ofthe form:
α0t = φ0(t ,X 0
t , µt ), and αt = φ(t ,Xt , µt ,X 0t ),
for deterministic feedback functions φ0 : [0,T ]× Rd0 ×P2(Rd ) 7→ A0 andφ : [0,T ]× Rd × P2(Rd )× Rd0 7→ A.
LINEAR QUADRATIC MODELS
State dynamicsdX 0
t = (L0X 0t + B0α
0t + F0Xt )dt + D0dW 0
t
dXt = (LXt + Bαt + FXt + GX 0t )dt + DdWt
where Xt = E[Xt |F0t ], (F0
t )t≥0 filtration generated by W0
Costs
J0(α0,α) = E[∫ T
0[(X 0
t − H0Xt − η0)†Q0(X 0t − H0Xt − η0) + α0†
t R0α0t ]dt
]J(α0,α) = E
[∫ T
0[(Xt − HX 0
t − H1Xt − η)†Q(Xt − HX 0t − H1Xt − η) + α†t Rαt ]dt
]in which Q, Q0, R, R0 are symmetric matrices, and R, R0 are assumed to bepositive definite.
EQUILIBRIA
I Open Loop VersionI Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation
I Closed Loop VersionI Fixed point step more difficultI Search limited to controls of the form
α0t = φ0(t ,X 0
t , Xt ) = φ00(t) + φ0
1(t)X 0t + φ0
2(t)Xt
αt = φ(t ,Xt ,X 0t , Xt ) = φ0(t) + φ1(t)Xt + φ2(t)X 0
t + φ3(t)Xt
I Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation
Solutions are not the same !!!!
APPLICATION TO FLOCKINGInspired by Nourian-Caines-Malhame generalization of the basic Cucker-Smalemodel.
I β = 0, so state reduces to velocity as cost depends only on velocity
I V 0,Nt velocity of the (major player) leader at time t
I V i,Nt the velocity of the i-th follower, i = 1, · · · ,N at time t
I Linear dynamics dV 0,N
t = α0t dt + Σ0dW 0
tdV i,N
t = αit dt + ΣdW i
t
I Minimization of Quadratic costs
J0 = E[∫ T
0
(λ0‖V 0,N
t − νt‖2 + λ1‖V 0,Nt − V N
t ‖2 + (1− λ0 − λ1)‖α0
t ‖2)dt
]
I V Nt := 1
N
∑Ni=1 V i,N
t the average velocity of the followers,I deterministic function [0,T ] 3 t → νt ∈ Rd (leader’s free will)I λ0 and λ1 are positive real numbers satisfying λ0 + λ1 ≤ 1
J i = E[∫ T
0
(l0‖V i,N
t − V 0,Nt ‖2 + l1‖V i,N
t − V Nt ‖
2 + (1− l0 − l1)‖αit‖
2)dt]
l0 ≥ 0 and l1 ≥ 0, l0 + l1 ≤ 1.
SAMPLE TRAJECTORIES IN EQUILIBRIUM
ν(t) := [−2π sin(2πt),2π cos(2πt)]
0.0
0.5
1.0
−1.0 −0.5 0.0 0.5
x
y
k0 = 0.80 k1 = 0.19 l0 = 0.19 l1 = 0.80
−0.5
0.0
0.5
1.0
−1.0 −0.5 0.0 0.5
x
y
k0 = 0.80 k1 = 0.19 l0 = 0.80 l1 = 0.19
FIGURE: Optimal velocity and trajectory of follower and leaders
SAMPLE TRAJECTORIES IN EQUILIBRIUM
ν(t) := [−2π sin(2πt),2π cos(2πt)]
0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
x
y
k0 = 0.19 k1 = 0.80 l0 = 0.19 l1 = 0.80
0.0
0.5
1.0
1.5
2.0
−0.5 0.0 0.5 1.0 1.5
x
y
k0 = 0.19 k1 = 0.80 l0 = 0.80 l1 = 0.19
FIGURE: Optimal velocity and trajectory of follower and leaders
CONDITIONAL PROPAGATION OF CHAOS
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 10
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 20
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 50
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 100
FIGURE: Conditional correlation of 5 followers’ velocities