Download - ujava.org Reinforcement Learning (2nd)
www.idosi.com .
www.idosi.com .
www.idosi.com .
www.idosi.com .
Reinforcement Learning (2nd)
ujava.org Workshop
2016-08-12
www.idosi.com
CEO Shindong KANG
()
ujava.org
spaceapi.org
Reinforcement Learning for Brick Game
Reinforcement Learning for Brick Game
To Flip Pancake
Crawling Robot on Carpet
Pavlov's Dog
Pavlov
Reinforcement ()
Reinforcement Learning
Forecast
Forecast with probability
Unknown model & real facts
Deep Neural Network
Bayesian Probability
Variance ()
Variance ()
Randrom Variable
Types of Randrom Variable
Discrete Probability Distribution
Continuous Probability Distribution, Probability Density Function
Density ()
Expected value ()
EV = xP/1
Expected Value for Continuous variable
Covariance ()
Covariance
Probability ()
Conditional Probability ( )
Bayes rule
Bayesian Probability ( )
Bayesian Probability ( )
P(fair|H) = ?
P(A) = P(fair) = P(B) = P(H) = P(B|A) = P(H|fair) =
1--- = -- 3
Brownian motion ( )
Brownian motion, Gaussian distribution
Snapshot of state
Markov Chain
Process Probability ( )
s1s2s3
Episode process :
s1, s2 = ?
s2, s3 = ?
s1, s3 = ?
Markov Process
Markov Process
Math Product Symbol
Markov Process
Markov Process
Markov Process
Stochastic Matrix
Stochastic Matrix
0.4 0.60.7 0.3
2 Snapshots of state
Direction using Second Order
Markov Process
3 Snapshots of state
Acceleration using 3rd order
Exploitation and Exploration ( and )
State-action exploration vs. Parameter exploration
Multi-armed bandit problem
Thompson sampling
Simulated Bandit Performance
Multi-armed bandit problem
Multi-Armed Bandit Algorithms
MAB Reward
Function's Probability Distribution
Function's Probability Distribution ?
Function's Probability Distribution
y = ax^2 +b
Function's Probability Distribution with Gaussian Distribution
y = ax^2 +b
Function's Probability Distribution with Gaussian Distribution
Gaussian Process Regreesion
Gaussian Process
From C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006
Thompson sampling
Thank you !
()Intelligent City Ltd.
Shindong KANG