07 approximate inference in bn

173
Bayesian Networks Unit 7 Approximate Inference in Bayesian Networks Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright Wang, Yuan-Kai, 王元凱 [email protected] http://www.ykwang.tw Department of Electrical Engineering, Fu Jen Univ. 輔仁大學電機工程系 2006~2011 Reference this document as: Wang, Yuan-Kai, “Approximate Inference in Bayesian Networks," Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.

Upload: yuan-kai-wang

Post on 10-May-2015

1.155 views

Category:

Education


1 download

TRANSCRIPT

Page 1: 07 approximate inference in bn

Bayesian Networks

Unit 7 Approximate Inference in Bayesian Networks

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Wang, Yuan-Kai, 王元凱[email protected]

http://www.ykwang.tw

Department of Electrical Engineering, Fu Jen Univ.輔仁大學電機工程系

2006~2011

Reference this document as: Wang, Yuan-Kai, “Approximate Inference in Bayesian Networks," Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.

Page 2: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 2

Goal of This Unit• P(X|e) inference for Bayesian networks• Why approximate inference

– Exact inference is too slow because of exponential complexity

• Using approximate approaches– Sampling methods

• Likelihood weighting sampling• Markov Chain Monte Carlo sampling

– Loopy belief propagation– Variational method

Page 3: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p.

Related Units• Background

– Probabilistic graphical model– Exact inference in BN

• Next units– Probabilistic inference over time

3

Page 4: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 4

Self-Study References• Chapter 14, Artificial Intelligence-a modern

approach, 2nd, by S. Russel & P. Norvig, Prentice Hall, 2003.

• Inference in Bayesian networks, B. D’Ambrosio, AI Magazine, 1999.

• Probabilistic Inference in graphical models, M. I. Jordan & Y. Weiss.

• An introduction to MCMC for machine learning. Andrieu, C., De Freitas, J., Doucet, A., & Jordan, M. I., Machine Learning, vol. 50, pp.5-43, 2003.

• Computational Statistics Handbook with Matlab, W. L. Martinez and A. R. Martinez, Chapman & Hall/CRC, 2002– Chapter 3 Sampling Concepts– Chapter 4 Generating Random Variables

Page 5: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 5

Structure of Related Lecture Notes

PGM Representation

Inference

Problem

Learning

Data

Unit 5 : BNUnit 9 : Hybrid BNUnits 10~15: Naïve Bayes, MRF,

HMM, DBN,Kalman filter

Unit 6: Exact inferenceUnit 7: Approximate inferenceUnit 8: Temporal inference

Units 16~ : MLE, EM

StructureLearning

ParameterLearning

B E

A

J M

P(A|B,E)P(J|A)P(M|A)

P(B)P(E)

Query

Page 6: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p.

Contents

1. Sampling .......................................................... 112. Random Number Generator .......................... 203. Stochastic Simulation ……............................. 704. Markov Chain Monte Carlo .......................... 1135. Loopy Belief Propagation …………………. 1456. Variational Methods ………………………... 1467. Implementation …………………………….. 1478. Summary ……………………………………. 1489. References …………………………………… 151

6

Page 7: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 7

4 Steps of Inference• Step 1: Bayesian theorem

• Step 2: Marginalization

• Step 3: Conditional independence

• Step 4: Product sum computation (Enumeration)– Exact inference–Approximate inference

),()(

),()|( eEXPeEP

eEXPeEXP

Hh

hHeEXP ),,(

Hh ni

ii XPaXP~1

))(|(

Page 8: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 8

Five Types of Queries in Inference• For a probabilistic graphical model G• Given a set of evidence E=e• Query the PGM with

–P(e) : Likelihood query–arg max P(e) :

Maximum likelihood query–P(X|e) : Posterior belief query–arg maxx P(X=x|e) : (Single query variable)

Maximum a posterior (MAP) query–arg maxx1…xk

P(X1=x1, …, Xk=xk|e) :Most probable explanation (MPE) query

Page 9: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 9

Approximate Inference v.s. Exact Inference

• Exact inference: P(X|E) = 0.71828– Get exact probability value– Using the inference steps derived by

probabilistic formula– Need exponential time complexity

• Approximate inference: P(X|E) 0.71– Get approximate probability value– Using sampling theorem– Need only polynomial time complexity,

fast computation

Page 10: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 10

Why Approximate Inference• Large treewidth

– Large, highly connected graphical models– Treewidth may be large (>40) in sparse

networks • In many applications, approximation are

sufficient– Example: P(X = x|e) = 0.3183098861– Maybe P(X = x|e) 0.3 is a good enough

approximation– e.g., we take action only if P(X=x|e) > 0.5

Page 11: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 11

1. Sampling

• 1.1 What Is Sampling• 1.2 Sampling for Inference

Page 12: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 12

Basic Idea of Sampling• Why sampling

– Estimate some values by random number generation

1. Sampling– Random number generating– Draw N samples from a known distribution P– Generate N random numbers from a known

distribution S2. Estimation

– Compute an approximate probability , which approximates the real posterior probability P(X|E)

Page 13: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 13

1.1 What Is Sampling• A very simple example with a random

variable : coin toss– Tossing the coin, get head or tail– It is a Boolean R.V.

• coin = head or tail– If it is unbiased coin, head and tail have

equal probability• A prior probability distribution

P(Coin) = <0.5, 0.5> • Uniform distribution

–Assume we have a coin but we do not know it is unbiased

Page 14: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 14

Sampling of Coin Toss• Sampling in this example

= flipping the coin many times N– e.g., N=1000 times– One flipping get one sample– Ideally, 500 heads, 500 tails

• P(head) = 500/1000=0.5P(tail) = 500/1000=0.5

– Practically, 5001 heads, 499 tails• P(head) = 501/1000=0.501

P(tail) = 499/1000=0.499• After the sampling,

– We can estimate probability distribution– Check if it is biased

Page 15: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 15

Sampling & Estimation (Math)• For a Boolean random variable X

– P(X) is prior distribution= <P(x), P(x)>

– Using a sampling algorithm to generate Nsamples

– Say N(x) is the number of samples that x is true, N(x) x is false

)(ˆ)( ),(ˆ)( xPN

xNxPN

xN

)()(lim ),()(lim xPN

xNxPN

xNNN

Page 16: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 16

1.2 Sampling for Inference• Given a Bayesian network G including

(X1, …, Xn)– We get a joint probability distribution

P(X1, …, Xn) = P(Xi|Pa(Xi))• For a query P(X|E=e)

– P(X|e) = P(Xi | Parent(Xi)) – It is hard to compute

• Need exponential time in number of Xi– We will try to use sampling to compute it

Page 17: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 17

Compute P(X|e) by Sampling• Sampling

– Generate N samples of P(X1, …, Xn) = P(Xi|Pa(Xi))

• Estimation– Use N samples to estimate

P(X,e) N(X,e)/N– Use N samples to estimate P(e) N(e)/N– Estimate P(X|e) by P(X,e) / P(e)

Explained in Sections 2,3,4

Page 18: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 18

What Is Sampling Algorithm• The algorithm to

–Generate samples from a known probability distribution P

–Estimate the approximate probability P̂

Page 19: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 19

Various Sampling Algorithms• Stochastic simulation

– Direct Sampling– Rejection sampling

• Reject samples disagreeing with evidence– Likelihood weighting

• Use evidence to weight samples• Markov chain Monte Carlo

(MCMC)– Sample from a stochastic process whose

stationary distribution is the true posterior

Section 3

Section 4

Page 20: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 20

2. Random Number Generator

• Very important for sampling algorithm• Introduce basic concepts related to

sampling of Bayesian networks• Subsections

– 2.1 Univariate– 2.2 Multivariate

Page 21: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 21

RNG In Programming Languages• Random number generator (RNG)

– C/C++: rand()– Java: random()– Matlab: rand()

• Why should we discuss it?– They generate random numbers with

uniform distribution– How to generate

• Gaussian, … • Multivariate, dependent random

variables • Non-closed-form distribution?

Page 22: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 22

Generate a Random Number (1/2)• Examples in C

– int i = rand();– Return 0 ~ RAND_MAX (32767)– It generates integers

• Generate a random number between 1 and n (n<32767)– int i = 1 + ( rand() % n )– (rand() % n) returns a number between 0

and n - 1– Add 1 to make random number between 1

and n– It generates integers, but not real numbers

Page 23: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 23

Generate a Random Number (2/2)• Ex: integer between 1 and 6–1 + ( rand() % 6)

• Ex: real number between 0 and 1–double i = rand() / RAND_MAX

• Exercise– Real number between 10 and 20

Page 24: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 24

Generate Many Random Numbers Repeatedly

• Using loop for repeated generation– for (int i=0; i<1000; i++)

{ rand(); }– int i, j[1000];

for (i=0; i<1000; i++){ j[i] = 1 + rand() % 6; }

rand() generates a number uniformlyUniform distribution

Page 25: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 25

Why Generate Random Numbers• Simulate random behavior• Make random decision• Estimate some values

Page 26: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 26

Random Behavior/Decision (1/2) • Flip a coin for decision (Boolean)

– Fair: each face has equal probability – int coin_face;

if (rand() > RAND_MAX/2) coin_face = 1;

else coin_face = 0;– int coin_face;

coin_face = rand() % 2;

Page 27: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 27

Random Behavior/Decision (2/2)• Random decision of multiple choices

– Discrete random variable• Ex: roll a die

–Fair: each face has equal probability• int die_face; //Random variable

die_face = rand() % 6;

Uniform distribution

Page 28: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 28

Estimation• If we can simulate a random behavior• We can estimate some values

– First, we repeat the random behavior– Then we estimate the value

Page 29: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 29

Example: The Coin Toss

• Flip the coin 1000 times to estimate the fairness of the coin– int coin_face; //Random variable

int frequency[2];for (i=0; i<1000; i++){ coin_face = rand() % 2

frequency[coin_face]++;}

Coinface

0 1

frequencyUniform distribution

Page 30: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 30

Example : Area of Circle (Estimation)• int x, y; //Two random variables

int N=1000, NCircle=0, Area;for (i=0; i<N; i++){ x = rand() / RAND_MAX;

y = rand() / RAND_MAX;if ( (x*x + y*y) <= 1 )

NCircle = NCircle + 1;}Area = 4 * (NCircle/N);

A random number ?

x and y are independent

We call (x,y) a sample

Page 31: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 31

Multiple Dependent Random Variables• Markov Chain: n random variables

• Bayesian Networks: 5 random variables

X1 Xk Xn......

Burglary Earthquake

Alarm

John Calls Mary Calls

Variables are dependent

What is a sample ?

Page 32: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 32

Sampling• It is to randomly generate a sample

– For a random variable X orA set of random variables X1, …, Xn• Boolean, Discrete, Continuous• Multivariate

– Independent, dependent– According to a probability distribution P(X)

• Discrete X: Histogram• Continuous X:

– Uniform, Gaussian, or – Any distribution: Gaussian mixture models

UnivariateMultivariate

Page 33: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 33

Sub-Sections for Generating a Sample

• 2.1 Univariate– Uniform, Gaussian, Gaussian mixture

• 2.2 Multivariate– Uniform– Gaussian

• Independent, dependent– Any distribution

• Gaussian mixture– Independent, dependent

• Bayesian network

Page 34: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 34

2.1 Univariate• For a random variable X

– Boolean, discrete, continuous, hybrid• We know P(X) is

– Uniform, Gaussian, Gaussian mixture• Generate a sample X according to P(X)

Page 35: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 35

Uniform Generator• Every programming language provides

a rand()/random() function to generate a uniform-distributed number– Integer number within [0, MAX)

• Sampling a Boolean uniform number– rand() %2

• Sampling a discrete uniform number within [0, d)– rand() % d

• Sampling a continuous uniform number– Within [0, 1): rand() % MAX– Within [a, b): a + (rand() % MAX)*(a-b)

Page 36: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 36

Example : Uniform Generator• x=rand(1,10000);• h=hist(x,20);• bar(h);

0 5 10 15 20 250

100

200

300

400

500

600

Page 37: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 37

Gaussian Generator (1/2)• Sampling Gaussian can be obtained by

uniform distribution• There are functions in C/Java/Matlab to

randomly generate a univariate Gaussian real number with (, )=(0,1)– C : Numerical recipies in C, – Java: Random.nextGaussian()– Matlab: randn()

• Suppose it is called Gaussian()

Page 38: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 38

Gaussian Generator (2/2)• Sampling a continuous Gaussian

number with (, )– (Gaussian() * ) +

• Sampling a discrete Gaussian number with (, ) ?

Page 39: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 39

Example : Gaussian Generator (1/2)• Pseudo codes

– Assume Gaussian() is a pseudo function to generate Gaussian numbers

– double x[10000]; for (i=0; i<10000; i++)

x[i] = Gaussian();– for (i=0; i<10000; i++)

x[i] = + Gaussian() * ;

Page 40: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 40

Example : Gaussian Generator (2/2)• Matlab

– x=randn(1,10000);– h=hist(x,20);– bar(h);

• Java– Random r=new

Random();int x[10000];for (i=0;i<10000;i++)x[i]=r.nextGaussian();

0 5 10 15 20 250

200

400

600

800

1000

1200

1400

1600

Page 41: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 41

Gaussian Mixture Generator (1/2)• Random variable X with Gaussian

– P(X) = N(X; , )• Random variable Y with Gaussian

mixture – P(Y) = m mN(Y; m, m)

Page 42: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 42

Gaussian Mixture Generator (2/2)• Generate N samples of X

– for (i=0; i<N; i++)x[i]=(Gaussian() * ) +

• Generate N samples of Y with mixture of M Gaussians– Each Gaussian m has m, m– for (m=0; m<M; m++)

for (i=0; i<N*m; i++)y[m][i] = (Gaussian() * m) + m

Page 43: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 43

Example : Gaussian Mixture Generator

• N=10000; pi1=0.8; pi2=0.2;• mu1=0; mu2=15; sigma1=3; sigma2=5;• x1 = mu1 + randn(1,N*pi1) * sigma1;• x2 = mu2 + randn(1,N*pi2) * sigma2;• x = [x1, x2];• h=hist(x,50);• bar(h);

0 10 20 30 40 50 600

100

200

300

400

500

600

700

800

900

Page 44: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 44

2.2 Multivariate• For random variables X1,… ,Xn

– Boolean, discrete, continuous, hybrid• We know P(X1,… ,Xn) is

– Uniform, Gaussian, Gaussian mixture, any distribution

• Generate a sample (X1,… ,Xn) according to P(X1,… ,Xn)– Independent– Dependent

Page 45: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 45

Multivariate Boolean Uniform Generator

• Boolean random variables X1,… ,Xn• int X[n]; // A sample

for (i=0; i<n; i++)X[i] = rand() % 2;

Page 46: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 46

Multivariate Discrete Uniform Generator

• Discrete random variables X1,…, Xn– Each with d discrete values: [0, d-1]– Each Xi is uniform distributed– X1,…, Xn must be independent

• int X[n]; // A samplefor (i=0; i<n; i++)

X[i] = rand() % d;

Page 47: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 47

Multivariate Gaussian Generator - Independent (1/2)

• Pseudo codes• For n random variables X=(X1,…,Xn)

– Gaussian : N(X; , ) • Mean vector: • Covariance matrix: =[ij]

• X1,…,Xn are independent– ij = 0 for ij

• Generate a sample of X Generate each Xi independently

Page 48: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 48

Multivariate Gaussian Generator - Independent (2/2)

• Generate a sample of X =(X1,…,Xn) with i=0, ii=1, ij = 0 for ij– int X[n]; // a sample

for (i=0; i<n; i++)X[i] = Gaussian();

• Generate a sample of X =(X1,…,Xn) with i0, ii 1, ij = 0 for ij– int X[n]; // a sample

for (i=0; i<n; i++)X[i] = i + Gaussian() * ii;

Page 49: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 49

Example – Matlab (1/2)mx=[0 0]';Cx=[1 0; 0 1];x1=-3:0.1:3;x2=-3:0.1:3;for i=1:length(x1),

for j=1:length(x2),

f(i,j)=(1/(2*pi*det(Cx)^1/2))*exp((-1/2)*([x1(i) x2(j)]-mx')*inv(Cx)*([x1(i);x2(j)]-mx));

endendmesh(x1,x2,f)pause;contour(x1,x2,f)pause

1001

)0,0(

X

TX

Page 50: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 50

Example – Matlab (2/2)• Randomly generate 1000 samples for

y1=randn(1,1000);y2=randn(1,1000);plot(y1,y2,'.');

1001

,)0,0( XT

X

Page 51: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 51

Multivariate Gaussian Generator - Dependent (1/4)

• For n random variables X=(X1,…,Xn)–Gaussian : N(X; , )

• Mean vector: • Covariance matrix: =[ij]

– is a positive definite matrix• Symmetric and all eigenvalues (pivots) > 0

– For general matrix A : A= LDU• L: lower triangular, U: upper triangular

D: diagonal matrix of pivots– For symmetric matrix S: S = LDLT

– For positive definite matrix = LDLT =– This is called Cholesky decomposition

• X1,…,Xn are dependent–ij 0

T TL D L D PP

Page 52: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 52

Multivariate Gaussian Generator - Dependent (2/4)

• Generate a sample of X with , – Perform Cholesky decomposition of

• Cholesky decomposition is pivot decompositionfor positive definite matrix

• = PP-1 = PPT

– Generate independent Gaussian Y=(Y1,…,Yn )with i=0, i=1

– X = PY +

Page 53: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 53

Multivariate Gaussian Generator - Dependent (3/4)

• Pseudo code to generate a sample of Xwith , – Matrix ;

Vector ;Vector X(n), Y(n); // a sample

Matrix P=chol(); //Cholesky decomp. for (i=0; i<n; i++) Y(i) = Gaussian();X = P * Y +

Page 54: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 54

Multivariate Gaussian Generator - Dependent (4/4)

• Proof– For n random variables X=(X1,…,Xn) with , – Generate n independent, zero-mean, unit variance

normal random variables Y=(Y1,…,Yn)

,)0,,0(,),,( 1T

YT

nYYY

10

01

Y

– Take X = PY+, where =PP-1 =PPT

TTTTTT

T

PPPYYPEPPYYEPYPYEXXEX

}{}{}))({())(( ofMatrix Covariance

Page 55: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 55

Example – Matlab (1/4)

mx=[0 0]';Cx=[1 1/2; 1/2 1];P=chol(Cx);

2/12/1

01,

12/12/11

)0,0(

23

PX

TX

Assume

Matlab:

Page 56: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 56

Example – Matlab (2/4)• Randomly generate 1000 samples for

• mx=zeros(2,1000);y1=randn(1,1000);y2=randn(1,1000);y=[y1;y2];P=[1, 0; 1/2, sqrt(3)/2];x=P*y+mx;x1=x(1,:);x2=x(2,:);plot(x1,x2,'.');r=corrcoef(x1',x2');

12/12/11

,)0,0( XT

X

Page 57: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 57

Example – Matlab (3/4)

• mx=[5 5]';• Cx=[1 9/10; 9/10 1];• P=chol(Cx);

9.0

01,

19.09.01

)5,5(

1019

109

PX

TX

Assume

Matlab:

Page 58: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 58

Example – Matlab (4/4)• Randomly generate 1000 samples for

• mx=5*ones(2,1000);y1=randn(1,1000);y2=randn(1,1000);y=[y1;y2];P=[1, 0; 9/10, sqrt(19)/10];x=P*y+mx;x1=x(1,:);x2=x(2,:);plot(x1,x2,'.');r=corrcoef(x1',x2');

19.09.01

,)5,5( XT

X

Page 59: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 59

Multivariate Gaussian Mixture Generator

• Generate N samples of X with mixture of MGaussians (Matlab-like pseudo code)– for (m=0; m<M; m++)

{ Matrix P=chol(m) //Cholesky decomposition for (i=0; i<N*m; i++){ //Generate n independent normally distributed

// R.V. (=0, =1) y = randn(1, n)// Transform y into x x = P * y +

}}

Page 60: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 60

Example – Matlab (1/4)• Combine the previous two Gaussians:1=0.5, 2=0.5,

-4 -2 0 2 4 6 8 10-3

-2

-1

0

1

2

3

4

5

6

7

12/12/11

)0,0(

1

1T

19.09.01

)5,5(

2

2T

Page 61: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 61

Example – Matlab (2/4)• pi1= 0.5; pi2=0.5; N=2000;

mx1=zeros(2,pi1*N); Cx1=[1 1/2; 1/2 1];P1=chol(Cx1); %P=[1, 0; 1/2, sqrt(3)/2];y1_1=randn(1,pi1*N); y1_2=randn(1,pi1*N);y1=[y1_1;y1_2];

x1=P1*y1+mx1; x1_1=x1(1,:); x1_2=x1(2,:);

mx2=5*ones(2,pi2*N); Cx2=[1 9/10; 9/10 1];P2=chol(Cx2); %P=[1, 0; 1/2, sqrt(3)/2];y2_1=randn(1,pi2*N); y2_2=randn(1,pi2*N);y2=[y2_1;y2_2];x2=P2*y2+mx2; x2_1=x2(1,:); x2_2=x2(2,:);

z1=[x1_1,x2_1]; z2=[x1_2,x2_2];plot(z1,z2,'.');

Page 62: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 62

Example – Matlab (3/4)• Combine the previous two Gaussians1=0.2, 2=0.8

-4 -2 0 2 4 6 8 10-3

-2

-1

0

1

2

3

4

5

6

7

12/12/11

)0,0(

1

1T

19.09.01

)5,5(

2

2T

Page 63: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 63

Example – Matlab (4/4)• pi1= 0.2; pi2=0.8; N=2000;

mx1=zeros(2,pi1*N); Cx1=[1 1/2; 1/2 1];P1=chol(Cx1); %P=[1, 0; 1/2, sqrt(3)/2];y1_1=randn(1,pi1*N); y1_2=randn(1,pi1*N);y1=[y1_1;y1_2];

x1=P1*y1+mx1; x1_1=x1(1,:); x1_2=x1(2,:);

mx2=5*ones(2,pi2*N); Cx2=[1 9/10; 9/10 1];P2=chol(Cx2); %P=[1, 0; 1/2, sqrt(3)/2];y2_1=randn(1,pi2*N); y2_2=randn(1,pi2*N);y2=[y2_1;y2_2];x2=P2*y2+mx2; x2_1=x2(1,:); x2_2=x2(2,:);

z1=[x1_1,x2_1]; z2=[x1_2,x2_2];plot(z1,z2,'.');

Page 64: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 64

Exercise• Write a program to randomly generate

1000 samples of 3-dimensional Gaussian with =(5,10,-3), =(2,1,3;4,2,2;3,1,2)

Page 65: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 65

Any Distribution• For random variables X1,… ,Xn

– Boolean, discrete, continuous, hybrid• We know P(X1,… ,Xn) has no closed-form

formula– Independent: P(X1,… ,Xn)= P(X1)… P(Xn) – Dependent:

P(X1,… ,Xn)= P(Xi | Parent(Xi))• Generate a sample (X1,… ,Xn) according to

P(X1,… ,Xn)– Independent: generate each Xi by P(Xi) – Dependent: generate each Xi by P(Xi| Parent(Xi))

Page 66: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 66

Two Boolean R.V.s - Independent• X1, X2 have distributions :

– P(X1)=<0.67, 0.33>, P(X2)=<0.75,0.25>• int X1, X2;

for (i=0; i<1000; i++){ if (rand() > RAND_MAX/3)

X1 = 1;else X1 = 0;if (rand() > RAND_MAX/4)

X2 = 1;else X2 = 0;

}

X10 1

P(X1)0.67

X20 1

P(X2)0.75

Page 67: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 67

Two Boolean R.V.s - Dependent• X1, X2 have distributions :

– P(X1)=<0.67, 0.33>– P(X2|X1=T)=<0.75,0.25>, P(X2|X1=F)=<0.8,0.2>

• Generate a sample (x1, x2)if (rand() > RAND_MAX/3) x1 = 1;else x1 = 0;if (x1==1)

if (rand() > RAND_MAX/4) x2 = 1;else x2 = 0;

else // x1==0if (rand() > RAND_MAX/5) x2 = 1;else x2 = 0;

Page 68: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 68

Markov Chain• Markov Chain: n random variables

X1 Xk Xn......

Page 69: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 69

Bayesian Network• Example: 5 random variables

Burglary Earthquake

Alarm

John Calls Mary Calls

Page 70: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 70

3. Stochastic Simulation

• Also called–Monte Carlo Methods–Sampling Methods

• Sub-sections–3.1 Direct sampling–3.2 Rejection sampling–3.3 Likelihood weighting

Page 71: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 71

3.1 Direct Sampling• Generate N samples randomly• For the inference P(X|E)

–P(X|E)= P(X^E) / P(E)–Get N(E) & N(X^E) from the N

samples• N(E) : No. of samples of E• N(X^E) : No. of samples of X and E

–P(E) = N(E) / N, P(X^E) = N(X^E) / N

–P(X|E) = N(X^E) / N(E)

Page 72: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 72

Example (1/4)• For the sprinkler network

–Estimate P(w|r)by direct sampling

–4 random variables–A sample =

(c,s,r,w)

Page 73: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 73

Example (2/4)• Generate 1000 samples

Cloudy Sprinkler Rain WetGrass

T T T FF T T FF F T TT T T FT T T F... ... ... ...F T T F

Page 74: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 74

Example (3/4)• P(r| w) = P(r, w)/P(w)

Cloudy Sprinkler Rain WetGrass

T T T FF T T FF F T TT T F F... ... ... ...F T T F

Nw: No. of WetGrass=FalseNr^w: No. of (Rain=True&WetGrass=False)

Nr^w / Nw

Page 75: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 75

Example (4/4)• P(R|w)

– = P(R, w)/P(w)– = < P(r ^ w)/P(w), P(r ^ w)/P(w) >

Cloudy Sprinkler Rain WetGrass

T T T FF T T FF F T TT T F F... ... ... ...F T T F

Page 76: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 76

How to Generate a Sample for the Bayesian Network? (1/3)

• The sprinkler Bayesian network

•Assume a sampling order:[ Cloudy, Sprinkler,

Rain, WetGrass ]

A sample is an atomic event :(cloundy,sprinkler,rain,wetgrass)=(T, F, T, T)

Page 77: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 77

How to Generate a Sample for the Bayesian Network? (2/3)

• int C, S, R, W;for (i=0; i<1000; i++){ if (rand() > RAND_MAX/2) C = T;

else C = F;if (rand() > RAND_MAX/2) S = T;

else S = F; if (rand() > RAND_MAX/2) R = T;

else R = F; if (rand() > RAND_MAX/2) W = T;

else W = F; } Incorrect

Implementation

Page 78: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 78

How to Generate a Sample for the Bayesian Network? (3/3)

• int C, S, R, W;for (i=0; i<1000; i++){ if (rand() > RAND_MAX/2) C = T;

else C = F;if (C==T)

if (rand() > RAND_MAX*0.9) S = T;

else S = F; else // C==F

if (rand() > RAND_MAX/2) S = T;

else S = F;...

}

Page 79: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 79

An Example Generating One Sample (1/8)

• The sampling algorithm1.Sample from P(Cloudy)=<0.5, 0.5>

– Suppose it returns true2.Sample from

P(Sprinkler|Cloudy=true)=<0.1,0.9>– Suppose it returns false

3.Sample from P(Rain|Cloudy=true)=<0.8,0.2>– Suppose it returns true

4.Sample from P(WetGrass|Sprinkler=false, Rain=true) = <0.9,0.1>– Suppose it returns true

Page 80: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 80

An Example Generating One Sample (2/8)

Samples:

C S R W

Page 81: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 81

An Example Generating One Sample (3/8)

Random sampling: Cloudy

Return: Cloudy=trueSamples:

C S R Wc

Page 82: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 82

An Example Generating One Sample (4/8)

Random sampling1. Sprinkler2. RainGiven Cloudy=true

Samples:

C S R Wc

Page 83: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 83

An Example Generating One Sample (5/8)

Random samplingSprinklerGiven Cloudy=true

Return: Sprinkler=false

Samples:

C S R Wc s

Page 84: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 84

An Example Generating One Sample (6/8)

Random sampling RainGiven Cloudy=true

Return: Rain=true

Samples:

C S R Wc s r

Page 85: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 85

An Example Generating One Sample (7/8)

Random sampling WetGrassGiven Rain=true,

Sprinkler=false

Samples:

C S R Wc s r

Page 86: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 86

An Example Generating One Sample (8/8)

Random sampling WetGrassGiven Rain=true,

Sprinkler=false

Return: WetGrass=true

Samples:

C S R Wc s r w

Page 87: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 87

The Algorithm (1/2)• To generate one sample

Page 88: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 88

The Algorithm (2/2)• In previous example

–We get a sample [true, false, true, true] of a Bayesian network using the Prior-Sample

• The sampling of a Bayesian network–Repeat the sampling N times–We get N samples

• We can use the N samples to compute any query probability in the Bayesian network

Page 89: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 89

How It Works (1/2)• Why any probability can be

answered from the sampling?–The N samples is actually a full joint

distribution table (FJD)C S R WT T T FF T T FF F T TT T F F... ... ... ...F T T F

C S R W PT T T F 0.02F T T F 0.13F F T T 0.04T T F F 0.15... ... ... ... ...

FJD

Page 90: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 90

Why It Works (2/2)• A sample is an atomic event (x1, ..., xn)• P(x1, ..., xn) N(x1, ..., xn) / N• Therefore, a FJD is generated from

the N samples• Note: N < 2n

Page 91: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 91

Exercise: Direct Sampling

smart study

prepared fair

pass

p(smart)=.8 p(study)=.6

p(fair)=.9

p(prep|…) smart smartstudy .9 .7study .5 .1

p(pass|…)smart smart

prep prep prep prepfair .9 .7 .7 .2fair .1 .1 .1 .1

Query: What is the probability that a student studied, given that they pass the exam?

Page 92: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 92

Problems of Direct Sampling• It needs to generate very many

samples in order to obtain the approximate FJD

• For a query of conditional probability P(X|e)–Can we just approximate the

conditional probability?–Yes, the following two algorithms will

do this

Page 93: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 93

3.2 Rejection Sampling• is estimated from samples

agreeing with e)|(ˆ eXP

Page 94: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 94

An Example• Estimate P(Rain|Sprinkler=true)

using 100 samples–27 samples have Sprinkler = true–Of these, 8 have Rain=true and

19 have Rain=false–P(Rain|Sprinkler=true) =

Normalize(<8,19>) = <0.296, 0.704>• Similar to a basic real-world

empirical estimation procedure

Page 95: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 95

Analysis of Rejection Sampling

• Hence rejection sampling returns consistent posterior estimates

• Problem: expensive if P(e) is small–P(e) drops off exponentially with

number of evidence variables!

)|()|(ˆ)(

),()(

),( eXPeXP ePeXP

eNeXN

Page 96: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 96

3.3 Likelihood Weighting• Avoids the inefficiency of rejection

sampling –By generating only events consistent

with the evidence variables e• Idea

–Fix evidence variables,–Sample only hidden variables–Weight each sample event by the

likelihood it accords the evidence• Events have different weights

Randomly generatea sample event

Page 97: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 97

An Example (1/9)• Query P(Rain|sprinkler, wetgrass)

Page 98: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 98

An Example (2/9)1. Set the weight =1.02. Sample from P(Cloudy)=<0.5,0.5>

• Suppose it returns true3. The evidence Sprinkler=true. So we set

= P(sprinkler|cloudy)=1*0.1=0.14. Sample from P(Rain|cloudy)=<0.8,0.2>

• Suppose it returns true5. The evidence WetGrass=true. So we set

= P(wetgrass|sprinkler,rain) =0.1*0.99=0.099

A sample event (true, true, true, true) with weight 0.099

Page 99: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 99

An Example (3/9)

=1.0

Page 100: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 100

An Example (4/9)

=1.0

Page 101: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 101

An Example (5/9)

=1.0

Page 102: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 102

An Example (6/9)

=1.0 0.1

Page 103: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 103

An Example (7/9)

=1.0 0.1

Page 104: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 104

An Example (8/9)

=1.0 0.1

Page 105: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 105

An Example (9/9)

=1.0 0.1 0.99= 0.099

Page 106: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 106

The Algorithm (1/2)• The example generates a sample

event (true, true, true, true) for the query P(Rain|sprinkler, wetgrass)

• Repeat the sampling N times–We get N sample events–Each event has a likelihood weight –1 = rain=true , 1 = rain=false

• P(Rain|sprinkler, wetgrass) = < 1/(1+2), 2/(1+2) >

Page 107: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 107

The Algorithm (2/2)

Page 108: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 108

Exercise: Likelihood Weighting

smart study

prepared fair

pass

p(smart)=.8 p(study)=.6

p(fair)=.9

p(prep|…) smart smartstudy .9 .7study .5 .1

p(pass|…)smart smart

prep prep prep prepfair .9 .7 .7 .2fair .1 .1 .1 .1

Query: What is the probability that a student studied, given that they pass the exam?

Page 109: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 109

Analysis (1/3)• Why the algorithm works? P(X|E=e)• Let the sampling probability for

WEIGHTED-SAMPLE be SWS–The evidence variables E are fixed

with e–All the other variables Z = {X} Y–The algorithm samples each variable

in Z given its parent values

l

iiiWS ZparentszPezS

1

))(|(),(

Page 110: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 110

Analysis (2/3)• The likelihood weight w for a given

sample (z, e)=(x, y, e) is

• The weighted probability of a sample (z,e)=(x, y, e) is

m

iii EparentsePezw

1

))(|(),(

),,(

))(|())(|(

),(),(

11eyxP

EparentsePZparentszP

ezwezSm

iii

l

iii

WS

n

iiin XparentsxPxxP

11 ))(|(),,(

Page 111: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 111

Analysis (3/3)

y

WS eyxweyxNexP ),,(),,()|(ˆ

y

WS eyxweyxS ),,(),,('

)|(),(' exPexP

y

eyxP ),,('

So the algorithm works

Page 112: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 112

Discussions• Likelihood weighting is efficient

because it uses all the samples generated

• However, it suffers a degradation in performance as the no. of evidence variables increases, because –Most samples will have very low weights,–The weighted estimate will be dominated

by the tiny fraction of samples that have infinitesimal likelihood

Page 113: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 113

4. Inference by MCMC• Key idea

– Sampling process as a Markov Chain• Next sample depends on the previous one

– Approximate any posterior distribution• "State" of network

= current assignment to all variables• Generate next state

– by sampling one variable given Markov blanket

• Sample each variable in turn, keeping evidence fixed

Page 114: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 114

The Markov Chain• With Sprinkler =true, WetGrass=true,

there are four states:

Page 115: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 115

Markov Blanket Sampling• Markov blanket of Cloudy is

–Sprinkler and Rain• Markov blanket of Rain is

–Cloudy, Sprinkler, and WetGrass• Probability given the Markov

blanket is calculated as follows–P(x'i|MB(Xi))

= P(x'i|Parents(Xi))ZjChildren(Xi)P(zj|Parents(Zj))

Page 116: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 116

An Example (1/2)• Estimate P(Rain|sprinkler,wetgrass)• Loop for N times

–Sample Cloudy or Rain given its Markov blanket

• Count number of times Rain=trueand Rain=false in the samples

Page 117: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 117

An Example (2/2)• E.g., visit 100 states

–31 have Rain=true, –69 have Rain=false

• P(Rain|sprinkler,wetgrass)= Normalize(<31, 69>) = <0.31, 0.69>

Page 118: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 118

The Algorithm

Page 119: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 119

Why it works• Skipped

–Details in pp. 517-518 in the AIMA 2e textbook

Page 120: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 120

Sub-Sections• 4.1 Markov chain theory• 4.2 Two MCMC sampling algorithms

Page 121: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 121

4.1 Markov Chain Theory• Suppose X1, X2, … take some set of values

– wlog. These values are 1, 2, ...• A Markov chain is a process that corresponds

to the network:

• To quantify the chain, we need to specify– Initial probability: P(X1)– Transition probability: P(Xt+1|Xt)

• A Markov chain has stationary transition probability: P(Xt+1|Xt) same for all times t

X1 X2 X3 Xn... ...

Page 122: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 122

Irreducible Chains

• A state j is accessible from state i if there is an n such that P(Xn = j | X1 = i) > 0– There is a positive probability of reaching j from i after some number steps

• A chain is irreducible if every state is accessible from every state

Page 123: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 123

Ergodic Chains• A state is positively recurrent if there is a

finite expected time to get back to state iafter being in state i– If X has finite number of states, then this is

suffices that i is accessible from itself

• A chain is ergodic if it is irreducible and every state is positively recurrent

Page 124: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 124

(A)periodic Chains• A state i is periodic if there is an integer d such that when n is not divisible by d

P(Xn = i | X1 = i ) = 0• Intuition: only every d steps state i may

occur • A chain is aperiodic if it contains no

periodic state

Page 125: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 125

Stationary ProbabilitiesThm:• If a chain is ergodic and aperiodic, then

the limitexists, and does not depend on i

• Moreover, letthen, P*(X) is the unique probability satisfying

)|(lim 1 iXXP nn

)|(lim)( 1* iXjXPjXP nn

i

tt iXPiXjXPjXP )()|()( *1

*

Page 126: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 126

Stationary Probabilities• The probability P*(X) is the stationary

probability of the process• Regardless of the starting point, the

process will converge to this probability

• The rate of convergence depends on properties of the transition probability

Page 127: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 127

Sampling from the Stationary Probability

• This theory suggests how to sample from the stationary probability:– Set X1 = i, for some random/arbitrary i– For t = 1, 2, …, n

•Sample a value xt+1 for Xt+1 from P(Xt+1|Xt=xt)

– return xn• If n is large enough, then this is a sample

from P*(X)

Page 128: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 128

Designing Markov Chains• How do we construct the right chain to

sample from?– Ensuring aperiodicity and irreducibility is

usually easy

• Problem is ensuring the desired stationary probability

Page 129: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 129

Designing Markov ChainsKey tool:• If the transition probability satisfies

then, P*(X) = Q(X)• This gives a local criteria for checking

that the chain will have the right stationary distribution

0)|1(whenever)()(

)|()|(

1

1

itXjtXPiXQjXQ

jXiXPiXjXP

tt

tt

Page 130: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 130

MCMC Methods• We can use these results to sample from P(X1,…,Xn|e)

Idea:• Construct an ergodic & aperiodic

Markov Chain such that P*(X1,…,Xn) = P(X1,…,Xn|e)

• Simulate the chain n steps to get a sample

Page 131: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 131

MCMC MethodsNotes:• The Markov chain variable Y takes as

value assignments to all variables that are consistent evidence

• For simplicity, we will denote such a state using the vector of variables

}satisfy,...,|)()(,...,{)( 1111 enn xxXVXVxxYV

Page 132: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 132

4.2 Two MCMC Sampling Algorithms

• Gibbs Sampler• Metropolis-Hastings Sampler

Page 133: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 133

Gibbs Sampler• One of the simplest MCMC method• Each transition changes the state of one Xi

• The transition probability defined by P itself as a stochastic procedure:– Input: a state x1,…,xn– Choose i at random (uniform probability)– Sample x’i from P(Xi|x1, …, xi-1, xi+1 ,…, xn, e)

– let x’j = xj for all j i– return x’1,…,x’n

Page 134: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 134

Correctness of Gibbs Sampler• How do we show correctness?

Page 135: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 135

Correctness of Gibbs Sampler• By chain ruleP(x1,…,xi-1, xi, xi+1,…,xn|e) =P(x1,…,xi-1, xi+1,…,xn|e)P(xi|x1,…,xi-1, xi+1,…,xn, e)

• Thus, we get

• Since we choose i from the same distribution at each stage, this procedure satisfies the ratio criteria

),,,,,,|'(),,,,,,|(

)|,,,',,,()|,,,,,,(

111

111

111

111ee

ee

niii

niii

niii

niiixxxxxPxxxxxP

xxxxxPxxxxxP

Transition

Page 136: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 136

Gibbs Sampling for Bayesian Network

• Why is the Gibbs sampler “easy” in BNs?• Recall that the Markov blanket of a

variable separates it from the other variables in the network– P(Xi | X1,…,Xi-1,Xi+1,…,Xn) = P(Xi | Mbi )

• This property allows us to use localcomputations to perform sampling in each transition

Page 137: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 137

Gibbs Sampling in Bayesian Networks

• How do we evaluate P(Xi | x1,…,xi-1,xi+1,…,xn) ?

• Let Y1, …, Yk be the children of Xi– By definition of Mbi, the parents of Yj are

in Mbi{Xi}• It is easy to show that

i

j

j

x jyjii

jyjii

ii payPPaxP

payPPaxPMbxP

')|()|'(

)|()|()|(

Page 138: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 138

Metropolis-Hastings• More general than Gibbs (Gibbs is a

special case of M-H)• Proposal distribution arbitrary q(x’|x)

that is ergodic and aperiodic (e.g., uniform)

• Transition to x’ happens with probability(x’|x)=min(1, P(x’)q(x|x’)/P(x)q(x’|x))

• Useful when computing P(x) infeasible• q(x’|x)=0 implies P(x’)=0 or q(x|x’)=0

Page 139: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 139

Sampling Strategy• How do we collect the samples?Strategy I:• Run the chain M times, each for N steps

– each run starts from a different state points

• Return the last state in each run

M chains

Page 140: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 140

Sampling StrategyStrategy II:• Run one chain for a long time• After some “burn in” period, sample

points every some fixed number of steps

“burn in” M samples from one chain

Page 141: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 141

Comparing StrategiesStrategy I:

– Better chance of “covering” the space of pointsespecially if the chain is slow to reach stationarity

– Have to perform “burn in” steps for each chainStrategy II:

– Perform “burn in” only once– Samples might be correlated (although only weakly)

Hybrid strategy: – Run several chains, sample few times each– Combines benefits of both strategies

Page 142: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 142

Short Summary -Approximate Inference

• Monte Carlo (sampling with positive and negative error) Methods:– Pos: Simplicity of implementation and

theoretical guarantee of convergence– Neg: Can be slow to converge and hard to

diagnose their convergence.• Variational Methods – Your presentation• Loopy Belief Propagation and Generalized

Belief Propagation -- Your presentation

Page 143: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 143

Exercise: MCMC Sampling

smart study

prepared fair

pass

p(smart)=.8 p(study)=.6

p(fair)=.9

p(prep|…) smart smartstudy .9 .7study .5 .1

p(pass|…)smart smart

prep prep prep prepfair .9 .7 .7 .2fair .1 .1 .1 .1

Query: What is the probability that a student studied, given that they pass the exam?

Page 144: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 144

Main Computational Problems1. Difficult to tell if convergence has

been achieved2. Can be wasteful if Markov

blanket is large– P(Xi|MB(Xi)) won't change much

(law of large numbers)

Page 145: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 145

5. Loopy Belief Propagation• TBU

Page 146: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 146

6. Variational Methods• TBU

Page 147: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 147

7. Implementation by PNL

PNL GeNIeEnumeration v (Naïve)Variable EliminationBelief Propagation v (Pearl) v (Polytree)Junction Tree v v (Clustering)Direct Sampling v (Logic)Likelihood Sampling v(LWSampling) v(Likelihood

sampling)MCMC Sampling v(Gibbswithanneal) (Other 5 samplings)

Page 148: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 148

8. Summary• Exact inference by variable

elimination–Polytime on polytrees–NP-hard on general graphs–Space = time, very sensitive to

topology

Page 149: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 149

Summary• Approximate inference by LW,

MCMC–LW does poorly when there is lots of

(downstream) evidence–LW, MCMC generally insensitive to

topology–Convergence can be very slow with

probabilities close to 1 or 0–Can handle arbitrary combinations of

discrete and continuous variables

Page 150: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 150

Summary• What we know

–What is a Bayesian network–How to inference, given a Bayesian

network• However, we still need to know

–How to learn CPTs–How to build or automatically learn

the structure of a Bayesian network by given a set of data

Page 151: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 151

9. References• General Introduction to Probabilistic Inference in

BN– B. D’Ambrosio, Inference in Bayesian networks, AI

Magazine, 1999.– M. I. Jordan & Y. Weiss, Probabilistic Inference in

graphical models,.– Andrieu, C., De Freitas, J., Doucet, A., & Jordan, M. I.

(in press). An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp.5-43, 2003..

Page 152: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 152

Recent Books• R. E. Neapolitan, Learning Bayesian Networks,

Prentice Hall, 2004.• C. Borgelt and R. Kruse, Graphical Models:

methods for data analysis and mining, Wiley, 2002.• D. Edwards, Introduction to Graphical Modelling,

2nd, Springer, 2000.• S. L. Lauritzen, Graphical Models, Oxford, 1996.• M. I. Jordan (ed.), Learning in Graphical Models,

MIT, 2001.

Page 153: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 153

Appendix• Theoretical analysis of approximation

error

Page 154: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 154

Types of ApproximationsAbsolute error• An estimate q of P(X=x|e) has

absolute error , ifP(X=x|e) - q P(X=x|e) +

equivalentlyq - P(X = x|e) q +

• Not always what we want: error 0.001– Unacceptable if P(X = x | e) = 0.0001,– Overly precise if P(X = x | e) = 0.3

0

1

q2

Page 155: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 155

Types of ApproximationsRelative error• An estimate q of P(X=x|e)

has relative error , ifP(X=x|e)(1-) q P(X=x|e)(1+)equivalently

q/(1+) P(X=x|e) q/(1-)• Sensitivity of approximation

depends on actual value of desired result 0

1

q

q/(1+)

q/(1-)

Page 156: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 156

Complexity• Recall, exact inference is NP-hard• Is approximate inference any easier?

• Construction for exact inference:– Input: a 3-SAT problem – Output: a BN such that P(X=t) > 0 iff is

satisfiable

Page 157: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 157

Complexity: Relative Error• Suppose that q is a relative error

estimate of P(X = t), • If is not satisfiable, then

P(X = t)(1 - ) q P(X = t)(1 + )0 = P(X = t)(1 - ) q P(X = t)(1 + ) = 0Thus, if q > 0, then is satisfiable

An immediate consequence:

Thm: Given , finding an -relative error approximation is NP-hard

Page 158: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 158

Complexity: Absolute error• Thm: If < 0.5, then finding an

estimate of P(X=x|e) with absulote error approximation is NP-Hard

Page 159: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 159

Likelihood Weighting• Can we ensure that all of our sample

satisfy e?• One simple solution:

–When we need to sample a variable that is assigned value by e, use the specified value

• For example: we know Y = 1–Sample X from P(X)–Then take Y = 1

• Is this a sample from P(X,Y |Y = 1) ?

X Y

Page 160: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 160

Likelihood Weighting• Problem: these samples of X from P(X)• Solution:

– Penalize samples in which P(Y=1|X) is small

• We now sample as follows:– Let x[i] be a sample from P(X)– Let w[i] be P(Y = 1|X = x [i])

X Y

i

i

iw

[i])x|XPiw)xXP

][

(][1|(

xY

Page 161: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 161

Likelihood Weighting• Why does this make sense?• When N is large, we expect to sample NP(X = x) samples with x[i] = x

• Thus,

• When we normalize, we get approximation of the conditional probability

)1,(

)|1()(][,

YxXNP

xXYPxXNPwxixi

i

Page 162: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 162

Samples:

B E A C R

Likelihood WeightingP(b) 0.03

P(e) 0.001

P(a)b e b e b e b e0.98 0.40.7 0.01

P(c)a

0.8 0.05

P(r)e e

0.3 0.001

b

Earthquake

Radio

Burglary

Alarm

Call

0.03

Weight

= r

a

= a

Page 163: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 163

Samples:

B E A C R

Likelihood WeightingP(b) 0.03

P(e) 0.001

P(a)

b e b e b e b e0.98 0.40.7 0.01

P(c)

a a0.8 0.05

P(r)

e e0.3 0.001

eb

Earthquake

Radio

Burglary

Alarm

Call

0.001

Weight

= r = a

Page 164: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 164

Samples:

B E A C R

Likelihood WeightingP(b) 0.03

P(e) 0.001

P(a)

b e b e b e b e0.98 0.40.7 0.01

P(c)

a a0.8 0.05

P(r)e e

0.3 0.001

eb

0.4

Earthquake

Radio

Burglary

Alarm

Call

Weight

= r = a

0.6a

Page 165: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 165

Samples:

B E A C R

Likelihood WeightingP(b) 0.03

P(e) 0.001

P(a)

b e b e b e b e0.98 0.40.7 0.01

P(c)

a a0.8 0.05

P(r)

e e0.3 0.001

e cb

Earthquake

Radio

Burglary

Alarm

Call

0.05Weight

= r = a

a 0.6

Page 166: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 166

Samples:

B E A C R

Likelihood WeightingP(b) 0.03

P(e) 0.001

P(a)b e b e b e b e0.98 0.40.7 0.01

P(c)a a

0.8 0.05

P(r)e e

0.3 0.001

e cb r

0.3

Earthquake

Radio

Burglary

Alarm

Call

Weight

= r = a

a 0.6 *0.3

Page 167: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 167

Likelihood Weighting• Let X1, …, Xn be order of variables

consistent with arc direction• w = 1• for i = 1, …, n do

–if Xi = xi has been observed•w w* P(Xi = xi | pai )

–else•sample xi from P(Xi | pai )

• return x1, …,xn, and w

Page 168: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 168

Importance Sampling• A method for evaluating expectation of f

under P(x), <f>P(X)• Discrete:• Continuous:

• If we could sample from P

dxxPxff

xPxff

XP

xXP

)()(

)()(

)(

)(

r

XP rxfR

f ])[(1)(

Page 169: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 169

Importance SamplingA general method for evaluating <f>P(X) when we cannot sample from P(X).Idea: Choose an approximating distribution

Q(X) and sample from it

Using this we can now sample from Q and then

x XQx

XP XQXPxfdx

XQXQxPxfdxxPxfxf

)()( )(

)()()()()()()()()(

W(X)

M

m

M

mXP mwmxf

MmXf

Mxf

1 1)(

)(])[(1])[(1)(

If we could generate samples from P(X)

Now that we generate the samples from Q(X)

Page 170: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 170

(Unnormalized) Importance Sampling

1. For m=1:MSample X[m] from Q(X)Calculate W(m) = P(X)/Q(X)

2. Estimate the expectation of f(X) using

Requirements: P(X)>0 Q(X)>0 (don’t ignore possible scenarios) Possible to calculate P(X),Q(X) for a specific X=x It is possible to sample from Q(X)

M

mXP mwmxf

Mxf

1)(

)(])[(1)(

Page 171: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 171

Normalized Importance SamplingAssume that we cannot evaluate P(X=x) but can evaluate P’(X=x) = P(X=x)(ex., we can evaluate P(X) but not P(X|e) in a Bayesian network)We define w’(X) = P’(X)/Q(X). We can then evaluate :

and then:

In the last step we simply replace with the above equation

xx

XQαxP

XQXPXQXw )('

)()(')()('

)(

)(

)()(

)(

)(')(')(

)(')(1)()()(')(1

)()()()()()()(

XQ

XQXQ

x

xxXP

XwXwXf

XwXfα

dxXQXQxPxf

α

dxXQXQxPxfdxxPxfxf

Page 172: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 172

Normalized Importance SamplingWe can now estimate the expectation of f(X) similarly to unnormalized importance sampling by sampling x[m] from Q(X) and then

(hence the name “normalized”)

M

m

M

mXP

mw

mwmxfxf

1

1)(

)('

)('])[()(

Page 173: 07 approximate inference in bn

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Approximate Inference in Bayesian Networks p. 173

Importance Sampling Weaknesses• Important to choose sampling

distribution with heavy tails– Not to “miss” large values of f

• Many-dimensional I-S:– “Typical set” of P may take a long time to

find, unless Q good approximation to P– Weights vary by factors exponential in N

• Similar for Likelihood Weighting