bayesian regression intro with r

24
Bayesian Regression An Introduction Using R J Guzmán, PhD January 2011

Upload: josue-guzman

Post on 22-May-2015

235 views

Category:

Education


3 download

DESCRIPTION

Some introductory notes on Bayesian regression using Program R.

TRANSCRIPT

Page 1: Bayesian regression intro with r

Bayesian Regression An Introduction Using R

J Guzmán, PhD

January 2011

Page 2: Bayesian regression intro with r

Multiple Regression •  Model:

µ|X,β = β0 + β1x1 + … + βkxk

y | X,β = β0 + β1x1 + … + βkxk + ε

ε ~ N(0, σ2) •  k predictors or explanatory variables •  n observations

Page 3: Bayesian regression intro with r

Multiple Regression •  Matrix notation:

y |β,σ2, X ~ N(Xβ, σ2I)

•  X matrix of explanatory variables •  β vector of regression coefficients •  y response or variable of interest

vector •  ε vector of normally distributed random

errors

Page 4: Bayesian regression intro with r

Example: Birds’ Extinction •  Albert, 2009. Bayesian Computation with R,

2d. Edition. Springer. •  Birds’ extinction – data from 16 islands •  species – name of bird species •  time – average time of extinction on the

islands •  nesting – average number of nesting pairs •  size – species’ size, 1 = large; 0 = small •  status – species’ status; 1 = resident,

0 = migrant

Page 5: Bayesian regression intro with r

Example

library(LearnBayes) data(birdextinct) attach(birdextinct) birdextinct[1:15, ]

Page 6: Bayesian regression intro with r

Linear Model •  Graphical display: log.time = log(time) plot(log.time ~ nesting)

•  Linear Model t.lm = lm(log.time ~ nesting + size + status, x = T, y = T )

summary(t.lm)

Page 7: Bayesian regression intro with r

Birds’ Extinction

•  From output: a.  longer extinction times for species with

large no. of nesting pairs

b.  smaller extinction times for large birds

c.  longer extinction times for resident birds

Page 8: Bayesian regression intro with r

Bayesian Regression

•  Model: y | β, σ2, X ~ N(Xβ, σ2I) •  Two unknown parameters: β & σ2

•  Assume un-informative joint prior: f(β, σ2) ∝ 1/σ2

•  Joint posterior: f(β, σ2 | y) ∝ f(β | y, σ2) × f(σ2 | y)

Page 9: Bayesian regression intro with r

Bayesian Regression •  Marginal posterior distribution of β,

conditional on error variance σ2 is Multivariate Normal with:

Mean = b = (XtX)-1Xty var(β) = σ2 ⋅ (XtX)-1

•  Marginal posterior distribution of σ2 is Inverse Gamma[(n - k)/2, S/2]

S = (y - Xb)t (y - Xb)

Page 10: Bayesian regression intro with r

Prediction

•  To predict future yp , based on vector xp

•  Conditional on β & σ2, yp ~ N(xpβ, σ2)

•  Posterior predictive distribution f(yp|y) = ∫f(yp|β, σ2)f(β, σ2|y)dβdσ2

Page 11: Bayesian regression intro with r

Computation •  Simulate from joint posterior of β & σ2:

– Simulate value of σ2 from f(σ2| y) – Simulate value of β from f(β | y, σ2)

•  Use Albert’s blinreg( ) to perform simulation

•  To predict mean response µy use Albert’s blinregexpected( )

•  To predict future response yp use Albert’s blinregpred( )

Page 12: Bayesian regression intro with r

blinreg( )

•  To sample from joint posterior of β & σ2 •  Inputs: y, X & no. of simulations m par.sample = blinreg(t.lm$y, t.lm$x, 5000)

S = sum(t.lm$residuals^2) shape=t.lm$df.residual/2; rate = S/2 library(MCMCpack) sigma2 = rinvgamma(1, shape, scale = 1/rate)

Page 13: Bayesian regression intro with r

Computation •  β is simulated from Multivariate

Normal: Mean = b = (XtX)-1Xty var(β) = σ2 ⋅ (XtX)-1

MSE = sum(t.lm$residuals^2)/ t.lm$df.residual

vbeta = vcov(t.lm)/MSE beta = rmnorm(1, t.lm$coef,

vbeta*sigma2)

Page 14: Bayesian regression intro with r

Graphical Display

op = par(mfrow = c(2,2)) hist(par.sample$beta[, 2],

main= "Nesting", xlab=expression(beta[1]))

hist(par.sample$beta[, 3], main= "Size", xlab=expression(beta[2]))

Page 15: Bayesian regression intro with r

Graphical Display

hist(par.sample$beta[, 4], main= "Status", xlab = expression(beta[3]))

hist(par.sample$sigma, main= "Error SD", xlab=expression(sigma))

par(op)

Page 16: Bayesian regression intro with r

Parameters’ Summary

•  Compute 2.5th , 50th & 97.5th percentiles

apply(par.sample$beta, 2, quantile, c(.025, .5, .975))

quantile(par.sample$sigma,

c(.025, .5, .975))

Page 17: Bayesian regression intro with r

Mean Response Covariate

Set

Nesting

Size

Status A

4

Small

Migrant

B

4

Small

Resident

C

4

Large

Migrant

D

4

Large

Resident

Page 18: Bayesian regression intro with r

Mean Response

cov1 = c(1, 4, 0, 0) cov2 = c(1, 4, 1, 0) cov3 = c(1, 4, 0, 1) cov4 = c(1, 4, 1, 1) X1 =rbind(cov1,cov2,cov3,cov4) mean.draw = blinregexpected(X1,

par.sample)

Page 19: Bayesian regression intro with r

Mean Response

op = par(mfrow = c(2,2)) hist(mean.draw[, 1],

main = "Covariate Set A", xlab = "Log Time")

hist(mean.draw[, 2],

main = "Covariate Set B", xlab = "Log Time")

Page 20: Bayesian regression intro with r

Mean Response hist(mean.draw[, 3],

main = "Covariate Set C", xlab = "Log Time")

hist(mean.draw[, 4],

main="Covariate Set D", xlab="Log Time")

par(op)

Page 21: Bayesian regression intro with r

Means Response’ Summary

•  Compute 2.5th , 50th & 97.5th percentiles

apply(mean.draw, 2, quantile, c(.025, .5, .975))

Page 22: Bayesian regression intro with r

Future Response pred.draw=blinregpred(X1, par.sample) op = par(mfrow = c(2,2)) hist(pred.draw[, 1], main="Covariate Set A", xlab="Log Time")

hist(pred.draw[, 2], main="Covariate Set B", xlab="Log Time")

Page 23: Bayesian regression intro with r

Future Response

hist(pred.draw[, 3], main="Covariate Set C", xlab="Log Time")

hist(pred.draw[, 2], main="Covariate Set D", xlab="Log Time")

par(op)

Page 24: Bayesian regression intro with r

Future Response’ Summary

•  Compute 2.5th , 50th & 97.5th percentiles

apply(pred.draw, 2, quantile, c(.025, .5, .975))