bayesian quadrature - university of torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 ·...
TRANSCRIPT
![Page 1: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/1.jpg)
Bayesian Quadrature:Model-based Approximate Integration
David DuvenaudUniversity of Cambridge
December 8, 2012
![Page 2: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/2.jpg)
The Quadrature Problem
� We want to estimate anintegral
Z =
∫f (x)p(x)dx
� Most computational problems
in inference correspond to
integrals:
� Expectations� Marginal distributions� Integrating out
nuisance parameters� Normalization
constants� Model comparison
x
function f(x)
input density p(x)
![Page 3: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/3.jpg)
Sampling Methods
� Monte Carlo methods:Sample from p(x), takeempirical mean:
Z =1
N
N∑i=1
f (xi )
x
function f(x)
input density p(x)
samples
![Page 4: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/4.jpg)
Sampling Methods
� Monte Carlo methods:Sample from p(x), takeempirical mean:
Z =1
N
N∑i=1
f (xi )
� Possibly sub-optimal for tworeasons:
x
function f(x)
input density p(x)
samples
![Page 5: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/5.jpg)
Sampling Methods
� Monte Carlo methods:Sample from p(x), takeempirical mean:
Z =1
N
N∑i=1
f (xi )
� Possibly sub-optimal for two
reasons:
� Random bunching up x
function f(x)
input density p(x)
samples
![Page 6: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/6.jpg)
Sampling Methods
� Monte Carlo methods:Sample from p(x), takeempirical mean:
Z =1
N
N∑i=1
f (xi )
� Possibly sub-optimal for two
reasons:
� Random bunching up� Often, nearby function
values will be similar
x
function f(x)
input density p(x)
samples
![Page 7: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/7.jpg)
Sampling Methods
� Monte Carlo methods:Sample from p(x), takeempirical mean:
Z =1
N
N∑i=1
f (xi )
� Possibly sub-optimal for two
reasons:
� Random bunching up� Often, nearby function
values will be similar
� Model-based andquasi-Monte Carlo methodsspread out samples to achievefaster convergence.
x
function f(x)
input density p(x)
samples
![Page 8: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/8.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
![Page 9: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/9.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ± SDp(Z)
![Page 10: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/10.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
![Page 11: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/11.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
![Page 12: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/12.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
![Page 13: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/13.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
![Page 14: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/14.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
![Page 15: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/15.jpg)
Model-based Integration
� Place a prior on f , for example, a GP
� Posterior over f implies a posterior over Z .
Z
x
f(x)
p(x)
GP meanGP mean ±2SDp(Z)
samples
� We’ll call using a GP prior Bayesian Quadrature
![Page 16: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/16.jpg)
Bayesian Quadrature Estimator
� Posterior over Z has mean linear in f (xs):
Egp [Z |f (xs)] =N∑i=1
zTK−1f (xi )
where zn =∫k(x , xn)p(x)dx
� Natural to minimize posterior variance of Z :
V [Z |f (xs)] =
∫ ∫k(x , x ′)p(x)p(x ′)dxdx ′ − zTK−1z
� Doesn’t depend on function values at all!
� Choosing samples sequentially to minimize variance:Sequential Bayesian Quadrature.
![Page 17: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/17.jpg)
Bayesian Quadrature Estimator
� Posterior over Z has mean linear in f (xs):
Egp [Z |f (xs)] =N∑i=1
zTK−1f (xi )
where zn =∫k(x , xn)p(x)dx
� Natural to minimize posterior variance of Z :
V [Z |f (xs)] =
∫ ∫k(x , x ′)p(x)p(x ′)dxdx ′ − zTK−1z
� Doesn’t depend on function values at all!
� Choosing samples sequentially to minimize variance:Sequential Bayesian Quadrature.
![Page 18: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/18.jpg)
Bayesian Quadrature Estimator
� Posterior over Z has mean linear in f (xs):
Egp [Z |f (xs)] =N∑i=1
zTK−1f (xi )
where zn =∫k(x , xn)p(x)dx
� Natural to minimize posterior variance of Z :
V [Z |f (xs)] =
∫ ∫k(x , x ′)p(x)p(x ′)dxdx ′ − zTK−1z
� Doesn’t depend on function values at all!
� Choosing samples sequentially to minimize variance:Sequential Bayesian Quadrature.
![Page 19: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/19.jpg)
Bayesian Quadrature Estimator
� Posterior over Z has mean linear in f (xs):
Egp [Z |f (xs)] =N∑i=1
zTK−1f (xi )
where zn =∫k(x , xn)p(x)dx
� Natural to minimize posterior variance of Z :
V [Z |f (xs)] =
∫ ∫k(x , x ′)p(x)p(x ′)dxdx ′ − zTK−1z
� Doesn’t depend on function values at all!
� Choosing samples sequentially to minimize variance:Sequential Bayesian Quadrature.
![Page 20: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/20.jpg)
Things you can do with Bayesian Quadrature
� Can incorporate knowledge of function (symmetries)
f (x , y) = f (y , x)⇔ ks(x , y , x ′, y ′) = k(x , y , x ′, y ′) + k(x , y ′, x ′, y)
+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)
� Can condition on gradients
� Posterior variance is a natural convergence diagnostic
100 120 140 160 180 200−0.2
−0.1
0
0.1
0.2
Z
# of samples
![Page 21: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/21.jpg)
Things you can do with Bayesian Quadrature
� Can incorporate knowledge of function (symmetries)
f (x , y) = f (y , x)⇔ ks(x , y , x ′, y ′) = k(x , y , x ′, y ′) + k(x , y ′, x ′, y)
+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)
� Can condition on gradients
� Posterior variance is a natural convergence diagnostic
100 120 140 160 180 200−0.2
−0.1
0
0.1
0.2
Z
# of samples
![Page 22: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/22.jpg)
Things you can do with Bayesian Quadrature
� Can incorporate knowledge of function (symmetries)
f (x , y) = f (y , x)⇔ ks(x , y , x ′, y ′) = k(x , y , x ′, y ′) + k(x , y ′, x ′, y)
+ k(x ′, y , x , y ′) + k(x ′, y ′, x , y)
� Can condition on gradients
� Posterior variance is a natural convergence diagnostic
100 120 140 160 180 200−0.2
−0.1
0
0.1
0.2
Z
# of samples
![Page 23: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/23.jpg)
More things you can do with Bayesian Quadrature
� Can compute likelihood of GP, learn kernel
� Can compute marginals with error bars, in two ways:
![Page 24: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/24.jpg)
More things you can do with Bayesian Quadrature
� Can compute likelihood of GP, learn kernel
� Can compute marginals with error bars, in two ways:
� Simply from the GP posterior:
![Page 25: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/25.jpg)
More things you can do with Bayesian Quadrature
� Can compute likelihood of GP, learn kernel
� Can compute marginals with error bars, in two ways:
� Or by recomputing fθ(x) for different θ with same x
0.2 0.4 0.6 0.8 1 1.2σ
x
Z
True σx
Marg. Like.
![Page 26: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/26.jpg)
More things you can do with Bayesian Quadrature
� Can compute likelihood of GP, learn kernel
� Can compute marginals with error bars, in two ways:
� Or by recomputing fθ(x) for different θ with same x
0.2 0.4 0.6 0.8 1 1.2σ
x
Z
True σx
Marg. Like.
� Much nicer than histograms!
![Page 27: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/27.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMD
![Page 28: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/28.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMD
![Page 29: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/29.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMD
![Page 30: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/30.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMD Empirical Rates in RKHS
![Page 31: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/31.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMDEmpirical Rates out of
RKHS
![Page 32: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/32.jpg)
Rates of Convergence
What is rate of convergence of SBQ when its assumptions are true?
Expected Variance / MMD Bound on Bayesian Error
![Page 33: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/33.jpg)
GPs vs Log-GPs for Inference
−20
0
20
40
60
80
100
120
x
`(x)
True FunctionEvaluations
![Page 34: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/34.jpg)
GPs vs Log-GPs for Inference
−20
0
20
40
60
80
100
120
x
`(x)
True FunctionEvaluationsGP Posterior
![Page 35: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/35.jpg)
GPs vs Log-GPs for Inference
−20
0
20
40
60
80
100
120
x
`(x)
−1
0
1
2
3
4
x
log`(x)
True FunctionEvaluationsGP Posterior
True Log-func
Evaluations
![Page 36: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/36.jpg)
GPs vs Log-GPs for Inference
−20
0
20
40
60
80
100
120
x
`(x)
−1
0
1
2
3
4
x
log`(x)
True Log-func
Evaluations
GP Posterior
True FunctionEvaluationsGP Posterior
![Page 37: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/37.jpg)
GPs vs Log-GPs for Inference
−20
0
20
40
60
80
100
120
x
`(x)
True FunctionEvaluationsGP PosteriorLog-GP Posterior
−1
0
1
2
3
4
x
log`(x)
True Log-func
Evaluations
GP Posterior
![Page 38: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/38.jpg)
GPs vs Log-GPs
0
50
100
150
200
x
`(x)
True FunctionEvaluations
![Page 39: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/39.jpg)
GPs vs Log-GPs
0
50
100
150
200
x
`(x)
True FunctionEvaluationsGP Posterior
![Page 40: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/40.jpg)
GPs vs Log-GPs
0
50
100
150
200
x
`(x)
−200
−150
−100
−50
0
x
log`(x)
True FunctionEvaluationsGP Posterior
True Log-func
Evaluations
![Page 41: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/41.jpg)
GPs vs Log-GPs
0
50
100
150
200
x
`(x)
−200
−150
−100
−50
0
x
log`(x)
True Log-func
Evaluations
GP Posterior
True FunctionEvaluationsGP Posterior
![Page 42: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/42.jpg)
GPs vs Log-GPs
0
50
100
150
200
x
`(x)
True FunctionEvaluationsGP PosteriorLog-GP Posterior
−200
−150
−100
−50
0
x
log`(x)
True Log-func
Evaluations
GP Posterior
![Page 43: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/43.jpg)
Integrating under Log-GPs
−20
0
20
40
60
80
x
`(x)
True FunctionEvaluationsGP PosteriorLog-GP Posterior
−1
0
1
2
3
4
x
log`(x)
True Log-func
Evaluations
GP Posterior
![Page 44: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/44.jpg)
Integrating under Log-GPs
−20
0
20
40
60
80
x
`(x)
True FunctionEvaluationsMean of GPApprox Log-GP
Inducing Points
−1
0
1
2
3
4
x
log`(x)
True Log-func
Evaluations
GP Posterior
Inducing Points
![Page 45: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/45.jpg)
Conclusions
� Model-based integration allows active learning about integrals,can require fewer samples than MCMC, and allows us tocheck our assumptions.
� BQ has nice convergence properties if its assumptions arecorrect.
� For inference, GP is not especially appropriate, but othermodels are intractable.
![Page 46: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/46.jpg)
Conclusions
� Model-based integration allows active learning about integrals,can require fewer samples than MCMC, and allows us tocheck our assumptions.
� BQ has nice convergence properties if its assumptions arecorrect.
� For inference, GP is not especially appropriate, but othermodels are intractable.
![Page 47: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/47.jpg)
Conclusions
� Model-based integration allows active learning about integrals,can require fewer samples than MCMC, and allows us tocheck our assumptions.
� BQ has nice convergence properties if its assumptions arecorrect.
� For inference, GP is not especially appropriate, but othermodels are intractable.
![Page 48: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/48.jpg)
Limitations and Future Directions
� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.
� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.
� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?
� How much overhead is worthwhile? Bounded rationality workseems relevant.
Thanks!
![Page 49: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/49.jpg)
Limitations and Future Directions
� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.
� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.
� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?
� How much overhead is worthwhile? Bounded rationality workseems relevant.
Thanks!
![Page 50: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/50.jpg)
Limitations and Future Directions
� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.
� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.
� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?
� How much overhead is worthwhile? Bounded rationality workseems relevant.
Thanks!
![Page 51: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/51.jpg)
Limitations and Future Directions
� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.
� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.
� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?
� How much overhead is worthwhile? Bounded rationality workseems relevant.
Thanks!
![Page 52: Bayesian Quadrature - University of Torontoduvenaud/talks/intro_bq.pdf · 2017-05-29 · Limitations and Future Directions Right now, BQ really only works in low dimensions (](https://reader031.vdocuments.pub/reader031/viewer/2022013020/5e9218ae3bbceb78d2374b9f/html5/thumbnails/52.jpg)
Limitations and Future Directions
� Right now, BQ really only works in low dimensions ( < 10 ),when the function is fairly smooth, and is only worth usingwhen computing f (x) is expensive.
� How to extend to high dimensions? Gradient observations arehelpful, but a D-dimensional gradient is D separateobservations.
� It seems unlikely that we’ll find another tractablenonparametric distribution like GPs - should we accept thatwe’ll need a second round of approximate integration on asurrogate model?
� How much overhead is worthwhile? Bounded rationality workseems relevant.
Thanks!