![Page 1: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/1.jpg)
Learning with Intractable Inference and Partial Supervision
Jacob Steinhardt
Stanford University
September 8, 2015
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 1 / 31
![Page 2: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/2.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 3: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/3.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 4: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/4.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.
Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 5: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/5.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 6: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/6.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 7: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/7.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 8: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/8.jpg)
Motivation
An Example
Company officials refused to comment.公司官员拒绝对此发表评论。
He said the company would appeal.他表示该公司将提出上诉。
Statistical reasoning: aggregate data across sentences to reach conclusions.Computational reasoning: focus on easily disambiguated words first.
Tension: statistics wants to expose information (aggregation), while computerscience wants to hide it (abstraction, adaptivity).
Statistical inference is computationally intractable.
How can we bring these two paradigms together?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 2 / 31
![Page 9: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/9.jpg)
Formal Setting
1 Motivation
2 Formal Setting
3 Reified Context Models
4 Relaxed Supervision
5 Open Questions
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 3 / 31
![Page 10: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/10.jpg)
Formal Setting
Setting: Structured Prediction
input x :
output y : v o l c a n i c
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Structured output space Y — requires inference
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 4 / 31
![Page 11: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/11.jpg)
Formal Setting
Setting: Structured Prediction
input x :
output y : v o l c a n i c
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]
Structured output space Y — requires inference
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 4 / 31
![Page 12: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/12.jpg)
Formal Setting
Setting: Structured Prediction
input x :
output y : v o l c a n i c
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Structured output space Y — requires inference
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 4 / 31
![Page 13: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/13.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 14: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/14.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 15: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/15.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 16: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/16.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 17: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/17.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 18: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/18.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 19: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/19.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 20: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/20.jpg)
Formal Setting
Supervised Learning is Easy
Recall: want to maximize E[logpθ (y | x)].
Suppose pθ (y | x) ∝ exp(θ>φ(x ,y)). Then:
∇θ logpθ (y | x) = φ(x ,y)︸ ︷︷ ︸given
−Ey∼pθ (·|x)[φ(x , y)]︸ ︷︷ ︸inference
.
Inference errors will be corrected by supervision signal (φ(x ,y)) over thecourse of learning.
In practice, anything reasonable (MCMC, beam search) works.
Conceptually, can use Searn (Daume III et al., 2009) or pseudolikelihood(Besag, 1975) to obviate need for inference.
Approximate inference is easy in supervised settings.
Unless we care about estimating uncertainty (calibration, precision/recall)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 5 / 31
![Page 21: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/21.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 22: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/22.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]
Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 23: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/23.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 24: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/24.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 25: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/25.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 26: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/26.jpg)
Formal Setting
Partially Supervised Structured Prediction
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Goal: learn θ to maximize Ex ,y∼D[logpθ (y | x)]Where pθ (y | x) = ∑z pθ (y ,z | x)
Again assume pθ (y ,z | x) ∝ exp(θ>φ(x ,z,y)). Then
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸inference on z
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸inference on z,y
.
Inference errors on z get reinforced during learning.
Inference often hardest (and most consequential) at beginning of learning!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 6 / 31
![Page 27: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/27.jpg)
Formal Setting
This Work
Two thrusts:1 How can we reify computation as part of a statistical model?2 How can we relax the supervision signal to aid computation while still
maintaining consistent parameter estimates?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 7 / 31
![Page 28: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/28.jpg)
Formal Setting
Related Work
Learning tractable models / accounting for approximations
sum-product networks (Poon & Domingos, 2011)
max-violation perceptron (Huang, Fayong, & Guo, 2012; Zhang et al., 2013; Yu et al., 2013)
fast-mixing Markov chains (S. & Liang, 2015)
many others (Barbu, 2009; Daume III, Langford, & Marcu, 2009; Domke, 2011; Stoyanov,
Ropson, & Eisner, 2011; Niepert & Domingos, 2014; Li & Zemel, 2014; Shi, S., & Liang, 2015)
Improving expressivity of variational inference
combining with MCMC (Salimans, Kingma, & Welling, 2015)
using neural networks (Kingma & Welling, 2013; Mnih & Gregor, 2014)
Computational-statistical tradeoffs
huge body of recent work (Berthet & Rigollet, 2013; Chandrasekaran & Jordan, 2013;
Zhang et al., 2013; Zhang, Wainwright, & Jordan, 2014; Christiano, 2014; Daniely, Linial, &
Shalev-Shwartz, 2014; Garg, Ma, & Nguyen, 2014; Shamir, 2014; Braverman et al., 2015; S. &
Duchi, 2015; S., Valiant, & Wager, 2015)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 8 / 31
![Page 29: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/29.jpg)
Formal Setting
Related Work
Learning tractable models / accounting for approximations
sum-product networks (Poon & Domingos, 2011)
max-violation perceptron (Huang, Fayong, & Guo, 2012; Zhang et al., 2013; Yu et al., 2013)
fast-mixing Markov chains (S. & Liang, 2015)
many others (Barbu, 2009; Daume III, Langford, & Marcu, 2009; Domke, 2011; Stoyanov,
Ropson, & Eisner, 2011; Niepert & Domingos, 2014; Li & Zemel, 2014; Shi, S., & Liang, 2015)
Improving expressivity of variational inference
combining with MCMC (Salimans, Kingma, & Welling, 2015)
using neural networks (Kingma & Welling, 2013; Mnih & Gregor, 2014)
Computational-statistical tradeoffs
huge body of recent work (Berthet & Rigollet, 2013; Chandrasekaran & Jordan, 2013;
Zhang et al., 2013; Zhang, Wainwright, & Jordan, 2014; Christiano, 2014; Daniely, Linial, &
Shalev-Shwartz, 2014; Garg, Ma, & Nguyen, 2014; Shamir, 2014; Braverman et al., 2015; S. &
Duchi, 2015; S., Valiant, & Wager, 2015)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 8 / 31
![Page 30: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/30.jpg)
Formal Setting
Related Work
Learning tractable models / accounting for approximations
sum-product networks (Poon & Domingos, 2011)
max-violation perceptron (Huang, Fayong, & Guo, 2012; Zhang et al., 2013; Yu et al., 2013)
fast-mixing Markov chains (S. & Liang, 2015)
many others (Barbu, 2009; Daume III, Langford, & Marcu, 2009; Domke, 2011; Stoyanov,
Ropson, & Eisner, 2011; Niepert & Domingos, 2014; Li & Zemel, 2014; Shi, S., & Liang, 2015)
Improving expressivity of variational inference
combining with MCMC (Salimans, Kingma, & Welling, 2015)
using neural networks (Kingma & Welling, 2013; Mnih & Gregor, 2014)
Computational-statistical tradeoffs
huge body of recent work (Berthet & Rigollet, 2013; Chandrasekaran & Jordan, 2013;
Zhang et al., 2013; Zhang, Wainwright, & Jordan, 2014; Christiano, 2014; Daniely, Linial, &
Shalev-Shwartz, 2014; Garg, Ma, & Nguyen, 2014; Shamir, 2014; Braverman et al., 2015; S. &
Duchi, 2015; S., Valiant, & Wager, 2015)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 8 / 31
![Page 31: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/31.jpg)
Reified Context Models
1 Motivation
2 Formal Setting
3 Reified Context Models
4 Relaxed Supervision
5 Open Questions
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 9 / 31
![Page 32: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/32.jpg)
Reified Context Models
Structured Prediction Task
input x :
output y : v o l c a n i c
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 10 / 31
![Page 33: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/33.jpg)
Reified Context Models
Contexts Are Key
v o l c a
v *o **l ***cDP:
v o l c a
v vo vol volcbeam search:
Key idea: contexts!
*odef=
aoboco...
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 11 / 31
![Page 34: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/34.jpg)
Reified Context Models
Contexts Are Key
v o l c a
v *o **l ***cDP:
v o l c a
v vo vol volcbeam search:
Key idea: contexts!
*odef=
aoboco...
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 11 / 31
![Page 35: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/35.jpg)
Reified Context Models
Contexts Are Key
v o l c a
v *o **l ***cDP:
v o l c a
v vo vol volcbeam search:
Key idea: contexts!
*odef=
aoboco...
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 11 / 31
![Page 36: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/36.jpg)
Reified Context Models
Contexts Are Key
v o l c a
v *o **l ***cDP:
v o l c a
v vo vol volcbeam search:
Key idea: contexts!
*odef=
aoboco...
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 11 / 31
![Page 37: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/37.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 38: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/38.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 39: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/39.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 40: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/40.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 41: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/41.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 42: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/42.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 43: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/43.jpg)
Reified Context Models
Desiderata
r *o **l ***cv *a **i ***r
coverage (short contexts)better uncertainty estimates (precision)stabler partially supervised learning updates
r ro rol rolcv ra ral ralc
expressivity (long contexts)capture complex dependencies
r ro rol *olc
← best of both worldsv ra ral ***cy *o *ol ***r
* ** *** ****
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 12 / 31
![Page 44: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/44.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i ccontext c: v *o *ol *olc · · · · · ·
Challenge: how to trade off contexts of different lengths?=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 45: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/45.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i c
context c: v *o *ol *olc · · · · · ·
Challenge: how to trade off contexts of different lengths?=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 46: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/46.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i ccontext c: v *o *ol *olc · · · · · ·
Challenge: how to trade off contexts of different lengths?=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 47: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/47.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i ccontext c: v *o *ol *olc · · · · · ·
r ro rol *olc
←“context sets”v ra ral ***cy *o *ol ***r
* ** *** ****C1 C2 C3 C4
Challenge: how to trade off contexts of different lengths?=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 48: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/48.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i ccontext c: v *o *ol *olc · · · · · ·
r ro rol *olc
←“context sets”v ra ral ***cy *o *ol ***r
* ** *** ****C1 C2 C3 C4
Challenge: how to trade off contexts of different lengths?
=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 49: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/49.jpg)
Reified Context Models
Reifying Contexts
input x :
output y : v o l c a n i ccontext c: v *o *ol *olc · · · · · ·
r ro rol *olc
←“context sets”v ra ral ***cy *o *ol ***r
* ** *** ****C1 C2 C3 C4
Challenge: how to trade off contexts of different lengths?=⇒ Reify contexts as part of model!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 13 / 31
![Page 50: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/50.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 51: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/51.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 52: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/52.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 53: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/53.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 54: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/54.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κ
φ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 55: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/55.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 56: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/56.jpg)
Reified Context Models
Reified Context Models
Given:
context sets C1, . . . ,CL
features φi(ci−1,yi)
Define the model
pθ (y1:L,c1:L−1) ∝ exp
(L
∑i=1
θ>
φi(ci−1,yi)
)· κ(y ,c)︸ ︷︷ ︸
consistency
Graphical model structure:
Y1 Y2 Y3 Y4 Y5
C1 C2 C3 C4
κ κ κ κ κφ2 φ3 φ4 φ5
inference viaforward-backward!
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 14 / 31
![Page 57: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/57.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 58: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/58.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 59: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/59.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 60: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/60.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 61: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/61.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 62: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/62.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 63: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/63.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 64: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/64.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 65: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/65.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 66: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/66.jpg)
Reified Context Models
Adaptive Context Selection
Select context sets Ci during forward pass of inference
Greedily select contexts with largest mass
abcde...
c
e
ce?
C1
cacb...eaeb...?a...
ca
?a
ca?a??
C2
etc.
Biases towards short contexts unless there is high confidence.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 15 / 31
![Page 67: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/67.jpg)
Reified Context Models
Precision
input x :
output y : v o l c a n i c
Model assigns probability to each prediction, so can predict on most confidentsubset.
Measure precision (# of correct words) vs. recall (# of words predicted).comparison: beam search
0.0 0.2 0.4 0.6 0.8 1.0recall
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
prec
isio
n
Word Recognition
Beam searchRCM
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 16 / 31
![Page 68: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/68.jpg)
Reified Context Models
Precision
input x :
output y : v o l c a n i c
Model assigns probability to each prediction, so can predict on most confidentsubset.
Measure precision (# of correct words) vs. recall (# of words predicted).comparison: beam search
0.0 0.2 0.4 0.6 0.8 1.0recall
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
prec
isio
n
Word Recognition
Beam searchRCM
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 16 / 31
![Page 69: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/69.jpg)
Reified Context Models
Precision
input x :
output y : v o l c a n i c
Model assigns probability to each prediction, so can predict on most confidentsubset.
Measure precision (# of correct words) vs. recall (# of words predicted).
comparison: beam search
0.0 0.2 0.4 0.6 0.8 1.0recall
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
prec
isio
n
Word Recognition
Beam searchRCM
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 16 / 31
![Page 70: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/70.jpg)
Reified Context Models
Precision
input x :
output y : v o l c a n i c
Model assigns probability to each prediction, so can predict on most confidentsubset.
Measure precision (# of correct words) vs. recall (# of words predicted).comparison: beam search
0.0 0.2 0.4 0.6 0.8 1.0recall
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
prec
isio
n
Word Recognition
Beam searchRCM
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 16 / 31
![Page 71: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/71.jpg)
Reified Context Models
Precision
Measure precision (# of correct words) vs. recall (# of words predicted).
0.0 0.2 0.4 0.6 0.8 1.0recall
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00pr
ecis
ion
Word Recognition
Beam searchRCM
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 16 / 31
![Page 72: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/72.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .
latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 73: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/73.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I am
output y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 74: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/74.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 75: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/75.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 76: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/76.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.
use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 77: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/77.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.
again compare to beam search (Nuhn et al., 2013)Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 78: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/78.jpg)
Reified Context Models
Partially Supervised Learning
Decipherment task:
cipher am 7→ 5, I 7→ 13, what 7→ 54, . . .latent z I am what I amoutput y 13 5 54 13 5
Goal: determine cipher
Fit 2nd-order HMM with EM, using RCMs for approximate E-step.use learned emissions to determine cipher.again compare to beam search (Nuhn et al., 2013)
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
map
ping
acc
urac
y
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 79: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/79.jpg)
Reified Context Models
Partially Supervised Learning
Fraction of correctly mapped words:
0 5 10 15 20training passes
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8m
appi
ng a
ccur
acy
DeciphermentRCMbeam
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 17 / 31
![Page 80: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/80.jpg)
Reified Context Models
Contexts During Training
Context lengths increase smoothly during training:
0 5 10 15 20number of passes
1.5
2.0
2.5
3.0
3.5
4.0
4.5
aver
age
cont
ext l
engt
h
Decipherment
******↓
***ing↓
idding
Start of training: little information, short contexts.End of training: lots of information, long contexts.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 18 / 31
![Page 81: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/81.jpg)
Reified Context Models
Contexts During Training
Context lengths increase smoothly during training:
0 5 10 15 20number of passes
1.5
2.0
2.5
3.0
3.5
4.0
4.5
aver
age
cont
ext l
engt
hDecipherment
******↓
***ing↓
idding
Start of training: little information, short contexts.End of training: lots of information, long contexts.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 18 / 31
![Page 82: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/82.jpg)
Reified Context Models
Contexts During Training
Context lengths increase smoothly during training:
0 5 10 15 20number of passes
1.5
2.0
2.5
3.0
3.5
4.0
4.5
aver
age
cont
ext l
engt
hDecipherment
******↓
***ing↓
idding
Start of training: little information, short contexts.
End of training: lots of information, long contexts.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 18 / 31
![Page 83: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/83.jpg)
Reified Context Models
Contexts During Training
Context lengths increase smoothly during training:
0 5 10 15 20number of passes
1.5
2.0
2.5
3.0
3.5
4.0
4.5
aver
age
cont
ext l
engt
hDecipherment
******↓
***ing↓
idding
Start of training: little information, short contexts.End of training: lots of information, long contexts.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 18 / 31
![Page 84: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/84.jpg)
Reified Context Models
Discussion
RCMs provide both expressivity and coverage, which enable:
More accurate uncertainty estimates (precision)
Better partially supervised learning updates
Reproducible experiments on Codalab: codalab.org/worksheets
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 19 / 31
![Page 85: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/85.jpg)
Reified Context Models
Discussion
RCMs provide both expressivity and coverage, which enable:
More accurate uncertainty estimates (precision)
Better partially supervised learning updates
Reproducible experiments on Codalab: codalab.org/worksheets
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 19 / 31
![Page 86: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/86.jpg)
Reified Context Models
Discussion
RCMs provide both expressivity and coverage, which enable:
More accurate uncertainty estimates (precision)
Better partially supervised learning updates
Reproducible experiments on Codalab: codalab.org/worksheets
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 19 / 31
![Page 87: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/87.jpg)
Reified Context Models
Discussion
RCMs provide both expressivity and coverage, which enable:
More accurate uncertainty estimates (precision)
Better partially supervised learning updates
Reproducible experiments on Codalab: codalab.org/worksheets
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 19 / 31
![Page 88: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/88.jpg)
Relaxed Supervision
1 Motivation
2 Formal Setting
3 Reified Context Models
4 Relaxed Supervision
5 Open Questions
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 20 / 31
![Page 89: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/89.jpg)
Relaxed Supervision
Intractable Supervision
Sometimes, even supervision is intractable:
input x : What is the largest city in California?latent z: argmax(λx .CITY(x)∧ LOC(x ,CA),λx .POPULATION(x))output y : Los Angeles
Intractable no matter how simple the model is!
but likely statistical relationships (e.g. between CITY and Los Angeles)
Need a way to relax the likelihood.
while maintaining good statistical properties (asymptotic consistency)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 21 / 31
![Page 90: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/90.jpg)
Relaxed Supervision
Intractable Supervision
Sometimes, even supervision is intractable:
input x : What is the largest city in California?latent z: argmax(λx .CITY(x)∧ LOC(x ,CA),λx .POPULATION(x))output y : Los Angeles
Intractable no matter how simple the model is!
but likely statistical relationships (e.g. between CITY and Los Angeles)
Need a way to relax the likelihood.
while maintaining good statistical properties (asymptotic consistency)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 21 / 31
![Page 91: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/91.jpg)
Relaxed Supervision
Intractable Supervision
Sometimes, even supervision is intractable:
input x : What is the largest city in California?latent z: argmax(λx .CITY(x)∧ LOC(x ,CA),λx .POPULATION(x))output y : Los Angeles
Intractable no matter how simple the model is!
but likely statistical relationships (e.g. between CITY and Los Angeles)
Need a way to relax the likelihood.
while maintaining good statistical properties (asymptotic consistency)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 21 / 31
![Page 92: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/92.jpg)
Relaxed Supervision
Intractable Supervision
Sometimes, even supervision is intractable:
input x : What is the largest city in California?latent z: argmax(λx .CITY(x)∧ LOC(x ,CA),λx .POPULATION(x))output y : Los Angeles
Intractable no matter how simple the model is!
but likely statistical relationships (e.g. between CITY and Los Angeles)
Need a way to relax the likelihood.
while maintaining good statistical properties (asymptotic consistency)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 21 / 31
![Page 93: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/93.jpg)
Relaxed Supervision
Intractable Supervision
Sometimes, even supervision is intractable:
input x : What is the largest city in California?latent z: argmax(λx .CITY(x)∧ LOC(x ,CA),λx .POPULATION(x))output y : Los Angeles
Intractable no matter how simple the model is!
but likely statistical relationships (e.g. between CITY and Los Angeles)
Need a way to relax the likelihood.
while maintaining good statistical properties (asymptotic consistency)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 21 / 31
![Page 94: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/94.jpg)
Relaxed Supervision
Approach
tractable
intractable
θ
β
Start with intractable likelihood q(y | z), model family pθ (z | x).
Replace q(y | z) with family of likelihoods qβ (y | z) (some very easy).
Derive constraints on (θ ,β ) that ensure tractability.
Learn within the tractable region.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 22 / 31
![Page 95: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/95.jpg)
Relaxed Supervision
Approach
tractable
intractable
θ
β
Start with intractable likelihood q(y | z), model family pθ (z | x).
Replace q(y | z) with family of likelihoods qβ (y | z) (some very easy).
Derive constraints on (θ ,β ) that ensure tractability.
Learn within the tractable region.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 22 / 31
![Page 96: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/96.jpg)
Relaxed Supervision
Approach
tractable
intractable
θ
β
Start with intractable likelihood q(y | z), model family pθ (z | x).
Replace q(y | z) with family of likelihoods qβ (y | z) (some very easy).
Derive constraints on (θ ,β ) that ensure tractability.
Learn within the tractable region.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 22 / 31
![Page 97: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/97.jpg)
Relaxed Supervision
Approach
tractable
intractable
θ
β
Start with intractable likelihood q(y | z), model family pθ (z | x).
Replace q(y | z) with family of likelihoods qβ (y | z) (some very easy).
Derive constraints on (θ ,β ) that ensure tractability.
Learn within the tractable region.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 22 / 31
![Page 98: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/98.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z
pθ (z,y | x)
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 99: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/99.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z
pθ (z,y | x)
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 100: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/100.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z
pθ (z,y | x)
)
As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 101: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/101.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z,y
pθ (z, y | x)exp(−distβ (y ,y))
)
As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 102: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/102.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z,y
pθ (z, y | x)exp(−distβ (y ,y))
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 103: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/103.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z,y
pθ (z, y | x)exp(−distβ (y ,y))
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 104: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/104.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z,y
pθ (z, y | x)exp(−distβ (y ,y))
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 105: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/105.jpg)
Relaxed Supervision
Relaxed Supervision: Example
input x : Company officials refused to comment.latent z:output y : 公司官员拒绝对此发表评论。
Idea: instead of requiring y to match observed output, penalize based onsome weighted distance distβ (y ,y).
`(θ ,β ;x ,y) =− log
(∑z,y
pθ (z, y | x)exp(−distβ (y ,y))
)As β → ∞, recover original objective.
but optimizing will send β → 0!
Two questions:
How to create natural pressure to increase β?
How to define distances for general problems?
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 23 / 31
![Page 106: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/106.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 107: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/107.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 108: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/108.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 109: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/109.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 110: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/110.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 111: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/111.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 112: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/112.jpg)
Relaxed Supervision
Relaxed Supervision: Formal Framework
Assume (WLOG) that z→ y is deterministic: y = f (z).
Let S(z,y) ∈ {0,1} encode the constraint [f (z) = y ].
Take projections πj : Y →Yj , j = 1, . . . ,k .
Let Sj(z,y) = [πj(f (z)) = πj(y)] be the projected constraint.
Define distance function:
distβ (z,y) =k
∑j=1
βj · (1−Sj(z,y)).
Note: can featurize distβ as −β>ψ(z,y), where ψj = Sj −1.
Lemma
Suppose that π1×·· ·×πk is injective. Then
S(z,y) =k∧
j=1
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 24 / 31
![Page 113: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/113.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
=⇒V∧
j=1
[count(z, j) = count(y , j)]
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 114: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/114.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
=⇒V∧
j=1
[count(z, j) = count(y , j)]
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 115: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/115.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
=⇒V∧
j=1
[count(z, j) = count(y , j)]
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 116: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/116.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
=⇒V∧
j=1
[count(z, j) = count(y , j)]
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 117: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/117.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
=⇒V∧
j=1
[count(z, j) =
πj(y)︷ ︸︸ ︷count(y , j)]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 118: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/118.jpg)
Relaxed Supervision
Example: Unordered Supervision
input x : a b a alatent z: d c d doutput y : {c : 1,d : 3}
Let count(·, j) count number of occurences of character j .
Decomposition:
[y =
f (z)︷ ︸︸ ︷multiset(z)]︸ ︷︷ ︸S(z,y)
⇐⇒V∧
j=1
[count(z, j) =
πj(y)︷ ︸︸ ︷count(y , j)]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 25 / 31
![Page 119: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/119.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 120: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/120.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)
latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 121: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/121.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)
output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 122: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/122.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 123: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/123.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 124: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/124.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 125: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/125.jpg)
Relaxed Supervision
Example: Conjunctive Semantic Parsing
Side information: predicates {Q1, . . . ,Qm}.e.g. Q6 = [DOG] = set of all dogs
input x : brown dog (input utterance)latent z: (Q11, Q6) (set of all brown objects, set of all dogs)output y : Q11∩Q6 (denotation, observed as a set)
For z = (Qj1 , . . . ,QjL), define the denotation JzK = Qj1 ∩·· ·∩QjL .
Decomposition:
y = JzK︸ ︷︷ ︸S(z,y)
⇐⇒m∧
j=1
I[JzK⊆ Qj ] =
πj(y)︷ ︸︸ ︷I[y ⊆ Qj ]︸ ︷︷ ︸
Sj(z,y)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 26 / 31
![Page 126: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/126.jpg)
Relaxed Supervision
Normalization Constant
Create pressure to increase β by adding normalization constant:
qβ (y | z) = exp(β>ψ(z,y)︸ ︷︷ ︸−distβ (z,y)
−A(β ))
`(θ ,β ;x ,y) =− log
(∑z
pθ (z | x)qβ (y | z)).
Lemma
Given π1, . . . ,πk , let A(β )def= ∑
kj=1 log(1+(|Yj |−1)exp(−βj)). Then,
∑y exp(−distβ (z,y))≤ A(β ) for all z.
Lemma
Jointly minimizing L(θ ,β ) = E[`(θ ,β ;x ,y)] yields a consistent estimate of thetrue parameters θ ∗.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 27 / 31
![Page 127: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/127.jpg)
Relaxed Supervision
Normalization Constant
Create pressure to increase β by adding normalization constant:
qβ (y | z) = exp(β>ψ(z,y)︸ ︷︷ ︸−distβ (z,y)
−A(β ))
`(θ ,β ;x ,y) =− log
(∑z
pθ (z | x)qβ (y | z)).
Lemma
Given π1, . . . ,πk , let A(β )def= ∑
kj=1 log(1+(|Yj |−1)exp(−βj)). Then,
∑y exp(−distβ (z,y))≤ A(β ) for all z.
Lemma
Jointly minimizing L(θ ,β ) = E[`(θ ,β ;x ,y)] yields a consistent estimate of thetrue parameters θ ∗.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 27 / 31
![Page 128: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/128.jpg)
Relaxed Supervision
Normalization Constant
Create pressure to increase β by adding normalization constant:
qβ (y | z) = exp(β>ψ(z,y)︸ ︷︷ ︸−distβ (z,y)
−A(β ))
`(θ ,β ;x ,y) =− log
(∑z
pθ (z | x)qβ (y | z)).
Lemma
Given π1, . . . ,πk , let A(β )def= ∑
kj=1 log(1+(|Yj |−1)exp(−βj)). Then,
∑y exp(−distβ (z,y))≤ A(β ) for all z.
Lemma
Jointly minimizing L(θ ,β ) = E[`(θ ,β ;x ,y)] yields a consistent estimate of thetrue parameters θ ∗.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 27 / 31
![Page 129: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/129.jpg)
Relaxed Supervision
Normalization Constant
Create pressure to increase β by adding normalization constant:
qβ (y | z) = exp(β>ψ(z,y)︸ ︷︷ ︸−distβ (z,y)
−A(β ))
`(θ ,β ;x ,y) =− log
(∑z
pθ (z | x)qβ (y | z)).
Lemma
Given π1, . . . ,πk , let A(β )def= ∑
kj=1 log(1+(|Yj |−1)exp(−βj)). Then,
∑y exp(−distβ (z,y))≤ A(β ) for all z.
Lemma
Jointly minimizing L(θ ,β ) = E[`(θ ,β ;x ,y)] yields a consistent estimate of thetrue parameters θ ∗.
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 27 / 31
![Page 130: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/130.jpg)
Relaxed Supervision
Constraints for Efficient Inference
Inference task:
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸sample z given x ,y
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸sample z given x
.
pθ ,β (z | x ,y) ∝ pθ (z | x)qβ (y | z)∝ pθ (z | x)exp(β>ψ(z,y)).
Rejection sampler:
sample from pθ (z | x)accept with probability exp(β>ψ(z,y)).
Bound expected number of samples:
∑x ,y∈Data
(∑z
pθ (z | x)exp(β>ψ(z,y))
)−1
≤ τ. (1)
Ratio of normalization constants: can optimize subject to (1) (similar to CCCP).
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 28 / 31
![Page 131: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/131.jpg)
Relaxed Supervision
Constraints for Efficient Inference
Inference task:
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸sample z given x ,y
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸sample z given x
.
pθ ,β (z | x ,y) ∝ pθ (z | x)qβ (y | z)∝ pθ (z | x)exp(β>ψ(z,y)).
Rejection sampler:
sample from pθ (z | x)accept with probability exp(β>ψ(z,y)).
Bound expected number of samples:
∑x ,y∈Data
(∑z
pθ (z | x)exp(β>ψ(z,y))
)−1
≤ τ. (1)
Ratio of normalization constants: can optimize subject to (1) (similar to CCCP).
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 28 / 31
![Page 132: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/132.jpg)
Relaxed Supervision
Constraints for Efficient Inference
Inference task:
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸sample z given x ,y
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸sample z given x
.
pθ ,β (z | x ,y) ∝ pθ (z | x)qβ (y | z)∝ pθ (z | x)exp(β>ψ(z,y)).
Rejection sampler:
sample from pθ (z | x)accept with probability exp(β>ψ(z,y)).
Bound expected number of samples:
∑x ,y∈Data
(∑z
pθ (z | x)exp(β>ψ(z,y))
)−1
≤ τ. (1)
Ratio of normalization constants: can optimize subject to (1) (similar to CCCP).
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 28 / 31
![Page 133: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/133.jpg)
Relaxed Supervision
Constraints for Efficient Inference
Inference task:
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸sample z given x ,y
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸sample z given x
.
pθ ,β (z | x ,y) ∝ pθ (z | x)qβ (y | z)∝ pθ (z | x)exp(β>ψ(z,y)).
Rejection sampler:
sample from pθ (z | x)accept with probability exp(β>ψ(z,y)).
Bound expected number of samples:
∑x ,y∈Data
(∑z
pθ (z | x)exp(β>ψ(z,y))
)−1
≤ τ. (1)
Ratio of normalization constants: can optimize subject to (1) (similar to CCCP).
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 28 / 31
![Page 134: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/134.jpg)
Relaxed Supervision
Constraints for Efficient Inference
Inference task:
∇θ logpθ (y | x) = Ez∼pθ (·|x ,y)[φ(x , z,y)]︸ ︷︷ ︸sample z given x ,y
−Ez,y∼pθ (·|x)[φ(x , z, y)]︸ ︷︷ ︸sample z given x
.
pθ ,β (z | x ,y) ∝ pθ (z | x)qβ (y | z)∝ pθ (z | x)exp(β>ψ(z,y)).
Rejection sampler:
sample from pθ (z | x)accept with probability exp(β>ψ(z,y)).
Bound expected number of samples:
∑x ,y∈Data
(∑z
pθ (z | x)exp(β>ψ(z,y))
)−1
≤ τ. (1)
Ratio of normalization constants: can optimize subject to (1) (similar to CCCP).
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 28 / 31
![Page 135: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/135.jpg)
Relaxed Supervision
Experiments
Conjunctive semantic parsing:
0 10 20 30 40 50iteration
0.0
0.2
0.4
0.6
0.8
1.0
accu
racy
FixedBeta(0.5)FixedBeta(0.2)FixedBeta(0.1)
0 10 20 30 40 50iteration
100
101
102
103
104
105
num
ber o
f sam
ples
FixedBeta(0.5)FixedBeta(0.2)FixedBeta(0.1)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 29 / 31
![Page 136: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/136.jpg)
Relaxed Supervision
Experiments
Conjunctive semantic parsing:
0 10 20 30 40 50iteration
0.0
0.2
0.4
0.6
0.8
1.0
accu
racy
AdaptBeta(500)FixedBeta(0.5)FixedBeta(0.2)FixedBeta(0.1)
0 10 20 30 40 50iteration
100
101
102
103
104
105
num
ber o
f sam
ples
AdaptBeta(500)FixedBeta(0.5)FixedBeta(0.2)FixedBeta(0.1)
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 29 / 31
![Page 137: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/137.jpg)
Open Questions
1 Motivation
2 Formal Setting
3 Reified Context Models
4 Relaxed Supervision
5 Open Questions
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 30 / 31
![Page 138: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/138.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 139: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/139.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 140: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/140.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimization
Metacomputationusing Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 141: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/141.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 142: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/142.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 143: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/143.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31
![Page 144: Learning with Intractable Inference and Partial Supervision - Stanford … · 2015. 12. 3. · Learning with Intractable Inference and Partial Supervision Jacob Steinhardt Stanford](https://reader033.vdocuments.pub/reader033/viewer/2022051922/60101b6e2472f778594f9e0f/html5/thumbnails/144.jpg)
Open Questions
Scale up to larger taskssemantic parsing, reinforcement learning, program induction
Extend to Bayesian models
Understand non-convex optimizationMetacomputation
using Reified Context Models?
Probabilistic abstract interpretation
Statistics & Computation: still a long way to go
Thanks!谢谢
J. Steinhardt (Stanford) Learning and Inference September 8, 2015 31 / 31