DISCUSSIONof
Bayesian Computation via empirical likelihood
Stefano Cabras, [email protected] Carlos III de Madrid (Spain)
Universita di Cagliari (Italy)
Padova, 21-Mar-2013
Summary
◮ Problem:
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN(θ | y) ∝ LN(θ)π(θ).
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN(θ | y) ∝ LN(θ)π(θ).
◮ BUT
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN(θ | y) ∝ LN(θ)π(θ).
◮ BUT◮ IF LN(θ) is not available:
◮ THEN all life ABC;
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN(θ | y) ∝ LN(θ)π(θ).
◮ BUT◮ IF LN(θ) is not available:
◮ THEN all life ABC;
◮ IF it is not even possible to simulate from f (y | θ):
Summary
◮ Problem:◮ a statistical model f (y | θ);◮ a prior π(θ) on θ;
◮ we want to obtain the posterior
πN(θ | y) ∝ LN(θ)π(θ).
◮ BUT◮ IF LN(θ) is not available:
◮ THEN all life ABC;
◮ IF it is not even possible to simulate from f (y | θ):◮ THEN replace LN(θ) with LEL(θ)
(the proposed BCel procedure):
π(θ|y) ∝ LEL(θ)× π(θ).
.
... what remains about the f (y | θ) ?
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned andexpressed by h(Y ,θ);
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned andexpressed by h(Y ,θ);
◮ Constraints are model driven and so there is still a timid traceof f (y | θ) in BCel .
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned andexpressed by h(Y ,θ);
◮ Constraints are model driven and so there is still a timid traceof f (y | θ) in BCel .
◮ Examples:
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned andexpressed by h(Y ,θ);
◮ Constraints are model driven and so there is still a timid traceof f (y | θ) in BCel .
◮ Examples:◮ The coalescent model example is illuminating in suggesting the
score of the pairwise likelihood;
... what remains about the f (y | θ) ?
◮ Recall that the Empirical Likelihood is defined, for iid sample,by means of a set of constraints:
Ef (y |θ)[h(Y ,θ)] = 0.
◮ The relation between θ and obs. Y is model conditioned andexpressed by h(Y ,θ);
◮ Constraints are model driven and so there is still a timid traceof f (y | θ) in BCel .
◮ Examples:◮ The coalescent model example is illuminating in suggesting the
score of the pairwise likelihood;◮ The residuals in GARCH models.
... a suggestion
What if we do not even known h(·) ?
... how to elicit h(·) automatically
... how to elicit h(·) automatically
... how to elicit h(·) automatically
◮ Set h(Y ,θ) = Y − g(θ), where
g(θ) = Ef (y |θ)(Y |θ),
is the regression function of Y |θ;
... how to elicit h(·) automatically
◮ Set h(Y ,θ) = Y − g(θ), where
g(θ) = Ef (y |θ)(Y |θ),
is the regression function of Y |θ;
◮ g(θ) should be replaced by an estimator g(θ).
How to estimate g(θ) ?
1... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,Castellanos, Ruli (Ercim-2012, Oviedo).
How to estimate g(θ) ?
◮ Use a once forever pilot-run simulation study: 1
1... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,Castellanos, Ruli (Ercim-2012, Oviedo).
How to estimate g(θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:θ1, . . . ,θM
1... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,Castellanos, Ruli (Ercim-2012, Oviedo).
How to estimate g(θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:θ1, . . . ,θM
2. Simulate the corresponding y1, . . . , yM
1... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,Castellanos, Ruli (Ercim-2012, Oviedo).
How to estimate g(θ) ?
◮ Use a once forever pilot-run simulation study: 1
1. Consider a grid (or regular lattice) of θ made by M points:θ1, . . . ,θM
2. Simulate the corresponding y1, . . . , yM
3. Regress y1, . . . , yM on θ1, . . . ,θM obtaining g(θ).
1... similar to Fearnhead, P. and D. Prangle (JRRS-B, 2012) or Cabras,Castellanos, Ruli (Ercim-2012, Oviedo).
... example: y ∼ N(|θ|, 1)For a pilot run of M = 1000 we have g(θ) = |θ|.
−10 −5 0 5 10
05
10
Pilot−Run s.s.
θ
y
g(θ)
... example: y ∼ N(|θ|, 1)Suppose to draw a n = 100 sample from θ = 2:
Histogram of y
y
Fre
quen
cy
0 1 2 3 4
05
1015
20
... example: y ∼ N(|θ|, 1)The Empirical Likelihood is this
−4 −2 0 2 4
1.0
1.5
2.0
2.5
θ
Em
p. L
ik.
1st Point: Do we need necessarily have to use f (y | θ) ?
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?◮ For a given data y;
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?◮ For a given data y;◮ and h(Y ,θ) fixed;
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?◮ For a given data y;◮ and h(Y ,θ) fixed;◮ the LEL(θ) is the same regardless of f (y | θ).
1st Point: Do we need necessarily have to use f (y | θ) ?
◮ The above data maybe drawn from a (e.g.) a Half Normal;
◮ How this is reflected in the BCel ?◮ For a given data y;◮ and h(Y ,θ) fixed;◮ the LEL(θ) is the same regardless of f (y | θ).
Can we ignore f (y | θ) ?
2nd Point: Sample free vs Simulation free
2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”samplefree”, i.e.
2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”samplefree”, i.e.
◮ LEL(θ) → LN(θ) for n → ∞,◮ implying π(θ|y) → πN(θ | y) asymptotically in n.
2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”samplefree”, i.e.
◮ LEL(θ) → LN(θ) for n → ∞,◮ implying π(θ|y) → πN(θ | y) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.
2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”samplefree”, i.e.
◮ LEL(θ) → LN(θ) for n → ∞,◮ implying π(θ|y) → πN(θ | y) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.◮ π(θ|ρ(s(y), sobs) < ǫ) → πN(θ | y) as ǫ → 0◮ implying convergence in the number of simulations if s(y) were
sufficient.
2nd Point: Sample free vs Simulation free
◮ The Empirical Likelihood is ”simulation free” but not ”samplefree”, i.e.
◮ LEL(θ) → LN(θ) for n → ∞,◮ implying π(θ|y) → πN(θ | y) asymptotically in n.
◮ The ABC is ”sample free” but not ”simulation free”, i.e.◮ π(θ|ρ(s(y), sobs) < ǫ) → πN(θ | y) as ǫ → 0◮ implying convergence in the number of simulations if s(y) were
sufficient.
A quick answer recommends use BCel
BUTa small sample would recommend ABC ?
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003)◮ Mengersen et al. (PNAS, 2012)
◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003)◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003)◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003)◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
◮ Quasi-Likelihoods:
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003)◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006)◮ Greco et al. (JSPI, 2008)◮ Ventura et al. (JSPI, 2010)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.◮ Mengersen et al. (PNAS, 2012)
◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006)◮ Greco et al. (JSPI, 2008)◮ Ventura et al. (JSPI, 2010)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009)
◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006)◮ Greco et al. (JSPI, 2008)◮ Ventura et al. (JSPI, 2010)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009) : second order matching
properties;◮ Chang and Mukerjee (Stat. & Prob. Letters 2006)◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006)◮ Greco et al. (JSPI, 2008)◮ Ventura et al. (JSPI, 2010)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009) : second order matching
properties;◮ Chang and Mukerjee (Stat. & Prob. Letters 2006) : examples;◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006)◮ Greco et al. (JSPI, 2008)◮ Ventura et al. (JSPI, 2010)◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ The use of pseudo-likelihoods is not new in the Bayesiansetting:
◮ Empirical Likelihoods:◮ Lazar (Biometrika, 2003) : examples and coverages of C.I.◮ Mengersen et al. (PNAS, 2012) : examples and coverages of
C.I.◮ ...
◮ Modified-Likelihoods:◮ Ventura et al. (JASA, 2009) : second order matching
properties;◮ Chang and Mukerjee (Stat. & Prob. Letters 2006) : examples;◮ ...
◮ Quasi-Likelihoods:◮ Lin (Statist. Methodol., 2006) : examples;◮ Greco et al. (JSPI, 2008) : robustness properties;◮ Ventura et al. (JSPI, 2010) : examples and coverages of C.I.;◮ ...
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion ofvalidity:
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion ofvalidity:
π(θ|y) should obey the laws of probability in a fashion that isconsistent with statements derived from Bayes’rule.
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion ofvalidity:
π(θ|y) should obey the laws of probability in a fashion that isconsistent with statements derived from Bayes’rule.
◮ Very difficult!
3nd Point: How to validate a pseudo-posteriorπ(θ|y) ∝ LEL(θ)× π(θ) ?
◮ Monahan & Boos (Biometrika, 1992) proposed a notion ofvalidity:
π(θ|y) should obey the laws of probability in a fashion that isconsistent with statements derived from Bayes’rule.
◮ Very difficult!
How to validate the pseudo-posterior π(θ|y) when this is notpossible ?
... Last point: the ABC is still a terrific tool
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:◮ Statistical Journals;
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:◮ Statistical Journals;◮ Twitter;
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:◮ Statistical Journals;◮ Twitter;◮ Xiang’s blog ( xianblog.wordpress.com )
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:◮ Statistical Journals;◮ Twitter;◮ Xiang’s blog ( xianblog.wordpress.com )
◮ ... it is tailored to Approximate LN(θ).
... Last point: the ABC is still a terrific tool
◮ ... a lot of references:◮ Statistical Journals;◮ Twitter;◮ Xiang’s blog ( xianblog.wordpress.com )
◮ ... it is tailored to Approximate LN(θ).
Where is the A in BCel ?