applied bayesian analysis for the social sciences

19
Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology [email protected] Sponsored by Computing and Research Services and the Institute of Behavioral Science

Upload: shawn

Post on 05-Jan-2016

28 views

Category:

Documents


4 download

DESCRIPTION

Applied Bayesian Analysis for the Social Sciences. Philip Pendergast Computing and Research Services Department of Sociology [email protected]. Sponsored by Computing and Research Services and the Institute of Behavioral Science. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Applied Bayesian Analysis for the Social Sciences

Applied Bayesian Analysis for the Social Sciences

Philip PendergastComputing and Research Services

Department of [email protected]

Sponsored by Computing and Research Services and the Institute of Behavioral Science

Page 2: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

Suspending Disbelief-- Faith in Classical Statistics What are some issues that we

have with classical statistics? Think back to your introductory class…

Page 3: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

Suspending Disbelief-- Faith in Classical Statistics

• Conducting an infinite number of experiments/ repeated sampling

• Assume that some parameter is unknown but has a fixed value

• P-value worship• Null hypothesis testing• Multiple comparisons• Strict data assumptions, often unmet• Confidence Interval interpretation• Small samples are an issue

Page 4: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

The Coin Flip

• Frequentist– We can determine the bias of a coin (b) by

repeatedly flipping it and counting heads. As long as we repeat the process enough times, we should be able to estimate the “true” bias of the coin.

– If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.

Page 5: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

The Nail Flip

• Frequentist– We determine the bias of a nail (b) by

repeatedly flipping it and counting “heads” (landing on its flat base).

– If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.

Does this seem reasonable? Don’t we know that the nail is biased?

Page 6: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

Classical Statistics is Atheoretical

• Science is an iterative process, we should learn from past research.

• Theory should guide us in how we analyze data. – Typically, beyond the lit. review, informs:

• Variable selection• Model building• Choice of model (e.g. SEM, HLM)• NOT the actual way parameters are

estimated in the analysis

Page 7: Applied Bayesian Analysis for the Social Sciences

Bayesian Statistics and Theory• Bayesian statistics considers to

be unknown, possessing a probability distribution reflecting our degree of uncertainty about it.

• We take into consideration theory and uncertainty when estimating .

• The Posterior: A probably distribution for given our data on hand.

• The Data: Needs only meet the assumption of exchangeability.

• The Prior: A distribution based on knowledge about , and our certainty.

p(|y)p(y|)p()

Page 8: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

The Nail Flip• Bayesian

– Prior Beliefs: We consult several nail experts, who are relatively certain that nails will land on their heads only 1/50 times, or 2% of the time.

– Data on Hand: We flip the nail 100 times.– Posterior: We sample from the joint probability of

our prior beliefs given our data (the Posterior distribution) to see whether the experts’ opinions are reasonable and/or if our nail shares a similar bias to other nails.

Well if we examine the

anatomy of the nail…

Page 9: Applied Bayesian Analysis for the Social Sciences

Priors and Subjectivity

• “B-B-B-Bbbbut wait, aren’t these priors subjective? We are objective scientists!”– Variable selection, model choice, research

questions are all subjective decisions.

• By making these subjective decisions explicit, we open ourselves to critique and are forced to thoughtfully choose and defend our choice of priors. If we have no good theory, we must choose a prior that lets the data speak for itself.

Page 10: Applied Bayesian Analysis for the Social Sciences

Choosing Sensible Priors

• How much do we know? How accurate do we take this information to be?– Informative priors: Historical data, expert opinion,

past research findings, theoretical implications.– Non-informative prior: Uniform distribution over a

sensible range of values.

• If the prior has high precision (1/2) or N is small, it will heavily influence the posterior distribution. If it has low precision or N is large, the data influences the posterior more.

Page 11: Applied Bayesian Analysis for the Social Sciences

Conjugate Prior Distributions

• Conjugate priors have a distribution that yields a posterior distribution in the same family as the prior when combined with data.

Data Distribution

Normal

Poisson

Binomial

Conjugate Prior

Normal or Uniform

Gamma

Beta

Page 12: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

The Posterior Distribution and Monte Carlo Integration

• Recall that p(|y) is a probability distribution.

• It is computationally demanding to directly derive summary measures of p(|y).

• Instead, we repeatedly sample from p(|y) and summarize the distribution formed by these samples.– This is called Monte Carlo

Integration

Page 13: Applied Bayesian Analysis for the Social Sciences

QuickTime™ and a decompressor

are needed to see this picture.

Monte Carlo Markov Chains, Explained

QuickTime™ and a decompressor

are needed to see this picture.

Page 14: Applied Bayesian Analysis for the Social Sciences

Markov Chains, Continued• We specify the number of chains as well as the

number of iterations made.• They “dance” around the posterior from

starting values, moving to areas of higher density.

• Chains stabilize around the posterior mean.• Once stabilized, discard early iterations (Burn-

in samples).• Estimates of the posterior come from the post-

burn-in period.

Page 15: Applied Bayesian Analysis for the Social Sciences

Bayesian Analysis (Finally!) • Decide on a model.• Specify the # of Markov Chains, # of

iterations, a burn-in period, and your prior beliefs.

• Run model diagnostics to check for convergence.

• Compare results of models with different specifications of priors, parameters, etc. to see which best “returns” the data in-hand or obtains the highest model fit (e.g. BIC, Bayes Factor, Deviance).

Page 16: Applied Bayesian Analysis for the Social Sciences

Overcoming Classical Shortcomings

• Conducting an infinite number of experiments/ repeated sampling

• Assume that some parameter is unknown but has a fixed value

• P-value worship• Null hypothesis testing• Multiple comparisons• Strict data assumptions, often

unmet• Confidence Interval

Interpretation• Small samples are an issue

•Only use data on hand, no extrapolating to other potential(ly conflicting) data•Directly estimate our uncertainty of •Report HDIs, thoughtfully draw conclusions•More meaningful hypothesis testing (e.g. different priors)•Not an issue•Minimal assumptions (exchangability)•HDI shows the believability (probability) of values•If strong priors, still useful

Page 17: Applied Bayesian Analysis for the Social Sciences

References

Kruschke, J. K. (2011). Doing Bayesian Data Analysis: A tutorial with R and BUGS. Oxford: Academic Press.

Kaplan, D. (2014). Bayesian Statistics for the Social Sciences. New York: Guilford Press.

Page 18: Applied Bayesian Analysis for the Social Sciences

R “MCMCpack” Tutorial

• Run simple models predicting job satisfaction as a function of income.

• One model uses an uninformative prior (specifically, the uniform distribution)

• The other uses an informed prior from earlier data

• Compare the Bayes Factors to see which “retrieves” the data better (i.e. is a better fit)

Page 19: Applied Bayesian Analysis for the Social Sciences

R “MCMCpack” Tutorial

• Open R• Click “Packages”-->Set CRAN mirror-->

Pick anything in the US.• Open “Packages” again-->Install

Packages-->Scroll down to MCMCpack.• Say “yes” to a new library.• Type “library(MCMCpack)” to load it in,

also type “library(foreign)” to enable reading of the STATA file.