handout eight

Upload: doomy-jones

Post on 06-Mar-2016

236 views

Category:

Documents


0 download

DESCRIPTION

eighth edition

TRANSCRIPT

  • Introduction to Probability and Statistics

    Handout #8

    Instructor: Lingzhou Xue

    TA: Daniel Eck

    The pdf file for this class is available on the class web page.

    1

  • Chapter 9

    One and Two-Sample Estimation Problems

    Statistical Inference

    2

  • Statistical Inference

    Estimation Taking a random sample from the distribution toelicit some information about the unknown parameter .

    Example: a candidate for public office may wish to estimatethe true proportion of voters favoring him by obtaining theopinions from a random sample of 100 eligible voters.

    Hypothesis Testing Hypothesis tests are procedures for makingrational decisions about the reality of effects.

    Example: one is interested in finding out whether or not anew drug is efficacious. He or she conducts a series of drugtrials and study that if those patients who received the drugdid slightly better than those who received a placebo or not.

    3

  • Classical Methods of Estimation

    4

  • Point Estimate

    A point estimate is single value for a population parameter.

    Interval Estimate

    An interval estimate is an interval, or range of values, used to

    estimate a population parameter.

    5

  • What are the desirable properties of a good decision function

    that would influence us to choose one estimator rather than

    another?

    Unbiased Estimator

    A statistic is said to be an unbiased estimator of the parameter

    if

    E() = .

    Bias

    Any sampling procedure that produces inferences that consis-

    tently overestimate or consistently underestimate some charac-

    teristic of the population is said to be biased.

    6

  • Example 1

    If X1, X2, . . . , Xn are binomial random variables with parameters

    n and p, show that

    1. P1 =Xn is an unbiased estimator of p, where X =

    ni=1Xin .

    2. P2 =X1n is an unbiased estimator of p.

    3. P3 =X1+Xnn+1 is not an unbiased estimator of p.

    7

  • Example 2

    Show that S2 is an unbiased estimator of the parameter 2,

    whichillustrates why we divide by n 1 rather than n when thevariance is estimated.

    Proof:

    8

  • Most Efficient Estimator

    If we consider all possible unbiased estimator of some parameter

    , the one with the smallest variance is called the most efficient

    estimator of .

    If 1 and 1 are two unbiased estimators of the same population

    parameter and 21

    < 22

    , we say that 1 is more efficient

    estimator of than 2.

    9

  • Sampling distributions of different estimators of .

    10

  • Example 3

    If X1, X2, . . . , Xn are binomial random variables with parameters

    n and p, show that

    1. P1 =Xn is an unbiased estimator of p, where X =

    ni=1Xin .

    2. P2 =X1n is an unbiased estimator of p.

    Which one is more efficient estimator of p?

    11

  • The Notion of an Interval Estimate

    Even the most efficient unbiased estimator is unlikely to estimate

    the population parameter exactly. It is true that our accuracy

    increase with large samples, but there is still no reason why

    we should expect a point estimate from a given sample to

    be exactly equal to the population parameter it is supposed to

    estimate. There are many situation in which it is preferable

    to determine an interval within which we would expect to find

    the value of the parameter. Such an interval is call interval

    estimate.

    12

  • Interval Estimation

    An interval estimate of a population parameter is an interval

    of the form L < < U , where L and U depend on the value of

    the statistic for a particular sample and also on the sampling

    distribution of .

    13

  • Interpretation of Interval Estimation

    Since different samples will generally yield different values of

    and, therefore, different values L and U , these endpoints of

    the interval are values of corresponding random variables Land U .

    From the sampling distribution of we shall be able to determine

    L and U such that

    P (L < < U) = 1 , 0 < < 1.then we have probability of 1 of selecting a random samplethat will produce an interval containing .

    14

  • Confidence Interval

    The interval L < < U , computed from the selected sample, is

    then called a 100(1)% confidence interval, the fraction 1is called the confidence coefficient or the degree of confidence,

    and the endpoints, L and U , are called the lower and upper

    confidence limits.

    The wider the confidence interval is, the more confidence we can

    be that the given interval contains the unknown parameter.

    15

  • Example 4

    It is better to be 95% confident that the average life of a certain

    television transistor is between 6 and 7 years than to be 99%

    confident that it is between 3 and 10 years.

    Ideally, we prefer a short interval with a high degree of

    confidence. Sometimes, restrictions on the size of our sample

    prevent us from achieving short intervals without sacrificing some

    of our degree of confidence.

    16

  • Single Sample: Estimating the Mean

    17

  • If n is sufficiently large, according to the central limit theorem,

    we can establish a confidence interval for by considering the

    sampling distribution X. We expect the sampling distribution of

    X to be approximately normally distributed with mean X =

    and standard deviation X = 2/n. Then, we have

    P(z/2 < Z < z/2

    )= 1 ,

    where

    Z =X /n.

    Hence

    P

    (z/2 X z

    n

    )= 1

    Upper one-sided bound

    P

    (X /n> z

    )= 1 P

    ( < X + z

    n

    )= 1

    27

  • One-Sided Confidence Bounds on ; Known

    If X is the mean of a random sample of size n from a population

    with variance 2, the one-sided 100(1 )% confidence boundsfor are given by

    Lower one-sided bound

    x z n.

    Upper one-sided bound

    x+ zn.

    28

  • Example 8

    In a psychological testing experiment, 36 subjects are selected

    randomly and their reaction time, in seconds, to a particular ex-

    periment is measure. Past experience suggests that the variance

    in reaction time to these types of stimuli are 4 sec2. The aver-

    age time for the subjects was 6.2 seconds. Give an upper 95%

    bound for the mean reaction time.

    Solution:

    29

  • The Case of Unknown

    30

  • In a situation with unknown, then the random variable

    T =X S/n.

    has a Student t-distribution with n 1 degrees of freedom whensample size is n. T can be used to construct a confidence interval

    on .

    P

    (t/2( = n 1)