handout eight

Introduction to Probability and Statistics

Handout #8

Instructor: Lingzhou Xue

TA: Daniel Eck

The pdf file for this class is available on the class web page.

1

Chapter 9

One and Two-Sample Estimation Problems

Statistical Inference

2

Statistical Inference

Estimation Taking a random sample from the distribution toelicit some information about the unknown parameter .

Example: a candidate for public office may wish to estimatethe true proportion of voters favoring him by obtaining theopinions from a random sample of 100 eligible voters.

Hypothesis Testing Hypothesis tests are procedures for makingrational decisions about the reality of effects.

Example: one is interested in finding out whether or not anew drug is efficacious. He or she conducts a series of drugtrials and study that if those patients who received the drugdid slightly better than those who received a placebo or not.

3

Classical Methods of Estimation

4

Point Estimate

A point estimate is single value for a population parameter.

Interval Estimate

An interval estimate is an interval, or range of values, used to

estimate a population parameter.

5

What are the desirable properties of a good decision function

that would influence us to choose one estimator rather than

another?

Unbiased Estimator

A statistic is said to be an unbiased estimator of the parameter

if

E() = .

Bias

Any sampling procedure that produces inferences that consis-

tently overestimate or consistently underestimate some charac-

teristic of the population is said to be biased.

6

Example 1

If X1, X2, . . . , Xn are binomial random variables with parameters

n and p, show that

1. P1 =Xn is an unbiased estimator of p, where X =

ni=1Xin .

2. P2 =X1n is an unbiased estimator of p.

3. P3 =X1+Xnn+1 is not an unbiased estimator of p.

7

Example 2

Show that S2 is an unbiased estimator of the parameter 2,

whichillustrates why we divide by n 1 rather than n when thevariance is estimated.

Proof:

8

Most Efficient Estimator

If we consider all possible unbiased estimator of some parameter

, the one with the smallest variance is called the most efficient

estimator of .

If 1 and 1 are two unbiased estimators of the same population

parameter and 21

< 22

, we say that 1 is more efficient

estimator of than 2.

9

Sampling distributions of different estimators of .

10

Example 3

If X1, X2, . . . , Xn are binomial random variables with parameters

n and p, show that

1. P1 =Xn is an unbiased estimator of p, where X =

ni=1Xin .

2. P2 =X1n is an unbiased estimator of p.

Which one is more efficient estimator of p?

11

The Notion of an Interval Estimate

Even the most efficient unbiased estimator is unlikely to estimate

the population parameter exactly. It is true that our accuracy

increase with large samples, but there is still no reason why

we should expect a point estimate from a given sample to

be exactly equal to the population parameter it is supposed to

estimate. There are many situation in which it is preferable

to determine an interval within which we would expect to find

the value of the parameter. Such an interval is call interval

estimate.

12

Interval Estimation

An interval estimate of a population parameter is an interval

of the form L < < U , where L and U depend on the value of

the statistic for a particular sample and also on the sampling

distribution of .

13

Interpretation of Interval Estimation

Since different samples will generally yield different values of

and, therefore, different values L and U , these endpoints of

the interval are values of corresponding random variables Land U .

From the sampling distribution of we shall be able to determine

L and U such that

P (L < < U) = 1 , 0 < < 1.then we have probability of 1 of selecting a random samplethat will produce an interval containing .

14

Confidence Interval

The interval L < < U , computed from the selected sample, is

then called a 100(1)% confidence interval, the fraction 1is called the confidence coefficient or the degree of confidence,

and the endpoints, L and U , are called the lower and upper

confidence limits.

The wider the confidence interval is, the more confidence we can

be that the given interval contains the unknown parameter.

15

Example 4

It is better to be 95% confident that the average life of a certain

television transistor is between 6 and 7 years than to be 99%

confident that it is between 3 and 10 years.

Ideally, we prefer a short interval with a high degree of

confidence. Sometimes, restrictions on the size of our sample

prevent us from achieving short intervals without sacrificing some

of our degree of confidence.

16

Single Sample: Estimating the Mean

17

If n is sufficiently large, according to the central limit theorem,

we can establish a confidence interval for by considering the

sampling distribution X. We expect the sampling distribution of

X to be approximately normally distributed with mean X =

and standard deviation X = 2/n. Then, we have

P(z/2 < Z < z/2

)= 1 ,

where

Z =X /n.

Hence

P

(z/2 X z

n

)= 1

Upper one-sided bound

P

(X /n> z

)= 1 P

( < X + z

n

)= 1

27

One-Sided Confidence Bounds on ; Known

If X is the mean of a random sample of size n from a population

with variance 2, the one-sided 100(1 )% confidence boundsfor are given by

Lower one-sided bound

x z n.

Upper one-sided bound

x+ zn.

28

Example 8

In a psychological testing experiment, 36 subjects are selected

randomly and their reaction time, in seconds, to a particular ex-

periment is measure. Past experience suggests that the variance

in reaction time to these types of stimuli are 4 sec2. The aver-

age time for the subjects was 6.2 seconds. Give an upper 95%

bound for the mean reaction time.

Solution:

29

The Case of Unknown

30

In a situation with unknown, then the random variable

T =X S/n.

has a Student t-distribution with n 1 degrees of freedom whensample size is n. T can be used to construct a confidence interval

on .

P

(t/2( = n 1)

handout eight

Documents