chapter 3: discrete random variables and their distributions

155
Chapter 3: Discrete Random Variables Professor Ron Fricker Naval Postgraduate School Monterey, California 3/15/15 1 Reading Assignment: Sections 3.1 – 3.8 & 3.11 – 3.12

Upload: trandan

Post on 04-Jan-2017

231 views

Category:

Documents


0 download

TRANSCRIPT

Chapter 3:Discrete Random Variables

Professor Ron FrickerNaval Postgraduate School

Monterey, California

3/15/15 1Reading Assignment: Sections 3.1 – 3.8 & 3.11 – 3.12

Goals for this Chapter

•  Learn about discrete random variables–  Probability distributions–  Expected value, variance, and standard deviation

•  Define specific distributions–  Binomial–  Geometric–  Negative Binomial–  Hypergeometric–  Poisson

•  Define and apply Tchebycheff’s Theorem

3/15/15 2

Sections 3.1 and 3.2:Some Definitions

•  Random variables are a fundamental concept in probability–  They’re random because they take on random

values based on the outcome of the experiment–  Every random variable is associated with a

probability distribution that specifies the possible r.v. values & the probability each value will occur

•  Definition 3.1: A random variable is said to be discrete if it can assume only a finite or countably infinite number of distinct values–  E.g., if Y represents the outcome of a six-sided

die, it can only take on values 1, 2, 3, 4, 5, and 6 3/15/15 3

Remember the Notation

•  Capital Roman letters denote random variables, typically X, Y, and Z –  These are “placeholders” for the values an experiment will

take on once it is complete

•  Small Roman letters denote the value of a random variable, typically x, y, and z–  They are the outcomes of the experiment

•  So, we can write things like P(Y=y) which in words is the “probability that random variable Y takes on the value y”–  Get comfortable with this notation – you will use it a lot in this

class and in this curriculum!

3/15/15 4

The Probability of a R.V. Outcome

•  Definition 3.2: The probability that Y takes on the value y, P(Y = y), is defined as the sum of the probabilities of all the sample points in S that are assigned the value y–  We sometimes denote P(Y = y) by p(y)

•  Because p(y) is a function that assigns probabilities to each value of the random variable Y, it is sometimes referred to as a probability function

3/15/15 5

Probability Distributions

•  Definition 3.3: The probability distribution for a discrete random variable Y can be represented by a formula, table, or a graph that provides p(y) = P(Y = y) for all y

•  Note that for all y, but the probability distribution only assigns nonzero probabilities to a countable number of y values–  It is convention that any value y not explicitly

assigned a positive probability is understood to be zero: p(y) = 0

3/15/15 6

p(y) � 0

Example

•  Basically, the probability distribution of the random variable Y tells us how the total probability of 1 is “distributed” among all the possible values that Y can have

•  Example: Let the random variable Y be the sum of a pair of dice:

•  The table is the probability distribution of y!

7

y

P(Y=y)

2 3 4 5 6 7 8 9 10 11 12

1/36 2 / 36 3 / 36 4 / 36 5/36 6 / 36 5/36 4 / 36 3 / 36 2 / 36 1/36

3/15/15

Textbook Example 3.1

An XO has six junior officers in the wardroom, three men and three women. He has to choose two for a special project. So as not to play favorites, he decides to select the two at random. Let Y denote the number of women selected. Find the probability distribution for Y.Solution:

3/15/15 8

Textbook Example 3.1 Solution (cont’d)

3/15/15 9

Example 3.1 Results Summary

•  As shown below, the results can be summarized in either a table or a probability histogram

•  Can also represent the result as a formula:

3/15/15 10

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Note that the area “under the curve” in each bar is

the probability of the discrete outcome

p(y) =

�3y

�� 32�y

��62

� , y = 0, 1, 2.

Some Common Terminology

•  For discrete random variables, the probability distribution is often called the probability mass function (or pmf)–  The notation for a pmf is–  Also, the probability distribution for continuous

random variables (that we’ll get to in the next chapter) is often called the probability density function (or pdf)

•  The expression is often referred to as cumulative distribution function (or cdf)

3/15/15 11

P (Y = y)

P (Y y)

Graphically Depicting the PMF and CDF

3/15/15 12

Probability Mass Function Cumulative Distribution Function

P (Y y)P (Y = y)

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.00 1 2 30 1 2 3 yy

Probability Axioms Redux

•  Theorem 3.1: For any discrete probability distribution, the following must be true:1.  2.  , where the summation is over

all values of y with nonzero probability–  Note these are just the Probability Axioms from

Chapter 2 applied to random variables

3/15/15 13

X

y

p(y) = 1

p(y) � 0 for all y

Section 3.2 Homework

•  Do problems 3.1, 3.2, 3.6, 3.9

3/15/15 14

Section 3.3:Expected Value

•  Definition 3.4: Let Y be a discrete r.v. with probability function p(y). Then the expected value of Y, E(Y) is defined to be

•  To be precise about the definition, the expected value only exists if the above sum is absolutely convergent, meaning

3/15/15 15

E(Y ) =X

y

y ⇥ p(y)

X

y

| y | p(y) < 1

Expected Value, continued

•  We often use the Greek letter (pronounced “mu”) as shorthand to denote the expected value, so

•  Example: Consider a r.v. Y with the following probability distribution:

or equivalently

•  Then

3/15/15 16

µ

E(Y ) ⌘ µ

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

E(Y ) =X

y

yp(y) = 0⇥ 1/4 + 1⇥ 1/2 + 2⇥ 1/4 = 1

Example

•  Now, remember that we defined probability in terms of relative frequency–  So, if the experiment in the example is repeated 4

million times, we would expect approximately•  1 million 0s (i.e., 4 million x ¼ = 1 million)•  2 million 1s•  1 million 2s

–  With this data, the average is

3/15/15 17

µ ⇡Pn

i=1 yin

=1, 000, 000⇥ 0 + 2, 000, 000⇥ 1 + 1, 000, 000⇥ 2

4, 000, 000

= 0⇥ 1/4 + 1⇥ 1/2 + 2⇥ 1/4

=2X

y=0

yp(y) = 1

Application Example

•  You’re going to independently shoot two missiles at two targets, one at each–  Under a given set of conditions, the missile has a

probability of kill of–  If Y is the number of kills, then p(y) is

–  And, E(Y), the expected number of kills is

3/15/15 18

y p(y) 0 0.3 x 0.3 = 0.09 1 0.3 x 0.7 + 0.7 x 0.3 = 0.42 2 0.7 x 0.7 = 0.49

pk = 0.7

E(Y ) =2X

y=0

yp(y) = 0⇥ 0.09 + 1⇥ 0.42 + 2⇥ 0.49 = 1.4

Expected Value of a Function of a R.V.

•  Theorem 3.2: Let Y be a discrete r.v. with probability function p(y) and g(y) is a real-valued function of Y. Then the expected value of g(y) is

•  Note that this is much like the calculation in Definition 3.4 except we’re doing it on g(y) instead of just y

3/15/15 19

E[g(Y )] =X

all y

g(y)⇥ p(y)

Proof of Theorem 3.2

203/15/15

Proof of Theorem 3.2

213/15/15

Proof of Theorem 3.2

223/15/15

Variance & Standard Deviation of a R.V.

•  Definition 3.5: For a r.v. Y with mean , the variance of Y, V(Y), is defined as the expected value of . That is,

•  The variance is literally the average squared deviation of Y from its mean–  It’s a measure of how variable a random variable is–  The bigger the variance, the more “spread

out” (around the mean) the values are that the r.v. can take on

3/15/15 23

E(Y ) ⌘ µ

(Y � µ)2

V (Y ) = E[(Y � µ)2].

The Standard Deviation of a R.V.

•  The standard deviation of Y is the positive square root of V(Y)

•  Conventional notation often denotes the variance as –  Hence, the standard deviation is often denoted by

the Greek letter sigma:•  Note that the variance is in the mean’s units

squared –  In comparison, the standard deviation is in the same

units as the mean–  This often makes the standard deviation easier to

understand and interpret3/15/15 24

V (Y ) ⌘ �2

Textbook Example 3.2

•  The probability distribution for a r.v. Y is given below in Table 3.3. Find the mean, variance, and standard deviation of Y.

•  Solution:

253/15/15

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 3.2 Solution, cont’d

263/15/15

Interpreting Example 3.2 Results

3/15/15 27

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

The Expected Value of a Constant is a Constant

•  Theorem 3.3: Let Y be a discrete r.v. with probability function p(y) and let c be a constant. Then .

•  Proof:

3/15/15 28

E(c) = c

Another Expected Value Theorem

•  Theorem 3.4: Let Y be a discrete r.v. with probability function p(y), let g(Y) be a function of Y, and let c be a constant. Then

•  Proof:

3/15/15 29

E[cg(Y )] = cE[g(Y )]

And Another Expected Value Theorem

•  Theorem 3.5: Let Y be a discrete r.v. with probability function p(y), and let g1(Y), g2(Y),…, gk(Y) be k functions of Y. Then

•  Proof:

3/15/15 30

E[g1(Y ) + g2(Y ) + · · ·+ gk(Y )] =

E[g1(Y )] + E[g2(Y )] + · · ·+ E[gk(Y )]

A Theorem Simplifying Variance Calcs

•  Theorem 3.6: Let Y be a discrete r.v. with probability function p(y) and mean . Then

•  Proof:

3/15/15 31

µ

V (Y ) ⌘ �2 = E[(Y � µ)2] = E(Y 2)� µ2.

Proof of Theorem 3.6 Continued

3/15/15 32

Textbook Example 3.3

•  Use Theorem 3.6 to find the variance of the r.v. Y in Example 3.2.–  Remember we found that and the

probability function is :

•  Solution:

333/15/15

µ = 1.75

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 3.4

•  As the manager of a repair depot, you have the choice of buying one of two machines: either type A or type B

•  Let t denote the number of hours of daily operation

•  The number of daily repairs is a r.v.–  Y1 is the number of daily repairs to machine A–  Y2 is the number of daily repairs to machine B

where

343/15/15

E(Y1) = V (Y1) = 0.10t

E(Y2) = V (Y2) = 0.12t

Textbook Example 3.4 (continued)

•  The daily cost of operating A is

•  And the daily cost of operating B is

•  Given the assumptions in the text, which machine minimizes the expected costs for a–  10 hour workday?–  20 hour workday?

353/15/15

CA(t) = 10t+ 30Y 21

CB(t) = 8t+ 30Y 22

Textbook Example 3.4 Solution

363/15/15

Textbook Example 3.4 Solution Cont’d

373/15/15

Section 3.3 Homework

•  Do problems 3.12, 3.14, 3.15, 3.23, 3.33–  Problem 3.33 hint: Define a new r.v. .

Then

3/15/15 38

Z = aY + bV (Z) = E([Z � E(Z)]2)

= E([(aY + b)� E(aY + b)]2)

Special Probability Distributions

•  Up to this point, we have calculated various probability distributions that are unique to a particular problem or scenario

•  However, there are some probability distributions that apply in many situations

•  We are going to learn about a whole bunch of these now–  They will come up over and over again in this

curriculum and in your future life as an ORSA–  It behooves you to learn these and commit the

details to memory

3/15/15 39

Section 3.4:Binomial Probability Distribution

•  The binomial distribution applies when there is:–  A sequence of independent and identical trials–  Each trial can result in only one of two outcomes

•  Examples:–  A sequence of helmet tests where each shot is

either a penetration or not–  A set of artillery shots where each shot is either a

kill or not–  A medical treatment where each person is either

cured or not–  A retention bonus experiment where each sailor

either reenlists or not3/15/15 40

Binomial Experiments

•  Definition 3.6: A binomial experiment has the following properties:1.  The experiment consists of a fixed number, n, of

identical trials2.  Each trial results in one of two outcomes:

“success,” S, or “failure” F 3.  The probability of success on any trial is p and

remains the same from trial to trial•  The probability of failure is q = 1 – p

4.  The trials are independent5.  The random variable of interest is Y, the number

of successes out of the n trials3/15/15 41

Textbook Example 3.5

•  An aircraft early warning detection system consists of 4 units that operate independently. Suppose each has probability 0.95 of detecting an intruder aircraft. For an intruder, the random variable Y is the number of radar units that do not detect the intruder plane. Is this a binomial experiment?

•  Solution:

423/15/15

Textbook Example 3.5 Solution Continued

433/15/15

Textbook Example 3.6

•  Suppose that 40% of a large population of registered voters favor candidate Jones. A random sample of n = 10 voters will be selected and Y, the number favoring Jones, is to be observed. Is this a binomial experiment?

•  Solution:

443/15/15

Textbook Example 3.6 Solution Continued

3/15/15 45

Deriving the Formula for the Binomial Probability Distribution

•  A sequence of n trials is conducted that can result in a success (“S”) or a failure (“F”)

•  One such sequence might look like this:

•  Now each S occurred with probability p and each F with probability 1-p –  So, the probability the above sequence occurred is

3/15/15 46

FSFSSF · · · SSFS| {z }y Ss and n�y Fs

(1� p)⇥ p⇥ (1� p)⇥ p⇥ p⇥ (1� p)⇥ · · ·⇥ p⇥ p⇥ (1� p)⇥ p =

py(1� p)n�y

Deriving the Formula for the Binomial Probability Distribution (cont’d)

•  But now note that the observed sequence is just one way that y successes and n-y failures could have occurred

•  In fact, there are ways we could observe y successes in n trials

•  So, the probability of observing y successes in n trials (given the probability of success on each trial is a constant p) is

3/15/15 47

✓n

y

p(y) =

✓n

y

◆py(1� p)n�y

The Binomial Distribution

•  Definition 3.7: A random variable Y is said to have a binomial distribution based on n trials with success probability p if and only ifwhere

•  Note, for example, that if the experiment is all failures then

3/15/15 48

p(y) =

✓n

y

◆pyqn�y,

y = 0, 1, 2, . . . , n and 0 p 1

p(0) =

✓n

0

◆p0qn = qn

Proof the Binomial Distribution Sums to 1

•  Remember that for a valid probability distribution

•  For the binomial, we have

•  Small aside: A binomial r.v. with n = 1 is also called a Bernoulli random variable

3/15/15 49

X

y

p(y) = 1

X

y

p(y) =nX

y=0

✓n

y

◆pyqn�y

= (p+ q)n = 1n = 1

Example Binomial Distributions

3/15/15 50

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Interactive Binomial Distribution Demo

3/15/15 51

From R (with the shiny package installed), run: shiny::runGitHub('DiscreteDistnDemos','rdfricker')

Reading Table 1 in Appendix 3

3/15/15 52

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Reading Table 1 in Appendix 3 (cont’d)

3/15/15 53

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Reading Table 1 in Appendix 3 (cont’d)

3/15/15 54

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Reading Table 1 in Appendix 3 (cont’d)

3/15/15 55

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 3.7

•  Suppose that a lot of 5,000 electrical fuses contain 5% defectives. If a sample of 5 fuses is tested, find the probability of observing at least one defective.

•  Solution:

563/15/15

Solution in R

•  In R, the binomial functions are dbinom() and pbinom() –  P(Y = y0): dbinom(y0,n,p) –  P(Y y0): pbinom(y0,n,p)

•  So,

3/15/15 57

The R Help Page

3/15/15 58

Textbook Example 3.8

•  Data show that 30% of people recover from a particular illness without treatment –  Ten ill people are selected at random to receive a

new treatment and 9 recover after the treatment–  Assuming the medication is worthless (i.e., has no

effect on the illness), what is the probability that prior to treatment we would have expected to observe that at least 9 of the 10 treated people recover?

•  Solution:

593/15/15

Textbook Example 3.8 Solution Continued

3/15/15 60

Checking the Solution in R

3/15/15 61

Textbook Example 3.9

•  The lot of electrical fuses in Example 3.7 is supposed to contain only 5% defectives. If n = 20 fuses are sampled, what’s the probability of observing at least 4 defectives?

•  Solution:

•  Note:623/15/15

Mean and Variance of a Binomial R.V.

•  Theorem 3.7: Let Y be a binomial random variable based on n trials and success probability p. Thenand

•  Proof that :

3/15/15 63

µ ⌘ E(Y ) = np

�2 ⌘ V (Y ) = npq

E(Y ) = np

Proof Continued

3/15/15 64

Proof Continued

3/15/15 65

Proof Continued

•  Proof that :

3/15/15 66

V (Y ) = npq

Proof Continued

3/15/15 67

Proof Continued

3/15/15 68

Proof Continued

3/15/15 69

Skip Textbook Example 3.10

•  This is a peek ahead into OA3102 and chapter 9 of the textbook–  It’s a statistics problem: Given some data, how to

we estimate the probability?–  We’ll save these and similar problems until

OA3102

703/15/15

Section 3.4 Homework

•  Do problems 3.37, 3.40, 3.41, 3.44, 3.46, 3.48

3/15/15 71

Section 3.5:Geometric Probability Distribution

•  Imagine (like with the binomial) you have a situation with a sequence of independent trials, where there are only two outcomes (success and failure), and the trials have probability of success p

•  Now, if we let Y be the number of trials until the first success (where we won’t fix the number of trials in advance), then Y has a geometric distribution–  Example: The number of pings with a radar or

sonar until first detection of a (stationary) object has a geometric distribution

3/15/15 72

Deriving the Geometric Distribution

•  Since Y is the number of trials up to and including the first success, we have

•  Definition 3.8: A random variable Y is said to have a geometric distribution based on n trials with success probability p if and only if

3/15/15 73

p(y) = P (FFFFFF · · ·F| {z }y�1

S) = (1� p)y�1p

p(y) = qy�1p, y = 1, 2, 3, . . . , 0 p 1.

Example of a Geometric Distribution

3/15/15 74

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Interactive Geometric Distribution Demo

3/15/15 75

From R (with the shiny package installed), run: shiny::runGitHub('DiscreteDistnDemos','rdfricker')

Proof that the Geometric Distribution is a Proper Probability Distribution

1.  Via inspection of the histogram or the definition of p(y), it should be obvious that

2.  Proof that :

3/15/15 76

X

y

p(y) =X

y

qy�1p = 1

p(y) � 0 8y

Textbook Example 3.11

•  Assume the probability an engine fails during any one-hour period is constant at p = 0.02. Find the probability that the engine continues operating for more than two hours.

•  Solution:

773/15/15

Solution in R

•  Similar to the binomial functions, for the geometric R has dgeom() and pgeom() –  P(Y = y0): dgeom(y0-1,p) –  P(Y y0): pgeom(y0-1,p)

•  Why y0-1 in the R expressions?–  R defines the geometric as the number of failures

before the first success•  So,

3/15/15 78

The R Help Page

3/15/15 79

Mean and Variance of a Geometric R.V.

•  Theorem 3.8: Let Y be a geometric random variable with success probability p. Thenand

•  Prove that :

3/15/15 80

µ = E(Y ) = 1/p

�2 = V (Y ) = (1� p)/p2

E(Y ) = 1/p

Proof of E(Y) Continued

3/15/15 81

Proof of V(Y) Continued

•  Proof that :

3/15/15 82

V (Y ) = (1� p)/p2

Proof of V(Y) Continued

3/15/15 83

Textbook Example 3.12

•  As in Example 3.11, assume the probability an engine fails during any one-hour period is constant at p = 0.02. Let Y denote the number of one-hour intervals until the failure. Find the mean and standard deviation of Y.

•  Solution:

843/15/15

Application Example (Skip Example 3.13)

•  Under a given set of conditions, the MK-48 torpedo has a probability hitting a particular surface target of p = 1/3. What is the average number of torpedoes you will need to shoot until you first hit the target?

•  Solution:

3/15/15 85

Section 3.5 Homework

•  Do problems 3.67, 3.68, 3.73, 3.74

3/15/15 86

Section 3.6:Negative Binomial Probability Distribution

•  Assume you have a situation with only two outcomes (success and failure) and constant probability of success p

•  If Y is the number of trials until the rth success (r > 1), then Y has a negative binomial dist’n–  Example: Elimination of a particular hardened

target will require two direct hits by a Tomohawk cruise missile.

•  The probability of one direct hit is p = 0.3. •  Shooting a sequence of missiles, what is the

probability that the fifth missile eliminates the target?

3/15/15 87

Deriving the Negative Binomial Distribution

•  Since Y is the number of trials until the rth success:–  The last observation is a success and–  The prior y-1 trials contain exactly r-1 successes

•  So:•  The probability the last observation is a success is

p and •  The probability the prior y-1 trials contain exactly

r-1 successes is

3/15/15 88

✓y � 1

r � 1

◆pr�1(1� p)[(y�1)�(r�1)] =

✓y � 1

r � 1

◆pr�1(1� p)y�r

Deriving the Negative Binomial Distribution

•  Multiplying these two, since they are independent gives the result

•  Definition 3.9: A random variable Y is said to have a negative binomial distribution if and only if

3/15/15 89

p(y) =

✓y � 1

r � 1

◆prqy�r, y = r, r + 1, r + 2, . . . ,

0 p 1

Interactive Negative Binomial Demo

3/15/15 90

From R (with the shiny package installed), run: shiny::runGitHub('DiscreteDistnDemos','rdfricker')

Textbook Example 3.14

•  A geological study indicates that wells drilled in a particular region should strike oil with probability p = 0.2. Find the probability that the third oil strike comes on the fifth well drilled.

•  Solution:

913/15/15

Using R to Calculate Probabilities

•  Following the naming convention in R, the functions for the negative binomial are dnbinom() and pnbinom() –  P(Y = y0): dnbinom(y0-r,r,p) –  P(Y y0): pnbinom(y0-r,r,p)

•  Why y0-r in the R expressions?–  The negative binomial R function first argument is

the number of failures before the rth success•  So, back to the missile example:

3/15/15 92

P(Y ≤ 5) =

The R Help Page

3/15/15 93

Application Example

•  As the DCA on your ship, from experience you know the probability a randomly selected P-250 pump is operational is p = 0.5. –  What’s the probability that you’ll get 3 operational

pumps in the first 3 tries?

–  What’s the probability that you have to try 6 pumps in order to get 3 operational pumps?

3/15/15 94

P (Y = 3) =

✓2

2

◆0.530.50 = 0.53 = 0.125

Mean and Variance of a Negative Binomial Random Variable

•  Theorem 3.9: If Y is a r.v. with a negative binomial distribution, then the mean is

and the variance is

•  We’ll skip the proof…

3/15/15 95

E(Y ) = r/p

V (Y ) =r(1� p)

p2

Textbook Example 3.15

•  A repair depot has a large inventory of retrograde biometrics systems from Afghanistan, 20% of which need repair, and for which the depot only has three repair kits–  A technician is sent to repair as many systems as

possible. To do so, she takes a system and tests it. If it works, she sets it aside. If not, she repairs it using one of the kits

–  Suppose it takes 10 minutes to test a system and, if it is broken, another 20 minutes to repair

•  Find the mean and variance of the total time it takes the technician to use the 3 repair kits

3/15/15 96

Textbook Example 3.15

•  Solution:

3/15/15 97

Section 3.6 Homework

•  Do problems 3.90, 3.91, 3.92, 3.93, 3.95

3/15/15 98

Section 3.7:Hypergeometric Probability Distribution

•  To apply the binomial distribution to some finite population of size N, the sample size n must be much smaller than N–  See Example 3.6 about voting, where here we

were approximating the use of the binomial–  The problem is that if n is large compared to N,

then the probability of selection is not constant•  Example: We are going to sample without

replacement n = 2 out of a population of N = 3 –  First draw: P(selecting an item) = 1/3 –  Second draw: P(selecting an item) = 1/2

3/15/15 99

Hypergeometric Distribution Set-up

•  What to do when n is large compared to N? –  Answer: We derive a new probability distribution

•  Here’s the set-up: –  In the population, there are only two possible

types of elements•  The textbook labels them “red” and “black”•  Think of it like you have a bowl full of red and

black marbles (N) and you close your eyes and grab a handful of them (n)

–  The outcome of interest, Y, is the number of red elements in the sample

•  I.e., the number of red marbles in your hand3/15/15 100

Hypergeometric Distribution Derivation

•  There are ways to choose a sample of size n out of a population of size N

•  So, the probability of choosing any one particular sample (Ei) is

•  Now, if there are r red elements in the population, then the probability of picking y red elements out of the r is

3/15/15 101

✓N

n

P (Ei) =1�Nn

� , 8Ei 2 S

✓r

y

Hypergeometric Distribution Derivation

•  If there are r red elements then there must be N-r black elements–  And, if you selected y red then there must be n-y

black elements in the sample–  Thus, there are ways to select the black

elements •  Using the mn rule, it thus follows that there

are ways to choose y red and n-y black elements from a population of size N containing r red elements

3/15/15 102

✓r

y

◆✓N � r

n� y

✓N � rn� y

The Hypergeometric Distribution

•  Definition 3.10: A random variable Y has a hypergeometric distribution if and only if

where y is an integer on either 0, 1, 2, …, n, if r > n, or 0, 1, 2, …, r, if r < n

subject to the restrictions that and

–  That is, the restrictions are , and

3/15/15 103

P (Y = y) ⌘ p(y) =

�ry

��N�rn�y

��Nn

y rn� y N � r

n N r Ny min(n, r)

Interactive Hypergeometric Demo

3/15/15 104

From R (with the shiny package installed), run: shiny::runGitHub('DiscreteDistnDemos','rdfricker')

Textbook Example 3.16

•  Out of a group of 20 Soldiers, 10 are randomly selected for a special mission. What is the probability that the 5 best qualified Soldiers out of the group of 20 have been selected?

•  Solution:

3/15/15 105

Check the Solution in R

•  The functions for the hypergeometric are dhyper() and phyper() –  P(Y = y0): dhyper(y0,r,N-r,n) –  P(Y y0): phyper(y0,r,N-r,n)

•  So, back to the example:

•  And, from first principles:

3/15/15 106

The R Help Page

3/15/15 107

Mean and Variance of a Hypergeometric Random Variable

•  Theorem 3.10: If Y is a r.v. with a hypergeometric distribution, then the mean is

and the variance is

•  We’ll skip the proof…

3/15/15 108

E(Y ) =nr

N

V (Y ) = n⇣ r

N

⌘✓N � r

N

◆✓N � n

N � 1

Textbook Example 3.17

•  Advanced combat helmets are shipped in lots of size 20. Five of the 20 helmets are ballistically tested (and destroyed) to check for quality. If more than one helmet is found defective, the lot is rejected,

•  If the lot of 20 actually contains 4 defectives, what is:–  The probability the lot will be rejected?–  The expected number of defectives in the sample

of 5 helmets?–  The variance of the number of defectives in the

sample of 5 helmets?3/15/15 109

Textbook Example 3.17 Solution

3/15/15 110

Textbook Example 3.17 Solution (cont’d)

3/15/15 111

Section 3.7 Homework

•  Do problems 3.102, 3.113, 3.114, 3.115

3/15/15 112

Section 3.8:Poisson Probability Distribution

•  The Poisson distribution is often a good model for the number of rare events Y that occur in space or time–  “Rare” means that the probability that two events

happen in the same (very small) increment of space or time is zero (or essentially zero)

•  Useful for modeling the number of events that occur per unit space or time–  Number of accidents/incidents, equipment failures,

arrivals, defects, etc.•  Can also be used to approximate the binomial

3/15/15 113

Named After Simeon Poisson

•  Studied at the Ecole Polytechnique underLaplace and Lagrange

•  Published this distribution, together with his probability theory, in 1837 in his work Recherches sur la Probabilité des Jugements en Matière Criminelle et en Matière Civile (“Research on the Probability of Judgments in Criminal and Civil Matters”)

1143/15/15

Poisson is a Limiting Form of the Binomial

•  For :

•  Now, note that and all other terms to the right of it have a limit of 1

•  Thus,

3/15/15 115

� = np

limn!1

✓n

y

◆py(1� p)n�y = lim

n!1

n(n� 1) · · · (n� y + 1)

y!

✓�

n

◆y ✓1� �

n

◆n�y

= limn!1

�y

y!

✓1� �

n

◆n ✓1� �

n

◆�y n(n� 1) · · · (n� y + 1)

ny

=�y

y!limn!1

✓1� �

n

◆n ✓1� �

n

◆�y ✓1� 1

n

◆✓1� 2

n

◆· · ·

✓1� y � 1

n

limn!1

✓1� �

n

◆n

= e��

limn!1

✓n

y

◆py(1� p)n�y =

�y

y!e��

The Poisson Distribution

•  Definition 3.11: A random variable Y has a Poisson distribution if and only if

•  It’s related to the binomial, as we just saw –  But in the binomial n is finite and fixed–  In the Poisson there is no n per se

•  Instead, we’re looking at infinitesimal regions or time slices where at most one event can occur

•  The parameter (Greek letter lambda) is the average number of events per unit

3/15/15 116

P (Y = y) =�y

y!e��, y = 0, 1, 2, . . . , � > 0

What’s the Difference Between Binomial and Poisson R.V.s?

•  Binomial:–  Event is the number of “successes” in a sequence

of n independent trials–  Example: number of operational units out of n

sampled–  E(Y) = np > np(1-p) = V(Y)

•  Poisson:–  Events occur per unit basis (per unit time, per unit

area, per unit volume, etc.)–  Example: number of defectives per unit–  And, as we’ll see,

117

E(Y ) = V (Y ) = �

3/15/15

Interactive Poisson Distribution Demo

3/15/15 118

From R (with the shiny package installed), run: shiny::runGitHub('DiscreteDistnDemos','rdfricker')

Textbook Example 3.18

•  Show that the Poisson distribution satisfies the requirement for a probability distribution:–  for all y – 

•  Solution:

3/15/15 119

X

y

p(y) = 1

p(y) � 0

Textbook Example 3.18 Solution (cont’d)

3/15/15 120

Textbook Example 3.19

•  Suppose a UAV patrol pattern is devised so the UAV may randomly visit a given location Y = 0, 1, 2, 3,… times per half hour.–  Each location is visited an average of once per

time period.–  Calculate the probability that the UAV will miss the

location during a half-hour period.–  What is the probability the location will be visited

exactly once? Twice? At least once?•  Solution:

3/15/15 121

Textbook Example 3.19 Solution (cont’d)

3/15/15 122

Check the Solution in R

•  The functions for the Poisson are dpois() and ppois() –  P(Y = y0): dpois(y0, ) –  P(Y y0): ppois(y0, )

•  So, back to the example:

•  And, from first principles:

3/15/15 123

Table 3 of Appendix 3 (page 843)

3/15/15 124

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Table 3 of Appendix 3 (page 843)

3/15/15 125

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

•  Table continued:

See the Book for the Rest of Table 3

•  ….

3/15/15 126

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Poisson Processes

•  A Poisson process is a stochastic process which counts the number of events per unit of space or time

•  In a Poisson process with mean number of occurrences per unit, the number of occurrences Y in a units has a Poisson distribution with mean–  A key assumption with Poisson processes is that

the number of occurrences in disjoint intervals is independent

3/15/15 127

a�

Textbook Example 3.20

•  A type of tree has seedlings dispersed in a large area with a mean density of five seedlings per square yard. What is the probability that none of ten randomly selected one-square yard regions have seedlings?

•  Solution:

3/15/15 128

Textbook Example 3.20 Solution (cont’d)

3/15/15 129

Textbook Example 3.21

•  Suppose Y has a binomial distribution with n = 20 and p = 0.1. Find the exact value of using the binomial table. Then approximate this using the Poisson table and compare.

•  Solution:

3/15/15 130

P (Y 3)

Textbook Example 3.21(cont’d)

3/15/15 131

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

•  From Table 1:

Textbook Example 3.21(cont’d)

3/15/15 132

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

•  From Table 3:

133

More on Using the Poisson to Approximate the Binomial

•  For large n and small p, it can be numerically unstable to do some Binomial calculations–  Say n=1,000,000 and p = 0.00001 (so ):

•  But, if it was n=1x1010 and p=1x10-9 (where we still have ) my version of Excel can’t calculate it

P(Y = 3) = 10000003

⎝⎜⎞

⎠⎟(.00001)3(.99999)999997

= 0.00756648

� = 10

� = 10

3/15/15

134

Poisson Approximation

•  For :

•  In R:

•  Very close to correct answer and much easier to calculate

•  And if it was n=1x1010 and p=1x10-9, is still 10, so no problem with the calculation

P(Y = 3) = e−10 103

3!= 0.00756666

� = np = 10

� = np

3/15/15

Mean and Variance of a Poisson Random Variable

•  Theorem 3.11: If Y is a r.v. with a Poisson distribution with parameter , then the mean and variance are

•  Proof:

3/15/15 135

E(Y ) = V (Y ) = �

Proof of Theorem 3.11 Continued

3/15/15 136

Proof of Theorem 3.11 Continued

3/15/15 137

Textbook Example 3.22

•  Industrial accidents occur according to a Poisson process with an average of 3 accidents/month. –  During the last 2 months, 10 accidents have

occurred.–  Does this number seem highly improbable if the

mean is still 3?–  Is there evidence that perhaps there is an increase

in the mean?•  Solution:

3/15/15 138

Textbook Example 3.22 Solution (cont’d)

3/15/15 139

3/15/15 140

Application Example

•  Navy Ashore Class A Mishaps occur at the rate of 15 per 100,000 sailors per year (Source: Navy Safety Center, 1996-2000 statistics)–  Hampton Roads, VA, has about 80,000 sailors

assigned–  What is the chance that Hampton Roads will have

more than 10 Class A Ashore Mishaps in the next 6 months?

•  Solution:

3/15/15 141

Application Example Solution (cont’d)

3/15/15 142

3/15/15 143

Section 3.8 Homework

•  Do problems 3.121, 3.122, 3.127, 3.128, 3.134

3/15/15 144

Section 3.11:Tchebysheff’s Theorem

•  The empirical rule says that approximately 95% of observations from a symmetric (or “bell shaped”) probability distribution will be in the interval–  It’s a useful rule of thumb for assessing whether

an observation is “unusual” •  Tchebysheff’s Theorem provides a

conservative lower bound for the probability that an observation falls in the interval–  It applies to any probability distribution

3/15/15 145

µ± 2�

µ± k�

Tchebysheff’s Theorem

•  Theorem 3.14: Let Y be a r.v. with mean and finite variance . Then, for any constant k > 0,

and

•  Equivalently:

and

3/15/15 146

µ�2

P (|Y � µ| � k�) 1

k2.

P (Y µ� k� or Y � µ+ k�) 1

k2.

P (|Y � µ| < k�) � 1� 1

k2

P (µ� k� Y µ+ k�) � 1� 1

k2

Tchebysheff’s Theorem Explained

3/15/15 147

P (Y µ� k� or Y � µ+ k�) 1

k2.

P (µ� k� Y µ+ k�) � 1� 1

k2

True probability that Y is within k standard deviations of

the mean

Lower bound on the true probability

True probability that Y is outside of k standard

deviations from the mean

Upper bound on the true probability

Tchebycheff in a Picture

•  Consider Y with a binomial distribution with n=20 & p=0.5 –  Then &–  The probability

histogram shows

–  Tchebyscheff says,

–  Note that Tchebycheff is correct ( ), but quite conservative in this case

3/15/15 148

µ = 10

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

� =p5

P (|Y � 10| � 2p5) 1

22= 0.25

P (|Y � 10| � 2

p5) = P (5.5 Y or Y � 14.5) = 0.04

0.04 0.25

Textbook Example 3.28

•  The number of customers per day at a sales counter, Y, has mean 20 and standard deviation 2. The probability distribution is not known. What can be said about the probability that tomorrow

•  Solution:

3/15/15 149

16 < Y < 24

Example

•  On average, a supply depot gets 156 requisitions per day with a standard deviation of 22. What is an upper bound on the probability that the number of requisitions in one day will be will be more than 3 standard deviations away from the mean?

•  Solution:

3/15/15 150

Section 3.11 Homework

•  Do problems 3.167, 3.168, 3.173, 3.174 •  From Chpt. 1 (p. 10), the empirical rule says:

ü  contains approx. 68% of the probabilityü  contains approx. 95% of the probabilityü  contains almost all of the probability

3/15/15 151

µ± �

µ± 2�

µ± 3�

Section 3.12:Summary

•  We’ve developed five discrete distributions that are widely applicable–  Here’s a table with their means and variances:

3/15/15 152

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Memorize These Five Distributions

3/15/15 153

Summary (continued)

•  In addition to Tables 1 and 3 for looking up probabilities associated with the binomial and Poisson distributions, you can use R:

–  You’ll need to be able to use the tables on exams, but outside of class I use R

3/15/15 154

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

What We Have Just Learned

•  Learned about discrete random variables–  Probability distributions–  Expected value, variance, and standard deviation

•  Defined specific distributions–  Binomial–  Geometric–  Negative Binomial–  Hypergeometric–  Poisson

•  Defined and applied Tchebycheff’s Theorem

3/15/15 155