cs#109 lecture#9 april#15th,2016 - stanford...
TRANSCRIPT
CS 109Lecture 9
April 15th, 2016
Four Prototypical Trajectories
Review
Bernoulli:§ indicator of coin flip X ~ Ber(p)
Binomial: § # successes in n coin flips X ~ Bin(n, p)
Poisson: § # successes in n coin flips X ~ Poi(λ)
Geometric: § # coin flips until success X ~ Geo(p)
Negative Binomial: § # trials until r successes X ~ NegBin(r, p)
Hyper Geometric: § # white balls drawn without replacement from urn with N balls, m are white: X ~ HypG(n, N, m)
Discrete Distributions
Supreme Court case: Berghuis v. SmithIf a group is underrepresented in a jury pool, how do you tell?
§ Article by Erin Miller – Friday, January 22, 2010§ Thanks to (former CS109er) Josh Falk for this article
Justice Breyer [Stanford Alum] opened the questioning by invoking the binomial theorem. He hypothesized a scenario involving “an urn with a thousand balls, and sixty are blue, and nine hundred forty are purple, and then you select them at random… twelve at a time.” According to Justice Breyer and the binomial theorem, if the purple balls were under represented jurors then “you would expect… something like a third to a half of juries would have at least one minority person” on them.
Balls, Urns and the Supreme Court
• Should model this combinatorially (X ~ HypGeo)§ Ball draws not independent trials (balls not replaced)
• Exact solution:P(draw 12 purple balls) = ≈ 0.4739
P(draw ≥ 1 blue ball) = 1 – P(draw 12 purple) ≈ 0.5261
• Approximation using Binomial distribution§ Assume P(blue ball) constant for every draw = 60/1000§ X = # blue balls drawn. X ~ Bin(12, 60/1000 = 0.06)§ P(X ≥ 1) = 1 – P(X = 0) ≈ 1 – 0.4759 = 0.5240
In Breyer’s description, should actually expect just over half of juries to have at least one black person on them
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛
121000
12940
Justin Breyer Meets CS109
Demo
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0 1 2 3 4 5 6 7 8 9 10 11 12
P(X
= x
)
# Underrepresented Jurrors
Underrepresented Juror PMF
• Determine N = how many of some species remain§ Randomly tag m of species (e.g., with white paint)§ Allow animals to mix randomly (assuming no breeding)§ Later, randomly observe another n of the species§ X = number of tagged animals in observed group of n§ X ~ HypG(n, N, m)
• “Maximum Likelihood” estimate§ Set N to be value that maximizes:
for the value i of X that you observed à = mn/i• Calculated by assuming: i = E[X] = nm/N
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛
−
−⎟⎟⎠
⎞⎜⎜⎝
⎛
==
nN
inmN
im
iXP )(
Endangered Species
N̂
Four Prototypical Trajectories
End Review
• So far, all random variables we saw were discrete§ Have finite or countably infinite values (e.g., integers)§ Usually, values are binary or represent a count
• Now it’s time for continuous random variables§ Have (uncountably) infinite values (e.g., real numbers)§ Usually represent measurements (arbitrary precision)
o Height (centimeters), Weight (lbs.), Time (seconds), etc.
• Difference between how many and how much
• Generally, it means replace with ∑=
b
axxf )( ∫
b
a
dxxf )(
From Discrete to Continuous
Integrals
*loving, not scary
• X is a Continuous Random Variable if there is function f(x) ≥ 0 for -∞ ≤ x ≤ ∞, such that:
• f is a Probability Density Function (PDF) if:
∫=≤≤b
adxxfbXaP )()(
1)()( ==∞<<−∞ ∫∞
∞−dxxfXP
Continuous Random Variables
• Say f is a Probability Density Function (PDF)
§ f(x) is not a probability, it is probability/units of X§ Not meaningful without some subinterval over X
§ Contrast with Probability Mass Function (PMF) in discrete case:
where for X taking on values x1, x2, x3, ...
0)()( === ∫a
adxxfaXP
)()( aXPap ==
1)(1
=∑∞
=iixp
1)()( ==∞<<−∞ ∫∞
∞−dxxfXP
Probability Density Function
• For a continuous random variable X, the Cumulative Distribution Function (CDF) is:
• Density f is derivative of CDF F:
• For continuous f and small ε :
§ So, ratio of probabilities can still be meaningful:o P(X = 1)/P(X = 2) ≈ (ε f(1))/(ε f(2)) = f(1)/f(2)
∫+
−
≈=+≤≤−2/
2/
)()()( 22
ε
ε
εεεa
a
afdxxfaXaP
∫∞−
=≤=<=a
dxxfaXPaXPaF )( )()()(
)()( aFdadaf =
Cumulative Distribution Function
• X is continuous random variable (CRV) with PDF:
§ What is C?
§ What is P(X > 1)?
⎩⎨⎧ <<−
= otherwise 0
2 x 0n whe)24()(
2xxCxf
13
22 1)24(0
232
2
0
2 =⎟⎟⎠
⎞⎜⎜⎝
⎛−⇒=−∫xxCdxxxC
83 1
38 10
3168 =⇒=⇒=⎟⎟
⎠
⎞⎜⎜⎝
⎛−⎟⎠
⎞⎜⎝
⎛ − CCC
21
322
3168
83
322
83)24()(
1
232
2
1
2
183 =⎥
⎦
⎤⎢⎣
⎡⎟⎠
⎞⎜⎝
⎛ −−⎟⎠
⎞⎜⎝
⎛ −=⎟⎟⎠
⎞⎜⎜⎝
⎛−=−= ∫∫
∞ xxdxxxdxxf
Simple Example
• X = days of use before your disk crashes
§ First, determine λ to have actual PDFo Good integral to know:
§ What is P(50 < X < 150)?
§ What is P(X < 10)?
⎩⎨⎧ ≥
=−
otherwise0 0
)(100/ xe
xfxλ
uu edue =∫100
1100
1 10010010010
100/100/100/ =⇒=−=−==∞−−− ∫∫ − λλλλλ xxx edxedxe
383.0 )50()150( 2/12/3
50
150100/150
50
100/100
1 ≈+−=−==− −−−−∫ eeedxeFF xx
095.01 )10( 10/1
0
10100/10
0
100/100
1 ≈+−=−== −−−∫ eedxeF xx
Disk Crashes
For continuousRV X:For discrete RV X:
∑=x
xpxXE )( ][
222 ])[(][])[()(Var XEXEXEX −=−= µ
dxxfxXE ∫∞
∞−
= )( ][
dxxfxgXgE ∫∞
∞−
= )() ()]([∑=x
xpxgXgE )( )()]([
bXaEbaXE +=+ ][][
∑=x
nn xpxXE )( ][ dxxfxXE nn ∫∞
∞−
= )( ][
)(Var)(Var 2 XabaX =+
For both discrete and continuous RVs:
Expectation and Variance
• X is a continuous random variable with PDF:
§ What is E[X]?
§ What is Var(X)?
⎩⎨⎧ ≤≤
=otherwise0
10 2)( xxxf
32
322)(][
013
1
0
2 ==== ∫∫
∞
∞−
xdxxdxxfxXE
x
)(xf
21
212)(][
014
1
0
322 ==== ∫∫
∞
∞−
xdxxdxxfxXE
181
32
21])[(][)(
222 =⎟
⎠
⎞⎜⎝
⎛−=−= XEXEXVar
Linearly Increasing Density
• X is a Uniform Random Variable: X ~ Uni(α, β)§ Probability Density Function (PDF):
o Sometimes defined over range α < x < β
§ (for α ≤ a ≤ b ≤ β)
§
§
⎪⎩
⎪⎨⎧ ≤≤
= −
otherwise0
)(1
βααβ xxf
αβ −−
==≤≤ ∫abdxxfbxaP
b
a
)()(
2)(2)(2)(][
222
βααβαβ
αβ α
ββ
ααβ
+=
−
−=
−=== ∫∫ −
∞
∞−
xdxdxxfxXE x
12)()(2αβ −
=XVar
αβ −1
x
)(xf
Uniform Random Variable
• X ~ Uni(0, 20)
§ P(X < 6)?
§ P(4 < X < 17)?
206
201)6(
6
0
==< ∫ dxxP
2013
204
2017
201)174(
17
4
=−==<< ∫ dxxP
⎪⎩
⎪⎨⎧ ≤≤
=otherwise0
200 )( 201
xxf
Fun with the Uniform Distribution
Riding the Marguerite
• Say the Marguerite bus stops at the Gates bldg. at 15 minute intervals (2:00, 2:15, 2:30, etc.)§ Passenger arrives at stop uniformly between 2-2:30pm§ X ~ Uni(0, 30)
• P(Passenger waits < 5 minutes for bus)?§ Must arrive between 2:10-2:15pm or 2:25-2:30pm
• P(Passenger waits > 14 minutes for bus)?§ Must arrive between 2:00-2:01pm or 2:15-2:16pm
31
305
305)3025()1510(
30
25301
15
10301 =+=+=<<+<< ∫∫ dxdxxPXP
151
301
301)1615()10(
16
15301
1
0301 =+=+=<<+<< ∫∫ dxdxxPXP
Riding the Marguerite
• Biking to a class on campus§ Leave t minutes before class starts§ X = travel time (minutes). X has PDF: f(x)§ If early, incur cost: c/min. If late, incur cost: k/min.
§ Choose t (when to leave) to minimize E[C(X, t)]:
dxxftxkdxxfxtcdxxftXCtXCEt
t
)( )()( )()( ),()],([00
∫∫∫∞∞
−+−==
⎪⎩
⎪⎨⎧
≥−
<−=
txtXktxXtc
tXC if)(
if )(),( :Cost
When to Leave for Class
Minimization via Differentiation• Want to minimize w.r.t. t:
§ Differentiate E[C(X, t)] w.r.t. t, and set = 0 (to obtain t*):o Leibniz integral rule:
∫∫ ∂
∂+−=
)(
)(1
12
2)(
)(
2
1
2
1
),()),(()()),(()(),(tf
tf
tf
tf
dxttxgttfg
dttdfttfg
dttdfdxtxg
dtd
∫∫∞
−−−+−=t
t
dxxkftfttkdxxcftfttctXCEdtd )()()()()()()],([
0
dxxftxkdxxfxtctXCEt
t
)( )()( )()],([0
∫∫∞
−+−=
kcktFtFktcF+
=⇒−−= *)( *)](1[*)(0