ec-512ec512 lecnotes pt2
Post on 01-Jun-2018
220 Views
Preview:
TRANSCRIPT
-
8/9/2019 EC-512EC512 LecNotes Pt2
1/29
Hypothesis Testing
Hypothesis testing: use of statistics to determine theprobability that a given hypothesis is true / false.
Hypothesis: some theory or claim that has been put
forward because it is believed to be true, but has not
been proved.
In the present context, hypothesis is a conjecture about
the distribution of some random variables – often
statements about mean and variance of an r.v.
Hypothesis testing tests a null hypothesis H 0 against the
alternate hypothesis H 1.
-
8/9/2019 EC-512EC512 LecNotes Pt2
2/29
-
8/9/2019 EC-512EC512 LecNotes Pt2
3/29
Decision Errors *he outcome of a hypothesis test is either +reject H 0 or
+do not reject H 0.
-hen we perform a statistical test we hope that our
decision will be correct, but sometimes it will be wrong.
*here are two possible errors that can be made inhypothesis test.
*ruth
ecisions
)ccept H 0
eject H 0
H 0
"orrect *ype I 0rror
H 1
*ype II 0rror "orrect
-
8/9/2019 EC-512EC512 LecNotes Pt2
4/29
Steps in Hypothesis
Testing
Hypothesis testing is a proof by contradiction.Step 1: 1ormulate the null and alternative hypothesis. )ssume
H2 is true.
Step 2: "ollect data.
Step 3: 0valuate whether data are consistent with the statisticalhypothesis – identify a test statistics that will be computedfrom the collection of data and assess the truth of the nullhypothesis.
Approaches: #$% 1re'uentist or classical.
#&% 3ayesian
#(% 4i!elihood.
-
8/9/2019 EC-512EC512 LecNotes Pt2
5/29
Some Defnitions "ritical region or rejection region – that region of the data5
space or the corresponding range of statistical test value forwhich null hypothesis will be rejected.
6i7e8 or significance level of the test – prob. of incorrectlyrejecting H 2 9 prob. of *ype I error 9 α .
-hen value for test statistic is found in the rejection region,the test is said to be 6statistically significant8 at α level
rob. of *ype II error 9 β
6ower8 of a test – prob. of correctly rejecting H 2
9 $ ; β. *he p5value is the min. significance level for which we would
still reject the null hypothesis. *he p5value is a measure ofhow much evidence there is against the null hypothesis.
-
8/9/2019 EC-512EC512 LecNotes Pt2
6/29
-
8/9/2019 EC-512EC512 LecNotes Pt2
7/29
Example (contd.)
?easure sample value 9 x s.
0valuate
If z @ &, then we decide that the sample has not come fromthe 7ero mean =aussian process under consideration.
"hec! that the 6si7e8 or 6significance level8 or the
probability of *ype I error for the set criterion is α 9 2.2A
*he p5value for the measured sample is
where f 2# x % 9 N #µ2,σ &%.
σ
µ 0−= s x
z
∫ +∞
−
−=0
)(2 00 µ
µ
s x
dx x f p
-
8/9/2019 EC-512EC512 LecNotes Pt2
8/29
Example (contd.)
-
8/9/2019 EC-512EC512 LecNotes Pt2
9/29
Bayesian HypothesisTesting =iven prior probabilities P #H 2% and P #H $%
=iven li!elihoods p# x s B H 2% and p# x s B H $%, where x s is the
collected data.
"alculate posterior odds ratio 9 3ayes factor C prior odds
ratio.
)(
)(
)|(
)|(
)|(
)|(
1
0
1
0
1
0
H P
H P
H x p
H x p
x H P
x H P
s
s
s
s ×=
-
8/9/2019 EC-512EC512 LecNotes Pt2
10/29
Bayesian Method (contd.)
ecide for H 2 if the posterior odd is greater than $, else
decide for H $.
3asically based on ?) criterion.
rior odd modified after observing the data.
Integrates prior probabilities associated with competing
hypotheses into the assessment of which hypothesis is
the most li!ely for the data in hand. aying other way, li!elihoods modified with the prior
probabilities.
-
8/9/2019 EC-512EC512 LecNotes Pt2
11/29
P #*ype II 0rror% 9 ∫R0 p( x | H 1)dxecision boundary: p( x | H 0) P ( H 0) = p( x | H 1) P ( H 1)
Bayesian Method (contd.)
P E = P #*ype I 0rror%P #H 2% D P #*ype II 0rror%P #H $%P #*ype I 0rror% 9 ∫R1 p( x | H 0)dx
-
8/9/2019 EC-512EC512 LecNotes Pt2
12/29
Bayesian Method (contd.)
-hen the hypotheses are defined over some parameter:
Eseful for compound hypothesis testing.
0xample: -e toss a coin $22 times and obtain F2 heads
and G2 tails. -hat is the evidence against thehypothesis that the coin is fair
)(
)(
)|(),|(
)|(),|(
)|(
)|(
1
0
11
00
1
0
H P
H P
d H p H x p
d H p H x p
x H P
x H P
s
s
s
s
×= ∫ ∫
θ θ θ
θ θ θ
-
8/9/2019 EC-512EC512 LecNotes Pt2
13/29
Minimum Prob o! Error
-e attach some cost function: C ij 9 cost incurred byaccepting H i when H j is true.
R 2 is region of accepting H 2, R $ is region of rejecting H 2#accepting H $% – we have to determine proper decision
regions so that the overall cost is minimi7ed.
>verall cost value:
[ ]
[ ]dx H P H x pC H P H x pC
dx H P H x pC H P H x pC C
R
R
∫
∫
++
+=
1
0
)()|()()|(
)()|()()|(
11110010
11010000
-
8/9/2019 EC-512EC512 LecNotes Pt2
14/29
Minimum Prob o! Error(contd.) "ost minimi7ed if we decide for H 2 when
"onsidering 7ero5one cost function, decide for H 2 when
*his is essentially the 3ayes decision criterion.
In this the overall error #both *ype I and *ype II% isminimi7ed.
)()|()()|(
)()|()()|(
11110010
11010000
H P H x pC H P H x pC
H P H x pC H P H x pC
s s
s s
+<
+
)()|()()|( 0011 H P H x p H P H x p s s <
-
8/9/2019 EC-512EC512 LecNotes Pt2
15/29
Minimum Prob o! Error(contd.)
-
8/9/2019 EC-512EC512 LecNotes Pt2
16/29
"isherian HypothesisTesting "onstruct a statistical null hypothesis
"hoose an appropriate distribution or test statistic.
"ollect the data with random samples.
etermine the p value assuming the null hypothesis istrue.
eject the null hypothesis if p is small
-
8/9/2019 EC-512EC512 LecNotes Pt2
17/29
Neyman#Pearson Method
imilar to 1ishers approach, but:
et significance level in advance,
1ocus on *ype I and *ype II errors, as well as power of
tests.
1or any α there is infinite number of possible decisionrules #infinite number of critical regions%.
0ach critical region has a power.
-
8/9/2019 EC-512EC512 LecNotes Pt2
18/29
Neyman#Pearson Method(contd.)
1alse )larm: -rongly rejecting H 2. etection: ightly detecting H 2 not true #rejecting H 2%.
?iss: -rongly accepting H 2.
"hec! that prob. of false alarm P F 9 α and prob. of miss
is β. rob. of correct acceptance is #$ ; α % and prob. of
detection P D 9 #$ ; β%.
o, we have two degrees of freedom.
)im to increase P D while decreasing P F – generally not
possible simultaneously.
-
8/9/2019 EC-512EC512 LecNotes Pt2
19/29
Neyman#Pearson Method(contd.)
-
8/9/2019 EC-512EC512 LecNotes Pt2
20/29
Neyman#Pearson Method(contd.)
elation between P F and P D given commonly by receiveroperating characteristic #>"% curve:
-
8/9/2019 EC-512EC512 LecNotes Pt2
21/29
Neyman#Pearson $emma
-
8/9/2019 EC-512EC512 LecNotes Pt2
22/29
Hypothesis Testing using$%T ecall the criterion used for minimi7ing cost function in
hypothesis testing #decision ma!ing%: eject H 2 when
*his can be rewritten as
o, if the threshold k in 4* e'uals H then the cost
function is minimi7ed. )lso, chec! that if k 9 P #H $%/P #H 2% for 7ero5one cost
function then it is essentially the 3ayes criterion based
hypothesis testing.
)()|()()|(
)()|()()|(
11110010
11010000
H P H x pC H P H x pC
H P H x pC H P H x pC
s s
s s
+>+
)()(
)|()|(
0
1
0010
1101
1
0
H P
H P
C C
C C
H x p
H x p
s
s ×−−<
-
8/9/2019 EC-512EC512 LecNotes Pt2
23/29
Points to Note
-
8/9/2019 EC-512EC512 LecNotes Pt2
24/29
Maximum $i&elihoodEstimation
)ssumed that the parametric form of p# x % is !nown, butdepends on some parameters θ $, θ &, θ (, KK..
o, once we can find #estimate% these parameter values
then the density function is uni'uely determined.
>bserve a set of i.i.d. training samples x $, x &, KK.., x n.
4i!elihood:
?40 finds the values of parameters for which the
li!elihood is maximi7ed.
∏=
==n
k
k n x p x x x p p
1
21 )|()|,.......,,()|( θθθX
-
8/9/2019 EC-512EC512 LecNotes Pt2
25/29
M$E (contd.) *o find the parameter value that maximi7es li!elihood we
need to differentiate w.r.t. θ k and then e'uate to 7ero.
0'uivalently, we may differentiate the log5li!elihood
function – this is easier to wor! with.
4og5li!elihood:
olve for
0xplain with examples: =aussian case un!nown mean,
un!nown mean and variance.
∑=
==n
k
k x p pl 1
)|(ln)|(ln)( θθXθ
∑=
=∇=∇n
k
k x pl
1
)|(ln)( 0θθθθ
-
8/9/2019 EC-512EC512 LecNotes Pt2
26/29
MAP Estimation
rior probability of different parameter values given p#θ%.
-e can determine the posterior prob. p#θ B X% for the
given training samples.
-e loo! for the parameter values that maximi7es this
posterior prob.
*hat is, we maximi7e p#X B θ% p#θ%.
0'uivalently, we maximi7e l #θ% p#θ%.
-
8/9/2019 EC-512EC512 LecNotes Pt2
27/29
Bayesian Estimation *he parametric form of p# x % is !nown but the parameter
values not !nown.
3asic goal is to compute density function from a given set
of samples, i.e. p# x B X% which is close to the un!nown p# x %.
o, need to compute p#θ B X% – called reproducing density.
Initial !nowledge about the parameter values is contained
in a !nown prior density p#θ% – called conjugate prior .
est of our !nowledge about the parameters is contained
in the sample set.
∫ = θXθθX d p x p x p )|()|()|(
-
8/9/2019 EC-512EC512 LecNotes Pt2
28/29
Bayesian Estimation'contd( Esing 3ayes formula:
ince the samples are drawn independently according to
the un!nown prob. density p# x %
∫
=θθθX
θθXXθ
d p p
p p p
)()|(
)()|()|(
∏=
=n
k
k x p p1
)|()|( θθX
-
8/9/2019 EC-512EC512 LecNotes Pt2
29/29
Bayesian Estimation )Example
=iven:
eproducing density:
1inally, density function:
),(~)(),(~)|( 2
00
2
σ µ µ σ µ µ N p N x p
∑=
=+
=+
+
+
=
−−=
n
k
k nnnn
n
n
n
xnnnn
n
X p
122
0
2202
0220
2
220
20
2
2
1ˆˆwhere
2
1exp2
1)|(
µ σ σ
σ σ σ µ
σ σ
σ µ
σ σ
σ µ
σ
µ µ
πσ µ
),(~)|( 22 nn N x p σ σ µ +X
top related