pac learning framework - university of cambridgemi.eng.cam.ac.uk/~cz277/doc/slides-pac.pdf · pac...

64
PAC Learning Framework Dept. of CS&T, Tsinghua University (Dept. of CS&T, Tsinghua University) PAC Learning Framework 1 / 30

Upload: others

Post on 29-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learning Framework

�

Dept. of CS&T, Tsinghua University

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 1 / 30

Page 2: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

1 Questions for Learning Algorithms

2 Basis of PACIntroductionBasic SymbolsError of a hypothesisPAC Learnability

3 Sample complexity for finite hypothesis spaceConsistent LearnerAgnostic Learning and Inconsistent HypothesesPAC-Learnability of Other Concept Classes

4 Sample Complexity for Infinite Hypothesis Spaces

5 Some More General ScenarioPapers in Recent Years

6 Critisms of The PAC Model

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 2 / 30

Page 3: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

Sample ComplexityI How many training examples do we need to converge to a successful

hypothesis with a high probability?

Computational Complexity

I How much computational effort is needed to converge to a successfulhypothesis with a high probability?

Mistake Bound

I How many training examples will the learner misclassify beforeconverging to a successful hypothesis?

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 3 / 30

Page 4: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

Sample ComplexityI How many training examples do we need to converge to a successful

hypothesis with a high probability?

Computational ComplexityI How much computational effort is needed to converge to a successful

hypothesis with a high probability?

Mistake Bound

I How many training examples will the learner misclassify beforeconverging to a successful hypothesis?

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 3 / 30

Page 5: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

Sample ComplexityI How many training examples do we need to converge to a successful

hypothesis with a high probability?

Computational ComplexityI How much computational effort is needed to converge to a successful

hypothesis with a high probability?

Mistake BoundI How many training examples will the learner misclassify before

converging to a successful hypothesis?

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 3 / 30

Page 6: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

What it means for the learner to be ”successful”?

I The output hypothesis is identical to the target conceptI The output hypothesis agrees with the target concept most of the time

The answer to the question denpends on the particular learning modelin mind

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 4 / 30

Page 7: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

What it means for the learner to be ”successful”?I The output hypothesis is identical to the target concept

I The output hypothesis agrees with the target concept most of the time

The answer to the question denpends on the particular learning modelin mind

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 4 / 30

Page 8: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

What it means for the learner to be ”successful”?I The output hypothesis is identical to the target conceptI The output hypothesis agrees with the target concept most of the time

The answer to the question denpends on the particular learning modelin mind

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 4 / 30

Page 9: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Questions for Learning Algorithms

What it means for the learner to be ”successful”?I The output hypothesis is identical to the target conceptI The output hypothesis agrees with the target concept most of the time

The answer to the question denpends on the particular learning modelin mind

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 4 / 30

Page 10: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

1 Questions for Learning Algorithms

2 Basis of PACIntroductionBasic SymbolsError of a hypothesisPAC Learnability

3 Sample complexity for finite hypothesis spaceConsistent LearnerAgnostic Learning and Inconsistent HypothesesPAC-Learnability of Other Concept Classes

4 Sample Complexity for Infinite Hypothesis Spaces

5 Some More General ScenarioPapers in Recent Years

6 Critisms of The PAC Model

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 5 / 30

Page 11: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Introduction

Probably Approximately Correct Learning

Analysis under the Framework

I Sample ComplexityI Computational Complexity

Restrict the discussion on

I Learn boolean-valued conceptI Learn from noise-free training data

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 6 / 30

Page 12: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Introduction

Probably Approximately Correct Learning

Analysis under the FrameworkI Sample ComplexityI Computational Complexity

Restrict the discussion on

I Learn boolean-valued conceptI Learn from noise-free training data

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 6 / 30

Page 13: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Introduction

Probably Approximately Correct Learning

Analysis under the FrameworkI Sample ComplexityI Computational Complexity

Restrict the discussion onI Learn boolean-valued conceptI Learn from noise-free training data

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 6 / 30

Page 14: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic SymbolsExample: The students who passed mid-term exam and hand in allthe homework, and passed the final exam or hand in a reading reportpassed the course Applied Stochastic Process

X : Instance set, x ∈ X

| X |= 2× 2× 2× 2 = 16

C : Concept set, c ∈ C

C = {((1, 1, 1, 0), 1), ((1, 0, 1, 1), 1), ((1, 1, 1, 1), 1)}H: All possible hypothesis set, h ∈ H

| H |= 1 + 3× 3× 3× 3 = 82

D : Instance Distribution; D: Training set

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 7 / 30

Page 15: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic SymbolsExample: The students who passed mid-term exam and hand in allthe homework, and passed the final exam or hand in a reading reportpassed the course Applied Stochastic ProcessX : Instance set, x ∈ X

| X |= 2× 2× 2× 2 = 16

C : Concept set, c ∈ C

C = {((1, 1, 1, 0), 1), ((1, 0, 1, 1), 1), ((1, 1, 1, 1), 1)}H: All possible hypothesis set, h ∈ H

| H |= 1 + 3× 3× 3× 3 = 82

D : Instance Distribution; D: Training set

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 7 / 30

Page 16: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic SymbolsExample: The students who passed mid-term exam and hand in allthe homework, and passed the final exam or hand in a reading reportpassed the course Applied Stochastic ProcessX : Instance set, x ∈ X

| X |= 2× 2× 2× 2 = 16

C : Concept set, c ∈ C

C = {((1, 1, 1, 0), 1), ((1, 0, 1, 1), 1), ((1, 1, 1, 1), 1)}

H: All possible hypothesis set, h ∈ H

| H |= 1 + 3× 3× 3× 3 = 82

D : Instance Distribution; D: Training set

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 7 / 30

Page 17: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic SymbolsExample: The students who passed mid-term exam and hand in allthe homework, and passed the final exam or hand in a reading reportpassed the course Applied Stochastic ProcessX : Instance set, x ∈ X

| X |= 2× 2× 2× 2 = 16

C : Concept set, c ∈ C

C = {((1, 1, 1, 0), 1), ((1, 0, 1, 1), 1), ((1, 1, 1, 1), 1)}H: All possible hypothesis set, h ∈ H

| H |= 1 + 3× 3× 3× 3 = 82

D : Instance Distribution; D: Training set

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 7 / 30

Page 18: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic SymbolsExample: The students who passed mid-term exam and hand in allthe homework, and passed the final exam or hand in a reading reportpassed the course Applied Stochastic ProcessX : Instance set, x ∈ X

| X |= 2× 2× 2× 2 = 16

C : Concept set, c ∈ C

C = {((1, 1, 1, 0), 1), ((1, 0, 1, 1), 1), ((1, 1, 1, 1), 1)}H: All possible hypothesis set, h ∈ H

| H |= 1 + 3× 3× 3× 3 = 82

D : Instance Distribution; D: Training set� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 7 / 30

Page 19: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic Symbols

Model of learning from examples

x is drawn randomly from X according to D

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 8 / 30

Page 20: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Basic Symbols

Model of learning from examples

x is drawn randomly from X according to D

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 8 / 30

Page 21: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Error of a hypothesisTure Error

Definition

True Error of hypothesis h with respect to target concept c anddistribution D is the probability that h will misclassify an instance drawnrandomly according to D , i.e.

errorD(h) ≡ P(c∆h)

A∆B is symmetric difference, that A∆B = (A ∪ B) \ (A ∩ B)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 9 / 30

Page 22: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Error of a hypothesisTure Error

Definition

True Error of hypothesis h with respect to target concept c anddistribution D is the probability that h will misclassify an instance drawnrandomly according to D , i.e.

errorD(h) ≡ P(c∆h)

A∆B is symmetric difference, that A∆B = (A ∪ B) \ (A ∩ B)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 9 / 30

Page 23: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futile

I We need to provide training examples corresponding to every possibleinstance in X

I Randomly drawned training examples will always be some nonzeroprobabilty to be misleading

PAC Learnability

I We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 24: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futile

I We need to provide training examples corresponding to every possibleinstance in X

I Randomly drawned training examples will always be some nonzeroprobabilty to be misleading

PAC Learnability

I We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 25: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in X

I Randomly drawned training examples will always be some nonzeroprobabilty to be misleading

PAC Learnability

I We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 26: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in XI Randomly drawned training examples will always be some nonzero

probabilty to be misleading

PAC Learnability

I We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 27: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in XI Randomly drawned training examples will always be some nonzero

probabilty to be misleading

PAC Learnability

I We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 28: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in XI Randomly drawned training examples will always be some nonzero

probabilty to be misleading

PAC LearnabilityI We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 29: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in XI Randomly drawned training examples will always be some nonzero

probabilty to be misleading

PAC LearnabilityI We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 30: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

Learnabilty(”Successful”)

errorD(h) = 0 is futileI We need to provide training examples corresponding to every possible

instance in XI Randomly drawned training examples will always be some nonzero

probabilty to be misleading

PAC LearnabilityI We will require only that h’s true error be bounded by some constant,ε, that can be made arbitrarily small

I We will require only that h’s probability of failure be bounded by someconstant, δ, that can be made arbitrarily small

I (1− δ)→ probably; ε→, approximately correctI We only require learner probably learn a hypothesis that is

approximately correct, which is PAC

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 10 / 30

Page 31: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

PAC Learnability

Definition

Consider a concept class C defined over a set of instances X of length nand a learner L using hypothesis space H. C is PAC-learnable by L usingH if for all c ∈ C , distributions D over X , ε such that 0 < ε < 1

2 , and δsuch that 0 < δ < 1

2 ,learner L will with probability at least (1− δ) output a hypothesis h ∈ Hsuch that errorD(h) ≤ ε,in time that is polynomial in 1

ε , 1δ , n, and size(c).

L→ (1− δ), ε,

L→Polynomial time, Polynomial samples

X → n

C → size(c)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 11 / 30

Page 32: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

PAC Learnability

Definition

Consider a concept class C defined over a set of instances X of length nand a learner L using hypothesis space H. C is PAC-learnable by L usingH if for all c ∈ C , distributions D over X , ε such that 0 < ε < 1

2 , and δsuch that 0 < δ < 1

2 ,learner L will with probability at least (1− δ) output a hypothesis h ∈ Hsuch that errorD(h) ≤ ε,in time that is polynomial in 1

ε , 1δ , n, and size(c).

L→ (1− δ), ε,

L→Polynomial time, Polynomial samples

X → n

C → size(c)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 11 / 30

Page 33: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC Learnability

PAC Learnability

Definition

Consider a concept class C defined over a set of instances X of length nand a learner L using hypothesis space H. C is PAC-learnable by L usingH if for all c ∈ C , distributions D over X , ε such that 0 < ε < 1

2 , and δsuch that 0 < δ < 1

2 ,learner L will with probability at least (1− δ) output a hypothesis h ∈ Hsuch that errorD(h) ≤ ε,in time that is polynomial in 1

ε , 1δ , n, and size(c).

L→ (1− δ), ε,

L→Polynomial time, Polynomial samples

X → n

C → size(c)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 11 / 30

Page 34: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

1 Questions for Learning Algorithms

2 Basis of PACIntroductionBasic SymbolsError of a hypothesisPAC Learnability

3 Sample complexity for finite hypothesis spaceConsistent LearnerAgnostic Learning and Inconsistent HypothesesPAC-Learnability of Other Concept Classes

4 Sample Complexity for Infinite Hypothesis Spaces

5 Some More General ScenarioPapers in Recent Years

6 Critisms of The PAC Model

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 12 / 30

Page 35: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity for Finite Hypothesis Space

Sample ComplexityI The growth in the number of required training examples with problem

size, called the sample complexity of the learning problem

Consistent Learner

Definition

A hypothesis is consistent with set D, if and only if h(x) = c(x), i.e.,

Consistent(h,D) ≡ (∀ < x , c(x) >∈ D), h(x) = c(x)

A learner is consistent if it outputs hypotheses that perfectly fit thetraining data

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 13 / 30

Page 36: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity for Finite Hypothesis Space

Sample ComplexityI The growth in the number of required training examples with problem

size, called the sample complexity of the learning problem

Consistent Learner

Definition

A hypothesis is consistent with set D, if and only if h(x) = c(x), i.e.,

Consistent(h,D) ≡ (∀ < x , c(x) >∈ D), h(x) = c(x)

A learner is consistent if it outputs hypotheses that perfectly fit thetraining data

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 13 / 30

Page 37: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

Version Space

Definition

The Version Space, denoted VSH,D ≡ {h ∈ H | Consistent(h,D)}

C = {(1, 1, 1, 0), (1, 0, 1, 1), (1, 1, 1, 1)}, c = (1, 0, 1, 1)D = {((1, 0, 1, 1), 1), ((1, 0, 0, 1), 0)}⇒ VSH,D = {(?, ?, 1, ?), ..., (1, 0, 1, 1), ..., (1, 1, 1, 0), ..., (1, 1, 1, 1)}

ε−exhaustedTo bound the number of examples needed by any consistent learner, weneed only to bound the number of examples needed to assure that theversion space contains no unacceptable hypotheses

Definition

VSH,D , is said to be ε-exhausted with respect to c and D , if everyhypothesis h in VSH,D has error less than ε with respect to c and D , i.e.

(∀h ∈ VSH,D)errorD(h) < ε

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 14 / 30

Page 38: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

Version Space

Definition

The Version Space, denoted VSH,D ≡ {h ∈ H | Consistent(h,D)}

C = {(1, 1, 1, 0), (1, 0, 1, 1), (1, 1, 1, 1)}, c = (1, 0, 1, 1)D = {((1, 0, 1, 1), 1), ((1, 0, 0, 1), 0)}⇒ VSH,D = {(?, ?, 1, ?), ..., (1, 0, 1, 1), ..., (1, 1, 1, 0), ..., (1, 1, 1, 1)}ε−exhausted

To bound the number of examples needed by any consistent learner, weneed only to bound the number of examples needed to assure that theversion space contains no unacceptable hypotheses

Definition

VSH,D , is said to be ε-exhausted with respect to c and D , if everyhypothesis h in VSH,D has error less than ε with respect to c and D , i.e.

(∀h ∈ VSH,D)errorD(h) < ε

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 14 / 30

Page 39: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

ε−Exhausting The Version SpaceA probabilistic way to probabilistic to bound the probability that theversion space will be ε−exhausted after a given number of trainingexamples, without knowing the identity of the target concept or thedistribution from which training examples are drawn

Definition

If the hypothesis space H is finite, and D is a sequence of m ≥ 1independent randomly drawn examples of some target concept c , then forany 0 ≤ ε ≤ 1, the probability that the version space VSH,D is notε−exhausted is less than or equal to,

| H | e−εm

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 15 / 30

Page 40: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

ε−Exhausting The Version SpaceI proof:

Assume h1, h2, . . . , hk be all the hypotheses in H that have true errorgreater than ε with respect to c . The probability one such hypothesis isconsistent with m randomly drawn samples is (1− ε)m, then theprobability that there’s at least one such hypothesis in VSH,D is,

k(1− ε)m (1)

k(1− ε)m ≤| H | (1− ε)m ≤| H | e−εm (2)

Get m

| H | e−εm ≤ δ (3)

m ≥ 1

ε(ln | H | + ln(

1

δ)) (4)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 16 / 30

Page 41: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

ε−Exhausting The Version SpaceI proof:

Assume h1, h2, . . . , hk be all the hypotheses in H that have true errorgreater than ε with respect to c . The probability one such hypothesis isconsistent with m randomly drawn samples is (1− ε)m, then theprobability that there’s at least one such hypothesis in VSH,D is,

k(1− ε)m (1)

k(1− ε)m ≤| H | (1− ε)m ≤| H | e−εm (2)

Get m

| H | e−εm ≤ δ (3)

m ≥ 1

ε(ln | H | + ln(

1

δ)) (4)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 16 / 30

Page 42: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

Conjunctions of Boolean LiteralsI n = 4, | H |= 34, ε = 0.1, δ = 0.05,then,

m ≥ 1

ε(n ln 3 + ln(

1

δ)) = 10× (4 ln 3 + ln 20) ≈ 74

| X |= 16 < m′0 = 74

I The weakness of this bound is mainly due to the | H | term, whicharises in the proof when summing the probability that a singlehypothesis could be unacceptable, over all possible hypotheses

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 17 / 30

Page 43: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Consistent Learner

Conjunctions of Boolean LiteralsI n = 4, | H |= 34, ε = 0.1, δ = 0.05,then,

m ≥ 1

ε(n ln 3 + ln(

1

δ)) = 10× (4 ln 3 + ln 20) ≈ 74

| X |= 16 < m′0 = 74

I The weakness of this bound is mainly due to the | H | term, whicharises in the proof when summing the probability that a singlehypothesis could be unacceptable, over all possible hypotheses

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 17 / 30

Page 44: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Agnostic Learning and Inconsistent HypothesesAgnostic Learner if H does not contain the target concept c , then azero-error hypothesis cannot always be found

Definition

A learner that makes no assumption that the target concept isrepresentable by H and that simply finds the hypothesis with minimumtraining error, is often called an agnostic learner

Inconsistent Hypothesis

I errorD(h) denote the training error of hypothesis h, hbest denote thehypothesis from H having lowest training error over training set

I Using Hoeffding bound which characterize the deviation between thetrue probability of some event and its observed frequency over mindependent trialsTo ensure the true error errorD(hbest) ≤ ε+ errorD(hbest)

P((∃h ∈ H)(errorD(hbest))errorD(hbest) + ε) ≤| H | e−2mε2

m ≥ 1

2ε2(ln | H | + ln

1

δ)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 18 / 30

Page 45: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Agnostic Learning and Inconsistent HypothesesAgnostic Learner if H does not contain the target concept c , then azero-error hypothesis cannot always be found

Definition

A learner that makes no assumption that the target concept isrepresentable by H and that simply finds the hypothesis with minimumtraining error, is often called an agnostic learner

Inconsistent Hypothesis

I errorD(h) denote the training error of hypothesis h, hbest denote thehypothesis from H having lowest training error over training set

I Using Hoeffding bound which characterize the deviation between thetrue probability of some event and its observed frequency over mindependent trialsTo ensure the true error errorD(hbest) ≤ ε+ errorD(hbest)

P((∃h ∈ H)(errorD(hbest))errorD(hbest) + ε) ≤| H | e−2mε2

m ≥ 1

2ε2(ln | H | + ln

1

δ)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 18 / 30

Page 46: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Agnostic Learning and Inconsistent HypothesesAgnostic Learner if H does not contain the target concept c , then azero-error hypothesis cannot always be found

Definition

A learner that makes no assumption that the target concept isrepresentable by H and that simply finds the hypothesis with minimumtraining error, is often called an agnostic learner

Inconsistent HypothesisI errorD(h) denote the training error of hypothesis h, hbest denote the

hypothesis from H having lowest training error over training set

I Using Hoeffding bound which characterize the deviation between thetrue probability of some event and its observed frequency over mindependent trialsTo ensure the true error errorD(hbest) ≤ ε+ errorD(hbest)

P((∃h ∈ H)(errorD(hbest))errorD(hbest) + ε) ≤| H | e−2mε2

m ≥ 1

2ε2(ln | H | + ln

1

δ)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 18 / 30

Page 47: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Agnostic Learning and Inconsistent HypothesesAgnostic Learner if H does not contain the target concept c , then azero-error hypothesis cannot always be found

Definition

A learner that makes no assumption that the target concept isrepresentable by H and that simply finds the hypothesis with minimumtraining error, is often called an agnostic learner

Inconsistent HypothesisI errorD(h) denote the training error of hypothesis h, hbest denote the

hypothesis from H having lowest training error over training setI Using Hoeffding bound which characterize the deviation between the

true probability of some event and its observed frequency over mindependent trialsTo ensure the true error errorD(hbest) ≤ ε+ errorD(hbest)

P((∃h ∈ H)(errorD(hbest))errorD(hbest) + ε) ≤| H | e−2mε2

m ≥ 1

2ε2(ln | H | + ln

1

δ)

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 18 / 30

Page 48: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC-Learnability of Other Concept Classes

Unbiased LearnerI The unbiased concept class C that contains every teachable concept

relative to X , suppose that instances in X are defined by n booleanfeatures, then | C |= 2|X | = 22n ,

m ≥ 1

ε(2n ln 2 + ln

1

δ)

is not PAC learnable

K-Term DNF And K-CNF Concepts

I K-Term DNF has polynomial sample complexity, it does not havepolynomial computational complexity for a learner using H = C

I K-CNF polynomial computation complexity per example and still haspolynomial sample complexity

I K-Term DNF ⊆ K − CNF

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 19 / 30

Page 49: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

PAC-Learnability of Other Concept Classes

Unbiased LearnerI The unbiased concept class C that contains every teachable concept

relative to X , suppose that instances in X are defined by n booleanfeatures, then | C |= 2|X | = 22n ,

m ≥ 1

ε(2n ln 2 + ln

1

δ)

is not PAC learnable

K-Term DNF And K-CNF ConceptsI K-Term DNF has polynomial sample complexity, it does not have

polynomial computational complexity for a learner using H = CI K-CNF polynomial computation complexity per example and still has

polynomial sample complexityI K-Term DNF ⊆ K − CNF

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 19 / 30

Page 50: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

1 Questions for Learning Algorithms

2 Basis of PACIntroductionBasic SymbolsError of a hypothesisPAC Learnability

3 Sample complexity for finite hypothesis spaceConsistent LearnerAgnostic Learning and Inconsistent HypothesesPAC-Learnability of Other Concept Classes

4 Sample Complexity for Infinite Hypothesis Spaces

5 Some More General ScenarioPapers in Recent Years

6 Critisms of The PAC Model

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 20 / 30

Page 51: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity for Infinite Hypothesis Spaces

Shatter

Definition

A set of instances S is shattered by hypothesis space H if and only if forevery dichotomy of S there exists some hypothesis in H consistent withthis dichotomy.

VC Dimension

Definition

VC (H), of hypothesis space H defined over instance space X is the size ofthe largest finite subset of X shattered by H. If arbitrarily large finite setsof X can be shattered by H, then VC (H) =

VC (H) = d = log2 2d ≤ log2 | H |

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 21 / 30

Page 52: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity for Infinite Hypothesis Spaces

Shatter

Definition

A set of instances S is shattered by hypothesis space H if and only if forevery dichotomy of S there exists some hypothesis in H consistent withthis dichotomy.

VC Dimension

Definition

VC (H), of hypothesis space H defined over instance space X is the size ofthe largest finite subset of X shattered by H. If arbitrarily large finite setsof X can be shattered by H, then VC (H) =

VC (H) = d = log2 2d ≤ log2 | H |

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 21 / 30

Page 53: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity and the VC Dimension

Upper Bound to PAC Learn Any Target in C

m ≥ 1

ε(4 log2

2

δ+ 8VC (H) log2

13

ε)

Lower Bound

m ≥ max{1

εlog

1

δ,VC (C )− 1

32ε}

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 22 / 30

Page 54: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Sample Complexity and the VC Dimension

Upper Bound to PAC Learn Any Target in C

m ≥ 1

ε(4 log2

2

δ+ 8VC (H) log2

13

ε)

Lower Bound

m ≥ max{1

εlog

1

δ,VC (C )− 1

32ε}

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 22 / 30

Page 55: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

1 Questions for Learning Algorithms

2 Basis of PACIntroductionBasic SymbolsError of a hypothesisPAC Learnability

3 Sample complexity for finite hypothesis spaceConsistent LearnerAgnostic Learning and Inconsistent HypothesesPAC-Learnability of Other Concept Classes

4 Sample Complexity for Infinite Hypothesis Spaces

5 Some More General ScenarioPapers in Recent Years

6 Critisms of The PAC Model

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 23 / 30

Page 56: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Some More General ScenarioLearning Real-Valued Target Functions

Definition

A concept class is a nonempty set C ⊆ 2X of concepts. Assume X is afixed set, either finite, countably infinite, [0, l ]n, or En for some n ≥ 1. Inthe latter cases, we assume that each c ∈ C is a Borel set. Forx = (x1, . . . , xm) ∈ Xm, m ≥ 1, the m-sample of c ∈ C generated by x isgiven by samc(x) = (< x1, Ic(x1) >, . . . , < xm, Ic(xm) >). The samplespace of C , denoted SC, is the set of all m-samples over all c ∈ C and allx ∈ Xm, for all m ≥ 1. · · ·

Blumer, A., Ehrenfeucht, A., Haussler, D., &Warmuth, M.Learnability and the Vapnik-Chervonenkis dimension.Journal of the ACM, 34(4) (October), 929-965. 1989.

Natarajan, B. K.Machine Learning: A theoretical approach.San Mateo, CA: Morgan Kaufmann, 1991.

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 24 / 30

Page 57: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Some More General Scenario

Learning from Certain Types of Noisy Data

Laird, P.Learning from good and bad data.Dordrecht: Kluwer Academic Publishers, 1988.

Kearns, M. J., Vazirani, U. V.An introduction to computational learning theory.Cambridge, MA: MIT Press, 1994.

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 25 / 30

Page 58: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Papers in Recent Years

Auer, P., Ortner, R.A new PAC bound for intersection-closed concept classesMachine learning, 2007.

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 26 / 30

Page 59: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Critisms of The PAC Model

The worst-case emphasis in the model makes it unusable in practice

The notions of target concept and noise-free training data areunrealistic in practice

Without its computational part, the result are contained by StatisticalLearning Theory

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 27 / 30

Page 60: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Critisms of The PAC Model

The worst-case emphasis in the model makes it unusable in practice

The notions of target concept and noise-free training data areunrealistic in practice

Without its computational part, the result are contained by StatisticalLearning Theory

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 27 / 30

Page 61: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Critisms of The PAC Model

The worst-case emphasis in the model makes it unusable in practice

The notions of target concept and noise-free training data areunrealistic in practice

Without its computational part, the result are contained by StatisticalLearning Theory

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 27 / 30

Page 62: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Reference

Valiant, L.A theory of the learnable.Communications of the ACM, 27, 1984.

Blumer, A., Ehrenfeucht, A., Haussler, D., &Warmuth, M.Learnability and the Vapnik-Chervonenkis dimension.Journal of the ACM, 34(4) (October), 929-965. 1989.

Haussler, D.Overview of the Probably Approximately Correct (PAC) LearningFramework.An introduction to the topic

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 28 / 30

Page 63: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Reference

Vidyasagar, M.A theory of learning and generalization : with applications toneural networks and control systems.London ; New York : Springer, c1997

Vapnik§V. N.Statistical learning theory.New York : Wiley, c1998.

Mitchell, T. M.Í, Qu�, ÜÕ¿�ÈÅìÆS.�®: Å�ó�Ñ��, 2003.

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 29 / 30

Page 64: PAC Learning Framework - University of Cambridgemi.eng.cam.ac.uk/~cz277/doc/Slides-PAC.pdf · PAC Learning Framework Ü⁄ Dept. of CS&T, Tsinghua University Ü⁄ (Dept. of CS&T,

Thanks!

� (Dept. of CS&T, Tsinghua University) PAC Learning Framework 30 / 30