2.supervised learning(epoch#2)-1

42
Introduction to Machine Learning with Python 2. Supervised Learning(1) Honedae Machine Learning Study Epoch #2 1

Upload: haesun-park

Post on 21-Jan-2018

459 views

Category:

Software


2 download

TRANSCRIPT

Page 1: 2.supervised learning(epoch#2)-1

Introduction to

Machine Learning with Python

2. Supervised Learning(1)

Honedae Machine Learning Study Epoch #2

1

Page 2: 2.supervised learning(epoch#2)-1

Contacts

Haesun Park

Email : [email protected]

Meetup: https://www.meetup.com/Hongdae-Machine-Learning-Study/

Facebook : https://facebook.com/haesunrpark

Blog : https://tensorflow.blog

2

Page 3: 2.supervised learning(epoch#2)-1

Book

ํŒŒ์ด์ฌ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผํ™œ์šฉํ•œ๋จธ์‹ ๋Ÿฌ๋‹, ๋ฐ•ํ•ด์„ .

(Introduction to Machine Learning with Python, Andreas

Muller & Sarah Guido์˜๋ฒˆ์—ญ์„œ์ž…๋‹ˆ๋‹ค.)

๋ฒˆ์—ญ์„œ์˜ 1์žฅ๊ณผ 2์žฅ์€๋ธ”๋กœ๊ทธ์—์„œ๋ฌด๋ฃŒ๋กœ์ฝ์„์ˆ˜์žˆ์Šต๋‹ˆ๋‹ค.

์›์„œ์—๋Œ€ํ•œํ”„๋ฆฌ๋ทฐ๋ฅผ์˜จ๋ผ์ธ์—์„œ๋ณผ์ˆ˜์žˆ์Šต๋‹ˆ๋‹ค.

Github:

https://github.com/rickiepark/introduction_to_ml_with_python/

3

Page 4: 2.supervised learning(epoch#2)-1

์ง€๋„ํ•™์Šต

4

Page 5: 2.supervised learning(epoch#2)-1

์ง€๋„ํ•™์Šตsupervised learning

์ž…๋ ฅ๊ณผ์ถœ๋ ฅ์˜์ƒ˜ํ”Œ๋ฐ์ดํ„ฐ๊ฐ€์žˆ์„๋•Œ์ƒˆ๋กœ์šด์ž…๋ ฅ์—๋Œ€ํ•œ์ถœ๋ ฅ์„์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

5

ํ›ˆ๋ จ(training, learning)

์˜ˆ์ธก(predict, Inference)

์‚ฌ์ „์—๋ฐ์ดํ„ฐ๋ฅผ๋งŒ๋“œ๋Š”๋…ธ๋ ฅ์ดํ•„์š”ํ•ฉ๋‹ˆ๋‹ค(์ž๋™ํ™”ํ•„์š”).

Page 6: 2.supervised learning(epoch#2)-1

๋ถ„๋ฅ˜classification์™€ํšŒ๊ท€regression

๋ถ„๋ฅ˜๋Š”๊ฐ€๋Šฅ์„ฑ์žˆ๋Š”ํด๋ž˜์Šค๋ ˆ์ด๋ธ”์ค‘ํ•˜๋‚˜๋ฅผ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

์ด์ง„๋ถ„๋ฅ˜binary classification์˜์˜ˆ: ๋‘๊ฐœํด๋ž˜์Šค๋ถ„๋ฅ˜(์ŠคํŒธ, 0-์Œ์„ฑํด๋ž˜์Šค, 1-์–‘์„ฑํด๋ž˜์Šค)

๋‹ค์ค‘๋ถ„๋ฅ˜multiclass classification์˜์˜ˆ: ์…‹์ด์ƒ์˜ํด๋ž˜์Šค๋ถ„๋ฅ˜(๋ถ“๊ฝƒํ’ˆ์ข…)

์–ธ์–ด์˜์ข…๋ฅ˜๋ฅผ๋ถ„๋ฅ˜(ํ•œ๊ตญ์–ด์™€ ํ”„๋ž‘์Šค์–ด์‚ฌ์ด์—๋‹ค๋ฅธ์–ธ์–ด๊ฐ€์—†์Œ)

๋Œ€๋ถ€๋ถ„๋จธ์‹ ๋Ÿฌ๋‹๋ฌธ์ œ๊ฐ€๋ถ„๋ฅ˜์˜๋ฌธ์ œ์ž…๋‹ˆ๋‹ค.

ํšŒ๊ท€๋Š”์—ฐ์†์ ์ธ์ˆซ์ž(์‹ค์ˆ˜)๋ฅผ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ)๊ต์œก, ๋‚˜์ด, ์ฃผ๊ฑฐ์ง€๋ฅผ๋ฐ”ํƒ•์œผ๋กœ์—ฐ๊ฐ„์†Œ๋“์˜ˆ์ธก

์ „๋…„๋„์ˆ˜ํ™•๋Ÿ‰, ๋‚ ์”จ, ๊ณ ์šฉ์ž์ˆ˜๋กœ์˜ฌํ•ด์ˆ˜ํ™•๋Ÿ‰์˜ˆ์ธก

์˜ˆ์ธก๊ฐ’์—๋ฏธ๋ฌ˜ํ•œ์ฐจ์ด๊ฐ€ํฌ๊ฒŒ์ค‘์š”ํ•˜์ง€์•Š์Šต๋‹ˆ๋‹ค.(์—ฐ๋ด‰ ์˜ˆ์ธก์˜๊ฒฝ์šฐ 40,000,001

์›์ด๋‚˜ 39,999,999์›์ด๋ฌธ์ œ๊ฐ€๋˜์ง€์•Š์Šต๋‹ˆ๋‹ค.)

ํ†ต๊ณ„ํ•™์—์„œ๋งŽ์ด์—ฐ๊ตฌ๋˜์–ด์™”์Šต๋‹ˆ๋‹ค.6

Page 7: 2.supervised learning(epoch#2)-1

ํšŒ๊ท€ - Regression

7

โ€œregression toward the meanโ€

ํ”„๋žœ์‹œ์Šค๊ณจํ„ดFrancis Galton

Page 8: 2.supervised learning(epoch#2)-1

์ผ๋ฐ˜ํ™”, ๊ณผ๋Œ€์ ํ•ฉ, ๊ณผ์†Œ์ ํ•ฉ

์ผ๋ฐ˜ํ™”generalization

ํ›ˆ๋ จ์„ธํŠธ๋กœํ•™์Šตํ•œ๋ชจ๋ธ์„ํ…Œ์ŠคํŠธ์„ธํŠธ์—์ ์šฉํ•˜๋Š”๊ฒƒ์ž…๋‹ˆ๋‹ค.

(์˜ˆ, ๋ชจ๋ธ์ดํ…Œ์ŠคํŠธ์„ธํŠธ์—์ž˜์ผ๋ฐ˜ํ™”๊ฐ€๋˜์—ˆ๋‹ค)

๊ณผ๋Œ€์ ํ•ฉoverfitting

ํ›ˆ๋ จ์„ธํŠธ์—๋„ˆ๋ฌด๋งž์ถ”์–ด์ ธ์žˆ์–ดํ…Œ์ŠคํŠธ์„ธํŠธ์˜์„ฑ๋Šฅ์ €ํ•˜๋˜๋Š”ํ˜„์ƒ์ž…๋‹ˆ๋‹ค.

(์˜ˆ, ๋ชจ๋ธ์ดํ›ˆ๋ จ์„ธํŠธ์—๊ณผ๋Œ€์ ํ•ฉ๋˜์–ด ์žˆ๋‹ค)

๊ณผ์†Œ์ ํ•ฉunderfitting

ํ›ˆ๋ จ์„ธํŠธ๋ฅผ์ถฉ๋ถ„ํžˆ๋ฐ˜์˜ํ•˜์ง€๋ชปํ•ดํ›ˆ๋ จ์„ธํŠธ, ํ…Œ์ŠคํŠธ์„ธํŠธ์—์„œ๋ชจ๋‘์„ฑ๋Šฅ์ €ํ•˜๋˜๋Š”

ํ˜„์ƒ์ž…๋‹ˆ๋‹ค. (์˜ˆ, ๋ชจ๋ธ์ด๋„ˆ๋ฌด๋‹จ์ˆœํ•˜์—ฌ๊ณผ์†Œ์ ํ•ฉ๋˜์–ด ์žˆ๋‹ค)

8

Page 9: 2.supervised learning(epoch#2)-1

๋„ˆ๋ฌด์ƒ์„ธํ•œ๋ชจ๋ธ๊ณผ๋Œ€์ ํ•ฉ

9

45์„ธ์ด์ƒ, ์ž๋…€์…‹๋ฏธ๋งŒ,

์ดํ˜ผํ•˜์ง€์•Š์€๊ณ ๊ฐ์ด์š”ํŠธ๋ฅผ์‚ด๊ฒƒ์ด๋‹ค.

์š”ํŠธํšŒ์‚ฌ์˜๊ณ ๊ฐ๋ฐ์ดํ„ฐ๋ฅผํ™œ์šฉํ•ด์š”ํŠธ๋ฅผ์‚ฌ๋ ค๋Š”์‚ฌ๋žŒ์„์˜ˆ์ธกํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค.

๋งŒ์•ฝ 10,000๊ฐœ์˜๋ฐ์ดํ„ฐ๊ฐ€๋ชจ๋‘์ด์กฐ๊ฑด์„๋งŒ์กฑํ•œ๋‹ค๋ฉด?

Page 10: 2.supervised learning(epoch#2)-1

๋„ˆ๋ฌด๊ฐ„๋‹จํ•œ๋ชจ๋ธ๊ณผ์†Œ์ ํ•ฉ

10

์ง‘์ด์žˆ๋Š”๊ณ ๊ฐ์€๋ชจ๋‘์š”ํŠธ๋ฅผ์‚ด๊ฒƒ์ด๋‹ค.

Page 11: 2.supervised learning(epoch#2)-1

๋ชจ๋ธ๋ณต์žก๋„๊ณก์„ 

11

๋ฐ์ดํ„ฐ์—๋Œ€ํ•œ๋ฏผ๊ฐ์„ฑ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž์ฒด์˜ค์ฐจ(y = ax + b ์˜ b๊ฐ€์•„๋‹˜)

ํŽธํ–ฅ-๋ถ„์‚ฐํŠธ๋ ˆ์ด๋“œ์˜คํ”„bias-variance tradeoff๋ผ๊ณ ๋„๋ถ€๋ฆ…๋‹ˆ๋‹ค.

Page 12: 2.supervised learning(epoch#2)-1

๋ฐ์ดํ„ฐ์…‹๊ณผ๋ณต์žก๋„๊ด€๊ณ„

๋ฐ์ดํ„ฐ๊ฐ€๋งŽ์œผ๋ฉด๋‹ค์–‘์„ฑ์ด์ปค์ ธ๋ณต์žกํ•œ๋ชจ๋ธ์„๋งŒ๋“ค์ˆ˜์žˆ์Šต๋‹ˆ๋‹ค.

10,000๋ช…์˜๊ณ ๊ฐ๋ฐ์ดํ„ฐ์—์„œ 45์„ธ์ด์ƒ, ์ž๋…€์…‹๋ฏธ๋งŒ, ์ดํ˜ผํ•˜์ง€์•Š์€๊ณ ๊ฐ์ด

์š”ํŠธ๋ฅผ์‚ฌ๋ คํ•œ๋‹ค๋ฉด ์ด์ „๋ณด๋‹คํ›จ์”ฌ์‹ ๋ขฐ๋†’์€๋ชจ๋ธ์ด๋ผ๊ณ ํ• ์ˆ˜์žˆ์Šต๋‹ˆ๋‹ค.

12

ํ›ˆ๋ จ

ํ…Œ์ŠคํŠธ

(less complex) (more complex)

Page 13: 2.supervised learning(epoch#2)-1

๋ฐ์ดํ„ฐ์…‹

forge ๋ฐ์ดํ„ฐ์…‹

์ธ์œ„์ ์œผ๋กœ๋งŒ๋“ ์ด์ง„๋ถ„๋ฅ˜๋ฐ์ดํ„ฐ์…‹, 26๊ฐœ์˜๋ฐ์ดํ„ฐํฌ์ธํŠธ, 2๊ฐœ์˜์†์„ฑ

13

Page 14: 2.supervised learning(epoch#2)-1

๋ฐ์ดํ„ฐ์…‹

wave ๋ฐ์ดํ„ฐ์…‹

์ธ์œ„์ ์œผ๋กœ๋งŒ๋“ ํšŒ๊ท€๋ฐ์ดํ„ฐ์…‹, 40๊ฐœ์˜๋ฐ์ดํ„ฐํฌ์ธํŠธ, 1๊ฐœ์˜์†์„ฑ

14

Page 15: 2.supervised learning(epoch#2)-1

๋ฐ์ดํ„ฐ์…‹

์œ„์Šค์ฝ˜์‹ ์œ ๋ฐฉ์•”๋ฐ์ดํ„ฐ์…‹

์•…์„ฑMalignant(1)/์–‘์„ฑBenign(0)์˜ ์ด์ง„๋ถ„๋ฅ˜, load_breast_cancer(),

569๊ฐœ์˜๋ฐ์ดํ„ฐํฌ์ธํŠธ, 30๊ฐœ์˜ํŠน์„ฑ

๋ณด์Šคํ„ด์ฃผํƒ๊ฐ€๊ฒฉ๋ฐ์ดํ„ฐ์…‹

1970๋…„๋Œ€๋ณด์Šคํ„ด์ฃผ๋ณ€์˜์ฃผํƒํ‰๊ท ๊ฐ€๊ฒฉ์˜ˆ์ธก, ํšŒ๊ท€๋ฐ์ดํ„ฐ์…‹, load_boston(),

506๊ฐœ์˜๋ฐ์ดํ„ฐํฌ์ธํŠธ, 13๊ฐœ์˜ํŠน์„ฑ

ํ™•์žฅ๋ณด์Šคํ„ด์ฃผํƒ๊ฐ€๊ฒฉ๋ฐ์ดํ„ฐ์…‹

ํŠน์„ฑ๋ผ๋ฆฌ๊ณฑํ•˜์—ฌ์ƒˆ๋กœ์šดํŠน์„ฑ์„๋งŒ๋“ฆ(ํŠน์„ฑ๊ณตํ•™, 4์žฅ, PolynomialFeatures),

mglearn.datasets.load_extended_boston(), 104๊ฐœ์˜๋ฐ์ดํ„ฐํฌ์ธํŠธ

์ค‘๋ณต์กฐํ•ฉ๊ณต์‹์€ ๐‘›๐‘˜

= ๐‘›+๐‘˜โˆ’1๐‘˜

์ด๋ฏ€๋กœ 132

= 13+2โˆ’12

=14!

2! 14โˆ’2 != 91

15

์–‘์„ฑpositive, ์Œ์„ฑnegative

ํŠน์„ฑ์˜์ œ๊ณฑ์ด๋งŒ๋“ค์–ด์ง

Page 16: 2.supervised learning(epoch#2)-1

load_breast_cancer()

16

(์ƒ˜ํ”Œ, ํŠน์„ฑ)

Bunch ํด๋ž˜์Šค์˜ํ‚ค

ํƒ€๊นƒ

Page 17: 2.supervised learning(epoch#2)-1

load_boston()

17

์›๋ž˜ 13๊ฐœํŠน์„ฑ

ํ™•์žฅ๋œ 104๊ฐœํŠน์„ฑ์ด์ฑ…์˜์˜ˆ์ œ๋ฅผ์œ„ํ•ด์„œ๋งŒ๋“ ํ™•์žฅ๋œ๋ฐ์ดํ„ฐ์…‹

Page 18: 2.supervised learning(epoch#2)-1

k-์ตœ๊ทผ์ ‘์ด์›ƒ๋ถ„๋ฅ˜

18

Page 19: 2.supervised learning(epoch#2)-1

k-์ตœ๊ทผ์ ‘์ด์›ƒ๋ถ„๋ฅ˜

์ƒˆ๋กœ์šด๋ฐ์ดํ„ฐํฌ์ธํŠธ์—๊ฐ€๊นŒ์šด์ด์›ƒ์ค‘๋‹ค์ˆ˜ํด๋ž˜์Šคmajority voting๋ฅผ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

forge ๋ฐ์ดํ„ฐ์…‹์— 1-์ตœ๊ทผ์ ‘์ด์›ƒ, 3-์ตœ๊ทผ์ ‘์ด์›ƒ์ ์šฉํ•˜๋ฉด๋‹ค์Œ๊ณผ๊ฐ™์Šต๋‹ˆ๋‹ค.

19

์˜ˆ์ธก์ด๋ฐ”๋€œ

Page 20: 2.supervised learning(epoch#2)-1

k-NN ๋ถ„๋ฅ˜๊ธฐ

20

ํ›ˆ๋ จ์„ธํŠธ์™€ํ…Œ์ŠคํŠธ์„ธํŠธ๋กœ๋ถ„๋ฆฌ

๋ชจ๋ธ๊ฐ์ฒด์ƒ์„ฑ

๋ชจ๋ธํ‰๊ฐ€

๋ชจ๋ธํ•™์Šต

Page 21: 2.supervised learning(epoch#2)-1

KNeighborsClassifier ๋ถ„์„

21

more complex

๊ณผ๋Œ€์ ํ•ฉless complex

๊ณผ์†Œ์ ํ•ฉ

์ผ์ •ํ•œ๊ฐ„๊ฒฉ์œผ๋กœ์ƒ์„ฑํ•œํฌ์ธํŠธ์˜์˜ˆ์ธก๊ฐ’์„์ด์šฉํ•ด๊ฒฐ์ •๊ฒฝ๊ณ„๊ตฌ๋ถ„

Page 22: 2.supervised learning(epoch#2)-1

k-NN ๋ถ„๋ฅ˜๊ธฐ์˜๋ชจ๋ธ๋ณต์žก๋„

22

more complex less complex

์ตœ์ ์ 

๊ณผ๋Œ€์ ํ•ฉ

๊ณผ์†Œ์ ํ•ฉ

Page 23: 2.supervised learning(epoch#2)-1

k-์ตœ๊ทผ์ ‘์ด์›ƒํšŒ๊ท€

23

Page 24: 2.supervised learning(epoch#2)-1

k-์ตœ๊ทผ์ ‘์ด์›ƒํšŒ๊ท€

์ƒˆ๋กœ์šด๋ฐ์ดํ„ฐํฌ์ธํŠธ์—๊ฐ€๊นŒ์šด์ด์›ƒ์˜์ถœ๋ ฅ๊ฐ’ํ‰๊ท ์ด์˜ˆ์ธก์ด๋ฉ๋‹ˆ๋‹ค.

wave ๋ฐ์ดํ„ฐ์…‹(1์ฐจ์›)์— 1-์ตœ๊ทผ์ ‘์ด์›ƒ, 3-์ตœ๊ทผ์ ‘์ด์›ƒ์ ์šฉํ•˜๋ฉด๋‹ค์Œ๊ณผ๊ฐ™์Šต๋‹ˆ๋‹ค.

24

ํ…Œ์ŠคํŠธ๋ฐ์ดํ„ฐ

์˜ˆ์ธก

Page 25: 2.supervised learning(epoch#2)-1

k-NN ์ถ”์ •๊ธฐ

25

ํ›ˆ๋ จ์„ธํŠธ์™€ํ…Œ์ŠคํŠธ์„ธํŠธ๋กœ๋ถ„๋ฆฌ

๋ชจ๋ธ๊ฐ์ฒด์ƒ์„ฑ

๋ชจ๋ธํ‰๊ฐ€

๋ชจ๋ธํ•™์Šต

Page 26: 2.supervised learning(epoch#2)-1

ํšŒ๊ท€๋ชจ๋ธ์˜ํ‰๊ฐ€

ํšŒ๊ท€๋ชจ๋ธ์˜ score() ํ•จ์ˆ˜๋Š”๊ฒฐ์ •๊ณ„์ˆ˜ ๐‘…2๋ฅผ๋ฐ˜ํ™˜

๐‘…2 = 1 โˆ’ ๐‘–=0

๐‘› ๐‘ฆ โˆ’ ๐‘ฆ 2

๐‘–=0๐‘› ๐‘ฆ โˆ’ ๐‘ฆ 2 ๐‘ฆ:ํƒ€๊นƒ๊ฐ’ ๐‘ฆ:ํƒ€๊นƒ๊ฐ’์˜ํ‰๊ท  ๐‘ฆ:๋ชจ๋ธ์˜์˜ˆ์ธก

์™„๋ฒฝ์˜ˆ์ธก: ํƒ€๊นƒ๊ฐ’==์˜ˆ์ธก๋ถ„์ž==0, ๐‘…2 = 1

ํƒ€๊นƒ๊ฐ’์˜ํ‰๊ท ์ •๋„์˜ˆ์ธก: ๋ถ„์žโ‰ˆ๋ถ„๋ชจ, ๐‘…2 = 0์ด๋จ

ํ‰๊ท ๋ณด๋‹ค๋‚˜์˜๊ฒŒ์˜ˆ์ธกํ•˜๋ฉด์Œ์ˆ˜๊ฐ€๋ ์ˆ˜์žˆ์Œ

26

Page 27: 2.supervised learning(epoch#2)-1

KNeighborsRegressor ๋ถ„์„

27

more complex

๊ณผ๋Œ€์ ํ•ฉ

less complex

๊ณผ์†Œ์ ํ•ฉ

Page 28: 2.supervised learning(epoch#2)-1

์žฅ๋‹จ์ ๊ณผ๋งค๊ฐœ๋ณ€์ˆ˜

์ค‘์š”๋งค๊ฐœ๋ณ€์ˆ˜ํฌ์ธํŠธ์‚ฌ์ด์˜๊ฑฐ๋ฆฌ์ธก์ •๋ฐฉ๋ฒ•: metric ๋งค๊ฐœ๋ณ€์ˆ˜๊ธฐ๋ณธ๊ฐ’ โ€˜minkowskiโ€™์ด๊ณ 

p ๋งค๊ฐœ๋ณ€์ˆ˜๊ธฐ๋ณธ๊ฐ’ 2 ์ผ๋•Œ,

์œ ํด๋ฆฌ๋””์•ˆ๊ฑฐ๋ฆฌ ๐‘–=0๐‘š ๐‘ฅ1

(๐‘–)โˆ’ ๐‘ฅ2

(๐‘–) 2

์ด์›ƒ์˜์ˆ˜(n_neighbors): 3๊ฐœ๋‚˜ 5๊ฐœ(๊ธฐ๋ณธ๊ฐ’)๊ฐ€ ๋ณดํŽธ์ 

์žฅ์  ์ดํ•ดํ•˜๊ธฐ์‰ฌ์›€, ํŠน๋ณ„ํ•œ์กฐ์ •์—†์ด์ž˜์ž‘๋™, ์ฒ˜์Œ์‹œ๋„ํ•˜๋Š”๋ชจ๋ธ๋กœ์ ํ•ฉ,

๋น„๊ต์ ๋ชจ๋ธ์„๋น ๋ฅด๊ฒŒ๋งŒ๋“ค์ˆ˜์žˆ์Œ

๋‹จ์  ํŠน์„ฑ๊ฐœ์ˆ˜๋‚˜์ƒ˜ํ”Œ๊ฐœ์ˆ˜๊ฐ€ํฌ๋ฉด์˜ˆ์ธก์ด๋Š๋ฆผ,

๋ฐ์ดํ„ฐ์ „์ฒ˜๋ฆฌ์ค‘์š”(์Šค์ผ€์ผ์ด ์ž‘์€ํŠน์„ฑ์—์˜ํ–ฅ์„์ค„์ด์ง€์•Š์œผ๋ ค๋ฉด์ •๊ทœํ™”ํ•„์š”),

(์ˆ˜๋ฐฑ๊ฐœ์ด์ƒ์˜) ๋งŽ์€ํŠน์„ฑ์„๊ฐ€์ง„๋ฐ์ดํ„ฐ์…‹์—๋Š”์ž˜์ž‘๋™ํ•˜์ง€์•Š์Œ,

ํฌ์†Œํ•œ๋ฐ์ดํ„ฐ์…‹์— ์ž˜์ž‘๋™ํ•˜์ง€์•Š์Œ

28

Page 29: 2.supervised learning(epoch#2)-1

์„ ํ˜•๋ชจ๋ธ - ํšŒ๊ท€

29

Page 30: 2.supervised learning(epoch#2)-1

ํšŒ๊ท€์˜์„ ํ˜•๋ชจ๋ธ

์„ ํ˜•ํ•จ์ˆ˜(linear model)๋ฅผ์‚ฌ์šฉํ•˜์—ฌ์˜ˆ์ธก

๐‘ฆ = ๐‘ค 0 ร— ๐‘ฅ 0 + ๐‘ค 1 ร— ๐‘ฅ 1 + โ‹ฏ+ ๐‘ค ๐‘ ร— ๐‘ฅ ๐‘ + ๐‘

๐‘ฆ = ๐‘ค0 ร— ๐‘ฅ0 + ๐‘ค1 ร— ๐‘ฅ1 + โ‹ฏ+ ๐‘ค๐‘ ร— ๐‘ฅ๐‘ + ๐‘

ํŠน์„ฑ : ๐‘ฅ 0 ~๐‘ฅ ๐‘ ํŠน์„ฑ๊ฐœ์ˆ˜: (p + 1)

๋ชจ๋ธํŒŒ๋ผ๋ฏธํ„ฐmodel parameter: ๐‘ค 0 ~๐‘ค ๐‘ , ๐‘

๐œ” : ๊ฐ€์ค‘์น˜weight, ๊ณ„์ˆ˜coefficient, ๐œƒ, ๐›ฝ e.g. model.coef_

๐˜ฃ : ์ ˆํŽธintercept, ํŽธํ–ฅbias e.g. model.intercept_

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐhyperparameter: ํ•™์Šต๋˜์ง€์•Š๊ณ ์ง์ ‘์„ค์ •ํ•ด์ฃผ์–ด์•ผํ•จ(๋งค๊ฐœ๋ณ€์ˆ˜)

e.g. KNeighborsRegressor์˜ n_neighbors

30

์‚ฌ์šฉ์ž๊ฐ€์ง€์ •ํ•œ๋งค๊ฐœ๋ณ€์ˆ˜์™€๊ตฌ๋ถ„

Page 31: 2.supervised learning(epoch#2)-1

wave ๋ฐ์ดํ„ฐ์…‹(ํŠน์„ฑ 1๊ฐœ)์œผ๋กœ๋น„๊ต

31

์„ ํ˜•๋ชจ๋ธ k-์ตœ๊ทผ์ ‘์ด์›ƒ

๐‘ฆ = ๐‘ค 0 ร— ๐‘ฅ 0 + ๐‘

k-์ตœ๊ทผ์ ‘์ด์›ƒ์—๋น„ํ•ด๋‹จ์ˆœํ•ด๋ณด์ด์ง€๋งŒํŠน์„ฑ์ด๋งŽ์œผ๋ฉด์˜คํžˆ๋ ค๊ณผ๋Œ€์ ํ•ฉ๋˜๊ธฐ์‰ฝ์Šต๋‹ˆ๋‹ค.

์„ ํ˜•๋ชจ๋ธ์€ํŠน์„ฑ์ด๋‘๊ฐœ์ด๋ฉดํ‰๋ฉด, ์„ธ๊ฐœ์ด์ƒ์€์ดˆํ‰๋ฉดhyperplane์ด๋ฉ๋‹ˆ๋‹ค.

Page 32: 2.supervised learning(epoch#2)-1

์ตœ์†Œ์ œ๊ณฑ๋ฒ•OLS, ordinary least squares

ํ‰๊ท ์ œ๊ณฑ์˜ค์ฐจmean square error(MSE =1

๐‘› ๐‘–=0

๐‘› ๐‘ฆ๐‘– โˆ’ ๐‘ฆ๐‘–2)๋ฅผ์ตœ์†Œํ™”

LinearRegression: ์ •๊ทœ๋ฐฉ์ •์‹normal equation ๐›ฝ = ๐‘‹๐‘‡๐‘‹ โˆ’1๐‘‹๐‘‡๐‘ฆ์„์‚ฌ์šฉํ•˜์—ฌ w, b๋ฅผ๊ตฌํ•จ

32

ํ›ˆ๋ จ์„ธํŠธ์™€ํ…Œ์ŠคํŠธ์„ธํŠธ๋กœ๋ถ„๋ฆฌ

๋ชจ๋ธํ‰๊ฐ€

๋ชจ๋ธ๊ฐ์ฒด์ƒ์„ฑ & ํ•™์Šต

๊ณผ์†Œ์ ํ•ฉ(1์ฐจ์›๋ฐ์ดํ„ฐ์…‹์ด๋ผ์˜ˆ์ƒํ•œ๋Œ€๋กœ)

๊ฐ€์ค‘์น˜๋Š”ํŠน์„ฑ์˜๊ฐœ์ˆ˜๋งŒํผ

Page 33: 2.supervised learning(epoch#2)-1

์ตœ์†Œ์ œ๊ณฑ๋ฒ•OLS, ordinary least squares

๋ณด์Šคํ„ด์ฃผํƒ๊ฐ€๊ฒฉ๋ฐ์ดํ„ฐ์…‹, 506๊ฐœ์ƒ˜ํ”Œ, 104๊ฐœํŠน์„ฑ

33

ํ›ˆ๋ จ์„ธํŠธ์™€ํ…Œ์ŠคํŠธ์„ธํŠธ์˜ R2 ์ ์ˆ˜์ฐจ์ด๊ฐ€ํผํŠน์„ฑ์ด๋งŽ์•„๊ฐ€์ค‘์น˜๊ฐ€ํ’๋ถ€ํ•ด์ ธ(104๊ฐœ์˜์ฐจ์›) ๊ณผ๋Œ€์ ํ•ฉ๋จ

๋™์ผํ•œ๊ธฐ๋ณธ๋งค๊ฐœ๋ณ€์ˆ˜๋กœํ›ˆ๋ จ

Page 34: 2.supervised learning(epoch#2)-1

๋ฆฟ์ง€ridge

์„ ํ˜•๋ชจ๋ธ(MSE ์ตœ์†Œํ™”) + ๊ฐ€์ค‘์น˜์ตœ์†Œํ™”(๊ฐ€๋Šฅํ•œ 0์—๊ฐ€๊น๊ฒŒ)

L2 ๊ทœ์ œregularization : L2 ๋…ธ๋ฆ„norm์˜์ œ๊ณฑ ๐‘ค 22 = ๐‘—=1

๐‘š ๐‘ค๐‘—2

๋น„์šฉํ•จ์ˆ˜cost function : ๐‘€๐‘†๐ธ + ๐›ผ ๐‘—=1๐‘š ๐‘ค๐‘—

2

์†์‹คํ•จ์ˆ˜loss function, ๋ชฉ์ ํ•จ์ˆ˜objective function ๋ผ๊ณ ๋„๋ถ€๋ฆ„

๐›ผ๊ฐ€ํฌ๋ฉดํŽ˜๋„ํ‹ฐ๊ฐ€์ปค์ ธ ๐‘ค๐‘—๊ฐ€์ž‘์•„์ ธ์•ผํ•จ(๊ณผ๋Œ€์ ํ•ฉ๋ฐฉ์ง€)

๐‘ค๐‘—๊ฐ€ 0์—๊ฐ€๊น๊ฒŒ๋˜์ง€๋งŒ 0์ด๋˜์ง€๋Š”์•Š์Œ

34

์ตœ์ ๊ฐ’ํŽ˜๋„ํ‹ฐpenalty

โบโ†‘

โบโ†“

Page 35: 2.supervised learning(epoch#2)-1

Ridge ํด๋ž˜์Šค

35

(๊ฐ€์ค‘์น˜๊ฐ€๊ทœ์ œ๋˜์–ด) ๊ณผ๋Œ€์ ํ•ฉ์ด์ค„๊ณ ํ…Œ์ŠคํŠธ์„ธํŠธ์ ์ˆ˜๊ฐ€์ƒ์Šน๋จ

๊ธฐ๋ณธ๊ฐ’ alpha=1.0

์ œ์•ฝ์ด๋„ˆ๋ฌด์ปค์ง๊ณผ์†Œ์ ํ•ฉ alpha=0.00001

๋กœํ•˜๋ฉด๋น„์Šทํ•ด์ง

์ตœ์†Œ์ œ๊ณฑ๋ฒ•

Page 36: 2.supervised learning(epoch#2)-1

Ridge.coef_

36

alpha ๊ฐ’์—๋”ฐ๋ฅธ coef_ ๊ฐ’์˜๋ณ€ํ™”

Page 37: 2.supervised learning(epoch#2)-1

๊ทœ์ œ์™€ํ›ˆ๋ จ๋ฐ์ดํ„ฐ์˜๊ด€๊ณ„

๋ณด์Šคํ„ด์ฃผํƒ๊ฐ€๊ฒฉ๋ฐ์ดํ„ฐ์…‹์˜ํ•™์Šต๊ณก์„  : LinearRegression vs Ridge(alpha=1)

37

ํ›ˆ๋ จ์„ธํŠธ์˜์ ์ˆ˜๋Š”๋ฆฟ์ง€๊ฐ€๋”๋‚ฎ์Œ

ํ…Œ์ŠคํŠธ์„ธํŠธ์˜์ ์ˆ˜๋Š”๋ฆฟ์ง€๊ฐ€๋”๋†’์Œ

๋ฐ์ดํ„ฐ๊ฐ€์ ์„๋•LinearRegression ํ•™์Šต์•ˆ๋จR2 < 0

(4~10 samples per weight)

๋ฐ์ดํ„ฐ๊ฐ€๋งŽ์œผ๋ฉด๊ทœ์ œํšจ๊ณผ๊ฐ์†Œ(๋ณต์žกํ•œ๋ชจ๋ธ์ด๊ฐ€๋Šฅํ•ด์ง)

๋ฐ์ดํ„ฐ๊ฐ€๋งŽ์•„์ง€๋ฉด๊ณผ๋Œ€์ ํ•ฉ์ค„์–ด๋“ฌ

Page 38: 2.supervised learning(epoch#2)-1

๋ผ์˜Lasso

38

์„ ํ˜•๋ชจ๋ธ(MSE ์ตœ์†Œํ™”) + ๊ฐ€์ค‘์น˜์ตœ์†Œํ™”(๊ฐ€๋Šฅํ•œ 0์œผ๋กœ)

L1 ๊ทœ์ œ : L1 ๋…ธ๋ฆ„ ๐‘ค 1 = ๐‘—=1๐‘š ๐‘ค๐‘—

๋น„์šฉํ•จ์ˆ˜cost function : ๐‘€๐‘†๐ธ + ๐›ผ ๐‘—=1๐‘š ๐‘ค๐‘—

๐›ผ๊ฐ€ํฌ๋ฉดํŽ˜๋„ํ‹ฐ๊ฐ€์ปค์ ธ ๐‘ค๐‘—๊ฐ€๋”์ž‘์•„์ ธ์•ผํ•จ

๐‘ค๐‘—๊ฐ€ 0์ด๋ ์ˆ˜์žˆ์Œ(ํŠน์„ฑ์„ ํƒ์˜ํšจ๊ณผ)

์ผ๋ถ€๊ณ„์ˆ˜๊ฐ€ 0์ด๋˜๋ฉด๋ชจ๋ธ์„์ดํ•ดํ•˜๊ธฐ์‰ฝ๊ณ 

์ค‘์š”ํ•œํŠน์„ฑ์„ํŒŒ์•…ํ•˜๊ธฐ์‰ฝ์Šต๋‹ˆ๋‹ค.

์ตœ์ ๊ฐ’

โบโ†‘

โบโ†“

Page 39: 2.supervised learning(epoch#2)-1

Lasso

39

alpha=1.0, max_iter=1000

๊ณผ์†Œ์ ํ•ฉ(๊ทœ์ œ๊ฐ€๋„ˆ๋ฌดํผ), 4๊ฐœ์˜ํŠน์„ฑ๋งŒํ™œ์šฉ๋จ

๊ทœ์ œ๋ฅผ๋„ˆ๋ฌด๋‚ฎ์ถ”๋ฉด LinearRegression๊ณผ๋น„์Šท

์ขŒํ‘œํ•˜๊ฐ•๋ฒ•coordinate descent

์ตœ์†Œ์ œ๊ณฑ๋ฒ•

๋ฆฟ์ง€์™€๋น„์Šท

Page 40: 2.supervised learning(epoch#2)-1

Lasso.coef_

40

alpha ๊ฐ’์—๋”ฐ๋ฅธ coef_ ๊ฐ’์˜๋ณ€ํ™”

Page 41: 2.supervised learning(epoch#2)-1

Ridge vs Lasso

์ผ๋ฐ˜์ ์œผ๋กœ๋ฆฟ์ง€๊ฐ€๋ผ์˜๋ณด๋‹ค์„ ํ˜ธ๋จ

L2 ํŽ˜๋„ํ‹ฐ๊ฐ€ L1 ํŽ˜๋„ํ‹ฐ๋ณด๋‹ค์„ ํ˜ธ๋จ (SGD์—์„œ ๊ฐ•ํ•œ์ˆ˜๋ ด)

๋งŽ์€ํŠน์„ฑ์ค‘์ผ๋ถ€๋งŒ์ค‘์š”ํ•˜๋‹ค๊ณ ํŒ๋‹จ๋˜๋ฉด๋ผ์˜

๋ถ„์„ํ•˜๊ณ ์ดํ•ดํ•˜๊ธฐ์‰ฌ์šด๋ชจ๋ธ์„์›ํ• ๋•Œ๋Š”๋ผ์˜

41

Page 42: 2.supervised learning(epoch#2)-1

ElasticNet

๋ฆฟ์ง€์™€๋ผ์˜ํŽ˜๋„ํ‹ฐ๊ฒฐํ•ฉ (R์˜ glmnet)

alpha, l1_ratio ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ L1 ๊ทœ์ œ์™€ L2 ๊ทœ์ œ์˜์–‘์„์กฐ์ ˆ

์‚ฌ์‹ค Lasso๋Š” ElasticNet(l1_ratio=1.0)์™€๋™์ผ

42

๐‘€๐‘†๐ธ + ๐›ผ ร— ๐‘™1_๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ

๐‘—=1

๐‘š

๐‘ฅ๐‘— +1

2๐›ผ ร— 1 โˆ’ ๐‘™1_๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ

๐‘—=1

๐‘š

๐‘ค๐‘—2

๐‘™1 = ๐›ผ ร— ๐‘™1_๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ ๐‘™2 = ๐›ผ ร— 1 โˆ’ ๐‘™1_๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ

๐›ผ = ๐‘™1 + ๐‘™2 ๐‘™1_๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ =๐‘™1

๐‘™1 + ๐‘™2๐‘™1๊ณผ ๐‘™2์—๋งž์ถ”์–ด๐›ผ์™€ ๐‘™1_ratio์„์กฐ์ ˆ