배워봅시다 머신러닝 with tensorflow

Post on 21-Apr-2017

88 Views

Category:

Data & Analytics

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

배워봅시다. 머신러닝with TensorFlow

이름 : 장훈나이 : 30세(1988-06-16)이메일 : lunawyrd@gmail.com블로그 : http://devhoon.tistory.comGithub : https://github.com/jang-hoon

[주요경력]2016.08 ~ 현재 (주)EXEM, WAS 및 Web Server 모니터링 프로그램(APM) 개발2015.09 ~ 2016.03 (주)두꺼비세상, 두꺼비세상 Android 앱 개발 및 Spring Boot를 이용한 REST, Batch, Gateway 서버 개발2015.04 ~ 2015.08 (주)옐로쇼핑미디어, 쿠차 Android 앱 개발2009.12 ~ 2013.02 (주)아이콘랩, Android 및 BlackBerry 앱 개발, 산업기능요원 대체복무

손글씨를인식하는앱을만들어보자

1. 앱에서손글씨를입력받고,

2. 입력받은손글씨를서버로전송

3. 서버에서는입력받은손글씨를분석

4. 결과를앱으로돌려주자

손글씨는어떻게분석하지?

찾아보니…기본원리 : Perceptron(퍼셉트론)

학습방법 : Regression(회귀분석)

분류방법 : Classification(분류법)

activation(y = Wx + b) = 1 or 0

퍼셉트론알고리즘(Perceptron, 1957)

프랭크로젠블랫Frank Rosenblatt(July 11, 1928 – July 11, 1971)

회귀분석(Regression)

“어떤변수( Y )가다른변수( X )에의해설명된다고보고그함수관계(Y = WX + b)를조사하는통계적해석수법”

– Google

ex) 평균온도( X )가올라가면아이스크림판매량( Y )이어떻게변하는가?

y = Wx + b

x y

1 1

2 2

3 3

4 4

5 5

W와 b는 어떻게 정할까

y_

?

?

?

?

?

실제 데이터 예측 데이터

가설(hypothesis)

y = 1x + 0

우리가 생각하는 값

cost =1

𝑚 (𝑦−𝑦_)

2

y y_

1 ?

2 ?

3 ?

4 ?

5 ?

적합성 검증(평균제곱오차, MSE)

비용함수(Cost Function)

실제 예측

코스트를 어떻게 최소화할까?

경사하강법(Gradient Descent Algorithm)

Hypothesis 설정

import tensorflow as tf

# 학습 데이터x_data = [1, 2, 3]y_data = [1, 2, 3]

W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.random_uniform([1], -1.0, 1.0))X = tf.placeholder(tf.float32)Y = tf.placeholder(tf.float32)

# 가설 Y_ = WX + bhypothesis = W * X + b

비용함수와오차보정

# 비용함수 cost = 평균( (Y_ - Y)^2 )cost = tf.reduce_mean(tf.square(hypothesis - Y))

# cost를 줄이는 방향으로 W, b를 수정하기 위한 경사하강법a = tf.Variable(0.1) # Learning rate, alphaoptimizer = tf.train.GradientDescentOptimizer(a)train = optimizer.minimize(cost)

TensorFlow실행을통해학습

init = tf.initialize_all_variables()

sess = tf.Session()sess.run(init)

# 학습for step in range(2001):

# hypothesis = W * X + b# cost = tf.reduce_mean(tf.square(hypothesis - Y))# train = tf.train.GradientDescentOptimizer(0.1).minimize(cost)sess.run(train, feed_dict={X: x_data, Y: y_data})if step % 20 == 0:

print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W), sess.run(b))

학습된모델을테스트

# Hypothesis 테스트, hypothesis = W * X + bprint(sess.run(hypothesis, feed_dict={X: 5}))print(sess.run(hypothesis, feed_dict={X: 2.5}))

분류법(Classification)

“Classification은 Category를나누는것과관련된일반적인과정으로, 아이디어와사물을인식하고, 차별화하고, 이해하는과정“

- Wikipedia

ex) 신문기사를날씨, 경제, 연예, 스포츠등으로구분

메일을일반메일, 스팸메일로구분

신용카드사용패턴이 평소와같은지특이한지구분

Logistic Regression(Binary Classfication)

X(공부시간) Y(합격여부)

1 0

2 0

3 1

4 1

5 1

Y = Wx+ b

1(pass)

0(fail)

공부시간

0.5

1 2 3 4 5

Multinomial Classfication

X1(공부시간) X2(출석) Y(학점)

3 1 C

2 3 B

3 5 A

2 3 B

5 1 C

출석(X1)

공부시간(X2)

A

C

B Y = Wx+ b

Y = Wx+ b

Y = Wx+ b

Multinomial Classfication

Multinomial Classfication

Multinomial Classfication

Softmax Nomalization

N(x)=𝑥

𝑥𝑛

적합성검증(평균제곱오차, MSE)

Cross Entropy

손글씨를인식하는앱을만들어보자

1. 앱에그려진숫자를분석가능한형태로전처리

2. 전처리된데이터를서버로전송

3. 서버는전처리된데이터를미리학습된모델(y = Wx + b)에적용

4. 학습된모델에의해분석된결과로적합한숫자를찾음

5. 찾은숫자를앱으로돌려줌

손글씨인식앱시연

이미지리사이징(Android)

Bitmap image = MainActivity.getBitmapFromView(mDrawingView);int width = image.getWidth();int height = image.getHeight();int scaleWidth = 28;int scale = width / scaleWidth;int scaleHeight = height/scale;

Matrix matrix = new Matrix();matrix.postScale(scaleWidth, scaleHeight);Bitmap newImage = Bitmap.createScaledBitmap(image, scaleWidth, scaleHeight, false);

부족한부분패딩 (Android)

List<Float> pixelList = new ArrayList<>();StringBuilder builder = new StringBuilder();

for(int w=0; w<newImage.getWidth(); w++) {pixelList.add(0.0f);

}

for(int w=0; w<newImage.getWidth(); w++) {pixelList.add(0.0f);

}

Threshold 처리 (Android)

for(int h=0; h<newImage.getHeight(); h++) {builder.setLength(0);for(int w=0; w<newImage.getWidth(); w++) {

float color = newImage.getPixel(w, h);color = color == -1 ? 0.0f : 1.0f;builder.append(color + " ");pixelList.add(color);

}}

for(int w=0; w<newImage.getWidth(); w++) {pixelList.add(0.0f);

}

Sample load및초기화

from tensorflow.examples.tutorials.mnist import input_dataimport tensorflow as tfimport numpy as np

#학습 데이터 로드mnist = input_data.read_data_sets("MNIST__data/", one_hot=True)

x = tf.placeholder("float", [None, 784]) # [?] x [784]y = tf.placeholder("float", [None, 10]) # [?] x [10]W = tf.Variable(tf.zeros([784, 10])) # [784] x [10]b = tf.Variable(tf.zeros([10])) # [10]

Softmax와비용함수, 오차보정

# y_ = softmax(Wx + b)y_ = tf.nn.softmax(tf.matmul(x, W) + b)

# cost = -∑(y * log(y_)cross_entropy = -tf.reduce_sum(y * tf.log(y_))

# cost를 줄이는 방향으로 W, b를 수정하기 위한 경사하강법train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

TensorFlow실행을통해학습

sess = tf.Session()sess.run(tf.initialize_all_variables())

for i in range(1000):batch_xs, batch_ys = mnist.train.next_batch(100)# y_ = tf.nn.softmax(tf.matmul(x, W) + b)# cross_entropy = -tf.reduce_sum(y * tf.log(y_))# train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)sess.run(train_step, feed_dict={x: batch_xs, y: batch_ys})

correct_prediction = tf.equal(tf.argmax(y_, 1), tf.argmax(y, 1))accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

학습된모델을이용해분석

def analysis(request):message = "Test"if request.method == "POST":

body_unicode = request.body.decode('utf-8')body = json.loads(body_unicode)

test_array = np.array(body)test_array = test_array.reshape((1, 784))# y_ = tf.nn.softmax(tf.matmul(x, W) + b)test_result = sess.run(tf.argmax(y_, 1), feed_dict={x: test_array});print(test_result[0])message = test_result[0]

return HttpResponse(message)

Q & A

감사합니다.

top related