emt machine learning 12th weeks : anomaly detection
TRANSCRIPT
Who am I ?Development Experience◆ Image Recognition using Neural Network◆ Bio-Medical Data Processing◆ Human Brain Mapping on High Performance
Computing◆ Medical Image Reconstruction
(Computer Tomography) ◆Enterprise System◆Open Source Software Developer
Open Source Software Developer◆ Linux Kernel & LLVM ◆ OPNFV (NFV&SDN) & OpenStack◆ Machine Learning (TensorFlow)
Book◆ Unix V6 Kernel
Korea Open Source Software Lab.Mario Cho
Problem Motivation• Just like in other learning problems,
• Want to know a given dataset is abnormal/anomalous or not?
• define a "model" - that tells us the probability the example is not anomalous. - also use a threshold (epsilon) as a dividing line - so we can say which examples are anomalous or not.
Example of Anomaly detection• Aircraft engine features:
• Dataset: { x(1), x(2), x(3), ,,, , x(m), }• New engine: xtest
• Features - x1 = heat generated- x2 = vibration intensity- x3 = …- ...- xm = ...
Example of Anomaly detection• Aircraft engine features:
• Features - x1 = heat generated- x2 = vibration intensity
Example of Anomaly detection• Density estimation
• Dataset: { x(1), x(2), x(3), ,,, , x(m)}• Is “New engine: xtest” anomalous?
Model p(x) 에 대하여.
P(xtest ) < E à flag anomaly
P(xtest ) >= E à not anomaly, normal
Anomaly detection example• Fraud detection
• X(i)= features of user I’s activities• Model p(x) from data• Identify unusual users by checking with have p(x) < E
• Manufacturing• X(i)= features of process I’s• Model p(x) from measured data• Identify unusual product by checking with have p(x) < E
• Monitoring computer in a data center• X(i)= features of machine I• X1 = memory use,• X2 = number of disk accesses / sec • X3 = CPU load• Identify unusual status by checking with have p(x) < E
Example of Anomaly detection
P(xtest(1) ) = 0.0426
P(xtest(1) ) >= E (0.02)
P(xtest(1) ) : normal
P(xtest(2) ) = 0.0021
P(xtest(2) ) < E (0.02)
P(xtest(2) ) : anormal
Anomaly detection vs. Supervised learning• Detect very small number• Positive (y = 1) : 0~20• Negative (y = 0 ) : Large
• Many different “types” of anomalies.
• Hard to adaptive similar learning
• Future anomalies may look nothing like any of the anomalous examples we’ve seen so far.
• Positive & Negative are large• Positive (y = 1) : Large• Negative (y = 0 ) : Large
• Enough positive example for algorithm to get a sense of what positive example are like
• Many different “types” of anomalies.
• Easy to adaptive similar learning
• Future positive exaple likely to be similar to ones in training set
Anomaly detection vs. Supervised learning• Fraud detection
• Manufacturing • Ex)
• aircraft engines• Manufacturing processing
• Monitor machine
• Email spam classification
• Weather prediction (sunny/ rainy / cloud)
• Cancer classification
Error analysis for anomaly detection• Want
• P(x) large for normal examples x.• P(x) small for anomalous examples x.
• Most common problem:• P(x) is comparable (say, both large) for normal and anomalous
Monitoring computers in a data center• Choose feature that might take on unusually large or small
value in the event of an anomaly
• X(i)= features of machine I• X1 = memory use,• X2 = number of disk accesses / sec • X3 = CPU load• X4 = Network traffic