demystifying machine learning

Post on 06-Aug-2015

291 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

@louisdorard

#papisconnect

PredictiveAPIs

Student Researcher Data Scientist Developer Non-technical

GREAT TALKS ON “REALWORLD” MACHINE

LEARNINGIMPLEMENTATIONS FROM

ALL OVER THE WORLDANDRÉS GONZALEZ, CLEVERTASK

32.5%33.8%

Familiarity with Predictive

Machine Learning Use cases

Limitations Predictive APIs Does it work?

Case study ML Canvas

–Mike Gualtieri, Principal Analyst at Forrester

“Predictive apps are the next big thing

in app development.”

–Waqar Hasan, VISA

“Predictive is the ‘killer app’ for big data.”

1. Machine Learning

2. Data

BUT

–McKinsey & Co. (2011)

“A significant constraint on realizing value from big data will

be a shortage of talent, particularly of people with deep

expertise in statistics and machine learning.”

DemystifyingMachine Learning

“Which type of email is this?

— Spam/Ham”

“Which type of email is this?

— Spam/Ham”

⇒ Classification

I

O

“Which type of email is this?

— Spam/Ham”

??

“How much is this house worth?

— X $”

-> Regression

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

ML is a set of AI techniques where “intelligence” is built by referring to

examples

Use cases

• Real-estate

• Spam

• Priority inbox

• Crowd prediction

property price

email spam indicator

email importance indicator

location & context #people

Zillow

Gmail

Gmail

Tranquilien

I. Get more customers • Reduce churn

• Score leads

• Optimize campaigns

customer churn indicator

customer revenue

customer & campaign interest indicator

II. Serve customers better • Cross-sell

• Increase engagement

• Optimize pricing

customer & product purchase indicator

user & item interest indicator

product & price #sales

III. Serve customers more efficiently • Predict demand

• Automate tasks

• Use predictive enterprise apps

context demand

credit application repayment indicator

Predictive enterprise apps • Priority filtering

• Message routing

• Auto-configuration

message priority indicator

request employee

user & actions settings

RULES

–Katherine Barr, Partner at VC-firm MDV

"Pairing human workers with machine learning and automation

will transform knowledge work and unleash new levels of human

productivity and creativity."

Limitations

Need examples of inputs AND outputs

What if not enough data points?

What if similar inputs have dissimilar outputs?

Bedrooms Bathrooms Price ($)

3 2 500,0003 2 800,0001 1 300,0001 1 800,000

Bedrooms Bathrooms Surface (foot²) Year built Price ($)

3 2 800 1950 500,0003 2 1000 1950 800,0001 1 500 1950 300,0001 1 500 2014 800,000

–@louisdorard

“A model can only be as good as the data it was given to train on”

Predictive APIs:ML for all

HTML / CSS / JavaScript

HTML / CSS / JavaScript

squarespace.com

The two phases of machine learning:

• TRAIN a model

• PREDICT with a model

The two methods of predictive APIs:

• TRAIN a model

• PREDICT with a model

The two methods of predictive APIs:

• model = create_model(dataset)

• predicted_output = create_prediction(model, new_input)

The two methods of predictive APIs:

• model = create_model(‘training.csv’)

• predicted_output = create_prediction(model, new_input)

“Is this email important?

— Yes/No”

“Is this customer going to leave next month?

— Yes/No”

“What is the sentiment of this tweet?

— Positive/Neutral/Negative”

The two phases of machine learning:

• TRAIN a model

• PREDICT with a model

The two phases of machine learning:

• TRAIN a model

• PREDICT with an already existing model

“Is this email spam?

— Yes/No”

Does it work?How well

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0002 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0001 769 1999 condo 315,0003 1 1012 1951 house3 2 1599 1964 house3 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0002 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0001 769 1999 condo 315,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,0004 2 1574 1964 house 835,0003 2 987 1951 townhouse 790,0004 2001 house 855,0001 1 530 2007 condo 122,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

4 2 1574 1964 house 835,0003 2 987 1951 townhouse 790,0004 2001 house 855,0001 1 530 2007 condo 122,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($) Price ($)

4 2 1574 1964 house 835,000 835,0003 2 987 1951 townhouse 790,000 790,0004 2001 house 855,000 855,0001 1 530 2007 condo 122,000 122,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($) Price ($)

4 2 1574 1964 house 835,0003 2 987 1951 townhouse 790,0004 2001 house 855,0001 1 530 2007 condo 122,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($) Price ($)

4 2 1574 1964 house 835,0003 2 987 1951 townhouse 790,0004 2001 house 855,0001 1 530 2007 condo 122,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

2 1.5 968 1976 townhouse 447,0003 1 860 1950 house 565,0001 769 1999 condo 315,0004 1315 1950 house 648,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($) Price ($)

4 2 1574 1964 house 818,000 835,0003 2 987 1951 townhouse 800,000 790,0004 2001 house 915,000 855,0001 1 530 2007 condo 100,000 122,000

Price ($) Price ($)

818,000 835,000800,000 790,000915,000 855,000100,000 122,000

Need real-time machine learning?

The two phases of machine learning:

• TRAIN a model

• PREDICT with a model

• Training time

• Prediction time

• Accuracy

Case study:churn analysis

• Who: SaaS company selling monthly subscription

• Question asked: “Is this customer going to leave within 1 month?”

• Input: customer

• Output: no-churn (negative) or churn (positive)

• Data collection: history up until 1 month ago

• Baseline: if no usage for more than 15 days then churn

Learning: OK

but

• How to represent customers?

• What to do after predicting churn?

Customer representation:

• basic info (age, income, etc.)

• usage of service (# times used app, avg time spent, features used, etc.)

• interactions with customer support (how many, topics of questions, satisfaction ratings)

Taking action to prevent churn:

• contact customers (in which order?)

• switch to different plan

• give special offer

• no action?

Measuring accuracy:

• #TP (we predict customer churns and he does)

• #FP (we predict customer churns but he doesn’t)

• #FN (we predict customer doesn’t churn but he does)

• Compare to baseline

Estimating Return On Investment:

• Taking action for #TP and #FP customers has a cost

• We earn #TP * success rate * revenue /cust. /month

• Compare to baseline

Machine Learning Canvas

Machine Learning Canvas

PREDICTIONS OBJECTIVES DATA

Context

Who will use the predictive system / who will beaffected by it? Provide some background.

Value Proposition

What are we trying to do? E.g. spend less time onX, increase Y...

Data Sources

Where do/can we get data from? (internaldatabase, 3rd party API, etc.)

Problem

Question to predict answers to (in plain English)

Input (i.e. question "parameter")

Possible outputs (i.e. "answers")

Type of problem (e.g. classification, regression,recommendation...)

Baseline

What is an alternative way of making predictions(e.g. manual rules based on feature values)?

Performance evaluation

Domain-specific / bottom-line metrics formonitoring performance in production

Prediction accuracy metrics (e.g. MSE ifregression; % accuracy, #FP for classification)

Offline performance evaluation method (e.g.cross-validation or simple training/test split)

Dataset

How do we collect data (inputs and outputs)?How many data points?

Features

Used to represent inputs and extracted fromdata sources above. Group by types andmention key features if too many to list all.

Using predictions

When do we make predictions and how many?

What is the time constraint for making those predictions?

How do we use predictions and confidence values?

Learning predictive models

When do we create/update models? With which data / how much?

What is the time constraint for creating a model?

Criteria for deploying model (e.g. minimum performance value — absolute,relative to baseline or to previous model)

IDE

AS

PE

CS

DE

PLO

YM

EN

T

BACKGROUND

ENGINE SPECS

INTEGRATION

PREDICTIONS OBJECTIVES DATA

BACKGROUND

ENGINE SPECS

INTEGRATION

PREDICTIONS OBJECTIVES DATA

BACKGROUND End-user Value prop Sources

ENGINE SPECS ML problem Perf eval Preparation

INTEGRATION Using pred Learning modelINTEGRATION Using pred Learning model

Why fill in ML canvas?

• Target the right problem for your company

• Choose right algorithm, infrastructure, or ML solution

• Guide project management

• Improve team communication

machinelearningcanvas.com

Recap

• Need examples of inputs AND outputs

• Need enough examples

• ML to create value from data

• 2 phases: TRAIN and PREDICT

• Predictive APIs make it more accessible

• Good data is essential

• What do we do with predictions?

• Measure performance with accuracy, time and bottom-line

• Also: deploy, maintain, improve…

louisdorard.com

top related