industry of things world - berlin 19-09-16

32
Impact of IoT analytics on the development budget Dr. Boris Adryan @BorisAdryan Industry of Things World, Berlin, 19 th September 2016

Upload: boris-adryan

Post on 23-Jan-2018

391 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Impact of IoT analytics on the development budget

Dr. Boris Adryan @BorisAdryan

Industry of Things World, Berlin, 19th September 2016

Dr. Boris Adryan• with Zühlke Engineering since September 2016

• longstanding IoT enthusiast • Founder of thingslearn Ltd. • Board Member & Strategic Advisor for Pycom

(microcontrollers), BioSelf (biosensors) and OpenSensors (IoT platform)

• before: research group leader for data analytics and machine learning at University of Cambridge, England.

@BorisAdryan

I disagree with the notion that data is the new oil. It’s as infinite as the sun, and just like the power of the sun, we’re barely using it at the moment.

Mike Gualtieri, Forrester Research

5V of Big Data

Velocity

Veracity

Volume

Variety

Value

“doesn’t fit on my local drive”

“process deals with hundreds of events

per second”

“wouldn’t even know how to save this in a

RDBMS”

“actionable insight”

“not sure how current, valid or complete it is”

It’s worth to look at the actual data problem before hiring a ‘big data specialist’ or buying an ‘analytics solution’.

IoT = Big DataSensor devices produce large and small data.

You may not immediately know how to deal with them - but that doesn’t automatically make them ‘big data’.

39% of survey participants are worried about the cost of an industrial IoT solution.

“Why aren’t you doing IoT?”

Hardware is often perceived as investment that customers understand and therefore anticipate.

This talk is about unfounded IoT fears.

There’s an air of magic around data and analytics. This leads to fear of: • having to hire specialists

(for both data plumbing and analytics) • having to buy expensive services • losing control over the process due to a lack

of understanding

data

You want actionable insight.

data

data

here be dragons!

whatever you do in your vertical

✓better ✓ faster ✓cheaper

insight

“magic”

how to deal and what to do with the data

✓small (fits on your drive) ✓you know exactly what you’re looking for

not a ‘data problem’ ask your programmer

✓ large (think data centre-scale) ✓you know exactly what you’re looking for

potentially ‘big data’

ask your sysadmin, then your programmer

Do you need to employ a specialist?

data

data

Let’s talk about IoT and the cloud

You have a choice. Actually, too much of it.

“My data problem must be special!”

✓ unstructured data

✓ distributed ingestion and storage

My company went to an IoT conference

& all I got was this t-shirt

and a bunch of buzzwords.

Customers fear costs because they’re facing:

Or they believe from hear-say that IoT automatically requires:

✓ real-time analytics

✓ sophisticated machine learning

“I receive U NsT Ruc Tur data!”De

RDBMS

name age

Boris 40

name city job

Boris Fra… IoT

key-value DBs

name: Boris age: 40 city: Frankfurt

name: Boris job: IoT / data science

name: Ilka age: 39 name: Ilka

city: Frankfurt job: pharma R&D

SQL-ish syntax

not a ‘big data’ nor a ‘cloud’ problemNoSQL DBs run on commodity hardware

thing thing thing

time

thing thing thing

thing

thing

thing

thing

thing

thing

thing

broker

broker

broker

broker storage

storage

storage

even standard cloud offerings can do distributed ingestion and storage very well

“I got too many things!”

not a big data ‘problem’

Your apps & corporate design

Your products and analytical

services

Your devices

Adapting a PaaS to your needs.

Security

I/O / broker fast storage

device management

gatewayportal & user

management

basic analytics

Zühlke IoT Platform

standard components (still, tedious to configure)

your USP

data

You want actionable insight.

data

data

here be dragons!

whatever you do in your vertical

✓better ✓ faster ✓cheaper

insight

“magic”

how to deal and what to do with the data

Basic data plumbing and storage is usually not the issue.

The message is that there are known knowns. There are things we know that we know. There are known unknowns. That is to say there are things that we now know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.

Donald Rumsfeld ex US Secretary of Defense

✓small or large

✓you don’t know what to connect or how to find it (the “known unknowns”)

✓you want to increase operational awareness (the “unknown unknowns”)

a ‘data science problem’

We can help to establish a machine learning pipeline to extract relevant information automatically.

data

data

data

data

datadatadata

data

datadata

Do you need to employ a specialist?

you may just need a one-off solution

unsupervised learning - get an overview what’s in your data set

supervised learning - teach the machine to classify data on the basis of some previous training

statistical analysis - find rules and outliers on the basis of numerical data

What is machine learning?

?

y

4 n n 0

2 n n 1

4 y y 4

6 y y 9

6 y y 2

skates bike car bus lorry

whe

els

mot

or

win

dow

s se

ats

very relevant for predictive maintenance etc.

dataweather forecast

airport location

# of gates

# of runways

# of snowploughs

airline

aircraft

BLACK BOX

trainingflights cancelled in

the past

classifier

ranked list of relevant features

weight of features

thresholds for features

performance metric

prediction

new data

How does classification work?

training

classifier

performance assessment

good enough?

success!

mor

e da

ta fo

r tra

inin

g

data

noyes

Is this reliable?se

nsiti

vity

“t

rue

posi

tives

1-specificity “false positives”

0 0.2 0.4 0.6 0.8 1.0

1.0

0.8

0.6

0.4

0.2

worse than random guess

data

Where is your classifier located?

data

data

here be dragons!

whatever you do in your vertical

insight

“magic”

model building training operation performance tracking

on device, cloud or mobile app

} R & D

}

✓better ✓ faster ✓cheaper

Is analytics just data crunching?

sound profile

assessment result

“predictive maintenance classifier”

“Do I need real-time analytics?”

microseconds to seconds

seconds to minutes

minutes to hours

hours to weeks

on device

on stream

in batch

am I falling? counteract

battery level should I land?

how many times did I

stall?

what’s the best weather for

flying?

in process

in database

operational insight

performance insight

strategic insight

e.g. Kalman filter

e.g. with machine learning

e.g. rules engine

e.g. summary stats

Can IoT ever be real-time?

zone 1:

real-time [us]

zone 2:

real-time [ms]

zone 3:

real-time [s]

Edge, fog and cloud computing

Edge Pro: - immediate compression from raw

data to actionable information - cuts down traffic - fast response

Con: - loses potentially valuable raw data - developing analytics on embedded

systems requires specialists - compute costs valuable battery life

Cloud Pro: - compute power - scalability - familiarity for developers - integration center across

all data

Con: - traffic

Fog Pro: - same as Edge - closer to ‘normal’ development work - gateways often mains-powered

Con: - loses potentially valuable raw data

Some of our examples for real-time analytics

Choosing the appropriate method and toolset on every level.

Options for cloud-based real-time analytics

some features can cost a bit, especially when you don’t really know what you’re doing and want to ‘try it out’.

a badly configured SMACK stack on your own commodity hardware can be slow and unreliable

your pre-trained classifier

My current pet hate: Deep Learning

Deep learning has delivered impressive results mimicking human reasoning, strategic thinking and creativity.

At the same time, big players have released libraries such that even ‘script kiddies’ can apply deep learning.

It’s already leading to unreflected use of deep learning when other methods would be more appropriate.

Dr. Boris Adryan @BorisAdryan

‣ Super-fast analytics and state-of-the-art methods are not automatically the most useful solution.

‣ A good understanding on the type of insight that is required by the business model is essential.

‣ There are many solutions readily available that might enable IoT projects very cost-effectively.

Zühlke can advise on your options around IoT and data analytics, and provide complete solutions where needed.

Industry of Things World, Berlin, 19th September 2016

Summary