Holistic, Distributed Stream Processing in IoT Environments
23. 03. ‘18 UK Systems Research Workshop
Peter MichalákCDT Cloud Computing
for Big Data
Prof. Paul WatsonSchool of Computing
Prof. Mike TrenellMedical School
Dr. Sarah HeapsSchool of Maths
and Stats
Stream Processing in IoT
Telemetry
Sensors Field Gateways
MessageBroker
Stream Processing Engine
Clouds
1
Telemetry
Command and ControlNotification
Sensors Field Gateways
MessageBroker
Stream Processing Engine
Clouds
DB
Holistic Stream Processing in IoT
2
Healthcare use case Behavioural Prompts & Feedback
BehaviouralPrompt
Feedback
3
ESPer/Spark/Storm
sky is the limit
C / JavaScript
Objective C / Swift
Java / Kotlin ..
Challenges
• Energy
• Pebble Watch battery life ~7 days
• Streaming raw accelerometer data reduces battery life ~18 hours
• Hand-crafting bespoke solutions
• Definition of data stream processing
• Programming for Heterogeneous Platforms
• Multitude of devices, APIs and programming languages
PATH2iot systemAutomating Computational Placement
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Physical Plan
Execution PlanPATHfinder
PATHdeployer
PATHmonitor
Resource Catalogue
Cost ModelsPlatform Specific Compilation
PATH2iot
Re-optimisation onInfrastructure change
• PATHfinder• Automated Computational
Decisions• Non-functional requirements• Device-specific compilation
• PATHdeployer• A deployment tool delivers
configuration to enable computation
• PATHmonitor• Future work for PATH2iot
5
PATHfinder : High-Level Declarative Description of Computation
Non-functional Requirements
Description of the ComputationIn
put Resource
Catalogue
• Event Processing Language (EPL) from Esper
• High Level Declarative Description of Computation
• SQL based with extended grammar to support CEP operations
• Decomposable into directed graph of stream operators
6
1. INSERT INTO AccelEventSELECT getAccelData(25, 60) FROM AccelEventSource
2. INSERT INTO EdEventSELECT Math.pow(x*x+y*y+z*z, 0.5) AS ed, tsFROM AccelEvent WHERE vibe=0
3. INSERT INTO StepEventSELECT ed1(’ts’) as ts FROM EdEventMATCH RECOGNIZE (MEASURES A AS ed1, B AS ed2 PATTERN (A B) DEFINE A AS (A.ed > THR), B AS (B.ed ≤ THR))
4. INSERT INTO StepCount SELECT count(*) as steps FROM StepEvent.win:time_batch(120 sec)
5. SELECT persistResult(steps, "time_series", "step_sum") FROM StepCount
[1] N. Zhao, “Full-featured pedometer design realized with 3-axis digital accelerometer,” Analog Dialogue, vol. 44, no. 06, 2010.
Step count algorithm[1] in EPL
PATHfinder : High-Level Declarative Description of Computation
Non-functional
Requirements
Description of
the ComputationInpu
t
Resource
Catalogue
7
ΩgetAccelData(25, 60)
σvibe=0
Πy*y
ΩMath.pow(ed, 0.5)
"(120)
Ωmatch_recognize(A.ed > THR) AND (B.ed <= THR)
ΩCOUNT(∗)
ΩpersistResult(“steps”, “step_sum”, “time_series”)
Πx*x Πz*z
Π+
Π+
Q5
Q4
Q3
Q2
Q1
1. INSERT INTO AccelEventSELECT getAccelData(25, 60) FROM AccelEventSource
2. INSERT INTO EdEventSELECT Math.pow(x*x+y*y+z*z, 0.5) AS ed, tsFROM AccelEvent WHERE vibe=0
3. INSERT INTO StepEventSELECT ed1(’ts’) as ts FROM EdEventMATCH RECOGNIZE (MEASURES A AS ed1, B AS ed2 PATTERN (A B) DEFINE A AS (A.ed > THR), B AS (B.ed ≤ THR))
4. INSERT INTO StepCount SELECT count(*) as steps FROM StepEvent.win:time_batch(120 sec)
5. SELECT persistResult(steps, "time_series", "step_sum") FROM StepCount
[1] N. Zhao, “Full-featured pedometer design realized with 3-axis digital accelerometer,” Analog Dialogue, vol. 44, no. 06, 2010.
Step count algorithm[1] in EPL
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Resource Catalogue
• Logical Optimisation[2]
• Pushing Selects &Windows closer to the data source
• Physical Optimisation• Enumerating Physical Plans• Placement of the physical plans
on available infrastructure• 225 Physical Plans
• Physical Plan Pruning• Removing non-deployable plans
based on infrastructure capabilities
• 18 Physical Plans
Physical Plan
8
PATHfinder: Optimisation
[2] Hirzel, Martin, Robert Soulé, Scott Schneider, Buğra Gedik, and Robert Grimm. ‘A Catalog of Stream Processing Optimizations’. ACM Computing Surveys 46
PP0 PP1 PP2 PP3 PP4 PP5
PP6 PP7 PP8 PP9 PP10 PP11
PP12 PP13 PP14 PP15 PP16 PP17
PP18 PP19 PP20 PP21 … PP225
x xx
xx
x x
xxxx
x
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Resource Catalogue
• Cost Models• Energy cost model[3]
• Power coefficients with confidence Intervals• Estimated Battery LifePhysical Plan
Cost Models
9
PATHfinder: Energy Cost Model
[3] M. Forshaw, N. Thomas, and A. S. McGough, “The case for energy- aware simulation and modelling of internet of things (iot),” ACM ENERGY-SIM, 2016.
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Resource Catalogue
• Cost Models• Energy cost model• Power coefficients with
confidence Intervals• Estimated Battery LifePhysical Plan
Cost Models
10
PATHfinder : Energy Cost Model
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Resource Catalogue
• Cost Models• Energy cost model• Power coefficients with
confidence Intervals• Estimated Battery LifePhysical Plan
Cost Models
11
PATHfinder : Energy Cost Model
Non-functional Requirements
Description of the ComputationIn
put
Logical Plan
Resource Catalogue
• Cost Models• Energy cost model• Power coefficients with
confidence Intervals• Estimated Battery LifePhysical Plan
Cost Models
120
20
40
60
100 150 200Time (s)
Aver
age
Pow
er C
onsu
mpt
ion
(mW
)
Exec Plan 040 with 120 s cycle
1 2 3 54
PATHfinder : Energy Cost Model
PATHdeployer : Architecture Overview
PATHdeployer
PATHfinder
Infra Req EPL
1
2
3
Input
Neo4j
D2ESPer D2ESPer
ZooKeeper
4
ActiveMQ
InfluxDB
Cloud Deployment
...
VersatilePebble(C)
VersatilePebble(JavaScript)
REST API(Flask)
/path2iot/pebble/set_config
/path2iot/pebble/get_config
5
MariaDB
data stream
IoT Deployment• Cloud Deployment• ZooKeeper: configuration delivery• ActiveMQ: event propagation• D2ESPer: in-house built dynamic
ESPer based stream processing tool• InfluxDB: time series database
• IoT Deployment• Flask REST API: configuration
delivery for IoT devices• MariaDB: storage endpoint• IoT agents – iPhone, Pebble Watch
13
Plan watch phone cloud Energy Impact (mJ)pp00 Ω1 !1 Ω2 Ω3 sxfer Ω4 ω1 Ω6 Ω5
pp01 Ω1 !1 Ω2 sxfer Ω3 Ω4 ω1 Ω6 Ω5
pp02 Ω1 !1 sxfer Ω2 Ω3 Ω4 ω1 Ω6 Ω5
pp03 Ω1 sxfer !1 Ω2 Ω3 Ω4 ω1 Ω6 Ω5
pp04 Ω1 !1 Ω2 Ω3 ω1 sxfer Ω4 Ω6 Ω5
pp05 Ω1 !1 Ω2 Ω3 sxfer ω1 Ω4 Ω6 Ω5
pp06 Ω1 !1 Ω2 sxfer Ω3 ω1 Ω4 Ω6 Ω5
pp07 Ω1 !1 sxfer Ω2 Ω3 ω1 Ω4 Ω6 Ω5
pp08 Ω1 sxfer !1 Ω2 Ω3 ω1 Ω4 Ω6 Ω5
pp09 Ω1 !1 Ω2 ω1 sxfer Ω3 Ω4 Ω6 Ω5
pp10 Ω1 !1 Ω2 sxfer ω1 Ω3 Ω4 Ω6 Ω5
pp11 Ω1 !1 sxfer Ω2 ω1 Ω3 Ω4 Ω6 Ω5
pp12 Ω1 sxfer !1 Ω2 ω1 Ω3 Ω4 Ω6 Ω5
pp13 Ω1 !1 ω1 sxfer Ω2 Ω3 Ω4 Ω6 Ω5
pp14 Ω1 !1 sxfer ω1 Ω2 Ω3 Ω4 Ω6 Ω5
pp15 Ω1 sxfer !1 ω1 Ω2 Ω3 Ω4 Ω6 Ω5
pp16 Ω1 ω1 sxfer !1 Ω2 Ω3 Ω4 Ω6 Ω5
pp17 Ω1 sxfer ω1 !1 Ω2 Ω3 Ω4 Ω6 Ω526.62
10.51
26.62
26.71
10.6
26.62
26.71
27.05
5.87
26.62
26.71
27.05
27.08
5.91
26.62
26.71
27.05
27.08
pp00
pp01
pp02
pp03
pp04
pp05
pp06
pp07
pp08
pp09
pp10
pp11
pp12
pp13
pp14
pp15
pp16
pp17
0 10 20Energy Impact (mJ)
Phys
ical
Pla
n
Physical Plan − Energy Impact overview
Results
baseline
best plan
Results
15
27.08 27.05 26.71 26.62
5.91
27.08 27.05 26.71 26.62
5.87
27.05 26.71 26.62
10.6
26.71 26.62
10.51
26.62
0
10
20
pp00 pp01 pp02 pp03 pp04 pp05 pp06 pp07 pp08 pp09 pp10 pp11 pp12 pp13 pp14 pp15 pp16 pp17Physical Plan
Ener
gy Im
pact
(mJ)
Physical Plan − Energy Impact overview
Plan watch phone cloud EI (mJ)
95 % conf int
Power (mW)
Bat Life(h)
pp03 Ω1 sxfer !1 Ω2 Ω3 Ω4 ω1 Ω6 Ω5 5.87 5.70 - 6.05 5.88 18.0-18.2
pp09 Ω1 !1 Ω2 ω1 sxfer Ω3 Ω4 Ω6 Ω5 26.62 26.48 - 26.76 26.69 79.5-84.4
27.08 27.05 26.71 26.62
5.91
27.08 27.05 26.71 26.62
5.87
27.05 26.71 26.62
10.6
26.71 26.62
10.51
26.62
0
10
20
pp00 pp01 pp02 pp03 pp04 pp05 pp06 pp07 pp08 pp09 pp10 pp11 pp12 pp13 pp14 pp15 pp16 pp17Physical Plan
Ener
gy Im
pact
(mJ)
Physical Plan − Energy Impact overview
• 453 % battery life improvement[5]
• 3x data reduction between wearable and cloud
• Non-functional requirement satisfied
[5] P. Michalák, P. Watson, “PATH2iot: A Holistic, Distributed Stream Processing System”, CloudCom, 2017
Holistic Distributed Stream Processing in IoT Environments• Holistic, Distributed Stream Processing System
• Design and open-source implementation[6]
• EPL decomposition• Logical and Physical Optimisation
• Energy Impact coefficients for Pebble Watch• Battery life increased dramatically
• PoC Deployment Architecture• Future work on Multi-objective optimisation
• e.g. Bandwidth, Performance, Accuracy
16[6] https://github.com/PetoMichalak/iotPower