information engineering in the age of the internet of things
TRANSCRIPT
Information Engineering in the Age of the Internet of Things
1
Payam BarnaghiInstitute for Communication Systems (ICS)/5G Innovation Centre University of SurreyGuildford, United Kingdom
Digital Catapult, December 2015
“A hundred years hence people will be so avid of every moment of life, life will be so full of busy delight, that time-saving inventions will be at a huge premium…”
“…It is not because we shall be hurried in nerve-shattering anxiety, but because we shall value at its true worth the refining and restful influence of leisure, that we shall be impatient of the minor tasks of every day….”
The March 26, 1906, New Zealand Star :
Source: http://paleofuture.com
3IBM Mainframe 360, source Wikipedia
Apollo 11 Command Module (1965) had 64 kilobytes of memory operated at 0.043MHz.
An iPhone 5s has a CPU running at speeds of up to 1.3GHzand has 512MB to 1GB of memoryCray-1 (1975) produced 80 million Floating point operations per second (FLOPS)10 years later, Cray-2 produced 1.9G FLOPS
An iPhone 5s produces 76.8 GFLOPS – nearly a thousand times more
Cray-2 used 200-kilowatt power
Source: Nick T., PhoneArena.com, 2014image source: http://blog.opower.com/
Computing Power
5
−Smaller size−More Powerful−More memory and more storage
−"Moore's law" over the history of computing, the number of transistors in a dense integrated circuit has doubled approximately every two years.
Smaller in size but larger in scale
6
The old Internet timeline
7Source: Internet Society
Connectivity and information exchange was (and is ) the main motivation behind the Internet; but Content and Services are now the key elements;
and all started growing rapidly by the introduction of the World Wide Web (and linked information and search and discovery services).
8
Early days of the web
9
Search on the Internet/Web in the early days
10
11
AnyPlace AnyTime
AnyThing
Data Volume
Security, Reliability, Trust and Privacy
Societal Impacts, Economic Values and Viability
Services and Applications
Networking andCommunication
12
Sensor devices are becoming widely available
- Programmable devices- Off-the-shelf gadgets/tools
Internet of Things: The story so far
RFID based solutions
Wireless Sensor andActuator networks
, solutions for communication
technologies, energy efficiency, routing, …
Smart Devices/Web-enabled
Apps/Services, initial products,
vertical applications, early concepts and demos, …
Motion sensor
Motion sensor
ECG sensor
Physical-Cyber-Social Systems, Linked-data,
semantics, M2M, More products, more
heterogeneity, solutions for control and
monitoring, …
Future: Cloud, Big (IoT) Data Analytics, Interoperability,
Enhanced Cellular/Wireless Com. for IoT, Real-world operational
use-cases and Industry and B2B services/applications,
more Standards…
Real world data
14
Data in the IoT
− Data is collected by sensory devices and also crowd sensing resources.
− It is time and location dependent.− It can be noisy and the quality can vary. − It is often continuous - streaming data.
− There are several important issues such as:− Device/network management− Actuation and feedback (command and control)− Service and entity descriptions.
IoT data- challenges
− Multi-modal, distributed and heterogeneous− Noisy and incomplete− Time and location dependent − Dynamic and varies in quality − Crowdsourced data can be unreliable − Requires (near-) real-time analysis− Privacy and security are important issues− Data can be biased- we need to know our data!
16P. Barnaghi, A. Sheth, C. Henson, "From data to actionable knowledge: Big Data Challenges in the Web of Things," IEEE Intelligent Systems, vol.28 , issue.6, Dec 2013.
Making IoT data widely available and (re-)usable− Machine-readable and/or human interpretable
meta-data− Open IoT data portals (static and streaming)− Open APIs and discoverable interfaces − Discoverable data and patterns (and resources)− (easily) Sharable and connectable data− Opportunistic and ad-hoc discovery, matching
and mash-up solutions − Quality and fit-for-purpose data − Trust, privacy and security aware data collection,
sharing and access solutions
17
Device/Data interoperability
18The slide adapted from the IoT talk given by Jan Holler of Ericsson at IoT Week 2015 in Lisbon.
Heterogeneity, multi-modality and volume are among the key issues.
We need interoperable and machine-interpretable solutions…
19
A bit of history
− “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in co-operation.“ (Tim Berners-Lee et al, 2001)
20Image source: Miller 2004
Semantics & the IoT
−The Semantic Sensor (&Actuator) Web is an extension of the current Web/Internet in which information is given well-defined meanings, better enabling objects, devices and people to work in co-operation and to also enable autonomous interactions between devices and/or objects.
21
Semantic Descriptions in Semantic (Web) World
22
23
The world of IoT and Semantics
Example: IoT-A information model
24
25
Some good existing models: SSN Ontology
Ontology Link: http://www.w3.org/2005/Incubator/ssn/ssnx/ssnM. Compton, P. Barnaghi, L. Bermudez, et al, "The SSN Ontology of the W3C Semantic Sensor Network Incubator Group", Journal of Web Semantics, 2012.
26
But why do we still not have fully integrated semantic solutions in the IoT?
Semantic Web these days…
27
Several existing models
28
A simple Google search for “IoT data model” returned around 3,720,000 results.
29
We have good models and description frameworks;
The problem is that having good models and developing ontologies are not enough.
30
Semantic descriptions are intermediary solutions, not the end product.
They should be transparent to the end-user and probably to the data producer as well.
A WoT/IoT Framework
WSN
WSN
WSN
WSNWSN
Network-enabled Devices
Semantically annotate data
31
GatewayCoAP
HTTP
CoAP
CoAP
HTTP
6LowPAN
Semantically annotate data
http://mynet1/snodeA23/readTemp?
WSNMQTT
MQTT
Gateway
And several other protocols and solutions…
Publishing Semantic annotations
− We need a model (ontology) – this is often the easy part for a single application.
− Interoperability between the models is a big issue.
− Express-ability vs Complexity is a challenge.
− How and where to add the semantics− Where to publish and store them− Semantic descriptions for data, streams, devices
(resources) and entities that are represented by the devices, and description of the services.
32
33
Simplicity can be very useful…
Hyper/CAT
34Source: Toby Jaffey, HyperCat Consortium, http://www.hypercat.io/standard.html
- Servers provide catalogues of resources toclients.
- A catalogue is an array of URIs.- Each resource in the catalogue is annotatedwith metadata (RDF-like triples).
Hyper/CAT model
35Source: Toby Jaffey, HyperCat Consortium, http://www.hypercat.io/standard.html
36
Perhaps complex models are (sometimes) good for publishing research papers….
But they are often difficult to implement and use in real world products.
What happens afterwards is more important
− How to use the data (documentation, tools, interfaces)
− How to index and query the annotated data− How to make the publication suitable for
constrained environments and/or allow them to scale
− How to query them (considering the fact that here we are dealing with live data and often reducing the processing time and latency is crucial)
− Linking to other sources
37
The IoT is a dynamic, online and rapidly changing world
38
isPartOf
Annotation for the (Semantic) Web
Annotation for the IoT
Image sources: ABC Australia and 2dolphins.com
Tools and APIs- e.g. Sense2Web
39P. Barnaghi, M. Presser, K. Moessner, "Publishing Linked Sensor Data", in Proc. of the 3rd Int. Workshop on Semantic Sensor Networks (SSN), ISWC2010, 2010.
Tools and API – e.g. FIWARE IoT Discovery Generic Enabler
40http://catalogue.fiware.org/enablers/iot-discovery/documentation
Model documentation- e.g. SAO ontology
41http://iot.ee.surrey.ac.uk/citypulse/ontologies/sao/sao
Providing flexible APIs- e.g. SAOPY
42http://iot.ee.surrey.ac.uk/citypulse/ontologies/sao/saopy.html
43
Creating common vocabularies and taxonomies are also equally important
e.g. event and unit taxonomies, common formats for representing location, vocabularies to describe proximity and relations between Things and their data.
44
We should accept the fact that sometimes we do not need (full) semantic descriptions.
Think of the applications and use-cases before starting to annotate the data.
An example: a discovery method in the IoT
time
location
type
Query formulating
[#location | #type | time]
Discovery ID
Discovery/DHT Server
Data repository(archived data)
#location#type
#location#type
#location#type
Data hypercube
Gateway
Core network
Network ConnectionLogical Connection
Data
An example: a discovery method in the IoT
46
S. A. Hoseinitabatabaei, P. Barnaghi, C. Wang, R. Tafazolli, L. Dong, "Method and Apparatus for Scalable Data Discovery in IoT Systems", US Patents, 2015.
101 Smart City Use-case Scenarios
47http://www.ict-citypulse.eu/page/content/smart-city-use-cases-and-requirements
48
Semantic descriptions can be fairly static on the Web;
In the IoT, the meaning of data and the annotations can change over time/space…
Dynamic Semantics
<iot:measurement><iot:type> temp</iot:type><iot:unit>Celsius</iot:unit><time>12:30:23UTC</time><iot:accuracy>80%</iot:accuracy><loc:long>51.2365<loc:lat><loc:lat>0.5703</loc:lat></iot:measurment>
49
But this could be a function of time and
location;
What would be the accuracy 5 seconds
after the measurement?
Dynamic annotations for data in the process chain
50S. Kolozali et al, A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing", iThings 2014, 2014.
Dynamic annotations for provenance data
51S. Kolozali et al, A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing", iThings 2014, 2014.
52
Metadata (Semantic) or higher-level event descriptions can also be learned and created automatically.
Extraction of events and semantics from social media
53
City Infrastructure
Tweets from a city
https://osf.io/b4q2t/
Pramod Anantharam, Payam Barnaghi, Krishnaprasad Thirunarayan, Amit P Sheth, "Extracting City Traffic Events from Social Streams", ACM Transactions on Intelligent Systems and Technology, 2015
Ontology learning from real world data
54Frieder Ganz, Payam Barnaghi, Francois Carrez, "Automated Semantic Knowledge Acquisition from Sensor Data", IEEE Systems Journal, 2014.
Overall we need semantics and metadata in the IoT and these play a key role in providing interoperability.
However, we should design and use the data publication, access, sharing and their associated metadata models carefully and consider the constraints and dynamicity of the IoT environments.
Data Lifecycle
57Source: The IET Technical Report, Digital Technology Adoption in the Smart Built Environment: Challenges and opportunities of data driven systems for building, community and city-scale applications, http://www.theiet.org/sectors/built-environment/resources/digital-technology.cfm
Information Engineering for the IoT:Design Commandments
58
#1: Design for large-scale and provide tools and APIs.
#2: Think of who will use the data and how when you design your models.
#3: Provide means to update and change the semantic annotations.
59
Smart data collection
− Smart data collection
− Sooner or later we need to think whether we need to collect that data, how often we need to collect it and what volume.
− Intelligent data Processing (selective attention and information-extraction)
60
(image source: KRISTEN NICOLE, siliconangle.com)
Image sources : The dailymail, http://helenography.net/, http://edwud.com/
Designing for real world problems
#4: Create tools, open APIs and datasets for validation, evaluation, and interoperability testing.
#5: Create common vocabularies and provide documentation.
#6: Of course you can always create a better model, but try to re-use existing ones as much as you can.
63
Reference Datasets
64http://iot.ee.surrey.ac.uk:8080/datasets.html
Importance of Complementary Data
65
Open Data/Open APIs
−Open data is often misinterpreted as free data and publically available data.
−You can have open data, but with the right access controls, with trust and privacy.
66
#7: Link your data and descriptions to other existing resources.
#8: Define rules and/or best practices for providing the values for each attribute.
#9: Remember the widely used (semantic) models on the Web are simple ones like FOAF.
67
Best Practices: an example (early draft)
68
Spatial Data on the Web- Best Practices (early draft)
69
#10: Design for different audience (data consumers, developers, providers) and think about real impact and sustainability.
#11: Specify (and encourage others to do the same) data governance and privacy procedures, explain the ownership and re-use rules, and give control to the owners of data
70
71Source LAT Times, http://documents.latimes.com/la-2013/
Future cities: A view from 1998
72
Source: http://robertluisrabello.com/denial/traffic-in-la/#gallery[default]/0/
Source: wikipedia
Back to the Future: 2013
73
Users in control or losing control?
74
Image source: Julian Walker, Flicker
#12: Semantics and information engineering are only one part of the solution and often not the end-product so the focus of the design should be on creating effective methods, tools and APIs to handle and process the semantics.
75
Technical challenges and research directions
76
Technical (and non-technical) Challenges
− Creating common models to represent, publish, and (re-)use and share IoT data.− creating an IoT data market and data-driven innovation− Developing standards
− Providing best practices, demonstrators and open data portals for streaming and dynamic IoT data.
− Provide governance, dependability, reliability, trust and security models.
77
Research challenges
−Transforming raw data to actionable-information.−Machine learning and data analytics for large-
scale, multi-modal and dynamic (streaming data).
− Making data more accessible and discoverable.
−Energy and computationally efficient data collection, aggregation and abstraction (for both edge and Cloud processing).
78
Research challenges (continued)
−Integration and combination of Physical-Cyber-Social data.
−Use of data for automated interactions and autonomous services in different domains.
−Resource-aware and context-aware security, privacy and trust solutions.
79
Desire for innovation
80Driverless Car of the Future (1957)
Image: Courtesy of http://paleofuture.com
In conclusion
− IoT information engineering is different from common models of web data and/or other types of big data.
− Data collection in the IoT comes at the cost of bandwidth, network, energy and other resources.
− Data collection, delivery and processing is also depended on multiple layers of the network.
− We need more resource-aware data analytics methods and cross-layer optimisations (Deep IoT).
− The solutions should work across different systems and multiple platforms (Ecosystem of systems).
− Data sources are more than physical (sensory) observation.− The IoT requires integration and processing of physical-cyber-
social data.− The extracted insights and information should be converted
to a feedback and/or actionable information. 81
Let’s hope
−The Internet of the Future will be −For everyone, everywhere, available at
anytime,−People will have control on their data−Data will be used for helping people−Smart applications will contribute to a
better life and to a better use of of our resources in the world!
82
Other challenges and topics that I didn't talk about
Resilience and reliability
Noise and incomplete data
Cloud and distributed computing
Networks, test-beds and mobility
Mobile computingServices
83
IET sector briefing report
84
Available at: http://www.theiet.org/sectors/built-environment/resources/digital-technology.cfm
85
Useful information:
http://www.raeng.org.uk/publications/reports/connecting-data-driving-productivity