the web of data: the w3c semantic web initiative
DESCRIPTION
From the Feb 19 2014 NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Implementations The Web of Data - Ralph Swick, Domain Lead of the Information and Knowledge Domain at W3CTRANSCRIPT
![Page 1: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/1.jpg)
The Web of Data
NISO Virtual Conference
19 February 2014
Ralph Swick, W3C
![Page 2: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/2.jpg)
Agenda
• Data is changing our lives
• W3C’s traditional focus
• Expanding scope of W3C’s data activities
![Page 3: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/3.jpg)
Web has transformed our relation
to computers and to data
• A computer in every pocket
• Apps leveraging context
– geolocation and other sensors
– social context (“I’m at the conference, too!”)
• Change in the use of search
– people search for answers, not sites
– answers from aggregated data
(Siri, Google Now, Wolfram Alpha)
![Page 4: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/4.jpg)
Apps are using data from many
sources
• Social networking
• Mobile devices
• Sensors
• Open data
![Page 5: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/5.jpg)
Imagine…
• A “Web” where
– documents are available for download
on the Internet
– but there would be no hyperlinks
among them
![Page 6: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/6.jpg)
Data on the Web is not enough…
• We need a proper infrastructure for a
real Web of Data where:
– data are available on the Web
• accessible via standard Web technologies
– data are interlinked over the Web
– data can be integrated over the Web
• This is Linked Data
![Page 7: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/7.jpg)
Agenda
• Data is changing our lives
• W3C’s traditional focus
• Expanding scope of W3C’s data activities
![Page 8: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/8.jpg)
Semantic Web Core
• RDF data model
• RDF Schema vocabulary design
• RDB2RDF relational DB export
• SPARQL query
• SKOS vocabulary description
• OWL ontological inference
• RIF rules interchange
• LDP read-write Web of Data
• POWDER description resources
• GRDDL app-specific XML
![Page 9: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/9.jpg)
Need for RDF schemas
• First step towards the “extra knowledge”:
– define the terms we can use
– what restrictions apply
– what extra relationships are there?
• “RDF Vocabulary Description Language”
– the term “Schema” is retained for historical
reasons…
![Page 10: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/10.jpg)
Vocabularies
• There is a need for “languages” to
define such vocabularies
– to define those vocabularies
– to assign clear “semantics” on how new
relationships can be deduced
![Page 11: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/11.jpg)
SKOS
• SKOS provides a simple bridge
between the “print world” and the
(Semantic) Web
• Thesauri, glossaries, etc., from the
library community can be made
available
• SKOS can also be used to organize,
e.g., tags, annotate other vocabularies,
…
![Page 12: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/12.jpg)
Semantic Web/Linked Data Today
• Standards are mature
– some level of maintenance work is always needed
• Server-side applications dominate
• Commercial applications exist, e.g.:
– direct integration/usage of linked data on the Web
– consumption of other formats converted internally to a
common format (RDF)
![Page 13: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/13.jpg)
Challenge: leverage data in
interoperable apps
• Public, private, behind enterprise firewalls
• From informal to highly curated
• From machine readable to human readable
– HTML tables, twitter feeds, local vocabularies,
spreadsheets, …
• Expressed in diverse data models
– tree, graph, table, …
• Serialized in many ways
– XML, CSV, RDF, PDF, JSON, HTML Tables,…
![Page 14: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/14.jpg)
The Linking Open Data Project
![Page 15: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/15.jpg)
Linked Data Principles
Is your data 5 Star?
Available on the Web in some format (i.e., use URI to access the data) Available as machine-readable structured data (e.g., excel instead of an image scan) As before, but using a non-proprietary format (e.g., CSV instead of excel) All the above, plus use open standards (RDF & Co.) to identify things, so that people could point at your stuff All the above, plus link your data to other people’s data to provide context
![Page 16: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/16.jpg)
A Three Star Example
![Page 17: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/17.jpg)
The importance of Linked Data
• Provide a core set of data that
applications can build on
– stable references for “things”,
• e.g., http://dbpedia.org/resource/Kolkata/
– many many relationships that applications
may reuse
– a “nucleus” for a larger, semantically
enabled Web!
![Page 18: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/18.jpg)
Linked Data Platform (LDP)
• Define an HTTP/RESTful based infrastructure to publish, read, write, or modify linked data – typical usage: data intensive application in a
browser, application integration using shared data…
• The infrastructure should be easy to implement and install – provides an “entry point” for Linked Data
applications!
• The work is nearing completion
![Page 19: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/19.jpg)
RDF with HTML: RDFa
• By adding some “meta” information, the same source can be reused – typical example: your personal information,
like address, should be readable for humans and processable by machines
• Some solutions have emerged: – add extra statements in microdata or RDFa
that can be converted to RDF • microdata can be used for a (useful) subset of RDF
• RDFa is, essentially, a complete serialization of RDF
![Page 20: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/20.jpg)
schema.org
• Schema.org is a cooperation of search engines (Bing, Google, Yahoo!, and Yandex)
• It is a large vocabulary that they all understand
• The terms are extracted from HTML5+microdata or HTML5+RDFa
– the various partners use it for different purposes
– it can be used by anyone outside of the search world!
![Page 21: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/21.jpg)
![Page 22: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/22.jpg)
Some things to remember when
you publish data
• Publish your data first, do user interfaces later!
– the “raw data” can become useful on its own right and others may use it
– you can add your added value later by providing nice user access
• If possible, publish your data in RDF but if you cannot, others may help you in conversions
– trust the community…
• Add links to other data. “Just” publishing isn’t enough…
![Page 23: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/23.jpg)
Some things to remember when
you publish data (2)
• Think about persistence and versioning
– others may depend on the data you publish…
• Be thoughtful about the URIs you choose
• Try to avoid reinventing the wheel when
choosing vocabularies
![Page 24: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/24.jpg)
Some things to remember when
you publish data (3)
• Document your data, i.e., provide
metadata
– there are vocabularies to do this
• Data Catalog Vocabulary (DCAT)
• Vocabulary of Interlinked Datasets (VoID)
• DCTERMS
• vocabularies for licensing (Open Data Commons,
government licenses)
– this area is still very much in development…
![Page 25: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/25.jpg)
Agenda
• Data is changing our lives
• W3C’s work on data integration
• Expanding scope of W3C’s data activities
![Page 26: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/26.jpg)
New work underway
• CSV on the Web
• Data on the Web Best Practices
• Vocabulary management
![Page 27: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/27.jpg)
What we are hearing
• CSV is everywhere
– can be huge data sets, not easily readable in a spreadsheet
or Google refine
– meaning of data not in machine-readable form
– data is not necessarily used for web-scale integration but
rather immediate usage
• Metadata is essential
• Conversion is an issue
• European Commission Study on business models
for Linked Open Government Data (BM4LOGD)
![Page 28: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/28.jpg)
Linked Data Benefits (BM4LOD)
• Flexible data integration
– Streamlined internal processes
– Where working relationships already exist, much easier to
share
– Linking reference collections; discovery of new relationships
• Increase in data quality
– More use of data internally brings errors to light
– Use of open standards increases quality of system
• New services
• Cost reduction
– Increased efficiency
– Increase in data usage due to LOD enrichment
![Page 29: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/29.jpg)
CSV on the Web
• How W3C can help
– metadata vocabulary to describe CSV data (structure,
reference to access rights, annotations, etc.)
– metadata discovery (e.g., part of an HTTP header, special
rows and columns, packaging formats…)
– mapping content to RDF, JSON, XML
![Page 30: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/30.jpg)
Best practices
• Document best practices for the data publishers
– URI design, management of persistence, versioning
– business models
– use of core metadata vocabularies (provenance, access
control, ownership)
• Specific vocabularies
– quality, application descriptions, …
![Page 31: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/31.jpg)
Vocabulary management:
challenge
• Interoperable vocabularies are key for (meta)data
• At the moment, it is a fairly chaotic world…
– many, possibly overlapping vocabularies
– difficult to locate the one that is needed
– vocabularies may not be properly managed, maintained,
versioned, provided persistence…
![Page 32: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/32.jpg)
Vocabulary management: how
W3C can help
• Provide a space where
– communities can develop vocabularies (through, e.g.,
CGs, possibly WGs)
– host vocabularies at W3C if requested
– annotate vocabularies with a proper set of metadata terms
– establish a vocabulary directory
• The exact structure is still being discussed
![Page 33: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/33.jpg)
Summary
• Data-driven smart apps are one of the major growth
engines for the worldwide software market.
• We need to meet developers where they are.
• 5 Star Benefits of LOD – Greater efficiency, better provision of the task
– Greater flexibility leads to lower costs for future projects
– New services, new connections, new discoveries
– Improved navigation within and between datasets
– Others can build apps based on your data
![Page 34: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/34.jpg)
Available specifications:
Primers, Guides`
• Primers:
– RDF Primer
– OWL Guide
– SKOS Primer
– GRDDL Primer
– RDFa Primer
• The W3C Semantic Web Activity Wiki has links to all
the specifications
![Page 35: The Web of Data: The W3C Semantic Web Initiative](https://reader033.vdocuments.pub/reader033/viewer/2022060108/55508460b4c905a85c8b4869/html5/thumbnails/35.jpg)
These slides are in the Web at http://www.w3.org/2014/Talks /0219-NISO-RRS with thanks to Ivan Herman, W3C and Phil Archer, W3C