linked data technology & status

45
Linked Data & Semantic Web Technology Linked Data Technology & Status Dr. Myungjin Lee

Upload: owen

Post on 24-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Linked Data Technology & Status. Dr. Myungjin Lee. The Semantic Web. more vocabulary for describing properties and classes. a vocabulary for describing properties and classes of RDF-based resources. to exchange rules between many "rules languages ". - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Linked DataTechnology & Status

Dr. Myungjin Lee

Page 2: Linked Data Technology & Status

Linked Data & Semantic Web Technology

The Semantic Web

an elemental syntaxfor content structurewithin documents

a simple languagefor expressing data models,

which refer to objects ("resources")and their relationships

more vocabularyfor describing properties and classes

a vocabulary for describingproperties and classes

of RDF-based resources

a protocol and query languagefor semantic web data sources

to exchange rulesbetween many "rules languages"

a string of characters used to identify a name or a resource

Linked Data & Semantic Web Technology http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/#(24)

Page 3: Linked Data Technology & Status

Linked Data & Semantic Web Technology

What is Linked Data?Linked data describes a method of publishing structured data so that it can be interlinked and become more useful.

The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.

- A roadmap to the Semantic Web by Tim Berners-Lee

http://www.w3.org/DesignIssues/LinkedData.html

Page 4: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Four Principles of Linked Data 1. Use URIs to identify things.

2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.

3. Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML.

4. Include links to other, related URIs in the exposed data to improve discovery of other related informa-tion on the Web.

http://www.w3.org/DesignIssues/LinkedData.html

Page 5: Linked Data Technology & Status

Linked Data & Semantic Web Technology

5 Star Linked Data★ Available on the web (whatever format) but with an

open licence, to be Open Data

★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)

★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★ All the above, plus: Link your data to other people’s data to provide context

http://www.w3.org/DesignIssues/LinkedData.html

Page 6: Linked Data Technology & Status

Linked Data & Semantic Web Technology

The Basic Requirements for Linked Data

an elemental syntaxfor content structurewithin documents

a simple languagefor expressing data models,

which refer to objects ("resources")and their relationships

a vocabulary for describingproperties and classes

of RDF-based resources

a protocol and query languagefor semantic web data sources

a string of characters used to identify a name or a resource

Linked Data & Semantic Web Technology

Page 7: Linked Data Technology & Status

Linked Data & Semantic Web Technology http://www.google.co.kr/search?q=namdeamun

Page 8: Linked Data Technology & Status

Linked Data & Semantic Web Technology

URI, Thing, and Representation

Thing

URI

Representation

http://data.kdata.kr/resource/Namdaemun

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Namdaemun | kdata.kr</title> <link rel="alternate" type="application/rdf+xml" href="http://data.kdata.kr/data/Namdaemun" title="RDF" /></head> <body onLoad="init();"> <div id="header"> <div> <h1 id="title">Namdaemun</h1> <div id="homelink"> &nbsp;at <a href="http://kdata.kr">kdata.kr</a>

identifiesand

names

represents

looks up

URIhttp://dbpedia.org/resource/Namdaemun

URIhttp://data.kdata.kr/resource/Sungnyemun

links

refers

PersonMachine

http://www.slideshare.net/lysander07/open-hpi-semweb02part1

Page 9: Linked Data Technology & Status

Linked Data & Semantic Web Technology http://www.w3.org/TR/cooluris/

Page 10: Linked Data Technology & Status

Linked Data & Semantic Web Technology

URIs for Real-World Objects• Be on the Web

– Given only a URI, machines and people should be able to retrieve a description about the resource identified by the URI from the Web.

• Be unambiguous– There should be no confusion between identifiers for

Web documents and identifiers for other resources.

http://www.w3.org/TR/cooluris/

Page 11: Linked Data Technology & Status

Linked Data & Semantic Web Technology

URIs for Real-World Objects

<URI-of-alice> a foaf:Person; foaf:name "Alice"; foaf:mbox <mailto:[email protected]>; foaf:homepage <http://www.example.com/people/alice> .

ID

RDF HTML

Resource identifier (URI)

RDF document URI HTML document URI

for web browsersfor semantic web applications

http://www.w3.org/TR/cooluris/

Page 12: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Distinguishing between Representations and De-scriptions

GenericDocument

RDF HTML

http://data.kdata.kr/page/Namdaemun

http://data.kdata.kr/page/Namdaemun.rdf http://data.kdata.kr/page/Namdaemun.html

text/htmlapplication/rdf+xml

Thing

http://data.kdata.kr/resource/Namdaemun

303 redirect

contentnegotiation

Page 13: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Cool URIs• Simplicity

– short and mnemonic

• Stability– remain as long as possible

• Manageability– issue your URIs in a way that you can manage

http://www.w3.org/TR/cooluris/

Page 14: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Designing URI Sets for the UK Public Sector• URIs:

– name the set and describe its characteristics– identify for the real-world ‘Things’ in a single con-

cept– provide a means of looking up data on the web– provide mechanisms to:

• lookup an Identifier URI and be redirected to its Document URI

• discover and get each of the Representation URIs

URI Type URI structure Examples

Identifier http://{domain}/id/{concept}/{reference} http://education.data.gov.uk/id/school/78

https://www.gov.uk/government/publications/designing-uri-sets-for-the-uk-public-sectorhttp://data.gov.uk/resources/uris

Page 15: Linked Data Technology & Status

Linked Data & Semantic Web Technology

URI Design Principles:Creating Unique URIs for Government Linked Data

• URI Template:

'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )+

• States and Territories– Owner

• federal– Suggested

• http://BASE/id/us/state/NAME– Example

• http://logd.tw.rpi.edu/id/us/state/Vermont

http://logd.tw.rpi.edu/instance-hub-uri-design

Page 16: Linked Data Technology & Status

Linked Data & Semantic Web Technology

XML (Extensible Markup Language)• a textual data format for the representation of

arbitrary data structures over the Internet• both human-readable and machine-readable

<title> W3C Demonstrates …</title><date> 12 February 2013</date><body> W3C invites media, analysts, and other attendees of Mobile World Congress …</body>

Content

title

date

body

bold1

bold2

Structure

titledate

body

bold1

bold2

Presentation

XML DTDXML Schema

XSLTXSL-foXPath

Concept

RelatedRecommendations

http://en.wikipedia.org/wiki/Xml

Page 17: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Data Representation of XML• Various ways to represent data using XML

– Myungjin Lee is Hye-jin’s husband.

• We need a method to represent data on abstract level.

<conjugalrelation><husband>Myungjin Lee</husband><wife>Hye-jin Han</wife>

</conjugalrelation>

<conjugalrelation husband=“Myungjin Lee”><wife>Hye-jin Han</wife>

</conjugalrelation>

<conjugalrelation husband=“Myungjin Lee” wife=“Hye-jin Han” />

Page 18: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF (Resource Description Frame-work)

• a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats

– Myungjin Lee is Hye-jin’s husband.

hasWife

http://en.wikipedia.org/wiki/Resource_Description_Framework

Page 19: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Data Representation of RDF

hasWife

http://semantics.kr/myungjinlee http://semantics.kr/hye-jinhanhttp://semantics.kr/rel/hasWife

SubjectURI reference

PredicateURI reference

ObjectURI reference or Literal

Triple

Page 20: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF Example

http://www.cars.com/car#A6

http://www.cars.com/car#Car

http://www.cars.com/car#Gasoline

http://www.cars.com/car#GDI

http://www.cars.com/car#Auto_8-Speedhttp://www.cars.com/car#Sedan

4

http://www.cars.com/car#AWD

115”

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://www.cars.com/car#transmission

http://www.cars.com/car#wheelbase

http://www.cars.com/car#engine

http://www.cars.com/car#fuel

http://www.cars.com/car#drivetrain

http://www.cars.com/car#doors

http://www.cars.com/car#body_style

Page 21: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF Serialization• N-Triples

– RDF Test Cases, W3C Recommendation, 10 February 2004– a line-based, plain text serialization format for storing and transmitting RDF data

• Notation 3 (N3)– a shorthand non-XML serialization of RDF models, designed with human-readabil-

ity in mind– much more compact and readable than XML RDF notation

• Turtle (Terse RDF Triple Language)– W3C Candidate Recommendation, 19 February 2013– a format for expressing data in the Resource Description Framework (RDF) data

model– a subset of Notation3 (N3) language, and a superset of the minimal N-Triples for-

mat

• RDF/XML– W3C Recommendation, 10 February 2004– an XML syntax for writing down and exchanging RDF graphs

http://en.wikipedia.org/wiki/N-Tripleshttp://en.wikipedia.org/wiki/Notation3

http://en.wikipedia.org/wiki/Turtle_(syntax)

Page 22: Linked Data Technology & Status

Linked Data & Semantic Web Technology

<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/title> "Tony Benn" .<http://en.wikipedia.org/wiki/Tony_Benn> <http://purl.org/dc/elements/1.1/publisher> "Wikipedia" .

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> </rdf:Description></rdf:RDF>

@prefix dc: <http://purl.org/dc/elements/1.1/>.

<http://en.wikipedia.org/wiki/Tony_Benn> dc:title "Tony Benn"; dc:publisher "Wikipedia".

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix dc: <http://purl.org/dc/elements/1.1/> .@prefix ex: <http://example.org/stuff/1.0/> .

<http://www.w3.org/TR/rdf-syntax-grammar> dc:title "RDF/XML Syntax Specification (Revised)" ; ex:editor [ ex:fullname "Dave Beckett"; ex:homePage <http://purl.org/net/dajobe/> ] .

N-Triple

RDF/XML

N3

Turtle

Page 23: Linked Data Technology & Status

Linked Data & Semantic Web Technology http://www.w3.org/TR/rdf11-concepts/

Page 24: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF 1.0 vs RDF 1.1

RDF 1.0 RDF 1.1

Resource Identification URI IRI (Internationalized Re-source Identifier)

Multiple RDF Graphs X O

HTML content for literal value X rdf:HTML

Page 25: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Recommendations of RDF

http://www.w3.org/standards/techs/rdf#w3c_all

Page 26: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF Schema• W3C Recommendation, 10 February 2004• to define classes and properties that may be

used to describe classes, properties and other resources

• RDF Schema allows– Definition of Classes– Definition of Properties and Restrictions– Definition of Hierarchies

http://www.slideshare.net/lysander07/openhpi-22

Page 27: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF Schema Example

car:Car

car:Vehicle

rdfs:subClassOf

rdf:Property

car:body_stylerdfs:domain

rdfs:range

rdfs:Class

rdf:type

rdf:type

car:Style

rdf:type

car:A6

rdf:type

car:Sedanrdf:typecar:body_style

ABox - assertion component

TBox - terminological component

Page 28: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDF Semantics• to provide a formal meaning based on a model-

theoretic semantics in its abstract syntax

<x, y> is in IEXT(I(rdfs:subClassOf))

if and only if x and y are in IC

and ICEXT(x) is a subset of ICEXT(y)

car:Car

car:Vehicle

rdfs:subClassOf

car:A6

rdf:type

rdf:type

Page 29: Linked Data Technology & Status

Linked Data & Semantic Web Technology

SPARQL• Why do we need a query language for RDF?

– Why de we need a query language for RDB?– to get to the knowledge from RDF

• SPARQL Protocol and RDF Query Language– to retrieve and manipulate data stored in Resource

Description Framework format– to use SPARQL via HTTP

http://www.slideshare.net/lysander07/openhpi-semweb03part1

Page 30: Linked Data Technology & Status

Linked Data & Semantic Web Technology

SPARQL ExamplePREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?emailWHERE { ?person a foaf:Person. ?person foaf:name ?name. ?person foaf:mbox ?email.}

RDF Knowledge Base

?name ?email

Myungjin Lee [email protected]

Gildong Hong [email protected]

Grace Byun [email protected]

Page 31: Linked Data Technology & Status

Linked Data & Semantic Web Technology

SPARQL Query Forms• SELECT query

– Used to extract raw values from a SPARQL endpoint, the results are returned in a table format.

• CONSTRUCT query– Used to extract information from the SPARQL endpoint

and transform the results into valid RDF.• ASK query

– Used to provide a simple True/False result for a query on a SPARQL endpoint.

• DESCRIBE query– Used to extract an RDF graph from the SPARQL endpoint,

the contents of which is left to the endpoint to decide based on what the maintainer deems as useful information.

http://en.wikipedia.org/wiki/SPARQL

Page 32: Linked Data Technology & Status

Linked Data & Semantic Web Technology

OWL (Web Ontology Language)• knowledge representation languages for author-

ing ontologies• If you need more expressiveness OWL

– such as,

Man Woman∩ = Ø

Person Persondescendant

Persondescendant

descendant

Husband Wife1:1 _01 Action

hasGenre

ActionMovie

subClassOf

Genre

type

Page 33: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Linked Data Service

What more do we need?

Triple StoreRDBMS HTMLHTML

HTML

SPARQL

R2RML

Linked Data Platform

RDFa

GRDDL

RDFKnowledge+

Page 34: Linked Data Technology & Status

Linked Data & Semantic Web Technology

R2RML• RDB to RDF Mapping Language• W3C Recommendation 27 September 2012• a language for expressing customized mappings

from relational databases to RDF datasets

<http://data.example.com/employee/7369> rdf:type ex:Employee.<http://data.example.com/employee/7369> ex:name "SMITH".

@prefix rr: <http://www.w3.org/ns/r2rml#>.@prefix ex: <http://example.com/ns#>.

<#TriplesMap1> rr:logicalTable [ rr:tableName "EMP" ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; rr:class ex:Employee; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "ENAME" ]; ].

R2RML

Result

RDB

http://www.w3.org/TR/r2rml/

Page 35: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Linked Data Platform• A set of best practices and simple approach for

a read-write Linked Data architecture, based on HTTP access to web resources that describe their state using RDF

• W3C Working Draft 25 October 2012

http://www.w3.org/TR/ldp/

Page 36: Linked Data Technology & Status

Linked Data & Semantic Web Technology

RDFa (the Resource Description Framework in attributes)

• W3C Recommendation, 07 June 2012• to express machine-readable data in Web doc-

uments like HTML, SVG, and XML

Example<p vocab="http://schema.org/" resource="#manu" typeof="Person"> My name is <span property="name">Manu Sporny</span> and you can give me a ring via <span property="telephone">1-800-555-0199</span>. <img property="image" src="http://manu.sporny.org/images/manu.png" /></p>

http://www.w3.org/TR/xhtml-rdfa-primer/

Page 37: Linked Data Technology & Status

Linked Data & Semantic Web Technology

GRDDL (Gleaning Resource Descriptions from Dialects of Languages)

• a mechanism and markup format for Gleaning Resource Descriptions from Dialects of Lan-guages to obtain RDF triples out of XML docu-ments, including XHTML

<html xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="glean_title.xsl getAuthor.xsl"><head><title>Are You Experienced?</title></head>...

<xsl:stylesheet version="1.0"> <xsl:template match="/"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="{$subject}"> <dc:title> <xsl:value-of select="/html:html/html:head/html:title"/> </dc:title> </rdf:Description> </rdf:RDF> </xsl:template></xsl:stylesheet>

<rdf:RDF> <rdf:Description rdf:about=""> <dc:title>Are You Experienced?</dc:title> </rdf:Description></rdf:RDF>

HTML

glean_title.xsl

RDF

http://www.w3.org/TR/grddl/

Page 38: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Jena Platform

Linked Data Service

Triple StoreRDBMS HTMLHTML

HTML

SPARQL

TDB & SDB

Jena API

Fuseki

ARQ & LARQ

http://jena.apache.org/

Page 39: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Openlink Virtuoso• a middleware and database engine hybrid that

combines the functionality of a traditional RDBMS, ORDBMS, RDF, XML, etc.– Relational Data Management– RDF Data Management– XML Data Management– Free Text Content Management & Full Text Index-

ing– Document Web Server– Linked Data Server– Web Application Server– Web Services Deployment (SOAP or REST)

http://virtuoso.openlinksw.com/

Page 40: Linked Data Technology & Status

Linked Data & Semantic Web Technology

Openlink Virtuoso Coverage

Linked Data Service

Triple StoreRDBMS HTMLHTML

HTML

SPARQL

Sponger

SPARQL Server

Storage and Inference

Page 41: Linked Data Technology & Status

Linked Data & Semantic Web Technology

The Linking Open Data cloud diagram

Linked Data & Semantic Web Technology

http://lod-cloud.net/

Page 42: Linked Data Technology & Status

Linked Data & Semantic Web Technology

MediaUser Generated Content

Publications

Government

Geographic

Cross-DomainLife Sciences

Linked Data & Semantic Web Technology

Domain Number of datasets Triples (Out-)Links

Media 25 18,4185,2061 5044,0705

Geographic 31 61,4553,2484 3581,2328

Government 49 133,1500,9400 1934,3519

Publications 87 29,5072,0693 1,3992,5218

Cross-domain 41 41,8463,5715 6318,3065

Life Sciences 41 30,3633,6004 1,9184,4090

User-generated Content 20 1,3412,7413 344,9143

Total 295 316,3421,3770 5,0399,8829

http://www.slideshare.net/lysander07/13-semantic-web-technologies-linked-data-semantic-search

Page 43: Linked Data Technology & Status

Linked Data & Semantic Web TechnologyLinked Data & Semantic Web Technology

KDATA (Linked Data for Korea)Domain Triples국가코드 3,899

엔터테인먼트 44,278 행정구역 2,969

초중고등학교 126,469 교육청 1,130 대학교 2,833

사회적 기업 5,539 서울시 개방 화장실 47,340

야구선수 및 팀 228,872 지하철역 4,450

역사 5,392 행정데이터표준용어 109,101

한옥마을 1,155 공공 WiFi 설치정보 1,671 KDATA 분류용어 808

전통시장 4,535 국립공원 10,605

문화재 80,156 공공체육시설 49,799

생물분류 3,256 문화시설 9,418

공원정보 및 프로그램 2,429 가격안정모범업소 16,212

가격안정모범업소 상품목록 14,300 공공시설물 인증제품 6,931

제설함 위치정보 39,218 야생동식물정보 115,099

야생동식물 출현정보 139,608 합계 1,077,472

http://kdata.kr/index.jsp

Page 44: Linked Data Technology & Status

Linked Data & Semantic Web Technology

<rdf:RDF><rdf:Description rdf:about="http://data.kdata.kr/data/Namdaemun?output=rdfxml"> <rdfs:label>RDF description of Namdaemun</rdfs:label> <foaf:primaryTopic> <kdc:StateDesignatedHeritage rdf:about="http://data.kdata.kr/resource/Namdaemun"> <rdfs:label>남대문 </rdfs:label> <rdfs:label>숭례문 </rdfs:label> <foaf:depiction rdf:resource="20060227132556895000.jpg"/> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Namdaemun"/>...</rdf:RDF>

http://data.kdata.kr/resource/Namdaemun

HTML

RDF

select ?swhere { ?s rdf:type <http://data.kdata.kr/class/NationalTreasure> . ?s rdfs:label "남대문 " .}

SPARQL

Page 45: Linked Data Technology & Status

Contents Search on the Semantic W

eb Dr. Myungjin Lee

e-Mail : [email protected] : http://twitter.com/MyungjinLee

Facebook : http://www.facebook.com/mjinlee

SlideShare : http://www.slideshare.net/onlyjiny/

Thanks foryour attention.