virtualizing relational databases as graphs: a multi-model approach

42
Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com Virtualizing Relational Databases as Graphs: a multi-model approach Juan F. Sequeda, Ph.D Co-Founder Capsenta 1 (i.e. Want Graphs? Have Relational? No Problem!) Smart Data/Graphorum Conference – February 1, 2017

Upload: juan-sequeda

Post on 13-Apr-2017

142 views

Category:

Technology


0 download

TRANSCRIPT

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Virtualizing Relational Databases as Graphs: a multi-model approach

Juan F. Sequeda, Ph.DCo-FounderCapsenta

1

(i.e. Want Graphs? Have Relational? No Problem!)

Smart  Data/Graphorum Conference  – February  1,  2017

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Background

• Ph.D.  from  UT  Austin  Computer  Science• Research  on  Integrating  Relational  Databases  with  Semantics  and  Graphs

• Editor  W3C  Standard  on  Mapping  Relational  Databases  to  Graphs

• Co-­‐Founder  Capsenta,  spinout  from  UTCS

2

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Takeaway

• Relational  Databases  can  be  virtualized  as  Graphs!

• Do  you  really  need  to  create  another  database?

3

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com 4

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Graphs  are  Cool!

5

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexible

6

:US_Constitution_1992/section/123

“Excessive   bail  shall  not  be  required,  nor  

excessive   fines  imposed,  nor  cruel  and  unusual  punishments   inflicted.”

:text

:US_Constitution_1992 “United   States   of  America  1789  (rev.  1992)”

:text

:isSectionOf

:Cruelty:hasTopic

“Prohibition  of  cruel  or  degrading  treatment”

:label

“inhumane   treatment”

:keyword

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Integration

7

:US_Constitution_1992/section/123

“Excessive   bail  shall   not  be  required,   nor   excessive  fines   imposed,   nor  cruel  and  unusual   punishments  

inflicted.”

:text

:US_Constitution_1992 “United   States  of  America  1789   (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition   of  cruel   or  degrading   treatment”

:label

“inhumane   treatment”

:keyword

:text

:EighthAmendment_USConstitution :Farmer_vs_Brennan

:lawsApplied

“A  prison   official’s  ‘deliberate   indifference’  to  a  substantial   risk  of  a  serious   harm   to  an  inmate    

violates   the  Eighth  Amendment”

:holding:sameAs

:Prisons_in_Indiana

:LGBT_right_case_laws

:subject :subject

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Data  and  Metadata  are  One

8

:US_Constitution_1992/section/123

“Excessive   bail  shall   not  be  required,   nor   excessive  fines   imposed,   nor  cruel  and  unusual   punishments  

inflicted.”

:text

:US_Constitution_1992 “United   States  of  America  1789   (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition   of  cruel   or  degrading   treatment”

:label

“inhumane   treatment”

:keyword

:text

:Section :Constitution:Topic

:Rights_and_Duties

:Physical_Integrity_Rights

:subClass

:subClass

:subClass

:hasTopic :isSectionOf

:type

:type

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Common  denominator  

9

<constitution id=“US_Constitution_1992”><section id="US_Constitution_1992/section/123">

<text>Excessive bail shall ...</text></section><topic>Cruelty</topic>

</constitution>

“Excessive   bail  shall  not  be  required,  nor  excessive   fines  imposed,   nor  cruel and  unusual  punishments   inflicted.”

id text topic123 Excessive  bail  shall…   Cruelty

:US_Constitution_1992/section/123

“Excessive   bail  shall   not  be  required,   nor   excessive  fines   imposed,   nor  cruel  and  unusual   punishments  

inflicted.”

:text:Cruelty

:hasTopic

XML Text

Tabular

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Traversal,  Navigation,  Reachability

10

:US_Constitution_1992/section/123

“Excessive   bail  shall   not  be  required,   nor   excessive  fines   imposed,   nor  cruel  and  unusual   punishments  

inflicted.”

:text

:US_Constitution_1992 “United   States  of  America  1789   (rev.  1992)”

:isSectionOf

:Cruelty:hasTopic

“Prohibition   of  cruel   or  degrading   treatment”

:label

“inhumane   treatment”

:keyword

:text

:EighthAmendment_USConstitution :Farmer_vs_Brennan

:lawsApplied

“A  prison   official’s  ‘deliberate   indifference’  to  a  substantial   risk  of  a  serious   harm   to  an  inmate    

violates   the  Eighth  Amendment”

:holding:sameAs

:Prisons_in_Indiana

:LGBT_right_case_laws

:subject :subject

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Semantics

11

:US_Constitution_1992/section/123

“Excessive   bail  shall   not  be  required,   nor   excessive  fines   imposed,   nor  cruel  and  unusual   punishments  

inflicted.”

:text:Cruelty

:hasTopic

“Prohibition   of  cruel   or  degrading   treatment”

:label

“inhumane   treatment”

:keyword

:Physical_Integrity_Rights

:subClass

:hasTopic

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

(Summary)  Why  are  Graphs  Cool?

12

• Flexible• Integration• Data  and  Metadata  are  one

• Common  Denominator• Traversal,  Navigation,  Reachability

• Semantics

ACM  Computing  Surveys  2008

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Are  Relational  Databases  cool?  

13

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Flexible

14

id attr1 attr2 attr3 attr4 … attrn …

id attribute value

id attr1 val1 attr2 val2 attr3 val3

id valueattr1

id valueattr2

id valueattr3

Copeland  and  Khoshafian.  A  decomposition   storage  model.   SIGMOD  1985

Agrawal  et  al.  Storage  and  Querying  of  E-­‐Commerce  Data.  VLDB   2001

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Integration

15

ExtractTransformLoad

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Data  and  Metadata  are  One

16

CREATE TABLE Person (id int primary key,Name varchar not null,…

)

Company closePrice closeDate

IBM 130 1/15/2016

MSFT 50.99 1/15/2016

IBM MSFT closeDate

130 50.99 1/15/2016

closePrice closeDate130 1/15/2016

closePrice closeDate

50.99 1/15/2016

IBM MSFT

-­‐ Krishnamurthy  et  al.  Language  features  for  interoperability  of  databases  with  schematic  discrepancies.  SIGMOD  1991-­‐ Lakshmanan et  al.  SchemaSQL -­‐ A  Language  for  Interoperability  in  Relational  Multi-­‐database  Systems.  VLDB  1996

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Common  denominator  

• Social  Network• Hierarchical  Data

17

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Traversal,  Navigation,  Reachability

• Write  a  bunch  of  Joins• Recursion

18

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Semantics

• Views• Triggers

19

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

(Summary)  Are  Relational  Databases  Cool?

• Flexible• Integration• Data  and  Metadata  are  one• Common  Denominator• Traversal,  Navigation,  Reachability• Semantics

20

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Ditch  your  Relational  Database  and  move  to  Graphs!?

Feasible?

21

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Query  Federation

What  if  …

22

Virtualize  Relational  Databases  as  Graphs

Keep  your  legacy  data  in  the  RDBMS

Run  graph  queries  over  the  virtual  graph  data

Add  new  data  that  doesn’t  fit  into  the  schema  into  a  separate  graph

Federate  queries  over  Virtualized  Graph  and  the  Real  Graph

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Query  Federation

What  if  …

23

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

What  type  of  graphs  are  we  talking  about?

24

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Property  Graphs  vs  RDF  Graphs

25

:Bob :Alicefoaf:knows

“Bob  Smith”

foaf:name

“Alice  Smith”

foaf:name

id1 id2

knowskey value

name Bob  Smith

key value

name AliceSmith

key value

since 2005

:g1

2005

:since

http://db-­‐engines.com/en/ranking/graph+dbms http://db-­‐engines.com/en/ranking/rdf+store

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

RDF:  Resource  Description  Framework

• Graph  Data  Model• Subject  (Node)  – Predicate  (Edge)  – Object  (Node)• W3C  Standard  for  data  on  the  web• URIs

26

:US_Constitution_1992/section/123

:US_Constitution_1992 “United   States  of  America  1789   (rev.  1992)”

:isSectionOf

:text

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Schemas,  Taxonomies,  Ontologies

27

:Cruelty

:Section :Constitution:Topic

:Rights_and_Duties

:Physical_Integrity_Rights

:subClass

:subClass

:subClass

:hasTopic :isSectionOf

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

SELECT ?x ?titleWHERE{ ?x :hasTopic :cruelty.?x :text ?t

}

SPARQL  Graph

RDF  Graph

:USConst/146

:cruelty

:prerel

“Excesses  bail  shall  not  be  required  …  “

:text

?x :cruelty:hasTopic

?text

SPARQL  Protocol  and  RDF  Query  Language

28

SPARQL   is  a  Query  Language• Graph  query  language   for  RDF• Match   SPARQL  graph  with  RDF  graph• Much  more  features  in  SPARQL  1.1:  Property  Paths

SPARQL   is  a  Protocol• Send  query  over  HTTP   GET  or  POST• Response   to  a  query  is  either   in  XML,  JSON  or  CSV  format

:text

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

SPARQL  Graph

RDF  Graph

:USConst/146

:cruelty

:prerel

“Excesses  bail  shall  not  be  required  …  “

:text

?x :cruelty:hasTopic

?text

:text

SPARQL  Protocol  and  RDF  Query  Language

29

?x ?text

:USConst/146 “Excesses  bail  shall  not  be  required  …  “

SELECT ?x ?titleWHERE{ ?x :hasTopic :cruelty.?x :text ?t

}

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Virtualizing  Relational  Databases  as  Graphs

30

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Relational  Database  to  RDF  (RDB2RDF)

31

ID NAME AGE CID

1 Alice 25 100

2 Bob NULL 100

Person

CID NAME

100 Austin

200 Madrid

City

<Person/1>

<City/100>

Alice25

Austin

<Person/2>

Bob

<City/200> Madrid

foaf:namefoaf:name foaf:age

rdfs:label

rdfs:label

foaf:based_near

Mapping

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

W3C  RDB2RDF  Standards

• W3C  Standards  to  map  Relational  Data  to  RDF

• A  Direct  Mapping  of  Relational  Data  to  RDF– Default  automatic  mapping  of  relational  data  to  RDF

• R2RML:  RDB  to  RDF  Mapping  Language– Customizable  language  to  map  relational  data  to  RDF

32

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

RDF

W3C  Direct  Mapping

33

RelationalDatabase

Direct  MappingEngine

Input:  Database  (Schema  and  Data)Primary  KeysForeign  Keys

OutputRDF  graph

https://www.w3.org/TR/rdb-­‐direct-­‐mapping/

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

W3C  Direct  Mapping  Result

34

ID NAME AGE CID

1 Alice 25 100

2 Bob NULL 100

Person

CID NAME

100 Austin

200 Madrid

City

<Person/ID=1>

<City/CID=100>

Alice25

Austin

<Person/ID=2>

Bob

<City/CID=200> Madrid

Person#Name Person#Age

City#Name

City#Name

Person#ref-­‐CID

Direct  Mapping

Person#Name

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

R2RML

35

R2RMLEngine

R2RMLFile

:Cruelty

:Section :Constitution:Topic

:Rights_and_Duties

:Physical_Integrity_Rights

:subClass:subClass

:subClass

:hasTopic :isSectionOf

RDF

RelationalDatabase

Target  Schema

https://www.w3.org/TR/r2rml/

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

<TriplesMap1>a rr:TriplesMap;

rr:logicalTable [ rr:tableName”Person" ];

rr:subjectMap [ rr:template"http://www.ex.com/Person/{ID}";

rr:classfoaf:Person ];

rr:predicateObjectMap [ rr:predicate foaf:based_near ; rr:objectMap [

rr:parentTripelMap <TripleMap2>;rr:joinCondition [

rr:child “CID”;rr:parent “CID”;

]]

].

<TriplesMap2>a rr:TriplesMap;

rr:logicalTable [ rr:tableName ”City" ];

rr:subjectMap [ rr:template "http://ex.com/City/{CID}";rr:class ex:City ];

rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [ rr:column ”TITLE" ]

].

Example  R2RML

36

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Graph  Data  Virtualization

37

SPARQL

RDBMS Graph

SQL

SQL  Results

SPARQLResults

R2RML  Mappingby

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Performance  of  Ultrawrap

• Reuse  existing  relational  infrastructure– 30+  years  of  optimizations– Semantic  Query  Optimizations

• Result:  SPARQL  as  fast  as  SQL

38

Sequeda  J.  Integrating  Relational  Databases  with  the  Semantic  Web.  IOS  Press.  2016http://www.iospress.nl/book/integrating-­‐relational-­‐databases-­‐with-­‐the-­‐semantic-­‐web/

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Demo

39

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Semantic  Data  Virtualization  as  a  Bridge  for  BI

40

HIVEImpala,   etc

OracleSQL  

Server

PostgresUnstructured

Semi-­‐Structured

Enterprise  Knowledge  Graph

Search ReportsAPI BI  Connectors(Tableau,   …)

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

Takeaway:  Tipping  Point

41

Relational  Database

Graphs

• Flexible• Integration• Data  and  Metadata  are  One• Common  Denominator• Traversal,  Navigation,  Reachability  • Semantics

Do  you  really  need  another  database?  

Relational  Databases  can  be  virtualized  as  Graphs

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com

THANK  YOU

Juan  Sequeda,  Ph.DCo-­‐Founder  – [email protected]@juansequeda

42