sept. 27, 2002 isdb’02 transforming xpath queries for bottom-up query processing yoshiharu...

28
Sept. 27, 2002 Sept. 27, 2002 ISDB’02 ISDB’02 Transforming XPath Queries for B Transforming XPath Queries for B ottom-Up Query Processing ottom-Up Query Processing Yoshiharu Ishikawa Yoshiharu Ishikawa Takaaki Nagai Takaaki Nagai Hiroyuki Kitagawa Hiroyuki Kitagawa University of Tsukuba University of Tsukuba {ishikawa,kitagawa}@is.tsukuba.ac.jp {ishikawa,kitagawa}@is.tsukuba.ac.jp

Upload: hope-holt

Post on 14-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB’02ISDB’02

Transforming XPath Queries for Transforming XPath Queries for Bottom-Up Query ProcessingBottom-Up Query Processing

Yoshiharu Ishikawa Yoshiharu Ishikawa Takaaki NagaiTakaaki Nagai

Hiroyuki KitagawaHiroyuki KitagawaUniversity of TsukubaUniversity of Tsukuba

{ishikawa,kitagawa}@is.tsukuba.ac.jp{ishikawa,kitagawa}@is.tsukuba.ac.jp

Page 2: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 3: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

BackgroundBackground

XML : content-description language on the XML : content-description language on the WebWeb

XPathXPath pattern-based query language for XMLpattern-based query language for XML extracts XML nodes based on the specified patteextracts XML nodes based on the specified patte

rnrn has has navigational semanticsnavigational semantics XSLT uses XPath for the node specificationXSLT uses XPath for the node specification XQuery also uses XPathXQuery also uses XPath

Page 4: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

XML ExampleXML Example<itemlist><itemlist> <item category=<item category=""audio equipment">audio equipment"> <catalog-info><catalog-info> <type>CD player</type><type>CD player</type> <manufacturer>Star Electronics</manufacturer><manufacturer>Star Electronics</manufacturer> <catalog-no>CDP-R55N</catalog-no><catalog-no>CDP-R55N</catalog-no> </catalog-info></catalog-info> <sales-info><sales-info> <prod-year>2001</prod-year><prod-year>2001</prod-year> <price>125.00</price><price>125.00</price> </sales-info></sales-info> </item></item> ......</itemlist></itemlist>

Page 5: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

XPath QueryXPath Query

Sample query Sample query QQ: retrieve prices of CD player: retrieve prices of CD playerss

XPath sentenceXPath sentence contains contains location stepslocation steps separated by "/" separated by "/" a location step has the format a location step has the format axis::node_test[praxis::node_test[pr

edicate]...[predicate]edicate]...[predicate] location steps can be abbreviatedlocation steps can be abbreviated

e.g., /descendant::foo e.g., /descendant::foo →→ //foo, /attribute::bar //foo, /attribute::bar →→ @bar @bar

/itemlist/item[@category = "audio equipment"]/itemlist/item[@category = "audio equipment"] [catalog-info/type = "CD player"]/sales-info/price[catalog-info/type = "CD player"]/sales-info/price

Page 6: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 7: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

article

authorsauthors

author author

"Smith" "White" "Chen"

author author

"Miller"

XPath SemanticsXPath Semantics XPath assumes XPath assumes top-downtop-down query processing query processing

Not efficient for large XML databasesNot efficient for large XML databases Bottom-up processingBottom-up processing is better in some cases is better in some cases

query: /article/authors[author = "Miller"]

article

authorsauthors

author author

"Smith" "White" "Chen"

article

authors authors

author author

"Miller""Miller"

author author authorauthor

article

authors

"Miller"

author

top-downtop-down bottom-upbottom-up

Page 8: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Bottom-Up Query ProcessingBottom-Up Query Processing We can process the We can process the

example query whenexample query when we can determine the we can determine the

specified leaf elements specified leaf elements (i.e., "Miller") with the (i.e., "Miller") with the help of an help of an indexindex, and, and

we can select the parent we can select the parent for a specific author for a specific author node.node.

We do not need to We do not need to access all the access all the authors/author authors/author elementselements

article

authorsauthors

author author

"Smith" "White" "Chen"

author author

"Miller"

article

authors

"Miller"

author

Page 9: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Our Objective and ApproachOur Objective and Approach

Our ObjectiveOur Objective Efficient bottom-up processing of XPath queries Efficient bottom-up processing of XPath queries

with the help of index structureswith the help of index structures Our ApproachOur Approach

Use of the Use of the proximal nodes modelproximal nodes model as the underlyi as the underlying retrieval modelng retrieval model

The model enables bottom-up query evaluationThe model enables bottom-up query evaluation Development of transformation rules from XPath Development of transformation rules from XPath

queries to proximal nodes expressionsqueries to proximal nodes expressions

Page 10: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 11: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

The Proximal Nodes Model (1)The Proximal Nodes Model (1)

Proposed by Navarro and Baeza-Yates [7] as a strProposed by Navarro and Baeza-Yates [7] as a structured document retrieval modeluctured document retrieval model

Uses Uses bottom-up bottom-up query processing approachquery processing approach XML data can be treated as nested nodes:XML data can be treated as nested nodes:

a a node node corresponds to an element or attribute in XMLcorresponds to an element or attribute in XML each node has an associated text region (called the each node has an associated text region (called the segseg

mentment): segments can take nested structure): segments can take nested structure Expressive power and efficiency are well-balancedExpressive power and efficiency are well-balanced

evaluation cost is almost O(n): n is the no. of nodesevaluation cost is almost O(n): n is the no. of nodes

Page 12: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

The Proximal Nodes Model (2)The Proximal Nodes Model (2) The model consists of three componentsThe model consists of three components Text pattern matching languageText pattern matching language

specifies pattern matching conditionsspecifies pattern matching conditions implementation dependentimplementation dependent returns a set of the matched nodesreturns a set of the matched nodes example: "ABC Corporation"example: "ABC Corporation"

Retrieval operators based on document structuresRetrieval operators based on document structures returns a set of nodes for a given element or attribute returns a set of nodes for a given element or attribute

namename example: chapter, priceexample: chapter, price

Operators to integrate partial retrieval resultsOperators to integrate partial retrieval results calculates the result node set from the given node setscalculates the result node set from the given node sets efficient computation based on segment relationshipsefficient computation based on segment relationships

Page 13: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Proximal Nodes OperatorsProximal Nodes Operators

P P inin Q Q a set of P nodes contained in one or more Q nodesa set of P nodes contained in one or more Q nodes

P P withwith Q Q a set of P nodes that contains one or more Q nodesa set of P nodes that contains one or more Q nodes

P P childchild Q Q a set of P nodes each of which is a child of a Q nodea set of P nodes each of which is a child of a Q node

P P parentparent Q Q a set of P nodes each of which is a parent of a Q nodea set of P nodes each of which is a parent of a Q node

P P + + QQ the union of P and Qthe union of P and Q

P P -- Q Q the difference of P and Qthe difference of P and Q

P P isis Q Q the intersection of P and Qthe intersection of P and Q

P P samesame Q Q a set of P nodes each of which is equal to a Q nodea set of P nodes each of which is equal to a Q node

P and Q are nodes with associated segmentsP and Q are nodes with associated segments

Page 14: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Example of Proximal Nodes ExpressionExample of Proximal Nodes Expression

Example expression of proximal nodes modelExample expression of proximal nodes model

Query processing stepsQuery processing steps 1. determine the node sets that corresponds to the 1. determine the node sets that corresponds to the

elements "item" and "type" using indexeselements "item" and "type" using indexes 2. determine the node set that corresponds to the pattern 2. determine the node set that corresponds to the pattern

"CD player" using an index"CD player" using an index 3. compute the result of "same" operator 3. compute the result of "same" operator 4. compute the result of "with" operator4. compute the result of "with" operator

item with (type same "CD player")

Page 15: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 16: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Translation Rules (1)Translation Rules (1)

Supports major XPath patternsSupports major XPath patterns Based on the XPath semantic description by Based on the XPath semantic description by

Wadler [10]Wadler [10] Use of denotational semanticsUse of denotational semantics

Page 17: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Translation Rules (2)Translation Rules (2)

A

, child

otherwise

namewith][][haswhen,][][

][]::[

][ where][]/[

][()][

][][/

][][]|[

)(→→→:

111

11111221

2121

1

   error

naasaxxanxn

xpxpa

xpxxxpxpp

xaATextxtext

Rootpxp

xpxpxpp

SegmentSetSNodeNamePatternAxis

a

aa

aaa

aa

aa

aaa

PAS

SS

SSS

S

SS

SSS

S

Page 18: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Translation Rules (3)Translation Rules (3)

,,

][withofattributestheallare@

][][]][[

][@][@][@

]::[][@

][andexpressionnonnumericais where,

][]][[

expressionnumericaiswhere,][][]][[

][typenodethehavethatofs'][allare,,

][][]*[

1

1

1

11

1

1

aAxnn

xpqxqp

xnxnx

xnattributexn

xpxq

xqxxqp

qxpqxqp

aPxaAnn

xnxnx

m

aa

maaa

aa

a

aa

aa

m

maaa

SS

SSS

SS

S

QwithS

SS

SSS

 

Page 19: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Auxiliary FunctionsAuxiliary Functions

child]attribute[

with]ancestor[

in]descendant[

parent]parent[

child]child[

→:

=

=

=

=

=

A

A

A

A

A

A OperatorAxis

Attribute]attribute[

Element]ancestor[

Element]descendant[

Element]parent[

Element]child[

P

P

P

P

P

P Nodetype:Axis

Page 20: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Simplification Using the Knowledge Simplification Using the Knowledge of Document Structureof Document Structure

If we know the DTD of the target XML, we If we know the DTD of the target XML, we can derive more simplified translation resultscan derive more simplified translation results

nxn

xann

error

naasaxxanxn

xpxxpxpp

pp

xpxxxpxpp

a

a

aaa

aaa

][:rulesimplified

,][iprelationshthesatisfiestoingcorrespondsetnodetheknowweif

otherwise

namewith][][haswhen,][][:original

][ where][]/[:rulesimplified

,ofchildtheasappearsonlyknowweif

][ where][]/[:original

1111221

12

11111221

S

A

PAS

SSS

SSS

A

,

, child

   

Page 21: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 22: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Translation ExampleTranslation Example Original query Original query QQ

Translation result:Translation result: tt11 = item with (item with (category same "audio equipment")) = item with (item with (category same "audio equipment"))

tt22 = catalog-info child t = catalog-info child t11

tt33 = t = t11 with (t with (t11 with (((type child t with (((type child t22) child t) child t22) same "CD player"))) same "CD player"))

tt44 = sales-info child t = sales-info child t33

ans = (((price child tans = (((price child t44) child t) child t44) child t) child t33) child itemlist) child itemlist

/itemlist/item[@category = "audio equipment"]/itemlist/item[@category = "audio equipment"] [catalog-info/type = "CD player"]/sales-info/price[catalog-info/type = "CD player"]/sales-info/price

Page 23: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Simplification of Query Plan (1)Simplification of Query Plan (1)

The translated result contains multiple The translated result contains multiple application of an operatorapplication of an operator

We can delete redundant operators We can delete redundant operators considering the operator semanticsconsidering the operator semantics

Example:Example: tt11 = = item with (item withitem with (item with (category same "audio (category same "audio

equipment")) equipment")) → → item withitem with (category same "audio (category same "audio equipment")equipment")

Page 24: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Simplification of Query Plan (2)Simplification of Query Plan (2)

If we can use the DTD information, we can furtIf we can use the DTD information, we can further simplify the expressionsher simplify the expressions

Example:Example: tt33 = t = t11 with (( with ((type child (catalog-info child ttype child (catalog-info child t11))) same ) same

"CD player") → t"CD player") → t11 with (( with ((type in ttype in t11) same "CD playe) same "CD player")r")

Simplified query plan for query QSimplified query plan for query Q tt11 = item with (category name "audio equipment") = item with (category name "audio equipment") ans = price in (tans = price in (t11 with ((type in t with ((type in t11) same "CD player) same "CD player

"))"))

Page 25: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 26: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Related WorkRelated Work

Translation of XQL queries into proximal nodTranslation of XQL queries into proximal nodes expressions (Baeza-Yates&Navarro [2])es expressions (Baeza-Yates&Navarro [2])

Rewriting techniques for XQL queries (Wood Rewriting techniques for XQL queries (Wood [13])[13])

Use of document structure for the query optiUse of document structure for the query optimization [3,11,12,13]mization [3,11,12,13]

Optimization of regular path expressions in tOptimization of regular path expressions in the context of semistructured DBs [4,8]he context of semistructured DBs [4,8]

Page 27: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Presentation OverviewPresentation Overview

BackgroundBackground Motivation and Our ApproachMotivation and Our Approach The Proximal Nodes ModelThe Proximal Nodes Model Query TranslationQuery Translation Translation ExampleTranslation Example Related WorkRelated Work Conclusions and Future WorkConclusions and Future Work

Page 28: Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba

Sept. 27, 2002Sept. 27, 2002 ISDB'02ISDB'02

Conclusions and Future WorkConclusions and Future Work

ConclusionsConclusions Bottom-up processing approach for XPath queriBottom-up processing approach for XPath queri

eses Support of major XPath query patternsSupport of major XPath query patterns Translation to proximal nodes expressionsTranslation to proximal nodes expressions Simplification and optimization techniquesSimplification and optimization techniques

Future workFuture work Support of more complete XPath semanticsSupport of more complete XPath semantics Application of hybrid approach (top-down and boApplication of hybrid approach (top-down and bo

ttom-up)ttom-up)