approximate counting of frequent query patterns over xquery stream

20
2004/5/28 1 Approximate Counting of F requent Query Patterns ov er XQuery Stream Liang Huai Yang, Mong Li Lee, Wynne HSU DASFAA 2004 Speaker:Ming Jing Tsai

Upload: kami

Post on 31-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Approximate Counting of Frequent Query Patterns over XQuery Stream. Liang Huai Yang, Mong Li Lee, Wynne HSU DASFAA 2004 Speaker:M ing Jing Tsai. Introduction. Efficient approach to improve XML management system Cache frequently retrieved results Frequent query patterns application - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Approximate Counting of Frequent Query Patterns over XQuery Stream

2004/5/28 1

Approximate Counting of Frequent Query Patterns over XQuery Stream

Liang Huai Yang, Mong Li Lee, Wynne HSUDASFAA 2004

Speaker:Ming Jing Tsai

Page 2: Approximate Counting of Frequent Query Patterns over XQuery Stream

2

Introduction

Efficient approach to improve XML management system Cache frequently retrieved results Frequent query patterns

application Search engine XML query system

Page 3: Approximate Counting of Frequent Query Patterns over XQuery Stream

3

Preliminaries

S = QPT1,QPT2,…,QPTN

Query pattern trees(QPT) Label:{“*”,”//”} ∪tagset

Rooted subtree(RST) root(RST) = root(QPT) RSTV’ QPTV , RSTE’ QPTE

Page 4: Approximate Counting of Frequent Query Patterns over XQuery Stream

4

QPT

book

title author price

book

title

author

price

fn ln

book

title

section

QPT1 QPT2 QPT3

book

title author price

RST

Page 5: Approximate Counting of Frequent Query Patterns over XQuery Stream

5

Approximate Counting

rst.count app ≧ (σ-ε)N rst.count app ≧ rst.counttrue-Εn XQuery stream divided into buckets of

w = bcurrent =

N

w

1

Page 6: Approximate Counting of Frequent Query Patterns over XQuery Stream

6

D-GQPT

1

3 62

book

title

author

54fn ln

7

8section

price

titleRST3

book1

3 82title author price

book

title author price

1,2,-1,3,-1,8,-1

Page 7: Approximate Counting of Frequent Query Patterns over XQuery Stream

7

D-GQPT

1

3 62

book

title

author

54fn ln

7

8section

price

titleRST3

book1

3 82title author price

book

title author price

1,2,-1,4,-1,9,-1

Page 8: Approximate Counting of Frequent Query Patterns over XQuery Stream

8

ECTree1

1

2

1

3

1

6

1

8

1

2 8

1

2 6

1

2 3

Gjoin

Grmlne =

1

3 8

1

3 6

GjoinGrmlne

1

4

3

1

5

31

6 8

GjoinGrmlne

1

7

6

Gjoin

Grmlne =

1

4 5

3

1

3 6

4

1

3 8

4

1

3 6

7

GjoinGrmlne

1

3 6 8

Page 9: Approximate Counting of Frequent Query Patterns over XQuery Stream

9

Candidate Generation

Rightmost active leaf node expansion Grmlne( )=

Gjoin ( )= | = X

j = i+1,…,N

1kRST

ir

kRST

i

kRST

i1k

RSTij

1kRST

ij

kRST

i

kRST

j

Page 10: Approximate Counting of Frequent Query Patterns over XQuery Stream

10

Prune

RSTK+1 doesn’t exist in ECTree RSTk+1.Δ = bcurrent - β | RSTK+1.tidlist| < β prune

RSTK+1 exists in ECTree RSTK+1.countapp = RSTK+1. countapp+|RSTK+1.tidlist| RSTK+1.countapp + RSTk+1.Δ < bcurrent prune

Join result with RSTK+1

subtree induced by RSTK+1

Page 11: Approximate Counting of Frequent Query Patterns over XQuery Stream

11

AppXQSMiner

Page 12: Approximate Counting of Frequent Query Patterns over XQuery Stream

12

AppXQSMiner

Page 13: Approximate Counting of Frequent Query Patterns over XQuery Stream

13

ECTree1

1

2

1

3

1

6

1

8

1

2 8

1

2 6

1

2 3

Gjoin

Grmlne =

1

3 8

1

3 6

GjoinGrmlne

1

4

3

1

5

31

6 8

GjoinGrmlne

1

7

6

Gjoin

Grmlne =

1

4 5

3

1

3 6

4

1

3 8

4

1

3 6

7

GjoinGrmlne

1

3 6 8

Page 14: Approximate Counting of Frequent Query Patterns over XQuery Stream

14

Experiment

P4 2.4GHz, 1GB RAM, WINXP DBLP DTD:98 nodes Shakespears’ Play DTD: 23 nodes

Page 15: Approximate Counting of Frequent Query Patterns over XQuery Stream

15

Experiment error=0.1σ

Page 16: Approximate Counting of Frequent Query Patterns over XQuery Stream

16

Experiment error = 0.1σ

Page 17: Approximate Counting of Frequent Query Patterns over XQuery Stream

17

Experiment sup = 0.005

Page 18: Approximate Counting of Frequent Query Patterns over XQuery Stream

18

Experiment sup = 0.005

Page 19: Approximate Counting of Frequent Query Patterns over XQuery Stream

19

Experiment error = 0.05σ

Page 20: Approximate Counting of Frequent Query Patterns over XQuery Stream

20

Experiment error = 0.05σ