analysis of tree edit distance algorithms serge dulucq and hélène b89902009 黃鼎翔 b89902011...
TRANSCRIPT
Analysis of Tree Edit Analysis of Tree Edit Distance AlgorithmsDistance Algorithms
Serge Dulucq and HSerge Dulucq and Hééllèènene
B89902009 黃鼎翔B89902011 田知本B89902045 巨彥霖
MotivationMotivation
One way of comparing two ordered trees is by measuring their edit distance
Application areas• Comparison of hierarchically structured data• Alignment of RNA secondary structures in computati
onal biology Two algorithms using dynamic programming
• Zhang-Shasha• Klein
PurposePurpose
A general analysis of dynamic programming for edit distance algorithm• Study the complexity of those decompositions
by counting the exact number of distinct recursive calls
Define a new edit distance algorithm for trees which improves original algorithms with respect to the number of recursive calls
Trees and forestsTrees and forests A tree is a node
(called the root) connected to an ordered sequence of disjoint trees
Such a sequence is called a forest
We write l(A1◦…◦An) for the tree composed of the node l connected to the sequence of trees A1, …, An An
l
A1
A2˙˙˙
˙˙˙
2
3
5 1
4
2
4
5 1
3≠
|F| denotes the number of nodes of the forest F
SF(F) is the set of all subforests of F
F(i), i is a node of F, denotes the subtree of F rooted at i
deg(i) is the degree of i, that is the number of children of i
1
2
3 5 6
4 78 9
10F
|F| = 10
2
3
9 5 6
4
∈ SF(F)
F(2) =
deg(4) = 2
Edit distanceEdit distance
Let F and G be two forests. The edit distance between F and G, denoted d(F, G), is the minimal cost of edit operations needed to transform F into G
Operations• Substitution• Insertion• Deletion
Let Cs, Ci, Cd denote the costs of substitution, insertion, deletion
Recursive relationship(1/3)Recursive relationship(1/3) Strings
• u, v are strings; x, y are alphabet symbols• d(xu, yv) = min{ Cd(x) + d(u, yv),
Ci(y) + d(xu, v), Cs(x, y) + d(u, v) }
• d(ux, vy) = min{ Cd(x) + d(u, vy), Ci(y) + d(ux, v), Cs(x, y) + d(u, v) }
ux
vy
y y
Recursive relationship(2/3)Recursive relationship(2/3)
Trees• l, l’ are roots; F, F’ are forests• d(l(F), l’(F’)) = min{ Cd(l) + d(F, l’(F’)),
Ci(l’) + d(l(F), F’), Cs(l, l’) + d(F, F’) }
l l’
l’
l’
Recursive relationship(3/3)Recursive relationship(3/3) Forests
• T, T’ are forests• Left decomposition
d(l(F)◦T, l’(F’)◦T’) = min{ Cd(l) + d(F◦T, l’(F’)◦T’),
Ci(l’) + d(l(F)◦T, F’◦T’), d(l(F), l’(F’)) + d(T, T’) }
• Right decompositiond(T◦l(F), T’◦l’(F’)) = min{ Cd(l) + d(T◦F, T’◦l’(F’)),
Ci(l’) + d(T◦l(F), T’◦F’), d(l(F), l’(F’)) + d(T, T’) }
• direction to indicate left or right
ExampleExample
1
3
4 5
2
1
3
4 5
2
3
4 5
2
3
4 5
2
3
4 5
2
4 5
4
5
3
4 5
4 52
2
4 5
42
5 4
Left decomposition
Right decomposition
Strategy & Relevant forestsStrategy & Relevant forests Let F and G be two forests. A strategy is a map
ping from SF(F)×SF(G) to {left, right}
Let (F, F’) be a pair of forests provided with a strategyφ.The set RFφ(F, F’) of relevant forests is defined as the least subset of SF(F)×SF(F’) such that if the decomposition of (F, F’) meets the pair (G, G’), then (G, G’) belongs to RFφ(F, F’)
RFφ(F) and RFφ(F’) denote the projection of RFφ(F, F’) on SF(F) and SF(F’)
#relevant denote the number of relevant forests
Proposition(1/2)Proposition(1/2) F=F’=Ø → RFφ(F, F’)=Ø
φ(F, F’)=left, F=l(G)◦T, F’=Ø → RFφ(F, F’) = {(F, F’)}∪RFφ(G◦T, F’)
φ(F, F’)=right, F=T◦l(G), F’=Ø → RFφ(F, F’) = {(F, F’)}∪RFφ(T◦G, F’)
φ(F, F’)=left, F=Ø, F=l’(G’)◦T’ → RFφ(F, F’) = {(F, F’)}∪RFφ(F, G’◦T’)d(l(G)◦T, l’(G’)◦T’) = min{ Cd(l) + d(G◦T, l’(G’)◦T’),
Ci(l’) + d(l(G)◦T, G’◦T’), Cs(l(G), l’(G’)) + d(G◦T, G’◦T’) }
d(T◦l(G), T’◦l’(G’)) = min{ Cd(l) + d(T◦G, T’◦l’(G’)), Ci(l’) + d(T◦l(G), T’◦G’), Cs(l(G), l’(G’)) + d(T◦G, T’◦G’) }
Proposition(2/2)Proposition(2/2) φ(F, F’)=right, F=Ø, F’=T’◦l’(G’)
→ RFφ(F, F’) = {(F, F’)}∪RFφ(F, T’◦G’)
φ(F, F’)=left, F=l(G)◦T, F’=l’(G’)◦T’ → RFφ(F, F’) = {(F, F’)}∪ RFφ(G◦T, F’)∪
RFφ(F, G’◦T’)∪RFφ(l(G), l’(G’))∪RFφ(T, T’) φ(F, F’)=right, F=T◦l(G), F’=T’◦l’(G’)
→ RFφ(F, F’) = {(F, F’)}∪ RFφ(T◦G, F’)∪ RFφ(F, T’◦G’)∪RFφ(l(G), l’(G’))∪RFφ(T, T’)
d(l(G)◦T, l’(G’)◦T’) = min{ Cd(l) + d(G◦T, l’(G’)◦T’), Ci(l’) + d(l(G)◦T, G’◦T’), Cs(l(G), l’(G’)) + d(G◦T, G’◦T’) }
d(T◦l(G), T’◦l’(G’)) = min{ Cd(l) + d(T◦G, T’◦l’(G’)), Ci(l’) + d(T◦l(G), T’◦G’), Cs(l(G), l’(G’)) + d(T◦G, T’◦G’) }
Lemma 1Lemma 1
Given a tree A=l(A1◦…◦An), for any strategy we have#relevant(A) ≥
|A| - |Ai|+ #relevant(A1) +…+ #relevant(An)where i∈[1…n] is such that the size of Ai is maximal
Proof(1/2)Proof(1/2)Let F = A1◦…◦An ⇒ RF(A) = {A}∪RF(F)
⇒ #relevant(A) = 1 + #relevant(F)When n=1:
F = A1, A=l(A1) ⇒ #relevant(A) = 1 + #relevant(A1)
≥ |A| - |A1| + #relevant(A1)When n>1:
Suppose left, Let A1 = l(F1), T = A2◦…◦AnRF(F) = {F}∪RF(A1)∪RF(T)∪RF(F1◦T)| RF(F1◦T) – (RF(F1)∪RF(T)) | ≥ min{|F1|, |T|}
⇒ #relevant(F) ≥ 1 + #relevant(A1) + #relevant(T) + min{|F1|, |T|}
Let j∈[2…n] st |Aj| is maximal among |A2|, …, |An|⇒ #relevant(F) ≥ 1 + #relevant(A1)
+…+ #relevant(An) + |T| - |Aj| + min{|F1|, |T|}
Take a lookTake a look
#relevant(A) ≥ |A| - |Ai| + #relevant(A1) +…+ #relevant(An)
⇒ #relevant(F) ≥ |F| + |Ai| + #relevant(A1) +…+ #relevant(An)
#relevant(F) ≥ 1 + |T| - |Aj| + min{|F1|, |T|}+ #relevant(A1) +…+ #relevant(An)
Proof(2/2)Proof(2/2)1 + |T| - |Aj| + min{|F1|, |T|} ≥ |F| - |Ai|1) If |F1| ≤ |T|
⇒ 1 + |T| + min{|F1|, |T|} = |F| Since |Aj| ≤ |Ai| ∴1 + |T| - |Aj| + min{|F1|, |T|} = |F| - |Aj|
≥ |F| - |Ai|2) If |F1| > |T|
⇒ |F| - |Ai| = |T| (∵i=1) ∴1 + |T| - |Aj| + min{|F1|, |T|} = 1 + |T| + |T| - |Aj|
≥ 1 + |T| > |F| - |Ai|
∴ #relevant(F) ≥ |F| - |Ai| + #relevant(A1) +…+ #relevant(An)⇒ #relevant(A) ≥ |A| - |Ai| + #relevant(A1) +…+ #relevant(An)
Lemma 2Lemma 2
For every nature number n, there exists a tree A of size n such that for any strategy, #relevant(A) has a lower bound in O(n logn)
• For complete balanced binary tree Tn of size n, prove by induction on n that
#relevant(Tn) ≥ (n+1)log2(n+1)/2
IdeaIdea Suppose the direction is left
RF(l(F)◦T) = {l(F)◦T}∪RF(l(F))∪RF(F◦T)∪RF(T)
Since T⊆F◦T, We want to eliminate in priority nodes of F in F◦T, such that RF(F◦T) and RF(T) share relevant forests as most as possible!
CoverCover Let F be a forest. A cover r of F is a mapping fr
om F to F∪{left, right} satisfying for each node i in F• if deg(i) = 0 or 1, then r(i)∈{left, right}• if deg(i) > 1, then r(i) is a child of i
2
4
1
3
2
4
1
3
left, right
Cover strategyCover strategy Given a pair of trees (A, B) and a cover r for A,
we associate a unique strategyφ as follows.• if deg(i) = 0 or 1, then φ(A(i), G) = r(i), for each forest
G in B• If A(i) is of the form l(A1◦…◦An) with n > 1, then let
p∈{1, …, n} such that the favorite child r(i) is the root of Ap. For each forest G of B, we define
φ(A(i), G) = right whenever p = 1, left otherwise φ(T◦Ap◦…◦An, G) = left, for each forest T of A1◦…◦Ap-1 φ(Ap◦T, G) = right, for each forest T of Ap+1◦…◦An
The tree A is called the cover tree. A strategy is a cover strategy if there exists a cover tree associated to it
A4
i
A1
A2
A3
GA(i)
φ(A(i), G) = right whenever p = 1, left otherwiseφ(T◦Ap◦…◦An, G) = left, for each forest T of A1◦…◦Ap-1φ(Ap◦T, G) = right, for each forest T of Ap+1◦…◦An
Some TasksSome Tasks
The order of our TasksThe order of our Tasks• 研究研究 Tree A …Tree A …• 研究研究 Tree B …Tree B …•將 將 Tree A & Tree BTree A & Tree B 的研究資料做結合的研究資料做結合• 求得求得 # distinct pairs (recursively)# distinct pairs (recursively)
Tree ATree A
Focus on relevant(A) (detail)Focus on relevant(A) (detail) Cover strategies in ACover strategies in A A A 將將牽引牽引著著 B B 走走
Lemma 4Lemma 4
RF(l(F)RF(l(F)◦T◦T) =) = {l(F) {l(F) ◦T, F1 ◦T, ….. ,Fk◦T◦T, F1 ◦T, ….. ,Fk◦T}}∪RF(l(F))∪RF(T)∪RF(l(F))∪RF(T) 這是幹什麼的呢這是幹什麼的呢 ?? Term : Term : k = |F| : Fk = |F| : F 所有所有 nodenode 的個數的個數 Fk+1 Fk+1 為 為 Fk Fk 作作 left decomposition left decomposition 而得到而得到 的的 forest , so F1 , F2 , …… , Fk forest , so F1 , F2 , …… , Fk 是由一是由一 連串的連串的 left decomposition left decomposition 所產生的 所產生的 forests. forests. 目標 目標 : : 利用利用 cover strategy cover strategy 為 為 φ(l(F) (F) ◦◦ T) = left T) = left 看看是否可以減少看看是否可以減少 recursiverecursive 的次數的次數 ??
F T
FT F T
RF(l(F))RF(T) RF(F◦T)
RF(l(F)◦T)
Since cover strategy, the
direction is left
RF(l(F)◦T) = {l(F) ◦T} ∪∪ RF(l(F)) ∪∪ RF(T) ∪∪RF(F◦T)
Conclusion Conclusion
RF(l(F)RF(l(F)◦T◦T) =) =
{l(F) {l(F) ◦T, F1 ◦T, ….. ,Fk◦T◦T, F1 ◦T, ….. ,Fk◦T}}∪RF(l(F))∪RF(T)∪RF(l(F))∪RF(T)
Lemma 5Lemma 5
#relevant(A) = #relevant(A) = |A| - |Aj| + #relevant(A1) + #relevant(A2) +|A| - |Aj| + #relevant(A1) + #relevant(A2) +
… + #relevant(An) … + #relevant(An)
Term : A = l(A1 Term : A = l(A1 ◦◦A2 A2 ◦ … ◦ ◦ … ◦ An).An). Aj Aj 為 為 AA 的的 favorite child.favorite child.目標 目標 : : 算出一個算出一個 cover treecover tree 的的 relevant forestsrelevant forests 的個數的個數
Part 1 : |A| - |Aj|Note : Φ(A(i), G) = right whenever p = 1, left otherwise
Φ(T◦Ap◦…◦An, G) = left, for each forest T of A1◦…◦Ap-1
Φ(Ap◦T, G) = right, for each forest T of Ap+1◦…◦An
說明 : 由於 Aj 為 A 的 favorite child , 所以 |A| - |Aj|
相當於在算 {A} ∪ { 所有包含 Aj 的 forests} 的
個數
Aj
Part 2: #relevant(A1) + #relevant(A2) + … + #relevant(An) Note : RF(A1◦A2◦A3◦A4◦... ◦An) ={A1◦A2◦A3◦A4◦... ◦An} ∪RF(F1◦A2◦A3◦A4◦... ◦An)∪RF(A1) RF(A2◦A3◦A4◦...∪ ◦An )
A1 A2 A3 A4 An…..
free nodefree node
什麼是什麼是 free node?free node?• 不是獨生子不是獨生子• 不是父母最愛的孩子不是父母最愛的孩子
DefinitionDefinition• the root of Athe root of A• the node whose parent is of degree the node whose parent is of degree grater thgrater th
an 1an 1 and is and is not the favorite childnot the favorite child
favorite child
free node
Tree BTree B
B B 是是被 被 A A 牽引著走牽引著走 So no any cover strategySo no any cover strategy Focus on following three things:Focus on following three things:
• Rightmost forestsRightmost forests• Leftmost forestsLeftmost forests• Special forestsSpecial forests
Three Things (1)Three Things (1)
DefinitionDefinition
• Rightmost forestsRightmost forests 由 由 B B 開始開始,做一連串的 ,做一連串的 left decompositionleft decomposition 到到結束結束,產生的所有 ,產生的所有 subforestssubforests
• Leftmost forestsLeftmost forests 由 由 B B 開始開始,做一連串的 ,做一連串的 right decompositionright decomposition 到到結束結束,產生的所,產生的所有 有 subforestssubforests
• special forestsspecial forests 由 由 B B 開始開始,做一連串的 ,做一連串的 left or right decompositionleft or right decomposition 到到結束結束,,產生的所有 產生的所有 subforestssubforests
Rightmost ∪ leftmost = special ? NO !
2 3
5
4
76
2
3
5
4
76
3
5 6
5 6
5 6
5 64
7
4
76
4
7
7
example
Left decompositio
n
1
2 3
5
4
76
B
all rightmost forests of B
Three Things (2)Three Things (2)
Three categoriesThree categories• relevant forests of A fall within three categoriesrelevant forests of A fall within three categories
((αα) those are compared with ) those are compared with all rightmost forests ofall rightmost forests of B B ((ββ ) those are compared with ) those are compared with all leftmost forests ofall leftmost forests of B B ((γγ ) those are compared with ) those are compared with all special forests ofall special forests of B B
why ?
Three Things (3)Three Things (3)
The of rightmost , leftmost , special forestThe of rightmost , leftmost , special forests ( )s ( )
• #right(B) = ∑(|B(i)|,i#right(B) = ∑(|B(i)|,i∈B) - ∑(|B(i)|,∑(|B(i)|,i is a rightmost child)
• #left(B) = ∑(|B(i)|,i#left(B) = ∑(|B(i)|,i∈B) - ∑(|B(i)|,∑(|B(i)|,i is a leftmost child)
• #special(B) = |B|(|B|+3) / 2 - ∑(|B(i)|,i∑(|B(i)|,i∈B)
number
#right#right #left#left #special#special
說明 說明 #right(B) , #left(B)#right(B) , #left(B)
Rightmost forests – all cover strategies are Rightmost forests – all cover strategies are that “favorite child is that “favorite child is rightmost childrightmost child” ” because of all because of all left decompositionleft decomposition
Leftmost forests – all cover strategies are Leftmost forests – all cover strategies are that “favorite child is that “favorite child is leftmost childleftmost child” ” because of all because of all right decompositionright decomposition
#relevant(B) = |B| - |Bj| + #relevant(B1) + … + #rele
vant(Bn)
#right(B) = |B| - |B 右 | + #right(B1) + … + #right(Bn)
#left(B) = |B| - |B 左 | + #left(B1) + … + #left(Bn)
recursively
recursively
#right(B) = ∑(|B(i)|,i∑(|B(i)|,i∈B) - ∑(|B(i)|,∑(|B(i)|,i is a rightmost child)
#left(B) = ∑(|B(i)|,i∑(|B(i)|,i∈B) - ∑(|B(i)|,∑(|B(i)|,i is a leftmost child)
Review
comparisoncomparison
two types (two types ( 對於對於 A)A)• Tree’s comparisonTree’s comparison
free nodefree node favorite childfavorite child
• Forests’ comparisonForests’ comparison
Lemma 6Lemma 6
let F be a relevant forest of Alet F be a relevant forest of A• if the direction is left , then F is at least comif the direction is left , then F is at least com
pared with all rightmost forests of Bpared with all rightmost forests of B• if the direction is right , then F is at least coif the direction is right , then F is at least co
mpared with all lef tmost forests of Bmpared with all lef tmost forests of B
Why?
牛刀小試
free node’s comparisonfree node’s comparison
Lemma 7Lemma 7• let i be a free node of Alet i be a free node of A
if the direction of i is left , then A(i) is (if the direction of i is left , then A(i) is (αα) ) if the direction of i is right , then A(i) is (if the direction of i is right , then A(i) is (ββ ) )
((αα) those are compared with ) those are compared with all rightmost forests ofall rightmost forests of B B((ββ) those are compared with ) those are compared with all leftmost forests ofall leftmost forests of B B ((γγ ) those are compared with ) those are compared with all special forests ofall special forests of B B
lemma7 lemma7 說明說明 consider G , the largest forest of B such that (A(i),G) belconsider G , the largest forest of B such that (A(i),G) bel
ongs to RF(A,B) and G is ongs to RF(A,B) and G is not a rightmost forestnot a rightmost forest 因為 因為 G G 一定不是 一定不是 B , so…..B , so….. 考慮如何產生出 考慮如何產生出 (A(i),G) ?(A(i),G) ? 共有四種可能的 共有四種可能的 case :case :
if the direction of i is left , then A(i) is (if the direction of i is left , then A(i) is (αα))
Case1 : 左邊不動 , 右邊斷頭
since the direction of A(i) is left
存在 a node l , two forests H and P such that G = H ◦ P
則 (A(i) , l(H) ◦ P) is in RF(A,B)
(A(i) , l(H) ◦ P) -> (A(i),G) by 右邊斷頭 !!
G is the largest and not rightmost => l(H) ◦ P is a rightmost forest of B
=> G = H ◦ P is also a a rightmost forest of B
Case2 : 左邊斷頭 , 右邊不動
存在 a node l , (l ◦ A(i) , G) -> by 左邊斷頭 !!
(A(i) ◦ l , G) -> by 左邊斷頭 !!
(l(A(i)) , G) -> by 左邊斷頭 !!
Case3 : tree 的超級比一比
(A(i) ◦ F1 , G ◦ F2) -> (A(i) , G) by tree 的超級比一比
(F1 ◦ A(i) , F2 ◦ G) -> (A(i) , G) by tree 的超級比一比
Case4 : forest 的超級比一比
(T1 ◦ A(i) , T2 ◦ G) -> (A(i) , G) by forest 的超級比一比
(A(i)◦ T1 , G ◦ T2) -> (A(i) , G) by forest 的超級比一比矛盾 not free node ! G is a tree !
forests’ comparisonforests’ comparison
Lemma9Lemma9• let F be a relevant forest of A but let F be a relevant forest of A but not anot a treetree. Let. Let i i bebe thth
e lower common ancestore lower common ancestor of the set or nodes of F and of the set or nodes of F and jj be the be the favorite childfavorite child of i of i
if F is a rightmost forest whose left most tree is not if F is a rightmost forest whose left most tree is not A(j) , A(j) , then F has the same category as A(i)then F has the same category as A(i)
if F is a leftmost forest , if F is a leftmost forest , then F has the samethen F has the same category as category as A(i)A(i)
else else F is (F is (γγ ))
lemma8 lemma8 說明 說明 (1) (2)(1) (2)
The fact…The fact…
(1) (2) is very trivial !!(1) (2) is very trivial !!
對於任何一座 forest , 如果
符合 (1) -> decomposition 尚未接觸 favorite child ( 在右邊 )
符合 (2) -> decomposition 尚未接觸 favorite child ( 在左邊 )
從老祖宗 (LCA) 下來的 forests’全部一致
category
Lemma8 Lemma8 說明 說明 (3)(3)
對於任何一座 對於任何一座 forest , forest , 如果不滿足如果不滿足 (1) & (2) , (1) & (2) , 則他最左邊的 則他最左邊的 tree tree 必定是老祖宗的愛子 必定是老祖宗的愛子 (favorite child) ,(favorite child) ,所以其目前的 所以其目前的 direction direction 是 是 right …right …
now consider a forest G …now consider a forest G …
如果 G is a rightmost forest of B :
因為 F is not a leftmost forest
所以 F 老祖宗的愛子一定不在最左邊 => A(i) 的方向是 left
by lemma => 存在 (A(i) , G)
(A(i) , G) -> (F , G) by 一連串 左邊斷頭 , 右邊不變 !!
如果 G is not a rightmost forest of B :
B 要變成 G 一定有 right decomposition
而剛好 F 目前的方向是 right
所以 (F , G) 存在
favorite child’s comparisonfavorite child’s comparison
Lemma9Lemma9• let i be the node of A is not free , and j be the palet i be the node of A is not free , and j be the pa
rent of irent of i if the direction of i is left , if i is the rightmost chilif the direction of i is left , if i is the rightmost chil
d of j and A(j) is left , then A(i) is (d of j and A(j) is left , then A(i) is (αα) ) if the direction of i is right , if i is the leftmost chilif the direction of i is right , if i is the leftmost chil
d of j and A(j) is right , then A(i) is (d of j and A(j) is right , then A(i) is (ββ ) ) else A(i) is (else A(i) is (γγ ) )
Lemma9 Lemma9 說明說明The fact…The fact…
all are very trivial !!all are very trivial !!
(1) (1) left left 的世界的世界(2) (2) right right 的世界的世界(3) (3) 其餘…其餘…
真的是 trivial 嗎 ??
NotationNotation
let i be a node of A , let j be the parent of i (if i is let i be a node of A , let j be the parent of i (if i is not root)not root)• Free(A(i))Free(A(i)) : #relevent(A(i),B) if i is free : #relevent(A(i),B) if i is free• Right(A(i))Right(A(i)) : #relevent(A(i),B) if A(j) is : #relevent(A(i),B) if A(j) is ((αα)) • Left(A(i))Left(A(i)) : #relevent(A(i),B) if A(j) is : #relevent(A(i),B) if A(j) is ((ββ )) All(A(i))All(A(i)) : # : #
relevent(A(i),B) if A(j) is relevent(A(i),B) if A(j) is ((γγ ))
So , #relevant(A,B) = Free(A)So , #relevant(A,B) = Free(A)
TheoremTheorem
let (A,B) be a pair of trees , A be a let (A,B) be a pair of trees , A be a cover treecover tree• 7 case 7 case
Case(1)Case(1)
If A is reduced to a single node whose If A is reduced to a single node whose direction is direction is rightright
Free(A) = #left(B)
Right(A) = #special(B)
Left(A) = #left(B)
All(A) = #special(B)
Case2Case2
If A is reduced to a single node whose If A is reduced to a single node whose direction is direction is leftleft
Free(A) = #right(B)
Right(A) = #left(B)
Left(A) = #special(B)
All(A) = #special(B)
Case3Case3
if A = l(A’) and the direction of l is if A = l(A’) and the direction of l is rightright ( A’ is a tree ) ( A’ is a tree )
Free(A) = #left(B) + Left(A’)
Right(A) = #special(B) + All(A’)
Left(A) = #left(B) + Left(A’)
All(A) = #special(B) + All(A’)
Case4Case4 if A = l(A’) and the direction of l is if A = l(A’) and the direction of l is leftleft ( A’ is a tree ) ( A’ is a tree )
Free(A) = #right(B) + Right(A’)
Right(A) = #right(B) + Right(A’)
Left(A) = #special(B) + All(A’)
All(A) = #special(B) + All(A’)
Case5Case5
if A = l(A1if A = l(A1 。…。。…。 An) and the favorite child is thAn) and the favorite child is the leftmost childe leftmost child
Free(A) = #left(B)(|A|-|A1|) + Left(A1) + Free(A2) +…+ Free(An)
Right(A) = #special(B)(|A|-|A1|) + All(A1) + Free(A2) +…+ Free(An)
Left(A) = #left(B)(|A|-|A1|) + Left(A1) + Free(A2) +…+ Free(An)
All(A) = #special(B)(|A|-|A1|) + All(A1) + Free(A2) +…+ Free(An)
Case6Case6 if A = l(A1if A = l(A1 。…。。…。 An) and the favorite child is thAn) and the favorite child is th
e rightmost childe rightmost child
Free(A) = #right(B)(|A|-|An|) + Right(An) + Free(A1) +…+Free(An-1)
Right(A) = #right(B)(|A|-|An|) + Right(An) + Free(A1) +…+Free(An-1)
Left(A) = #special(B)(|A|-|An|) + All(An) + Free(A1) +…+Free(An-1)
All(A) = #special(B)(|A|-|An|) + All(An) + Free(A1) +…+Free(An-1)
Case7Case7 if A = l(A1if A = l(A1 。…。。…。 An) and the favorite child is Aj ,An) and the favorite child is Aj ,
with 1<j<n with 1<j<n
Free(A) = #right(B)(1+|A1 。…。 Aj-1|) +#special(B)(|Aj 。…。 An|)
+ All(Aj) + Free(A1) +…+ Free(Aj-1) + Free(Aj+1) +…+ Free(An)
Right(A) = #right(B)(1+|A1 。…。 Aj-1|) +#special(B)(|Aj 。…。 An|)+ All(Aj) + Free(A1) +…+ Free(Aj-1) + Free(Aj+1) +…+ Free(An)
Left(A) = #special(B)(|A|-|Aj|) +All(Aj) + Free(A1) +…+ Free(Aj-1) + Free(Aj+1) +…+ Free(An)
All(A) = #special(B)(|A|-|Aj|) +All(Aj) + Free(A1) +…+ Free(Aj-1) + Free(Aj+1) +…+ Free(An)
conclusionconclusion
StepsSteps• 拿到拿到 two tree two tree AA & & BB• 計算 計算 #right(#right(BB) #left() #left(BB) #special() #special(BB))
#relevant(#relevant(AA,,BB) = ) = Free( Free(AA))Free(Free(AA))
by theorem recursive
exampleexample
For Zhang-Shasha algorithmFor Zhang-Shasha algorithm
#relevant(A,B) = #right(A) * #right(B)
Why ?
Choose the favorite child (1)Choose the favorite child (1)
Choose the good favorite child to Choose the good favorite child to minimminimizeize Free(A)Free(A)
Free(A) = minCase 5 (favorite child 在最左邊 )
Case 6 (favorite child 在最右邊 )
Case 7 (favorite child 在最中間 )
Choose the favorite child (2)Choose the favorite child (2)
Is this really good?Is this really good?
Not necessarily !!Why?Need preprocessing time !!