data structures and algorithms 12 - edx5 目录页 ming zhang “data structures and...

15
Data Structures and Algorithms(12) Instructor: Ming Zhang Textbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao Higher Education Press, 2008.6 (the "Eleventh Five-Year" national planning textbook) https://courses.edx.org/courses/PekingX/04830050x/2T2014/ Ming Zhang "Data Structures and Algorithms"

Upload: others

Post on 14-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

Data Structures and Algorithms(12)

Instructor: Ming ZhangTextbook Authors: Ming Zhang, Tengjiao Wang and Haiyan Zhao

Higher Education Press, 2008.6 (the "Eleventh Five-Year" national planning textbook)

https://courses.edx.org/courses/PekingX/04830050x/2T2014/

Ming Zhang "Data Structures and Algorithms"

Page 2: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

2

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Chapter 12 Advanced data structure

• 12.1 Multidimensional Array

• 12.2 Generalized Lists

• 12.3 Storage management

• 12.4 Trie

• 12.5 Improved binary search tree

Page 3: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

3

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.3 Trie

• Ideal situation: The average time of insertion,

deletion, and search is O(logN)

• Input 9, 4, 2, 6, 7, 15, 12, 21

• Output 2, 4, 6, 7, 9, 12, 15, 21

9

154

62 12

7

21

9

15

4

6

2

12

7

21

Page 4: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

4

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Structure of Trie

• Space division of key

• “trie” comes from “retrieval”

• Application

• Information retrieval

• Large scale of English dictionary

• 26-branch Trie

• Binary Trie

• Letters (numbers) represented as binary coding

• Coding includes just 0 and 1

12.4 Trie

Page 5: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

5

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Store words ‘and, ant, bad, bee’

Tree of English words: 26-branch Trie

12.4 Trie

Subtree ‘an’ contains

set {and, ant} that

every word from the

set has the same

prefix ‘an’.

a

n

d t

b

a

d

e

e

and ant bad bee

A subtree contains the words with the same prefix

Page 6: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

6

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

、Store words an , and ant , bad , bee

a

n

dt

b

a e

and antbad bee

an

*

* ** *

Page 7: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

7

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Compact the Single Paths close to the leaf

12.4 Trie

Store words an、and、ant、bad、bee

a

n

dt

b

a e

and ant

bad bee

an

*

Page 8: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

8

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Binary Trie

12.4 Trie

Elements are 2、5、9、17、41、45、63

0 (<32) 1 (>32)

0 (<16) 1 (>16)

0 (<8)

0 (<4)

1 (>48)

1 (>8)

1 (>4)

2 5

9

17

41

631 (>40)

45

0

(<44) 1 (>44)

0 (<48)

Page 9: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

9

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

PATRICIA Structure

12.4 Trie

101xxx

10xxxx

0000xx

001xxx

0001xx

Code: 2:000010 5:000101 9:001001

17:010001 41:101001 45:101101 63:111111

0xxxxx 1xxxxx

00xxxx 01xxxx 11xxxx

000xxx

2 5

9

17 63

0

1 1

2 2

3

41

1010xx 1011xx

45

3

Compression

Page 10: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

10

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Characteristics of PATRICIA Tree

• The compressed PATRICIA tree is a

full binary tree

• Every internal node represents a 1-bit

comparison

• Always at least two children are

generated

• The number of comparisons will

not exceed the length of the key

12.4 Trie

Page 11: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

11

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

Suffix Trees

c

a

ab

ababc

abc

b

a

b

c

c

b

a

b

c

c

c

b

babc

bc

Suffix Trie

ab

ababc

ab

abcc

bc

abcc

abc

babc

bc

c

b

Suffix Tree

implicit statesexplicit states

T = ababc

Ascending Orderababcabcabcbcc

Page 12: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

12

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

12.4 Trie

Suffix Array

3 1 1 0 2 0 1 0 0 -

5 ALAM$

1 ALAYALAM$

7 AM$

3 AYALAM$

6 LAM$

2 LAYALAM$

0 MALAYALAM$

8 M$

4 YALAM$

9 $

M A L A Y A L A M $0 1 2 3 4 5 6 7 8 9

5 1 7 3 6 2 0 8 4 9

Suffix Array

The longest common prefix arraySuffix 5 and Suffix 1 share “ALA”Suffix 1 and Suffix 7 share “A” LCP always adjacent

Page 13: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

13

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

3 1 1 0 2 0 1 0 0 -

12.4 Trie

5 ALAM$

1 ALAYALAM$

7 AM$

3 AYALAM$

6 LAM$

2 LAYALAM$

0 MALAYALAM$

8 M$

4 YALAM$

9 $SA

lcp

LA

5 1

7 3 4 2

D = 3

D = 1

D = 0

D = 2

5 1 7 3 6 2 0 8 4 9

Page 14: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

14

目录页

Ming Zhang “Data Structures and Algorithms"

Chapter 12

Advanced Data

Structure

Discussions

•Can Trie handle Chinese characters?

What about PATRICIA Trie structure?

• Learn related document about

Suffix Array and Suffix Tree. And

think about their applications.

12.4 Trie

Page 15: Data Structures and Algorithms 12 - edX5 目录页 Ming Zhang “Data Structures and Algorithms" Chapter 12 Advanced Data Structure Store words ‘and, ant, bad, bee’ Tree of English

Data Structures and Algorithms

Thanks

the National Elaborate Course (Only available for IPs in China)http://www.jpk.pku.edu.cn/pkujpk/course/sjjg/

Ming Zhang, Tengjiao Wang and Haiyan ZhaoHigher Education Press, 2008.6 (awarded as the "Eleventh Five-Year" national planning textbook)

Ming Zhang “Data Structures and Algorithms”