5. collision resolution by progressive overflowcontents.kocw.net/kocw/document/2014/yeungnam/... ·...

27
5. Collision Resolution by Progressive Overflow Overflow Progressive Overflow Linear Probing Progressive Overflow Linear Probing 51 How Progressive Overflow Works 5.1 How Progressive Overflow Works 기본 개념 Collision 발생할 때, 이후 빈 공간에 삽입 (그림 10.4) End of file 일 경우, 처음부터 다시 검색 (그림 10.5) Circular queue 의 개념 Key 검색 시, h(Key) 수행 h(Key)해당 Key없더라도 계속 검색 h(Key)해당 Key없더라도, 계속 검색 언제까지? 무한 loop의 가능성? 영남대학교 데이터베이스연구실 Algorithm: Chapter 10 (Page 20)

Upload: others

Post on 15-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

5. Collision Resolution by ProgressiveOverflowOverflow

Progressive Overflow Linear Probing Progressive Overflow Linear Probing

5 1 How Progressive Overflow Works5.1 How Progressive Overflow Works

기본 개념 Collision 발생할 때, 이후 빈 공간에 삽입 (그림 10.4) End of file 일 경우, 처음부터 다시 검색 (그림 10.5)

Circular queue 의 개념 Key 검색 시, h(Key) 수행

h(Key)에 해당 Key가 없더라도 계속 검색 h(Key)에 해당 Key가 없더라도, 계속 검색 언제까지? 무한 loop의 가능성?

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 20)

Page 2: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

0

1....

5Key

....

Novak . . .

6

7

KeyYork

Jasper . . .

Rosen . . . York’s homeAddress (busy)2nd try (busy)

Hashfunction

7

8Address

p

Moreley . . .6

2nd try (busy)

3rd try (busy)

9

.. ..

64th try (open)York’s actualaddress

G i i i i i

. . address

FIGURE 10.4 Collision resolution with progressive overflow.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 21)

Page 3: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

0

1

2

3Key

2

KeyBlue ...

...

Hashroutine

98Address99 9999

Jello . . .

Wrapping around

FIGURE 10.5 Searching for an address beyond the end of a file.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 22)

Page 4: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

5.2 Search Lengthg 정의

주어진 를 검색하기 위해 필요한 d k 액세스 수 주어진 Key를 검색하기 위해 필요한 disk 액세스 수 그림 10.6

average search length = total search lengthtotal number of records

Packing density와의 관계 (그림 10.7)g y ( )

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 23)

Page 5: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Actualaddress

Homeaddress

Number ofaccesses needed

0. .

Key Home Addressto retrieve

20

.. ..Adams 20Bates 21Cole 21 20 1Adams . . .

22

21Dean 22Evans 20

21

21

2

1Bates . . .

C l22

23

21

22

2

2

Cole . . .

Dean . . .

24

25

Evans . . . 20 5

25

FIGURE 10.6 Illustration of the effects of clustering of a records. As keys are

......

FIGURE 10.6 Illustration of the effects of clustering of a records. As keys are clustered, the number of accesses required to access later keys can become large.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 24)

Page 6: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

55

4

3Average

2

1

searchlength

1

20 40 60 80 10020 40 60 80 100Packing density

FIGURE 10.7 Average search length versus packing density in a hashedfile in which one record can be stored per address, progressive overflow isused to resolve collisions and the file has just been loadedused to resolve collisions, and the file has just been loaded.

Algorithm: Chapter 10 (Page 25)영남대학교데이터베이스연구실

Page 7: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

6. Storing More than One Record per Address: Bucketsper Address: Buckets

Bucket?bl k f d h i i d i di k A block of records that is retrieved in one disk access.

Hashing의 결과로 bucket address 결정Synonym들을 하나의 bucket에 저장 Synonym들을 하나의 bucket에 저장

Bucket overflow?

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 26)

Page 8: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Key Home AddressGreen 30Hall 30Jenks 32King 33L d 33Land 33Marx 33Nutt 33

Bucketaddress Bucket contents

Green . . . Hall . . .30

31

Jenks . . .

King Land Marks

32

33

(Nutt . . . Isan overflowKing . . . Land . . . Marks . . .33 record)

FIGURE 10.8 An illustration of buckets. Each bucket can hold up to threeFIGURE 10.8 An illustration of buckets. Each bucket can hold up to threerecords. Only one synonym (Nutt) results in overflow.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 27)

Page 9: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

6.1 Effects of Buckets on Performance6.1 Effects of Buckets on Performance

Packing Densityki d i r

packing density =

b: number of records that fit in a bucket

rbN

예 N = 1,000 & r = 750 & b = 1

b N = 500 & r = 750 & b = 2 각 경우에 대해 packing density 는 동일

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 28)

Page 10: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

성능향상 r/N 비율의 증가 r/N 비율의 증가

p(0) 의 감소 Space utilization 증가 표 10.3 표 10.3

Overflow 발생확률 감소 표 10.4 & 표 10.5

File withoutBuckets

File withBucketsBuckets Buckets

Number of records r = 750 r = 750Number of addresses N = 1,000 N = 500Number of addresses N 1,000 N 500Bucket size b = 1 b = 2Packing density 0.75 0.75Ratio of records to address r/N = 0 75 r/N = 1 5

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 29)영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 29)

Ratio of records to address r/N = 0.75 r/N = 1.5

Page 11: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

TABLE 10.3 Poisson distributions for two different file organizations

p(x) File without File withBuckets Buckets(r/N = 0 75) (r/N = 1 5)(r/N = 0.75) (r/N = 1.5)

p(0) 0.472 0.223(1) 0 354 0 335p(1) 0.354 0.335

p(2) 0.133 0.251p(3) 0.033 0.126p(4) 0.006 0.047p(5) 0.001 0.014p(6) — 0.004p( )p(7) — 0.001

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 30)

Page 12: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

TABLE 10.4 Synonyms causing collisions as a percent of records fory y g pdifferent packing densities and different bucket sizes

Packing Bucket SizegDensity(%) 1 2 5 10 100

10 4.8 0.6 0.0 0.0 0.020 9.4 2.2 0.1 0.0 0.030 13 6 4 5 0 4 0 0 0 030 13.6 4.5 0.4 0.0 0.040 17.6 7.3 1.1 0.1 0.050 21.3 10.4 2.5 0.4 0.060 24 8 13 7 4 5 1 3 0 060 24.8 13.7 4.5 1.3 0.070 28.1 17.0 7.1 2.9 0.075 29.6 18.7 8.6 4.0 0.080 31.2 20.4 10.3 5.3 0.190 34.1 23.8 13.8 8.6 0.8100 36.8 27.1 17.6 12.5 4.0

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 31)

Page 13: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

TABLE 10.5 Average number of accesses required in a successfulh b i flsearch by progressive overflow

Packing Bucket SizeDensityDensity(%) 1 2 5 10 50

10 1 06 1 01 1 00 1 00 1 0010 1.06 1.01 1.00 1.00 1.0030 1.21 1.06 1.00 1.00 1.0040 1.33 1.10 1.01 1.00 1.0050 1.50 1.18 1.03 1.00 1.0060 1.75 1.29 1.07 1.01 1.0070 2.17 1.49 1.14 1.04 1.0080 3.00 1.90 1.29 1.11 1.0190 5.50 3.15 1.78 1.35 1.0495 10 50 5 6 2 7 1 8 1 195 10.50 5.6 2.7 1.8 1.1

Adapted from Donald Knuth, The Art of Computer Programming, Vol. 3, ©1973,Addison Wesle Reading Mass Page 536 Reprinted ith permissionAddison-Wesley, Reading, Mass, Page 536. Reprinted with permission.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 32)

Page 14: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

6.2 Implementation Issuesp Hashed Index는 fixed-length file B-Tree

d 를 미리 초기화 ( 표시 위해 ) Index를 미리 초기화 ( Empty 표시 위해 ) Index page 구조와 연관하여 생각삽입/삭제/갱신 시 hash 함수와의 연관관계 고려 삽입/삭제/갱신 시 hash 함수와의 연관관계 고려

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 33)

Page 15: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Bucket StructureBucket Structure Counter field 필요

l 표현 방법 필요 Empty slot 표현 방법 필요

0 / / / / / / / / / / / / / / / / / / / / / / / / /An emptybucket :

2 JONES ARNSWORTH / / / / / / / / / / / / / / /Two entries :

5 JONES ARNSWORTH STOCKTON BRICE THROOP

entries :

A full5 JONES ARNSWORTH STOCKTON BRICE THROOPbucket :

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 34)

Page 16: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Initializing and LoadingInitializing and Loading Initializing a File for Hashing

데이터를 저장하기 전에 파일 영역 미리 할당 & 초기화 데이터를 저장하기 전에 파일 영역 미리 할당 & 초기화 Clustering의 관점

record의 삽입 및 검색 알고리즘이 단순화 record의 삽입 및 검색 알고리즘이 단순화 이렇게 하지 않을 경우, 해결 방법?

Loading a Hash File 일반적인 fixed-length record file과의 차이점g

hash() 함수 호출을 통해서만 데이터 액세스 Key 검색 및 삽입 과정 ( linear probing )

record 개수 > file 크기?

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 35)

Page 17: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

7. Making Deletions7. Making Deletions Record 삭제 시 고려 사항

C ll 으로 인해 d가 다른 위치에 존재 가능 Collision으로 인해 record가 다른 위치에 존재 가능 Record 삭제 후 재배치 필요

Free slot은 이후 새로운 record에 할당 가능하도록 Free slot은 이후 새로운 record에 할당 가능하도록. 예 : 그림 10.9 ~ 10.10

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 36)

Page 18: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

......record

Homeaddress

Actualaddress

Adams . . .

4

5Adams 5 5

Jones . . .

M i

6

7

Jones 6 6

Morris 6 7

Smith . . .

Morris . . .7

8Smith 5 8

......

FIGURE 10.9 File organization before deletions.g

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 37)

Page 19: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

.. ... .

4

......

Jones

Adams . . .5

6Jones . . .

Adams . . .5

6Jones . . .6

7Smith

7

8

# # # # # #

Smith . . .

.. .8

Smith . . .

......

8

.. ...

FIGURE 10 10 FIGURE 10 11FIGURE 10.10 The same organization as in Fig. 10.9,with Morris deleted.

FIGURE 10.11The same file as in Fig. 10.9 afterthe insertion of a tombstonefor Morris.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 38)

Page 20: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

7.1 Tombstones for Handling Deletions7.1 Tombstones for Handling Deletions

기본 개념d 삭제 후 삭제 되었다는 표시 남기자 Record 삭제 후, 삭제 되었다는 표시 남기자.

Free slot과는 구분예: 그림 10 11 예: 그림 10.11

장점 Search 중 tombstone이 발견되면 계속 search Search 중 tombstone이 발견되면, 계속 search Tombstone 위치에 새로운 record 저장 가능

주의 사항 Searching delay (그림 10.11에서 Smith 삭제 시) 중복 확인 곤란 (그림 10.11에서 새로운 Smith 삽입)

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 39)

Page 21: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

7.2 Effects of Deletions and Additions on PerformancePerformance

문제점거듭되는 d l 의 결과로 b 과다 존재 거듭되는 deletion의 결과로, tombstone 과다 존재 searching delay 초래

해결 방안 해결 방안 Local reorganization at each deletion Complete reorganization after threshold Complete reorganization after threshold 다른 collision resolution algorithm 사용

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 40)

Page 22: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

8. Other Collision Resolution Techniques8. Other Collision Resolution Techniques

8.1 Double Hashing

기본 개념 Progressive overflow의 문제점

Hashing table에서 record cluster 발생 가능C lli i 발생 시 새로운 함수로 다시 h hi Collision 발생 시 새로운 함수로 다시 hashing

장, 단점Hashing table에서 record 들을 골고루 분산 Hashing table에서 record 들을 골고루 분산

Overflow record의 경우, 여러 번의 disk seek 필요

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 41)

Page 23: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

8.2 Chained Progressive Overflow8.2 Chained Progressive Overflow Synonym 들을 pointer로 연결

O fl 시 들만 h 가능 Overflow 시 synonym 들만 search 가능 Average search length가 준다. ( 그림 10.13 )

Home address가 다른 key에 할당 될 경우? Home address가 다른 key에 할당 될 경우? two-pass loading

Two-pass Loading 1st pass: Home record 들을 loadp 2nd pass: Overflow record 들을 빈 slot 에 할당 Question: 새로운 home record가 나중에 삽입?

Chaining with a separate overflow area

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 42)

Page 24: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Average Search LengthAverage Search Length

Key Home Address Actual Address Search Length

AdamsBatesCole

202120

202122

113

DeanEvansFlint

212420

232425

316

A S h L h ( 1 1 3 3 1 6 ) / 6 2 5Average Search Length = ( 1 + 1 + 3 + 3 + 1 + 6 ) / 6 = 2.5

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 43)

Page 25: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Home Actual Address of Search

...

address address Data next synonym length

Adams . . .

Bates

2223

20 2021 21

11Bates . . .

Cole . . .

Dean

23251

21 2120 2221 23

122Dean . . .

Evans . . .

Fli

-1-11

21 2324 2420 25

213Flint . . . -1

...

20 25 3

FIGURE 10.13 Hashing with chained progressive overflow. Adams,Cole, and Flint are synonyms; Bates and Dean are synonyms.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 44)

Page 26: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

8.3 Chaining with a Separate Overflow Area8.3 Chaining with a Separate Overflow Area

Hashing table을 두 부분으로 분리d d 저장 Prime data area: Home record 저장

Overflow area: Overflow record 들의 linked list

장, 단점 Potential home address는 항상 unused Potential home address는 항상 unused

(Home address = free) ? Store there Overflow area가 다른 cylinder에 존재할 경우?y

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 45)

Page 27: 5. Collision Resolution by Progressive Overflowcontents.kocw.net/KOCW/document/2014/Yeungnam/... · 2016-09-09 · 5. Collision Resolution by Progressive Overflow ProgressiveOverflowProgressive

Home Primary Overflow

..

Homeaddress

Primarydata area

Overflowarea

Adams . . .

Bates

01

.Cole . . .

D

21

2021

01Bates . . . 1 Dean . . .

Flint . . .

-1-1

212223

123

Evans . . . -1 ...

2324

3

...

.

FIGURE 10.14 Chaining to a separate overflow area, Adams, Cole,and Flint are synonyms; Bates and Dean are synonyms.

영남대학교데이터베이스연구실 Algorithm: Chapter 10 (Page 46)