[pgday.seoul 2017] 6. gin vs gist 인덱스 이야기 - 박진우
TRANSCRIPT
Contents
1.Index2.Heap3.Btree and GIN4.Ttree and GiST5.summary
Why Index??
Why Index??
Spatial Index
Visibility Index
Full Text Search
Index
Index
Index
인덱스는지정된컬럼에대한 매핑정보를가지고있습니다.
Ex) CREATE INDEX test1_id_index ON test1 (id);
Index
PostgreSQL에서는다음과같은 Index type을지원합니다.
• B-Tree : numbers, text, dates, etc..
• Generalized Inverted Index (GIN)
• Generalized Inverted Search Tree (GiST)
• Space partitioned GiST (SP-GiST)
• Block Range Indexes (BRIN)
• Hash
Heap
Heap(힙) 이란?: 정렬의기준이없이저장된테이블의존재형태
Block 0
Block 1
Block 2
Block 3
Block 4
Block 0
Block 1
Block 2
Block 3
Block 4
0123
0123
0123
0123
0123
Heap
Block 0
Block 1
Block 2
Block 3
Block 4
0123
0123
0123
0123
0123
TID: Physical location of heap tupleex) Berlin: 0번째 Block의 2번째항목이다.
Item Point: Berlin (0,2)
Heap
• Table file은 n개의 block으로구성되어있다. • 한 block 당 Page의디폴트크기는 8192byte(약 8KB)이다.• 한페이지(Page)는 Header Info, Record data, free space로구성되어있다.
Heap
Seq. Scan VS. Index Scan
B-tree
Postgres=# CREATE INDEX indexname ON tablename (columnname)
CREATE INDEX test1_id_index ON test1 (id);
• 기본적인 Index type의방식
• 사용법
B-tree
B-tree
GIN
Seoul (0,12)
Seoul(4,2)
Seoul (1,9)
Seoul (4,1)
Busan (2,2)
Seoul (0,12), (4,2), (1,9), (4,1),
(2,2)
Busan (2,2)
Posing list
• Generalized Inverted Index (GIN)
GIN
Posting tree
GIN
Posting list
GIN
1. Text retrivalpostgres=# -- create a table with a text columnpostgres=# CREATE TABLE t1 (id serial, t text);CREATE TABLEpostgres=# CREATE INDEX t1_idx ON t1 USING gin (to_tsvector('english', t));CREATE INDEXpostgres=# INSERT INTO t1 VALUES (1, 'a fat cat sat on a mat and ate a fat rat');INSERT 0 1postgres=# INSERT INTO t1 VALUES (2, 'a fat dog sat on a mat and ate a fat chop');INSERT 0 1postgres=# -- is there a row where column t contains the two words? (syntax contains some magic to hit index)postgres=# SELECT * FROM t1 WHERE to_tsvector('english', t) @@ to_tsquery('fat & rat');id | t ----+------------------------------------------1 | a fat cat sat on a mat and ate a fat rat
(1 row)
postgres=# CREATE INDEX indexname ON tablename USING GIN (columnname);
GIN
2. Array
postgres=# -- create a table where one column exists of an integer arraypostgres=# --postgres=# CREATE TABLE t2 (id serial, temperatures INTEGER[]);CREATE TABLEpostgres=# CREATE INDEX t2_idx ON t2 USING gin (temperatures);CREATE INDEXpostgres=# INSERT INTO t2 VALUES (1, '{11, 12, 13, 14}');INSERT 0 1postgres=# INSERT INTO t2 VALUES (2, '{21, 22, 23, 24}');INSERT 0 1postgres=# -- Is there a row with the two array elements 12 and 11?postgres=# SELECT * FROM t2 WHERE temperatures @> '{12, 11}';id | temperatures ----+---------------1 | {11,12,13,14}
(1 row)
GiST
• “contains”, “left of”, “overlaps”, 등을지원한다.
• Full Text Search, Geometric operations (PostGIS, etc. ), Handling ranges (tiem, etc.)
• KNN-search, BRTree를바탕으로구성되어있다.
R-tree(Rectangle-tree)
R-tree(Rectangle-tree)
Linear Indexing
R-tree(Rectangle-tree)
Multi-Dimensional
R-tree(Rectangle-tree)
Multi-Dimensional
GiST
postgres=# CREATE INDEX indexname ON tablename USING GIST (columnname);postgres=# -- create a table with a column of non-trivial typepostgres=# --postgres=# CREATE TABLE t3 (id serial, c circle);CREATE TABLEpostgres=# CREATE INDEX t3_idx ON t3 USING gist(c);CREATE INDEXpostgres=# INSERT INTO t3 VALUES (1, circle '((0, 0), 0.5)');INSERT 0 1postgres=# INSERT INTO t3 VALUES (2, circle '((1, 0), 0.5)');INSERT 0 1postgres=# INSERT INTO t3 VALUES (3, circle '((0.3, 0.3), 0.3)');INSERT 0 1postgres=# -- which circles lie in the bounds of the unit circle?postgres=# SELECT * FROM t3 WHERE circle '((0, 0), 1)' @> c;id | c ----+-----------------1 | <(0,0),0.5>3 | <(0.3,0.3),0.3>
(2 rows)
지원하는 Data type
지원하는 Data type
지원하는 Data type
summary
• B-tree is ideal for unique values• GIN is ideal for indexes with many duplicates• GIST for everything else
Experiments lead to the following observations:
creation time - GIN takes 3x time to build than GiST
size of index - GIN is 2-3 times bigger than GiST
search time - GIN is 3 times faster than GiST
update time - GIN is about 10 times slower than GiST
경청해주셔서감사합니다.