in-memory database 전준민, 정주성, 이한민, 곽하녹 1. table of contents 1. introduction...

In-Memory Database전준민 , 정주성 , 이한민 , 곽하녹

Table of Contents

1. Introduction

2. Disk Resident DB vs In-Memory DB

3. Column Store

4. Durability

5. Data Overflow

6. Products of IMDB

7. Optimization Aspects on IMDB

1. IntroductionWhat is In-Memory Database (IMDB) ?

Architecture

Rise of IMDB

Applications

Myths about IMDB

What is In-Memory Database (IMDB)?• An in-memory database system is a database manage-

ment system that stores data entirely in main memory.

What is In-Memory Database (IMDB)?

Architecture

• Fast data access• Algorithms optimized on

main memory• Efficient memory usage• Durability

Rise of the IMDB

• Multicore Processors• Cheaper and Bigger Memories• Demands on Fast Databases

Rise of the IMDB

Applications

• Low-latency, high volume systems

Myths about IMDB

• Given the same amount of RAM, disk DBs can perform at the same speed as IMDBs (by using caching technology).

• If a RAM disk is created and a traditional disk DB is de-ployed on it, it delivers the same performance as an in-memory database.

• write on disk• buffer manager• indexes for disk• redundant data

2. Disk Resident DB (DRDB) vs In-Memory DB (IMDB)DRDB vs IMDB : Overview

Indexes

Concurrency Control

DRDB vs IMDB : Overview [1]

DRDB IMDB

File I/O Carries File I/O burden No file I/O burden

Storage UsageAssumes storage is abun-dant

Uses storage more effi-ciently

AlgorithmsAlgorithm optimized for disk

Algorithms optimized for memory

CPU Cycles More CPU cycles Less CPU cycles

Persistence Non-volatile Volatile

Lock Fine Locks Coarse Locks

Indexes: B+-Tree in DRDB [2]

• The redundant data are kept in some index structures, to reduce I/O.

Indexes: T-Tree in IMDB [3]

• The indexes in IMDBs are focused on reduced memory consumption and CPU cycles.

• In the early 90's, Lehman and Carey proposed the T-tree as an index structure for main memory database.

• The T-tree indexes are more efficient than B-trees in that they require less memory space and fewer CPU cy-cles.

Indexes: T-Tree in IMDB

• The T-tree evolved from AVL Trees and B-Trees.

Indexes: Hash indexes in IMDB

• Hash indexes are used for key-value based in-memory databases (cache servers) such as Redis and Mem-cached.

Concurrency Control

• In DRDBs, locking granules are low level.• To reduce contention• To increase parallelism

• In IMDBs, locks are coarse-grained thanks to fast pro-cessing.

• Locking granules like a relation or an entire database• No need to look up hash table• Serial scheduling is enough in most cases

3. Column StoreWhat is Column Store?

Benefits of Column Store

Delta Storage

What is Column Store?

• Column Store• stores data tables as columns

of data rather than as rows of data

Benefits of Column Store [4]

• Column stores are more suitable in IMDB than row stores • Better parallelism• Better compression• Faster data access

• Using parallel processing.• Especially for aggregations.

Benefits of Column Store: Parallel-ism [5] • Column storage can easily be separated into equal parts

which leads to effective parallel processing. • Highly parallelized scan operations are available which

are faster than indexed searches.• The row store cannot compete if processing is set-ori-ented and requires column operations, but most appli-cations are based on set-oriented processing and not di-rect tuple access.

Benefits of Column Store: Parallel-ism • Highly parallelized scan operations using column stores

are faster than using just ordinary indexes.

Benefits of Column Store: Compres-sion• Column store allows highly efficient compression be-

cause the columns contain only few distinct values.• Compression

Delta Storage [6]

• Since writing on compressed column stores in real time is inefficient, delta storage techniques are used.

• Delta Storage• optimized for write operations

• Main Storage• compressed column store

Delta Storage

• INSERT • insert a new record in the delta storage. The merge process will

move the record from delta to main.

• DELETE • A DELETE statement will select the record and mark it as invalid by

setting a flag (for main or delta). The merge process will delete the record from memory once there is no open transaction active for it anymore.

• UPDATE• An UPDATE statement will insert a new version of the record. The

merge process will move the latest version from delta to main. Old versions will be deleted once there is no open transaction active for them anymore.

Delta Storage: Simplified View of In-sert-Only Approach

Delta Storage

• The merge process starts when the delta storage grows big enough.

4. DurabilityLogging and Checkpointing

Command Logging

NVM Logging

Durability• Durability is difficult to support in IMDBs• Many IMDBs have added durability via the following

mechan-isms• Checkpoints• Transaction logging

Checkpointing

• Checkpoints in DRDB• Bring pages on disk up to date• Reduce the work of recovery

• Checkpoints in IMDB• Make a copy of the data on disks (snapshot)• Truncate the logs

Logging and Checkpointing

Transaction

Log Buffer

Physica

l Disk

Memory Ta-blespace log sync

Checkpoint Image File

REDO Log File

• Problem• Log I/O becomes bot-

tleneck

• How long do we need to keep the log?

• Until the next check-point

• TPCC benchmarking on DRDBs (New Order transaction)

• Logging takes up a non-small portion

• Larger portion for IMDBs

Logging and Checkpointing [7]

Command Logging [8]

• Light-weight, coarse-grained logging technique• Logical logging• Advantages

• Write substantially fewer bytes per transaction than physical logging • Reduce run time overhead

• Disadvantages• Slow recovery

• Failures that require recovery to ensure system availability are much less frequent

• 1.5X higher throughput than main-memory optimized imple-mentation of physical logging

Command Logging

• NVM (Non-Volatile Memory)• low read/write latency like DRAM• persistent write like SSD

NVM Logging [9]

DRAM NAND Flash NVM

Byte-Address-able

Yes No Yes

Capacity 1X 4X 2-4X

Latency 1X 400X 3-5X

• DBMS relies on both DRAM and NVM

NVM+DRAM Architecture

5. Data overflowAnti-caching

Project Siberia

Data overflow

• Datasets may not fit in DRAM• IMDB Solutions

• Anti-caching• Project Siberia

Anti-caching [10]

• Used in H-Store• Cold data is moved to disk in a safe manner• Bloom filter used for tracking data• Manage cold data by maintaining a LRU chain

Anti-caching

• Fine-grained eviction• eviction is performed at tuple-level, not page-level

• Non-blocking fetches• a transaction that accesses evicted data is simply aborted and

then restarted at a later point

Project Siberia [11]

• Used in Hekaton• Automatically and transparently maintain cold data on

cheaper secondary storage• Allow more data to fit in memory• Log-based management of cold data

6. Products of IMDBH-Store / VoltDB

Hekaton

SAP HANA

In-memory NoSQL Databases

Products of IMDB

H-Store / VoltDB

• Distributed row-based in-memory relational database • Targeted for high-performance OLTP processing• Light-weight logging strategy• Anti-caching

Hekaton

• Memory-optimized OLTP engine• Fully integrated into Microsoft SQL server• Multi-version concurrency control • Project Siberia

SAP HANA

• A distributed in-memory database featured for the inte-gration of OLTP and OLAP

• Provides rich data analytics functionality by offering multiple query language interfaces (e.g., standard SQL, SQLScript, MDX, WIPE, FOX and R)

SAP HANA

• Three-level column-oriented unified table structure

In-memory NoSQL Databases

• RAMCloud• Distributed in-memory key-value store, featured for low la-

tency, high availability and high memory utilization

• Bitsy• Embeddable in-memory graph database that implements the

Blueprints API, with ACID guarantees on transactions based on the optimistic concurrency mode

Comparison of IMDB [12]

Sys-tems Data Model Work-

loads Indexes Fault Toler-ance

Memory Overflow

Relational Databases

H-Store relation(row) OLTPhashing, b+-tree, binary tree

command log-ging, checkpoint, replica

anti-caching

Hekaton relation(row) OLTPlatch-free hashing, Bw-tree

logging, check-point, replica

Project Siberia

SAP HANA

relation, graph, text OLTP, OLAP timeline index

logging, check-point, standby server

table/parti-tion-level swapping

NoSQL Databases

RAM-Cloud key-value object op-

erations hashing logging, replica N/A

Graph Databases Bitsy N/A OLTP

optimistic con-currency con-trol

logging, backup N/A

7. Optimization Aspects on IMDB

Optimization Aspects on IMDB [12]

Aspects Concerns Related Work

Index cache consciousness, time/space efficiency T-Tree, CSS-Trees, CSB+-Trees, BD-Tree

Data Layout cache consciousness, space efficiency

columnar layout, HANA Hybrid Store, log structure

Concurrency Con-trol overhead, correctness virtual snapshot, transaction memory,

Query Processing code locality, time efficiency stored procedure, JIT compilation, sort-ing

Fault Tolerance durability, correlated failures, availability

group commit and log coalescing, NVM, command logging, remote logging

Data Overflow locality, paging, hot/cold classification

anti-caching, Hekaton Siberia, data compression, virtual memory manage-ment, pointer swizzling

References[1] Garcia-Molina, Hector, and Kenneth Salem. "Main memory database systems: An overview." Knowledge and Data Engineering, IEEE Transactions on 4.6 (1992): 509-516.

[2] Comer, Douglas. "Ubiquitous B-tree." ACM Computing Surveys (CSUR) 11.2 (1979): 121-137.

[3] Lehman, Tobin J., and Michael J. Carey. "A study of index structures for main memory database management systems." Conference on Very Large Data Bases. Vol. 294. 1986.

[4] Abadi, Daniel J., Samuel R. Madden, and Nabil Hachem. "Column-stores vs. row-stores: how different are they really?." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008

[5] Plattner, Hasso. "A common database approach for OLTP and OLAP using an in-memory column database." Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 2009.

[6] Färber, Franz, et al. "The SAP HANA Database--An Architecture Overview."IEEE Data Eng. Bull. 35.1 (2012): 28-33.

References[7] Harizopoulos, Stavros, et al. "OLTP through the looking glass, and what we found there." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008.

[8] Malviya, Nirmesh, et al. "Rethinking main memory oltp recovery." Data Engi-neering (ICDE), 2014 IEEE 30th International Conference on. IEEE, 2014.

[9] DeBrabant, Justin, et al. "A Prolegomenon on OLTP Database Systems for Non-Volatile Memory." Proceedings of the VLDB Endowment 7.14 (2014).

[10] DeBrabant, Justin, et al. "Anti-caching: A new approach to database man-agement system architecture." Proceedings of the VLDB Endowment 6.14 (2013): 1942-1953.

[11] Eldawy, Ahmed, Justin Levandoski, and Paul Larson. "Trekking through siberia: Managing cold data in a memory-optimized database." Proceedings of the VLDB Endowment 7.11 (2014).

[12] Zhang, Hao, et al. "In-memory big data management and processing: A sur-vey." (2015).

in-memory database 전준민, 정주성, 이한민, 곽하녹 1. table of contents 1. introduction...

Documents

oracle database technology...

3 durability (mongolian)

full vehicle durability prediction using co …...full...

3. durability analysis rlz modified.ppt tai.pdf · •...

cƠ sỞ dỮ liỆu trÊn bỘ nhỚ (in-memory db) vÀ...

1332-50 durability and service life design

concrete durability

surface engineering for increased durability and energy...

durability design – the indian...

cp info -- precast concrete pipe durability...title cp info...

memory internal memory and external memory

concrete durability متانة الخرسانة

grc durability pdf

international conference on durability of concrete...

braz packer slake durability

flexible barrier materials for improving the durability of

data durability with schemaless database - breizhcamp...

comparison of durability parameters of self-compacting...

rivarossi memory 1983 rr pocher.pdfdiesel 232 db c.a. p 10...

flash memory testing flash memory testing