«cassandra data modeling – моделирование данных для nosql СУБД...

Post on 19-Jul-2015

231 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cassandra data modeling

Andrey Kozlov

Cassandra features

● “Masterless” architecture

● Scalable

● Fast inserts

SSTable for storing data

Cassandra Data Structure

● Column Family

o Row

Column

Map<RowKey, Map<ColumnKey, ColumnValue>>

Simple Data Table

CREATE TABLE employees (

name text,

age int,

role text,

PRIMARY KEY (name));

INSERT INTO employees (name, age,

role) VALUES ('john', 37, 'dev');

INSERT INTO employees (name, age,

role) VALUES ('eric', 38, 'ceo');

name | age | role

------+-----+------

eric | 38 | ceo

john | 37 | dev

age role

john 37 dev

age role

eric 38 ceo

Data Table with Composite key

CREATE TABLE employees (

company text,

name text,

age int,

role text,

PRIMARY KEY (company,name)

);

company | name | age | role

---------+------+-----+------

OSC | eric | 38 | ceo

OSC | john | 37 | dev

RKG | anya | 29 | lead

RKG | ben | 27 | dev

RKG | chan | 35 | ops

eric:age eric:role john:age joghn:role

OSC 38 dev 37 dev

anya:age anya:role ben:age ben:role chad:age chad:role

RKG 29 lead 27 dev 35 ops

Select by Composite key

CREATE TABLE no_column_skip(

a int,

b int,

c int,

d int,

e int,

PRIMARY KEY (a, b, c, d));

Valid:

SELECT ... WHERE a=0 AND (b, c) > (1, 2)

SELECT ... WHERE a=0 AND (b) > (3)

SELECT ... WHERE a=0 AND (b, c, d) > (1, 2, 5)

Not Valid:

SELECT ... WHERE a=0 AND (b, d) > (1, 2)

SELECT ... WHERE a=0 AND (c) > (3)

SELECT ... WHERE (b, c, d) > (1, 2)

Many to Many

Normalized with one additional table

Normalized with two tables

Particular denormalization

Particular denormalization with composite

keys

Event data for time period

Event data for time period

Secondary index template

CREATE TABLE playlists (

id uuid,

song_order int,

song_id uuid,

title text,

album text,

artist text,

PRIMARY KEY (id, song_order ) );

INSERT INTO playlists (id, song_order, song_id, title, artist, album)

VALUES (playlist_1, 1, song_1, 'La Grange', 'ZZ Top', 'Tres Hombres');

INSERT INTO playlists (id, song_order, song_id, title, artist, album)

VALUES (playlist_1, 2, song_2, 'Moving in Stereo', 'Fu Manchu', 'We Must Obey');

INSERT INTO playlists (id, song_order, song_id, title, artist, album)

VALUES (playlist_2, 3, song_3, 'Hang On', 'Fu Manchu', 'California Crossing');

CREATE INDEX ON playlists( artist );

Secondary index template

id song_oder song_id title artist album

playlist_1 1 song_1 La Grange ZZ Top Tres Hombres

2 song_2 Moving in Stereo Fu Manchu We Must Obey

...

playlist_2 1 song_3 Hang On Fu Manchu California Crossing

...

artist playlist_id song_order

ZZ Top playlist_1 1

...

Fu manchu playlist_1 2

playlist_2 1

...

Secondary index distribution

Thank you for attention

References

• https://ru.wikipedia.org/wiki/BigTable

• http://www.datastax.com/documentation/cassandra/2.1/cassandra/gettingStartedCassandraIntro.ht

ml

• http://www.datastax.com/documentation/cassandra/2.1/cassandra/dml/dml_write_path_c.html

• http://www.datastax.com/documentation/cassandra/2.1/cassandra/dml/architectureClientRequests

Read_c.html

• http://www.datastax.com/documentation/cassandra/2.1/cassandra/dml/dml_config_consistency_c.

html#concept_ds_umf_5xx_zj__table_vs2_f2s_gk

• http://rollerweblogger.org/roller/entry/composite_keys_in_cassandra

• http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-

1/#.VPNAcvmUeUT

• https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/

• http://habrahabr.ru/post/203200/

top related