using q4m a message queue storage engine for mysql cybozu labs, inc. kazuho oku

53
Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Upload: gervase-evans

Post on 17-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Using Q4Ma message queue storage engine for

MySQL

Cybozu Labs, Inc.Kazuho Oku

Page 2: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Background

Apr 22 2009 Using Q4M 2

Page 3: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Who am I?

Name: Kazuho Oku ( 奥 一穂 )Original Developer of Palmscape /

XiinoThe oldest web browser for Palm OS

Working at Cybozu Labs since 2005Research subsidiary of Cybozu, Inc. in Japan

Apr 22 2009 Using Q4M 3

Page 4: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

About Cybozu, Inc.

Japan’s largest groupware vendorMostly provides as software products, not

as servicesSome of our apps bundle MySQL as storage

Apr 22 2009 Using Q4M 4

Page 5: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

About Pathtraq

Started in Aug. 2007Web ranking service

One of Japan’s largestlike Alexa, but semi-realtime, and per-pagerunning on MySQL

Need for a fast and reliable message relayfor communication between the main

server and content analysis server(s)

Apr 22 2009 Using Q4M 5

Page 6: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Design Goals of Q4M

RobustDo not lose data on OS crash or power

failure

FastTransfer thousands of messages per second

Easy to UseUse SQL for access / maintenanceIntegration into MySQL

no more separate daemons to take care of

Apr 22 2009 Using Q4M 6

Page 7: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 7

What is a Message Queue?

Page 8: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 8

What is a Message Queue?

Middleware for persistent asynchronous communicationcommunicate between fixed pairs (parties)a.k.a. Message Oriented Middleware

MQ is intermediate storageRDBMS is persistent storage

Senders / receivers may go down

Page 9: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 9

Minimal Configuration of a MQ

Senders and receivers access a single queue

Sender Receiver

Queue

Page 10: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 10

MQ and Relays

Separate queue for sender and receiver

Messages relayed between queues

Sender

Queue

Receiver

Queue

Relay

Page 11: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 11

Merits of Message Relays

Destination can be changed easilyRelays may transfer messages to different

locations depending on their headers

Robustness against network failureno loss or duplicates when the relay fails

Logging and Multicasting, etc.

Page 12: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 12

Message Brokers

Publish / subscribe modelSeparation between components and their

integrationComponents read / write to predefined

queuesIntegration is definition of routing rules

between the message queuesMessages are often transformed (filtered)

within the relay agent

Page 13: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 13

What about Q4M?

Q4M itself is a message queueCan connect Q4M instances to

create a message relayProvides API for creating message

relays and brokers

Page 14: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Performance of Q4M

over 7,000 mess/sec.message size: avg. 512 bytessyncing to disk

Outperforming most needsif you need more, just scale outCan coexist with other storage engines

without sacrificing their performance

see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.phpApr 22 2009 Using Q4M 14

Page 15: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 15

Applications of Q4M

Page 16: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 16

Asynchronous Updates

Mixi (Japan's one of the largest SNS) uses Q4M to buffer writes to DB, to offload peak demands

from http://alpha.mixi.co.jp/blog/?p=272

Page 17: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 17

Connecting Distant Servers

Pathtraq uses Q4M to create a relay between its database and content analysis processes

PathtraqDB

ContentAnalysis

ProcessesMySQL conn.over SSL,gzip

→ Contents to be analyzed →

← Results of the analysis ←

Page 18: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

To Prefetch Data

livedoor Reader (web-based feed aggregator) uses Q4M to prefetch data from database to memcached

uses Q4M for scheduling web crawlers as well

from http://d.hatena.ne.jp/mala/20081212/1229074359

Apr 22 2009 Using Q4M 18

Page 19: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 19

Scheduling Web Crawlers

Web crawlers with retry-on-errorSample code included in Q4M dist.

URLDB

Request Queue

Spiders

Retry Queue

Re-scheduler

Store Result

Read URL

If failed to fetch, store URL in retry queue

Page 20: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Delayed Content Generation

Hatetter (RSS feed-to-twitter-API gateway) uses Q4M to delay content generationSource code:

github.com/yappo/website-hatetter

Apr 22 2009 Using Q4M 20

Page 21: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 21

User Notifications

For sending notifications from web services

DB

Queue(s)

App. Logic SMTP Agent

IM Agent

Page 22: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Installing Q4M

Apr 22 2009 Using Q4M 22

Page 23: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Installing Q4M

Compatible with MySQL 5.1Download from q4m.31tools.com

Binary releases available for some platforms

Installing from source:requires source code of MySQL./configure && make && make installrun support-files/install.sql

Apr 22 2009 Using Q4M 23

Page 24: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 24

Configuration Options of Q4M

--with-sync=no|fsync|fdatasync|fcntlControls synchronization to diskdefault: fdatasync on linux

--enable-mmapMmap’ed reads lead to higher throughputdefault: yes

--with-delete=pwrite|msyncmsync recommended on linux>=2.6.20 if

you need really high performance

Page 25: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 25

Q4M Basics

Page 26: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

The Model

Apr 22 2009 Using Q4M 26

Q4M table

Subscribers

Publisher

Publisher

Publisher

Various publishers write to queueSet of subscribers consume the entries in

queue

Page 27: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Creating a Q4M Table

ENGINE=QUEUE creates a Q4M table

No primary keys or indexes

Sorted by insertion order (it’s a queue)

Apr 22 2009 Using Q4M 27

mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE;Query OK, 0 rows affected (0.42 sec)

Page 28: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Modifying Data on a Q4M Table

No restrictions for INSERT and DELETE

No support for UPDATE

Apr 22 2009 Using Q4M 28

mysql> INSERT INTO qt (id,message) -> VALUES -> (1,'Hello'), -> (2,'Bonjour'), -> (3,'Hola');Query OK, 3 rows affected (0.02 sec)

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+3 rows in set (0.00 sec)

Page 29: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

SELECT from a Q4M Table

Works the same as other storage engines

SELECT COUNT(*) is cached

Apr 22 2009 Using Q4M 29

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+3 rows in set (0.00 sec)

mysql> SELECT COUNT(*) FROM qt;+----------+| COUNT(*) |+----------+| 3 | +----------+1 row in set (0.00 sec)

How to subscribe to a queue?

Page 30: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Calling queue_wait()

After calling, only one row becomes visible from the connection

Apr 22 2009 Using Q4M 30

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+3 rows in set (0.00 sec)

mysql> SELECT queue_wait('qt');+------------------+| queue_wait('qt') |+------------------+| 1 | +------------------+1 row in set (0.00 sec)

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | +----+---------+1 row in set (0.00 sec)

Page 31: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

OWNER Mode and NON-OWNER Mode

In OWNER mode, only the OWNED row is visible

OWNED row becomes invisible from other connections

rows of other storage engines are visible

Apr 22 2009 Using Q4M 31

NON-OWNER Mode

1,'Hello'2,'Bonjour'3,'Hola'

OWNER Mode

1,'Hello'

queue_wait()

queue_end()queue_abort()

Page 32: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Returning to NON-OWNER mode

By calling queue_abort, the connection returns to NON-OWNER mode

Apr 22 2009 Using Q4M 32

mysql> SELECT QUEUE_ABORT();+---------------+| QUEUE_ABORT() |+---------------+| 1 | +---------------+1 row in set (0.00 sec)

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+3 rows in set (0.01 sec)

Page 33: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Consuming a Row

By calling queue_end, the OWNED row is deleted, and connection returns to NON-OWNER mode

Apr 22 2009 Using Q4M 33

mysql> SELECT queue_wait('qt');(snip)mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 1 | Hello | +----+---------+1 row in set (0.01 sec)

mysql> SELECT queue_end();+-------------+| queue_end() |+-------------+| 1 | +-------------+1 row in set (0.01 sec)

mysql> SELECT * FROM qt;+----+---------+| id | message |+----+---------+| 2 | Bonjour | | 3 | Hola | +----+---------+2 rows in set (0.00 sec)

Page 34: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Writing a Subscriber

Call two functions: queue_wait, queue_end Multiple subscribers can be run concurrently

each row in the queue is consumed only once

while (true) { SELECT queue_wait('qt'); # switch to owner mode rows := SELECT * FROM qt; # obtain data if (count(rows) != 0) # if we have any data, then handle_row(rows[0]); # consume the row SELECT queue_end(); # erase the row from queue}

Apr 22 2009 Using Q4M 34

Page 35: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Writing a Subscriber (cont'd)

Or call queue_wait as a conditionWarning: conflicts with trigger-based

insertions

while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end();}

Apr 22 2009 Using Q4M 35

Page 36: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

The Model – with code

Apr 22 2009 Using Q4M 36

while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end();}

Q4M table

Subscribers

INSERT INTO queue ...

INSERT INTO queue ...

INSERT INTO queue ...

Publisher

Publisher

Publisher

Page 37: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Three Functions in Detail

Apr 22 2009 Using Q4M 37

Page 38: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 38

queue_wait(table)

Enters OWNER mode0 〜 1 row becomes OWNED

Enters OWNER mode even if no rows were available

Default timeout: 60 secondsReturns 1 if a row is OWNED (0 on timeout)

If called within OWNER mode, the owned row is deleted

Page 39: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Revisiting Subscriber Code

Calls to queue_end just before queue_wait can be omitted

while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end();}

Apr 22 2009 Using Q4M 39

Page 40: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 40

Conditional queue_wait()

Consume rows of certain conditionRows that do not match will be left

untouchedOnly numeric columns can be checkedFast - condition tested once per each row

examples: SELECT queue_wait('table:(col_a*3)+col_b<col_c'); SELECT queue_wait('table:retry_count<5');

Page 41: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 41

queue_wait(tbl_cond,[tbl_cond…,timeout])

Accepts multiple tables and timeoutData searched from leftmost table

to rightReturns table index (the leftmost

table is 1) of the newly owned rowReturns zero if no rows are being owned

example: SELECT queue_wait('table_A','table_B',60);

Page 42: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 42

Functions for Exiting OWNER Mode

queue_endDeletes the owned row and exits OWNER

mode

queue_abortReleases (instead of deleting) the owned

row and exits OWNER modeClose of a MySQL connection does the same

thing

Page 43: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Relaying and Routing Messages

Apr 22 2009 Using Q4M 43

Page 44: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

The Problem

Relay (or router) consists of more than 3 processes, 2 conns

No losses, no duplicates on crash or disconnection

Apr 22 2009 Using Q4M 44

Q4M Table(source)

Q4M Table(dest.)Relay Program

Page 45: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Internal Row ID

Every row have a internal row IDinvisible from Q4M table definitionmonotonically increasing 64-bit integer

Used for detecting duplicatesUse two functions to skip duplicatesData loss prevented by using queue_wait /

queue_end

Apr 22 2009 Using Q4M 45

Page 46: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

queue_rowid()

Returns row ID of the OWNED row (if any)Returns NULL if no row is OWNED

Call when retrieving data from source

Apr 22 2009 Using Q4M 46

Page 47: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

queue_set_srcid(src_tbl_id, mode, src_row_id)

Call before inserting a row to destination tableChecks if the row is already inserted into

the table, and ignores next INSERT if true

Parameters:src_tbl_id - id to determine source table

(0 〜 63)mode - "a" to drop duplicates, "w" to resetsrc_row_id - row ID obtained from source

tableApr 22 2009 Using Q4M 47

Page 48: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Pseudo Code

Relays data from src_tbl to dest_tbl

while (true) { # wait for data SELECT queue_wait(src_tbl) => src_db; # read row and rowid row := (SELECT * FROM src_tbl => src_db); rowid := (SELECT queue_rowid() => src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, 'a', rowid) => dest_db; INSERT INTO dest_tbl (row) => dest_db;}

Apr 22 2009 Using Q4M 48

Page 49: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

q4m-forward

Simple forwarder scriptinstalled into mysql-dir/bin

usage: q4m-forward [options] src_addr dest_addrexample: % support-files/q4m-forward \ "dbi:mysql:database=db1;table=tbl1;user=foo;password=XXX" \ "dbi:mysql:database=db2;table=tbl2;host=bar;user=foo"options: --reset reset duplicate check info. --sender=idx slot no. used for checking duplicates (0..63, default: 0) --help

Apr 22 2009 Using Q4M 49

Page 50: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 50

Limitations and the Future of Q4M

Page 51: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 51

Things that Need to be Fixed

Table compactions is a blocking operationruns when live data becomes <25% of log

filevery bad, though not as bad as it seems

it's fast since it's a sequential write operation

Relays are slowsince transfer is done row-by-row

Binlog does not worksince MQ replication should be synchronous

Page 52: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Apr 22 2009 Using Q4M 52

Future of Q4M

2-phase commit with other storage engines (maybe)queue consumption and InnoDB updates

can become atomic operation

Page 53: Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku

Thank you

http://q4m.31tools.com/

Apr 22 2009 Using Q4M 53