shard-query, an mpp database for the cloud using the lamp stack

of 64 /64

Click here to load reader

Author: justin-swanhart

Post on 11-Aug-2014

733 views

Category:

Data & Analytics


113 download

Embed Size (px)

DESCRIPTION

This combined #SFMySQL and #SFPHP meetup talked about Shard-Query. You can find the video to accompany this set of slides here: https://www.youtube.com/watch?v=vC3mL_5DfEM

TRANSCRIPT

Shard-Query AN MPP DATABASE FOR THE CLOUD USING THE LAMP STACK Introduction Presenter Justin Swanhart Principal Support Engineer at Percona Previously a trainer and consultant at Percona too Developer Swanhart-tools Shard-Query MPP sharding middleware for MySQL Flexviews Materialized views (fast refresh) for MySQL bcmath UDF arbitrary precision math for MySQL Intended Audience MySQL users with data too large to query efficiently using a single machine Big Data Analytics / OLAP User generated content analysis People interested in distributed database processing Terms MPP Massively Parallel Processing An MPP system is a system that can process a SQL statement in parallel on a single machine or even many machines A collection of machines is often called a Grid MPP is also sometimes called Grid Computing MPP (cont) Not many open source databases (none?) support MPP Community editions of closed source offerings are limited Some closed source databases include Vertica, Greenplum, Redshift The Cloud Managed collection of virtual servers Easy to add servers on demand Ideal for a federated, distributed database grid Easy to scale up by moving to a VM with more cores Easy to scale out by adding machines Amazon is one of the most popular cloud environments LAMP stack Linux Amazon Linux RHEL Ubuntu LTS, etc. Apache Web Server Most popular web server on the planet MySQL The worlds most popular open source database PHP High level language makes development easier Database Middleware A piece of software that sits between an end-user application and the database Operates on the queries submitted by the application, then returns the results to the application Usually a proxy of some sort MySQL proxy is the open source user configurable proxy for MySQL Supports Lua scripts which intercept queries Shard-Query can use MySQL Proxy out of the box Message Queue / Job Server Accepts jobs or messages and places them in a queue A worker reads jobs/messages from the queue and acts on them Offers support for asynchronous jobs Gearman My job server of choice for PHP Has two different PHP interfaces (pear and pecl) SQ comes bundled with a modified version* of the pear interface Excellent integration with MySQL as well (UDF) * Removes warnings triggered by modern PHP strict mode Sharding It is a short for Shared Nothing Means splitting up your data onto more than one machine Tables that are split up are called sharded tables Lookup tables are not sharded. In other words, they must be duplicated on all nodes Shard-Query supports directory based or hash based sharding Shard mapper Shard-Query supports DIRECTORY and HASH mapping out of the box DIRECTORY based sharding allows you to add or remove shards from the system, but lookups may go over the network, reducing performance* compared to HASH mapping HASH based sharding uses a hash algorithm to balance rows over the sharded database. However, since a HASH algorithm is used, the number of database shards can not change after initial data loading. * But only for queries like select count(*) from table where customer_id = 50 What is big data Most machine generated data Line order information for a large organization like Wal-Mart Any data so large that you cant effectively operate on it on one machine For example, an important query that needs to run daily executes in greater than 24 hours. It is impossible to meet the daily goal unless you can find a way to make the query execute faster. These kind of problems can happen on relatively small amounts of data (tens of gigabytes) Analytics(OLAP) versus OLTP OLTP is focused on short lived small transactions that read or write small amounts of data OLAP is focused on bulk loading and reading large amounts of data in a single query. Aggregation queries are OLAP queries Shard-Query is designed for analytics (OLAP) not OLTP must parse all commands sent to it (and make multiple round trips) Minium query time of around 20ms PROBLEM: Single Threaded Queries THE BIGGEST BOTTLENECK IN ANALYTICAL QUERIES IS THE SPEED OF A SINGLE CORE Single thread queries in the database MySQL, PostgreSQL, Firebird and all other major open source databases have single threaded queries This means that a single query can only ever utilize the resources of a single core As the data size grows, analytical queries get slower and slower In memory, as the data grows the speed decreases because the data is accessed in a single query As the number of rows to be examined increases, performance decreases Why single threaded MySQL is optimized for getting small amounts of data quickly(OLTP) It was created at a time when having more than one CPU was not common Adding parallelism now is a very complex task, particularly since MySQL supports multiple storage engines So adding parallel query is not a high priority (not even on the roadmap) Designed to run LOTS of small queries simultaneously, not one big query Single Threading bad for IO If the data set is significantly larger than memory, single threaded queries often cause the buffer pool to "churn For example, small lookup tables can easily be pushed out of the buffer pool, resulting in frequent IO to look up values While SSD may helps somewhat, one database thread can not read from an SSD at maximum device capacity While the disk may be capable of 1000+ MB/sec, a single thread is generally limited to query($sql); $endtime = microtime(true); if(!empty($shard_query->errors)) { if(!empty($shard_query->errors)) { echo "ERRORS RETURNED BY OPERATION:n"; print_r($shard_query->errors); } } if(is_resource($stmt) || is_object($stmt)) { $count=0; while($row = $shard_query->DAL->my_fetch_assoc($stmt)) { print_r($row); ++$count; } echo "$count rows returnedn"; $shard_query->DAL->my_free_result($stmt); } else { if(!empty($shard_query->info)) print_r($shard_query->info); echo "no query resultsn"; } echo "Exec time: " . ($endtime - $stime) . "n"; Simple data access layer comes with Shard-Query Errors are returned as a member of the object Run the query PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture Apache web interface GUI Easy to set up Run queries and get results Serves as an example of using Shard-Query in a web app with asynchronous queries Submits queries via Gearman Simple HTTP authentication PHP OO Apache Web Interface MySQL Proxy Gearman Message Queue Worker Worker Worker Worker MySQL database shards Shard-Query Architecture MySQL Proxy Interface LUA script for MySQL Proxy Supports most SHOW commands Intercepts queries, and sends them to Shard-Query using the MySQL Gearman UDF Serves as another example of using Gearman to execute queries. Behaves slightly differently than MySQL for some commands Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow Map/reduce like workflow Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow SQL Parser Find it at http://github.com/greenlion/php-sql-parser Supports SELECT/INSERT/UPDATE/DELETE REPLACE RENAME SHOW/SET DROP/CREATE INDEX/CREATE TABLE EXPLAIN/DESCRIBE Used by SugarCRM too, as well as other open source projects. Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow Query Rewrite for parallelism Shard-Query has to manipulate the SQL statement so that it can be executed over more than on partition or machine COUNT() turns into SUM of COUNTs from each query AVG turns into SUM and COUNT SEMI-JOIN is turned into a materialized join STDDEV/VARIANCE are rewritten as well use the sum of squares method Push down LIMIT when possible Query Rewrite for parallelism (cont) Because lookup tables are duplicated on all shards, the query executes in a shared-nothing way All joins, filtering and aggregation are pushed down Mean very little data must flow between nodes in most cases High performance Meets or beats Amazon Redshift in testing at 200GB of data Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow Map/Combine The store_resultset gearman worker runs SQL and stores the result in a table To keep the number of rows in the table (and the time it takes to aggregate results in the end) small, an INSERT ON DUPLICATE KEY UPDATE (ODKU) statement is used when inserting the rows There is a UNIQUE KEY over the GROUP BY attributes to facilitate the upsert Query submitted SQL is parsed Query rewrite for parallelism yields multiple queries Gearman Jobs (map/combine) Final Aggregation (reduce) Return result Shard-Query Data Flow Final aggregation Shard-Query has to return a proper result, combining the results in the result table together to return the correct answer Again, for example COUNT must be rewritten as SUM to combine all the counts (from each shard) in the result table Aggregated result is returned to the client Shard-Query Flow as SQL [[email protected] bin]$ ./run_query --verbose select count(*) from lineorder; Shard-Query optimizer messages: SQL TO SEND TO SHARDS: Array ( [0] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 [1] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 [2] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 [3] => SELECT COUNT(*) AS expr_2913896658 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 ) SQL TO SEND TO COORDINATOR NODE: SELECT SUM(expr_2913896658) AS ` count ` FROM `aggregation_tmp_58392079` Array ( [count ] => 0 ) 1 rows returned Exec time: 0.03083610534668 Initial query Query rewrite / map Final aggregation / reduce Final result Map/Combine example select LO_OrderDateKey, count(*) from lineorder group by LO_OrderDateKey; Shard-Query optimizer messages: * The following projections may be selected for a UNIQUE CHECK on the storage node operation: expr$0 * storage node result set merge optimization enabled: ON DUPLICATE KEY UPDATE expr_2445085448=expr_2445085448 + VALUES(expr_2445085448) SQL TO SEND TO SHARDS: Array ( [0] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [1] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [2] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 GROUP BY expr$0 [3] => SELECT LO_OrderDateKey AS expr$0,COUNT(*) AS expr_2445085448 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 GROUP BY expr$0 ) SQL TO SEND TO COORDINATOR NODE: SELECT expr$0 AS `LO_OrderDateKey`,SUM(expr_2445085448) AS ` count ` FROM `aggregation_tmp_12033903` GROUP BY expr$0 combine reduce Use cases Machine generated data Sensor readings Metrics Logs Any large table with short lookup tables Star schema are ideal Call detail records Shard-Query is used in the billing system of a large cellular provider CDRs generate a lot of data Shard-Query includes a fast PERCENTILE function Green energy meter processing High volume of data means sharding is necessary With Shard-Query, reporting is possible over all the shards, making queries possible that would not work with Fabric or other sharding solutions Used in India for reporting on a green power grid Log analysis Performance logs from a web application for example Aggregate many different statistics and shard if log volumes are high enough Search text logs with regular expressions Performance Star Schema Benchmark SF 20 119 million rows of data (12GB) Infobright Community Database Only 1st query from each flight selected Unsharded compared to four shards (box has 4 cpu - Amazon m1.xlarge) COLD MySQL 35.39s Shard-Query 11.62s HOT MySQL 10.99s Shard-Query 2.95s Query 1 select sum(lo_extendedprice*lo_discount) as revenue from lineorder join dim_date on lo_orderdatekey = d_datekey where d_year = 1993 and lo_discount between 1 and 3 and lo_quantity < 25; COLD MySQL 34.24s Shard-Query 12.74s HOT MySQL 12.74s Shard-Query 3.26s Query 2 select sum(lo_revenue), d_year, p_brand from lineorder join dim_date on lo_orderdatekey = d_datekey join part on lo_partkey = p_partkey join supplier on lo_suppkey = s_suppkey where p_category = 'MFGR#12' and s_region = 'AMERICA' group by d_year, p_brand order by d_year, p_brand; COLD MySQL 27.29s Shard-Query 7.97s HOT MySQL 18.89 Shard-Query 5.06s Query 3 select c_nation, s_nation, d_year, sum(lo_revenue) as revenue from customer join lineorder on lo_custkey = c_customerkey join supplier on lo_suppkey = s_suppkey join dim_date on lo_orderdatekey = d_datekey where c_region = 'ASIA' and s_region = 'ASIA' and d_year >= 1992 and d_year