apache drill でたしなむ セルフサービスデータ探索 - 2014/11/06 cloudera world tokyo...

17
© 2014 MapR Technologies 1 ® © 2014 MapR Technologies Apache Drill でたしなむ セルフサービスデータ探 草薙 昭彦 (MapR Technologies) 2014 11 6

Upload: mapr-technologies-japan

Post on 05-Jul-2015

1.353 views

Category:

Technology


0 download

DESCRIPTION

数あるSQL-on-Hadoopエンジンの中でも、標準SQL準拠、柔軟で動的なデータ解釈、様々なデータソースや格納形式への対応という特徴を持つApache Drill。デモを中心に、Drillの便利な機能を利用したデータ検索・分析の楽しみ方をご紹介します。2014年11月6日に開催されたCloudera World Tokyo 2014 LTセッションでの講演資料です。

TRANSCRIPT

  • 1. 2014 MapR Technologies1Apache Drill (MapR Technologies)2014 11 6

2. ? 2014 MapR Technologies2 3. 2014 MapR Technologies3 () 4. 2014 MapR Technologies4 5. : 2014 MapR Technologies 5 6. 2014 MapR Technologies 6Hadoop/NoSQL BIDBDB 7. Web 2014 MapR Technologies 7Hadoop/NoSQL BIDBDBHadoopNoSQLHadoop 8. 2014 MapR Technologies 8(, )Data Agility()(, ),MapReduce , Hive SQL-on-Hadoop (IT) (, ) 9. 2014 MapR Technologies 9Apache Drill 10. Apache Drill Google Dremel (BigQuery) : http://incubator.apache.org/drill/ GitHub: https://github.com/apache/incubator-drill 2014 MapR Technologies 10 11. 2014 MapR Technologies 11Apache Drill Agility JSON ETL Flexibility CSV, TSV, JSON, Parquet, Hive, HBase, MongoDB, REST, Hive HBaseJSON Familiarity SQL HiveUDF 12. Drill : 2$ tar xzf apache-drill-0.6.0.tar.gz!!$ ./apache-drill-0.6.0/bin/sqlline -u jdbc:drill:zk=local!!0: jdbc:drill:zk=local!SELECT columns[1] AS LOCATION, columns[2] AS _MONTH, max(columns[6]) AS MAX_TEMP!FROM dfs.`/root/drillwork/tokyo_2013.csv`!GROUP BY columns[1], columns[2];!+------------+------------+------------+!| LOCATION | _MONTH | MAX_TEMP |!+------------+------------+------------+!| | 01 | 9.9 |!| | 02 | 9.8 |!| | 03 | 9.5 |!| | 04 | 25.5 |!| | 05 | 29.4 |!| | 06 | 31.9 |!| | 07 | 36.1 |!| | 08 | 38.6 |!| | 09 | 36.5 |!| | 10 | 30.7 |!| | 11 | 21.8 |!| | 12 | 9.7 |!+------------+------------+------------+!12 rows selected (0.543 seconds)!() 2014 MapR Technologies 12 13. 2014 MapR Technologies13 14. 2014 MapR Technologies 14 Hive HDFS CSV HBase MongoDB CSV Mongo DB Join JSON 15. Apache Drill: SQL 2014 MapR Technologies 15 ETL HBase NoSQL SQL (JSON) ANSI SQL BI Hive UDFAGILITYINSTANT INSIGHTS TO BIG DATAFLEXIBILITYONE INTERFACEFOR HADOOPNOSQLFAMILIARITYEXISTING SKILLS TECHNOLOGIES 16. 2014 MapR Technologies 16 UVer. 0.6 SQL HBase Hive ANSI SQL JDBC/ODBC/RESTVer. 1.0 YARN Window Non-Equi Insert/Update/Delete JavaAPI 17. 2014 MapR Technologies 17Q A@mapr_japan [email protected]