基礎から学ぶ超並列sqlエンジンimpala #cwt2015
TRANSCRIPT
-
SQLImpala
| Cloudera
-
2 2015 Cloudera, Inc. All rights reserved.
( ) Customer Operations Engineer()
-
3 2015 Cloudera, Inc. All rights reserved.
Impala 2.0 Roadmap
-
4 Cloudera, Inc. All rights reserved.
Impala
-
5 Cloudera, Inc. All rights reserved.
Cloudera ImpalaHadoopSQL h>p://impala.io/
-
6 Cloudera, Inc. All rights reserved.
ImpalaHDFS HBase
Hive ODBC / JDBC Kerberos / LDAP CDH() Cloudera / Oracle / MapR / Amazon
-
7 Cloudera, Inc. All rights reserved.
Impala Hive
Hadoop
ImpalaHive-
-
8 Cloudera, Inc. All rights reserved.
Impala -> SQL -> Hive -> (: nested type)
-
9 Cloudera, Inc. All rights reserved.
SQL on Hadoop Impala
JDBC/ODBC BI/ (: Tableau, Zoomdata, MicroStrategy, QlikView, SAS)
SQL Hive(MapReduce/Spark)
ETL SparkSQL
Spark SQL
CDH5.4Hive on Spark/SparkSQL
-
10 Cloudera, Inc. All rights reserved.
-
11 Cloudera, Inc. All rights reserved.
Impala
impalad
catalogd Statestore
impala-shell(command line Client) ODBC / JDBC
ODBC / JDBC
SQL App
Hive
Metastore HDFS NN
State Store catalogd HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
-
12 Cloudera, Inc. All rights reserved.
Impala Daemon (impalad)HDFSDataNode impalad impalad
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
-
13 Cloudera, Inc. All rights reserved.
Catalog Service (catalogd) impaladHDFSHive
impaladDDLHiveMetastore
Hive Metastore HDFS NN
State Store catalogd
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
DDL
BlockHive
-
14 Cloudera, Inc. All rights reserved.
StateStore 1 Impalad
catalogd
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad State Store
/
-
15 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
HDFS DataNode HDFS DataNode HDFS DataNode
-
16 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
SQL
HDFS DataNode HDFS DataNode HDFS DataNode
-
17 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
impalad
HDFS DataNode HDFS DataNode HDFS DataNode
-
18 Cloudera, Inc. All rights reserved.
Impala
Query Exec Engine
Query Coordinator
Query Planner
Query Exec Engine
Query Coordinator
Query Planner
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
ODBC / JDBC
SQL App
HDFS (JOIN)
HDFS DataNode HDFS DataNode
-
19 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
impalad
HDFS DataNode HDFS DataNode HDFS DataNode
-
20 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App HiveQL
HDFS DataNode HDFS DataNode HDFS DataNode
-
21 Cloudera, Inc. All rights reserved.
Disk
MapReduceDisk
Impala
-
22 Cloudera, Inc. All rights reserved.
-
23 Cloudera, Inc. All rights reserved.
UDF ()UDF UDAF() Impala C++ UDF Java Hive UDF Python UDF
h>ps://github.com/cloudera/impyla
-
24 Cloudera, Inc. All rights reserved.
-
25 Cloudera, Inc. All rights reserved.
100 10
10 1
1000 GB
100 GB
Group A
Group B
-
26 Cloudera, Inc. All rights reserved.
Impala (Authenecaeon)
Kerberos/LDAP
(Authorizaeon) Sentry(HDFS)
(Audit) Cloudera Navigator
-
27 Cloudera, Inc. All rights reserved.
/I/O I/O
bzip2
: :
-
28 Cloudera, Inc. All rights reserved.
Parquet Impala
I/O Impalasnappy
-
29 Cloudera, Inc. All rights reserved.
HBaseImpala
Impala HBase External systems
put SELECT * FROM hbase_tbl
INSERT / INSERT VALUES get, scan
put/getHadoopNoSQL
ImpalaHBase HDFS
HBase
-
30 Cloudera, Inc. All rights reserved.
Kudu ParquetHDFSKudu
CDH 5.4
-
31 Cloudera, Inc. All rights reserved.
2.0
-
32 Cloudera, Inc. All rights reserved.
Impala 2.0(CDH5.2)
Disk(Disk spill)
SQL 2003Window(RANK, LAG) Where (VARCHAR, CHAR) (VAR_SAMP, VAR_POP)
-
33 Cloudera, Inc. All rights reserved.
Impala 2.1(CDH5.3)
StateStore
-
34 Cloudera, Inc. All rights reserved.
Impala 2.2(CDH5.4)
Amazon S3(unsupported)
Cloudera Navigator
-
35 Cloudera, Inc. All rights reserved.
Roadmap
-
36 Cloudera, Inc. All rights reserved.
2015
Nested type()
EMC Isilon
-
37 Cloudera, Inc. All rights reserved.
2015/2016
LlamaYARN
-
38 Cloudera, Inc. All rights reserved.
2016
20 (mulecore join/runeme/HW)
(nested type/UDF)
/
(Disk Spill) SQL
-
39 Cloudera, Inc. All rights reserved.
-
40 Cloudera, Inc. All rights reserved.
Cloudera Impala HadoopSQL
BI/
-
41 Cloudera, Inc. All rights reserved.
Impala
-
42 Cloudera, Inc. All rights reserved.
Impala4WebUI Hue
QuickStartVM
Cloud Cloudera Live
Cloudera Manager
-
43 Cloudera, Inc. All rights reserved.
HueHue HP h>p://gethue.com/ Hue Demo site h>p://demo.gethue.com/ Query Editors Hive/Impala
-
44 Cloudera, Inc. All rights reserved.
QuickStartVMDownload site h>p://www.cloudera.com/content/www/en-us/downloads/quickstart_vms/5-4.html VMCDH Cloudera Manager(default )8-10GB
-
45 Cloudera, Inc. All rights reserved.
Cloudera LiveWeb site h>p://www.cloudera.com/content/www/en-us/get-started/cloudera-live.html Cloud (AWS) (m4.xlarge x 4) Tableau/Zoomdata(m4.xlarge +1)60 AWS
-
46 Cloudera, Inc. All rights reserved.
Cloudera Manager
root(TUI)
Readme
OS
$ curl -O h>p://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin $ chmod 755 cloudera-manager-installer.bin $ sudo ./cloudera-manager-installer.bin
$ sudo ./cloudera-manager-installer.bin --i-agree-to-all-licenses --noprompt --noreadme
-
47 Cloudera, Inc. All rights reserved.
Cloudera Manager
2 3
Cloudera Manager CDH
1
-
48 Cloudera, Inc. All rights reserved.
-
49 Cloudera, Inc. All rights reserved.
ImpalaDocument h>p://www.cloudera.com/content/www/en-us/documentaeon/enterprise/latest/topics/impala.html Impala() Engineer Blog h>p://blog.cloudera.com/ Cloudera Blog. Impala()
-
50 Cloudera, Inc. All rights reserved.
CDH ()[email protected]
Cloudera ()http://community.cloudera.com/10%
-
51 Cloudera, Inc. All rights reserved.
Hadoop Hadoop h>p://gihyo.jp/admin/serial/01/how_hadoop_works gihyo.jp Impala201512-20161
-
52 Cloudera, Inc. All rights reserved.
We are hiring!
-
Thank you.