unicorn · 2018. 8. 1. · 機械学習の精度. architecture elastic load balancing (elb) amazon...
TRANSCRIPT
![Page 1: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/1.jpg)
![Page 2: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/2.jpg)
UNICORNとは?
UNICORNの課題 UNICORNの事例
![Page 3: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/3.jpg)
UNICORNとは?
![Page 4: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/4.jpg)
アプリデベロッパー向けの
Automated Marketing Platformです
![Page 5: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/5.jpg)
UNICORNの課題
![Page 6: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/6.jpg)
UNICORNはRTBでの広告買い付けに対応しています。
ユーザー
サイト
UNICORN
①
② ④
③
⑤ RTB
① ユーザーがサイトに訪問
②広告買い付けリクエストを受け
③最適な広告を選ぶ
④買う単価を決める
⑤他のDSPより単価が高い場合に広告が表示される
秒間20万Request
一万種類以上の広告在庫
10ms前後100+の要素
機械学習の精度
![Page 7: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/7.jpg)
Architecture
Amazon EC2 Elastic Load Balancing (ELB) Fluentd
Amazon S3
Kafka
Amazon Aurora MongoDB
Amazon Athena
Amazon Redshift
Flink
Amazon Glue Machine Learning
Real Time
AnalyticsStorage
Database
![Page 8: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/8.jpg)
Architecture
Max 200,000 QPSReal Time
PB Size DataStorage
Fast and Cheaper Analytics
Variety Database
![Page 9: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/9.jpg)
UNICORNの事例
![Page 10: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/10.jpg)
Architecture
S3を活用してHttpRequestの処理スピードを10ms台まで短縮しました!
![Page 11: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/11.jpg)
Architecture - Real Time
Request
Program
Cache
Response
DatabaseNew Request Connections
Maintenance
![Page 12: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/12.jpg)
Architecture - Real Time
Shared Memory
Response
Request
Program
S3 DatabaseFast High Availability Maintain Easy10ms
100ms → 10ms !
![Page 13: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/13.jpg)
Architecture
S3とAthenaの利用で機械学習の高速化に成功!!!
![Page 14: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/14.jpg)
Architecture - Storage
Aurora Redshift MongoDB
Driver
Hive
Report Aggregation Machine Learning
HDFS
Hard AutoScale
Wait slowest
![Page 15: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/15.jpg)
Architecture - Storage
Aurora Redshift MongoDB
Glue
Athena
Report Aggregation Machine Learning
S3Read Fast
Async
![Page 16: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/16.jpg)
Architecture - Analytics
Amazon EC2 Elastic Load Balancing (ELB) Fluentd
Amazon S3
Kafka
Amazon Aurora MongoDB
Amazon Athena
Amazon Redshift
Flink
Amazon Glue Machine Learning
Analytics40(0)TB / Day
10,000 Jobs / Day
![Page 17: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/17.jpg)
Architecture - Analytics
Parquet
![Page 18: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/18.jpg)
Architecture - Analytics
Athena Cost
90% Down!
![Page 19: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/19.jpg)
Architecture - Analytics
S3 Cost
x 10!
![Page 20: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/20.jpg)
JSON And Parquet対象5ファイルに対してAthenaのSelectを実行します。
SELECT col_1 FROM table
JSONの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5
Parquetの場合Athena Scan Data: 5GB * 5 / 10 = 2.5GBS3 Request Count: 5
Parquet使うとAthena Costが1/10に!
※ 5GB / file,10 columns / row
![Page 21: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/21.jpg)
JSON And Parquet対象5ファイルに対してAthenaのSelectを実行します。
SELECT col_1 … col_10 FROM table
JSONの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5
Parquetの場合Athena Scan Data: 5GB * 5 = 25GBS3 Request Count: 5 * 10 = 50
※ 5GB / file,10 columns / row
Parquetのファイル数を考慮しないとS3コスト増!
![Page 22: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/22.jpg)
JSON And Parquet
Parquetを使用する時にS3のアクセス回数も考慮する必要があります!!!
Cost = Athena Scan Data + S3 Requests Count解決策:Parquet生成する時に , 1パーティション1ファイルに固める
![Page 23: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/23.jpg)
チーム構成:
Engineer 5人
BizDev 3人
AWS EC2 400+台
ほぼ全日本のモバイル端末からの広告トラフィックを処理できています。
これからGlobal展開していきます。QPSまだまだ上がります!
![Page 24: UNICORN · 2018. 8. 1. · 機械学習の精度. Architecture Elastic Load Balancing (ELB) Amazon EC2 Fluentd Amazon S3 Kafka Amazon Aurora Amazon Redshift MongoDB Amazon Athena](https://reader034.vdocuments.pub/reader034/viewer/2022051512/603e1226ef802e395a054dab/html5/thumbnails/24.jpg)
ご清聴ありがとうございました