20170108 微軟大數據整合解決方案- cortana intelligence suite
TRANSCRIPT
關於我…
• 建國中學、輔仁大學應用數學系、交通大學工業工程研究所。1999年於南加大取得 M.S.
Degree in Computer Science 後,留美由一間小新創公司的軟體工程師開始職涯。後歷任 US.
Interactive Inc. 資深工程師、Sierra Systems 技術主管。
• 2005 返台並加入台灣微軟,曾任技術中心 (MTC) 資深架構師、開發體驗暨平台推廣事業部(DX) 資深協理,現任大中華區 Azure CAT 高級項目經理,提供企業公有雲實作最佳解決方案及微軟產品規劃。
• 具多年協助大型企業及新創公司導入新一代技術平台、及技術行銷之經驗,除 Azure 公有雲解決方案外,於企業搜尋解決方案、企業單一入口及 Windows Apps、Office Apps 等亦有涉獵。
Platform Services
Infrastructure Services
WebApps
MobileApps
API Apps
Notification Hubs
HybridCloud
Backup
StorSimple
Azure SiteRecovery
Import/Export
SQL Database DocumentDB
Redis Cache
AzureSearch
StorageTables
SQL DataWarehouse
Azure AD Health Monitoring
AD PrivilegedIdentity Management
OperationalAnalytics
Cloud Services
BatchRemoteApp
ServiceFabric
Visual Studio
ApplicationInsights
VS Team Services
Domain Services
HDInsight MachineLearning Stream Analytics
Data Factory
EventHubs
Data LakeAnalytics Service
IoT Hub
Data Catalog
Security & Management
Azure ActiveDirectory
Multi-FactorAuthentication
Automation
Portal
Key Vault
Store/Marketplace
VM Image Gallery& VM Depot
Azure ADB2C
Scheduler
Xamarin
HockeyApp
Power BI Embedded
SQL Server Stretch Database
MobileEngagement
FunctionsCognitive Services Bot Framework Cortana
Security Center
Container Service
VM Scale Sets
Data Lake Store
BizTalkServices
Service Bus
Logic Apps
API Management
Content DeliveryNetwork
Media Services
Media Analytics
主題:將資料轉換為智能及決策
Intelligence
Dashboards &
Visualizations
Information
Management
Big Data Stores Machine Learning
and Analytics
CortanaEvent HubsHDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Intelligence Action
People
Automated Systems
Apps
Web
Mobile
Bots
Bot
FrameworkSQL Data
WarehouseData Catalog
Data Lake
Analytics
Data Factory Machine
LearningData Lake Store
Cognitive
Services
Power BI
Data
Sources
Apps
Sensors
and
devices
Data
Vehicle Telemetry Analytics 車輛遙測收集及分析
https://channel9.msdn.com/Shows/Azure-Friday/Cortana-Analytics-Vehicle-Telemetry-Analytics-Solution-Template
Information Management 資訊管理
Data
Sources
Apps
Sensors
and devices
Data
Information
Management
Event Hubs
Data Catalog
Data Factory
Compose and orchestrate data services at scale大數據收集之協調整合
INGEST
SQL
<>
SQL
DATA SOURCES
{ }
SQL
• 建立、排程、組織及管理資料管線
• 顯現資料歷程
• 連接到企業內部原生資料庫或雲端資料來源
• 監視資料管線健全狀態
• 自動化雲端資源管理
• 轉換及分析:如半結構化資料加入 Hadoop 處理轉換為 Hive, Pig, 或自訂程式碼
Information
Management
Event Hubs
Data Catalog
Data Factory
Get more value from your enterprise data assets資料目錄
Information
Management
Event Hubs
Data Catalog
Data Factory
• 註冊企業資料資產
• 快速探索資料資產,縮短找尋資料的時間
• 讓資料存留在預期的位置;選擇您的連接工具
• 控制可探索註冊之資料資產的使用者
• 利用開放的 REST API 融入現有工具和程序
• 降低在資料生態系統使用者之間的屏障,弭平 IT 和業務之間的間隙
Ingest events from websites, apps and devices at cloud scale 事件中樞-從網站、應用程式和裝置擷取雲端等級的遙測數據
• 每秒,即時,記錄數百萬個事件 (AMQP & HTTPS)
• 靈活的授權與節流 (throttling) 來連接裝置
• 使用以時間為基礎的事件緩衝 (time-based event buffering)
• 彈性化的規模 scale up and down
• 使用原生用戶端程式庫,以連接最廣泛的平台 (.NET, Java, Rest)
• 支援多種雲端服務隨插即用的配接器 (ex: NiFi, ELK stack, etc.)
Azure
API
Management
Backend Services
Data
Information
Management
Event Hubs
Data Catalog
Data Factory
Data sources
Apps
Sensors and devices
Event Hubs
SQL Database Machine Learning
HDInsightStorage
Power BIStream Analytics
Big Data Stores 存放海量資料,並視需求擴充
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and devices
Data
Information
Management
Event Hubs
Data Catalog
Data Factory
A hyper-scale repository for big data analytics workloads支援智慧行動的無限制 Data Lake
• 支援使用開放 Apache Hadoop 分散式檔案系統 (HDFS) 標準的任何應用程式
• 不限資料大小,可以儲存數兆個檔案,單一檔案的大小可大於 1 PB,這比其他雲端存放區大 200 倍
• 適合用來儲存任何類型的資料,如高解析度視訊等大規模資料集、基因和地震資料集、醫學資料等
• 調整輸送量以支援大規模平行處理分析
• 永遠加密、角色型安全性與稽核:內建單一登入 (SSO)、多重要素驗證及流暢管理數百萬個身分識別等功能。使用角色型存取控制 (RBAC),將 POSIX 型 ACL 授權給使用者和群組
LOB
Applications
SocialDevices
Clickstream
Sensors
Video
Web
Relational
HDInsight
ADL Analytics
Machine Learning
Spark
R
ADL Store
Big Data Stores
SQL Data
Warehouse
Data Lake Store
SQL Data WarehouseSQL 資料倉儲
• 結合 SQL Server 關聯式資料庫與 Azure 雲端相應放大功能。
• 在數秒內增加或減少存儲空間、暫停或繼續計算
• 實證經驗,在 6 億個使用者帳戶間支援 13 億筆每日AAD驗證
• 內建 Microsoft PolyBase 技術,可簡化並使用分散式分析,讓您透過熟悉的工具在多個資料來源間執行單一 T-SQL 查詢
• 與 SQL Server Integration Services、Azure Analysis Services、Azure 串流分析、Azure Machine Learning、Azure Data Factory 及 Azure 儲存體的無縫整合
Power BI
App ServiceSQL Database
SQL Data Warehouse
Machine Learning
Hadoop
Intelligent App
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Machine Learning and Analytics預測結果及制定決策
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and devices
Data Intelligence
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Easily build, deploy, and share predictive analytics solutions無論是否資料科學家,快速建立、佈署並分享預測分析方案
• 提供多種強固學習演算法模組、訓練資料及視覺化檢驗
• 支援 R & Python 及數百個 CRAN 套件
• 預測成果以 Web Service 佈署於雲端,享受可擴充性及穩定性,支援單次或批次呼叫
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
如果搭上了鐵達尼號,會生還嗎?http://demos.datasciencedojo.com/demo/titanic/
發佈為Web Service https://docs.microsoft.com/zh-tw/azure/machine-learning/machine-learning-excel-add-in-for-
web-services
機器學習演算法小祕技 https://docs.microsoft.com/zh-tw/azure/machine-learning/machine-learning-algorithm-cheat-sheet
Data Lake Analytics以 U-SQL、R、Python 和 .Net輕鬆開發及執行大規模平行轉換和處理資料的程式,來處理 PB 規模的資料
• 基於 YARN 架構的公有雲平台服務
• 隨選服務,幾秒內即可開始處理巨量資料作業
• 使用您熟悉的技術:U-SQL, Spark, Hive, HBase and Storm
• 處理 PB 規模的資料。無需管理基礎結構,您可以視需要處理資料、立即調整規模,並只按各項作業付費
• 能夠使用 Azure SQL Database 和 Azure SQL 資料倉儲等關聯式來源的最佳化資料視覺效果
• 提供 99.9% SLA 的支援
Data Lake Analytics
SQL DW SQL DB Storage BlobsData Lake Store SQL DB in a VM
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Comprehensive set of managed Apache big data projects企業用雲端 Spark 和 Hadoop 服務
• 唯一受完整管理的雲端 Hadoop 產品,為 Spark、Hive、MapReduce、HBase、Storm、Kafka 和 R 伺服器提供最佳化開放原始碼分析叢集
• 支援 Visual Studio、Eclipse 和 IntelliJ 等開發環境
• 透過整合兩個最熱門的筆記本 Jupyter 和 Zeppelin 來解析資料
• 支援 Linux 及 Windows,數秒之內完成佈署,不需要修補作業系統或更新Hadoop 版本,Azure 會為您代勞
• R 伺服器中的多執行緒數學程式庫與透明的並行作業,最多可處理比開放原始碼 R 多 1000 倍的資料,且速度快 50 倍
• 透過低耦合計算及儲存體,相應增加或減少工作負載以符合成本效益
Core Engine
Batch
Map Reduce
Script
Pig
SQL
Hive
NoSQL
HBase
Streaming
Storm
In-Memory
Spark
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Real-time stream processing in the cloud串流分析
• 即時分析資料流,例如 IoT 方案
• 每秒串流處理數百萬個事件
• 串流分析以高輸送量處理資料,產生可預測的結果,而且不會遺失資料
• 為來自裝置與應用程式的資料,建立即時儀表板與警示
• 相互關聯多個資料串流
• 使用熟悉的 SQL 語言,加快開發的速度
Event Hubs
Blob Storage
Stream
Analytics
SQL Database
Event Hubs
Power BI
Blob Storage
Table Storage
Demo: Cortana Intelligence Gallery https://gallery.cortanaintelligence.com
Intelligence智能
Intelligence
Cortana
Bot
Framework
Cognitive
Services
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and devices
Data
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Roll your own with REST APIs
Simple to add: just a few lines of code required
Integrate into the language and platform of your choice
Breadth of offerings helps you find the right API for your app
Built by experts in their field from Microsoft Research, Bing, and Azure Machine Learning
Quality documentation, sample code, and community support
Easy Flexible Tested
GET AKEY
Your bots – wherever your users converse
Intelligence
Cortana
Bot
Framework
Cognitive
Services
• Bot Connector Service: A service to register your bot, configure channels and publish to the Bot Directory. Connect your bot(s) seamlessly to text/sms, Office 365 mail, Skype, Slack, Twitter, and more.
• Bot Builder SDK: An open source SDK hosted on GitHub. Everything you need to build great dialogs within your Node.js or C# bot
• Bot Directory: A public directory of bots registered through the Bot Connector Service. Discover, try, and add bots to conversation experiences
https://dev.botframework.com/
Microsoft Bot Builder (SDKs)
Microsoft Bot Directory
Knowledge & Intelligence Services
VisionWeb Search
Language Speech Knowledge
…ML
Public APIs (Cognitive Services)
Dialog Manager
Knowledge &
Action Graph Entity
Private APIs
Microsoft Bot Connector 4
3
2
1
Add smarts to your bot
Build a great bot
Make your bot discoverable
Connect your bot to channels
Co
nve
rsatio
n a
s a P
latf
orm
Conversation as a Platform
Dashboards & Visualizations 將資料轉換成易於了解的、互動式的視覺圖像
Dashboards &
Visualizations
Power BI
IntelligenceInformation
Management
Big Data Stores Machine Learning
and Analytics
CortanaEvent HubsHDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Intelligence
Bot
FrameworkSQL Data
WarehouseData Catalog
Data Lake
Analytics
Data Factory Machine
LearningData Lake Store
Cognitive
Services
Data
Sources
Apps
Sensors
and
devices
Data
Keep a pulse on your business with live, interactive dashboards將資料轉換為互動、即時的圖表,隨時隨地觀看
Event Hubs
Stream Analytics
Machine Learning
Storage
SQL databaseHDInsight
Power BI
Power BI
• Power BI 儀表板為商務使用者設計,無需IT背景
• 隨時隨地透過行動裝置即時存取
• 提供多種 2D/3D 圖表,互動、即時
• 亦可將報表內嵌到您既有網頁中
• 整合來自Excel 試算表、內部資料、Hadoop 資料集、資料流和雲端服務的資料源
Power BI
Dashboards &
Visualizations
Power BI
主題:將資料轉換為智能及決策
Intelligence
Dashboards &
Visualizations
Information
Management
Big Data Stores Machine Learning
and Analytics
CortanaEvent HubsHDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Intelligence Action
People
Automated Systems
Apps
Web
Mobile
Bots
Bot
FrameworkSQL Data
WarehouseData Catalog
Data Lake
Analytics
Data Factory Machine
LearningData Lake Store
Cognitive
Services
Power BI
Data
Sources
Apps
Sensors
and
devices
Data
主題:將資料轉換為智能及決策
Intelligence
Dashboards &
Visualizations
Information
Management
Big Data Stores Machine Learning
and Analytics
CortanaEvent HubsHDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Intelligence Action
People
Automated Systems
Apps
Web
Mobile
Bots
Bot
FrameworkSQL Data
WarehouseData Catalog
Data Lake
Analytics
Data Factory Machine
LearningData Lake Store
Cognitive
Services
Power BI
Data
Sources
Apps
Sensors
and
devices
Data