Download - Azure HDInsight 介紹
常透過Hadoop 處理的資料型態
1.情緒分析(Sentiment)Understand how your customers feel about your brand
2. ClickstreamCapture and analyze website visitors’ data trails and optimize your website
3.感應器(Sensor)/機器Discover patterns in data streaming automatically from remote sensors and machines
4.地理資訊Analyze location-based data to manage operations where they occur
5.伺服器 LogsResearch logs to diagnose process failures and prevent security breaches
6.非結構化資料 (txt, video, pictures, etc..)Understand patterns in files across millions of web pages, emails, and documents
Azure HDInsight簡介
Hadoop Meets the Cloud由微軟所管理的Hadoop服務
使用100% 開源的Apache Hadoop
相容.Net 與 Java 工具
可自動升級 Hadoop 版本
數分鐘內可以設定完成並執行, 無須採購硬體
執行於 Windows 或 Linux
啟用與設定服務, 使用, 取消服務 –可以保留資料
微軟提供技術支援
Data Node Data Node Data Node Data Node
Task Tracker Task Tracker Task Tracker Task Tracker
Name Node
Job Tracker
HMasterCoordination
Region Server Region Server Region Server Region Server
Stream processin
g
Search and query
Data analytics (Excel)
Web/thick client
dashboards
Devices to take action
RabbitMQ /
ActiveMQ
其他Hadoop 元件與工具Ambari: Cluster provisioning, management, and monitoring.Avro (Microsoft .NET Library for Avro): Data serialization for the Microsoft .NET environmentMapReduce and YARN: Distributed processing and resource managementOozie: Workflow managementPhoenix: Relational database layer over HBasePig: Simpler scripting for MapReduce transformationsSqoop: Data import and exportTez: Allows data-intensive processes to run efficiently at scaleZooKeeper: Coordination of processes in distributed systems
HDInsight 優勢
自動化建置 Hadoop clusters
使用最新, 穩定的 Hadoop 元件
提供叢集的高可用度跟高可靠性
透過Azure Blob storage提供經濟, 有效率的儲存方式
整合其他Azure 服務, 包括 Web apps 跟 SQL Database
低進入成本
be removed January 1, 2017
https://portal.azure.com
https://azure.microsoft.com/en-us/documentation/templates/?term=hdinsight
叢集佈署