azure resource monitoring cloud talk_20161128
TRANSCRIPT
Azure Resource Monitoring
VĂN ĐÌNH PHÚC – FPT SoftwareLÊ ĐỨC TIỆP – FPT SoftwareĐOÀN NGỌC HUY – FPT Software
Ping Me
Van Dinh Phuc (Philip Van)Personal email: [email protected] : @phucvdb Skype: @phucvdbTechnology Domain: ◼Virtualization & Cloud Technologies focus on infrastructure (VDI, EUC, IaaS,
PaaS)◼Linux Container◼Innovation Technologies
My current job: Cloud Solution Architect – FSO.CLI.R&D
Agenda
Wholistic Azure Resource monitoring Introduction to Azure Monitoring Services Introduction to On-premise Monitoring (HuyDN6 – FHO.STU) Open Source Monitoring Tools for Azure Case Study with a FSO Project (TiepLD2 – FSU1.BU2) Q&A
Wholistic azure resorce monitoring
Conceptual Model for monitoring and diagnostics
Non-compute resources Compute resources
Monitoring and diagnostics scenarios
◼Health monitoring◼Availability monitoring◼Performance monitoring◼Security monitoring◼SLA monitoring◼Auditing◼Usage monitoring◼Issue tracking
The monitoring and diagnostics pipeline
Collecting and storing data
Azure DiagnosticsAzure Diagnostics gathers data from the following sources for each compute node, aggregates it, and then uploads it to Azure Storage:◼IIS logs◼IIS Failed Request logs◼Windows event logs◼Performance counters◼Crash dumps◼Azure Diagnostics infrastructure logs◼Custom error logs◼.NET EventSource◼Manifest-based ETW
Introduction to Azure Monitoring ServicesResource MonitoringBilling Monitor & ReportLoggingSecurity & ComplianceOMSDemo
Azure’s Monitoring Offerings
The best monitoring strategy combines use of all three to gain comprehensive, detailed insight into the health of your services:◼Azure Monitor – Offers visualization, query, routing, alerting, auto scale, and automation
on data both from the Azure infrastructure (Activity Log) and each individual Azure resource (Diagnostic Logs)◼Application Insights – Provides rich detection and diagnostics for issues at the application
layer of your service, well-integrated on top of data from Azure Monitoring◼Log Analytics part of Operations Management Suite – Provides a holistic IT management
solution for both on-premises and third-party cloud-based infrastructure (such as AWS) in addition to Azure resources.
Resoure Monitoring
Azure Monitor:◼Free◼ A metrics infrastructure◼Monitoring Sources◼Activity Logs (cannot be deleted by Azure
Users) – existing in 90 Days◼Resource Metrics – existing in 7 days◼Diagnostics logs
◼Support almost of Azure Services
Azure monitor
◼Features:◼Metrics◼Alerts (Application Insights, Log Analytics – OMS, Azure Monitor)◼Autoscale◼Activity log◼Diagnostic Logs◼Partner Integrations
Application Insights◼An extensible Application Performance
Management (APM) service for web developers◼Run time vs build time◼Supported-Platforms:◼ASP.net◼Java◼JavaScript◼Node.js◼Sharepoint sites◼SCOM / OMS / PowerBI Integration◼Others (link)
◼Pricing models: basic / Enterprise
Billing Monitor & Report
◼Azure support some features:◼Resource Costs in RG◼Subscription◼Azure billing (Preview)◼Azure EA portal◼Partner integration◼Azure Resource Usage API (Preview) / Azure Resource RateCard API (Preview)
◼Should use tagging for your resources on Azure
Security Monitoring – Azure Security Center
Azure Advisor (Preview)
◼a personalized recommendation engine that provides proactive best practices guidance for optimally configuring your Azure resources◼free Azure product while in public preview
Introduction to Azure Monitoring ServicesResource MonitoringBilling Monitor & ReportLoggingSecurity & ComplianceOMSDemo
Challenges?Unify and index dataCollect multiple types of data from multiple sourcesMake data searchableExtract crucial informationMake data understandableWho is the audience? Are we talking with developers or operations?Data from Application Insight developer centricOperations and developers need to take the right actionIntegrate into DevOps lifecycleProvide integration points for DevOps lifecycle and toolingHooks for problem and incident management systemsHooks for providing information to Plan + Track phase
Operation management Suite
Services in OMS
Log Analytics Azure Automation
Azure Site Recovery Azure Backup
Insights & AnalyticsSecurity & Protection
Configuration & Automation Backup & Disaster Recovery Backup & Disaster Recovery
Log Analytics
Dashboards with high-level actionable intelligenceQuery and analyze dataAdvanced analytics with Power BI
Automatic indexing of ingested log dataCorrelate data over whole infrastructureAutomatic actions and remediation
Collect logs and performance data from systems and storageDesignate custom logs and fieldsSolutions provide additional metrics and insights
Correlate
Collect
Analyze
Log Analytics - Architecture
OMS Workspace
VM with agent
SCOM management servers
SCOM agents
direct agents
Portal
OMS Repository
Azure resources
VM with agent
SolutionSolutionSolution
Log Query
OMS Service
Local
Other Clouds
Azure Storage
Azure
Log Analytics - Data flow
Log Analytics to the helpCentral repositoryData agnosticInfrastructure data (logs, performance)Application data (AppInsight, Service Fabric, …)Automated ingestionStore data in one big poolIndexing of dataData is made searchableDisplay dataEnable human readable informationProvide crucial information at a glance
OMS Workspace
OMS Repository
Service Fabric
VMs with Agent
OMS Service
Azure Storage
Azure
Application Insights
Innovation Pipeline
Distributed data sources - Support
◼Windows Event Log◼Windows Perf Counter◼ETW Traces◼Application Insights◼Load Testing Utilities◼Application Events◼Stream Analytics◼Event Hub
OMS support phases
Azure Automation to the help
IntegrationEnables integration into DevOps lifecycleEnables integration into 3rd party tooling like ITSM systems
OMS Workspace
OMS Repository
Service Fabric
VMs with Agent
OMS Service
Azure Storage
Azure
Azure Automation
Local
Application Insights
Runbook Automation
Webhook
ITSM Systems
Visual Studio Team Services
Introduction to On-premise Monitoring
Key Capabilities & Features (1 of 3)
Infrastructure monitoringMultiple OS Supported (Windows/Linux/Unix)• Server Availability• Performance: CPU, Memory, File system, Disk space, Swap space• Critical Services & Applications• Event Logs• Importance log & configure filesNetwork Topology discovery• Discover devices attach to network to generate a Topology• Connection health, Port/Interface status, Device PerformanceHardware Storage device Server components: Physical Disk, Network Interface, CPU, RAM…
Key Capabilities & Features (2 of 3)
Application & Workload monitoringApplication Monitoring• AD, Exchange, Lync, SharePoint, IIS…• Distributed Application: 3-tier application, LoB web application,
Messaging Services• .NET, Java applicationMultiple Scenarios Supported• Average Response Time• Component Availability• Key application performance metric • SLA StateIntegrate with other systems• Ticket system• System Center Suites
Key Capabilities & Features (3 of 3)
Hybrid & Public Cloud MonitoringVirtual Workload• Physical Host status• Virtual Machines• Virtual NetworkPublic Cloud• IaaS: monitor as an extended Infrastructure system• PaaS: use REST APIs to remotely discover and collect
performance information about services• SaaS: depend of application (limited)
SCOM Architecture
How Objects Are Discovered and Monitored
◼An Operations Manager agent is a service that is installed on a computer
◼Send Heartbeat to Management server every 60 seconds
◼Running as a service on monitored server called Health Service
◼Collects data, compares sampled data to predefined values
◼Creates alerts, and runs responses to the Management Server
◼A Management Server receives and distributes configurations to agents on monitored computers.
◼Agent vs Agentless
Visualize the IT Infrastructure
◼Performance Dashboard◼Summary Dashboard◼SLA Dashboard◼Topology Dashboard
Provide multiple dashboard template to visualize IT Infrastructure
Built-in Report capability◼Show historical information about the environment◼Provide historical trending of health and performance, detailed information about
configuration, capacity planning assistance and the ability to deliver monitoring and SLA information◼Include hundreds of pre-built as well as the ability to create reports◼Support favorite and schedule report with multiple format (XML, PDF, Excel, Word…)
Availability Report Alert ReportPerformance Report
SCOM vs OMS
SCOM = Monitoring
◼A monitoring tool with some basic log analytics capabilities◼Monitor workloads, distributed
applications and so on, whether on-premise or in the cloud◼Can custom setting & monitor◼Need maintain & update new version
OMS = Log Analytics
◼An enhanced log analyzer, with some basic monitoring◼Analytic tool with many preconfigured
solutions to monitor workload on cloud◼Microsoft SaaS◼No need for maintenance or add new
feature
SCOM & OMS: Better Together
Open Source Monitoring Tools for Azure SummaryFeature Comparison Demo
About tools
◼Almost of them are support for monitoring Azure VMs◼Some tools support for monitor a Web App via http check ◼Common Tools◼Zabbix◼Nagios◼Ganglia
Case Study with a FSO Project