isa5428: 普及計算 autonomic computing 金仲達教授 清華大學資訊系統與應用研究所...
Post on 19-Dec-2015
227 views
TRANSCRIPT
ISA5428: 普及計算
Autonomic Computing
金仲達教授清華大學資訊系統與應用研究所
九十一學年度第二學期
(Slides are taken from the presentations byAlan Ganek, Alfred Spector, Jeff Kephart of IBM)
2
Trillions of heterogeneous computing devices connected to the Internet
Dream of Pervasive Computing …
or Nightmare!
3
Core of the Problem
• Complexityin systems themselves and in the operating environment– As systems become more interconnected
and diverse, architects are less able to anticipate and design interactions among componentspush to runtime, late bindinge.g., hot-plug, JVM, JIT compilation, service discovery, mobile agents, …
• Complexity managementhuman intervention and IT costs
4
Need Complexity Management
• But complexity is beyond that human can handleHuman out of the control loop autonomic
• Even though we are moving along this direction, is there any systematic way of addressing this issue?
• Autonomic Computing
5
Alan G. GanekVice PresidentIBM Autonomic Computing
http://www.ibm.com/autonomic/
Autonomic Computing
6
Directory Directory and Security and Security
ServicesServicesExistingExisting
ApplicationsApplicationsand Dataand Data
BusinessBusinessDataData
DataDataServerServer
WebWebApplicationApplication
ServerServer
Storage AreaStorage AreaNetworkNetwork
BPs andBPs andExternalExternalServicesServices
WebWebServerServer
DNSDNSServerServer
DataData
Dozens of systems and applications
Hundreds of components
Thousands of tuning
parameters
Complex Heterogeneous Infrastructures Are a Reality!
7
Industry Trends
• Administration of systems is increasingly difficult– 100s of configuration, tuning parameters for DB2
• Heterogeneous systems are increasingly connected– Integration becoming ever more difficult
• Architects can't plan interactions among components– Increasingly dynamic; frequently with unanticipated
components• More burden must be assumed at run time
– But human administrators can't assume the burden• 6:1 cost ratio between storage admin and storage• 40% outages due to operator error
• Need self-managing computing systems– Behavior specified by sys admins via high-level
policies– System and its components figure out how to carry
out policies
8
Autonomic Computing Vision
• “ Intelligent” open systems that…– Manage complexity– “Know” themselves– Continuously tune themselves– Adapt to unpredictable conditions– Prevent and recover from failures– Provide a safe environment
• Self-management:– free administrators from details of operations– provide peak performance 24/7– Concentrate on high-level decisions and
policies
9
Increase Responsiveness
Adapt to dynamically changing environments
Business Resiliency
Discover, diagnose, and act to prevent disruptions
Operational Efficiency
Tune resources and balance workloads to maximize use of IT resources
Secure Information and Resources
Anticipate, detect, identify, and protect against attacks
Self-managing Systems That …
Aware/Proactive
10
Self-Configuring Example:DB2 Configuration Advisor
11
Self-Healing Example: IBM Electronic Service Agent
12
InternetInternet
Appliance Appliance ServersServers
Web Web Application Application
ServersServersData and Data and
Transaction Transaction ServersServers
Internet/Internet/ExtranetExtranet
Business Business PartnersPartners
Self-tuning, end-to-end performance Self-tuning, end-to-end performance managementmanagement
Dynamic allocation of network resourcesDynamic allocation of network resources Workload balancing & routingWorkload balancing & routing Cross platform reportingCross platform reporting Policy-based for various classes of users & applicationsPolicy-based for various classes of users & applications
Heterogeneous, distributed components working together
Self Optimizing:Enterprise Workload Management
13
Automate incident responseProtect systems and dataHelp prevent service disruptions
Risk MgrIDS Rules
Event Database
CorrelationEngine
Intrusion Detection System (IDS)
RouterWebServer
Firewall
ApplicationServer Intrusion
Detection
InternetIntranet
RiskManagerSecurity Event
ApplicationServer
"The Tivoli security management software portfolio is helping our clients extend their businesses to the
Internet while providing security and privacy..."Mark Ford, Principal
Deloitte & Touche
Rapid / automated analysisof complex situations
Self-Protecting Example: IBM Tivoli Risk Manager
14
Evolving towards Self-management
Today The Autonomic Future
Self-configure
Corporate data centers are multi-vendor, multi-platform. Installing, configuring, integrating systems is time-consuming, error-prone.
Automated configuration of components, systems according to high-level policies; rest of system adjusts automatically. Seamless, like adding new cell to body or new individual to population.
Self-heal Problem determination in large, complex systems can take a team of programmers weeks
Automated detection, diagnosis, and repair of localized software/hardware problems.
Self-optimize WebSphere, DB2 have hundreds of nonlinear tuning parameters; many new ones with each release.
Components and systems will continually seek opportunities to improve their own performance and efficiency.
Self-protect Manual detection and recovery from attacks and cascading failures.
Automated defense against malicious attacks or cascading failures; use early warning to anticipate and prevent system-wide failures.
15
Manual Autonomic
Ben
efit
sSk
ills
Ch
arac
teri
stic
s
ManagedLevel 2
PredictiveLevel 3
AdaptiveLevel 4
AutonomicLevel 5
BasicLevel 1
Multiple sources of
system generated data
Requires extensive,
highly skilled IT staff
Basic Requirements
Met
Evolving to Autonomic Computing
16
Manual Autonomic
Ben
efit
sSk
ills
Ch
arac
teri
stic
s
BasicLevel 1
PredictiveLevel 3
AdaptiveLevel 4
AutonomicLevel 5
Multiple sources of
system generated data
Requires extensive,
highly skilled IT staff
Basic Requirements
Met
ManagedLevel 2
Consolidationof data and
actions through
managementtools
IT staffanalyzes andtakes actions
Greater system
awarenessImproved
productivity
Evolving to Autonomic Computing
17
Manual Autonomic
Ben
efit
sSk
ills
Ch
arac
teri
stic
s
BasicLevel 1
ManagedLevel 2
AdaptiveLevel 4
AutonomicLevel 5
Multiple sources of
system generated data
Requires extensive,
highly skilled IT staff
Basic Requirements
Met
Consolidationof data and
actions through
managementtools
IT staffanalyzes andtakes actions
Greater system
awarenessImproved
productivity
PredictiveLevel 3
Systemmonitors,
correlates and recommends
actions
IT staffapproves and
initiates actions
Reduced dependency on
deep skillsFaster/better
decision making
Evolving to Autonomic Computing
18
Manual Autonomic
Ben
efit
sSk
ills
Ch
arac
teri
stic
s
BasicLevel 1
ManagedLevel 2
PredictiveLevel 3
AutonomicLevel 5
Evolving to Autonomic Computing
Multiple sources of
system generated data
Requires extensive,
highly skilled IT staff
Basic Requirements
Met
Consolidationof data and
actions through
managementtools
IT staffanalyzes andtakes actions
Greater system
awarenessImproved
productivity
Systemmonitors,
correlates and recommends
actions
IT staffapproves and
initiates actions
Reduced dependency on
deep skillsFaster/better
decision making
AdaptiveLevel 4
System monitors,
correlates and takes action
IT staff manages
performance against SLAs
Balanced human/system
interactionIT agility and
resiliency
19
Manual Autonomic
Ben
efit
sSk
ills
Ch
arac
teri
stic
s
BasicLevel 1
ManagedLevel 2
PredictiveLevel 3
AdaptiveLevel 4
Multiple sources of
system generated data
Requires extensive,
highly skilled IT staff
Basic Requirements
Met
Consolidationof data and
actions through
managementtools
IT staffanalyzes andtakes actions
Greater system
awarenessImproved
productivity
Systemmonitors,
correlates and recommends
actions
IT staffapproves and
initiates actions
Reduced dependency on
deep skillsFaster/better
decision making
System monitors,
correlates and takes action
IT staff manages
performance against SLAs
Balanced human/system
interactionIT agility and
resiliency
AutonomicLevel 5
Integrated components dynamically managed by
business rules/policies
IT staff focuseson enabling
business needs
Business policy drives IT
managementBusiness agility and resiliency
Evolving to Autonomic Computing
20
IBM’s Architecture Model
• Intelligent control loop:– Implementing self-managing attributes
involves an intelligent control loop
21
Control Loops Delivered in 2 Ways
Combinations of Management
Tools
Recourse Provider
22
3 Layers of Control Loop Management
• Composite resources tied to business decision-making
• Composite resources decision-making, e.g., cluster servers
• Resource elements managing themselves
23
Autonomic Element - Structure
• Fundamental atom of the architecture– Managed element(s)
• Database, storage– Autonomic manager
• Responsible for:– Providing its service– Managing own
behavior inaccordance withpolicies
– Interacting with other autonomic elementsAn Autonomic Element
Monitor
Analyze
Sensors
Execute
Plan
Effectors
Knowledge
Au
ton
omic
Man
ager
Man
aged
Ele
men
t
Sensors Effectors
24
Alerts, events & problem analysis request interface
SLA/Policy interface, interprets & translates into "control logic"
PlanPolicy TransformsPlan Generators
Policy Interpreter
Analyze
Execute
Service Dispatcher
Distribution Engine
Scheduler Engine
Workflow Engine
Monitor
Metric Managers
Filters
Simple CorrelatorsKnowledge
Policy
CalendarTopology
Recent Activity Log
Sensors Effectors
Rules Engines
Analysis Engines
Policy Validations
Policy Resolution
Autonomic Manager Substructure
25
Autonomic Elements - Interaction
• Relationships– Dynamic, ephemeral– Formed by agreement
• May be negotiated– Full spectrum
• Peer-to-peer• Hierarchical
– Subject to policies
26
Multiple Contexts for Autonomic Behavior
System Elements
(Intra-elementself-
management)
Groups of Elements
(Inter-elementself-
management)
Business Solutions
(Business Policies,
Processes, Contracts)
ServerFarm
EnterpriseNetwork
StoragePool
Customer Relationship Management
Enterprise
ResourcePlanning
Servers StorageNetworkDevicesMiddleware
DatabaseApplications
27
Mapping to IT Processes
28
Levels of Maturity
29
Enabled capabilities
Core technologies
Administrative Console
Policy Infrastructure
Data Collection (Logging/Tracing)
Infrastructure Provisioning
Install/Dependency Management
Heterogeneous Workload Management
Solution Management
Policy-based Management
End-to-end Problem Determination
Automated Root Cause AnalysisAuto-Update
Identity/Security Management
Auto-Detection
Dynamic Provisioning
Autonomic Computing Requires Core Technologies
30
Integrated Solutions Console for Common System Administration
• Value:– One consistent interface across
product portfolio– Common runtime infrastructure
and development tools basedon industry standards, component reuse
– Provides a presentation framework for other autonomic core technologies
...n
Customer pain point:Complexity of operations
Standards-based: J2EE, JSR168
31
Log and Trace Tool for Problem Determination
• Value:– Introduces standard
interfaces and formats for logging and tracing
– Central point of interaction with multiple data sources
– Correlated views of data– Reduced time spent in
problem analysis
Analysis Engine
Data Exploiters
Data Producers
ISC
StandardInterface
LoggingAgent
Common situations and data model
BLog
Embedded adapter
....
Data Store
LoggingAgent
Common situations and data model
eServer
Log
Embedded adapter
LoggingAgent
Common situations and data model
ALog
Embedded adapter
Collector Collector....
Parser
Parser
Parser
Viewer....
Customer pain point:Difficulty in analyzing problems in multi-component systems
Standards-based:JSR47, Apache
32
Install/Config Package for Solution Install
• Value:– One consistent software installation
technology across all products– Consistent and up-to-date configuration
and dependency data, key to buildingself-configuring autonomic systems
– Reduced deployment time with less errors– Reduced software maintenance time,
improved analysis of failed system components
– Component-based install for IBM and non-IBM products
Install package developer
Meta-DataNameUUIDVendorVersion
Configuration PropertiesInstall InputRuntime Attributes
DependenciesHW, SW, OS, ConfigurationExtensions
Install ActionsExtensions
Verification ActionsExtensions
Configuration ActionsExtensions
Package Structure
Product Files (binaries, etc.)
Product Files (binaries, etc.)
Deployment Descriptor
Deployment Descriptor
Verification Actions
Verification Actions
DependencyCheckers
DependencyCheckers
Custom Extensions
InstallActions
InstallActions
Configuration Actions
Configuration Actions
GUI Interface
GUI Interface
Customer pain point:Difficulty of deployment in complex systems
Standards-based:OGSA, Web Services
Partnering with InstallShield
33
Policy Tools for Policy-based Management
•Value:–Uniform cross-product policy definition and management infrastructure, needed for delivering system-wide self-management capabilities
–Simplifies management of multiple products; reduced TCO
–Easier to dynamically change configuration in on-demand environment
Customer pain point:Complexity of product and systems management
Standards-based:DMTF, OASIS, OGSA
Adaptation
Definition
ValidationLocal
Repository
Distribution
Enforcement
Point
Push or pull
Push or pull
Activate
Implement
MON ITOR
Facts
Analysis
Resource
…
…
Enforcement
Point
Resource Resource
34
Technologies for Implementing Autonomic Managers
Value:• Components to simplify the incorporation of
autonomic functions into applications – Building blocks for self-management– Monitoring, analysis, planning and execution
components – Including autonomic computing technologies,
grid tools, and services• Pluggable
– Defines interfaces and provides implementations for each major toolkit component
Customer pain point: How to implement end-to-end autonomic solutions
Standards-based:OGSA, W3C
35
Summary of Autonomic Computing Architecture
• Based on a distributed, service-oriented architectural approach, e.g., OGSA– Every component provides or consumes services– Policy-based management
• Autonomic elements– Make every component resilient, robust, self-
managing– Behavior is specified and driven by policies
• Relationships between autonomic elements – Based on agreements established and maintained
by autonomic elements– Governed by policies– Give rise to resiliency, robustness, self-
management of system
36
Summary