cogman : cognitive network management architecture - phd thesis defense -
DESCRIPTION
CogMan : Cognitive Network Management Architecture - PhD Thesis Defense -. Sungsu Kim [email protected] Supervisor: Prof. James Won-Ki Hong June 27, 2013 Distributed Processing & Network Management Lab. Dept. of Computer Science and Engineering POSTECH, Korea. 01 Introduction. - PowerPoint PPT PresentationTRANSCRIPT
Sungsu Kim, POSTECH PhD Thesis Defense 1/37
CogMan: Cognitive Network Man-agement Architecture
- PhD Thesis Defense -
Sungsu [email protected]
Supervisor: Prof. James Won-Ki Hong
June 27, 2013
Distributed Processing & Network Management Lab.Dept. of Computer Science and Engineering
POSTECH, Korea
Sungsu Kim, POSTECH PhD Thesis Defense 2/37
Table of Contents
02 Related Work Autonomic control loop
Human cognition model
03 CogMan Conceptual representation of CogMan
Cognitive Control loop
Reasoning for the Reflective Loop
01 Introduction Network management approaches
Research motivation
Problems
Research approach
04 Validation SDN overview
Failure recovery problems in SDN
Experiment results
05 Concluding Remarks Summary
Contributions
Future work
Sungsu Kim, POSTECH PhD Thesis Defense 3/37
Introduction
Sungsu Kim, POSTECH PhD Thesis Defense 4/37
Network Management Approaches Traditional approach
Managed network
Administrator
Monitoring data:Port up/down
state,Number of
packet in/out,Network alarms
Commands for reconfiguration
Autonomic approachAutonomic Network Management System
Managed network
Decision making
Monitor
Policy reposi-tory
Analyze
Execute
Monitoring data:Port up/down
state,Number of
packet in/out,Network alarms
Commands for reconfiguration
Sungsu Kim, POSTECH PhD Thesis Defense 5/37
Research Motivation Previous studies have discussed various autonomic net-
work management technologies
Existing autonomic network management technologies are heavily dependent on policies to fix problems
Autonomic network management systems are not widely deployed in real networks and most networks are managed by human administrators
In new networking architectures, such as Software Defined Networking (SDN) and OpenFlow networks, network control is centralized, so an autonomic network management ap-proach is appropriate for control and management
Sungsu Kim, POSTECH PhD Thesis Defense 6/37
Problems in Autonomic Network Management
Understanding of current state of the managed network is weak
Autonomic network management systems cannot solve complex problems
Response time of autonomic network management systems is not fast
Sungsu Kim, POSTECH PhD Thesis Defense 7/37
Research Approach Existing autonomic network management systems
cannot handle complex problems Autonomic network management systems are not
deployed in real networks
Efficient management of complex problems
An autonomic network management architecture based on the human cognition model
Validation of the architecture in an SDN network
PreviousResearches
ProposedMethod
Sungsu Kim, POSTECH PhD Thesis Defense 8/37
Related Work
Sungsu Kim, POSTECH PhD Thesis Defense 9/37
Related Work (1/3)
IBM MAPE [IBM, ‘03]
Managed Resources
Sensors Effectors
Monitor
Analyze
Execute
Plan
Knowledge
Autonomic ManagerSensors Effectors
Sungsu Kim, POSTECH PhD Thesis Defense 10/37
Related Work (2/3)
FOCALE control loop [Strassner, ‘07]
Current State =Desired State?
Managed ResourceManaged Resource
Analyze Data and EventsAnalyse Data
and Events
YES
NO
Model -BasedTranslation
Control
Control
Control
Control
Policy Manager
Policies control application of intelligence
Policy Manager
Policies control application of intelligence
Context ManagerContext Manager
Ontological Comparison
Reasoning and Learning
Control
Autonomic Manager
TranslationModel-Based Determine
Actual State
Configuration(s))Define New Device
Sungsu Kim, POSTECH PhD Thesis Defense 11/37
Related Work (3/3) Human cognition model [Shrobe, 06]
Reflective
Deliberative
Reactive
WorldBody
Motor GoalPerceptualgoal
Sensoryimage
Actions:Posture
LocomotionSensorimotortransformation
Perception Control Actuation
Conceptual
gist
Intellectualgoals
Recall and attention algorithm
Emotions
Behavioralplan
Sungsu Kim, POSTECH PhD Thesis Defense 12/37
CogMan: Cognitive Network Man-agement Architecture
Sungsu Kim, POSTECH PhD Thesis Defense 13/37
Conceptual Representation of CogMan
Managed re-source(s)
Observe &Normalize Reasoning
Autonomic manager
Vendor-specific data
Normalized data
Act
Policy managerUser interface
Support
A set of actionsfor reconfigura-tion
Reactive loop
Deliberative loop
Reflective loop
Policy
Vendor-specific com-mands
Business goals
Compare
Cisco data Juniper data
Information model mapping
Port down alarm
Port down
alarmCorrelate alarms
Compare state and classify problems
Reactive: a single failure,backup path is pre-paredDeliberative: a single failure,
backup path is prepared,optimal path is required
Reflective: multiple fail-ures &backup path is failed
Reasoning is re-quired to solve complex
problems
Backup path
Backup path
Sungsu Kim, POSTECH PhD Thesis Defense 14/37
Cognitive Control Loop (1/2)
Original FOCALE control loop + human cognition model
Managed resource(s)
Decision making
Compare
Observe
ActPlan & Decide
Perception Actuation
Normalize
Reflective
Deliberative
Reactive
WorldBody
Motor GoalPerceptual
goal
Sensoryimage
Actions:Posture
LocomotionSensorimotortransformation
Perception Control Actuation
Conceptual gist
Intellectualgoals
Recall and attention algorithm
Emotions
Behavioralplan
FOCALE Human cognition model
Sungsu Kim, POSTECH PhD Thesis Defense 15/37
Cognitive Control Loop (2/2)
Managed resource(s)
Decision making
Compare
Observe
Act
Plan & De-cide
Reasoning
ActuationPerception
Reflective
Deliberative
Reactive
Normalize
Reactive loopProblems can be solved fast
Reflective loopComplex problems
Reasoning algorithm is necessary
Deliberative loopProblems defined
by policy
Sungsu Kim, POSTECH PhD Thesis Defense 16/37
Reasoning for the Reflective Loop
Reasoning algorithm is used to solve complex problems
Multiple failures cannot be solved if backup paths are failed
We propose a Fast Flow Setup (FFS) algorithm to recover multiple failures in SDN networks • FFS recovers failures fast even if backup paths are failed• FFS reduces load of an SDN controller
Sungsu Kim, POSTECH PhD Thesis Defense 17/37
Validation: Fault Management in SDN Networks
Sungsu Kim, POSTECH PhD Thesis Defense 18/37
Software Defined Networking (SDN)
SDN: separation of data and control planes
Switches
API to the data plane(e.g., Open-Flow)
Logically-central-ized control
Controller
Routing
Control Plane
Man-age-ment Plane
Data Plane
Traditional networks SDN networks
Sungsu Kim, POSTECH PhD Thesis Defense 19/37
OpenFlow Flow Table Entry
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport
Matching fields Action Stats
1. Forward packet to port(s)2. Encapsulate and forward to controller3. Drop packet4. Send to normal processing pipeline5. Modify Fields
+ mask what fields to match
Packet + byte counters
L1 L2 L3 L4
Sungsu Kim, POSTECH PhD Thesis Defense 20/37
Failure Recovery in SDN Networks
Traditional IP networks• Distributed routing protocols reroute packets to alternative
paths• Manual reconfiguration • Path protection (MPLS)
SDN networks• Protection
− Backup paths− Fast failure recovery time (less than 50ms)
• Restoration− Redirect affected flows one by one− Failure recovery time is relatively long
Sungsu Kim, POSTECH PhD Thesis Defense 21/37
Restoration ExampleCon-troller
AB
C
D
EHost 1Host 2
Port down message
Port down message
1. Obtain affected flows (host1host2)2. Find an alternative path for each flow path: <ACED> 3. set up alternative paths
Working path
Backup path
Sungsu Kim, POSTECH PhD Thesis Defense 22/37
Protection ExampleController
AB
C
D
EHost 1Host 2
1. Switch A detects port down 2. Send packets to the backup path
Working path
Backup path
Set working and backup paths
Sungsu Kim, POSTECH PhD Thesis Defense 23/37
Problems in SDN Fault Management
Protection can recover a failure in 50ms• Protection is the best solution for a single failure
Problems of the protection mechanism• Extra packet exchanges are required during flow setup • Protection cannot handle multiple failures that affect both
working and backup paths• Practically, providing perfect protection to all links is difficult
Restoration is an appropriate method for multiple failures• Failure recovery time of restoration is longer than 200ms
Sungsu Kim, POSTECH PhD Thesis Defense 24/37
Why Restoration Takes Too Long?
Controller
AB
C
D
EHost 1Host 2
dst: host2
The controller calculate the path between host1 and host 2 path= <ABD>
Add flow entries to A, B, and D
Ask con-troller
Flow setup example
Sungsu Kim, POSTECH PhD Thesis Defense 25/37
Fast Flow Setup (FFS) Original flow setup requires many packet exchanges
• We propose a Fast Flow Setup (FFS) algorithm• FFS implants path information to a flow entry• Reduce the number of packet exchanges for flow setup
Original flow setup
Fast Flow Setup (FFS)
Control packet exchange 1+ n 2
Delay (1+n)*t 2* tController load high low
Tasks of switches Flow entry setup IP header inspection,
Flow entry setup
n=number of switches in a patht= latency between the controller and a switch
Sungsu Kim, POSTECH PhD Thesis Defense 26/37
Example of the FFS Algorithm
Controller
AB
C
D
EHost 1Host 2
dst: host2
1. The controller calculates the path between host1 and host 2 path= <ABD>2. Implant path <BD> into flow table entry
Ask con-troller
dst: host2
Flow entry
<DB>
dst: host2 D B
dst: host2 D dst:
host2
Sungsu Kim, POSTECH PhD Thesis Defense 27/37
The Proposed SDN Fault Management
•Predefined backup path•Single failure•Short duration flow
Reactive•Predefined backup path•Single failure•Long-lived flowDelibera-
tive
•No predefined backup path•Multiple failures•FFS algorithm is used for recov-
ery
Reflective
Sungsu Kim, POSTECH PhD Thesis Defense 28/37
System ArchitecturePort state alarm
Port state handler
Alarm clustering
Affected flow detec-tor
Routing
Flow table modifier
Observe
Normalize
Compare
Plan & Decide
Reasoning
Flow_mod message
Act
Path encoder
CogMan processesFunctions for actual fault management
Sungsu Kim, POSTECH PhD Thesis Defense 29/37
Prototype Implementation
Floodlight Controller
CogMan
S1OpenFlow network• Topology
construction• Fault injection
MAPE
Management module• Protection• FFS algorithm• CogMan • FOCALE• MAPE
Controller core
FOCALE
S2
S5
S3 S4
S6
Host 1
Host n
…
… …
Sungsu Kim, POSTECH PhD Thesis Defense 30/37
Recovery Time (Single Failure)
CogMan (protection) Restoration
Number of affected flows = 10
Sungsu Kim, POSTECH PhD Thesis Defense 31/37
Recovery Time (Multiple Failures)
Minimum: recovery time of the first affected flowMaximum: recovery time of the last affected flow
CogMan (FFS) vs. FOCALE (restoration)
Sungsu Kim, POSTECH PhD Thesis Defense 32/37
Packet Exchange Ratio
Packet exchanges between the controller and switches
Packet exchange ratio Traffic volume
Number of affected flows = 50
Sungsu Kim, POSTECH PhD Thesis Defense 33/37
Packet Exchanges for Flow Setup
Number of packet exchanges required to set up flow (normal vs. protection)
Number of packet exchanges Analytic and measured difference
Sungsu Kim, POSTECH PhD Thesis Defense 34/37
Concluding Remarks
Sungsu Kim, POSTECH PhD Thesis Defense 35/37
Summary Autonomic network management technologies are re-
quired to solve complex problems
Autonomic network management architecture based on the cognition model is proposed
FFS is proposed for fast recovery of multiple failures
The algorithm and architecture are validated by con-ducting experiments in an SDN network
Sungsu Kim, POSTECH PhD Thesis Defense 36/37
Contributions The problems of network management approaches are described
By applying a human cognition model to FOCALE control loop, we propose CogMan which is able to handle complex problems
A novel failure recovery mechanism, which can be used instead of restoration, is described for fast failure recovery in SDN networks
A complete monitoring, analysis, and recovery cycle of managing fault in SDN networks is described. This thesis shows that the proposed methods recover various failure cases by conducting experiments in our testbed
Sungsu Kim, POSTECH PhD Thesis Defense 37/37
Future Work Validation of the proposed methods in a large-scale
testbed
Combination of protection and FFS for recovery from multiple failures in 50ms
Applying CogMan to other management cases • E.g., Quality of Service (QoS) management of video streaming ser-
vices
Feasibility test for replacing the current flow setup al-gorithm
Sungsu Kim, POSTECH PhD Thesis Defense 38/37
바쁘신 와중에도 시간 내주셔서 감사합니다
Q&A
Sungsu Kim, POSTECH PhD Thesis Defense 39/37
Publications (1/2) International Journal/Magazine Papers (2)
• Sungsu Kim, Joon-Myung Kang, Sin-seok Seo, and James Won-Ki Hong, “ A Cognitive Model based Approach for Autonomic Fault Management in OpenFlow Networks,” International Journal of Network Management (IJNM), (sub-mitted) (SCIE).
• Taesang Choi, Tae-Ho Lee, Nodir Kodirov, Jaegi Lee, Doyeon Kim, Joon-Myung Kang, Sungsu Kim, John Strassner, and James Won-Ki Hong, “HiMang: Highly Manageable Network and Service Architecture for New Generation”, Journal of Communications and Networks, vol. 13, no. 6, pp. 547-551, Dec. 30, 2011. (SCI)
International Conference/Workshop Papers (9)• Sungsu Kim, Sin-seok Seo, Joon-Myung Kang, Guy Pujolle, and James Won-Ki Hong, “Autonomic Resource Alloca-
tion for Video Streaming Services in Content Delivery Networks,” Global Information Infrastructure and Networking Symposium (GIIS 2012), Chroni, Venezuela, Dec. 2012.
• Sungsu Kim, Sin-seok Seo, Joon-Myung Kang, and James Won-Ki Hong, “ Autonomic Fault Management based on Cognitive Control Loops,” 2012 IEEE/IFIP International Workshop on Management of the Future Internet (ManFI 2012), Maui, Hawaii, USA, April 20, 2012, pp. 1104-1110.
• Sungsu Kim, John Strassner, and James Won-Ki Hong, “Semantic Overlay Network for Peer-to-Peer Hybrid Infor-mation Search and Retrieval,” 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011), Dublin, Ireland, May 23-27, 2011, pp. 430-437.
• Arum Kwon, Joon-Myung Kang, Sin-seok Seo, Sung-Su Kim, Jae Yoon Chung, John Strassner, and James Won-Ki Hong, “The Design of a Quality of Experience Model for Providing High Quality Multimedia Services,” Lecture Notes in Computer Science, Vol. 6473, Modelling Autonomic Communication Environments, 5th International Workshop on Modelling Autonomic Communication Environments (MACE 2010), Niagara Falls, Canada, Oct. 28, 2010, pp. 24-36.
• Sin-seok Seo, Sung-Su Kim, Nazim Agoulmine, and James Won-Ki Hong, “On Achieving Self-Organization in Mobile WiMAX Network,” the 5th IEEE/IFIP International Workshop on Broadband Convergence Networks (BcN 2010), Os-aka, Japan, Apr. 19, 2010, pp. 43-50.
• Sung-Su Kim, Young J. Won, John Strassner, and James Won-Ki Hong, “Manageability of the Internet: Management with New Functionality,” the 12th IEEE/IFIP Network Operations and Management Symposium (NOMS 2010), Os-aka, Japan, Apr. 19-23, 2010.
Sungsu Kim, POSTECH PhD Thesis Defense 40/37
Publications (2/2)• John Strassner, SungSu Kim, and James Won-Ki Hong, “Using Semantics to Learn About Routing Data for Im-
proved Network Management in the Future Internet,” the 1st IEEE/IFIP International Workshop on Knowledge Management for Future Services and Networks, Osaka, Japan, Apr. 23, 2010.
• John Strassner, Sung-Su Kim, James Won-Ki Hong, “Semantic Routing for Improved Network Management in the Future Internet,” Recent Trends in Wireless and Mobile Networks (WiMo), 2010.
• Sung-Su Kim, Young J. Won, Mi-Jung Choi, James W. Hong, and John Strassner, “Towards Management of the Future Internet,” IFIP/IEEE Workshop on Management of the Future Internet (conjunction with IM 2009), New York, USA, June 5, 2009, pp. 1-6.
Domestic Journal / Conference Papers (6)
Sungsu Kim, POSTECH PhD Thesis Defense 41/37
Appendix
Sungsu Kim, POSTECH PhD Thesis Defense 42/37
KnowledgeRepresentation
Management Archi-tectureControl Loop
AutonomicNetwork Manage-
ment
Related Work
Sungsu Kim, POSTECH PhD Thesis Defense 43/37
Knowledge Representation
Sungsu Kim, POSTECH PhD Thesis Defense 44/37
Knowledge Representation
Information model• A representation of concepts and relation-
ships, constraints, rules, and operations to specify data semantics
Numbers and words without relation-
shipsNumbers and words with relationships
Inferences derived from information
Data Informa-tion
Knowledge
Feature DEN-ng SID CIMPatterns Many more used
that SID4 Not used
Policy model DEN-ng v6.6.4 DEN-ng v3.5 Simple IETF model
ECA model YES YES NOMetadata model YES NO NO
Sungsu Kim, POSTECH PhD Thesis Defense 45/37
Policy Continuum Business ViewJohn gets a gold service
Unique ID Subscribe SLAGold Silver Bronze
Network/System View
SRC/DSTIP Address
Device con-figuration
DiffServ, bandwidth configuration
Device View
Sungsu Kim, POSTECH PhD Thesis Defense 46/37
Model based Translation Layer
MBTL
DEN-ng
Intermediate
CiscoJuniperNortel Managed Resources
CLI SNMP
Vendor-neutral commands/data
Event ev= new Event();ev. Type
ev. Problem
Event {Source=IP address;
Problem=egp_neighbor_loss}
Trap name: egpNeighborLoss
Raw data
Sungsu Kim, POSTECH PhD Thesis Defense 47/37
OODA loop [Boyd, ‘95]
Control Loop (2/3)
Observe Orient Decide Act
Observations Analyses &Synthesis
CulturalTraditions
GeneticHeritage
NewInformation
PreviousExperience
Decision
Act on HypothesisAct on Decision
Action
Act on Unfolding Interaction with the Environment
OutsideInformation
UnfoldingCircumstances
Implicit Guidanceand Control
Implicit Guidanceand Control
Sungsu Kim, POSTECH PhD Thesis Defense 48/37
Hierarchical Management Archi-tecture
Sungsu Kim, POSTECH PhD Thesis Defense 49/37
Management Architecture Client-server based architecture
• Centralized management• Poor scalability
P2P based management architecture [14]• Highly distributed and scalable
− Load balancing of management tasks • Overhead for exchanging information between management
nodes Hierarchical management architecture [13]
• Distributed and scalable• May not appropriate for dynamic environment, such as vir-
tual networks or cloud computing− Require algorithms for structuring management nodes
Sungsu Kim, POSTECH PhD Thesis Defense 50/37
Related Projects
Comparison with related projects
Knowl-edge
Plane [20]
4WARD [21]
AutoI [22] FAME [24] CogMan
Autonomic principle
No Yes Yes Yes Yes
Self-organiz-ing
Yes Yes Yes Yes Yes
Knowledge representation
Not defined Data-ori-ented map-
ping
Model-based translation
Model-based translation
Model-based translation
Accommo-dates hetero-geneous data
No Limited Yes Yes Yes
Structure of components
No Yes No No Yes
Sungsu Kim, POSTECH PhD Thesis Defense 51/37
Solution Approach (3/4)
Managed Resource
AEM
ANM
AEM AEM AEM
ANM
ANM
Managed Resource
Managed Resource
Managed Resource
Network domain
Network
Network device.. ... .
. . .
Hierarchical management architecture
Sungsu Kim, POSTECH PhD Thesis Defense 52/37
Detailed Algorithms and Imple-mentation
Sungsu Kim, POSTECH PhD Thesis Defense 53/37
Alternative Path Setting (1/2) Calculate alternative paths in advance
• E.g., K-shortest algorithm Push alternative path into OpenFlow header
• Extract different part of an old path and an alternative path
1 2Original pathin: 1 out: 2 in: 2
3in: 2out: 1 out: 1
5in: 1 out: 2
1 4Alternative pathin: 1 out: 3 in: 2
3out: 1 out: 1in: 3
5in: 1 out: 2
1 4Path difference in: 1 out: 2 in: 2
3in: 2out: 1 out: 1
Sungsu Kim, POSTECH PhD Thesis Defense 54/37
Alternative Path Setting (2/2) Put path difference to OpenFlow header
• ofp_action_output, pad field• out port numbers of switches are put into pad field
struct ofp_action_output { uint16_t type; uint16_t len; uint32_t port; uint16_t max_len; uint8_t pad[6];};
1 4Path difference in: 1 out: 2 in: 2
3in: 2out: 1 out: 1
1 1NULL 1NULL 2
Switch 1Switch 4Switch 3flag
pad
pad [0] pad [1] pad [2] pad [3] pad [4] pad [5]
Sungsu Kim, POSTECH PhD Thesis Defense 55/37
Failure Recovery Procedure (1/2) If a switch cannot send a packet as an action specified
1: Checks flow table action again 2: If a port state is down or deleted3: examine ofp_action_output pad field4: If flag ==1, 5: change the out port of flow table action as pad [5]6: Set the alternative path to IP option field in the packet
1 1NULL 1NULL 2
Switch 1Switch 4Switch 3flag
Output action path
pad [0] pad [1] pad [2] pad [3] pad [4]
1 1NULL NULLNULL 1
flag
IP options
[0] [1]
pad [5]
[2] [3] [4] [5]
Switch 4Switch 3
Sungsu Kim, POSTECH PhD Thesis Defense 56/37
Failure Recovery Procedure (2/2) When switch receives a new flow that flow table
does not know1: If flag in IP option == 12: Make new output action as specified in IP option3: Add flow and action to the flow table 4: Delete option [5] and shift right 8 bits 5: If option [5]==NULL6: Remove IP option field
1 1NULL NULLNULL 1
flag
IP options
[0] [1] [2] [3] [4] [5]
Switch 4Switch 3
1 NULLNULL NULLNULL 1
flag
New IP options
[0] [1] [2] [3] [4] [5]
Switch 3
Sungsu Kim, POSTECH PhD Thesis Defense 57/37
Evaluation (2/4) Recovery time of FFS (# of running flow=100)
0 10 20 30 40 50 600
0.5
1
1.5
2
2.5
3
packet count
inte
rarri
val t
ime
(ms)
20ms
Sungsu Kim, POSTECH PhD Thesis Defense 58/37
Evaluation (4/4)
Recovery time of FFS (# of running flow=200)
0 10 20 30 40 50 600
0.5
1
1.5
2
2.5
3
packet count
inte
rarri
val t
ime
(ms)
20ms
Sungsu Kim, POSTECH PhD Thesis Defense 59/37
Evaluation
Sungsu Kim, POSTECH PhD Thesis Defense 60/37
Use Case: Fault Management
Managed resource(s)
Decision making
Compare
Observe
Act
Plan & De-cide
Reasoning
ActuationPerception
Reflective
Deliberative
Reactive
Port down messages,End-to-end connec-tivity failed message
NormalizeTransform vendor specific
data to vendor-neutral data
Correlate network alarms
Decide
QoS alarm
Switch to
switch ping
failed
Switch port down
Causality graph
Reactive case: backup paths are prepared for the failure
Deliberative case: redirect to the optimal path (temporal
backup path)Reflective case: no backup path
is prepared