dragonflow austin summit talk
TRANSCRIPT
Eran Gampel, HuaweiLi Ma, AWCloudGal Sagie, Huawei
Highlights • Integral “Big Tent” project in OpenStack• Designed for High Scale, Performance and Low Latency• Lightweight and Simple• Easily Extendable• Distributed SDN Control Plane• Focus on advanced networking services• Distributes Policy Level Abstraction to the Compute Nodes
Neutron-Server
Dragonflow Plugin
DBOVSDragonflow
DBDriver
Compute Node
OVSDragonflow
DBDriver
Compute Node
OVSDragonflowDB
Driver
Compute Node
OVSDragonflowDB
Driver
Compute Node
DB
VM VM..VM VM..
VM VM.. VM VM..
Distributed SDN
“Under The Hood”
Compute Node Compute Node Compute Node Dragonflow
Network DB
OVS
NeutronServer
Redis
OVSDB-Server
ETCD RethinkDBRAMCloud
Kernel Datapath Module
NIC
User Space
Kernel Space
DB DriversRedis ETCD RethinkDBRMC
Future
Dragonflow PluginRoute Core
API SG
vswitchd
Container
VM Dragonflow ControllerAbstraction Layer
L2 App L3 AppDHCP App
FWaaS LBaaS …FIP/DNAT
Pluggable DB Layer
NB D
B Dr
iver
s
SB DB Drivers
SmartNIC OVSDB
ZooKeeper
ETCD
RMC
Redis
OpenFlow
SG App
ZooKeeper
Zookeeper
Pub/Sub DriversRedis ØMQ
Current Release Features (Mitaka)L2 core API, IPv4, IPv6
GRE/VxLAN/STT/Geneve tunneling protocols Distributed L3 Virtual RouterDistributed DHCP Pluggable Distributed Database
ETCD, RethinkDB, RAMCloud, Redis, ZooKeeperPluggable Publish-Subscribe
ØMQ, RedisSecurity Groups
OVS Flows leveraging connection tracking integrationDistributed DNATSelective Proactive Distribution
Tenant Based
Pluggable Database FrameworkRequirements
HA + ScalabilityDifferent Environments have different requirements
Performance, Latency, Scalability, etc.
Why Pluggable?Long time to productizeMature Open Source alternativesAllow us to focus on the networking services only
Distributed DB
DB Data 3DB Data 2DB Data 1
Full Distribution
Compute Node 1
DragonflowLocal Cache
OVS
Compute Node NDragonflow
OVS
Local Cache
Dragonflow DB DriversRedis ETCD ZookeeperRMC
DB Data 3DB Data 2DB Data 1
DB Data 3DB Data 2DB Data 1
DistributedDatabase
DB Data 3DB Data 2
DB Data 1
Selective Proactive Distribution
Compute Node 1
DragonflowLocal Cache
OVS
DB Data 1
Compute Node NDragonflow
OVS
Local Cache
DB Data 3DB Data 2
Dragonflow DB DriversRedis ETCD ZookeeperRMC
Selective Proactive Distribution
Compute Node 1
DragonflowLocal Cache
OVS
Net1 – VM1, VM2
Compute Node 2Dragonflow
OVS
Local CacheNet2 – VM3, VM4
VM1 VM2 VM3 VM4
Distributed DB
Net2 – VM3, VM4Net1 – VM1, VM2
Compute Node
DragonflowLocal Controller
SubscriberRedis ØMQ
Compute Node
DragonflowLocal Controller
SubscriberRedis ØMQ
Compute Node
DragonflowLocal Controller
SubscriberRedis ØMQ
Compute Node
DragonflowLocal Controller
SubscriberRedis ØMQ
Neutron ServerDragonflow
PluginPublisher
Redis ØMQ
Neutron ServerDragonflow
PluginPublisher
Redis ØMQ
Neutron ServerDragonflow
PluginPublisher
Redis ØMQ. . .
DB
. . .
Pluggable Pub/Sub
Example Distributed DHCP
Page 11
Network Node
DHCP namespace
DHCP namespace
DHCP namespace
DHCP namespace
Neutron DHCP Implementation
DHCP namespace
dnsmasq
DHCPAgent
Neutron Server
Message QueueExample• 100 Tenants• 3 vNet / tenant= 300 DHCP Servers
1 VM Send DHCP_DISCOVER
2 Classify Flow as DHCP, Forward to Controller
3 DHCP App sends DHCP_OFFER back to VM
4 VM Send DHCP_REQUEST
5 Classify Flow as DHCP, Forward to Controller
6 DHCP App populates DHCP_OPTIONS from DB/CFG and send DHCP_ACK
Dragonflow Distributed DHCP
DHCP DISCOVER
VM DHCP SERVER
DHCP OFFER DHCPREQUEST
DHCPACK
13
46
7
Compute Node
Dragonflow
VM
OVS
VM
1 2
br-intqvoXXX qvoXXX
OpenFlow
14
25
7
Dragonflow ControllerAbstraction Layer
L2App
L3App
DHCPApp SG
36
Pluggable DB Layer
DistributedDB
This haseverythingin it
Is Dragonflow Ready?AWCloud Point of View
Who is AWcloud?A pure OpenStack player in ChinaStarted in 2012Intel Capital-backed StartupBroad deployment of production clouds in ChinaLarge-scale infra operations backgroundHighly diverse set of workloads and use cases
Case Study: a typical large-scale deployment• A public cloud for local enterprises• Cooperated with Gaoxin Yiyun, Dell and Intel• Located in Guizhou Province
• The heart of big data industry in China• 2500+ Physical Servers deployed in the data center
• So far, 500+ physical servers have been virtualized by AWcloud OpenStack distribution.
Case Study: a typical large-scale deployment
Dispatch Network Policy to Compute NodesRequirements:
ScalabilityReliability
Currently, we use Neutron OVS plugin…but as workloads increase…
Limitations in Large-scale deployments- Messaging- Persistent HA DB to store network policy
Scale-OUT
Scalability in Persistent StorageRDBMS used in OpenStack
• The semantic is too strong for scale-out applications• Critical performance loss due to semantics
We Believe• Centralized clustering cannot practically scale-out in data center size
We Need• A distributed data storage system which is…
• Optimized for READ• Reached a CONSISTENT state for the whole system• Always HIGH AVAILABILITY• Able to work properly under network PARTITION
Necessary
Scalability in Persistent Storage
We prefer BASE systems for data backends• Basically Available• Soft-state• Eventual consistent
? Is there any open source solution that can meet our requirements?
Scalable Persistent Storage in Dragonflow
• A pluggable Key-Value Interface Layer• Supported Solutions
• ETCD• RAMCloud• ZooKeeper• Redis Is it enough?
Scalable and reliable?
DB Consistency: Common Problem to all SDN Solutions
SDN ControllerNorth-bound Interface (REST?)
South-bound Interface (Openflow)
SDN Apps
SDN DB
NeutronDB
Neutron-serverML2-Core-Plugin
ML2.Drivers.Mechanism.XXX
Services-PluginService
Network
Neutron API Nova API
CLI / Dashboard (Horizon) / Orchestration Tool (Heat)
Switch
Nova
Nova Compute
VM VM
Nova Compute
VM VM
Virtual Switch Virtual SwitchNeutron
Plugin AgentNeutron
Plugin Agent
Message Queue (AMQP)
Neutron-L3-Agent
Neutron-DHCP-Agent
Load
Bal
ance
r
Fire
wall
VPN
L3 S
ervic
es
Topo
logy
Mgr
.
Ove
rlay
Mgr
.
Secu
rity
Vendor-specific API
DB Consistency: Common Problem to all SDN Solutions
Neutron DB
Relational Database
ACID system
Stores the whole virtualized network topology for OpenStack
Dragonflow DB
Key-value Store
BASE system
Stores the ‘partial’ virtualized network topology used in Dragonflow
DB Consistency: Common Problem to all SDN Solution • Problem 1
• Neutron DB transaction is committed, but the related operations on Dragonflow DB have failed.
• Problem 2• Concurrent APIs cause multiple transactions on a given Neutron object. Neutron
DB can deal with it very well due to its ACID nature. How about Dragonflow DB?
• Problem 3• Nested transactions can be done in Neutron DB. How about Dragonflow DB?
• Problem N…
Some thoughts on DB Consistency• Remove Neutron DB
• Complicated Solution when involving ML2• Cannot be done in a short period of time
• Introduce the pluggable key-value store into Neutron• How to work with SQLAlchemy?• Need much more time on evaluation and deep
discussion.
• Any other simple and straightforward solutions?
DB Consistency in Dragonflow
• Introduce a distributed lock for coordination• Guarantee the atomicity of a given API• Implemented in the Neutron core plugin layer• Project-based lock allows concurrency
DragonflowNorth-bound Interface
South-bound Interface (Openflow)
SDN Apps
DF DB
NeutronDB
Neutron-serverCore Plugin
Dragonflow Neutron Plugin
Neutron API
CLI / Dashboard (Horizon) / Orchestration Tool (Heat)
Topo
logy
Mgr
.
Ove
rlay
Mgr
.
Secu
rity
Obtain distributed lock
Dragonflow NB API
DB Consistency in Dragonflow
• Introduce an object synchronization mechanism• All the objects stored in both databases are versioned.• Take advantage of CAS operations of the Dragonflow DB.• Sync the object when something unexpected happens.
SDN DBNeutronDB
Network_ID Name Status MTU VLAN Availability Zone Subnets
Object_ID = Network_ID Version = 5
Read
Notify
compare & swap <- VersionCompute Node Compute Node Compute Node
Dragonflow Local Controller
SubscribervSwitch Flush Flows
Roadmap
OpenStack Networking Challenges
• Scalability• Networking does not scale (<500 compute nodes)
• Performance• Networking performance is low (namespace overhead, huge
control plane overhead, …)• Operability
• Reference implementation has lots of maintenance problems (e.g. thousands of concurrent DHCP servers, namespaces, etc.)
Dragonflow ScalabilityScale(# Compute
Nodes)
today
10,000
2,500
Time to Market
n+2
1,000
Dragonflow& Redis
Dragonflow &RAMCloud &
ØMQOptimized Hybrid
(Reactive/Proactive)Dragonflow4,000
n+1soon
RoadmapAdditional DB Drivers ZooKeeper, Redis…Selective Proactive DB Pluggable Pub/Sub Mechanism DB Consistency Distributed DNATSecurity GroupHierarchical Port Binding (SDN ToR) move to ML2Containers (Kuryr plugin and nested VM support)Topology Service Injection / Service ChainingInter Cloud Connectivity (Border Gateway / L2GW)Optimize Scale and Performance
Rack
ToR
Overlay Virtual Topology
UnderlayHardware
DB
CoreRouterODL
NeutronServer ML2 Mechanism Drivers
ODLDragonflow
Hierarchical Network Management
Dragonflow and Containers
Network Services…
My Cloud
NATIPS
FW
DHCP
LBL3
L2 VPN
Topology Based Service InjectionCompute Node
OVS
VM 1 VM 2
Table 0 Table 1 Table N…
ExternalApp
External App
Table
OpenFlow / Other API
Service Injection Hooks
Newton Release New Applications IGMP Application Distributed Load Balancing (East/West) Brute Force prevention DNS service Distributed Metadata proxy Port Fault Detection
Ride the Dragon!
• Documentation • https://wiki.openstack.org/wiki/Dragonflow
• Bugs & blueprints • https://launchpad.net/dragonflow
• DF IRC channel • #openstack-dragonflow• Weekly on Monday at 0900 UTC in #openstack-meeting-4
(IRC)
Welcome to join discussion!
• Dragonflow Work Sessions• April 27th, 9:50am, Hilton Austin MR414 – Testing & CI• April 27th, 11:00am, Hilton Austin MR414 – DB Consistency• April 28th, 11:20am, Hilton Austin MR406 – Roadmap• April 28th, 1:30pm, Hilton Austin MR414 – PUB/SUB• April 28th, 2:20pm, Hilton Austin MR414 – ML2 & Topology
Injection