swan sigcomm13

12
Achieving High Utilization with Software-Driven WAN Chi-Yao Hong  (UIUC)  Srikanth Kandula Ratul Mah ajan Min g Zha ng Vij ay Gil l Moh an Nan duri Rog er Wa tte nhofe r  (ETH) Microsoft Abstract— We present SWAN, a system that boosts the utilization of inter-datacenter networks by centrally control- ling when and how much trac each service sends and fre- quen tly re-co ngur ing the netw ork’s data plane to matc h curre nt trac demand. But done simplistically, these re- cong uratio ns can also cause severe, trans ient congestio n because dierent switches may apply updates at dierent times. We develop a nov el techniqu e that levera ges a small amount of scratch capacity on links to apply updates in a pro va bly conges tion-f ree manne r, without makin g any as- sumptions about the order and timing of updates at individ- ual switches. Further, to scale to large networks in the face of limited forwarding table capacity, SWAN greedily selects a small set of entries that can best satisfy current demand. It updates this set without disrupting trac by leveraging a small amoun t of scratch capacit y in forward ing tables. Ex- periments using a testbed prototype and data-driven simu- lations of two production networks show that SWAN carries 60% more trac than the current practice. Catego ries and Subject Descri ptors:  C.2.1 [Compu ter- Communicat ion Net works ]: Netw ork Arch itecture and Desig n Keywords:  Inter-DC WAN; software-dened networking 1. INTRODUCTION The wide area network (WAN) that connects the data- centers (DC) is critical infrastructure for providers of online servi ces such as Amazo n, Google, and Microso ft. Man y ser- vices rely on low-latency inter-DC communication for good user experience and on high-throughput transfers for relia- bilit y (e.g., when repli cating updates ). Giv en the need for high capacity —in ter-DC trac is a signi cant fracti on of Internet trac and rapidly growing [20]—and unique traf- c characteristics, the inter-DC WAN is often a dedicated network, distinct from the WAN that connects with ISPs to reach end users [15]. It is an expensive resource, with amor- tized annual cost of 100s of millions of dollars, as it provides 100s of Gbps to Tbps of capacity over long distances. However, pro vider s are unable to fully leverage this in- vest ment today. Inter-DC WANs have extremel y poor ef- cien cy; the averag e utiliz ation of even the busie r links is 40-60 %. One culprit is the lack of coordin ation among the servi ces that use the netw ork. Barri ng coarse, static limits Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or comme rcial advan tage and that copie s bear this notice and the full cita- tion on the rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from [email protected]. SIGCOMM’13, August 12–16, 2013, Hong Kong, China. Copyright 2013 ACM 978-1-4503-2056-6/13/08 ...$15.00. in some cases, services send trac whenever they want and however much they want. As a resul t, the netw ork cycles through periods of peaks and troughs. Since it must be pro- visioned for peak usage to avoid congestion, the network is under-subscribed on av erage. Obser ve that netw ork usage does not have to be this way if we can exploit the char- acteri stics of inter-DC trac . Some inter -DC services are dela y-tol eran t. We can tamp the cyclical behavio r if such trac is sent when the demand from other trac is low. This coordination will boost average utilization and enable the network to either carry more trac with the same ca- pacity or use less capacity to carry the same trac . 1 Another culprit behind poor eciency is the distributed resource allocation model of today, typi cally implemented using MPLS TE (Multiprotocol Label Switching Trac En- gineering) [4,  24]. In this model, no entity has a global view and ingress routers greedily select paths for their trac. As a result, the network can get stuck in locally optimal routing patter ns that are global ly suboptimal [ 27]. We present SWAN (Software-driven WAN), a system that enables inter-DC WANs to carry signicantly more trac. By itself, carrying more trac is straightforward—we can let loose bandwidth -hu ngry services . SWAN achieves high e- ciency while meeting policy goals such as preferential treat- ment for higher-priority services and fairness among similar services. Per observations abov e, its two key aspects are i) globally coordinating the sending rates of services; and  ii) centrally allocating network paths. Based on current service demands and network topology, SWAN  decides how much trac each service can send and congures the network’s data plane to carry that trac. Maintaining high utilization requires frequent updates to the network’s data plane, as trac demand or network topol- ogy changes. A key chal lenge is to impl emen t these updates without causing transient congestion that can hurt latency- sensitive trac. The underlying problem is that the updates are not atomic as they require changes to multiple switches. Even if the before and after states are not congested, con- gestion can occur during updates if trac that a link is sup- posed to carry after the update arrives before the trac that is supposed to leave has left. The extent and duration of such congestion is worse when the network is busier and has larger RTTs (which lead to greater temporal disparity in the applicatio n of updates ). Both these conditi ons hold 1 In some networks, fault tolerance is another reason for low utilization; the network is provisioned such that there is am- ple capacity even after (common) failures. However, in inter- DC WANs, trac that needs strong protection is a small subset of the overall trac, and existing technologies can tag and protect such trac in the face of failures (§2).

Upload: ionut-constantinescu

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 1/12

Page 2: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 2/12

Page 3: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 3/12

Page 4: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 4/12

Page 5: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 5/12

Page 6: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 6/12

Page 7: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 7/12

Page 8: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 8/12

Page 9: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 9/12

Page 10: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 10/12

Page 11: Swan Sigcomm13

8/12/2019 Swan Sigcomm13

http://slidepdf.com/reader/full/swan-sigcomm13 11/12