5/3/05cs118/spring051 plan ahead 5th week: r congestion control, tcp delay modeling r network...
Post on 21-Dec-2015
214 views
TRANSCRIPT
5/3/05 CS118/Spring051
Plan Ahead
5th week: Congestion control, TCP delay modeling Network protocols: IPv4, IPv66th week: network routing, routing in the Internet7th week: Midterm Broadcast and multicast routingBefore final: Data link layer, Ethernet, switches, wireless networking
5/3/05 CS118/Spring052
Congestion ControlCongestion: “too many
sources sending too much data too fast for network to handle”
Scenario 1 2 identical senders, 2 receivers, one router w/infinite buffer, no retransmission when congested:
large delays;maximum achievable throughput
5/3/05 CS118/Spring053
Congestion: scenario 2 one router, finite buffers; senders retransmit when timeout
R/2
R/2in
ou
t
in
out
=
retransmission of delayed (not lost) packet makes much larger than
R/2
R/2in
ou
t
R/4
€
'in
€
out
R/2
R/2in
ou
t
Data losses leads to
€
'in >
€
out
5/3/05 CS118/Spring054
• Long delays• superfluous retransmissions• when a packet is dropped, any “upstream transmission capacity” used for that packet was wasted!
finite shared output link buffers
Host Ain : original data
Host B
out
'in : original data, plus retransmitted data
Congestion: scenario 3Q: what happens as and increase?
€
in
€
'in
5/3/05 CS118/Spring055
Approaches towards congestion controlNetwork-assisted congestion control: routers provide
feedback to end hosts Single bit congestion indication Explicit rate sender should send at
End-end congestion control: no explicit feedback from network
congestion inferred from end-system observed loss, delay approach taken by TCP
5/3/05 CS118/Spring056
TCP Congestion Control Add a “congestion control window” congwin on top of flow-control
window
Sender limits LastByteSent-LastByteAcked CongWin How to adjust CongWin
CongWin initialized to 1 mss, increase quickly until loss (= congestion) Upon loss: decrease congwin, then begin probing (increasing) again two “phases”: (1)slow start, (2)congestion avoidance
• threshold defines the boundary between the two How the sender infers congestion: Timeout, or 3 duplicate ACKs
Congwin
recvwin
5/3/05 CS118/Spring057
Basic idea: learn from observations
when congwin < threshold, increase congwin exponentially
when congwin ≥ threshold, increase congwin linearly
if packet lost, have gone too far threshold = congwin / 2 If 3 dup. ACKs: network capable of delivering some
packets, congwin cut in half If timeout: slow-start again (congwin = 1 mss)
Additive Increase, Multiplicative Decrease (AIMD)
5/3/05 CS118/Spring058
TCP SlowStart & Congestion Avoidanceinitialize:
Congwin = 1threshold = RcvWindow
if (CongWin < threshold){ for every segment ACKed Congwin++} until (loss event)
/* slowstart is over */ { for every w segments ACKed: Congwin++} Until (loss event)
/* loss detected */threshold = Congwin/2If (3 dup. ACKs) Congwin = thresholdElse Congwin = 1 mss
one segment
RTT
time
two segments
four segments
5/3/05 CS118/Spring059
TCP sender congestion controlState Event TCP Sender Action Commentary
Slow Start (SS)
Received ACK for previously unacked data
CongWin = CongWin + MSS, If (CongWin > Threshold) set state to “Congestion Avoidance”
Resulting in a doubling of CongWin every RTT
CongestionAvoidance (CA)
Received ACK for previously unacked data
CongWin = CongWin+MSS * (MSS/CongWin)
Additive increase, resulting in increase of CongWin by 1 MSS every RTT
SS or CA Loss event detected by 3 duplicate ACK
Threshold = CongWin/2, CongWin = Threshold,Set state to “Congestion Avoidance”
Fast recovery, implementing multiplicative decrease. CongWin will not drop below 1 MSS.
SS or CA Timeout Threshold = CongWin/2, CongWin = 1 MSS,Set state to “Slow Start”
Enter slow start
SS or CA Duplicate ACK
Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
5/3/05 CS118/Spring0510
Is TCP fair?Fairness: if N TCP sessions share same bottleneck link, each should
get 1/N of link capacityExample: 2 competing connections, same RTT Additive increase gives slope of 1 multiplicative decrease decreases throughput proportionally
capacity R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 t h
rou g
h pu t
congestion avoidance: additive increaseloss: decrease window by factor of 2
congestion avoidance: additive increaseloss: decrease window by factor of 2
TCP connection 1
bottleneckrouter
TCP conn 2
5/3/05 CS118/Spring0511
Fairness (more)
Fairness and UDP Multimedia apps often do
not use TCP do not want rate throttled
by congestion control
Instead use UDP: pump audio/video at
constant rate, tolerate packet loss
Research area: TCP friendly
Fairness and parallel TCP connections
nothing prevents app from opening parallel cnctions between 2 hosts.
Web browsers do this Example: link of rate R
supporting 9 cnctions; new app asks for 1 TCP, gets
rate R/10 new app asks for 11 TCPs, gets
R/2 !
5/3/05 CS118/Spring0512
Delay modeling
Q: How long does it take to receive an object from a Web server after sending a request?
Ignoring congestion, delay is influenced by:
TCP connection establishment data transmission delay slow start
Assumptions: Assume one link between client
and server of rate R no retransmissions (no loss, no
corruption)
Window size: First assume: fixed congestion
window, W segments Then dynamic window,
modeling slow start
5/3/05 CS118/Spring0513
Fixed congestion window (1)
First case: WS/R > S/R+RTT
ACK for first segment in window returns before window’s worth of data sent
delay = 2RTT + O/R
Notations:S: #bits in one segmentO: #bits in one objectR: bandwidthW: window size (# segments)K: O/WSQ: # times server idles if O=∞P = min(Q, K-1)
5/3/05 CS118/Spring0514
Fixed congestion window (2)
Second case: WS/R < RTT + S/R:
wait for ACK after sending window’s worth of data sent
delay = 2RTT + O/R + (K-1)[S/R + RTT - WS/R]Server's waiting time
5/3/05 CS118/Spring0515
RTT
initiate TCPconnection
requestobject
first window= S/R
second window= 2S/R
third window= 4S/R
fourth window= 8S/R
completetransmissionobject
delivered
time atclient
time atserver
TCP Delay Modeling: Slow Start (1)
Example:• O/S = 15 segments• K = 4 windows• Q = 2• P = min{K-1,Q} = 2
Server idles P=2 times
Delay components:• 2 RTTs for connection establish and request• O/R to transmit object• Server's idle time
Server idles: P = min{K-1,Q} times
5/3/05 CS118/Spring0516
TCP Delay Modeling: Slow Start (2)
Now suppose window grows according to slow start
The delay for one object is:
R
S
R
SRTTPRTT
R
O
R
SRTT
R
SRTT
R
O
idleTimeRTTR
O
P
kP
k
P
pp
)12(][2
]2[2
2delay
1
1
1
−−+++=
−+++=
++=
−
=
=
∑
∑
5/3/05 CS118/Spring0517
HTTP Modeling Assume Web page consists of:
1 base HTML page (of size O bits) M images (each of size O bits)
Non-persistent HTTP: M+1 TCP connections in series Response time = (M+1)O/R + (M+1)2RTT + sum of idle times
Persistent HTTP: 2 RTT to request and receive base HTML file 1 RTT to request and receive M images Response time = (M+1)O/R + 3RTT + sum of idle times
Non-persistent HTTP with X parallel connections Suppose M/X integer. 1 TCP connection for base file M/X sets of parallel connections for images. Response time = (M+1)O/R + (M/X + 1)2RTT + sum of idle times
5/3/05 CS118/Spring0518
02468
101214161820
28Kbps
100Kbps
1Mbps
10Mbps
non-persistent
persistent
parallel non-persistent
RTT = 100 msec, O = 5 Kbytes, M=10 and X=5
For low bandwidth, connection & response time dominated by transmission time.Persistent connections only give minor improvement over parallel connections.
HTTP Response time (in seconds)
5/3/05 CS118/Spring0519
0
10
20
30
40
50
60
70
28Kbps
100Kbps
1Mbps
10Mbps
non-persistent
persistent
parallel non-persistent
HTTP Response time (in seconds)
RTT =1 sec, O = 5 Kbytes, M=10 and X=5
For larger RTT, response time dominated by TCP establishment & slow start delays. Persistent connections now give important improvement: particularly in high delaybandwidth networks.
5/3/05 CS118/Spring0520
Network layer transport segment from sending to receiving host Source host: encapsulates segments into packets Destination host: delivers segments to transport layer network layer protocols in every host and router Each router examines header fields in all packets passing
through it Routing: calculate the best path to each destination Forwarding: move packets from input to output
segment
Network protocol header
S Dsegment
To transport protocol
R R R
5/3/05 CS118/Spring0521
Makeup lectures on Monday June 6
There will be no class on Thursday June 9To make it up:8-9:50am Boelter 5422, or6-7:50pm Boelter 5419
Pick the lesser evil one Additional office hours on the final exam day:
Saturday June 11: 10:00AM - 1:00PM
And the Final exam is: 3:00 - 6:00PM
5/3/05 CS118/Spring0522
Always keep the big picture in mind
HTTP
TCP
IP
Ethernetinterface
HTTP
TCP
IP
Ethernetinterface
IP IP
Ethernetinterface
Ethernetinterface
SONETinterface
SONETinterface
host host
router router
HTTP message
TCP segment
IP packet IP packetIP packet
5/3/05 CS118/Spring0523
Network layer: Connection vs. connection-less service
Virtual Circuit network provides connection-oriented service source-to-dest path works in a way much like telephone circuit
Datagram network provides connectionless service The two services analogous to TCP vs. UDP at
transport-layer, but: Network delivery service: host-to-host No choice: a given network provides one or the other but not
both (as in transport layer)
5/3/05 CS118/Spring0524
Virtual circuit Network Use a signaling protocol to setup connection before data can flow every router on source-dest path maintains “state” for each passing
connection link, router resources (bandwidth, buffers) allocated to each VC each packet carries VC identifier (not destination host address) VC number must be changed on each link.
New VC number comes from forwarding table
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
1. Initiate call 2. incoming call
3. Accept call4. Call connected
5. Data flow begins 6. Receive data
5/3/05 CS118/Spring0525
Forwarding table
12 22 32
13
2
VC number
interfacenumber
Incoming interface Incoming VC # Outgoing interface Outgoing VC #
1 12 2 222 63 1 18 3 7 2 171 97 3 87… … … …
Forwarding table innorthwest router:
Routers maintain connection state information!
5/3/05 CS118/Spring0526
R1
ETH FDDI
IPIP
ETH
R2
FDDI WLAN
IPR3
WLAN ETH
IP
H1
IP
ETH
H8
Internet: A Datagram Network
hosts are connected to subnets subnets are interconnected by IP routers All hosts and routers speak IP
routers also “speak” many different link layer protocols IP provides two basic functions
globally unique address for all connected points Best effort datagram delivery from source to destination hosts
• Fragmentation/reassembly of packets whenever needed
5/3/05 CS118/Spring0527
The Internet Network layer
forwardingtable
Host, router network layer functions:
Routing protocols•RIP, OSPF, BGP …
IP protocol•addressing conventions•datagram format•packet handling conventions
ICMP protocol•error reporting•router “signaling”
Transport layer: TCP, UDP
Link layer
physical layer
Networklayer
Router function
5/3/05 CS118/Spring0528
IP datagram format
ver Total length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifier
IP headerchecksum
time tolive
source IP address
IP version number
header length
upper layer protocolto deliver payload to
head.len
type ofservice
flgsfragment
offset
protocolmax number
of remaining hops
destination IP address
Options (if any)
3 fields used for packetfragmentation/reassembly
basic h
eader
E.g. timestamp,route recording,Specify list of routers to visit.
how much overhead for a TCP segment?
20 bytes of TCP 20 bytes of IP = 40 bytes
5/3/05 CS118/Spring0529
IP Address structure
IP address space: 2-level hierarchy
What’s a network ? (from IP address perspective) device interfaces with same network
part of IP address can physically reach each other without
going thru a router
173.1.1.1
173.1.1.2
173.1.1.3
173.1.1.4 173.1.2.9
173.1.2.2
173.1.2.1
173.1.3.2173.1.3.1
173.1.3.27
LAN
173.1.1.1 = 10101101 00000001 00000001 00000001
173 1 11
•32-bits, uniquely identifies a host or router interface –interface: connection between host/router and physical link
Network-ID host-ID 4 byte
5/3/05 CS118/Spring0530
IP Address: how many bits for net-ID Original IP design: class-based address
Two changes added over the last 25 years Subnetting: add a hidden level to address hierarchy
• An organization gets one address block, then split the host part into two parts: subnet and host parts
CIDR: Classless InterDomain Routing (today)• network portion of address of arbitrary length
0network host
10 network host
110 network host
1110 multicast address
A
B
C
D
1.0.0.0 to127.255.255.255
128.0.0.0 to191.255.255.255
192.0.0.0 to223.255.255.255
224.0.0.0 to239.255.255.255
Network ID Host ID
5/3/05 CS118/Spring0531
Classless InterDomain Routing address format: a.b.c.d/x, x # bits in network portion
Internet Service Providers get blocks of IP addresses from the Internet address authority
Internet customers get portion of their ISP’s addr. blockISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23
Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23
11001000 00010111 00010000 00000000
networkpart
hostpart200.23.16.0/23
5/3/05 CS118/Spring0532
Hierarchical addressing: route aggregation
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
Hierarchical addressing allows efficient advertisement of routing information:
5/3/05 CS118/Spring0533
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anything with addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
“Send me anything with addresses beginning 200.23.16.0/20”
Multi-homing
Hierarchical addressing: route aggregation Route aggregation helps reduce routing table size multi-homing defeats address aggregation
ISPs-R-Us has a more specific route to Org. 7
“Send me anything with addresses beginning 199.31.0.0/16, or200.23.30.0/23 ”
5/3/05 CS118/Spring0534
IP Subnetsubnet mask: indicates the portion of the address that is
considered as “network ID” by the local sitesubnet mask does not need to align with a byte boundary
Each host must be configured with both an IP address and a subnet mask
subnets are invisible outside of the local sitebackbone routers only know how to forward packets to the
networkIDWithin the organization, routers store: [subnet, mask, next hop]
Subnet advantages: aggregate local info., keep backbone routers table size small
Network ID Host ID11111111111111111111110000000000
Viewed from inside10-bit host ID
Viewed from outside
5/3/05 CS118/Spring0535
An example
BA
Network# next-hop
131.179 B
Look up IP addr.131.179.96.15
C
Network# mask next-hop
131.179.96 255.255.255.0 C …… ………..
131.179.96.15
Global Internet
UCLA CS
131.179.96.0
111111111111111111111111 00000000
131 . 179 . 96 15 subnetted address
subnet mask(255.255.255.0)
Network ID host IDa class-B address
5/3/05 CS118/Spring0536
Getting an IP packet from source to dest.
173.1.1.1
173.1.1.2
173.1.1.3
173.1.1.4 173.1.2.9
173.1.2.2
173.1.2.1
173.1.3.2173.1.3.1
173.1.3.27
A
BE
Source host A destination B: Host A: [A’s addr & subnet mask] ═ [B’s addr & subnet mask] ?
yes: B is on the same net, use link layer to send pkt directly to B
Source host Adestination E: [A’s addr & subnet mask] =
[E’s addr & subnet mask] ? yes No: Send pkt to default router
173.1.1.4
Router: Is E on any of my directly connect subnets?
Yes: send pkt directly to E No: forward to another router
according to routing table
5/3/05 CS118/Spring0537
IP Fragmentation & Reassembly Different subnets have different
MTUs (Maximum Transmission Unit)
Sender host always uses its max MTU size
Routers “fragment” IP packets if the next link has a smaller MTU chop packets to the MTU size of next
link further fragmentation down the path
possible
packet reassembled at dest. host
reassembly
H1
H2
R3
R2
R1
1300B
MTU=532B
512B
276B
MTU=1500B
H1 sending an IP packet of 1300 byte data to H2:
5/3/05 CS118/Spring0538
IP Fragmentation: An example
reassembly
H1
H2
R3
R2 1300B
512B
276B
data (1300 bytes)
rest of the IP header
4 5 TOS 13207394 0 0 0 0
data (512 bytes)
rest of the IP header
4 5 TOS 5327394 0 0 1 0
data (512 bytes)
rest of the IP header
4 5 TOS 5327394 0 0 1 64
data (276 bytes)
rest of the IP header
4 5 TOS 2967394 0 0 0 128
At destination:- identifier: tell all pieces in the same packet- the last fragment: MF=0- the offsets tell whether there are holes missing in the middle
MTU=532B
5/3/05 CS118/Spring0539
ICMP: Internet Control Message Protocol used by hosts & routers to
communicate network-level information error reporting: unreachable
host, network, port, protocol echo request/reply
ICMP msgs carried in IP packets
ICMP message format
Type Code description0 0 echo reply (ping)3 0 dest. network unreachable3 1 dest host unreachable3 2 dest protocol unreachable3 3 dest port unreachable3 6 dest network unknown3 7 dest host unknown4 0 source quench (congestion control - not used)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP headertype code checksum
unused (or used by certain ICMP types)
IP header and first 64bits of dataOr
data (according to ICMP types)
IP header
5/3/05 CS118/Spring0542
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
local network(e.g., home network)
10.0.0/24
rest ofInternet
Datagrams with source or destination in this networkhave 10.0.0/24 address for
source, destination (as usual)
All datagrams leaving localnetwork have same single source
NAT IP address: 138.76.29.7,different source port numbers
5/3/05 CS118/Spring0543
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
local network(e.g., home network)
10.0.0/24
rest ofInternet
Datagrams with source or destination in this networkhave 10.0.0/24 address for
source, destination (as usual)
All datagrams leaving localnetwork have same single source
NAT IP address: 138.76.29.7,different source port numbers
5/3/05 CS118/Spring0544
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
S: 10.0.0.1, 3345D: 128.119.40.186, 80
1
10.0.0.4
138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40, 80
NAT translation tableWAN side addr LAN side addr
138.76.29.7, 5001 10.0.0.1, 3345…… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345
4
S: 138.76.29.7, 5001D: 128.119.40.186, 80
2
2: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 5001
3
3: Reply arrives dest. address: 138.76.29.7, 5001
4: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345
5/3/05 CS118/Spring0545
NAT implementation NAT router must do the following:
outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #)
• . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.
remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair
incoming datagrams: replace (NAT IP address, new port #) in destination fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table
Problems due to NAT Increased network complexity, reduced robustness Cannot run services from inside a NAT box
address shortage should instead be solved by IPv6
5/3/05 CS118/Spring0546
IPv6Motivation: 32-bit address space exhaustionTake the opportunity for some clean-upIPv6 datagram format:
Address length changed from 32 bits to 128 bits fragmentation fields moved out of base header IP options moved out of base header
• Header Length field eliminatedHeader Checksum eliminatedType of Service field eliminatedTime to Live Hop Limit, Protocol Next HeaderPrecedence Priority, added Flow Label fieldLength field excludes IPv6 header
5/3/05 CS118/Spring0547
IPv6 header format
Destination Address (16 bytes)
Version Priority Flow Label
Payload Length Next Header Hop Limit
Source Address (16 bytes, 128 bits)
Version Hdr Len Total Length
Identification Fragment Offset
Prec TOS
Time to Live Protocol Header Checksum
Flags
Source Address
Destination Address
PaddingOptions
32 bits
IPv4 header
5/3/05 CS118/Spring0548
Changes from IPv4
Priority: identify priority among datagrams in flow Flow Label: identify datagrams in same “flow” (concept
of“flow” not well defined). Next header: identify upper layer protocol for data Options: allowed, but outside of the basic header,
indicated by “Next Header” field Checksum: removed entirely to reduce processing time
at each hop ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too Big” multicast group management functions
5/3/05 CS118/Spring0549
Transition From IPv4 To IPv6 Not all routers can be upgraded simultaneous to allow the Internet operate with mixed IPv4 and IPv6
routers : tunnelingA B E F
IPv6 IPv6 IPv6 IPv6
tunnelLogical view:
Physical view:A B E F
IPv6 IPv6 IPv6 IPv6
C D
IPv4 IPv4
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Flow: XSrc: ADest: F
data
Src:BDest: E
Flow: XSrc: ADest: F
data
Src:BDest: E
A-to-B:IPv6
E-to-F:IPv6
B-to-C:IPv6 inside
IPv4
B-to-C:IPv6 inside
IPv4
5/3/05 CS118/Spring0550
1
23
0111
value in arrivingpacket’s header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
5/3/05 CS118/Spring0551
Router Architecture Overview
Two key router functions:
run routing algorithms/protocol (RIP, OSPF, BGP) forwarding datagrams from incoming to outgoing link
5/3/05 CS118/Spring0552
Input Port Functions
Decentralized switching: given datagram dest., lookup output port using
forwarding table in input port memory goal: complete input port processing at ‘line
speed’ queuing: if datagrams arrive faster than
forwarding rate into switch fabric
Physical layer:bit-level reception
Data link layer:e.g., Ethernetsee chapter 5