distributed processing systems (networking technologies) distributed processing systems (networking...

58
Distributed Processing Distributed Processing Systems Systems (Networking Technologies) (Networking Technologies) 오 오 오 오 오 오 오오오오오 오오오오 오오오 오오오오오 오오오오 오오오 Email : [email protected] Email : [email protected]

Upload: jeffrey-berry

Post on 04-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Distributed Processing Distributed Processing SystemsSystems

(Networking Technologies)(Networking Technologies)

오 상 규오 상 규

서강대학교 정보통신 대학원서강대학교 정보통신 대학원

Email : [email protected] : [email protected]

Page 2: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 2

Networking Technologies

서강대학교 정보통신 대학원

Networking facilities in distributed systems are implemented by various

hardware components (e.g., switch, NIC, etc.) and software components

(protocols, communication handlers, device drivers, etc.)

The resulting functionality and performance is determined by all of these

components and not simply by the networking hardware.

Communication subsystem

A collection of hardware and software components

that provide the communication facility for a

distributed system. In distributed systems, the impact of network technologies on the

communication subsystem is usually considered.

Networking Issues in Distributed SystemsNetworking Issues in Distributed Systems

Page 3: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 3

Networking Technologies

서강대학교 정보통신 대학원

Network Performance ParametersNetwork Performance Parameters Network Performance ParametersNetwork Performance Parameters

Performance parameters Latency: time required to transfer an empty message between two

relevant computers. Software overheads in accessing the network. Routing delays and propagation delay.

Data transfer rate: speed at which data can be transferred between two computers in the network once transmission has begun (bits per seconds). - determined primarily by physical characteristics.

Message transfer time = latency + length / data transfer rate. In distributed systems, messages are usually small in size and latency

is more significant than transfer rate. Total system bandwidth: total volume of traffic that can be transferred

across the network in a given time (measure of throughput).

Page 4: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 4

Networking Technologies

서강대학교 정보통신 대학원

Network RequirementsNetwork RequirementsNetwork RequirementsNetwork Requirements

Performance requirements Must achieve performance comparable to a centralized architecture.

Reliability requirements Guarantees of reliability are usually required for most distributed

system applications. Errors in communication media are very rare these days. Errors are often due to timing failures in the sender and receiver

software. Detection and correction are often performed by applications.

Communication protocols and Network interfaces should be changed/selected based on the requirements.

Page 5: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 5

Networking Technologies

서강대학교 정보통신 대학원

Local Area Networks (LAN) Relatively high speed: fiber-optic or coaxial cable. Within a single building or campus. Broadcast communication (No message routing required).

Shared channel -> Conflict resolution. Achieve Low latency. Transfer rate: 0.2 ~ 100 Mbps => More than 1 Gbps.

Network type Standard Data transfer rate (Mbits/sec)

Ethernet IEEE 802.3 10 FDDI FDDI-I 100 Apple Local Talk Apple Computer 0.23 IBM Token Ring IEEE 802.5 4 or 16 Fast Ethernet IEEE 802.3u 100 Gigabit Ethernet IEEE 802.3z 1000 ATM ATM Forum 25, 155, 622 Myrinet Myricom 1280

Types of Network (1)Types of Network (1)

Page 6: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 6

Networking Technologies

서강대학교 정보통신 대학원

Wide Area Networks (WAN) Interconnects host computers separated by large distances. Packet switches or packet switching exchange (PSE).

– Linked by communication circuits.– Routes messages or packets (store-and-forward communication).

Latency: 0.1 ~ 0.5 seconds. ISDN: Multiples of basic channel speed of 64 Kbps (2B+D channels). T1/T3 and Frame Relay. B-ISDN (ATM)

Cell-relay method. Speed: 155 Mbps (OC-3), 622 Mbps (OC-12), 2.4 Gbps (OC-48), 9.

6 Gbps (OC-192). Lower latency than packet-switching networks.

WDM (Wavelength Division Multiplexing).

Types of Network (2)Types of Network (2)

Page 7: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 7

Networking Technologies

서강대학교 정보통신 대학원

Developed by Xerox PARC in 1973. IEEE/ISO standard 802.3. 10 Mbps using coaxial cable or UTP cable. Bus topology. Carrier sensing, multiple access with collision detection (CSMA/CD). Packet broadcasting.

All stations listen for packets that are addressed to them. Any station wishing to transmit a message broadcasts one or more

packets (frames). Each packet contains the address of the destination station, sending

station, and the data. MTU = 1518 bytes. An address consisting of all 1s is for broadcast. A NIC must be implemented to recognized a multicast address.

Introduction to Ethernet (1)Introduction to Ethernet (1)

Page 8: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 8

Networking Technologies

서강대학교 정보통신 대학원

Ethernet protocol is implemented in the Ethernet NIC. Packet layout

Type field is used by the upper layer protocols to distinguish packets of various types.

IEEE is the allocation authority for Ethernet addresses (MAC addresses) to the manufacturers.

64 byte lower bound on the packet length is needed for collision detection. Maximum distance = 2.5 Km, 4 repeater allowed => 51.2 us => 64 bytes. If network speed goes up, the minimum length should go up or minimum cab

le length must come down. Think about Fast Ethernet (100 Mbps) and Gigabit Ethernet (1 Gbps). !!!

Introduction to Ethernet (2)Introduction to Ethernet (2)

Destinationaddress

Sourceaddress

Type Data for transmission Framecheck

sequence

6 6 2 46 <= length <= 1500 4

Page 9: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 9

Networking Technologies

서강대학교 정보통신 대학원

Ethernet Collision HandlingEthernet Collision Handling Ethernet Collision HandlingEthernet Collision Handling

Carrier sensing: NIC listens for the presence of a signal. It waits until no signal is present in the cable before the transmission.

Collision detection Listen its input port and compare two signals during transmission. Station stops transmitting and produces a jamming signal on the cable. All transmitting and listening stations cancel their current packet.

Back-off (Binary Exponential Back-off algorithm) Each of the stations involved in a collision waits a time n before

retransmission, where n is a random integer and is the time taken for a signal to reach all stations.

Check sum NIC at the receiving station computes the check sequence and

compares it with the check sum in the packet. It the comparison fails the packet is rejected.

Page 10: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 10

Networking Technologies

서강대학교 정보통신 대학원

Synchronous Transfer Mode (STM) vs Asynchronous Transfer Mode (ATM)

Synchronous : Network’s bandwidth is divided into a fundamental unit call time-slots or buckets.

Example : Narrowband ISDN (2B + D) => B = 64 Kbps, D = 16 Kbps

Disadvantage: Significant waste of bandwidth. ATM transfers data in fixed units called cells (53 bytes). Fast cell switching (label switching) with short fixed packets. Overcome the disadvantage of STM by using statistical multiplexing.

Asynchronous Transfer Mode (ATM)Asynchronous Transfer Mode (ATM)

8 1 8 1 8 1 8 1

48 bits in 250 us

B2 B1 B2B1 B1B2 B2D D D D

Page 11: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 11

Networking Technologies

서강대학교 정보통신 대학원

Accessing the Network (Software’s View) Accessing the Network (Software’s View)

Device Driver Interface (DDI)

(NDIS, ODI, DLPI, etc.)

ATM Device Driver

ATM Adapter Card

ATM Protocol TCP/UDP IP

Other drivers

Application Programming Interface (API)User

Kernel System Call

Application or Other tools

Socket,

ATM API

Hardware Specific Interface

Page 12: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 12

Networking Technologies

서강대학교 정보통신 대학원

Host Network Interface (Traditional Case)Host Network Interface (Traditional Case)

Application

Socket layer

TCP

IP

Interface driver

Network MAC

Application

Socket layer

TCP

IP

Interface driver

Network MAC

Ker

nel

Bu

ffer

ing

User Buffering User Buffering

Ker

nel

Bu

ffer

ing

1

2

3

42

3

4

5

system call interface

Sender Receiver

Data path in a conventional protocol stack

5 1

Page 13: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 13

Networking Technologies

서강대학교 정보통신 대학원

DMA-Based Host Network InterfaceDMA-Based Host Network Interface

Application

Host

Processor

DMA

User Buffer

Kernel

Buffer

Network Buffer

Network Interface

Network

Page 14: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Networking Technologies Networking Technologies Case Studies 1Case Studies 1

Networking Technologies Networking Technologies Case Studies 1Case Studies 1

Gigabit LAN - MyrinetGigabit LAN - Myrinet

Page 15: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 15

Networking Technologies

서강대학교 정보통신 대학원

New type of local area network Arose from two ARPA-funded projects: Mosaic multi-computer (Caltech) and Atomic LAN (USC/ISI). based on the multi-computer message passing technology used for packet communication and switching within massively parallel processors (MPP). MPP message-passing network at campus-area.

Gigabit-per-second LAN developed by Myricom Inc. bandwidth : 1.28Gbps ( full-duplex 2.56Gbps ) Component of Myrinet

switch host interface cable (link) myrinet software

IntroductionIntroduction IntroductionIntroduction

Page 16: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 16

Networking Technologies

서강대학교 정보통신 대학원

Component of MyrinetComponent of Myrinet Component of MyrinetComponent of Myrinet

Host Interface card 16port Myrinet Switch

Optical Fibre Converters

Myrinet-SAN cable

Page 17: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 17

Networking Technologies

서강대학교 정보통신 대학원

Above figure shows, a Myrinet LAN consist of point-to-point, full-duplex links that connect hosts and switches.

The multiple-port switches may connect by links to other switches and to the single-port host interfaces in any topology, including those with cycles.

A Possible ConfigurationA Possible ConfigurationA Possible ConfigurationA Possible Configuration

Page 18: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 18

Networking Technologies

서강대학교 정보통신 대학원

Myrinet LinkMyrinet Link Myrinet LinkMyrinet Link

System A System B

1.28 Gbps channel

port port

1.28 Gbps channel

Link(Cable)

Myrinet Link is composed of a full-duplex pair of Myrinet channels. The connection of a link to a system is called a port.

A single channel bandwidth : 1.28 Gbps

chose 25m as the maximum length for electrical cables

Page 19: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 19

Networking Technologies

서강대학교 정보통신 대학원

Channels convey packets that are arbitrary-length sequences of packet-data bytes.

The channel maintains the framing of packets. Which byte is the head of the packet, and which byte is the tail

The flow of information on a channel may be blocked(stopped) temporarily by the receiver. This flow control is provided on every link.

The port circuits detect the condition in which the port is unused, its link is disconnected, or its link is connected to an unpowered component. In this condition, the port sender is not blocked; instead, outgoing packets are dropped.

Myrinet ChannelMyrinet Channel Myrinet ChannelMyrinet Channel

Page 20: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 20

Networking Technologies

서강대학교 정보통신 대학원

The reference implementation of a Myrinet channel connects a sender(output) and receiver(input) through a 9-wire, parallel, communication medium.

If d=1, bits 7~0 convey the data byte, If d=0, bits 7~0 contain the code for a control symbol.

Physical Channel StructurePhysical Channel Structure Physical Channel StructurePhysical Channel Structure

< A single channel structure >

Page 21: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 21

Networking Technologies

서강대학교 정보통신 대학원

Control SymbolsControl Symbols Control SymbolsControl Symbols

Page 22: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 22

Networking Technologies

서강대학교 정보통신 대학원

Accomplished on Myrinet channels by the receiver injecting STOP and GO control symbols into the stream being produced by the sender of the opposite-going channel.

Applies only to packet data; all control symbols are exempt from flow control and have priority over packet data.

The STOP and GO control symbols used for flow control have priority over all other control symbols.

The Myrinet receiver contains a “slack buffer” in order to manage flow control.

Flow ControlFlow Control Flow ControlFlow Control

Page 23: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 23

Networking Technologies

서강대학교 정보통신 대학원

Slack Buffer OperationSlack Buffer Operation Slack Buffer OperationSlack Buffer Operation

If the down stream flow is blocked so that the slack buffer fills to the STOP line, the receiver generates a STOP control symbol that stops the flow before the buffer overflows.

When the down stream flow resumes, the receiver generates a GO control symbol when the level reaches the GO line.

The buffer positions between GO and STOP provide hysteresis to assure that STOP and GO control symbols will not consume excessive bandwidth on the opposite-going channel.

STOP

GO

32 bytes

16 bytes

32bytes

Slack_A

Slack_B

Slack_A must be large enough to stop the sender

before the slack_A buffer overflows.

Slack_B part of the slack buffer is required only for

performance reasons.

Hysteresis parameter is important only for reducing

the number of STOP and GO symbols that must

be sent on the opposite-going channel.

Page 24: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 24

Networking Technologies

서강대학교 정보통신 대학원

Head

er( variab

le)

Packet Format (1)Packet Format (1) Packet Format (1)Packet Format (1)

1(to switch),port#

1(to switch),port#

0(to host),type

GAP

CRC

Payload...

Payload...

A single Myrinet channel conveys a sequence discrete

(9-bit) characters composed of: packet-data bytes (a set of 256 characters), and a set of control symbols (IDLE,GAP,GO,STOP…)

Control symbols are interleaved with the packet data

in order to perform packet framing, flow control, and

other functions. For example, the sequence:

..GAP,GO,IDLE,d0,d1,IDLE,GO,STOP,d2,d3,GAP,STOP,GAP..

Includes the 4-byte packet (d0,d1,d2,d3), which is framed

by GAP. The GO and STOP as well as IDLE control

symbol may be used to fill unused cycles either within

or between packets.

...

Delivered to

destination host

1 byte

trailer

Pay

load

( arbitrary

leng

th)

MSB

Page 25: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 25

Networking Technologies

서강대학교 정보통신 대학원

Packet Format (2)Packet Format (2)Packet Format (2)Packet Format (2)

When a packet enters a host interface, the leading byte identifies the type

of a packet: mapping packet, network management packet, or a packet with

an IP packet as its payload.

The most-significant bit of each header byte distinguishes between

to-switch and to-host bytes.

Page 26: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 26

Networking Technologies

서강대학교 정보통신 대학원

Routing and CRC CheckRouting and CRC Check Routing and CRC CheckRouting and CRC Check

When a packet enters a switch, the leading byte of the header determines the

outgoing port before being stripped off the packet. The myrinet-packet trailer carries an 8-bit CRC character computed on the entire

packet. Because the packet header is modified at each switch, Myrinet recomputes

the CRC on each link. Check errors by XORing CRCs : 1) Incoming CRC XOR Computed CRC, 2) Outgoing link computes CRC and XORing the content from 1) 3) Correct packets will have correct CRC, otherwise errors.

interfaceswitchswitchinterface

Header on

this link

10000101

10000010

00000001

10000010

00000001 00000001

Page 27: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 27

Networking Technologies

서강대학교 정보통신 대학원

Wormhole Routing (Switching)Wormhole Routing (Switching) Wormhole Routing (Switching)Wormhole Routing (Switching)

Packet Buffer

Port 0

P0

P1

P2

SWITCH

Port 1

Port 3

Port 2

(a) Packet Switching

SWITCH 1 SWITCH 2

Link Buffer Buffer

Payload Address/Header

Phits:

(b) Wormhole Switching

Page 28: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 28

Networking Technologies

서강대학교 정보통신 대학원

Host Interface (1)Host Interface (1) Host Interface (1)Host Interface (1)

Consists of two major components: the LANai chip and its associated SRAM. The LANai is a custom VLSI chip and controls the data transfer between the host

and the network.

Page 29: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 29

Networking Technologies

서강대학교 정보통신 대학원

Host Interface (2)Host Interface (2) Host Interface (2)Host Interface (2)

Its main component is a programmable micro-controller, which controls DMA

engine responsible for the data transfer directions host onboard memory

and onboard memory network. Message data must first be written to the SRAM, before it can be injected into

the network. The SRAM stores the Myrinet Control Program(MCP) and packet. Besides controlling the data transfer, LANai is also responsible for automatic

network mapping and monitoring the network status. The LANai communicates with the hosts device drivers or user-level libraries

through job queues residing in the SRAM.

Page 30: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 30

Networking Technologies

서강대학교 정보통신 대학원

Two categories of software provide access to and control Myrinet:

the MCP executing on the processors in the host interface, and the

device drivers and operating systems executing in the hosts.

MCP The program that runs on the LANai chip on the host interface board. It is MCP’s job to transfer messages between the host and the network. Checks the validity of the incoming packet, interprets headers, and

transfer packet data to specified scatter buffers in the host memory. It then signals the arrival of a packet in an acknowledgement queue and

optionally by producing an interrupt for the host. Performs continuous mapping, monitoring (remapping), and route

selecting that makes the network self-configuring and self-healing.

Myrinet Software: MCP (1)Myrinet Software: MCP (1) Myrinet Software: MCP (1)Myrinet Software: MCP (1)

Page 31: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 31

Networking Technologies

서강대학교 정보통신 대학원

The MCP is not burned into the LANai board. Instead, it is loaded into the LANai by the host device driver when the host boots up.

The software interface between the MCP and the Host is called a channel.

A channel is a set of queues on the LANai board shared by MCP and Host.

There are 3 queues in a channel: Send Queue Receive Buffer Queue Command/Ack Queue

Each queue has a single producer and a single consumer. On the host side, a set of single-producer and a single-consumer command and ack queues control the interface.

Myrinet Software: MCP (2)Myrinet Software: MCP (2) Myrinet Software: MCP (2)Myrinet Software: MCP (2)

Page 32: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 32

Networking Technologies

서강대학교 정보통신 대학원

One host interface in each Myrinet maps the network by sending mapping packets to other interfaces and to itself. Myrinet Mapping Operations:

Mapper : the MCP with the highest address. To determine if a host is connected to a certain port on a switch the

the mapper will send a scout message addressed to a host connected to that port.

If a host exist at that port, the host MCP will receive the scout message and send an acknowledgement back to the Mapper.

When the mapper receives the acknowledgement, it knows that a host exists at that port.

If the Mapper receives no response, then either nothing is connected to the port, or another switch is connected to the port. The Mapper sends a scout message with a route on it that goes through

the port, into the second switch, and back through the port of the first switch, and back to the Mapper.

Myrinet Software: Mapping Operation (1)Myrinet Software: Mapping Operation (1) Myrinet Software: Mapping Operation (1)Myrinet Software: Mapping Operation (1)

Page 33: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 33

Networking Technologies

서강대학교 정보통신 대학원

If the Mapper receives this scout message, then there was another switch

attached to the first switch at the port the Mapper was looking at. Otherwise the port is unconnected. The mapper recursively examines the whole network in this manner.

Myrinet Software: Mapping Operation (2)Myrinet Software: Mapping Operation (2) Myrinet Software: Mapping Operation (2)Myrinet Software: Mapping Operation (2)

Page 34: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 34

Networking Technologies

서강대학교 정보통신 대학원

Myrinet Software: Host SoftwareMyrinet Software: Host Software Myrinet Software: Host SoftwareMyrinet Software: Host Software

Provides the interface between Unix user processes and the host-interface

board. Figure 7 is a copy diagram that illustrates the steps involved in moving

information from a user process to the network. One copy TCP/IP interface Zero copy operation

Page 35: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 35

Networking Technologies

서강대학교 정보통신 대학원

All Myrinet specifications are open and public. The device driver code and the MCP are distributed as source code to serve as documentation and base for porting new protocol layers onto Myrinet.

Device drivers are available for Linux, Solaris, WindowsNT, DEC Unix, Sparc, Alpha, MIPS, and PowerPC processors.

A patched GNU C-compiler is available to develop MCP programs.

A new protocol layer called GM is developed by Myricom and replaces old device drivers and MCP programs. It provides reliable and ordered delivery of messages and supports protected kernel as well as user- level access routines to the Myrinet hardware.

Myrinet shows high performance more than any LANs known until now.

Software and PerformanceSoftware and PerformanceSoftware and PerformanceSoftware and Performance

Page 36: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 36

Networking Technologies

서강대학교 정보통신 대학원

Myrinet offers a good price/performance ratio.

Host interfaces vary in the range of $1300~1800, whereas an 8-port SAN switch costs approximately $2000.

The great flexibility of the hardware due to a programmable u-controller is one of the major advantages of Myrinet, but can also be a bottleneck with respect to performance, since the LANai runs only at moderate frequencies.

The buffering of messages in the onboard SRAM prevents the implementation of true zero copy protocols since there is no direct interface to the network from the hosts view.

This might be a reason for the moderate bandwidth of small/medium-sized messages.

Myrinet has shown its scalability in large cluster configurations (more than 256 nodes).

SummarySummarySummarySummary

Page 37: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Networking TechnologiesNetworking TechnologiesCase Studies 2Case Studies 2

Networking TechnologiesNetworking TechnologiesCase Studies 2Case Studies 2

LPN - Fiber ChannelLPN - Fiber Channel

Page 38: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 38

Networking Technologies

서강대학교 정보통신 대학원

Existing Channel ArchitectureExisting Channel ArchitectureExisting Channel ArchitectureExisting Channel Architecture

Work StationWork Station

ScannerScanner

Disk SubsystemDisk Subsystem

Tape SubsystemTape Subsystem

SupercomputerSupercomputer Disk ArrayDisk Array

2 X HIPPI

SCSI

Page 39: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 39

Networking Technologies

서강대학교 정보통신 대학원

New Demands in Interconnect (1)New Demands in Interconnect (1)New Demands in Interconnect (1)New Demands in Interconnect (1)

Limitations of SCSI

HostHost

Adapter . . .

1 2 3 N

Ultra SCSI Daisy_Chain 20MB~40M/s Termination T

Ultra SCSI

Data intensive applications.

Larger configurations.

Cable length restriction.

Increasing device / distance connectivity requirement.

Page 40: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 40

Networking Technologies

서강대학교 정보통신 대학원

New Demands in Interconnect (2)New Demands in Interconnect (2) New Demands in Interconnect (2)New Demands in Interconnect (2)

Fiber Channel : channel - network hybrid

a high performance interconnect standard designed for bi-directional, point-to-point serial data channels between desktop workstations, mass storage subsystems, peripherals and host systems.

Fiber Channel

Simplicity

Low latency

Guaranteed delivery

Connectivity

Distance

Multiplexing

Attributes of channelsAttributes of channels Attributes of networksAttributes of networks

Page 41: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 41

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel ArchitectureFiber Channel Architecture Fiber Channel ArchitectureFiber Channel Architecture

InterconnectionFabric

InterconnectionFabric

Work StationWork StationScannerScanner

Disk SubsystemDisk Subsystem

Tape SubsystemTape Subsystem

Disk ArrayDisk Array

Main FrameMain Frame

SupercomputerSupercomputer

Page 42: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 42

Networking Technologies

서강대학교 정보통신 대학원

Full duplex link with two fibers per link.

High throughput - over 100MB/s (current : 1.6Gbps).

Support for distances up to 10 km.

High capacity utilization.

Greater connectivity.

Ability to carry multiple existing interface command set.

Simpler and less costly system.

Characteristics of Fiber ChannelCharacteristics of Fiber Channel Characteristics of Fiber ChannelCharacteristics of Fiber Channel

Page 43: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 43

Networking Technologies

서강대학교 정보통신 대학원

FC-AL

Three TopologiesThree TopologiesThree TopologiesThree Topologies

HostHost FC-ALN

FabricSwitch

HostHost

NL

N

E FL

F

E

N

BridgeBridge

LoopLoopLoopLoopPoint to PointPoint to Point

Switch / FabricSwitch / FabricSwitch / FabricSwitch / Fabric

Parallel SCSI

Page 44: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 44

Networking Technologies

서강대학교 정보통신 대학원

Types of PortsTypes of PortsTypes of PortsTypes of Ports

N_Port : Node port.

NL_Port : Node port + Loop port.

F_Port : Fabric port.

FL_Port : Fabric port + Loop port.

L_Port : Loop port.

E_Port : Switch to Switch communication port.

G_Port : Behave as E_Port, FL_Port or F_Port, depending on what is connected to.

Page 45: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 45

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Topology Explained (1)Fiber Channel Topology Explained (1) Fiber Channel Topology Explained (1)Fiber Channel Topology Explained (1)

Fabric PortName

Node PortName

Connection

NotApplicable

N_Port Dedicated Bandwidth connection between two N_Port

Point to Point Dedicated Bandwidth

Fabric PortName

Node PortName

Connection

NotApplicable

N_Port

Shared Bandwidth network. Industrial NL_Port use arbitration scheme to get control of loop. After control, NL_Port establishes a point to point logical connection with another NL_Port.

Loop Shared Bandwidth

Page 46: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 46

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Topology Explained (2)Fiber Channel Topology Explained (2)Fiber Channel Topology Explained (2)Fiber Channel Topology Explained (2)

Fabric PortName

Node PortName

Connection

F_PortsFL_PortsE_PortsG_Ports

N_PortsNL_Ports

- Scaled Bandwidth. - N_Ports and NL_Ports connect to - Other nodes via a switch. - Each connection is full-duplex for very high system bandwidth. - N_Ports connect to F_Ports and NL_Ports connect to FL_Port. - Switched connect to other switches via an expansion port or E_Port. - A switch port that can be either an F_Port is called a genetic Port or G_Ports.

Switched scaled bandwidth

Page 47: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 47

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Physical ComponentFiber Channel Physical ComponentFiber Channel Physical ComponentFiber Channel Physical Component

Adapters or Interface Cards

Cable Types

Media Converters

Hubs

Routers and Bridges

Switches and Fabrics

Page 48: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 48

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Protocol LayerFiber Channel Protocol Layer Fiber Channel Protocol LayerFiber Channel Protocol Layer

Common Services

Signaling Protocol

Transmission Protocol ( encode / decode )

Physical Interface / Media

Networks

IEEE802

ATM

Channels

SCSI HIPPI IPI-3 SBCS IP

133Mbps

266Mbps

531Mbps

1.05Gbps

2.12Gbps

4.25Gbps

FC-4

FC-3

FC-2

FC-1

FC-0

FC-PH

Page 49: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 49

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Physical Layer (1)Fiber Channel Physical Layer (1) Fiber Channel Physical Layer (1)Fiber Channel Physical Layer (1)

FC-0 : Physical Interface and Media

Allows a variety of physical media and data rate.

Physical media : The physical media are optical fiber, coaxial cable and shielded twisted pair.

FC-1 : Byte synchronization and encoding

Defines the signal encoding technique used for transmission and for synchronization across the point-to-point link.

8B/10B encoding/decoding scheme provides balance, is simple to implement, and provides useful error-detection capability.

Special code character maintains byte and word alignment.

Page 50: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 50

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Physical Layer (2)Fiber Channel Physical Layer (2)Fiber Channel Physical Layer (2)Fiber Channel Physical Layer (2)

FC-2 : Actual transport mechanism

A robust 32 bit CRC detect transmission errors to ensure data integrity.

Various classes of service through the fabric.

Constructs to support efficient multiplexing of operations.

A flow control scheme that provides a guaranteed delivery capability.

A built-in protocol.

Optional headers that may be used for network routing.

Generic functions that are common across multiple upper-level protocol.

Process to provide segmentation and re-assembly of data.

Page 51: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 51

Networking Technologies

서강대학교 정보통신 대학원

Class of ServiceClass of Service Class of ServiceClass of Service

Flow Control Delivery UseClass 1

AcknowledgeConnection Service

Buffer to BufferNode to Node

Reliableand

Guaranteed

Large Files andAbsolute Quality of

ServiceClass 2

AcknowledgeConnectionless

Service

Buffer to BufferNode to Node

GuaranteedNetworking

OLTP

Class 3Unacknowledge

Connection ServiceBuffer to Buffer Reliable

Storage NetworkingBroadcastMulticast

Class4FractionalBandwidthConnection

oriented Service

Buffer to BufferNode to Node

Reliableand

Guaranteed

Real Time SystemIsochronous Data

Real TimeAudio / Video

Class 6Simplex

Connection ServiceNone Reliable

Video Distribution(One to Many)

Data Acquisition

Page 52: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 52

Networking Technologies

서강대학교 정보통신 대학원

ExchangeExchange

Fiber Channel Physical Frame StructureFiber Channel Physical Frame StructureFiber Channel Physical Frame StructureFiber Channel Physical Frame Structure

SequenceSequence

FrameFrame

FrameFrame

FrameFrame

FrameFrame

. . .

Ordered Set

4BytesStart of Frame

24BytesHeader

4BytesCRC

Check

4BytesEnd ofFrame

2112Bytes Payload

64BytesOptional Header

2048BytesPayload

Page 53: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 53

Networking Technologies

서강대학교 정보통신 대학원

Fiber Channel Upper Layer (FC-3 & FC-4)Fiber Channel Upper Layer (FC-3 & FC-4)Fiber Channel Upper Layer (FC-3 & FC-4)Fiber Channel Upper Layer (FC-3 & FC-4)

FC-3 : Common services layer - port -related services.

Striping : Makes use of multiple N_Ports in parallel to transmit a single information unit across multiple links simultaneously.

Hunt Groups : A set of associated N_Ports at a single node. Multicast : Delivers a single transmission to multiple destination N_Ports.

FC-4 : Upper layer protocols mapping - Supports various protocols. SCSI (Small Computer System Interface) IPI (Intelligent Peripheral Interface) HIPPI (High Performance Parallel Interface) IP (Internet Protocol) ATM (Asynchronous Transfer Mode) Link Encapsulation (FC-LE) using IS8802.2 SBCCS (Single Byte Command Code Set Mapping) Audio Video Fast File Transfer Audio Video Real Time Stream Transfer Avionics

Page 54: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 54

Networking Technologies

서강대학교 정보통신 대학원

Typical Fiber Channel SAN Environment 1Typical Fiber Channel SAN Environment 1 Typical Fiber Channel SAN Environment 1Typical Fiber Channel SAN Environment 1

Page 55: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 55

Networking Technologies

서강대학교 정보통신 대학원

Typical Fiber Channel SAN Environment 2Typical Fiber Channel SAN Environment 2 Typical Fiber Channel SAN Environment 2Typical Fiber Channel SAN Environment 2

LAN/MAN/WAN

SAN

Sun Solaris Linux HP-UX IBM AIX Compaq True64 MS NT Other

McData Brocade QLogic Vixel

Crossroads Bridge

IBM disksubsystem

EMC disksubsystem

HDS disksubsystem

EMC Clariion CompaqRAID

Dell JBODIBM tapesubsystem

STK tapesubsystem

Page 56: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 56

Networking Technologies

서강대학교 정보통신 대학원

Typical Fiber Channel SAN Environment 3Typical Fiber Channel SAN Environment 3 Typical Fiber Channel SAN Environment 3Typical Fiber Channel SAN Environment 3

Page 57: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 57

Networking Technologies

서강대학교 정보통신 대학원

Technology ComparisonTechnology ComparisonTechnology ComparisonTechnology Comparison

Fibre Channel Gigabit Ethernet ATM

Technologyapplication

Storage, network,video, clusters

Network Network, Video

TopologiesPoint to Point,

Loop hub, switchedPoint to Point, hub,

switchedswitched

Baud rate 1.06 Gbps 1.25 Gbps 622 MbpsScalability to higher

data rates2.12 , 4.24Gbps Not defined 1.24 Gbps

Guaranteeddelivery

Yes No No

Congestion dataloss

None Yes Yes

Frame size Variable, 0-2KB Variable, 0-1.5KB Fixed, 53KB

Flow control Credit Based Rate Based Rate Based

Physical media Copper and fibre Copper and fibre Copper and fibre

Protocols supportedNetwork, SCSI,

VideoNetwork Network, Video

Page 58: Distributed Processing Systems (Networking Technologies) Distributed Processing Systems (Networking Technologies) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macrmimpact.com

Page 58

Networking Technologies

서강대학교 정보통신 대학원

SummarySummary SummarySummary

Price Performance Leadership

Solution Leadership

Reliable

Gigabit Bandwidth Now

Multiple Topologies

Multiple protocols

Scalable

Congestion free

High Efficiency