یطابترا ایم یاه هکبشcrc.aut.ac.ir/cloudcourse93/14_in_940210.pdfدرس رایانش...
TRANSCRIPT
MIMD
•Multiprocessor (shared memory)
IN
P1 P2 Pn
M1 M2 Mn
Processors
Interconnection Network
Memory modules
(Tightly Coupled Architecture) 3
Shared Memory
4
• Uniform Memory Access (UMA)
• Tightly Coupled system
• Non-Uniform Memory Access (NUMA)
• Loosely Coupled system
• Cedar from University of Illinois
• BBN Butterfly
• Cache Only Memory Access (COMA)
• Using global distributed caches
• Kendal Square Research-1 (KSR-1)
4
MIMD (cont.)
Global Interconnection Network (Global IN)
Global Memory
GM1 Global Memory
GM2 Global Memory
GMn
P1
P2
Pn
C I N
CM1
CM2
CM3
P1
P2
Pn
C I N
CM1
CM2
CM3
P1
P2
Pn
C I N
CM1
CM2
CM3
(Loosely Coupled Architecture) - Cedar 5
MIMD (cont.)
P1 M1
P2 M2
Pn Mn Inte
rconnection
N
etw
ork
(IN)
(Loosely Coupled Architecture) – BBN Butterfly 6
MIMD (cont.)
• Data flow machine
• an instruction is ready for execution when data for its operands have been made available
• Purely self-contained
• No program counter
9
Hybrid Architecture
• Combine features of different architectures to provide better performance for parallel computations.
• Two type of parallelism
• Control parallelism (MIMD)
• Data parallelism (SIMD)
13
Neural Networks (Definition)
A large number of PEs
Connected in Parallel
Capable of learning
Adaptive to changing
Able to cope with serious disruptions
Power of Connectivity Power of Processors vs
15
Interconnection Network (IN)
• The measure of an IN is “how quickly it can deliver how much of what’s needed to the right place, reliably and at good cost and value”.
17
Performance Criteria for IN
• Latency
• Transit time for a single msg.
• Bandwidth
• how much msg. traffic the IN can handle, e.g., Mbytes/s
• Connectivity
• How many immediate neighbors each node has, and how often each neighbor can be reached
• Hardware cost
• What fraction of the total hardware cost the IN represents
E.g., wires, switches, connectors, arbitration logic, …
18
Performance Criteria for IN (cont.)
• Reliability
• Redundancy paths,
• Functionality
• Additional functions performed by the IN, such as combining of msg. and fault tolerance
• e.g., data routing, interrupt handling, request/ message combining, coherence
• Scalability
• The ability to be expandable
19
Definitions
• Node degree:
• node degree is the number of links (edges) connected to the node
• Diameter:
• the diameter of a network is defined as the largest minimum distance between any pair of nodes. The minimum distance between a pair of nodes is the minimum number of communication links (hops) that data from one of the nodes must traverse in order to reach the other node.
• Network Size
• The number of nodes in the IN
20
Data Routing
• Functions in data routing • Shifting
• Rotation
• Permutation (one-to-one)
• Broadcast (one-to-all)
• Multicast (many-to-many)
• Personalized communication (one-to-many)
• Shuffle / Exchange
21
Static Networks (cont.)
• Ring
• Degree = 2
• Diameter:
• unidirectional: n-1
• bidirectional: Ceil(n-1)/2
25
Static Networks (cont.)
• Binary tree
• Degree:
• Leaf=1
• Root=2
• Others=3
• Diameter: 2(h-1)
Nh log2
26
Static Networks (cont.)
• Fat tree.
• Degree and Diameter is the same as binary tree
• Due to heavy traffic towards root, the number of links gradually increases (e.g., CM-5).
27
Static Networks (cont.)
28
• Star.
• Degree:
• Central = n-1
• Others = 1
• Diameter= 2
N0
N3
N2
N1
N7
N6
N5N4
Static Networks (cont.)
Shuffle(sn-1sn-2 ... s0) = sn-2sn-3 ... s0sn-1
Exchange(sn-1sn-2 ... s1s0) = sn-1sn-2 ... s1s0
Source Destination
000 000
001 010
010 100
111 111
100 001
101 011
110 101
011 110
29
Shuffle-Exchange Network
• For N=8
• Applications: • The shuffle-exchange network provides suitable interconnection patterns for implementing
certain parallel algorithms, such as polynomial evaluation, Fast Fourier Transform (FFT), sorting, and matrix transposition.
30
Mesh Routing Algorithm
• Simple routing algorithm routes a packet from source S to destination D in a mesh with n2 nodes.
1. Compute the row distance R as
2. Compute the column distance C as
3. Add the values R and C to the packet header at the source node.
4. Starting from the source, send the packet for R rows and then for C columns.
nSnDR //
)(mod)(mod nSnDC
32
Example (Mesh)
33
to route a packet from node 6 (i.e., S=6) to node 12 (i.e., D =12),
the packet goes through two paths, as shown in the figure:
,24/64/12 R
.220 C
• HyperCube
• Degree= n
• Diameter= n
• Address Bits= n
• Dimensions= n
• Neighbors= n
36
Static Networks (cont.)
Static Networks (cont.)
• n-Mesh
• Degree:
• Corner= n
• Internal= 2n
• n < Others < 2n
• Diameter=
1
0
)1(n
iik
38
Static Networks (cont.)
• k-Ary n-cube
• Degree:
• If k=2 then Degree = n
• If k>2 then Degree = 2n
• Diameter= 2/kn
(a) 4-ary 2-cube network
(b) 3-ary 3-cube network
39
Cache Coherence
environment Multiprocessor
Cache dedicated to each processor
Cache coherence problem
How to keep multiple copies of the data consistent during execution?
40
Cache Coherence Mechanisms
1. Hardware-based schemes
• Snoopy cache protocols
• If INs have broadcast features
• Directory cache protocols
• No broadcast features in INs
2. Software-based schemes
3. Combination
41
Snoopy Cache Protocol
43
A two-processor configuration with copies of data block x
write-through
write-back
Dynamic Networks (Single-Stage)
In Single-Stage Network any
permutation can be reached
by at most
3(logN2) -1 pass.
48
Interconnection Design Decisions
• Considerations about selecting the Architecture of Interconnection Network
• Operation Mode
• Control Strategy
• Network Topology
• Switching Methodology
• Functional characteristics of the switch
53
Interconnection Design Decisions
• Operation mode:
• Synchronous
• Asynchronous
• Combined
• Control Strategy
• Centralized control
• Distributed control
• Switching methodology
• circuit switching
• packet switching
• integrated switching
54