354 33 powerpoint-slides ch17
TRANSCRIPT
-
8/12/2019 354 33 Powerpoint-slides CH17
1/78
N. Senthil Kumar,M. Saravanan &
S. Jeevananthan
-
8/12/2019 354 33 Powerpoint-slides CH17
2/78
Chapter 17
MULTIPROCESSOR CONFIGURATION
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
3/78
Introduction The speed of any microprocessor based system
depends upon the clock frequency at which connectedprocessors and peripherals works.
When bulk I/O data transfer is done under the controlof microprocessor then the processor has to spendmost of its time idle
Enhancement of the speed, on appropriate systeminvolving several connected processors, requires acertain topology - Such a system is known asmultiprocessor system.
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
4/78
Introduction The simplest type of multiprocessor system is one
containing a CPU (such as 8086) and a numeric dataprocessor (NDP) and/or input/output processor (IOP).
The NDP (such as 8087)and IOP work in synchronismwith the main processor to complete the specific tasksand are known as coprocessors.
Additional hardware circuits such as bus arbiter, bus
controller may be needed to co-ordinate the activitiesof the number of processors working at a time in thesystem.
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
5/78
-
8/12/2019 354 33 Powerpoint-slides CH17
6/78
-
8/12/2019 354 33 Powerpoint-slides CH17
7/78
Multiprocessor ConfigurationVs. Single-chip Microprocessor
Several processors may be combined to fit the needs ofan application while avoiding the expense of theunneeded capabilities of a single complex multiple-chipprocessor.
The modularity of a multiprocessor system providesmeans for expansion because it is easy to add moreprocessors as the need arises.
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
8/78
Multiprocessor ConfigurationVs. Single-chip Microprocessor
In a multiprocessor system, tasks are divided amongthe modules. If the failure occurs, it is easier andcheaper to find and replace the malfunctioningprocessor than it is to find and replace the failing part
in a complex processor.
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
9/78
Different Configurations Of
Multiprocessor System The maximum mode operation of the 8086 isspecifically designed to implement multiprocessorsystems.
Three basic configurations - They are the coprocessor,closely coupled and loosely coupled configurations.
The first two of the configurations are very similar in
that both the CPU (i.e 8086) and the external orsupporting processor share not only the entire memoryand I/O subsystem, but they also share the same buscontrol logic and clock generator as shown in figure.
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
10/78
Closely Coupled
Multiprocessor Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
11/78
Closely Coupled
Multiprocessor Configuration
Oxford University Press 2013. All rights reserved.
In a closely coupled configuration the supportingprocessor may act independently from the CPU, butin a coprocessor design it is dependent on the CPUand must interact directly with the CPU.
-
8/12/2019 354 33 Powerpoint-slides CH17
12/78
Loosely CoupledMultiprocessor Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
13/78
Loosely Coupled
Multiprocessor Configuration
Oxford University Press 2013. All rights reserved.
In Closely Coupled Configuration, several modulesmay share the system resources, and system buscontrol logic must resolve the bus contentionproblem.
Each potential bus master runs independently andthere are no direct connections between them asshown in fig.
In addition to the shared resources, each modulemay include its own memory and I/O devices.
-
8/12/2019 354 33 Powerpoint-slides CH17
14/78
Loosely Coupled
Multiprocessor Configuration
Oxford University Press 2013. All rights reserved.
The processors in the separate modules cansimultaneously access their private subsystems throughthe local buses and perform their local data references
and instruction fetches independently, thus improvingthe degree of concurrent processing.
In a loosely coupled multiprocessor system, two 8086processors can not be tied directly together.
Each CPU has its own bus control logic, and busarbitration is resolved by extending this logic andadding external logic that is common to all mastermodules.
-
8/12/2019 354 33 Powerpoint-slides CH17
15/78
Loosely Coupled
Multiprocessor Configuration
Oxford University Press 2013. All rights reserved.
Therefore, several CPUs can form a very large systemand each CPU may have independent processorsand/or a coprocessor attached to it
-
8/12/2019 354 33 Powerpoint-slides CH17
16/78
Advantages of Loosely
Coupled Configuration
Oxford University Press 2013. All rights reserved.
The system can be expanded in a modular form. Eachbus master module is an independent unit andnormally resides on a separate PC board and hence, abus master module can be added or removed withoutaffecting the other modules in the system.
High system throughput can be achieved by having
more than one CPU A failure in one module does not cause a breakdownof the entire system and faulty module can be easilydetected and replaced.
-
8/12/2019 354 33 Powerpoint-slides CH17
17/78
Advantages of Loosely
Coupled Configuration
Oxford University Press 2013. All rights reserved.
Each bus master may have a local bus to accessdedicated memory or I/O devices so that a greater
degree of parallel processing can be achieved.
-
8/12/2019 354 33 Powerpoint-slides CH17
18/78
-
8/12/2019 354 33 Powerpoint-slides CH17
19/78
Daisy Chaining
Oxford University Press 2013. All rights reserved.
There are three schemes for establishing prioritynamely daisy chaining, polling and independentrequesting.
-
8/12/2019 354 33 Powerpoint-slides CH17
20/78
Daisy Chaining
Oxford University Press 2013. All rights reserved.
In this simple and low cost methods, all the mastersuse the same line for making bus requests.
To respond to a bus request (BR) signal, the controller
sends bus grant (BG) signal if the bus busy signal isinactive.
The grant signal serially propagates through eachmaster until it encounters the first one that is
requesting access to the bus.
-
8/12/2019 354 33 Powerpoint-slides CH17
21/78
Who Gets Priority in DaisyChaining?
Oxford University Press 2013. All rights reserved.
The first module blocks the propagation of the busgrant signal, activates the bus busy line and gains
control of the bus. Any other requesting module which is present afterthe master that has now gained the control of thebus, will not receive the grant signal and therefore,
the priority is determined by the physical location ofthe modules. The requesting module located closest to the
controller has the highest priority.
-
8/12/2019 354 33 Powerpoint-slides CH17
22/78
Merits and Demerits of
Daisy Chaining
Oxford University Press 2013. All rights reserved.
Compared to the other two methods, the daisy chainscheme requires least number of control lines andthis number is independent of the number ofmodules in the system.
However, the arbitration time is slow due to thepropagation delay of the bus grant signal through thedifferent masters.
-
8/12/2019 354 33 Powerpoint-slides CH17
23/78
Merits and Demerits of
Daisy Chaining
Oxford University Press 2013. All rights reserved.
This delay is proportional to the number of modulesand therefore, a daisy-chain based system is limitedto the multiprocessor system having only a fewmodules.
Further, the priority of each module is fixed by itsphysical location and failure of a module in thesystem causes the whole system to fail.
-
8/12/2019 354 33 Powerpoint-slides CH17
24/78
Polling
Oxford University Press 2013. All rights reserved.
It uses a set of lines sufficient to address each
module. In response to a bus request (BR), the controllergenerates and sends out a sequence of moduleaddresses to the requesting modules.
-
8/12/2019 354 33 Powerpoint-slides CH17
25/78
Polling
Oxford University Press 2013. All rights reserved.
When a requesting module recognizes its address, itactivates the busy line and begins to use the bus.
The major advantage of polling is that the priority
can be dynamically changed by altering the pollingsequence (i.e. the order in which the moduleaddresses are sent) stored in the controller.
-
8/12/2019 354 33 Powerpoint-slides CH17
26/78
Independent Requesting
Oxford University Press 2013. All rights reserved.
This fastest and high BR & BG lines (2m lines areneeded for m modules)scheme resolves the priorityin a parallel fashion.
-
8/12/2019 354 33 Powerpoint-slides CH17
27/78
Independent Requesting
Oxford University Press 2013. All rights reserved.
Each module has a separate pair of bus request (BR)and bus grant lines (BG) and each pair has a priorityassigned to it.
The controller includes a priority decoder, whichselects the request with the highest priority andreturns the corresponding bus grant signal.
-
8/12/2019 354 33 Powerpoint-slides CH17
28/78
Interconnection Topologies ina Multiprocessor System
Oxford University Press 2013. All rights reserved.
A microprocessor with its external bus connections
needs memory to form a minimum workableprocessing system. In a multiprocessor system, a number of
microprocessors are connected with each other using
a single bus. The bus is also used to address a multi port memory
or a shared single I/O port.
-
8/12/2019 354 33 Powerpoint-slides CH17
29/78
Interconnection Topologies in
a Multiprocessor System
Oxford University Press 2013. All rights reserved.
A microprocessor with its external bus connectionsneeds memory to form a minimum workableprocessing system.
In a multiprocessor system, a number ofmicroprocessors are connected with each other usinga single bus.
The bus is also used to address a multi port memoryor a shared single I/O port. The method of communication among the
microprocessors in a multiprocessor system
-
8/12/2019 354 33 Powerpoint-slides CH17
30/78
Shared Bus Architecture
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
31/78
Shared Bus Architecture
Oxford University Press 2013. All rights reserved.
The shared bus architecture uses a common memorywhich may be partitioned into local memory banks fordifferent processors.
At a time, only one processor performs a bus cycleto fetch instructions or data from the memory .
-
8/12/2019 354 33 Powerpoint-slides CH17
32/78
Multi-port Memory
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
33/78
Multi-port Memory
Oxford University Press 2013. All rights reserved.
The processors P1 and P2 address a multi-port memorywhich can be accessed at a time by both theprocessors.
Both the processors also have local memories whichare used by them to store individual instructions, dataand the execution of its individual task.
The multiport memory may be used for storing the
instructions, data and the results to be shared by morethan one processor.
-
8/12/2019 354 33 Powerpoint-slides CH17
34/78
Linked Input/Output
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
35/78
Linked Input/Output
Oxford University Press 2013. All rights reserved.
Linked Input/Output interconnection utilizesinput/output capabilities of a microprocessor basedsystem to communicate with other systems
The direct access of common instructions and datawhich are available in a local system memory is notpossible in this method.
-
8/12/2019 354 33 Powerpoint-slides CH17
36/78
Crossbar Switching
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
37/78
Crossbar Switching
Oxford University Press 2013. All rights reserved.
It uses an extension of the concept of shared memoryfor a number of processors.
In this method, more than one processor can have
simultaneous accesses to the different memorymodules to be shared individually as long as there isno conflict.
The total memory is divided into modules. While oneprocessor is accessing a memory module, the otherprocessor will be denied an access of the samemodule till it is relinquished by the former processor.
-
8/12/2019 354 33 Powerpoint-slides CH17
38/78
Crossbar Switching
Oxford University Press 2013. All rights reserved.
The crossbar switch provides the interconnectionpaths between the memory modules and theprocessors.
In crossbar switch interconnection, several paralleldata paths are possible. Each node of the crossbarrepresents a bus switch.
All these nodes may be controlled by one of theseprocessors or by a separate one.
-
8/12/2019 354 33 Powerpoint-slides CH17
39/78
Physical interconnections
between processors in amultiprocessor system
Oxford University Press 2013. All rights reserved.
Star configuration
Loop configuration
Complete interconnection
Regular topologies and
Irregular topologies
-
8/12/2019 354 33 Powerpoint-slides CH17
40/78
Star Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
41/78
Star Configuration
Oxford University Press 2013. All rights reserved.
All the processors are connected to a central switchingelement via dedicated paths.
The central switching element may be an independent
processor. The switching element controls the interconnectionsbetween the processing elements.
-
8/12/2019 354 33 Powerpoint-slides CH17
42/78
Ring or loop Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
43/78
-
8/12/2019 354 33 Powerpoint-slides CH17
44/78
Completely Connected
Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
45/78
Completely Connected
Configuration
Oxford University Press 2013. All rights reserved.
Every processing element can directly communicatewith another processor at a time.
the required number of dedicated interconnectionpaths which is given by equation
Interconnection path.
-
8/12/2019 354 33 Powerpoint-slides CH17
46/78
Regular Topology
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
47/78
Regular Topology
Oxford University Press 2013. All rights reserved.
The processors can be arranged in any of the regularstructures such as linear array, square, hexagonal orcubical configurations. Each of the processors (nodes)
has a local memory to be accessed only by thatprocessor.
Each of the processors can communicate with a fixednumber of neighbours in the specific regular structure.
-
8/12/2019 354 33 Powerpoint-slides CH17
48/78
Irregular Topology
Oxford University Press 2013. All rights reserved.
The processors in this scheme do not follow anyuniform or regular connection pattern.
The number of neighboring processors, with which aprocessor can communicate is not fixed and may evenbe programmable.
-
8/12/2019 354 33 Powerpoint-slides CH17
49/78
Operating System Used in a
Multiprocessor System
Oxford University Press 2013. All rights reserved.
Once the microprocessors are arranged in a particulartopology, an appropriate operating system and system
software are required which will be able to work incoordination with new system resources.
An operating system is a program that resides in thecomputer memory and acts as an interface between
the user or application program and the computerresources.
-
8/12/2019 354 33 Powerpoint-slides CH17
50/78
Operating System Used in a
Multiprocessor System
Oxford University Press 2013. All rights reserved.
The success of a multiprocessor system relies on asuitable operating system.
The operating system used for single processor cannot be used for multiprocessor system.
-
8/12/2019 354 33 Powerpoint-slides CH17
51/78
Distributed Operating System
Oxford University Press 2013. All rights reserved.
Distributed operating systems are designed to runparallel processes.
Hence it is essential that a proper environment existsfor concurrent processes to communicate andcooperate in order to complete the allotted task.
The features expected from a distributed operatingsystem used in a multiprocessor system are listed.
-
8/12/2019 354 33 Powerpoint-slides CH17
52/78
Features of Distributed OS
Oxford University Press 2013. All rights reserved.
A distributed operating system should provide amechanism for inter-process and inter-processorcommunication.
A distributed operating system must be capable ofhandling the structural or architectural changes in thesystem due to expected or unexpected reasons likefaults or modifications in the configuration.
The distributed operating system should also take careof the unauthorized data access and data protection,as the data sets in these systems are referred by morethan one processor.
-
8/12/2019 354 33 Powerpoint-slides CH17
53/78
Features of Distributed OS
Oxford University Press 2013. All rights reserved.
The distributed operating system must have amechanism to split the given tasks into concurrentsubtasks which can be executed in parallel on different
processors and to collect the results of the subtasksand further process these to obtain the final result.
-
8/12/2019 354 33 Powerpoint-slides CH17
54/78
Multiprocessor system having
8086 and 8087
Oxford University Press 2013. All rights reserved.
Typical multiprocessor system consisting of 8086 and8087 (Numeric coprocessor)
The 8087 is a coprocessor which has been designed towork under the control of 8086 and gives additionalnumeric processing capabilities to 8086.
8087 is a 40 pin IC and is available in 5,8 and 10 MHZ
versions compatible with different versions o
-
8/12/2019 354 33 Powerpoint-slides CH17
55/78
Multiprocessor system having
8086 and 8087
Oxford University Press 2013. All rights reserved.
When 8086 is interfaced with 8087, the instructions ofthe 8087 can be included in the program which has to
be executed by 8086. The 8086 performs the opcodefetch cycles and identifies the instructions for 8087. f8086.
-
8/12/2019 354 33 Powerpoint-slides CH17
56/78
Architecture of 8087
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
57/78
Architecture of 8087
Oxford University Press 2013. All rights reserved.
8087 has two internal sections namely Control Unit(CU) and Numeric Extension Unit (NEU).
The NEU executes all the numeric processorinstructions while the CU receives and decodesinstructions, and reads or writes memory operands.
The control unit is also responsible for establishingcommunication between the CPU (8086) and memory,and also for coordinating the internal coprocessorexecution.
-
8/12/2019 354 33 Powerpoint-slides CH17
58/78
Architecture of 8087
Oxford University Press 2013. All rights reserved.
The internal data bus in 8087 is 84 bits wide including68-bit fraction, 15-bit exponent and a sign bit.
The microcode control unit in 8087 generates thecontrol signals required for execution of the 8087instructions.
8087 contains a programmable shifter which isresponsible for shifting the operands during theexecution of instructions like FMUL and FDIV.
The data bus interface in 8087 connects the internaldata bus of 8087 with the 8086s system data bus.
-
8/12/2019 354 33 Powerpoint-slides CH17
59/78
Pin Details of 8087
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
60/78
Interconnection of 8087 with
8086
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
61/78
Interconnection of 8087 with
8086
Oxford University Press 2013. All rights reserved.
8087 can be connected with 8086 only when 8086 isoperating in maximum mode.
In maximum mode, all control signals are derived using aseparate chip known as Bus Controller (8288).
The BUSY pin of 8087 is connected with the pin of 8086.The QS0 and QS1 lines in 8087 are directly connected tohe corresponding pins in 8086 based system.
The pins AD15-AD0, A19/S6-A16/S3, /S7, Reset andReady of 8087 are connected to the corresponding pinsof 8086.
-
8/12/2019 354 33 Powerpoint-slides CH17
62/78
Multiprocessor System Having
8086 and 8089
Oxford University Press 2013. All rights reserved.
While accessing the I/O devices by non-DMA datatransfer, such as serial port and parallel port in thepersonal computer, the CPU (such as 8086) is requiredto set up the interfacing chips used to access the I/Odevices and perform the actual data transfer.
For high speed devices, data are transferred using DMA,but the CPU has to set up the device controller, initiatethe DMA operation, and check the post-transfer statusafter the completion of each DMA operation.
-
8/12/2019 354 33 Powerpoint-slides CH17
63/78
Multiprocessor System Having
8086 and 8089
Oxford University Press 2013. All rights reserved.
The 8089 I/O processor (IOP) is designed to handle thetasks involved in I/O processing. An IOP can fetch and
executes its own instructions, unlike a DMA controller.
-
8/12/2019 354 33 Powerpoint-slides CH17
64/78
8089
Oxford University Press 2013. All rights reserved.
The instruction set of 8089 is specifically designed forI/O operations, but in addition to data transfer, they canperform arithmetic and logic operations, branches,searching and translation.
The CPU communicates with the 8089 throughmemory-based control blocks. The CPU preparescontrol blocks that describe the task to be performed,and then sends the task to the 8089 through aninterrupt like signal. The 8089 reads the control blocksto locate a program called a channel program, which iswritten using the 8089s instruction set.
-
8/12/2019 354 33 Powerpoint-slides CH17
65/78
8089
Oxford University Press 2013. All rights reserved.
Then the 8089 performs the assigned task by fetchingand executing instructions from the channel program.When 8089 has finished the task, it informs that to theCPU either through an interrupt or by updating a status
location in memory.
-
8/12/2019 354 33 Powerpoint-slides CH17
66/78
Pin details of 8089
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
67/78
Local and Remote operation of 8089
Oxford University Press 2013. All rights reserved.
The 8089 assumes all of the work involved in an I/O transferincluding device set up, DMA operation and programmed I/O,thereby relieving the CPU from the burden of the I/Oprocessing.
This allows the CPU to concentrate on higher-level tasks whilethe 8089 takes care of the I/O processing.
This greatly simplifies system software and hardware efforts,and improves system performance and flexibility, by distributed
processing approach. The 8089 may be operated in a local (closely coupled)
configuration or a remote (loosely coupled) configuration.
-
8/12/2019 354 33 Powerpoint-slides CH17
68/78
Local Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
69/78
Local Configuration
Oxford University Press 2013. All rights reserved.
In a local configuration, the 8089 shares the businterface with the host (8086) by using its pins. Allresources are accessed through the system bus.
-
8/12/2019 354 33 Powerpoint-slides CH17
70/78
Remote Configuration
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
71/78
Remote Configuration
Oxford University Press 2013. All rights reserved.
In a remote configuration, the 8089 may have its ownlocal I/O bus and requires a bus arbiter and controller,address latches and data transceivers for accessing theshared system bus.
-
8/12/2019 354 33 Powerpoint-slides CH17
72/78
8089 (IOP) Architecture
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
73/78
8089 (IOP) Architecture
Oxford University Press 2013. All rights reserved.
Each of the two channels can be programmed andoperated independently while sharing the commoncontrol logic and ALU.
The channel control pointer (CCP) can not bemanipulated by the user.
It stores the address of the control block (CB) forchannel 1 during the initialization sequence.
For channel 2, its control block (CB) starts at the addressthat is indicated by adding 8 to the contents of the CCP.
-
8/12/2019 354 33 Powerpoint-slides CH17
74/78
8089 (IOP) Architecture
Oxford University Press 2013. All rights reserved.
To dispatch a task to either channel, the CPU (8086)sends out a channel attention (CA) signal along withthe select (SEL) signal which selects channel 1 (if SEL=0)or channel 2 (if SEL=1).
Since the channels occupy two consecutive I/O portaddresses, the A0 address line of 8086 is connected tothe SEL pin so that when A0=0, one channel is selected
and when A0=1, another channel is selected. Each channel has an identical set of registers, each setbeing divided into two groups according to size.
-
8/12/2019 354 33 Powerpoint-slides CH17
75/78
8089 (IOP) Architecture
Oxford University Press 2013. All rights reserved.
The pointer group consists of those registers having 20bits, and the register group consists of those registershaving 16 bits.
Each pointer, with the exception of PP, has an associatedtag bit. When used to access a memory operand, the tagbit indicates whether the contents of that pointerrepresents a 20-bit system (i.e. memory) space address(if tag=0) or a 16-bit local (i.e. I/O) space address (iftag=1).
In accessing the local space, only the low-order 16-bitsof the pointer are used as the address. Register PP alwayspoints to an address in the system space.
-
8/12/2019 354 33 Powerpoint-slides CH17
76/78
Registers in 8089 IOP
Oxford University Press 2013. All rights reserved.
-
8/12/2019 354 33 Powerpoint-slides CH17
77/78
Registers in 8089 IOP
Oxford University Press 2013. All rights reserved.
The registers GA, GB, GC, IX, BC, and MC can be used asgeneral purpose registers for arithmetic and logicoperations in a channel program. In addition, theyperform special functions when addressing memory
operands and executing DMA operations. A memory operand can only be addressed by using
one of the pointers GA, GB, GC, or PP as a base register.During a DMA operation, GA and GB are used for the
source and destination pointers. If GA points to thesource, then GB points to the destination, and vice-versa.
-
8/12/2019 354 33 Powerpoint-slides CH17
78/78
Registers in 8089 IOP When a translation operation is performed along with
the DMA transfer, the contents of GC are used as thebase address of a 256-byte translation table.
Register BC is used as the byte counter during a DMAtransfer, and is decremented by 1 after each bytetransfer and by 2 after each word transfer.