354 33 powerpoint-slides ch17

8/12/2019 354 33 Powerpoint-slides CH17

1/78

N. Senthil Kumar,M. Saravanan &

S. Jeevananthan


2/78

Chapter 17

MULTIPROCESSOR CONFIGURATION

Oxford University Press 2013. All rights reserved.


3/78

Introduction The speed of any microprocessor based system

depends upon the clock frequency at which connectedprocessors and peripherals works.

When bulk I/O data transfer is done under the controlof microprocessor then the processor has to spendmost of its time idle

Enhancement of the speed, on appropriate systeminvolving several connected processors, requires acertain topology - Such a system is known asmultiprocessor system.



4/78

Introduction The simplest type of multiprocessor system is one

containing a CPU (such as 8086) and a numeric dataprocessor (NDP) and/or input/output processor (IOP).

The NDP (such as 8087)and IOP work in synchronismwith the main processor to complete the specific tasksand are known as coprocessors.

Additional hardware circuits such as bus arbiter, bus

controller may be needed to co-ordinate the activitiesof the number of processors working at a time in thesystem.



5/78


6/78


7/78

Multiprocessor ConfigurationVs. Single-chip Microprocessor

Several processors may be combined to fit the needs ofan application while avoiding the expense of theunneeded capabilities of a single complex multiple-chipprocessor.

The modularity of a multiprocessor system providesmeans for expansion because it is easy to add moreprocessors as the need arises.



8/78

Multiprocessor ConfigurationVs. Single-chip Microprocessor

In a multiprocessor system, tasks are divided amongthe modules. If the failure occurs, it is easier andcheaper to find and replace the malfunctioningprocessor than it is to find and replace the failing part

in a complex processor.



9/78

Different Configurations Of

Multiprocessor System The maximum mode operation of the 8086 isspecifically designed to implement multiprocessorsystems.

Three basic configurations - They are the coprocessor,closely coupled and loosely coupled configurations.

The first two of the configurations are very similar in

that both the CPU (i.e 8086) and the external orsupporting processor share not only the entire memoryand I/O subsystem, but they also share the same buscontrol logic and clock generator as shown in figure.



10/78

Closely Coupled

Multiprocessor Configuration



11/78

Closely Coupled



In a closely coupled configuration the supportingprocessor may act independently from the CPU, butin a coprocessor design it is dependent on the CPUand must interact directly with the CPU.


12/78

Loosely CoupledMultiprocessor Configuration



13/78

Loosely Coupled



In Closely Coupled Configuration, several modulesmay share the system resources, and system buscontrol logic must resolve the bus contentionproblem.

Each potential bus master runs independently andthere are no direct connections between them asshown in fig.

In addition to the shared resources, each modulemay include its own memory and I/O devices.


14/78

Loosely Coupled



The processors in the separate modules cansimultaneously access their private subsystems throughthe local buses and perform their local data references

and instruction fetches independently, thus improvingthe degree of concurrent processing.

In a loosely coupled multiprocessor system, two 8086processors can not be tied directly together.

Each CPU has its own bus control logic, and busarbitration is resolved by extending this logic andadding external logic that is common to all mastermodules.


15/78

Loosely Coupled



Therefore, several CPUs can form a very large systemand each CPU may have independent processorsand/or a coprocessor attached to it


16/78

Advantages of Loosely

Coupled Configuration


The system can be expanded in a modular form. Eachbus master module is an independent unit andnormally resides on a separate PC board and hence, abus master module can be added or removed withoutaffecting the other modules in the system.

High system throughput can be achieved by having

more than one CPU A failure in one module does not cause a breakdownof the entire system and faulty module can be easilydetected and replaced.


17/78

Advantages of Loosely

Coupled Configuration


Each bus master may have a local bus to accessdedicated memory or I/O devices so that a greater

degree of parallel processing can be achieved.


18/78


19/78

Daisy Chaining


There are three schemes for establishing prioritynamely daisy chaining, polling and independentrequesting.


20/78

Daisy Chaining


In this simple and low cost methods, all the mastersuse the same line for making bus requests.

To respond to a bus request (BR) signal, the controller

sends bus grant (BG) signal if the bus busy signal isinactive.

The grant signal serially propagates through eachmaster until it encounters the first one that is

requesting access to the bus.


21/78

Who Gets Priority in DaisyChaining?


The first module blocks the propagation of the busgrant signal, activates the bus busy line and gains

control of the bus. Any other requesting module which is present afterthe master that has now gained the control of thebus, will not receive the grant signal and therefore,

the priority is determined by the physical location ofthe modules. The requesting module located closest to the

controller has the highest priority.


22/78

Merits and Demerits of

Daisy Chaining


Compared to the other two methods, the daisy chainscheme requires least number of control lines andthis number is independent of the number ofmodules in the system.

However, the arbitration time is slow due to thepropagation delay of the bus grant signal through thedifferent masters.


23/78

Merits and Demerits of

Daisy Chaining


This delay is proportional to the number of modulesand therefore, a daisy-chain based system is limitedto the multiprocessor system having only a fewmodules.

Further, the priority of each module is fixed by itsphysical location and failure of a module in thesystem causes the whole system to fail.


24/78

Polling


It uses a set of lines sufficient to address each

module. In response to a bus request (BR), the controllergenerates and sends out a sequence of moduleaddresses to the requesting modules.


25/78

Polling


When a requesting module recognizes its address, itactivates the busy line and begins to use the bus.

The major advantage of polling is that the priority

can be dynamically changed by altering the pollingsequence (i.e. the order in which the moduleaddresses are sent) stored in the controller.


26/78

Independent Requesting


This fastest and high BR & BG lines (2m lines areneeded for m modules)scheme resolves the priorityin a parallel fashion.


27/78

Independent Requesting


Each module has a separate pair of bus request (BR)and bus grant lines (BG) and each pair has a priorityassigned to it.

The controller includes a priority decoder, whichselects the request with the highest priority andreturns the corresponding bus grant signal.


28/78

Interconnection Topologies ina Multiprocessor System


A microprocessor with its external bus connections

needs memory to form a minimum workableprocessing system. In a multiprocessor system, a number of

microprocessors are connected with each other using

a single bus. The bus is also used to address a multi port memory

or a shared single I/O port.


29/78

Interconnection Topologies in

a Multiprocessor System


A microprocessor with its external bus connectionsneeds memory to form a minimum workableprocessing system.

In a multiprocessor system, a number ofmicroprocessors are connected with each other usinga single bus.

The bus is also used to address a multi port memoryor a shared single I/O port. The method of communication among the

microprocessors in a multiprocessor system


30/78

Shared Bus Architecture



31/78

Shared Bus Architecture


The shared bus architecture uses a common memorywhich may be partitioned into local memory banks fordifferent processors.

At a time, only one processor performs a bus cycleto fetch instructions or data from the memory .


32/78

Multi-port Memory



33/78

Multi-port Memory


The processors P1 and P2 address a multi-port memorywhich can be accessed at a time by both theprocessors.

Both the processors also have local memories whichare used by them to store individual instructions, dataand the execution of its individual task.

The multiport memory may be used for storing the

instructions, data and the results to be shared by morethan one processor.


34/78

Linked Input/Output



35/78

Linked Input/Output


Linked Input/Output interconnection utilizesinput/output capabilities of a microprocessor basedsystem to communicate with other systems

The direct access of common instructions and datawhich are available in a local system memory is notpossible in this method.


36/78

Crossbar Switching



37/78

Crossbar Switching


It uses an extension of the concept of shared memoryfor a number of processors.

In this method, more than one processor can have

simultaneous accesses to the different memorymodules to be shared individually as long as there isno conflict.

The total memory is divided into modules. While oneprocessor is accessing a memory module, the otherprocessor will be denied an access of the samemodule till it is relinquished by the former processor.


38/78

Crossbar Switching


The crossbar switch provides the interconnectionpaths between the memory modules and theprocessors.

In crossbar switch interconnection, several paralleldata paths are possible. Each node of the crossbarrepresents a bus switch.

All these nodes may be controlled by one of theseprocessors or by a separate one.


39/78

Physical interconnections

between processors in amultiprocessor system


Star configuration

Loop configuration

Complete interconnection

Regular topologies and

Irregular topologies


40/78

Star Configuration



41/78

Star Configuration


All the processors are connected to a central switchingelement via dedicated paths.

The central switching element may be an independent

processor. The switching element controls the interconnectionsbetween the processing elements.


42/78

Ring or loop Configuration



43/78


44/78

Completely Connected

Configuration



45/78

Completely Connected

Configuration


Every processing element can directly communicatewith another processor at a time.

the required number of dedicated interconnectionpaths which is given by equation

Interconnection path.


46/78

Regular Topology



47/78

Regular Topology


The processors can be arranged in any of the regularstructures such as linear array, square, hexagonal orcubical configurations. Each of the processors (nodes)

has a local memory to be accessed only by thatprocessor.

Each of the processors can communicate with a fixednumber of neighbours in the specific regular structure.


48/78

Irregular Topology


The processors in this scheme do not follow anyuniform or regular connection pattern.

The number of neighboring processors, with which aprocessor can communicate is not fixed and may evenbe programmable.


49/78

Operating System Used in a

Multiprocessor System


Once the microprocessors are arranged in a particulartopology, an appropriate operating system and system

software are required which will be able to work incoordination with new system resources.

An operating system is a program that resides in thecomputer memory and acts as an interface between

the user or application program and the computerresources.


50/78

Operating System Used in a

Multiprocessor System


The success of a multiprocessor system relies on asuitable operating system.

The operating system used for single processor cannot be used for multiprocessor system.


51/78

Distributed Operating System


Distributed operating systems are designed to runparallel processes.

Hence it is essential that a proper environment existsfor concurrent processes to communicate andcooperate in order to complete the allotted task.

The features expected from a distributed operatingsystem used in a multiprocessor system are listed.


52/78

Features of Distributed OS


A distributed operating system should provide amechanism for inter-process and inter-processorcommunication.

A distributed operating system must be capable ofhandling the structural or architectural changes in thesystem due to expected or unexpected reasons likefaults or modifications in the configuration.

The distributed operating system should also take careof the unauthorized data access and data protection,as the data sets in these systems are referred by morethan one processor.


53/78

Features of Distributed OS


The distributed operating system must have amechanism to split the given tasks into concurrentsubtasks which can be executed in parallel on different

processors and to collect the results of the subtasksand further process these to obtain the final result.


54/78

Multiprocessor system having

8086 and 8087


Typical multiprocessor system consisting of 8086 and8087 (Numeric coprocessor)

The 8087 is a coprocessor which has been designed towork under the control of 8086 and gives additionalnumeric processing capabilities to 8086.

8087 is a 40 pin IC and is available in 5,8 and 10 MHZ

versions compatible with different versions o


55/78

Multiprocessor system having

8086 and 8087


When 8086 is interfaced with 8087, the instructions ofthe 8087 can be included in the program which has to

be executed by 8086. The 8086 performs the opcodefetch cycles and identifies the instructions for 8087. f8086.


56/78

Architecture of 8087



57/78



8087 has two internal sections namely Control Unit(CU) and Numeric Extension Unit (NEU).

The NEU executes all the numeric processorinstructions while the CU receives and decodesinstructions, and reads or writes memory operands.

The control unit is also responsible for establishingcommunication between the CPU (8086) and memory,and also for coordinating the internal coprocessorexecution.


58/78



The internal data bus in 8087 is 84 bits wide including68-bit fraction, 15-bit exponent and a sign bit.

The microcode control unit in 8087 generates thecontrol signals required for execution of the 8087instructions.

8087 contains a programmable shifter which isresponsible for shifting the operands during theexecution of instructions like FMUL and FDIV.

The data bus interface in 8087 connects the internaldata bus of 8087 with the 8086s system data bus.


59/78

Pin Details of 8087



60/78

Interconnection of 8087 with

8086



61/78

Interconnection of 8087 with

8086


8087 can be connected with 8086 only when 8086 isoperating in maximum mode.

In maximum mode, all control signals are derived using aseparate chip known as Bus Controller (8288).

The BUSY pin of 8087 is connected with the pin of 8086.The QS0 and QS1 lines in 8087 are directly connected tohe corresponding pins in 8086 based system.

The pins AD15-AD0, A19/S6-A16/S3, /S7, Reset andReady of 8087 are connected to the corresponding pinsof 8086.


62/78

Multiprocessor System Having

8086 and 8089


While accessing the I/O devices by non-DMA datatransfer, such as serial port and parallel port in thepersonal computer, the CPU (such as 8086) is requiredto set up the interfacing chips used to access the I/Odevices and perform the actual data transfer.

For high speed devices, data are transferred using DMA,but the CPU has to set up the device controller, initiatethe DMA operation, and check the post-transfer statusafter the completion of each DMA operation.


63/78

Multiprocessor System Having

8086 and 8089


The 8089 I/O processor (IOP) is designed to handle thetasks involved in I/O processing. An IOP can fetch and

executes its own instructions, unlike a DMA controller.


64/78

8089


The instruction set of 8089 is specifically designed forI/O operations, but in addition to data transfer, they canperform arithmetic and logic operations, branches,searching and translation.

The CPU communicates with the 8089 throughmemory-based control blocks. The CPU preparescontrol blocks that describe the task to be performed,and then sends the task to the 8089 through aninterrupt like signal. The 8089 reads the control blocksto locate a program called a channel program, which iswritten using the 8089s instruction set.


65/78

8089


Then the 8089 performs the assigned task by fetchingand executing instructions from the channel program.When 8089 has finished the task, it informs that to theCPU either through an interrupt or by updating a status

location in memory.


66/78

Pin details of 8089



67/78

Local and Remote operation of 8089


The 8089 assumes all of the work involved in an I/O transferincluding device set up, DMA operation and programmed I/O,thereby relieving the CPU from the burden of the I/Oprocessing.

This allows the CPU to concentrate on higher-level tasks whilethe 8089 takes care of the I/O processing.

This greatly simplifies system software and hardware efforts,and improves system performance and flexibility, by distributed

processing approach. The 8089 may be operated in a local (closely coupled)

configuration or a remote (loosely coupled) configuration.


68/78

Local Configuration



69/78

Local Configuration


In a local configuration, the 8089 shares the businterface with the host (8086) by using its pins. Allresources are accessed through the system bus.


70/78

Remote Configuration



71/78

Remote Configuration


In a remote configuration, the 8089 may have its ownlocal I/O bus and requires a bus arbiter and controller,address latches and data transceivers for accessing theshared system bus.


72/78

8089 (IOP) Architecture



73/78



Each of the two channels can be programmed andoperated independently while sharing the commoncontrol logic and ALU.

The channel control pointer (CCP) can not bemanipulated by the user.

It stores the address of the control block (CB) forchannel 1 during the initialization sequence.

For channel 2, its control block (CB) starts at the addressthat is indicated by adding 8 to the contents of the CCP.


74/78



To dispatch a task to either channel, the CPU (8086)sends out a channel attention (CA) signal along withthe select (SEL) signal which selects channel 1 (if SEL=0)or channel 2 (if SEL=1).

Since the channels occupy two consecutive I/O portaddresses, the A0 address line of 8086 is connected tothe SEL pin so that when A0=0, one channel is selected

and when A0=1, another channel is selected. Each channel has an identical set of registers, each setbeing divided into two groups according to size.


75/78



The pointer group consists of those registers having 20bits, and the register group consists of those registershaving 16 bits.

Each pointer, with the exception of PP, has an associatedtag bit. When used to access a memory operand, the tagbit indicates whether the contents of that pointerrepresents a 20-bit system (i.e. memory) space address(if tag=0) or a 16-bit local (i.e. I/O) space address (iftag=1).

In accessing the local space, only the low-order 16-bitsof the pointer are used as the address. Register PP alwayspoints to an address in the system space.


76/78

Registers in 8089 IOP



77/78

Registers in 8089 IOP


The registers GA, GB, GC, IX, BC, and MC can be used asgeneral purpose registers for arithmetic and logicoperations in a channel program. In addition, theyperform special functions when addressing memory

operands and executing DMA operations. A memory operand can only be addressed by using

one of the pointers GA, GB, GC, or PP as a base register.During a DMA operation, GA and GB are used for the

source and destination pointers. If GA points to thesource, then GB points to the destination, and vice-versa.


78/78

Registers in 8089 IOP When a translation operation is performed along with

the DMA transfer, the contents of GC are used as thebase address of a 256-byte translation table.

Register BC is used as the byte counter during a DMAtransfer, and is decremented by 1 after each bytetransfer and by 2 after each word transfer.

354 33 powerpoint-slides ch17

Documents