csc317l13

Upload: mohammad-hamid

Post on 14-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 csc317l13

    1/14

    Computer Organization & Architecture

    Lecture #13

    Computer Evolution and Performance

    The evolution of computers has been characterized by increasing processor speed,decreasing component size, increasing memory size, and increasing I/O capacityand speed.

    One factor responsible for the great increase in processor speed in the shrinkingsize of the microprocessor components; this reduces the distance betweencomponents and hence increases speed. However, the true gains in speed in recentyears have come from the organization of the processor, including heavy use of

    pipelining and parallel execution techniques and the use of speculative execution

    techniques, which results in the tentative execution of future instructions that mightbe needed. All of these techniques are designed to keep the processor busy as muchof the time as possible.

    A critical issue in computer system design is balancing the performance of thevarious elements, so that gains in performance in one area are not handicapped bya lag in other areas. In particular, processor speed has increased more rapidly thanmemory access time. A variety of techniques are used to compensate for thismismatch, including caches, wider data paths from memory to processor, and moreintelligent memory chips.

    A Brief History of Computers

    The First Generation: Vacuum Tubes

    ENIAC Electronic Numerical Integrator And Computer

    Designed by John Mauchly and John Presper Eckert University of Pennsylvania 1943 to 1946 Developed for calculating artillery firing tables Generally regarded as the first electronic computer Enormous!!!

    o 30 tonso 1500 square feet of floor spaceo 18,000 tubes

  • 7/30/2019 csc317l13

    2/14

    o 140 kW of power 5000 additions per second Decimal number system 20 accumulators 10 digits Programmed manually with switches and cables Disassemble in 1955The von Neumann Machine

    The task of entering and altering programs for the ENIAC was extremely tedious.The programming process could be facilitated if the program could be representedin a form suitable for storing in memory alongside the data. Then, a computercould get its instructions by reading them from memory, and a program could beset or altered by setting the values of a portion of memory.

    Developed by John von Neumann Princeton Institute for Advanced Studies 1945 to 1952 Prototype of all subsequent general-purpose computers. IAS computer

    o Stored-Program concepto Main memory stores both data and instructionso Arithmetic and logic unit (ALU) capable of operating on binary datao Control unit, which interprets and executes the instructions in memoryo Input and output (I/O) equipment operated by the control unit

    Shown below is the general structure of the IAS computer:

  • 7/30/2019 csc317l13

    3/14

    With rare exceptions, all of todays computers have this same general structure andfunction and are referred to as von Neumann machines.

    IAS details

    1000x40 words of storage both data and instructions

    2x20 bit instructionsShown below is the number word format:

    Shown below is the instruction word format:

    The control unit operates the IAS by fetching instructions from memory andexecuting them one at a time. The control unit and the ALU contain storagelocations, called registers.

    Memory buffer register (MBR): contains a word to be stored in memory, or issued to receive a word from memory.

    Memory address register (MAR): specifies the address in memory of theword to be written from or read into the MBR.

    Instruction register (IR): contains the 8-bit opcode instruction being executed. Instruction buffer register (IBR): employed to hold temporarily the right-hand instruction from a word in memory. Program Counter (PC): contains the address of the next instruction-pair to be

    fetched from memory.

    Accumulator (AC) and multiplier quotient (MQ): employed to holdtemporarily operands and results of ALU operations.

  • 7/30/2019 csc317l13

    4/14

    Shown below is the expanded structure of IAS Computer:

    Shown below is the IAS instruction cycle:

  • 7/30/2019 csc317l13

    5/14

    The IAS operates by repetitively performing an instruction cycle. Each instructioncycle consists of two subcycles.

    Fetch cycle: the opcode of the next instruction is loaded into the IR and theaddress portion is loaded into the MAR. This instruction may be taken from theIBR, or it can be obtained from memory by loading a word into the MBR, andthen down to the IBR, IR, and MAR.

    Execute cycle: the control circuitry interprets the opcode and executes theinstruction by sending out the appropriate control signals to cause data to bemoved or an operation to be performed by the ALU.

    The IAS computer had 21 instructions which can be grouped as follows:

    Data transfer: move data between memory and ALU registers or between twoALU registers.

    Unconditional branch: used to facilitate repetitive operations. Conditional branch: branch can be made dependent on a condition, thus

    allowing decision points.

    Arithmetic: operations performed by the ALU. Address modify: permits address to be computed in the ALU and then inserted

    into instructions stored in memory.

  • 7/30/2019 csc317l13

    6/14

    Commercial Computers

    1947 Eckert-Mauchly Computer Corporation formed to manufacture computerscommercially

    1950 UNIVAC I (Universal Automatic Computer) commissioned by Bureau ofthe Census

    o first successful commercial computero scientific and commercial applications

    Eckert-Mauchly Computer Corporation became part of the UNIVAC divisionof Sperry-Rand Corporation

    Late 1950s UNIVAC II released with greater memory capacity and higherperformance than UNIVAC I upward compatible

    IBM major manufacturer of punched-card processing equipment 1953 IBM 701 first electronic stored-program computer

    o scientific applications 1955 IBM 702 introduced

    o business applications IBM 700/7000 series established IBM as the overwhelmingly dominant

    computer manufacturer

    The Second Generation: Transistors

    Transistors replaced vacuum tubeso Smallero Cheapero Less heato Same functionalityo Solid-state device made from silicon (sand)

    Bell Labs 1947 Fully transistorized computers commercially available late 1950s NCR and RCA first to produce small transistor machines IBM 7000 Digital Equipment Corporation (DEC) PDP-1 High-level programming languages Provision of system software with computers

  • 7/30/2019 csc317l13

    7/14

    Third Generation: Integrated Circuits

    Single, self-contained transistor discrete component Manufacturing process was very expensive and cumbersome using discrete

    components Early second generation computers contained 10,000 transistors expanding to

    hundreds of thousands with newer machines

    1958 Integrated circuit invented IBM System/360 DEC PDP-8Microelectronics

    Means small electronics Computer consists of logic gates, memory cells and interconnections Manufactured on a semiconductor such as silicon Many transistors can be produced on a single wafer of siliconShown below is the relationship between Wafer, Chip, and Gate

    The table below shows a summary of technology generations:

    Generation Dates Technology Speed (ops per sec)

    1 1946-1957 Vacuum Tube 40,000

    2 1958-1964 Transistor 200,000

    3 1965-1971 SSI and MSI 1,000,000

    4 1972-1977 LSI 10,000,000

    5 1978- VLSI 100,000,000

  • 7/30/2019 csc317l13

    8/14

    Moores Law Gordon Moore cofounder of Intel - 1965

    The number of transistors on a chip will double every year Since the 1970s the number of transistors has doubled every 18 months

    Cost of a chip has remained virtually unchanged cost of computer logic andmemory circuitry has fallen at a dramatic rate

    Higher packing density shorter electrical path increased operating speed Computers become smaller available in more environments Reduced power and cooling requirements Fewer interconnections increase in reliabilityShown below is the Growth in CPU Transistor Count:

    IBM System/360 Series see Table 2.4

    1964 Replaced 7000 series not compatible Industrys first planned family of computers

    o Similar or identical instruction setso Similar or identical operating systems (O/S)o Increasing speedo Increasing number if I/O ports more terminal connectionso Increasing memory sizeo Increasing cost

    Multiplexed switch structure see Figure 2.5

  • 7/30/2019 csc317l13

    9/14

    DEC PDP-8 see Table 2.5

    1964 First minicomputer named after miniskirt

    Did not need an air conditioned room Small enough to sit on a lab bench Could not do everything that a mainframe computer could

    o $16,000 versus $100,000+ IBM System/360 Original equipment manufacturers (OEM) would integrate PDP-8 as part of an

    integrated system package

    Introduced the bus structure that is virtually universal for all minicomputersand microcomputers

    o Omnibus 96 signal paths control, address, and data signalsShow below is the Omnibus:

    Semiconductor Memory

    1950s and 1960s core memory Tiny rings of ferromagnetic material that were strung up on grids of fine wire

    suspended on small screens inside the computer

    Magnetized one way for a one and magnetized the other way for a zero Relatively fast 1 millionth of a second to read a stored bit Expensive and bulky Destructive read

    o Data erased during reado Extra circuits required to restore data after read

    1970 Fairchild Size of a single core 256 bits of memory Nondestructive read Much faster than core 70 billionths of a second to read a stored bit Cost initially much higher than core changed in 1974 11 generations each generation provided four times the storage density

  • 7/30/2019 csc317l13

    10/14

    Microprocessors

    1971 Intel 4004 4 bito First microprocessoro

    All CPU components on a single chipo Designed for specific applications 1972 Intel 8008 8 bit

    o Twice as complex as the 4004o Designed for specific applications

    1974 Intel 8080 8 bito First general-purpose microprocessor

    Table 2.6 shows the evolution of the Intel Microprocessors.

    Designing for Performance

    Microprocessor Speed

    Chipmakers release new generations of chips every three years each with fourtimes as many transistors

    Memory chips have quadrupled the capacity of dynamic-access memory(DRAM) every three years

    Microprocessor speed boosts that come from reducing the distance betweencircuits has improved performance four- or fivefold every three years since Intellaunched the x86 family in 1978

    The raw speed of the microprocessor will not achieve its potential unless if is fed aconstant stream of work to do in the form of computer instructions. While thechipmakers have been busy learning how to fabricate chips of greater and greaterdensity, the processor designers must come up with ever more elaborate techniquesfor feeding the monster.

    Branch prediction the processor looks ahead in the instruction code fetchedfrom memory and predicts which branches, or groups of instructions, are likelyto be processed next.

    Data flow analysis the processor analyzes which instructions are dependenton each others results, or data, to create an optimized schedule of instructions.

  • 7/30/2019 csc317l13

    11/14

    Speculative execution using branch prediction and data flow analysis, someprocessors speculatively execute instructions ahead of their actual appearance inthe program execution, holding the results in temporary locations.

    Performance Balance

    While processor power has raced ahead at breakneck speed, other criticalcomponenets of the computer have not kept up. The result is a need to look for

    performance balance: an adjusting of the organization and architecture tocompensate for the mismatch among the capabilities of the various components.

    Nowhere is the problem created by such mismatches than in the interface betweenprocessor and main memory.

    Shown below is the evolution of DRAM and Processor Characteristics:

    While processor speed and memory capacity have grown rapidly, the speed withwhich data can be transferred between main memory and the processor has not.

    The interface between processor and main memory is the most critical pathway inthe entire computer, because it is responsible for carrying a constant flow ofprogram instructions and data between memory chips and the processor.

  • 7/30/2019 csc317l13

    12/14

    Shown below are the trends in DRAM use:

    The amount of main memory is going up but, but DRAM density is going upfaster. The net result is that, on average, the number of DRAMs per system isgoing down. This has an affect on transfer rates, because there is less opportunityfor parallel transfer of data.

    There are a number of ways that a system architect can attack this problem:

    Increase the number of bits that are retrieved at one time wider data paths Change the DRAM interface include cache or other buffering techniques Reduce the frequency of memory access include one or more level of cache

    both on- and off-chip between the processor and main memory

    Increase the interconnect bandwidth between processors and memory higher-speed buses and using a hierarchy of buses to buffer and structure data flow

  • 7/30/2019 csc317l13

    13/14

    Pentium and PowerPC Evolution

    Pentium Evolution

    8080o First general purpose microprocessor

    o 8 bit data patho Used in first personal computer Altair

    8086o Much more powerfulo 16 bito Instruction cache prefetch few instructionso 8088 (8 bit external bus) used in first IBM PC

    80286o 16 Mbyte memory up form 1 Mbyte

    80386o 32 bito Support for multitasking

    80486o Sophisticated powerful cache and instruction pipeliningo Built in math co-processor

    Pentiumo Introduced superscalar techniqueso Multiple instructions executed in parallel

    Pentium Proo Increased superscalar organizationo Aggressive register renamingo Branch predictiono Data flow analysiso Speculative execution

    Pentium IIo MMX technology video, audio, graphics processing

    Pentium IIIo Additional floating point instructions for graphics

    Pentium 4o Arabic not Roman numeralso Additional floating point and multimedia enhancements

    Itaniumo 64 bit

  • 7/30/2019 csc317l13

    14/14

    PowerPC Evolution

    601o Introduce the market to the PowerPC architectureo

    32 bit 603

    o Used for low-end desktop and portable computerso 32 bito Lower cost and a more efficient implementation

    604o Used for desktop computers and low-end serverso 32 bito Used advanced superscalar design techniques

    620o Used in high-end serverso 64 bit

    740/750 (G3)o Two levels of cache in the main processor significant performance

    improvement over machines with off-chip cache

    G4o Increase the parallelism and internal speed of the processor