blueeyes presentation

Upload: pranithasubramani

Post on 08-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 BLUEEYES presentation

    1/30

    VLIW

    ARCHITECTURE

  • 8/7/2019 BLUEEYES presentation

    2/30

    VLIW . 2

    Increasing Processor Performance

    Semiconductor Technology

    Parallel Processing

    Multiprocessors, Multicomputers

    Parallelism within the Processor

    Pipelining

    ILP

  • 8/7/2019 BLUEEYES presentation

    3/30

    VLIW . 3

    ILP (Instruction Level Parallelism)

    Parallel Execution ofInstructions.

    Overlapping of instructions

    ILP processors

    Superscalar processors

    VLIW processors.

  • 8/7/2019 BLUEEYES presentation

    4/30

  • 8/7/2019 BLUEEYES presentation

    5/30

    VLIW . 5

    Execution in a Scalar Processor

    Fetch

    Write Back

    Execute

    Decode

  • 8/7/2019 BLUEEYES presentation

    6/30

    VLIW . 6

    Decision about operations by H/W

    More than one instruction at a time

    Dynamic scheduling

    Superscalar processors

  • 8/7/2019 BLUEEYES presentation

    7/30

    VLIW . 7

    Basic Superscalar Approach

    REGISTER FILE

    INSTRUCTION CACHE

    EXECUTION UNIT

    # 1

    INSTRUCTION

    BUFFERS, DECODERS,

    DISPATCHER

    RECORD BUFFER

    EXECUTION UNIT

    # 4

    EXECUTION UNIT

    # 3

    EXECUTION UNIT

    # 2

    DATA CACHE

  • 8/7/2019 BLUEEYES presentation

    8/30

    VLIW . 8

    Execution in Superscalar

    Fetch

    Decode

    Execute

    Write Back

    With degree 4

  • 8/7/2019 BLUEEYES presentation

    9/30

    VLIW 9

    Disadvantages ofSuperscalar

    Complexity of hardware.

    Window size constrained. This limits the capacityto detect independent instructions.

    More power consumption.

  • 8/7/2019 BLUEEYES presentation

    10/30

    VLIW . 10

    VLIW

    Very Long Instruction Word.

    Instructions hundereds of bits in length

    Uses long instruction called a Multiop

    Multiple functional units are concurrently used

    Functional units share a common register file.

    Code compaction by compiler.

  • 8/7/2019 BLUEEYES presentation

    11/30

    VLIW 11

    A Brief History

    Joseph fisher,Trace scheduling,1979

    He coined the acronym VLIW.

    In 1984, two companies were started

    Multiflow, started by Joseph Fisher

    Cydrome, founded by Bob Rau.

  • 8/7/2019 BLUEEYES presentation

    12/30

    VLIW . 12

    In 1987, Cydrome delivered the first machine the 256 bit Cydra 5.

    Multiflow delivered Trace/200 - 1987

    Trace/300 - 1988

    Trace/500 - 1990

  • 8/7/2019 BLUEEYES presentation

    13/30

    VLIW . 13

    Since then VLIW machines have seen arevival and some degrees of success.

    Multiflow closed in 1990

    Cydrome closed in 1998

  • 8/7/2019 BLUEEYES presentation

    14/30

    VLIW . 14

    Basic VLIW Approach

    REGISTER FILE

    INSTRUCTION CACHE

    EXECUTION UNIT# 1

    INSTRUCTION

    REGISTER

    EXECUTION UNIT

    # 4

    EXECUTION UNIT

    # 3

    EXECUTION UNIT

    # 2

    DATA CACHE

    REGISTER FILE

    EXECUTION UNIT

    # 1

  • 8/7/2019 BLUEEYES presentation

    15/30

  • 8/7/2019 BLUEEYES presentation

    16/30

  • 8/7/2019 BLUEEYES presentation

    17/30

    VLIW . 17

    Case Studies

    Defoe.

    Intel Itanium Processor.

    TransmetaCrusoe Processor.

  • 8/7/2019 BLUEEYES presentation

    18/30

    VLIW . 18

    Defoe Architecture

    D-Cache

    Simple

    Integer

    Simple

    Integer

    Complex

    Integer

    Load/

    Store

    Load/

    Store

    Branch/

    Cmp

    64 entry Register File

    Dispersal Unit

    D-Cache

    16x

    Pred

    Score

    Board

    &

    Fetch

    To L2

    Cache

    From L2

    Cache

  • 8/7/2019 BLUEEYES presentation

    19/30

    VLIW Abhilash.P.K. 19

    Instruction Encoding

    64 bit compressed VLIW architecture.

    Used variable length multiops

    Individual operations are encoded as 32 bit words.

    A special stop bit indicates the end of an instructionword.

    Stop bit

    (1 bit)

    Predicate

    (4 bits)

    OPCODE

    (9 bits)

    RDEST

    (6 bits)

    RSRC 1

    (6 bits)

    RSRC 2

    (6 bits)

  • 8/7/2019 BLUEEYES presentation

    20/30

    VLIW . 20

    Intel Itanium Processor

    Intels first implementation ofIA-64.

    IA-64 is an ISA for the EPIC (Explicitly Parallel

    InstructionComputing) style ofVLIW, developed

    jointly by Intel and HP.

  • 8/7/2019 BLUEEYES presentation

    21/30

    VLIW . 21

    64 bit processor, with

    4 integer units

    4 multimedia units

    2 load/store units

    2 extended precision floating

    point units

    2 single precision floating point units

  • 8/7/2019 BLUEEYES presentation

    22/30

    VLIW . 22

    Transmeta Crusoe Processor

    Designed to reduce power consumption.

    Dynamic scheduling consumes more power.

    VLIW replaces the complex ways of gaining ILP

    with simpler and more power efficient ways.

  • 8/7/2019 BLUEEYES presentation

    23/30

    VLIW . 23

    Instruction Format

    Instructions are either64 or128 bits long.

    Molecules and atoms

    .

    64 GPRs

  • 8/7/2019 BLUEEYES presentation

    24/30

    VLIW . 24

    Compiler Support

    Instruction scheduling algorithms are critical.

    Three important scheduling algorithms

    Trace scheduling

    Trace scheduling-2

    Super Block scheduling

  • 8/7/2019 BLUEEYES presentation

    25/30

    VLIW . 25

    Advantages

    Less hardware complexity.

    Static Scheduling Much more hardware can be devoted to useful

    computation.

    Software has a larger window to look at..

    Can find more ILP.

  • 8/7/2019 BLUEEYES presentation

    26/30

    VLIW . 26

    Shortcomings

    Wasteful encoding with NOPs.

    Hard to maintain code compatibility between

    generations.

    Increased program size.

    Compiler has to explicitly add NOP.

    New versions of the architecture can force majorrewriting of the compiler.

  • 8/7/2019 BLUEEYES presentation

    27/30

    VLIW . 27

    Future of VLIW

    Newer processors are mainly used for

    Stream and image processing. Eg PhilipsTrimedia

    Digital Signal Processig. Eg TMS320C62x from

    Texas Instr

    Mobile computing. Eg Transmeta Crusoe

    High end server applications. Eg Intel Itanium

  • 8/7/2019 BLUEEYES presentation

    28/30

    VLIW 28

    Stream and media processing lend themselves

    to VLIW style with large amounts ofILP.

    Superscalars will be forced to use simpler

    structures and seek help from software.

  • 8/7/2019 BLUEEYES presentation

    29/30

    VLIW . 29

    References

    www.cs.utah.edu/~mbinu/coursework/686_vliw/

    www.semiconductors.philips.com/acrobat/others/

    Advanced Computer Architecture - Kai Hwang.

    www.entecollege.com

  • 8/7/2019 BLUEEYES presentation

    30/30