jeanlucmmix

Upload: andrei-nicolae

Post on 07-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 JeanLucMMIX

    1/21

    The MMIX 2009 Computer

    Jean-Luc [email protected]

    University of Victoria

    February 13, 2009

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    2/21

    Outline

    History

    OverviewArchitecture

    ISA

    MMIXAL

    Example

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    3/21

    Knuth is prolific...

    The Art of Computer Programming (TAOCP)

    Volume 1 (1st edition 1967)MIX 1009 ComputerVolume 2 (1st edition 1968)Volume 3 (1st edition 1973)Volume 1 (2nd edition 1973)TEX and METAFONT 19771985Volume 2 (2nd edition 1981)Volume 1 (3rd edition 1997)Volume 2 (3rd edition 1998)

    Volume 3 (2rd edition 1998)MMIX 2009 ComputerNNIX Operating System (not!) ...Volume 4 ...

    ...and more.

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    4/21

    Why define an architecture?

    Expose the machine Educational value

    TAOCP programs are short

    TAOCP was initially thought to be a book about compilers

    Effect of hardware on algorithm implementations Which high-level language to use?

    1960s Algol 1970s Pascal 1980s C

    1990s C++ or Java 2000s ???

    My books focus on timeless truths Knuth TAOCP,Vol. 1, Fascicle 1.

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    5/21

    MIX 1009

    Hypothetical computer from 1960s era

    Binary-decimal machine Each byte has 6-bits or 2 decimal digits

    Word: 5 bytes and sign

    9 special-purpose registers

    Address-space: 4000 words (20K bytes) Standard sub-routine calling convention based on

    self-modifying code

    I/O devices:

    tape units disk or drum units

    card reader

    card punch

    line printer typewriter terminal

    paper tape

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    6/21

    MIX 1009

    Clearly MIX is archaic by todays standards

    New architecture defined as MMIX 2009

    To be used from now on for TAOCP

    The plan is to convert TAOCP Vol. 13 programs to MMIX forthe next edition

    Effort underway, volunteers needed

    http://mmixmasters.sourceforge.net

    http://mmixmasters.sourceforge.net/http://mmixmasters.sourceforge.net/http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    7/21

    MMIX 2009

    Designed by Knuth, with significant contributions by John

    Hennessy (MIPS) and Dick Sites (Alpha) 64-bit RISC architecture

    256 general-purpose registers ($0 ... $255)

    32 special-purpose registers ($rA ... $rZ, $rBB, $rTT,

    $rWW, $rXX, $rYY, $rZZ) Register stack

    Big-endian machine

    32-bit instructions

    64-bit virtual address space Integers stored in 2s complement

    IEEE 754 floats

    Unicode UTF-16 characters

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    8/21

    MMIX

    MMIX is an architecture, not an implementation

    MMIX meta-simulator can configure a machine:

    Instruction Pipelining Number of ALUs

    Bus width

    Memory access times

    I$ D$

    L2$

    Branch prediction

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    9/21

    MMIX

    References: Knuths MMIXware book TAOCP Volume 1, Fascicle 1 (sections 1.3 and 1.4) Wikipedia

    Tools: MMIXware assembler, simulator, ... gcc has an MMIX backend

    and a lot more...

    http://www.springerlink.com/content/v8qtq27c84t8/http://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://en.wikipedia.org/wiki/MMIXhttp://en.wikipedia.org/wiki/MMIXhttp://www-cs-faculty.stanford.edu/~knuth/fasc1.ps.gzhttp://www.springerlink.com/content/v8qtq27c84t8/http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    10/21

    Memory and Addressing

    1 byte = 8 bits 1 wyde = 2 bytes (word)

    1 tetra = 4 bytes (double word)

    1 octa = 8 bytes (quad word)

    64-bit addresses represented by an octa RISC architecture: explicit loads and stores

    Memory: M[x] = y (x is an octa, y is a byte)

    Registers: $ x = y (x is a byte, y is an octa)

    M2[x] for wyde addressing M4[x] for tetra addressing

    M8[x] for octa addressing

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    11/21

    Encoding

    32-bit instructions (1 tetra) OP X, Y, Z ternary operations

    e.g.: ADD $X, $Y, $Z

    OP X, YZ binary operations e.g.: INCL $X, YZ

    OP XYZ unary operations e.g.: JMP @+4*XYZ

    JMP @+1000000 JMP is opcode #f0 (# denotes hex) 1000000 is #03d090 Instruction encodes to #f003d090

    There are 256 opcodes

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    12/21

    Instruction Timing

    Instructions time is measured in Oops Memory latency is measured in

    Time-value of decreases as technology advances

    / increases over time

    LDO takes + time to complete

    Most instructions take 1

    Branches: 1 (but 3 if mispredicted)

    Floating point ops: mostly 4

    FDIV and FSQRT take 40

    MUL 10, DIV 60

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    13/21

    Loads and Stores

    u() returns an unsigned integer, s() a signed integer

    Address calculation: A = (u($Y) + u($Z)) mod 264

    Load byte: LDB $X, $Y, $Z

    Load wyde: LDW $X, $Y, $Z Load tetra: LDT $X, $Y, $Z

    Load octa: LDO $X, $Y, $Z

    Append U to avoid sign extending: LDBU, LDWU,...

    Unaligned loads and stores are supported

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    14/21

    Examples of Load

    $2 = 1000, $3 = 2, $4 = 5

    M[1000] ... M[1007] = #0123 4567 89ab cdef

    LDB $1,$2,$3 $1 = #0000 0000 0000 0045

    LDW $1,$2,$3 $1 = #0000 0000 0000 4567

    LDB $1,$2,$4 $1 = #ffff ffff ffff ffab

    LDBU $1,$2,$4 $1 = #0000 0000 0000 00ab

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    15/21

    Instructions Overview

    Load and Store: LDB, STB, ...

    Arithmetic: ADD, SUB, MUL, DIV, NEG, ... Shifts: SL, SR

    Comparisons: CMP, ...

    Conditionals: CSN, CSZ, CSP, ...

    Bitwise: AND, OR, XOR, ANDN, ORN, NAND, NOR, ... Bytewise: BDIF, WDIF, ...

    Floating point: FADD, FSUB, FMUL, FDIV, ...

    Immediates: SETH, SETMH, SETML, SETL, ...

    Jumps and branches: BN, PBN, BZ, BP, ... Subroutines: PUSHJ, POP, ...

    Cache & System: LDUNC, SYNCID, CSWAP, ...

    Interrupts: TRIP, TRAP, RESUME

    NNIX

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    16/21

    NNIX

    NNIX is even more virtual than MMIX It defines a UNIX-like environment

    And the address space of a program

    Kernel Space

    Pool

    Stack

    Text

    Data

    #0000000000000000

    #2000000000000000

    #4000000000000000

    #6000000000000000

    #8000000000000000

    #FFFFFFFFFFFFFFFF

    MMIXAL

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    17/21

    MMIXAL

    Weve seen $ (Register) and # (Hex)

    Variable declarations: j IS $0

    Constant declarations: FIVE IS 5

    Segment markers: LOC #100

    Global Registers: n GREG 0

    Defining data: BYTE, WYDE, TETRA, OCTA

    Current location: @, aligned

    R i S k

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    18/21

    Register Stack

    256 registers are partitioned Special purpose register $rL and $rG

    With 0 L G < 256

    Local registers: $0, ..., $(L-1)

    Marginal registers: $L, ..., $(G-1) Global registers: $G, ..., $255

    $rG set at program startup, $rL=0

    Writing to a marginal register increases $rL and intermediate

    registers are set to zero. Registers are pushed to stack S[0], ..., S[ 1] which is big

    enough. is the current stack insertion point.

    S b ti ll

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    19/21

    Subroutine calls

    $XX, the number of registers in use, X < L

    PUSHJ $X,@+4*YZ[-262144]

    Push $0, ..., $X on the stack S[],..., S[+ X]

    = + X + 1

    Move registers: $0 = $(X + 1), ..., $(L x 2)=$(L 1)

    $rL = L X 1

    $rJ = @ + 4 (return address)

    Control jumps to @ + 4 YZ or @ + 4 YZ 262144(4 216 = 262144)

    POP X,YZ

    Preserve X registers, and undo previous PUSHJ

    control jumps to $rJ+4YZ

    P 1 2 10M

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    20/21

    Program 1.2.10M

    Given n elements X[1] ... X[n], find m and j such thatm = X[j] = max1in X[i], where j is the largest index thatsatisfies this relation.

    M1. [Initialize.] Set j n, k n 1, m X[n].M2. [All tested?] Ifk = 0, the algorithm terminates.M3. [Compare.] IfX[k] m, goto M5.M4. [Change m.] Set j k, m X[k].M5.

    [Decrease k.] Decrease k by one, return to M2.

    Example Program

    http://goforward/http://find/http://goback/
  • 8/3/2019 JeanLucMMIX

    21/21

    Example Program

    Maximum of X[1..100] (Algorithm 1.2.10M)Caller: Set x0 to be address of X[1]

    PUSHJ $1, Max100

    j IS $0; m IS $1; kk IS $2; xk IS $3;

    Max100 SETL kk,100*8 M1. Initialize

    LDO m,x0,kkJMP 1F

    3H LDO xk,x0,kk M3. Compare

    CMP t, xk, m

    PBNP t,5F

    4H SET m,xk M4. Change m1H SR j,kk,3

    5H SUB kk,kk,8 M5. Decrease k

    PBP kk,3B M2. All tested?

    6H POP 2,0 Return to main program

    http://goforward/http://find/http://goback/