midterm solhints
TRANSCRIPT
-
8/10/2019 Midterm SolHints
1/3
McMaster University
Department of Computing and Software
Dr. W. Kahl
COMP SCI 2GA3, SFWR ENG 3GA3
Midterm Solution Hints
2013-10-30
Computer Architecture
30 October 2013
1 CPU Speed Calculations 10%
Consider two different implementations, M1 an d M2, of the same instruction set. There arethree classes (A, B and C) of instructions in the instruction set.
M1has a clock rate of 1500 MHz, and M2has a clock rate of 2000 MHz. The average numberof cycles for each instruction class on the two machines are as follows:
Instruction Class CPI on M1 CPI on M2A 2 3B 4 4C 5 6
(a) If the number of instructions executed in a certain program is divided equally amongthe classes of instructions, how much faster is M2 than M1?
Solution Hints: Find the CPI of each machine first. CPI for M1 is 113 ; CPI for M2is 13
3
CPU time for M1 is InstructionCount
11
3
1500MHz
CPU time for M2 is InstructionCount
13
3
2000MHz
M2 has a smaller execution time, and is faster by the inverse ratio of the executiontime, or 112000
131500 1.1282.
(One could also say that M2 is about 12.82% faster than M1.)
(b) Assuming the instruction distribution from (a), at what clock rate would M1 have thesame performance as the 2000 MHz version ofM2?
Solution Hints: M1 would be as fast if the clock rate were higher by a factor of
1.1282.1500MHz 1.1282 = 1692MHz
2 Amdahls Law 10%
You are going to enhance a machine, and there are two possible improvements: you couldmake memory write instructions run twice as fast as before, or you could speed multipli-cation instructions up by a factor of three. You repeatedly run a program that takes 100seconds to execute.
Of this time, 10% is used for memory writes, 20% for multiplication, and 70% for other
tasks.(a) What will the speedup be if you improve only memory writes?
(b) What will the speedup be if you improve only multiplication?
(c) What will the speedup be if both improvements are made?
Solution Hints: Using Amdahls Law:
(a) Speedup for memory writes = 100102 +20+70
1.0526
(b) Speedup for multiplication = 10010+
20
3 +70
1.1538
(c) Speedup for both = 100102 +
20
3 +70
1.2245
3 MIPS Assembly Programming: signum 15%
The mathematical sign function is defined as follows: signum(x) =
1 ifx >00 ifx =0
1 ifx
-
8/10/2019 Midterm SolHints
2/3
4 Adding an Addressing Mode 15%
In this question, we examine quantitatively the pros and cons of adding an addressing modeto MIPS that allows offsets to come from registers; for example
sw $s1, $s2($s3)
is now legal, and stores the contents of register $s1 into memory at the address obtainedfrom adding the register contents of$s2and $s3.
Since this instruction reads from three registers, instead of from at most two like all con-ventional MIPS instructions, it needs significantly more wiring and logic circuitry in andaround the register file.
The primary benefit is that fewer instructions will be executed because we wont have tocalculate variable-offset addresses via an adduinstruction before issuing thelwor swinstruc-tion.
For simplicity, we assume that the primary disadvantage is that the cycle time will have toincrease to account for the additional time to perform register access.
Assume that the new instruction will cause the cycle time to increase by 10%. Use theinstruction frequencies for the P1 benchmark from the table below. Assume that the newaddressing mode affects only the clock speed, not the CPI. What percentage of data transfer
instructions must be transformed into the new instructions, assuming each such transforma-tion saves one add, to have at least the same performance?
FrequencyI ns tr uc ti on c la ss MI PS e xa mpl es H LL co rr es po nd en ce P1 P2Arithmetic add, sub, addi operations in expressions 50% 45%Data transfer lw, sw, lb, sb references to data structures,
such as arrays30% 47%
Conditional branch beq, bne, slt, slti ifstatements a nd lo ops 1 5% 7%Jump j, jr, jal procedure calls, returns, and
case/switch statements5% 1%
Solution Hints: Let the program have ninstructions, the original clock cycle time be t,and Xbe the ratio of load and store instrructions transformed and eliminating an add a tthe same time.
Then we have:
The original program had 0.3 n data transfer instructions and 0.45 n arithmeticinstructions.
The transformed program also has 0.3 n data transfer instructions, but only (0.450.3 X) narithmetic instructions.
execold = n CPI t
execnew = (1 X 0.3) n CPI 1.1 t
Therefore:
execnew execold
Def. execold, execnew(1 X 0.3) n CPI 1.1 t n CPI t
Isotony of multiplication, with n CPI t >0(1 X 0.3) 1.1 1
Isotony of division, with 1.15 > 01 X 0.3 1
1.1
Isotony of addition1 10
11 X 0.3
Isotony of division, with 0.
3>
01110.3
X
Arithmetic: 1033 =0.30
0.30 X
We need to transform at least 30.4% of the data transfers. (If we transform exactly 30.3%,the new program will still be slower on the new machine than the old program on the oldmachine.)
5 Floating-Point Representation 15%
Assume that $s3 contains the base address of array a. Consider the following assemblyfragment:
lui $t0, 0x64CE
srl $t1, $t0, 24
addu $t2, $t1, $s3
s w $t 0, 0( $t 2)
This can be understood as implementing the following pseudocode, with an int constant iand a floatconstant f:
a[i] := f;
Determine the decimal values of the index i and (possibly using decimal fractions d1d2 , so nocalculator is necessary) of the floating-point number f.
Document the intermediate states and the bit pattern of the floating point representation of f .Solution Hints:
lui $t0, 0x64CE# t0 = 0x64CE0000srl $t1, $t0, 24# t0 = 0x64CE0000, t1 = 0x64addu $t2, $t1, $s3# t0 = 0x64CE0000, t1 = 0x64, t2 = &a[25]s w $ t0 , ( 0) $t 2# t0 = 0x64CE0000, t1 = 0x64, t2 = &a[25], a[25] = 0 11001001 10011100000000000000000
i =25 and f =(1)0 (1 + 3964) 2201127 =(1 + 3964) 274 1.609375 274 3.0400 1022
-
8/10/2019 Midterm SolHints
3/3
6 C to MIPS Assembly 45%
The following C function definition is to be translated to MIPS assembly code, followingthe standard MIPS conventions for subroutine memory allocation and argument and resultpassing:
in t f ( in t k , in t [ ] A ){
in t i = 0 ;in t s = 0 ;
while ( i < k ){ A [ i ] = 2 i + 1 ;i f (2 i > k )
A[ i 1 ] = s i ;e l s e
s = s + A [ i ] ;i = i + 1 ;
}return ( s + 7 ) ;
}
(a) Document which variables will be stored in which registers.
(b) For the C function definition above, produce equivalent MIPS assembly code. Striveto use a minimal number of instructions, and using a minimal number of registers.
(c) How many registers did you use?
Solution Hints:
(a) We can store s in the return value register $v0, but a temporary would of course bepossible, too.
k $a0
A $a1
s $v0
i $t0
If any variables are stored in $s*registers, these need to be saved to the stack first!
(b) One possible solution:
f: addi $t0, $zero, 0 # i := 0addi $v0, $zero, 0 # s := 0
While: slt $t1, $t0, $a0 # t0 := (i < k)
beq $t1, $zero, Done
sll $t1, $t0, 1 # t1 := 2 * i
s lt $ t4 , $ a0 , $ t1 # t4 := (2 * i > k) , for fu tu re
addiu $t1, $t1, 1 # t1 := 2 * i + 1
srl $s1, $s0, 31 # s1 := (x < 0)
sll $t2, $t0, 2 # t2 := 4 * i
a dd u $ t3 , $ a1 , $ t2 # t3 := &A [i ]
sw $t1, 0($t3) # A[i] := 2 * i + 1
beq $t4, $zero, Else # if (2 * i > k) ...
subu $t2, $v0, $t0 # t2 := s - i
s w $ t2 , - 4( $t3 )
j Incr
Else: addu $v0, $v0, $t1 # s = s + A[i]; # s = s + 2 * i + 1;
Incr: addiu $t0, $t0, 1 # i := i + 1
j While
Done: addiu $v0, $v0, 7
jr $ra
(c) Beyond the argument and result registers$a0, $a1, and $v0, the solution here uses five
more: $t0
to$t4
.