gordon moore gordon moore, cofounder of intel 1965: 2 x trans. per chip/year after 1970: 2 x trans....
Post on 21-Dec-2015
214 Views
Preview:
TRANSCRIPT
Gordon MooreGordon Moore, cofounder of Intel
1965:2 x trans. per chip/year
After 1970:2 x trans. per chip/1.5year
摩爾定律
Growth in CPU transistor count
Consequences of Moore’s law
Cost of a chip remains unchanged during the growth of in density => cost down
Electrical path length is shortened => increase operating speed
Computer becomes smaller
Reduction in power More circuitry on each chip => fewer inter-chip connections => more
reliable
Chap.4 The Role of Performance
Jen-Chang Liu, Spring 2005
Hardware performance is often key to the effectiveness of an entire system of hardware and software.
What do we mean by saying one computer has better performance than another?
Example: performance of airplanes
Performance of a hardware system
What do we mean by better performance? Fast speed ?
Response time (execution time): the time between the start and completion of a task 完成工作所需的時間
Throughput : the total amount of work done in a given time 單位時間完成的工作
Ex. multi-user system
Performance measure
Performance X
1
Execution time x
=
* Relative performance:
Performance A
Performance B
= n =
Machine A is n times faster than B
Execution time B
Execution time A
Ex. machine A runs a program in 10 sec., machine B runs a program in 15 sec.,
Performance A
Performance B
= 1.5Execution time B
Execution time A
= =15
10
Quantitative relation of performance and execution time on machine x:
Problem with previous definition of performance
The definition of execution time How about multiple tasks run
concurrently? Use which programs to evaluate the
performance of a computer ?
Execution Time ? The total time to complete a task –
response time, elapsed time In a timeshared system, such as Unix, a
processor work on several programs Including disk access, memory access, I/O,
OS overhead…
執行時間的定義
使用者觀點
Program A swap Prog. B I/O Program A
Response time for A
CPU time CPU execution time
Does not include waiting for I/O, running other programs
CPU exec. time = user CPU time + system CPU time
user CPU time CPU time spent in the program
system CPU time CPU time spent in the OS about our program
不含 I/O, 執行其他程式時間
Example : CPU time Unix command : time
90.7u 12.9s 2:39 65%
user CPUsystem CPU
elapsed time
90.7+12.9
159= 0.65
We will discuss CPU performance, i.e. user CPU time in the following discussion
Unit of time Seconds Clock cycle
Ex. Clock cycle time = 2ns
Clock rate = 1
2x10-6= 500 MHz
CPU timefor a program
CPU clock cyclesfor a program= x Clock cycle time
Instructionsfor a program= x
Average clock cycleper instruction xClock cycle
time(CPI)
Example 1 Machine A,B has the same ISA, for the
same program Machine A: clock cycle = 1ns, CPI = 2 Machine B: clock cycle = 2ns, CPI = 1.2
CPU timeA= Inst. count x CPI x clock cycle time= I x 2 x 1= 2I
CPU timeB =I x 1.2 x 2 = 2.4 I
Performance A
Performance B
Execution time B
Execution time A
= =2.4I
2I= 1.2
A is 1.2 times faster than B
Example 2Instruction class CPI
ABC
123
Code 1: 2 1 2Code 2: 4 1 1
Compiler generate 2 different code sequences
A B C
CPU clock cycle1 = 2x1 + 1x2 + 2x3 = 10 cyclesCPU clock cycle2 = 4x1 + 1x2 + 1x3 = 9 cycles
Total inst.56
faster?
faster
Short conclusion Computer Performance
software hardware
Response time
CPU timeI/O, other prog.s
Instructioncount
CPI Clock cyclelength
How to optimize them in a hardware design?
Problem with previous definition of performance
The definition of execution time How about multiple tasks run
concurrently? Use which programs to evaluate the
performance of a computer ?
Choose programs to evaluate performance
Benchmarks: programs chosen to measure performance
SPEC (System Performance Evaluation Cooperative) suit of benchmarks Started in 1989 http://open.specbench.org/ SPEC95 in textbook is retired… SPECx contains a set of benchmark programs
SPEC – money…
SPEC95 benchmarks
Integer benchmarkswritten in C
floating-pt benchmarkswritten in Fortran 77
Summarize performance Which is faster?
Computer A
Computer B
Program 1(sec) 1 10
Program 2(sec) 1000 100
Total time(sec) 1001 110
Performance B
Performance A
Execution time A
Execution time B
= =1001
110= 9.1
* Assume the programs occur in equal probability.
SPEC ratio The execution time of a benchmark
program is normalized (compared to a baseline system)
SPECint95, SPECfp95
SPEC ratio = Exec. Time on Sun SPARCstation 10/40Exec. Time on the measured machine
SPECint95 = geometric mean of SPEC ratios
Example: SPECint95 for Pentium and Pentium Pro
Clock rate (MHz)
SP
EC
int
2
0
4
6
8
3
1
5
7
9
10
200 25015010050
Pentium
Pentium Pro
1
1 Performanceimprovement
2
2 Clock rate x2
SPECint x 1.7 ?
Amdahl’s law in computing
CPU timefor a program
CPU clock cyclesfor a program= x
Clock rate
1
Clock rate => CPU time 2 2
* Improvement of one aspect of a machine does not increaseperformance by the same ratio
部分的改進
* Ex. The bottleneck in the memory system does not improve
Exec. timeafter improve.
=Exec. time affected by improve.
Amount of improvementExec. timeunaffected
+
as in previous example
Example: Amdahl’s law A program takes 100s to run 20% multiplication, 50% memory op.,
30% others What’s the speed up for
Multiply speed 4
Memory access 2
10020/4 + 50 + 30
=1.18
10020 + 50/2 + 30
=1.33
MIPS as a measurement (not good…)
MIPS = Million Instructions Per Second
High MIPS => faster ?
MIPS=Instruction count
Execution time x 106
Pitfalls: MIPS cannot be used to compare computers with
different instruction sets => inst. count differs MIPS varies between programs on the same
computer => no single MIPS for a machine
Example: MIPS ? Example: 500 MHz machine
Code 1
Code 2
Inst. Count(x109) for each inst. classA B C
5 1 1
10 1 1
2 compilers for the same source program:
Instruction class CPIABC
123
Example: MIPS?
MIPS1 =Inst. count
Exec timex106= (5+1+1)x109
20x106=350
MIPS2 = (10+1+1)x109
30x106=400
Exec. time1 < Exec. time2
MIPS1< MIPS2
Exec. Time1 = (5x1+1x2+1x3)x109cycles
500x106 cycles/sec= 20 sec.
Exec. Time2 = (10x1+1x2+1x3)x109cycles
500x106 cycles/sec= 30 sec.
top related