gordon moore gordon moore, cofounder of intel 1965: 2 x trans. per chip/year after 1970: 2 x trans....

28
Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/ye After 1970: 2 x trans. per chip/1.5y 摩摩摩摩

Post on 21-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Gordon MooreGordon Moore, cofounder of Intel

1965:2 x trans. per chip/year

After 1970:2 x trans. per chip/1.5year

摩爾定律

Page 2: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Growth in CPU transistor count

Page 3: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Consequences of Moore’s law

Cost of a chip remains unchanged during the growth of in density => cost down

Electrical path length is shortened => increase operating speed

Computer becomes smaller

Reduction in power More circuitry on each chip => fewer inter-chip connections => more

reliable

Page 4: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Chap.4 The Role of Performance

Jen-Chang Liu, Spring 2005

Page 5: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Hardware performance is often key to the effectiveness of an entire system of hardware and software.

What do we mean by saying one computer has better performance than another?

Page 6: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example: performance of airplanes

Page 7: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Performance of a hardware system

What do we mean by better performance? Fast speed ?

Response time (execution time): the time between the start and completion of a task 完成工作所需的時間

Throughput : the total amount of work done in a given time 單位時間完成的工作

Ex. multi-user system

Page 8: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Performance measure

Performance X

1

Execution time x

=

* Relative performance:

Performance A

Performance B

= n =

Machine A is n times faster than B

Execution time B

Execution time A

Ex. machine A runs a program in 10 sec., machine B runs a program in 15 sec.,

Performance A

Performance B

= 1.5Execution time B

Execution time A

= =15

10

Quantitative relation of performance and execution time on machine x:

Page 9: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Problem with previous definition of performance

The definition of execution time How about multiple tasks run

concurrently? Use which programs to evaluate the

performance of a computer ?

Page 10: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Execution Time ? The total time to complete a task –

response time, elapsed time In a timeshared system, such as Unix, a

processor work on several programs Including disk access, memory access, I/O,

OS overhead…

執行時間的定義

使用者觀點

Program A swap Prog. B I/O Program A

Response time for A

Page 11: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

CPU time CPU execution time

Does not include waiting for I/O, running other programs

CPU exec. time = user CPU time + system CPU time

user CPU time CPU time spent in the program

system CPU time CPU time spent in the OS about our program

不含 I/O, 執行其他程式時間

Page 12: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example : CPU time Unix command : time

90.7u 12.9s 2:39 65%

user CPUsystem CPU

elapsed time

90.7+12.9

159= 0.65

We will discuss CPU performance, i.e. user CPU time in the following discussion

Page 13: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Unit of time Seconds Clock cycle

Ex. Clock cycle time = 2ns

Clock rate = 1

2x10-6= 500 MHz

CPU timefor a program

CPU clock cyclesfor a program= x Clock cycle time

Instructionsfor a program= x

Average clock cycleper instruction xClock cycle

time(CPI)

Page 14: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example 1 Machine A,B has the same ISA, for the

same program Machine A: clock cycle = 1ns, CPI = 2 Machine B: clock cycle = 2ns, CPI = 1.2

CPU timeA= Inst. count x CPI x clock cycle time= I x 2 x 1= 2I

CPU timeB =I x 1.2 x 2 = 2.4 I

Performance A

Performance B

Execution time B

Execution time A

= =2.4I

2I= 1.2

A is 1.2 times faster than B

Page 15: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example 2Instruction class CPI

ABC

123

Code 1: 2 1 2Code 2: 4 1 1

Compiler generate 2 different code sequences

A B C

CPU clock cycle1 = 2x1 + 1x2 + 2x3 = 10 cyclesCPU clock cycle2 = 4x1 + 1x2 + 1x3 = 9 cycles

Total inst.56

faster?

faster

Page 16: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Short conclusion Computer Performance

software hardware

Response time

CPU timeI/O, other prog.s

Instructioncount

CPI Clock cyclelength

How to optimize them in a hardware design?

Page 17: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Problem with previous definition of performance

The definition of execution time How about multiple tasks run

concurrently? Use which programs to evaluate the

performance of a computer ?

Page 18: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Choose programs to evaluate performance

Benchmarks: programs chosen to measure performance

SPEC (System Performance Evaluation Cooperative) suit of benchmarks Started in 1989 http://open.specbench.org/ SPEC95 in textbook is retired… SPECx contains a set of benchmark programs

Page 19: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

SPEC – money…

Page 20: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

SPEC95 benchmarks

Integer benchmarkswritten in C

floating-pt benchmarkswritten in Fortran 77

Page 21: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Summarize performance Which is faster?

Computer A

Computer B

Program 1(sec) 1 10

Program 2(sec) 1000 100

Total time(sec) 1001 110

Performance B

Performance A

Execution time A

Execution time B

= =1001

110= 9.1

* Assume the programs occur in equal probability.

Page 22: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

SPEC ratio The execution time of a benchmark

program is normalized (compared to a baseline system)

SPECint95, SPECfp95

SPEC ratio = Exec. Time on Sun SPARCstation 10/40Exec. Time on the measured machine

SPECint95 = geometric mean of SPEC ratios

Page 23: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example: SPECint95 for Pentium and Pentium Pro

Clock rate (MHz)

SP

EC

int

2

0

4

6

8

3

1

5

7

9

10

200 25015010050

Pentium

Pentium Pro

1

1 Performanceimprovement

2

2 Clock rate x2

SPECint x 1.7 ?

Page 24: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Amdahl’s law in computing

CPU timefor a program

CPU clock cyclesfor a program= x

Clock rate

1

Clock rate => CPU time 2 2

* Improvement of one aspect of a machine does not increaseperformance by the same ratio

部分的改進

* Ex. The bottleneck in the memory system does not improve

Exec. timeafter improve.

=Exec. time affected by improve.

Amount of improvementExec. timeunaffected

+

as in previous example

Page 25: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example: Amdahl’s law A program takes 100s to run 20% multiplication, 50% memory op.,

30% others What’s the speed up for

Multiply speed 4

Memory access 2

10020/4 + 50 + 30

=1.18

10020 + 50/2 + 30

=1.33

Page 26: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

MIPS as a measurement (not good…)

MIPS = Million Instructions Per Second

High MIPS => faster ?

MIPS=Instruction count

Execution time x 106

Pitfalls: MIPS cannot be used to compare computers with

different instruction sets => inst. count differs MIPS varies between programs on the same

computer => no single MIPS for a machine

Page 27: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example: MIPS ? Example: 500 MHz machine

Code 1

Code 2

Inst. Count(x109) for each inst. classA B C

5 1 1

10 1 1

2 compilers for the same source program:

Instruction class CPIABC

123

Page 28: Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

Example: MIPS?

MIPS1 =Inst. count

Exec timex106= (5+1+1)x109

20x106=350

MIPS2 = (10+1+1)x109

30x106=400

Exec. time1 < Exec. time2

MIPS1< MIPS2

Exec. Time1 = (5x1+1x2+1x3)x109cycles

500x106 cycles/sec= 20 sec.

Exec. Time2 = (10x1+1x2+1x3)x109cycles

500x106 cycles/sec= 30 sec.