computer)architecture)i - uppsala universityarchitecture)i lecture)1:)welcome)and)introduc7on)...
TRANSCRIPT
Computer Architecture I
Lecture 1: Welcome and Introduc7on Instructor: David Black-‐Schaffer
TAs: Muneeb Khan and Andreas Sembrant
Today, Part 1
• About the Course – What is computer architecture? – Why should you care?
• Ge-ng to Know One Another
• Administra7ve Details – Registra7on – Labs – Grading – Schedule
About the Course • Introductory course to computer architecture
(Not for IT/DV students; they take the 7.5hp one) • How a computer is built
– Logic -‐> circuits -‐> datapath • How a computer is controlled
– Basic opera7ons -‐> microarchitecture -‐> instruc7ons (ISA) -‐> assembly
• Contents (in-‐order) – MIPS assembly – Logic design (adders, ALU, control) – Performance analysis – Data path and pipelining – Input/Output – Caches – Virtual memory
AXer This Course, You Should…
• Understand the func7onality and opera7on of the basic elements of a computer system including processor, memory and input/output
• Reason about first-‐order performance
• Understand the hardware/soEware interface
• Understand and be able to write programs in assembly language
Credits
• Slides and material adapted from – Karl Marklund – Jus7n Pearson – Stefanos Kaxiras
(Slides originally developed by Profs. Hill, Falsafi, Marculescu, Paaerson, Rutenbar and Vijaykumar of CMU, Purdue, UCB, UW, Copyright 2003)
– Tanenbaum, Structured Computer Organiza7on, FiXh Edi7on, (c) 2006 Pearson Educa7on, Inc.
Ques7ons You Should Be Asking
• Why MIPS? (None of us has a MIPS computer…) – It’s clean and easy to understand – x86 is not
• Why should I study computer architecture?
Why Should Study Computer Architecture?
• Press release from last week…
• ARM introduced the “big.LITTLE” processor
• Huh?
From The Register
Backup: Who Knows What ARM is?
• Q: How many of you have an ARM computer? – A: All of you
From The Economist
ARM
What Exactly Are They Doing?
LITTLE BIG
• What is big.LITTLE? • Big cores for high performance • LiOle cores for low power
From ARM
!"#$%&
!$%'"%()*+$&
Why is ARM Doing This?
LITTLE
BIG
• Power Efficiency = calcula7ons/energy
The Details
LITTLE
BIG
LITTLE • Simple ( fewer func7onal units) • Short pipeline ( slower clock)
BIG • Complex
• More func7onal units • Out of order execu7on
• Long pipeline • Faster clock • Bigger branch penalty
From ARM
Why Can They Do This?
• Scaling: We can build more transistors than we know how to use
12/2005 6/2004
12/2002 6/2001
12/1999 6/1998
12/1996 6/1995
A whole 1995 processor fits in this much of a 2005 processor.
Why Should They Do This?
• Can’t increase power: Need to improve power efficiency
Figure 2. Historical growth in single-processor performance and a forecast of processor performance to 2020, based on the ITRS roadmap. A dashed line represents expectations if single-processor performance had continued its historical trend.
10
100
1,000
10,000
100,000
1,000,000
1985 1990 1995 2000 2005 2010 2015 2020 Year of introduction
Clock
frequ
ency
(MHz
)
33JANUARY 2011
tion of the International Technology Roadmap for Semiconductors (www.itrs.net/Links/2009ITRS/Home2009.htm) predicts this growth continuing through the next decade, but we will probably be unable to continue increasing transistor den-sity for CMOS circuits at the current pace for more than the next 10 years.
Figure 2 shows this expectation gap using a logarithmic vertical scale. In 2010, this gap for single-processor perfor-mance is approximately a factor of 10; by 2020, the gap will have grown to about a factor of 1,000. Most economic or societal sectors implicitly or explicitly expect computing to deliver steady, exponentially increasing performance, but these graphs show traditional single-processor computing sys-tems will not match expectations.
By 2020, we will see a large “expectation gap” for single processors. After many decades of dramatic exponen-tial growth, single-processor performance is slowing and not expected to improve in the foreseeable future. Energy and power constraints play an important and growing role in computing performance. Computer systems re-quire energy to operate, and, as with any device, the more energy needed, the more expensive the system is to oper-ate and maintain. Moreover, the energy consumed by the system ends up as heat that must be removed. Even with new parallel models and solutions, the performance of most future computing systems will be limited by power or energy in ways the computer industry and researchers have yet to confront.
For example, the benefits of replacing a single, highly complex processor with increasing numbers of simpler processors will eventually reach a limit when further simplification costs more in performance than it saves in power. Power constraints are thus inevitable for systems ranging from handheld devices to the largest computing datacenters, even as the transition is made to parallel systems.
Total energy consumed by computing systems is al-ready substantial and continues to grow rapidly in the US and elsewhere around the world. As is the case in other economic sectors, the total energy consumed by comput-ing will come under increasing pressure.
Even if we succeed in sidestepping the limits on single-processor performance, total energy consumption will remain an important concern, and growth in performance will become limited by power consumption within a decade.
0
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
1985 1990 1995 2000 2005 2010
Relative performanceNumber of transistors (thousands)
Clock speed (MHz)Power type (W)Number of cores/chip
Year of introduction
Trans
istors
per c
hip
Figure 1. Transistors, frequency, power, performance, and processor cores over time. The original Moore’s law projection of increasing transistors per chip remains unabated even as performance has stalled.
Power Wall
From Fuller 2011
What are Others Doing?
• GPUs: Lots and lots of very small cores
Nvidia Fermi AMD Barts
This is Computer Architecture
• Understanding performance and efficiency • Design tradeoffs in execu7ng instruc7ons • Building the hardware • Making it programmable
vs. vs.
So, Why Should You Care?
• Computers are evolving very fast • Need to understand how they work to understand why they are changing
• Computer Architecture is cri7cal to performance and efficiency
• Not just about designing hardware: – How does big.LITTLE affect soXware? – How easy is it to program a GPU?
Ques7ons?
GETTING TO KNOW ONE ANOTHER
About Me
• I’m American (as if you haven’t no7ced…) – From a different system (e.g., graded homework) – May speak too quickly – Tell me if I’m doing something wrong (and how to make it beOer)
• Background – Power-‐efficient computer architecture – Taught intro/advanced digital design courses – Worked in industry (Apple) on CPU/GPU programming systems (OpenCL)
– Speak/understand Swedish preay well
About You
• What program(s) are you in? • What year(s) are you in?
• Why are you taking this course?
• Will you do the assigned reading before class? (Would in-‐class short quizzes help?)
• Would you do suggested prac7ce problems?
About Your Background
• How many have taken an impera7ve programming course? – Java, C, C++, C#, FORTRAN (I hope not…), MATLAB – Not func7onal languages such as ML or Erlang
• How many have seen digital design? – AND/OR gates, mux/demux, Karnaugh maps – Flipflos, finite state machines
• How many care about the speed of your code?
Prerequisites
• Basic programming (programmeringsteknik II) for (int i=0; i<10; i++) {
a[i] = calculate(size, b[i]);
}
• Interest in how computers work and why they are fast or slow
Feedback from Students Last Year
• Overall evalua7on: 3.4
• What was good? – The lab assignments (but they were a lot of work) – The textbook
• What could be improved? – Administra7on of the course – Assignment details – Fit in all the lectures
• What are we changing? – Clear schedule, deadlines, and office hours – Extra TA-‐run tutorial for second lab – Extra review sessions at the end
3
Ques7ons?
ADMINISTRATIVE DETAILS
Registering for the Course
• AdmiOed students: – Register online in studentportalen
• Unregistered students: – Register at www.antagning.se (before 7 Nov)
• MS students: contact your program coordinator
• Exchange students: contact Ulrika Jaresund • Students with older registra7ons: contact IT dept.
Reading
“Computer Organiza7on & Design: The Hardware/SoEware Interface” Paaerson and Hennesy. 4th Edi7on, Morgan Kaufman 2007. (Third edi7on is fine.)
• This is a great book and the lectures will largely follow the flow of the book.
• You should read the book. (Really, please read it.)
Labs • MIPS Assembly (Programming)
– 2-‐Nov to 9-‐Nov (1 week) – Tutorial 1-‐Nov
• 32-‐bit Adder (Logic Design) – 9-‐Nov to 16-‐Nov (1 week) – Tutorial 11-‐Nov
• Data Path (Logic Design) – 16-‐Nov to 28-‐Nov (~2 weeks)
• Interrupt-‐driven I/O (Programming) – 30-‐Nov to 7-‐Dec (1 week)
• Memory-‐mapped I/O (Programming) – 7-‐Dec to 14-‐Dec (1 week)
• Labs will be done in pairs. If you can’t find a partner let us know and we’ll arrange one for you.
Grading
• 5hp = 3hp (exam) + 2hp (labs), UG/3/4/5 – All labs must be submiaed on 7me for any lab credit – Missed labs may be turned in aEer the course, but will limit the total lab grade to 3
– Labs will be graded within 1 week on a scale of 1-‐5 – You have 1 week to request a lab re-‐grade
• Why so harsh? – Labs are the best way for you to learn the material – We want you to take them seriously and get them done on 7me
• 2 late days to use during the course – You must tell us when you use them – Note that you s7ll have a lab due the next week
How to Get Help
• Office Hours – David: Thursday 13.00-‐15.00 (office 1240) – Muneeb: Monday 9.00-‐11.00 (P1549) – Andreas: Wednesday 15.00-‐17.00 (P1549) (Other 7mes by appointment, but no guarantees.)
• Email – Preface all email subjects with “dark:” – Lab/grading ques7ons to Muneeb/Andreas – Course administra7on ques7ons to David – General ques7ons to any of us – We will try to respond promptly, but no later than our next scheduled office hours at worst.
Sche
dule
Week Lecture Reading
Lab Session
43 Intro Instruction Set Architecture I
2.1-2.6
44 Instruction Set Architecture II 2.7-2.10 (stop at the Java bit), 2.13 (good summary), 2.17, 2.18
MIPS Assembly SPIM/LogicSim
Arithmetic and Integer Numbers 3.1-3.4, 3.5 (optional), 3.6 (to the MIPS operands), 3.8-3.9
45 Logic B.1-B.3, B.5
32-bit Adder Logic
Performance 4.1-4.6
46 Data path 1 5.1-5.4
Data path
Data path 2 (multicycle) 5.5-5.6, 5.10-5.11
47 Data path 3 (pipelining) 6.1-6.3
Data path 4 (hazards) 6.4-6.6, (6.9 optional) 6.11-6.12
48 I/O 8.1-8.5, 8.9-8.10
Interrupt-driven I/O
Memories 7.1, B.9
49 Caches 7.2-7.3
Memory-mapped I/O
Virtual Memory 1 7.4
50 Virtual Memory 2 7.5, 7.7-7.8
Review Review 51 Exam
TIME TO GET STARTED!