computer architecture 計算機組織與結構
DESCRIPTION
Computer Architecture 計算機組織與結構. Instructor: 左瑞麟 [email protected] Office:200314 分機: 62328. 課程網頁. http://www.cs.nccu.edu.tw/~ raylin/UndergraduateCourse/ComputerArchitecture/2009.htm. 課程目標. 本課程旨在介紹計算機硬體的基本概念與製作方式,利用各種實例 ,對 相關主題作深入淺出之說明,期能使學生瞭解電腦的組織架構與重要技術。. 課程大綱. 1. 學習電子計算機系統設計原理。 - PowerPoint PPT PresentationTRANSCRIPT
課程網頁 http://www.cs.nccu.edu.tw/~raylin/
UndergraduateCourse/ComputerArchitecture/2009.htm
2009/9/15 2
課程目標 本課程旨在介紹計算機硬體的基本概念與製作方式,利用各種實例,對相關主題作深入淺出之說明,期能使學生瞭解電腦的組織架構與重要技術。
32009/9/15
課程大綱1. 學習電子計算機系統設計原理。 2. 熟悉中央處理器單元的結構與運作。 3. 熟悉指令集架構的設計與取捨。 4. 了解 CPU 及其週邊設備的關係及其運作方式。
42009/9/15
上課進度
52009/9/15
週次 日 期 進 度1 9/15 Syllabus
Chapter 1: Computer Abstractions and Technology
2 9/22 Chapter 2: Instructions: Language of the Computer
3 9/29 Chapter 2: Instructions: Language of the Computer
4 10/6 Chapter 3: Arithmetic for Computers
5 10/13 Chapter 3: Arithmetic for Computers
6 10/20 Chapter 3: Arithmetic for ComputersChapter 4: The Processor
7 10/27 Chapter 4: The Processor
8 11/3 Chapter 4: The Processor
9 11/10 Chapter 4: The Processor
10 11/17 Mid-term Exam11 11/24 Chapter 4: The Processor
Chapter 5: Large and Fast: Exploiting Memory Hierarchy 12 12/1 Chapter 5: Large and Fast: Exploiting Memory Hierarchy
13 12/8 Chapter 5: Large and Fast: Exploiting Memory Hierarchy
14 12/15 Chapter 5: Large and Fast: Exploiting Memory Hierarchy
15 12/22 Chapter 6: Storage and Other I/O Topics
16 12/29 Chapter 6: Storage and Other I/O Topics
17 1/5 Chapter 7: Multicores, Multiprocessors, and Clusters
18 1/12 期末考
教學方式 講授 ( 投影片 ) 作業與測驗
62009/9/15
Prerequisite Some background in assembly language Boolean algebra Logic design
2009/9/15 7
評分標準 作業 , 報告及上課表現 30%, 期中考 30%, 期末考 40%
82009/9/15
Textbook D. A. Patterson, J. L. Hennessy. Computer Organization & Design: The Hardware/Software Interface, 3rd. ed., Morgan Kaufmann, 2004
2009/9/15 9
Textbook D. A. Patterson, J. L. Hennessy. Computer Organization & Design: The Hardware/Software Interface, 4th. ed., Morgan Kaufmann, 2009
2009/9/15 10
Chap 1 Computer Abstraction and Technology
2009/9/15 11
Introduction
Computer science fiction in the recent past Automatic teller machines (ATM) Computers in automobiles Laptop computers Human genome project World Wide Web (WWW)
2009/9/15 12
Ubiquitous computing
Computer classification Desktop computers Servers
Minicomputers Mainframes Supercomputers
Embedded computers
2009/9/15 13
Characteristics of embedded computers A computer inside another device
Microprocessors found in a car, a cell phone or PDA, etc. Used for running one predetermined application Have unique application requirements that
Combine a minimum performance with stringent limitations on cost or power
2009/9/15 14
百萬台電腦
圖 1.1 從 1988 至 2002 年,不同種類的處理器的銷售量。這些數字的獲得有些許不同,因此需要注意這些結果的解釋。如桌上型電腦和伺服器的總數計算完整的電腦系統,因為其中的一部份為多重處理器,使的處理器的銷售數字較高些,但大約只有全部的 10~20%( 由於伺服器平均雖有著超過一顆以上的處理器,但僅為單一處理器系統的桌上型電腦銷售量3%) 。嵌入式電腦的總數,實際上是計算處理器的數目。有些嵌入式系統是看不見處理器的,更有些單一設備卻有多顆的處理器。
2009/9/15 15
百萬顆處理器
圖 1.2 1998 至 2002 年所有的指令集架構為處理器的銷售量。關於「其餘」的種類是指定應用或客製化的處理器。在 ARM 的例子裡,大約有 80% 的銷售量是使用在手機上,他們結合了 ARM 和特定應用邏輯在單一晶片上。2009/9/15 16
Things you’ll be learning How are programs written in a high-level language
translated into the language of the hardware What is the interface between the software and the
hardware How does software instruct the hardware to perform
needed functions What determines the performance of a program How to analyze and improve the performance
2009/9/15 17
Why learn this stuff you want to call yourself a “computer scientist” you want to build software people use (need
performance) you need to make a purchasing decision or offer
“expert” advice
2009/9/15 18
Hardware or software component How this component affects performance
Algorithm Determines both the number of source-level statements and the number of I/O operations executed
Programming language, compiler, and architecture
Determines the number of machine instructions for each source-level statement
Processor and memory system Determines how fast instructions can be executed
I/O system (hardware and operating system)
Determines how fast I/O operations may be executed
2009/9/15 19
Understanding program performance
Below your program
A hardware in a computer can only execute low-level instructions
Needs several layers of software to interpret or translate high-level operations into simple computer instructions
Layers of software are organized primarily in a hierarchical fashion
2009/9/15 20
硬體
系統軟體應用軟體
圖 1.3 硬體和軟體的階層性概念,此圖以中心為硬體而最外圈為應用軟體的同心圓表示。2009/9/15 21
Types of systems software Operating systems ( 作業系統 )
Handing basic input and output operations Allocating storage and memory Providing for sharing the computer among multiple applications
using it simultaneously Compilers (編譯器)
Translating high-level language statements into instructions that the hardware can execute complex
2009/9/15 22
From a high-level language to the language of hardware The machine alphabet is just two letters; “on” and “off” 1 and 0 are the two symbols for these two letters We refer to each letter as a binary digit or bit Instructions are just collections of bits that the computer
understands For example: 1000110010100000
2009/9/15 23
The first programmers communicated to computers in binary numbers
They quickly invented new notations that were closer to the way humans think Translated to binary by hand
Finally, using the machine to help program the machine The first of these programs was named an assembler
Assembler: a program that translates a symbolic version of instructions into the binary version
2009/9/15 24
Assembler: a program that translates a symbolic version of instructions into the binary version add A, B 1000110010100000
The name for this symbolic language is called assembly language
High-level programming language: a portable language such as C, Fortran, or Java composed of words and algebraic notations that can be translated by a compiler into assembly language A + B add A, B 1000110010100000
2009/9/15 25
Benefits of high-level programming languages: Allow the programmer to think in a more natural language Improve programmer productivity Allow programs to be independent of the computer on which they
were developed
2009/9/15 26
編譯器
組譯器
High-level language(c 語言 )
Assembly language(MIPS 規格 )
Binary machine language(MIPS 規格 )
圖 1.4 C 程式編譯成組合語言再組譯成二位元機械語言。雖然從高階語言轉譯成二位元機械語言有兩個步驟,有些編譯器會將中間過程刪除,直接產生二位元機械語言。這些語言和程式在第二章會有更為詳細的介紹。
2009/9/15 27
Under the covers The five classic components are input ( 輸入 ), output (輸出) , memory (記憶體) , datapath (資料路徑) and control (控制單元) The last two sometimes combined and called the
processor( 處理器 )
2009/9/15 28
編譯器介面
電腦 輸入
輸出
控制單元資料路徑
處理器 記憶體
效能評估
圖 1.5 構成電腦五種要素的組織圖。處理器從記憶體中抓取指令和資料。記憶體中的資料由輸入裝置寫入,並由輸出裝置讀出。控制單元則送出運作訊號以決定資料流程、記憶體、輸入和輸出裝置的動作。2009/9/15 29
Input device: a mechanism through which the computer is fed information keyboard, mouse
Output device: a mechanism that conveys the result of a computation to a user or another computer screen
Some devices provide both input and output to the computer Networks, disks
2009/9/15 30
圖 1.6 桌上型電腦。液晶顯示螢幕是主要的輸出裝置,鍵盤與滑鼠為主要的輸入裝置。主機箱內則包含了處理器和額外的輸入 / 輸出裝置。本圖是Dell Optiplex GX260 系統。2009/9/15 31
Anatomy of a mouse The original mouse was electromechanical It used a large ball that when rolled across a surface would
cause an x and y counter to be incremented The amount of increase in each counter told how far the
mouse had been moved
2009/9/15 32
All-optical mouse A miniature optical processor including
An LED to provide lighting A tiny black-and-white camera A simple optical processor
It has largely replaced the electromechanical mouse
2009/9/15 33
Through the looking glass Cathode ray tube (CRT) display Flat-panel Liquid crystal displays (LCDs)
The computer hardware support for graphics consists mainly of a raster refresh buffer (光柵更新緩衝器) or frame buffer (畫面緩衝器) to store the bit map (or called pixels ,像素 )
2009/9/15 34
畫面緩衝器 柵式掃描陰極射線管顯示器
圖 1.7 左圖畫面緩衝器的每個座標決定右圖柵式掃描陰極射線管相對應座標處的影像。在 的像素點的位元圖樣是 0011 ,代表一較為明亮的灰階值,而 像素點的位元圖樣是 1101比較暗。)Y,(X 00
)Y,(X 11
2009/9/15 35
Opening the box Motherboard
A plastic board containing packages of integrated circuits or chips Integrated circuits
Also called chip. A device combining dozens to millions of transistors ( 電晶體 )
The motherboard is composed of The piece connecting to the I/O devices The memory
A storage area in which programs are kept when they are running and that contains the data needed by the running programs
The processor The active part of the computer, which contains the datapath and control
and adds numbers, tests numbers, signals I/O devices to activate, and so on2009/9/15 36
電源供應器
有罩子的風散
主機板
DVD驅動器ZIP驅動器
硬碟圖 1.8 在 15 頁圖 1.6 的個人電腦內部圖。這種包裝因為它開啟的方式,旁邊有絞鍊,所以有時稱做蛤殼式 (clamshell)包裝。為了看看裡邊有什麼,我們從左上角開始。左上角的金屬盒是電源供應器,下方是個有罩子的風散。在風扇的右下方是印刷電路板 (printed circuit (PC)board) ,在電腦裡稱做主機板,包含了電腦裡大部分的電子零件。圖 1.10 是個接近此種板子的圖例。處理器就是在風扇右邊的大型凸起矩形物。在右手邊我們可以看見擺放各種驅動盤機器的隔間,最上面是 DVD驅動器,中間是 ZIP驅動器,下面是硬碟。
2009/9/15 37
控制單元 其它介面邏輯控制單元 輸入 / 輸出介面
指令快取記憶體 資料快取記憶體增強型浮點及多媒體運算單元
控制單元
控制單元
第二階快取及記憶體介面
進階管線化多執行緒支援單元圖 1.9 在圖 1.8 的電路板上所使用的處理器的內部圖。左手邊的是 Pentium4 處理器晶片的縮影照片,右手邊則顯示了該處理器內部的主要區塊。
2009/9/15 38
The processor (CPU) comprises two main components: Datapath
Performs arithmetic operations Control
Commands the datapath, memory, and I/O devices according to the instructions of the program
2009/9/15 39
處理器 記憶體處理器介面
輸入 / 輸出裝置匯流排插槽
圖形化介面卡
碟盤及通用序列埠介面
圖 1.10 貼近個人電腦主機板。這塊板子使用 Intel Pentium 4 處理器,位於板子的左上角。它的上面覆蓋了一個似鰭狀的金屬散熱器。這是個散熱裝置,幫助晶片散去熱量。記憶體部分包含了一個或多個電路板,垂直插在主機板上,靠近中央。動態隨機存取記憶體鑲嵌在這些小電路板上 (稱之為雙同軸記憶體模組 (dual inline memory modules,DIMMS)), 然後插入進接器。主機板上其餘的大部分用來連接外部輸入 / 輸出裝置,如音頻信號 /MIDI 、右邊的平行 /序列埠、底部的兩個週邊元件連接介面(PCI)卡插槽和連接硬碟的進階連接技術 (advanced technology attachment,ATA)連接器。
2009/9/15 40
Dynamic random access memory (DRAM): Several DRAMS are used together to contain the instructions and
data of a program It provides random access to any location
Cache memory A small, fast memory that acts as a buffer for the DRAM memory Built using a different memory technology, static random access
memory (SRAM) More expensive than DRAM
2009/9/15 41
A safe place for data Primary memory (main memory): volatile memory
DRAM Secondary memory: nonvolatile memory
Magnetic disk (hard disk) CD DVD FLASH
The access times for magnetic disks are much slower than for DRAMs About 100,000 times faster
The cost per megabyte of disk is about 100 times less expensive than DRAM
2009/9/15 42
圖 1.11 圖中顯示了 10 片碟盤和讀寫頭的硬碟。2009/9/15 43
Communicating with other computers Several major advantages of networked computers:
Communication Resource sharing Nonlocal access
Ethernet is the most popular type of network
Local area network (LAN) Wide area network (WAN)
2009/9/15 44
Technologies for building processors and memories Vacuum tube
An electronic component, predecessor of the transistor Transistor
An on/off switch controlled by an electric signal Integrated circuit (IC)
It combined dozens to hundreds of transistors into a single chip Very large scale integrated (VLSI) circuit
A device containing hundreds of thousands to millions of transistors
2009/9/15 45
年 使用於電腦的技術 相對效能 / 單位成本1951 真空管 (vacuum tube) 11965 電晶體 351975 積體電路 9001995 超大型積體電路 2,400,0002005 極大型積體電腦 6,200,000,000
圖 1.12 長時間以來,使用在電腦的各項技術其單位成本的相對效能。資料來源 :波士頓電腦博物館, 2005年為作者推算而得。
2009/9/15 46
發表時間
千位元容量
圖 1.13 動態隨機存取記憶體晶片隨時間演變的容量成長圖。 Y軸以千位元做量測,千指的是 1024 。這二十年來,動態隨機存取記憶體工業幾乎每三年便會提高四倍的容量,相當每年百分之六十。每三年增加四倍的估計為動態隨機存取記憶體的成長法則。近年來,成長率已經逐漸趨緩,而稍微接近每二年倍增或每四年增加四倍。
)(210
2009/9/15 47
Moore’s law Transistor capacity doubles every 18 to 24 months
2009/9/15 48
Year of introduction Transistors
4004 1971 2,2508008 1972 2,5008080 1974 5,0008086 1978 29,000286 1982 120,000386™ processor 1985 275,000486™ DX processor 1989 1,180,000Pentium® processor 1993 3,100,000Pentium II processor 1997 7,500,000Pentium III processor 1999 24,000,000Pentium 4 processor 2000 42,000,000
The chip manufacturing process Transistor capacity doubles every 18 to 24 months
2009/9/15 49
20到 40道的製程矽碇 薄片空白晶圓
將晶片封裝
測試過的晶片切割機
測試過的晶圓 晶圓測試機
圖樣晶圓
封裝過的晶片 零件測試機
測試過的封裝晶片賣給顧客
圖 1.14 晶片的製造過程。矽碇在切成薄片後,空白的晶圓會經過 20到 40道的圖樣製造,處理過後的晶圓會以晶圓測試機測試,並顯示好的部份的電腦映圖。之後晶圓會被切成一塊一塊的小方塊。在本圖裡,這片晶圓有 20 個晶片,其中有 17 個通過測試 (x 表示壞的晶片 ) 。本例中的良率是 17/20/即 85% ,之後好的晶片會封裝起來,在賣給消費者前在測試一次。這個例子裡,封裝過後的晶片有一顆是壞的。2009/9/15 50
圖 1.15 包含了 Intel Pentium 4 晶片的 8 吋 (200mm) 晶圓。百分之百良率的晶圓裡,有 165 顆 Pentium 晶片。一顆晶片的面積為 250 ,裡頭有 5500 萬顆電晶體,使用 0.18 製程,意思是最小的電晶體大小約 0.18微米。Pentium4 晶片也有使用更先進的 0.13 製程製造。晶圓的周圍有數十顆部份製造的晶片是無用的,它們之所以會被製造,是如此一來會較容易設計晶圓圖樣所需的光罩圖。
2mm
2009/9/15 51
圖 1.16 散熱片上的 Intel Pentium4(3.06Ghz) 晶片,散熱片要散去晶片所製造出的 82 瓦熱量。
2009/9/15 52
Instruction set architecture A very important abstraction
interface between hardware and low-level software Allowing computer designers to talk about functions
independently from the hardware that performs them Advantage: different implementations of the same
architecture Disadvantage: sometimes prevents using new
innovations
2009/9/15 53