hw/sw co-design 발전 동향 조준동 vlsi algorithmic design automation lab. school of...
Post on 19-Dec-2015
215 views
TRANSCRIPT
Hw/Sw Co-design 발전 동향
조준동VLSI Algorithmic Design Automation Lab.http://vada.skku.ac.kr School of Information and TelecommunicationSungkyunkwan Univ.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
모듈 1: 통합설계방법론의 필요성 및 개요 학습 목표
칩 의 집 적 도 가 높 아 짐 에 따 라 MPSoC 와 NoC 와 같 이 프로세싱 컴포넌트들의 병행성도 증가하게 된다 . 이 경우 시스템 설계 및 검증은 더욱 어려운 문제가 되는데 이러한 추세를 대비하는 통합설계 기술의 전망을 살펴보도록 한다 .
선수지식 논리설계 컴퓨터구조
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
목차
Motivation Multiprocessor platform On chip Communication (Network on Chip) Low Power Design HW/SW Codesign Methodology
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
MOTIVATION MOTIVATION
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Post Pc = Mobile computing + Intelligent environment 109 times bandwidth and 106 times power consumption
3GOPS to search a song in 0.5sec by humming from a D/B (containing 2000 songs) and 3D TV also requires several GOPS.
By National Technology Roadmap for Semiconductors, in 2010, 4 billion transistors with 50nm is integrated into one chip and its clock speed would be 10GHz
New design methodology is required to handle wiring delay and intrinsic electrical noise.
Ultra low energy (10-100 Mops/mW), Ultra low cost S/W and H/W co-design, S/W-driven Design Reuse (e.g., sof
tware-Defined Radio)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Predicting the future
1899: Charles H. Duell, U.S. Patent Office:Everything that can be invented has beeninvented.1943: Thomas J. Watson, Chairman of the Board, IBM:I think there is a world market of about fivecomputers.1948: IBM:The computer has no commercial value.1981: Bill Gates, Chairman, Microsoft:640 kilobytes of RAM ought to be enough foranybody.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
McKinsey Curve: dynamics of R&D disciplines
maturity of a
discipline
year
fundmental issues
consolidation
saturation: limitations met
new discipline on top of it by ....
... by innovation
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
EDA Industry Revolutions
1978
Transistor entry: Applicon, Calma, CV ...
1992
Synthesis: Cadence, Synopsys ...1985
Schematics entry: Daisy, Mentor, Valid ...
courtesy [Keutzer / Newton]
EDA industry paradigmswitching every 7 years
1999HLLs, (Co-) Compilation
Data-Stream-based DPU arrays
2006closer to programmers‘ mind set
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Embedded Systems andPortable Computing
92% of market Knowledge base needed Hardware/Software Codesign
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
A Multimedia Embedded Chip
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향What are the properties of theseAmbient Intelligence architectures?
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Silicon technology roadmap
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
발전 방향 Wireless processing system 은 높은 throughput 과 함께
많은 계산을 필요로 하지만 엄격한 power 제약이 있음 재구성 SoC 구현은 parallelism 에 의해 성능향상을
시도하고 , IP reuse 를 사용 Hot spot bottleneck 에 의한 성능 예측을 통한 Algorithm
partitioning 멀티미디어 응용 제품의 확대와 이에 필요한 대용량의 burs
t 데이터 전송요구를 만족하기 위한 통신 대역폭을 확장 Dual-Core Architecture (ARM+DSP) -> Multiprocessor SoC
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Key Challenges With Chip Design
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
멀티프로세서 플랫폼멀티프로세서 플랫폼
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
HIERARCHY OF PLATFORMS
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
최근 연구동향 Intel’s Reconfigurable Radio Architecture. (mes
h + nearest neighbor) Reconfigurable Baseband Processing, Picochip Portable Components using Containers for Hete
rogeneous Platforms, Mercury Computer Systems, Inc.
A configurable Platform, Altera, Excalibur, Xilinx Virtex FPGA
Adaptive Computing Machine, Quicksilver Tech. Mercury, Sky, Galileo, Tundra (crossbars, bridges) Virginia Tech’s reconfigurable hardware
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Full Application Platform users design full applications on top of ha
rdware and software architectures Nexperia Texas Instrument's OMAP multimedia plat
form Infineon's M-Gold 3G wireless platform, Parthus' Bluetooth platforms ARM's PrimeXsys wireless platform
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
OMAPTM(open multimedia application platform)
OMAP architecture 는 platform의 전체 clocking 과 idle mode의 전체 control 을 할 수 있는 SW/OS 가 있다 .
Dual core architecture 는 task에 대해 가정 적당한 process에게 task 를 할당하는 것이 가능
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Processor-centric platform
focus on access to a configurable processor but doesn't model complete applications
Program-in Chip-out (PICO), HP Lab. UC. Berkeley, GARP Improv Systems ARC Tensilica Triscend
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Fully programmable platform
consisting of FPGA logic and a processor core
System on a programmable chip(SOPC) Altera's Excalibur, Xilinx' Virtex-II Pro and
Quicklogic's QuickMIPS Xilinx-IBM XBlue architecture
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Coarse grain Reconfigurable Computing, Reiner Hartenstein, TU Kaiserslautern
The new machine Paradigm :Configware is going mainstream,
Hardware / Configware /Software do-design is the new mind set for digital systems engineering
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향Fine-grain Morphware: Drawbacks
FPGA Architectures SRAM-based Look-up Tables (LUTs) Problems:
Routing: reduces Performance Bad Ratio: active / passive Elements
reconfigurable Interconnect (Switching Boxes)
Configurable Logic Block (CLB)
Source: R. Hartenstein
LUT
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Reconfigurability Overhead
S S
S Sresources needed for reconfigurability
partly for configuration code storage
L
L L
LL
L
L LL
area used by application
“hidden RAM”not shown
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Merit of Coarse grain Approach
100
hardwired
1000
10
1
0.1
0.01
0.0012 1 0.5 0.25 0.13 0.1 0,07
MOPS / mW
µ feature size
T. Claasen et al.: ISSCC 1999
instruction set processors
standard microprocessor
DSP
FPGAs (reconfigurable logic)
Wiring by abutment:a 32 Bit KressArray example
if coarse grain cellsare full custom and
mesh-connected,and 2nd level interconnect ressources layouted over the cells
*) R. Hartenstein: ISIS 1997
rDPAs (reconfigurable computing)*
the array is almost as area-efficient as hardwired
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Communication- centric platform
interconnect architecture but doesn't typically provide a processor or a full application
Sonics' SiliconBackplane PalmChip's CoreFrame architectures.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
?
What’s coming next ?
The History of Paradigm Shifts
“Mainstream Silicon Applicationis switching every 10 Years”
TTL µproc.,memory
“The Programmable System-on-a-Chipis the next wave“
custom
standard
1957
1967
1977
1987
1997
2007
Makimoto’s Wave
ASICs,accel’s
LSI,MSI
1st D
esig
n C
risis
2n
d D
esig
n C
risis
?
reconfigurable
Published
in 1989
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
The anti universe Paul Dirac predicted a complete anti universe consisting of antimatter
“There are regions in the universe, which consist of antimatter .....
We are not aware, that there is a new area in computing sciences , which consists of antimatter of computing
.... But there are asymmetries”
Reconfigurable Computing is made from this antimatter: data-stream-based computing
when a particle hits its antiparticle, both are converted into energy: Annihilation
hydrogen anti hydrogen
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
instruction stream
spinning
Machine and Anti Machine
+CPU
- 1936 1st electronic computer (Konrad Zuse)
Machine paradigm:„von Neumann“
1946 v. N. machine paradigm
1971 1st microprocessor (Ted Hoff)
data stream spinning
1979 „data streams“ (systolic array: Kung / Leiserson)
-DPU
+
Anti Machine paradigm
1990 anti machine paradigm published1995 rDPA / DPSS (supersystolic: Rainer Kress)
novelcompilationtechniques
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
IBM’s Coreconnect
초기의 32 비트에서 시작하여 128 비트까지 대역폭을 확장
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Sonics Smart Interconnect IP
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
SMART (Sonics Methodology and Architecture for Rapid Time-to-Market)
plug-and-play on-chip communications network Packet-based 50 employees in a year IP 및 설계환경 제공 , SoC 설계 지원 Cadence 와 연합 SiliconBackplne III 는 통신 + 미디어
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Nexperia Digital Video Platform Designing the initial platform, along with the pnx8500, wasn't
quick and easy. It involved about 300 hardware, software and systems people
working between 1999 and 2001, of which 60 were involved with hardware.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Microprocessor Architecture Research
•Wave Pipelining, Prof. Mike Flynn at Stanford
•Multithreaded Processors•Single-Chip Multiprocessors, Prof. Kunle Olukotun at Stanford
•Vector/Stream Processors, Prof. Bill Dally at Stanford
•Intelligent RAM, Prof. Dave Patterson at U .C. Berkeley
•Reconfigurable Computing, DARPA program
Don Alpert
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Single-Chip Multiprocessors
Hydra Project — Prof. Kunle Olukotun at Stanford — Targets thread-level parallelism 4 CPUs on a Chip 3-Level Cache Hierarchy Parallelizing Compiler Technology
Don Alpert
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Advantages of multi-processors:
Performance: possibility to exploit thread level parallelism combined with ILP
Energy: low energy cost per instruction by customizing the nodes (ASIPs) + effective memory hierarchy and distributed customisable organisation
Flexible: programmable nodes Scalability: memory bandwidth is scalable
(if good memory hierarchy is used)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
On
-chip Perip
heral Bus (O
PB
)
UART (2)
I2C (2)
GPIO
Arb
Processor Local Bus (PLB)
Timers
InterruptController
RAM/ROM/Peripheral controller
Ext Bus Master
GPIO
GPT
OPBBridge
PCI-XBridge
Processor Local Bus (PLB)
10/100/1GEthernet
MAC
440 PowerPC
DMADDR
SDRAM controller
DMAController
SRAM
Specialprocessoraccelerators
SoC designs with special-purpose processor accelerators attached to the common bus have been used
IBM PowerNP Operate in parallel
with processor Dedicated to specifi
c tasks Programmable & fl
exible
–C.J. Georgiou, V. Salapura, M. Denneau, "A Programmable Scalable Platform for Next Generation Networking,–" Network Processor Design, Issues and Practices, Vol. 2, Morgan-Kaufmann Publishers, 2004
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Advantages of multiprocessor subsystems in SoC design The multiprocessor subsystem is connected to the SoC bus
via a bridge This separation accommodates different speeds, bus widths,
signals, and signaling protocols between the SoC bus and multiprocessor interconnect
The subsystem interconnect fabric (i.e., switch) is optimized for multiprocessor operation
Only data traffic flows between the multiprocessor and the rest of the SoC
The computational capacity of the multiprocessor subsystem is parameterized Number of processor clusters, embedded memories, and me
mory sizes can be optimized for the particular application Software development is simplified
Basic communication and system management primitives are already available to the designer of the application
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Samy Mefitali
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
On Chip Communication On Chip Communication
(Network on Chip)(Network on Chip)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
단일 반도체 칩 상에 통신망 구조를 이식단일 반도체 칩 상에 통신망 구조를 이식 OSI modelOSI model 에 의해서 전송 프로토콜을 정의에 의해서 전송 프로토콜을 정의 DSP/microprocessor/Memory DSP/microprocessor/Memory 등을 등을 H/W-S/W co-desi
gn 이용 단일 칩 내에서 연결이용 단일 칩 내에서 연결 코드 최적화 및 저전력 software IP 라이브러리 구축 모듈간 연결을 위한 버스 구조 구성 요소
Region: Region: 특수한 토폴로지특수한 토폴로지 // 네트워크 구조를 허용하는 네트워크 구조를 허용하는 영역영역
BackboneBackbone Wapper : Wapper : 전송되는 메시지를 적절한 형태로 변환전송되는 메시지를 적절한 형태로 변환 , ,
복잡하다복잡하다 복잡하고 대형 시스템에 적합복잡하고 대형 시스템에 적합
NoC (network on chip)U.C. BerkeleyU.C. Berkeley
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Adaptive System on Chip
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Scheduled Communication A tiled architecture 각 tile 은 computational core 이며 각 interfac
e 가 네트웍을 구성 Core interface 는 하나 이상의 tile 에서
발생하는 heterogeneous processing 의 사용을 제공함
The system connect using statically scheduled mesh of interconnect
Data 는 이웃하는 tile 과 communication pipeline 에 의해 이동하므로 fast clock rate 와 interconnection resource 의 시 분할이 가능
Core 와 runtime interconnect 의 재설정 능력에 의해 dynamic power management 를 가능케 한다 .
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Communication Interface
-Stream data that passes through a communication interface is scheduled for a specific communication - clock cycle based on data link availability.-the result of scheduling for each interface is a set of instructions for its associated interconnect memory.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
9-core and 16-core Mode
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Evaluation Methodology
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
From buses to networks on chip?
Low power wireless networks Uncertain knowledge of physical medium Communication is dominant energy consumer
Can we adapt design and optimization techniques from (wireless) networking to SOC communication?
Packetized communication on chip Requires overhaul of architectures, CAD, software to be
communication centric Protocol stack: Simple (3 layer) vs. more complex (7 layer
ISO/OSI) Physical, Data link, Network
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
What is NoC? Less but ‘programmable’ wires by introducin
g switches (routers). Shared bus: communication bottleneck Point to point connection: many under-utiliz
ed long wires Structured approach to interconnect; wires a
re either short to get on the network, router to router.
Separation of computation (IPs) and communication (NOC)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Future SoC Interconnect Challenges
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향Network Architectures and control
Giovanni De Micheli
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
활용 분야- 선택적인 QoS 를 보장하는 프로토콜을 지원하여
Real Time Application 및 대용량 데이터 대역폭이 요구되는 응용 분야에 적합
- High frame rate video 및 3D 그래픽 관련 등과 같은 멀티미디어 대용량 응용분야 SoC 설계
- 온칩 네트워크 핵심 IP 및 설계 지원 툴을 하나의 플랫폼화한 플랫폼 기반 설계 환경을 구축하여 이를 다양한 SoC 설계에 활용함
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
What next from networking?
Computer networkingOn demand
wakeupPacket
switchingRumor routingCDMA
System on Chip design
Error correction ???
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
On chip communications evolving from deterministic baseban
d signaling interconnects to on-chip networking and communications
Indeed, complete integration of all layers of a networked node on a single chip physical transceiver, modem link/MAC packet scheduling routing routing protocols transport TCP application adaptive buffering
IC designer is also a networked system designer.
Application
OS & Middleware
TransportCODECActuatorSensor
Peripherals Network
MAC/Link
Physical
Application
OS & Middleware
Transport CODECActuatorSensor
PeripheralsNetwork
MAC/Link
Physical
PROTOCOLS
Flow of bits
Application
OS & Middleware
TransportCODECActuatorSensor
Peripherals Network
MAC/Link
Physical
Application
OS & Middleware
Transport CODECActuatorSensor
PeripheralsNetwork
MAC/Link
Physical
PROTOCOLS
Flow of bits
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Technology challenges: Global interconnect wires
Global communication structures become performance and power bottlenecks
[ITRS Roadmap 2001]
•Gate delay decreasing 25%per generation•Wire delay increasing 100%per generation•Communicate across a chip— 1 clock at 400 MHz in 0.35μm— 12.4 clocks at 1 GHz in 0.1μm
Global wires violate scaling laws
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Architecture: Bus based systems Advantages
Simple, extensible, area efficient Disadvantages
Comm. bottleneck (poor scaling), arbitration overhead Widely used: AMBA, IBM CoreConnect, Wishbone Techniques to increase efficiency
Bus splitting, burst mode transfers, split transactions
[IBM CoreConnect Spec.][AMBA Spec.]
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Case study: Bus Splitting Used in several SOC buses
Reduced capacitive load Smaller sized drivers
16% to 50% energy savings Depends on comm. patterns
Architectural implications More concurrency in transactions Split can be vertical or horizontal Tools to guide the splitting
Related to floor planning
[Hsieh, TCAD02]
M1 M2
M3
M4
M5
Split across bus width
M2
M3
M4M1
M5
Split along bus length (multi-bus system)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Physical design
Limitations come from interconnect physics Delay on global wires and delay uncertainty Crosstalk due to capacitive coupling among wires Electric signalling techniques Trade-off noise immunity vs. energy vs. speed Sense small swings -> low energy and fast transiti
ons Synchronization across large chips Is synchronization possible at high clock rates? What is the probability of synchronization failure?
Giovanni De Micheli
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Reliability of information
Information transfer is inherently unreliable at
the electrical level, due to: Timing errors Cross-talk Electro-magnetic interference (EMI) Soft errors
The problem will get increasingly more acute as technology scales down
Giovanni De Micheli
Giovanni De Micheli
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Systems on chips:a communication-centric view
Design component interconnection under: Uncertain knowledge of physical medium Incomplete knowledge of data traffic Design interconnection as a micro-network Leverage network design technology Manage information flow To provide for performance Power-manage components based on activity To reduce energy consumption
Giovanni De Micheli
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Network design objectives
Low communication latency Streamlined control protocols High communication bandwidth To support demanding SW applications Low energy consumption Wiring switched capacitance dominates High system-level reliability Correct communication errors, data loss
Giovanni De Micheli
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Framework for NoC modelling
Three types of basic components for a system-level modelTasksRTOS services
Task sheduling Resourse allocation Execution synchronization
Communication network
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Compiler Research Issues Synthesis of RTOS elements in the compiler
On the application side: Generation of an efficient application-specific static/run-time scheduler and synchronization
On the hardware side: Generation of device drivers, memory management primitives, etc. using hardware specifications
Automatic retargetability for family of target architectures Automatic application partitioningMapping of process/task-level concurrency onto multiple
PEs using programmer guidance in programmer’s model
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Memory vs Reused-IP
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Embedded Multiprocessor SoC Memory Management
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
LOW POWER DESIGN LOW POWER DESIGN
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Dynamic Power Management
Dynamic Power Management 는 data content 의 run-time variation 에 따른 서로 다른 clock domain 을 이용한 frequency 의 감소로 인한 power saving
Pre-computation 에 의한 반복적인 switching 제거 Valid data stream data 일 경우만 연결시켜 불필요한 switchin
g 을 제거 Reconfigurable clock based system balancing creates an env
ironment of just in time computing which can reduce overall power usage.
Prefetch many frames in a optimal-sized buffer [[email protected]]
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Power Metric Based on network activity and HSPICE circuit simulation
of interconnect, the network power consumption(Pint) is:
T : represents the number of tiles
PIF/D: overhead of the instruction memory fetch and decode
s: the number of stream
Nvs and Nivs: the number of valid and invalid transfer for stream s while Ps is the power consumed in transferring 1 bit through stream s
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Dynamic Power Management in On-Chip Communication?
Encoding/decoding relationship E.g. Bus invert coding, …
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Advanced Bus Architecture:Error-resilient Coding
Error-detection code or error-correction code Energy trade-off between
Retransmission Error-correction coder/decoder
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Energy Issue in On-chip Bus Arbitration Centralized bus arbitration
As bus scale grows up, energy inefficient Energy cost of communicating with the
arbiter and the arbiter complexity grows up more than linearly.
Distributed bus arbitration Code division multiple access (ISSCC’00) Just began to consider this problem.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
HW/SW Codesign Methodology
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Hardware/Software Co-design: Definition
Hardware(ASIC, FPGA) 와 software(DSP,MCU) 가 복합된 시스템을 체계적이며 효율적으로 설계
Meeting System level objectives by exploiting
the synergism of hardware and software
through their concurrent design
To Hardware if speed, power, area and special
Use software as a means of differentiating products based on
the same hardware platform.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
HW/SW Co-Synthesis: Pareto Point
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Time-Space Exploration Enumerate all Trade-off’s and
select the one with the most benefit.
Branch and Bound method for estimating SoC metric.
Jiang Xu and Wayne WolfPrinceton University
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
66% chips are not OK on first silicon (2004)
Mid-90s – 6 months late =
> 31% earnings loss
Today 3 month late =
$500M loss
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Methodology Requirement: Need for revolutionary design methods en
abling: Faster ‘Time To Market’ through IP reuse,
standard communication interfaces and scalable interconnect topology (NoC) Increased flexibility through SW program
mability and configurable HW Enable to map an application to a platform
to increase the productivity of a platform user
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
H/W and S/W 통합 저전력 설계
DSP StationDSP Station
H/W 합성 및 에너지 예측
HW SW 통합
S/W 코아 에너지 예측
SW 에너지 효율 계산
시스템 수준 에너지 예측
ORINOCOORINOCO
ORINOCOORINOCO
SeamlessSeamlessCo-centricCo-centricSignal-masterSignal-master
SynopsysSynopsys
클러스터 스케쥴링
클러스터 선택
HW 에너지 효율 계산
클러스터 링
알고리즘 선택 Matlab/SPWMatlab/SPWMatlab/SPWMatlab/SPW
Cossap, Cossap, SynopsysSynopsys
S/WS/WH/WH/W
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
재구성 플랫폼 기반 설계 방법재구성 플랫폼 기반 설계 방법
Real-time reconfiguration architecture with minimum configuration time
Design space exploration Dynamic Memory and Power manage
ment On a Chip (MPoC)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Why ASIPs? The Energy-Flexibility Gap
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Hw/Sw Partitioning on Single-Chip Platforms
Numerous single-chip commercial devices with uP and FPGA Triscend E5 (shown) Triscend A7 Atmel FPSLIC Xilinx Virtex II Pro Altera Excalibur More sure to come…
Make hw/sw partitioning even more attractive
uP and peripherals Cache/memory
Configurable logic
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
iSoC iSoC 는 SoC design 의 scalability, flex
ibility 를 향상시키기 위한 on-chip communication architecture
Dynamic Configuration 규칙적이고 유연한 구조로 global comm
unication 을 위한 traffic, power, speed, area requirement 모델링을 위해 예측 가능한 framework 를 제공
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
iSOC Compiler Divides applications into parts, each of
which fit into a specific core. Determines data communications
between the cores in a space-time fashion
Generate interconnect memory contents for each individual interface.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Application-specific multiprocessor SoC design flow
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Cont.
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Mission Statement To carry out R&D programs which are 3 t
o 10 years ahead of today’s industrial needs in the field of ..
Design Technology for Integrated Information and Communication Systems for Human’s Well-Being
Reconfigurable SoC, Multi-media multi-Mode terminals, BAN for health-monitoring
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
결론 New Computing Architecture Paradigm Architectural exploration tools Dynamic Real-time reconfiguration archit
ecture
with minimum configuration time Dynamic Memory and Power manageme
nt On a Chip (MPoC)
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
References aSOC: A Scalable, Single-Chip Communications Architecture
Jian Liang, Sriram Swaminathan, and Russell Tessier
Department of Electrical and Computer Engineering
University of Massachusetts, Amherst, MA. 01003.
{jliang, tessier}@ecs.umass.edu
Configurable Platforms With Dynamic Platform Management:
An Efficient Alternative to Application-Specific System-on-Chips Krishna Sekar Kanishka Lahiri Sujit Dey [email protected] [email protected] [email protected] Dept. of ECE, UC San Diego, La Jolla, CA NEC Laboratories America, Princeton, NJ
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
References Ackland et al., A Single Chip, 1.6 Billion, 16b MAC/s Multiprocessor DSP, IEEE JSSC, March 2000 • Agrawal, Raw Computation, Scientific American, August 1999 • Benini and De Micheli, Networks on Chip: A New SoC Paradigm, IEEE Computer, January 2002 • Benini and De Micheli, Powering Networks on Chip, Proceedings ISSS, October 2001 • Bertozzi, Benini and De Micheli, Low-Power Error-Resilient Codes for On-Chip Data Busses, DATE 2002 • Dally and Towles, Route Packets, not Wires, DAC 2001 • Guerrier and Grenier, A Generic Architecture for On-Chip Packet Switched Interconnections, DATE 2000 • Ho, Mai and Horowitz, The Future of Wires, IEEE Proceedings, January 2001 • Hu and Marculescu, Energy Aware Mapping for Tile-Based NoC Architectures, ASPDAC 2003 • Rijpkema et al., Trade off in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip, DATE 2003 • Yoshimura et al., DS-CDMA Wired Bus with Simple Interconnection Topology for Parallel Processing System LSIs, ISSC 2000 • Worm, Ienne, Thiran and De Micheli, An Adaptive, Low-Power Transmission scheme for On- Chip Networks, ISSS 2002 • Ye, De Micheli, Benini, Packetized On-chip Interconnect Communication Analysis for MPSoCs, DATE 2003 • Zhang et al., A 1V Heterogeneous Reconfigurable DSP IC for Wireless Baseband Digital Signal Processing, JSSC, November 2000
Copyrightⓒ2005 J.D.Cho,
Hw/Sw Co-design 발전 동향
Co-design On-line Sites
IMEC ftp reports (Cathedral): ftp://ftp.imec.be/pub/vsdm/reports/ Stanford Tech Reports: http://elib.stanford.edu/ Synopsys Research Publications:
http://www.synopsys.com/news/pubs/research/ATG_index.html URLs to Hardware/Software Co-Design Research: http://
www.ece.cmu.edu/~thomas/hsURL.html
Bibliography of Hardware/Software Codesign: http://www-ti.informatik.uni-tuebingen.de/~buchen/
Ralf Niemann's Codesign Links and Literature: http://ls12-www.informatik.uni-dortmund.de/~niemann/codesign/codesign_links.html
http://hartenstein.de/Ph-D-Theses.html