sistemi embedded - i parte (2011-2012).pdf
TRANSCRIPT
-
Embedded Systems Design: A Unified Hardware/Software Introduction
1
Chapter 1: Introduction
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
2
Outline
Embedded systems overview What are they?
Design challenge optimizing design metrics Technologies
Processor technologies IC technologies Design technologies
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
3
Embedded systems overview
Computing systems are everywhere Most of us think of desktop computers
PCs Laptops Mainframes Servers
But theres another type of computing system Far more common...
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
4
Embedded systems overview
Embedded computing systems Computing systems embedded within
electronic devices Hard to define. Nearly any computing
system other than a desktop computer Billions of units produced yearly, versus
millions of desktop units Perhaps 50 per household and per
automobile
Computers are in here...
and here...
and even here...
Lots more of these, though they cost a lot
less each.
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
5
A short list of embedded systems
And the list goes on and on
Anti-lock brakesAuto-focus camerasAutomatic teller machinesAutomatic toll systemsAutomatic transmissionAvionic systemsBattery chargersCamcordersCell phonesCell-phone base stationsCordless phonesCruise controlCurbside check-in systemsDigital camerasDisk drivesElectronic card readersElectronic instrumentsElectronic toys/gamesFactory controlFax machinesFingerprint identifiersHome security systemsLife-support systemsMedical testing systems
ModemsMPEG decodersNetwork cardsNetwork switches/routersOn-board navigationPagersPhotocopiersPoint-of-sale systemsPortable video gamesPrintersSatellite phonesScannersSmart ovens/dishwashersSpeech recognizersStereo systemsTeleconferencing systemsTelevisionsTemperature controllersTheft tracking systemsTV set-top boxesVCRs, DVD playersVideo game consolesVideo phonesWashers and dryers
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
6
Some common characteristics of embedded systems
Single-functioned Executes a single program, repeatedly
Tightly-constrained Low cost, low power, small, fast, etc.
Reactive and real-time Continually reacts to changes in the systems environment Must compute certain results in real-time without delay
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
7
An embedded system example -- a digital camera
Microcontroller
CCD preprocessor Pixel coprocessorA2D
D2A
JPEG codec
DMA controller
Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
Single-functioned -- always a digital camera Tightly-constrained -- Low cost, low power, small, fast Reactive and real-time -- only to a small extent
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
8
Design challenge optimizing design metrics
Obvious design goal: Construct an implementation with desired functionality
Key design challenge: Simultaneously optimize numerous design metrics
Design metric A measurable feature of a systems implementation Optimizing design metrics is a key challenge
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
9
Design challenge optimizing design metrics
Common metrics Unit cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost
NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system
Size: the physical space required by the system Performance: the execution time or throughput of the system Power: the amount of power consumed by the system Flexibility: the ability to change the functionality of the system without
incurring heavy NRE cost
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
10
Design challenge optimizing design metrics
Common metrics (continued) Time-to-prototype: the time needed to build a working version of the
system
Time-to-market: the time required to develop a system to the point that it can be released and sold to customers
Maintainability: the ability to modify the system after its initial release Correctness, safety, many more
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
11
Design metric competition -- improving one may worsen others
Expertise with both software and hardware is needed to optimize design metrics Not just a hardware or
software expert, as is common A designer must be
comfortable with various technologies in order to choose the best for a given application and constraints
SizePerformance
Power
NRE cost
Microcontroller
CCD preprocessor Pixel coprocessorA2D D2A
JPEG codec
DMA controller
Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
Hardware
Software
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
12
Time-to-market: a demanding design metric
Time required to develop a product to the point it can be sold to customers
Market window Period during which the
product would have highest sales
Average time-to-market constraint is about 8 months
Delays can be costly
Rev
enue
s ($)
Time (months)
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
13
Losses due to delayed market entry
Simplified revenue model Product life = 2W, peak at W Time of market entry defines a
triangle, representing market penetration
Triangle area equals revenue
Loss The difference between the on-
time and delayed triangle areasOn-time Delayedentry entry
Peak revenue
Peak revenue from delayed entry
Market rise Market fall
W 2W
Time
D
On-time
Delayed
Rev
enue
s ($)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
14
Losses due to delayed market entry (cont.)
Area = 1/2 * base * height On-time = 1/2 * 2W * W Delayed = 1/2 * (W-D+W)*(W-D)
Percentage revenue loss = (D(3W-D)/2W2)*100%
Try some examples
On-time Delayedentry entry
Peak revenue
Peak revenue from delayed entry
Market rise Market fall
W 2W
Time
D
On-time
Delayed
Rev
enue
s ($)
Lifetime 2W=52 wks, delay D=4 wks (4*(3*26 4)/2*26^2) = 22% Lifetime 2W=52 wks, delay D=10 wks (10*(3*26 10)/2*26^2) = 50% Delays are costly!
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
15
NRE and unit cost metrics
Costs: Unit cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of
designing the system total cost = NRE cost + unit cost * # of units per-product cost = total cost / # of units
= (NRE cost / # of units) + unit cost Example
NRE=$2000, unit=$100 For 10 units
total cost = $2000 + 10*$100 = $3000 per-product cost = $2000/10 + $100 = $300
Amortizing NRE cost over the units results in an additional $200 per unit
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
16
NRE and unit cost metrics
$0
$40,000
$80,000
$120,000
$160,000
$200,000
0 800 1600 2400
ABC
$0
$40
$80
$120
$160
$200
0 800 1600 2400
Number of units (volume)
ABC
Number of units (volume)
tota
l cos
t (x1
000)
per
pro
duc
t cos
t
Compare technologies by costs -- best depends on quantity Technology A: NRE=$2,000, unit=$100 Technology B: NRE=$30,000, unit=$30 Technology C: NRE=$100,000, unit=$2
But, must also consider time-to-market
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
17
The performance design metric
Widely-used measure of system, widely-abused Clock frequency, instructions per second not good measures Digital camera example a user cares about how fast it processes images, not
clock speed or instructions per second
Latency (response time) Time between task start and end e.g., Cameras A and B process images in 0.25 seconds
Throughput Tasks per second, e.g. Camera A processes 4 images per second Throughput can be more than latency seems to imply due to concurrency, e.g.
Camera B may process 8 images per second (by capturing a new image while previous image is being stored).
Speedup of B over A = Bs performance / As performance Throughput speedup = 8/4 = 2
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
18
Three key embedded system technologies
Technology A manner of accomplishing a task, especially using technical
processes, methods, or knowledge Three key technologies for embedded systems
Processor technology IC technology Design technology
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
19
Processor technology
The architecture of the computation engine used to implement a systems desired functionality
Processor does not have to be programmable Processor not equal to general-purpose processor
Application-specific
Registers
CustomALU
DatapathController
Program memory
Assembly code for:
total = 0for i =1 to
Control logic and State register
Datamemory
IR PC
Single-purpose (hardware)
DatapathController
Controllogic
State register
Datamemory
index
total
+
IR PC
Registerfile
GeneralALU
DatapathController
Program memory
Assembly code for:
total = 0for i =1 to
Control logic and
State register
Datamemory
General-purpose
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
20
Processor technology
Processors vary in their customization for the problem at hand
total = 0for i = 1 to N loop total += M[i]end loop
General-purpose processor
Single-purpose processor
Application-specific processor
Desired functionality
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
21
General-purpose processors
Programmable device used in a variety of applications Also known as microprocessor
Features Program memory General datapath with large register file and
general ALU User benefits
Low time-to-market and NRE costs High flexibility
Pentium the most well-known, but there are hundreds of others
IR PC
Registerfile
GeneralALU
DatapathController
Program memory
Assembly code for:
total = 0for i =1 to
Control logic and
State register
Datamemory
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
22
Single-purpose processors
Digital circuit designed to execute exactly one program a.k.a. coprocessor, accelerator or peripheral
Features Contains only the components needed to
execute a single program No program memory
Benefits Fast Low power Small size
DatapathController
Control logic
State register
Datamemory
index
total
+
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
23
Application-specific processors
Programmable processor optimized for a particular class of applications having common characteristics Compromise between general-purpose and
single-purpose processors
Features Program memory Optimized datapath Special functional units
Benefits Some flexibility, good performance, size and
power
IR PC
Registers
CustomALU
DatapathController
Program memory
Assembly code for:
total = 0for i =1 to
Control logic and
State register
Datamemory
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
24
IC technology
The manner in which a digital (gate-level) implementation is mapped onto an IC IC: Integrated circuit, or chip IC technologies differ in their customization to a design ICs consist of numerous layers (perhaps 10 or more)
IC technologies differ with respect to who builds each layer and when
source drainchanneloxidegate
Silicon substrate
IC package IC
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
25
IC technology
Three types of IC technologies Full-custom/VLSI Semi-custom ASIC (gate array and standard cell) PLD (Programmable Logic Device)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
26
Full-custom/VLSI
All layers are optimized for an embedded systems particular digital implementation Placing transistors Sizing transistors Routing wires
Benefits Excellent performance, small size, low power
Drawbacks High NRE cost (e.g., $300k), long time-to-market
Har
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
27
Semi-custom
Lower layers are fully or partially built Designers are left with routing of wires and maybe placing
some blocks Benefits
Good performance, good size, less NRE cost than a full-custom implementation (perhaps $10k to $100k)
Drawbacks Still require weeks to months to develop
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
28
PLD (Programmable Logic Device)
All layers already exist Designers can purchase an IC Connections on the IC are either created or destroyed to
implement desired functionality Field-Programmable Gate Array (FPGA) very popular
Benefits Low NRE costs, almost instant IC availability
Drawbacks Bigger, expensive (perhaps $30 per unit), power hungry,
slower
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
29
Moores law
The most important trend in embedded systems Predicted in 1965 by Intel co-founder Gordon MooreIC transistor capacity has doubled roughly every 18 months
for the past several decades10,000
1,000
100
10
1
0.1
0.01
0.001
Logic transistors per chip
(in millions)
Note: logarithmic scale
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
30
Moores law
Wow This growth rate is hard to imagine, most people
underestimate How many ancestors do you have from 20 generations ago
i.e., roughly how many people alive in the 1500s did it take to make you?
220 = more than 1 million people (This underestimation is the key to pyramid schemes!)
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
31
Graphical illustration of Moores law
1981 1984 1987 1990 1993 1996 1999 2002
Leading edgechip in 1981
10,000transistors
Leading edgechip in 2002
150,000,000transistors
Something that doubles frequently grows more quickly than most people realize! A 2002 chip can hold about 15,000 1981 chips inside itself
b
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
32
Design Technology
The manner in which we convert our concept of desired system functionality into an implementation
Libraries/IP: Incorporates pre-designed implementation from lower abstraction level into higher level.
Systemspecification
Behavioralspecification
RTspecification
Logicspecification
To final implementation
Compilation/Synthesis:Automates exploration and insertion of implementation details for lower level.
Test/Verification: Ensures correct functionality at each level, thus reducing costly iterations between levels.
Compilation/Synthesis
Libraries/IP
Test/Verification
Systemsynthesis
Behaviorsynthesis
RTsynthesis
Logicsynthesis
Hw/Sw/OS
Cores
RTcomponents
Gates/Cells
Model simulat./checkers
Hw-Swcosimulators
HDL simulators
Gatesimulators
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
33
Design productivity exponential increase
Exponential increase over the past few decades
100,000
10,000
1,000
100
10
1
0.1
0.01
1983
1987
1989
1991
1993
1985
1995
1997
1999
2001
2003
2005
2007
2009
Prod
uctiv
ity(K
) Tra
ns./S
taff
M
o.
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
34
The co-design ladder
In the past: Hardware and software
design technologies were very different
Recent maturation of synthesis enables a unified view of hardware and software
Hardware/software codesign Implementation
Assembly instructions
Machine instructions
Register transfers
Compilers(1960's,1970's)
Assemblers, linkers(1950's, 1960's)
Behavioral synthesis(1990's)
RT synthesis(1980's, 1990's)
Logic synthesis(1970's, 1980's)
Microprocessor plus program bits: software
VLSI, ASIC, or PLD implementation: hardware
Logic gates
Logic equations / FSM's
Sequential program code (e.g., C, VHDL)
The choice of hardware versus software for a particular function is simply a tradeoff among various design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no
fundamental difference between what hardware or software can implement.
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
35
Independence of processor and IC technologies
Basic tradeoff General vs. custom With respect to processor technology or IC technology The two technologies are independent
General-purpose
processorASIP
Single-purpose
processor
Semi-customPLD Full-custom
General,providing improved:
Customized, providing improved:
Power efficiencyPerformance
SizeCost (high volume)
FlexibilityMaintainability
NRE costTime- to-prototype
Time-to-marketCost (low volume)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
36
Design productivity gap
While designer productivity has grown at an impressive rate over the past decades, the rate of improvement has not kept pace with chip capacity
10,000
1,000
100
10
1
0.1
0.01
0.001
Logic transistors per chip
(in millions)
100,000
10,000
1000
100
10
1
0.1
0.01
Productivity(K) Trans./Staff-Mo.IC capacity
productivity
Gap
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
37
Design productivity gap
1981 leading edge chip required 100 designer months 10,000 transistors / 100 transistors/month
2002 leading edge chip requires 30,000 designer months 150,000,000 / 5000 transistors/month
Designer cost increase from $1M to $300M
10,0001,000
10010
10.1
0.010.001
Logic transistors per chip
(in millions)
100,00010,00010001001010.10.01
Productivity(K) Trans./Staff-Mo.IC capacity
productivity
Gap
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
38
The mythical man-month
The situation is even worse than the productivity gap indicates In theory, adding designers to team reduces project completion time In reality, productivity per designer decreases due to complexities of team management
and communication In the software community, known as the mythical man-month (Brooks 1975) At some point, can actually lengthen project completion time! (Too many cooks)
10 20 30 400
100002000030000400005000060000
43
24
1916 15 16
18
23
Team
Individual
Months until completion
Number of designers
1M transistors, 1designer=5000 trans/month
Each additional designer reduces for 100 trans/month
So 2 designers produce 4900 trans/month each
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
39
Summary
Embedded systems are everywhere Key challenge: optimization of design metrics
Design metrics compete with one another
A unified view of hardware and software is necessary to improve productivity
Three key technologies Processor: general-purpose, application-specific, single-purpose IC: Full-custom, semi-custom, PLD Design: Compilation/synthesis, libraries/IP, test/verification
Embedded Systems Design: A Unified Hardware/Software Introduction
1
Chapter 10: IC Technology
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
2
Outline
Anatomy of integrated circuits
Full-Custom (VLSI) IC Technology
Semi-Custom (ASIC) IC Technology
Programmable Logic Device (PLD) IC Technology
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
3
CMOS transistor
Source, Drain Diffusion area where electrons can flow Can be connected to metal contacts (vias)
Gate Polysilicon area where control voltage is applied
Oxide Si O2 Insulator so the gate voltage cant leak
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
4
End of the Moores Law?
Every dimension of the MOSFET has to scale (PMOS) Gate oxide has to scale down to
Increase gate capacitance Reduce leakage current from S to D Pinch off current from source to drain
Current gate oxide thickness is about 2.5-3nm Thats about 25 atoms!!!
source drainoxidegate
IC package IC channel
Silicon substrate
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
5
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
6
20Ghz +
FinFET has been manufactured to 18nm Still acts as a very good transistor
Simulation shown that it can be scaled to 10nm Quantum effect start to kick in
Reduce mobility by ~10% Ballistic transport become significant
Increase current by about ~20%
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
7
NAND
Metal layers for routing (~10) PMOS dont like 0 NMOS dont like 1 A stick diagram form the basis for mask sets
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
8
Silicon manufacturing steps
Tape out Send design to manufacturing
Spin One time through the manufacturing process
Photolithography Drawing patterns by using photoresist to form barriers for deposition
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
9
Full Custom
Very Large Scale Integration (VLSI) Placement
Place and orient transistors Routing
Connect transistors Sizing
Make fat, fast wires or thin, slow wires May also need to size buffer
Design Rules simple rules for correct circuit function
Metal/metal spacing, min poly width
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
10
Full Custom
Best size, power, performance Hand design
Horrible time-to-market/flexibility/NRE cost Reserve for the most important units in a processor
ALU, Instruction fetch
Physical design tools Less optimal, but faster
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
11
Semi-Custom
Gate Array Array of prefabricated gates place and route Higher density, faster time-to-market Does not integrate as well with full-custom
Standard Cell A library of pre-designed cell Place and route Lower density, higher complexity Integrate great with full-custom
d
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
12
Semi-Custom
Most popular design style
Jack of all trade Good
Power, time-to-market, performance, NRE cost, per-unit cost, area
Master of none Integrate with full custom for
critical regions of design
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
13
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
14
Programmable Logic Device
Programmable Logic Device Programmable Logic Array, Programmable Array Logic, Field Programmable
Gate Array All layers already exist
Designers can purchase an IC To implement desired functionality
Connections on the IC are either created or destroyed to implement Benefits
Very low NRE costs Great time to market
Drawback High unit cost, bad for large volume Power
Except special PLA slower
800-6400 usable gates5-15 ns delay, up to 125 MHzFew $s price (2004)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
15
Xilinx FPGA
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
16
Configurable Logic Block (CLB)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
17
I/O Block
-
Embedded Systems Design: A Unified Hardware/Software Introduction
1
Chapter 8: State Machine and Concurrent Process Model
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 2
Outline
Models vs. Languages State Machine Model
FSM/FSMD HCFSM and Statecharts Language Program-State Machine (PSM) Model
Concurrent Process Model Communication Synchronization Implementation
Dataflow Model Real-Time Systems
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 3
Describing embedded systems processing behavior Can be extremely difficult
Complexity increasing with increasing IC capacity Past: washing machines, small games, etc.
Hundreds of lines of code Today: TV set-top boxes, Cell phone, etc.
Hundreds of thousands of lines of code Desired behavior often not fully understood in beginning
Many implementation bugs due to description mistakes/omissions
English (or other natural language) common starting point Precise description difficult to impossible Example: Motor Vehicle Code thousands of pages long...
Introduction
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 4
An example of trying to be precise in English
California Vehicle Code Right-of-way of crosswalks
21950. (a) The driver of a vehicle shall yield the right-of-way to a pedestrian crossing the roadway within any marked crosswalk or within any unmarked crosswalk at an intersection, except as otherwise provided in this chapter.
(b) The provisions of this section shall not relieve a pedestrian from the duty of using due care for his or her safety. No pedestrian shall suddenly leave a curb or other place of safety and walk or run into the path of a vehicle which is so close as to constitute an immediate hazard. No pedestrian shall unnecessarily stop or delay traffic while in a marked or unmarked crosswalk.
(c) The provisions of subdivision (b) shall not relieve a driver of a vehicle from the duty of exercising due care for the safety of any pedestrian within any marked crosswalk or within any unmarked crosswalk at an intersection.
All that just for crossing the street (and theres much more)!
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 5
Models and languages
How can we (precisely) capture behavior? We may think of languages (C, C++), but computation model is the key
Common computation models: Sequential program model
Statements, rules for composing statements, semantics for executing them Communicating process model
Multiple sequential programs running concurrently State machine model
For control dominated systems, monitors control inputs, sets control outputs Dataflow model
For data dominated systems, transforms input data streams into output streams Object-oriented model
For breaking complex software into simpler, well-defined pieces
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 6
Models vs. languages
Computation models describe system behavior Conceptual notion, e.g., recipe, sequential program
Languages capture models Concrete form, e.g., English, C
Variety of languages can capture one model E.g., sequential program model C,C++, Java
One language can capture variety of models E.g., C++ -oriented model, state machine model
Certain languages better at capturing certain computation models
Models
Languages
Recipe
SpanishEnglish Japanese
Poetry Story Sequent.program
C++C Java
Statemachine
Data-flow
Recipes vs. English Sequential programs vs. C
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 7
Text versus Graphics
Models versus languages not to be confused with text versus graphics Text and graphics are just two types of languages
Text: letters, numbers Graphics: circles, arrows (plus some letters, numbers)
X = 1;
Y = X + 1;
X = 1
Y = X + 1
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 8
Introductory example: An elevator controller
Simple elevator controller Request Resolver
resolves various floor requests into single requested floor
Unit Control moves elevator to this requested floor
Try capturing in C...
Move the elevator either up or down to reach the requested floor. Once at the requested floor, open the door for at least 10 seconds, and keep it open until the requested floor changes. Ensure the door is never open while moving. Dont change directions unless there are no higher requests when moving up or no lower requests when moving down
Partial English description
buttonsinside
elevator
UnitControl
b1
down
open
floor
...
RequestResolver
...
up/downbuttons on
eachfloor
b2bN
up1up2dn2
dnN
req
up
System interface
up3dn3
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 9
Elevator controller using a sequential program model
Move the elevator either up or down to reach the requested floor. Once at the requested floor, open the door for at least 10 seconds, and keep it open until the requested floor changes. Ensure the door is never open while moving. Dont change directions unless there are no higher requests when moving up or no lower requests when moving down
Partial English description
buttonsinside
elevator
UnitControl
b1
down
open
floor
...
RequestResolver
...
up/downbuttons on
eachfloor
b2bN
up1up2dn2
dnN
req
up
System interface
up3dn3
Sequential program model
void UnitControl() { up = down = 0; open = 1; while (1) { while (req == floor); open = 0; if (req > floor) { up = 1;} else {down = 1;} while (req != floor); up = down = 0; open = 1; delay(10); }}
void RequestResolver() { while (1) ... req = ... ...}void main() { Call concurrently: UnitControl() and RequestResolver()}
Inputs: int floor; bit b1..bN; up1..upN-1; dn2..dnN;Outputs: bit up, down, open;Global variables: int req;
You might have come up with something having even more if statements.
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 10
Finite-state machine (FSM) model
Trying to capture this behavior as sequential program is a bit awkward
Instead, we might consider an FSM model, describing the system as: Possible states
E.g., Idle, GoingUp, GoingDn, DoorOpen Possible transitions from one state to another based on input
E.g., req > floor Actions that occur in each state
E.g., In the GoingUp state, u,d,o,t = 1,0,0,0 (up = 1, down, open, and timer_start = 0)
Try it...
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 11
Finite-state machine (FSM) model
Idle
GoingUp
req > floor
req < floor
!(req > floor)
!(timer < 10)
req < floor
DoorOpen
GoingDn
req > floor
u,d,o, t = 1,0,0,0
u,d,o,t = 0,0,1,0
u,d,o,t = 0,1,0,0
u,d,o,t = 0,0,1,1
u is up, d is down, o is open
req == floor
!(req
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 13
Finite-state machine with datapath model (FSMD)
FSMD extends FSM: complex data types and variables for storing data FSMs use only Boolean data types and operations, no variables
FSMD: 7-tuple S is a set of states {s0, s1, , sl} I is a set of inputs {i0, i1, , im} O is a set of outputs {o0, o1, , on} V is a set of variables {v0, v1, , vn} F is a next-state function (S x I x V S) H is an action function (S O + V) s0 is an initial state
I,O,V may represent complex data types (i.e., integers, floating point, etc.) F,H may include arithmetic operations H is an action function, not just an output function
Describes variable updates as well as outputs Complete system state now consists of current state, si, and values of all variables
Idle
GoingUp
req > floor
req < floor
!(req > floor)
!(timer < 10)
req < floor
DoorOpen
GoingDn
req > floor
u,d,o, t = 1,0,0,0
u,d,o,t = 0,0,1,0
u,d,o,t = 0,1,0,0
u,d,o,t = 0,0,1,1
u is up, d is down, o is open
req == floor
!(req floor
!(req > floor) u,d,o, t = 1,0,0,0
u,d,o,t = 0,0,1,0
u,d,o,t = 0,1,0,0
u,d,o,t = 0,0,1,1
u is up, d is down, o is open
req < floor
req > floor
req == floor
req < floor
!(req
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 15
State machine vs. sequential program model
Different thought process used with each model State machine:
Encourages designer to think of all possible states and transitions among states based on all possible input conditions
Sequential program model: Designed to transform data through series of instructions that may be iterated and
conditionally executed State machine description excels in many cases
More natural means of computing in those cases Not due to graphical representation (state diagram)
Would still have same benefits if textual language used (i.e., state table) Besides, sequential program model could use graphical representation (i.e., flowchart)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 16
Try Capturing Other Behaviors with an FSM
E.g., Answering machine blinking light when there are messages
E.g., A simple telephone answering machine that answers after 4 rings when activated
E.g., A simple crosswalk traffic control light Others
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 17
Capturing state machines in sequential programming language
Despite benefits of state machine model, most popular development tools use sequential programming language
C, C++, Java, Ada, VHDL, Verilog, etc. Development tools are complex and expensive, therefore not easy to adapt or replace
Must protect investment
Two approaches to capturing state machine model with sequential programming language
Front-end tool approach Additional tool installed to support state machine language
Graphical and/or textual state machine languages May support graphical simulation Automatically generate code in sequential programming language that is input to main development tool
Drawback: must support additional tool (licensing costs, upgrades, training, etc.) Language subset approach
Most common approach...
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 18
Language subset approach
Follow rules (template) for capturing state machine constructs in equivalent sequential language constructs
Used with software (e.g.,C) and hardware languages (e.g.,VHDL)
Capturing UnitControl state machine in C
Enumerate all states (#define) Declare state variable initialized to
initial state (IDLE) Single switch statement branches to
current states case Each case has actions
up, down, open, timer_start Each case checks transition conditions
to determine next state if() {state = ;}
#define IDLE0#define GOINGUP1#define GOINGDN2#define DOOROPEN3void UnitControl() {
int state = IDLE;while (1) {
switch (state) { IDLE: up=0; down=0; open=1; timer_start=0; if (req==floor) {state = IDLE;} if (req > floor) {state = GOINGUP;} if (req < floor) {state = GOINGDN;} break; GOINGUP: up=1; down=0; open=0; timer_start=0; if (req > floor) {state = GOINGUP;} if (!(req>floor)) {state = DOOROPEN;} break; GOINGDN: up=1; down=0; open=0; timer_start=0; if (req < floor) {state = GOINGDN;} if (!(req
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 19
General template
#define S0 0#define S1 1...#define SN Nvoid StateMachine() {
int state = S0; // or whatever is the initial state.while (1) {
switch (state) {S0:
// Insert S0s actions here & Insert transitions Ti leaving S0:if( T0s condition is true ) {state = T0s next state; /*actions*/ }if( T1s condition is true ) {state = T1s next state; /*actions*/ }...if( Tms condition is true ) {state = Tms next state; /*actions*/ }break;
S1:// Insert S1s actions here// Insert transitions Ti leaving S1break;
...SN:
// Insert SNs actions here// Insert transitions Ti leaving SNbreak;
}}
}
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 20
HCFSM and the Statecharts language
Hierarchical/concurrent state machine model (HCFSM)
Extension to state machine model to support hierarchy and concurrency
States can be decomposed into another state machine
With hierarchy has identical functionality as Without hierarchy, but has one less transition (z)
Known as OR-decomposition States can execute concurrently
Known as AND-decomposition
Statecharts Graphical language to capture HCFSM timeout: transition with time limit as condition history: remember last substate OR-decomposed
state A was in before transitioning to another state B Return to saved substate of A when returning from B
instead of initial state
A1 z
B
A2 z
x y w
Without hierarchy
A1 z
B
A2
x y
A
w
With hierarchy
C1
C2
x y
CB
D1
D2
u v
D
Concurrency
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 21
UnitControl with FireMode
FireMode When fire is true, move elevator
to 1st floor and open door
Without hierarchy
Idle
GoingUp
req>floor
reqfloor)
timeout(10)
reqfloor
u,d,o = 1,0,0
u,d,o = 0,0,1
u,d,o = 0,1,0
req==floor!(req1
u,d,o = 0,1,0
u,d,o = 0,0,1
!fire
FireDrOpenfloor==1
fire
u,d,o = 0,0,1
UnitControl
fire!fire FireGoingDn
floor>1
u,d,o = 0,1,0
FireDrOpenfloor==1
fire
FireMode
u,d,o = 0,0,1
With hierarchy
Idle
GoingUp
req>floor
reqfloor)
timeout(10)
reqfloor
u,d,o = 1,0,0
u,d,o = 0,0,1
u,d,o = 0,1,0
req==floor!(req>floor)
u,d,o = 0,0,1
NormalModeUnitControl
NormalMode
FireMode
fire!fire
UnitControl
ElevatorController
RequestResolver
...
With concurrent RequestResolver
w/o hierarchy: Getting messy! w/ hierarchy: Simple!
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 22
Program-state machine model (PSM): HCFSM plus sequential program model
Program-states actions can be FSM or sequential program
Designer can choose most appropriate Stricter hierarchy than HCFSM used in
Statecharts transition between sibling states only, single entry Program-state may complete
Reaches end of sequential program code, OR FSM transition to special complete substate PSM has 2 types of transitions
Transition-immediately (TI): taken regardless of source program-state
Transition-on-completion (TOC): taken only if condition is true AND source program-state is complete
SpecCharts: extension of VHDL to capture PSM model
SpecC: extension of C to capture PSM model
up = down = 0; open = 1; while (1) { while (req == floor); open = 0; if (req > floor) { up = 1;} else {down = 1;} while (req != floor); open = 1; delay(10); } }
NormalMode
FireModeup = 0; down = 1; open = 0;while (floor > 1);up = 0; down = 0; open = 1;
fire!fire
UnitControl
ElevatorController
RequestResolver
...req = ...
...
int req;
NormalMode and FireMode described as sequential programs
Black square originating within FireModeindicates !fire is a TOC transition
Transition from FireMode to NormalModeonly after FireMode completed
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 23
Role of appropriate model and language
Finding appropriate model to capture embedded system is an important step Model shapes the way we think of the system
Originally thought of sequence of actions, wrote sequential program First wait for requested floor to differ from target floor Then, we close the door Then, we move up or down to the desired floor Then, we open the door Then, we repeat this sequence
To create state machine, we thought in terms of states and transitions among states When system must react to changing inputs, state machine might be best model
HCFSM described FireMode easily, clearly
Language should capture model easily Ideally should have features that directly capture constructs of model FireMode would be very complex in sequential program
Checks inserted throughout code Other factors may force choice of different model
Structured techniques can be used instead E.g., Template for state machine capture in sequential program language
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 24
Concurrent process model
Describes functionality of system in terms of two or more concurrently executing subtasks
Many systems easier to describe with concurrent process model because inherently multitasking
E.g., simple example: Read two numbers X and Y Display Hello world. every X seconds Display How are you? every Y seconds
More effort would be required with sequential program or state machine model
Subroutine execution over time
time
ReadX ReadYPrintHelloWorld
PrintHowAreYou
Simple concurrent process example
ConcurrentProcessExample() {x = ReadX()y = ReadY()Call concurrently:PrintHelloWorld(x) and PrintHowAreYou(y)
}PrintHelloWorld(x) {
while( 1 ) {print "Hello world."delay(x);
}}PrintHowAreYou(x) {
while( 1 ) {print "How are you?" delay(y);
}}
Sample input and output
Enter X: 1Enter Y: 2Hello world. (Time = 1 s)Hello world. (Time = 2 s)How are you? (Time = 2 s)Hello world. (Time = 3 s)How are you? (Time = 4 s)Hello world. (Time = 4 s)...
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 25
Dataflow model
Derivative of concurrent process model Nodes represent transformations
May execute concurrently Edges represent flow of tokens (data) from one node to another
May or may not have token at any given time When all of nodes input edges have at least one token, node may
fire When node fires, it consumes input tokens processes
transformation and generates output token Nodes may fire simultaneously Several commercial tools support graphical languages for capture
of dataflow model Can automatically translate to concurrent process model for
implementation Each node becomes a process
modulate convolve
transform
A B C D
Z
Nodes with more complex transformations
t1 t2
+
*
A B C D
Z
Nodes with arithmetic transformations
t1 t2
Z = (A + B) * (C - D)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 26
Synchronous dataflow
With digital signal-processors (DSPs), data flows at fixed rate Multiple tokens consumed and produced per firing Synchronous dataflow model takes advantage of this
Each edge labeled with number of tokens consumed/produced each firing
Can statically schedule nodes, so can easily use sequential program model
Dont need real-time operating system and its overhead How would you map this model to a sequential programming
language? Try it... Algorithms developed for scheduling nodes into single-
appearance schedules Only one statement needed to call each nodes associated
procedure Allows procedure inlining without code explosion, thus reducing
overhead even more
modulate convolve
transform
A B C D
Z
Synchronous dataflow
mt1 ct2
mA mB mC mD
tZ
tt1 tt2
t1 t2
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 27
Concurrent processes and real-time systems
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 28
Concurrent processes
Consider two examples having separate tasks running independently but sharing data
Difficult to write system using sequential program model
Concurrent process model easier
Separate sequential programs (processes) for each task
Programs communicate with each other
Heartbeat Monitoring System
B[1..4]
Heart-beat pulse
Task 1:Read pulseIf pulse < Lo then Activate SirenIf pulse > Hi then Activate SirenSleep 1 secondRepeat
Task 2:If B1/B2 pressed then Lo = Lo +/ 1If B3/B4 pressed then Hi = Hi +/ 1Sleep 500 msRepeat
Set-top Box
Input Signal
Task 1:Read SignalSeparate Audio/VideoSend Audio to Task 2Send Video to Task 3Repeat
Task 2:Wait on Task 1Decode/output AudioRepeat
Task 3:Wait on Task 1Decode/output VideoRepeat
Video
Audio
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 29
Process
A sequential program, typically an infinite loop Executes concurrently with other processes We are about to enter the world of concurrent programming
Basic operations on processes Create and terminate
Create is like a procedure call but caller doesnt wait Created process can itself create new processes
Terminate kills a process, destroying all data In HelloWord/HowAreYou example, we only created processes
Suspend and resume Suspend puts a process on hold, saving state for later execution Resume starts the process again where it left off
Join A process suspends until a particular child process finishes execution
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 30
Communication among processes
Processes need to communicate data and signals to solve their computation problem Processes that dont communicate are just
independent programs solving separate problems Basic example: producer/consumer
Process A produces data items, Process B consumes them
E.g., A decodes video packets, B display decoded packets on a screen
How do we achieve this communication? Two basic methods
Shared memory Message passing
processA() {// Decode packet// Communicate packet
to B}
}
void processB() {// Get packet from A// Display packet
}
Encoded video packets
Decoded video packets
To display
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 31
Shared Memory
Processes read and write shared variables No time overhead, easy to implement But, hard to use mistakes are common
Example: Producer/consumer with a mistake Share buffer[N], count
count = # of valid data items in buffer
processA produces data items and stores in buffer If buffer is full, must wait
processB consumes data items from buffer If buffer is empty, must wait
Error when both processes try to update count concurrently (lines 10 and 19) and the following execution sequence occurs. Say count is 3.
A loads count (count = 3) from memory into register R1 (R1 = 3) A increments R1 (R1 = 4) B loads count (count = 3) from memory into register R2 (R2 = 3) B decrements R2 (R2 = 2) A stores R1 back to count in memory (count = 4) B stores R2 back to count in memory (count = 2)
count now has incorrect value of 2
mb
01: data_type buffer[N];02: int count = 0;03: void processA() {04: int i;05: while( 1 ) {06: produce(&data);07: while( count == N );/*loop*/08: buffer[i] = data;09: i = (i + 1) % N;10: count = count + 1;11: }12: }13: void processB() {14: int i;15: while( 1 ) {16: while( count == 0 );/*loop*/17: data = buffer[i];18: i = (i + 1) % N;19: count = count - 1;20: consume(&data);21: }22: }23: void main() {24: create_process(processA); 25: create_process(processB);26: }
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 32
Message Passing
Message passing Data explicitly sent from one process to
another Sending process performs special operation,
send Receiving process must perform special
operation, receive, to receive the data Both operations must explicitly specify which
process it is sending to or receiving from Receive is blocking, send may or may not be
blocking Safer model, but less flexible
void processA() {while( 1 ) {
produce(&data)send(B, &data);/* region 1 */receive(B, &data);consume(&data);
}}
void processB() {while( 1 ) {receive(A, &data);transform(&data)send(A, &data);/* region 2 */
}}
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 33
Back to Shared Memory: Mutual Exclusion
Certain sections of code should not be performed concurrently Critical section
Possibly noncontiguous section of code where simultaneous updates, by multiple processes to a shared memory location, can occur
When a process enters the critical section, all other processes must be locked out until it leaves the critical section
Mutex A shared object used for locking and unlocking segment of shared data Disallows read/write access to memory it guards Multiple processes can perform lock operation simultaneously, but only one process
will acquire lock All other processes trying to obtain lock will be put in blocked state until unlock
operation performed by acquiring process when it exits critical section These processes will then be placed in runnable state and will compete for lock again
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 34
Correct Shared Memory Solution to the Consumer-Producer Problem
The primitive mutex is used to ensure critical sections are executed in mutual exclusion of each other
Following the same execution sequence as before: A/B execute lock operation on count_mutex Either A or B will acquire lock
Say B acquires it A will be put in blocked state
B loads count (count = 3) from memory into register R2 (R2 = 3)
B decrements R2 (R2 = 2) B stores R2 back to count in memory (count = 2) B executes unlock operation
A is placed in runnable state again A loads count (count = 2) from memory into register R1 (R1
= 2) A increments R1 (R1 = 3) A stores R1 back to count in memory (count = 3)
Count now has correct value of 3
01: data_type buffer[N];02: int count = 0;03: mutex count_mutex;04: void processA() {05: int i;06: while( 1 ) {07: produce(&data);08: while( count == N );/*loop*/09: buffer[i] = data;10: i = (i + 1) % N;11: count_mutex.lock();12: count = count + 1;13: count_mutex.unlock();14: }15: }16: void processB() {17: int i;18: while( 1 ) {19: while( count == 0 );/*loop*/20: data = buffer[i];21: i = (i + 1) % N;22: count_mutex.lock();23: count = count - 1;24: count_mutex.unlock();25: consume(&data);26: }27: }28: void main() {29: create_process(processA); 30: create_process(processB);31: }
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 35
Process Communication
Try modeling req value of our elevator controller Using shared memory Using shared memory and mutexes Using message passing buttonsinside
elevator
UnitControl
b1
down
open
floor
...
RequestResolver
...
up/downbuttons on
eachfloor
b2bN
up1up2dn2
dnN
req
up
System interface
up3dn3
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 36
A Common Problem in Concurrent Programming: Deadlock
Deadlock: A condition where 2 or more processes are blocked waiting for the other to unlock critical sections of code
Both processes are then in blocked state Cannot execute unlock operation so will wait forever
Example code has 2 different critical sections of code that can be accessed simultaneously
2 locks needed (mutex1, mutex2) Following execution sequence produces deadlock
A executes lock operation on mutex1 (and acquires it) B executes lock operation on mutex2( and acquires it) A/B both execute in critical sections 1 and 2, respectively A executes lock operation on mutex2
A blocked until B unlocks mutex2 B executes lock operation on mutex1
B blocked until A unlocks mutex1 DEADLOCK!
One deadlock elimination protocol requires locking of numbered mutexes in increasing order and two-phase locking (2PL)
Acquire locks in 1st phase only, release locks in 2nd phase
01: mutex mutex1, mutex2;02: void processA() {03: while( 1 ) {04: 05: mutex1.lock();06: /* critical section 1 */07: mutex2.lock();08: /* critical section 2 */09: mutex2.unlock();10: /* critical section 1 */11: mutex1.unlock();12: }13: }14: void processB() {15: while( 1 ) {16: 17: mutex2.lock();18: /* critical section 2 */19: mutex1.lock();20: /* critical section 1 */ 21: mutex1.unlock();22: /* critical section 2 */23: mutex2.unlock();24: }25: }
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 37
Synchronization among processes
Sometimes concurrently running processes must synchronize their execution When a process must wait for:
another process to compute some value reach a known point in their execution signal some condition
Recall producer-consumer problem processA must wait if buffer is full processB must wait if buffer is empty This is called busy-waiting
Process executing loops instead of being blocked CPU time wasted
More efficient methods Join operation, and blocking send and receive discussed earlier
Both block the process so it doesnt waste CPU time Condition variables and monitors
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 38
Condition variables
Condition variable is an object that has 2 operations, signal and wait When process performs a wait on a condition variable, the process is blocked
until another process performs a signal on the same condition variable How is this done?
Process A acquires lock on a mutex Process A performs wait, passing this mutex
Causes mutex to be unlocked Process B can now acquire lock on same mutex Process B enters critical section
Computes some value and/or make condition true Process B performs signal when condition true
Causes process A to implicitly reacquire mutex lock Process A becomes runnable
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 39
Condition variable example:consumer-producer
2 condition variables buffer_empty
Signals at least 1 free location available in buffer buffer_full
Signals at least 1 valid data item in buffer processA:
produces data item acquires lock (cs_mutex) for critical section checks value of count if count = N, buffer is full
performs wait operation on buffer_empty this releases the lock on cs_mutex allowing
processB to enter critical section, consume data item and free location in buffer
processB then performs signal if count < N, buffer is not full
processA inserts data into buffer increments count signals processB making it runnable if it has
performed a wait operation on buffer_full
01: data_type buffer[N];02: int count = 0;03: mutex cs_mutex;04: condition buffer_empty, buffer_full;06: void processA() {07: int i;08: while( 1 ) {09: produce(&data);10: cs_mutex.lock();11: if( count == N ) buffer_empty.wait(cs_mutex);13: buffer[i] = data;14: i = (i + 1) % N;15: count = count + 1;16: cs_mutex.unlock();17: buffer_full.signal();18: }19: }20: void processB() {21: int i;22: while( 1 ) {23: cs_mutex.lock();24: if( count == 0 ) buffer_full.wait(cs_mutex);26: data = buffer[i];27: i = (i + 1) % N;28: count = count - 1;29: cs_mutex.unlock();30: buffer_empty.signal();31: consume(&data);32: }33: }34: void main() {35: create_process(processA); create_process(processB);37: }
Consumer-producer using condition variables
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 40
Monitors
Collection of data and methods or subroutines that operate on data similar to an object-oriented paradigm
Monitor guarantees only 1 process can execute inside monitor at a time
(a) Process X executes while Process Y has to wait
(b) Process X performs wait on a condition Process Y allowed to enter and execute
(c) Process Y signals condition Process X waiting on Process Y blocked Process X allowed to continue executing
(d) Process X finishes executing in monitor or waits on a condition again
Process Y made runnable again
Process X
Monitor
DATA
CODE
(a)
Process Y
Process X
Monitor
DATA
CODE
(b)
Process Y
Process X
Monitor
DATA
CODE
(c)
Process Y
Process X
Monitor
DATA
CODE
(d)
Process Y
Waiting
Waiting
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 41
Monitor example: consumer-producer
Single monitor encapsulates both processes along with buffer and count
One process will be allowed to begin executing first
If processB allowed to execute first Will execute until it finds count = 0 Will perform wait on buffer_full condition
variable processA now allowed to enter monitor and
execute processA produces data item finds count < N so writes to buffer and
increments count processA performs signal on buffer_full
condition variable processA blocked processB reenters monitor and continues
execution, consumes data, etc.
01: Monitor {02: data_type buffer[N];03: int count = 0;04: condition buffer_full, condition buffer_empty;06: void processA() {07: int i;08: while( 1 ) {09: produce(&data);10: if( count == N ) buffer_empty.wait();12: buffer[i] = data;13: i = (i + 1) % N;14: count = count + 1;15: buffer_full.signal();16: }17: }18: void processB() {19: int i;20: while( 1 ) {21: if( count == 0 ) buffer_full.wait();23: data = buffer[i];24: i = (i + 1) % N;25: count = count - 1;26: buffer_empty.signal();27: consume(&data);28: buffer_full.signal();29: }30: }31: } /* end monitor */32: void main() {33: create_process(processA); create_process(processB);35: }
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 42
Implementation
Mapping of systems functionality onto hardware processors:
captured using computational model(s)
written in some language(s) Implementation choice independent
from language(s) choice Implementation choice based on
power, size, performance, timing and cost requirements
Final implementation tested for feasibility
Also serves as blueprint/prototype for mass manufacturing of final product
The choice of computational
model(s) is based on whether it
allows the designer to describe the
system.
The choice of language(s) is
based on whether it captures the computational
model(s) used by the designer.
The choice of implementation is
based on whether it meets power, size, performance and
cost requirements.
Sequent.program
Statemachine
Data-flow
Concurrent processes
C/C++Pascal Java VHDL
Implementation A Implementation B
Implementation C
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 43
Concurrent process model: implementation
Can use single and/or general-purpose processors (a) Multiple processors, each executing one process
True multitasking (parallel processing) General-purpose processors
Use programming language like C and compile to instructions of processor
Expensive and in most cases not necessary Custom single-purpose processors
More common
(b) One general-purpose processor running all processes
Most processes dont use 100% of processor time Can share processor time and still achieve necessary
execution rates (c) Combination of (a) and (b)
Multiple processes run on one general-purpose processor while one or more processes run on own single_purpose processor
Process1Process2
Process3Process4
Processor A
Processor B
Processor C
Processor D Com
mun
icat
ion
Bus
(a)
(b)
Process1Process2
Process3Process4
General Purpose Processor
Process1Process2
Process3Process4
Processor A
General Purpose
Processor
Com
mun
icat
ion
Bus
(c)
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 44
Implementation: multiple processes sharing single processor
Can manually rewrite processes as a single sequential program Ok for simple examples, but extremely difficult for complex examples Automated techniques have evolved but not common E.g., simple Hello World concurrent program from before would look like:
I = 1; T = 0;while (1) {
Delay(I); T = T + 1;if X modulo T is 0 then call PrintHelloWorldif Y modulo T is 0 then call PrintHowAreYou
}
Can use multitasking operating system Much more common Operating system schedules processes, allocates storage, and interfaces to peripherals, etc. Real-time operating system (RTOS) can guarantee execution rate constraints are met Describe concurrent processes with languages having built-in processes (Java, Ada, etc.) or a sequential
programming language with library support for concurrent processes (C, C++, etc. using POSIX threads for example)
Can convert processes to sequential program with process scheduling right in code Less overhead (no operating system) More complex/harder to maintain
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 45
Processes vs. threads
Different meanings when operating system terminology Regular processes
Heavyweight process Own virtual address space (stack, data, code) System resources (e.g., open files)
Threads Lightweight process Subprocess within process Only program counter, stack, and registers Shares address space, system resources with other threads
Allows quicker communication between threads Small compared to heavyweight processes
Can be created quickly Low cost switching between threads
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 46
Implementation:suspending, resuming, and joining
Multiple processes mapped to single-purpose processors Built into processors implementation Could be extra input signal that is asserted when process suspended Additional logic needed for determining process completion
Extra output signals indicating process done
Multiple processes mapped to single general-purpose processor Built into programming language or special multitasking library like POSIX Language or library may rely on operating system to handle
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 47
Implementation: process scheduling
Must meet timing requirements when multiple concurrent processes implemented on single general-purpose processor
Not true multitasking Scheduler
Special process that decides when and for how long each process is executed Implemented as preemptive or nonpreemptive scheduler Preemptive
Determines how long a process executes before preempting to allow another process to execute
Time quantum: predetermined amount of execution time preemptive scheduler allows each process (may be 10 to 100s of milliseconds long)
Determines which process will be next to run Nonpreemptive
Only determines which process is next after current process finishes execution
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 48
Scheduling: priority
Process with highest priority always selected first by scheduler Typically determined statically during creation and dynamically during
execution FIFO
Runnable processes added to end of FIFO as created or become runnable Front process removed from FIFO when time quantum of current process is up
or process is blocked Priority queue
Runnable processes again added as created or become runnable Process with highest priority chosen when new process needed If multiple processes with same highest priority value then selects from them
using first-come first-served Called priority scheduling when nonpreemptive Called round-robin when preemptive
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 49
Priority assignment
Period of process Repeating time interval the process must complete one execution within
E.g., period = 100 ms Process must execute once every 100 ms
Usually determined by the description of the system E.g., refresh rate of display is 27 times/sec Period = 37 ms
Execution deadline Amount of time process must be completed by after it has started
E.g., execution time = 5 ms, deadline = 20 ms, period = 100 ms Process must complete execution within 20 ms after it has begun regardless of its period Process begins at start of period, runs for 4 ms then is preempted Process suspended for 14 ms, then runs for the remaining 1 ms Completed within 4 + 14 + 1 = 19 ms which meets deadline of 20 ms Without deadline process could be suspended for much longer
Rate monotonic scheduling Processes with shorter periods have higher priority Typically used when execution deadline = period
Deadline monotonic scheduling Processes with shorter deadlines have higher priority Typically used when execution deadline < period
Process
ABCDEF
Period
25 ms50 ms12 ms100 ms40 ms75 ms
Priority
536142
Process
GHIJKL
Deadline
17 ms50 ms32 ms10 ms140 ms32 ms
Priority
523614
Rate monotonic
Deadline monotonic
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 50
Real-time systems
Systems composed of 2 or more cooperating, concurrent processes with stringent execution time constraints
E.g., set-top boxes have separate processes that read or decode video and/or sound concurrently and must decode 20 frames/sec for output to appear continuous
Other examples with stringent time constraints are: digital cell phones navigation and process control systems assembly line monitoring systems multimedia and networking systems etc.
Communication and synchronization between processes for these systems is critical
Therefore, concurrent process model best suited for describing these systems
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 51
Real-time operating systems (RTOS)
Provide mechanisms, primitives, and guidelines for building real-time embedded systems Windows CE
Built specifically for embedded systems and appliance market Scalable real-time 32-bit platform Supports Windows API Perfect for systems designed to interface with Internet Preemptive priority scheduling with 256 priority levels per process Kernel is 400 Kbytes
QNX Real-time microkernel surrounded by optional processes (resource managers) that provide POSIX and
UNIX compatibility Microkernels typically support only the most basic services Optional resource managers allow scalability from small ROM-based systems to huge multiprocessor systems
connected by various networking and communication technologies Preemptive process scheduling using FIFO, round-robin, adaptive, or priority-driven scheduling 32 priority levels per process Microkernel < 10 Kbytes and complies with POSIX real-time standard
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 52
Summary
Computation models are distinct from languages Sequential program model is popular
Most common languages like C support it directly State machine models good for control
Extensions like HCFSM provide additional power PSM combines state machines and sequential programs
Concurrent process model for multi-task systems Communication and synchronization methods exist Scheduling is critical
Dataflow model good for signal processing
-
Embedded Systems Design: A Unified Hardware/Software Introduction
1
Chapter 2: Custom single-purpose processors
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
2
Outline
Introduction Combinational logic Sequential logic Custom single-purpose processor design RT-level custom single-purpose processor design
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
3
Introduction
Processor Digital circuit that performs a
computation tasks Controller and datapath General-purpose: variety of computation
tasks Single-purpose: one particular
computation task Custom single-purpose: non-standard
task
A custom single-purpose processor may be Fast, small, low power But, high NRE, longer time-to-market,
less flexible
Microcontroller
CCD preprocessor
Pixel coprocessorA2D
D2A
JPEG codec
DMA controller
Memory controller ISA bus interface UART LCD ctrl
Display ctrl
Multiplier/Accum
Digital camera chip
lens
CCD
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
4
CMOS transistor on silicon
Transistor The basic electrical component in digital systems Acts as an on/off switch Voltage at gate controls whether current flows from
source to drain Dont confuse this gate with a logic gate
source drainoxidegate
IC package IC channel
Silicon substrate
gate
drain
source
Conductsif gate at 1
1
nMOS transistor
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
5
CMOS transistor implementations
Complementary Metal Oxide Semiconductor
We refer to logic levels Typically 0 : 0V, 1 : 5V or less
Two basic CMOS types nMOS conducts if gate at 1 pMOS conducts if gate at 0 Hence complementary
Basic gates Inverter, NAND, NOR
x F = x'
1
inverter
0
F = (xy)'
x1
x
y
y
NAND gate
0
1
F = (x+y)'
x y
x
y
NOR gate0
gate
drain
source
nMOS
Conductsif gate at 1
gate
source
drain
pMOS
Conductsif gate at 0
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
6
Basic logic gates
F = x yAND
F = (x y)NAND
F = x yXOR
F = xDriver
F = xInverter
x F
F = x + yOR
F = (x+y)NOR
x F
x
yF
Fx
y
x
yF
xy
Fxy F
Fyxx
0y0
F0
0 1 01 0 01 1 1
x0
y0
F0
0 1 11 0 11 1 1
x0
y0
F0
0 1 11 0 11 1 0
x0
y0
F1
0 1 01 0 01 1 1
x0
y0
F1
0 1 11 0 11 1 0
x0
y0
F1
0 1 01 0 01 1 0
x F0 01 1
x F0 11 0
F = (x y)XNOR
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
7
Combinational logic design
A) Problem description
y is 1 if a is to 1, or b and c are 1. z is 1 if b or c is to 1, but not both, or if all are 1.
D) Minimized output equations
000
1
01 11 100
1
0 1 0
1 1 1
abcy
y = a + bc
000
1
01 11 100
0
1 0 1
1 1 1
z
z = ab + bc + bc
abc
C) Output equations
y = a'bc + ab'c' + ab'c + abc' + abc
z = a'b'c + a'bc' + ab'c + abc' + abc
B) Truth table
1 0 1 1 11 1 0 1 11 1 1 1 1
0 0 1 0 10 1 0 0 10 1 1 1 01 0 0 1 0
00 0 0 0
Inputsa b c
Outputsy z
E) Logic Gates(random logic)
abc
y
z
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
8
Combinational components
With enable input e all Os are 0 if e=0
With carry-in input Ci
sum = A + B + Ci
May have status outputs carry, zero, etc.
Multiplexor
O =I0 if S=0..00I1 if S=0..01I(m-1) if S=1..11
Decoder
O0 =1 if I=0..00O1 =1 if I=0..01O(n-1) =1 if I=1..11
Adder
sum = A+B(first n bits)
carry = (n+1)thbit of A+B
Comparator
less = 1 if AB
ALU
O = A op Bop determinedby S.
n-bit, m x 1 Multiplexor
O
S0
S(log m)
n
n
I(m-1) I1 I0
log n x n Decoder
O1 O0O(n-1)
I0I(log n -1)
n-bitAdder
nA B
n
sumcarry
n-bitComparator
n nA B
less equal greater
n bit, m function
ALU
n nA B
S0
S(log m)n
O
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
9
Sequential components
(storage) Register
Q = 0 if clear=1,I if load=1 and clock=1,Q(previous) otherwise.
Counter
Q = 0 if clear=1,Q(prev)+1 if count=1 and clock=1.
clear
n-bitRegister
n
n
load
I
Q
shift
I Q
n-bitShift register
n-bitCountern
Q
Shift register
Q = lsb- Content shifted- I stored in msb
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
10
Sequential logic design
A) Problem Description
You want to construct a clock divider. Slow down your pre-existing clock so that you output a 1 for every four clock cycles
0
1 2
3
x=0
x=1x=0
x=0
a=1 a=1
a=1
a=1
a=0
a=0
a=0
a=0
B) State Diagram
C) Implementation Model
Combinational logic
State register
a x
I0
I0
I1
I1
Q1 Q0
D) State Table (Moore-type)
1 0 1 1 11 1 0 1 11 1 1 0 0
0 0 1 0 10 1 0 0 10 1 1 1 01 0 0 1 0
00 0 0 0
InputsQ1 Q0 a
OutputsI1 I0
1
0
0
0
x
Given this implementation model Sequential logic design quickly reduces to
combinational logic design
gis
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
11
Sequential logic design (cont.)
00
1
Q1Q0I1
I1 = Q1Q0a + Q1a + Q1Q0
0 1
1
1
010
00 11 10a 01
0
0
0
1 0 1
1
00 01 11a1
10I0 Q1Q0
I0 = Q0a + Q0a0
1
0 0
0
1
1
0
0
00 01 11 10
x = Q1Q0
x
0
1
0
a
Q1Q0
E) Minimized Output Equations F) Combinational Logic
(random logic)
a
Q1 Q0
I0
I1
x
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
12
Custom single-purpose processor basic model
controller and datapath
controller datapath
externalcontrolinputs
externalcontrol outputs
externaldata
inputs
externaldata
outputs
datapathcontrolinputs
datapathcontroloutputs
a view inside the controller and datapath
controller datapath
stateregister
next-stateand
controllogic
registers
functionalunits
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
13
Example: greatest common divisor
GCD
(a) black-box view
x_i y_i
d_o
go_i
0: int x, y;1: while (1) {2: while (!go_i) ;3: x = x_i; 4: y = y_i;5: while (x != y) {6: if (x < y) 7: y = y - x; else 8: x = x - y; }9: d_o = x; }
(b) alg. specification
y = y -x7: x = x - y8:
6-J:
x!=y
5: !(x!=y)
x
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
15
Creating the datapath
Create a register for any declared variable
Create a functional unit for each arithmetic operation
Connect the ports, registers and functional units Based on reads and writes Use multiplexors for
multiple sources
Create unique identifier for each datapath component
control input and output
y = y -x7: x = x - y8:
6-J:
x!=y
5: !(x!=y)
x
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
17
Splitting into a controller and datapath
y_sel = 1y_ld = 1
7: x_sel = 1x_ld = 1
8:
6-J:
x_neq_y=1
5:x_neq_y=0
x_lt_y=1 x_lt_y=0
6:
5-J:
d_ld = 1
1-J:
9:
x_sel = 0x_ld = 13:
y_sel = 0y_ld = 14:
1:1
!1
2:
2-J:
!go_i
!(!go_i)
go_i
0000
0001
0010
0011
0100
0101
0110
0111 10001001
1010
1011
1100
ControllerController implementation model
y_selx_sel
Combinational logic
Q3 Q0
State register
go_i
x_neq_yx_lt_y
x_ldy_ld
d_ld
Q2 Q1
I3 I0I2 I1
subtractor subtractor7: y-x8: x-y5: x!=y 6: x
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
19
Completing the GCD custom single-purpose processor design
We finished the datapath We have a state table for
the next state and control logic All thats left is
combinational logic design
This is not an optimized design, but we see the basic steps
ard
a view inside the controller and datapath
controller datapath
stateregister
next-stateand
controllogic
registers
functionalunits
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
20
We often start with a state machine Rather than algorithm Cycle timing often too central
to functionality
Example Bus bridge that converts 4-bit
bus to 8-bit bus Start with FSMD Known as register-transfer
(RT) level Exercise: complete the design
HH
RT-level custom single-purpose processor design
Prob
lem
Spe
cific
atio
n
BridgeA single-purpose processor that
converts two 4-bit inputs, arriving one at a time over data_in along with a
rdy_in pulse, into one 8-bit output on data_out along with a rdy_out pulse.
Sender
data_in(4)
rdy_in rdy_out
data_out(8)
Receiver
clock
FSM
D
WaitFirst4 RecFirst4Startdata_lo=data_in
WaitSecond4
rdy_in=1rdy_in=0
RecFirst4End
rdy_in=1
RecSecond4Startdata_hi=data_in
RecSecond4End
rdy_in=1rdy_in=0rdy_in=1
rdy_in=0
Send8Startdata_out=data_hi
& data_lordy_out=1
Send8Endrdy_out=0
Bridge
rdy_in=0Inputsrdy_in: bit; data_in: bit[4];
Outputsrdy_out: bit; data_out:bit[8]
Variablesdata_lo, data_hi: bit[4];
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
21
RT-level custom single-purpose processor design (cont)
WaitFirst4 RecFirst4Startdata_lo_ld=1
WaitSecond4
rdy_in=1rdy_in=0
RecFirst4End
rdy_in=1
RecSecond4Startdata_hi_ld=1
RecSecond4End
rdy_in=1rdy_in=0rdy_in=1
rdy_in=0
Send8Startdata_out_ld=1
rdy_out=1
Send8Endrdy_out=0
(a) Controller
rdy_in rdy_out
data_lodata_hi
data_in(4)
(b) Datapathdata_outd
ata_
out_
ldda
ta_h
i_ld
data
_lo_
ld
clk
to a
ll re
gist
ers
data_out
Bridge
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
22
Optimizing custom single-purpose processors
Optimization is the task of making design metric values the best possible
Optimization opportunities original program FSMD datapath FSM
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
23
Optimizing the original program
Analyze program attributes and look for areas of possible improvement number of computations size of variable time and space complexity operations used
multiplication and division very expensive
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
24
Optimizing the original program (cont)
0: int x, y;1: while (1) {2: while (!go_i) ;3: x = x_i; 4: y = y_i;5: while (x != y) {6: if (x < y) 7: y = y - x; else 8: x = x - y; }9: d_o = x; }
0: int x, y, r;1: while (1) {2: while (!go_i) ;
// x must be the larger number3: if (x_i >= y_i) {4: x=x_i; 5: y=y_i;
}6: else {7: x=y_i; 8: y=x_i;
}9: while (y != 0) {
10: r = x % y;11: x = y; 12: y = r; }13: d_o = x; }
original program optimized program
replace the subtraction operation(s) with modulo
operation in order to speed up program
GCD(42, 8) - 9 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (43, 8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2).
GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows: (42, 8), (8,2), (2,0)
-
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis
25
Optimizing t