top-down design methodology · [1] advanced asic chip synthesis, using synopsysdesign compiler and...

12
강좌는 C & S Technology 사의 지원으로 제작되었으며 copyright가 없으므로 비영리적인 목적에 한하여 누구든지 복사, 배포가 가능합니다. 연구실 홈페이지에는 고성능 마이크로프로세서에 관련된 많은 강좌가 있으며 누구나 무료로 다운로드 받을 수 있습니다. Top Top - - down Design down Design Methodology Methodology 2002. 12. 2002. 12. 연세대학교 연세대학교 전기전자공학과 전기전자공학과 프로세서 프로세서 연구실 연구실 박사과정 박사과정 정 우 경 E-mail: mail: [email protected] [email protected] Homepage: http:// Homepage: http:// mpu.yonsei.ac.kr mpu.yonsei.ac.kr 전화 전화: 02 : 02-2123 2123-2872 2872 2002. 12. 2002. 12. 연세대학교 연세대학교 전기전자공학과 전기전자공학과 프로세서 프로세서 연구실 연구실 박사과정 박사과정 정 우 경 E-mail: mail: [email protected] [email protected] References References [1] [1] Advanced ASIC Chip Synthesis Advanced ASIC Chip Synthesis, , Using Using Synopsys Synopsys Design Compiler and Design Compiler and PrimeTime PrimeTime, , Himanshu Himanshu Bhatnagar Bhatnagar, , Kluwer Kluwer Academic Academic Publishers, 1999 Publishers, 1999 [2] [2] The The Verilog Verilog Hardware Description Hardware Description Language Language, Donald E. Thomas, Philip , Donald E. Thomas, Philip Moorby Moorby, , Kluwer Kluwer Academic Publishers, 1991 Academic Publishers, 1991 4 [3] [3] HDL Chip Design, A Practical Guide for HDL Chip Design, A Practical Guide for Designing, Synthesizing and Simulating Designing, Synthesizing and Simulating ASICs ASICs and and FPGAs FPGAs using VHDL or using VHDL or Verilog Verilog, , Douglas J. Smith, Douglas J. Smith, Doone Doone Publications, 1996 Publications, 1996 [4] [4] Design Compiler User Guide Design Compiler User Guide, , Synopsys Synopsys [5] [5] PrimeTime PrimeTime User Guide User Guide, , Synopsys Synopsys [6] [6] Chip Synthesis Workshop, Instructor Chip Synthesis Workshop, Instructor Guide Guide, , Synopsys Synopsys 5 Advances in Semiconductor Advances in Semiconductor Moore Moore’ s Law: Number of transistors doubles s Law: Number of transistors doubles every 18 months. every 18 months. UltraSPARCIII UltraSPARCIII: 87.5M : 87.5M tr tr, Pentium4: 55M , Pentium4: 55M tr tr, , Itanium2: 221M Itanium2: 221M tr tr Top Top-down Design Methodology down Design Methodology Short time to market Short time to market Reduced NRE cost Reduced NRE cost Design reuse Design reuse Increased flexibility Increased flexibility Alternative technology libraries Alternative technology libraries Alternative architectures Alternative architectures 6

Upload: others

Post on 21-Aug-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

1

이 강좌는 C & S Technology 사의 지원으로

제작되었으며 copyright가 없으므로

비영리적인 목적에 한하여 누구든지 복사,

배포가 가능합니다. 연구실 홈페이지에는

고성능 마이크로프로세서에 관련된 많은

강좌가 있으며 누구나 무료로 다운로드 받을

수 있습니다.

TopTop--down Designdown DesignMethodologyMethodology

2002. 12.2002. 12.

연세대학교연세대학교 전기전자공학과전기전자공학과

프로세서프로세서 연구실연구실

박사과정박사과정 정정 우우 경경EE--mail: mail: [email protected]@yonsei.ac.kr

Homepage: http://Homepage: http://mpu.yonsei.ac.krmpu.yonsei.ac.kr

전화전화: 02: 02--21232123--28722872

2002. 12.2002. 12.연세대학교연세대학교 전기전자공학과전기전자공학과

프로세서프로세서 연구실연구실박사과정박사과정 정정 우우 경경

EE--mail: mail: [email protected]@yonsei.ac.kr

ReferencesReferences

[1] [1] Advanced ASIC Chip SynthesisAdvanced ASIC Chip Synthesis, , Using Using SynopsysSynopsys Design Compiler and Design Compiler and PrimeTimePrimeTime, , HimanshuHimanshu BhatnagarBhatnagar, , KluwerKluwer Academic Academic Publishers, 1999Publishers, 1999

[2] [2] The The VerilogVerilog Hardware Description Hardware Description LanguageLanguage, Donald E. Thomas, Philip , Donald E. Thomas, Philip MoorbyMoorby, , KluwerKluwer Academic Publishers, 1991Academic Publishers, 1991

4

[3] [3] HDL Chip Design, A Practical Guide for HDL Chip Design, A Practical Guide for Designing, Synthesizing and Simulating Designing, Synthesizing and Simulating ASICsASICs and and FPGAsFPGAs using VHDL or using VHDL or VerilogVerilog, ,

Douglas J. Smith, Douglas J. Smith, DooneDoone Publications, 1996Publications, 1996

[4] [4] Design Compiler User GuideDesign Compiler User Guide, , SynopsysSynopsys

[5] [5] PrimeTimePrimeTime User GuideUser Guide, , SynopsysSynopsys

[6] [6] Chip Synthesis Workshop, Instructor Chip Synthesis Workshop, Instructor GuideGuide, , SynopsysSynopsys

5

Advances in SemiconductorAdvances in Semiconductor

MooreMoore’’s Law: Number of transistors doubles s Law: Number of transistors doubles every 18 months.every 18 months.–– UltraSPARCIIIUltraSPARCIII: 87.5M : 87.5M trtr, Pentium4: 55M , Pentium4: 55M trtr, ,

Itanium2: 221M Itanium2: 221M trtrTopTop--down Design Methodologydown Design Methodology–– Short time to marketShort time to market–– Reduced NRE costReduced NRE cost–– Design reuseDesign reuse–– Increased flexibilityIncreased flexibility–– Alternative technology librariesAlternative technology libraries–– Alternative architecturesAlternative architectures

6

Page 2: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

2

Design MethodologyDesign Methodology

BottomBottom--upup–– Full customFull custom–– Small area, high Small area, high

performanceperformanceTopTop--downdown–– HDL based designHDL based design–– Synthesis using Synthesis using

automatic CAD toolsautomatic CAD tools–– Easy development Easy development

and verificationand verification Transistor

Gate

RTL

Architecture

Algorithm

Systemconcept

IncreasingBehavioralabstraction

IncreasingDetailed

Realization &Complexity

7

TopTop--down Design Methodologydown Design Methodology

System

PCB1

PCB2

PCB3

uP ROM

RAM ASIC

Peri FPGA

Board Chip

ARTLcode

BRTLcode

Gates

RTLsynthesis

Layout

Layoutsynthesis

8

Design Automation ToolsDesign Automation Tools

HDL simulation: HDL simulation: –– VerilogVerilog--XL, NCXL, NC--VerilogVerilog, NC, NC--VHDL(CadenceVHDL(Cadence), VSS, ), VSS,

VSC(SynopsysVSC(Synopsys), Model ), Model Sim(MentoSim(Mento))Synthesis:Synthesis:–– Design Design Compiler(SynopsysCompiler(Synopsys), Build ), Build Gates(CadenceGates(Cadence), ),

Leonardo(MentoLeonardo(Mento))Verification:Verification:–– Prime Time, Prime Time, EPIC(SynopsysEPIC(Synopsys), ), Calibre(MentoCalibre(Mento), Star), Star--SimSim, ,

Hercules(AvantiHercules(Avanti), Diva, ), Diva, Dracula(CadenceDracula(Cadence))Layout:Layout:–– Apollo(AvantiApollo(Avanti), Silicon Ensemble, ), Silicon Ensemble, Virtuoso(CadenceVirtuoso(Cadence), IC), IC--

Station(MentoStation(Mento))9

Design FlowDesign Flow

BehavioralHDL Model

RTL HDL Model

Gate LevelNetlist

AlgorithmVerification

NC-Verilog

FunctionalVerification

NC-Verilog

SynthesisDesign Compiler

Dynamic/Static Timing VerificationNC-Verilog,Prime Time

Post-LayoutTiming VerificationNC-Verilog,Prime Time

Layout

Place & Route

Apollo

Fabrication 10

HDL (Hardware Description Language)HDL (Hardware Description Language)

Description aspectsDescription aspects–– Abstract behavior modelingAbstract behavior modeling–– Hardware structure modelingHardware structure modeling

VHDLVHDL–– 1980 USA Department of Defense1980 USA Department of Defense–– 1987 IEEE Standard 10761987 IEEE Standard 1076

VerilogVerilog–– 1981 Gateway Design Automation1981 Gateway Design Automation–– 1995 IEEE Standard 13641995 IEEE Standard 1364

11

HDL DescriptionHDL Description

Behavioral modelBehavioral model-- abstraction of working, little abstraction of working, little regard to implementation, similar to a programming regard to implementation, similar to a programming languagelanguageStructural modelStructural model-- describe consisting modules and describe consisting modules and interconnections, hierarchical designinterconnections, hierarchical designRTL (Register Transfer Level) model RTL (Register Transfer Level) model –– specify specify registers which store data and interconnect them registers which store data and interconnect them through logic equationsthrough logic equations–– Register Register =>=> Combinational Logic Combinational Logic =>=> RegisterRegister–– Reveal hardware structureReveal hardware structure–– SynthesizableSynthesizable

12

Page 3: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

3

HDL GuidelinesHDL Guidelines

Technology independenceTechnology independenceClock logicClock logic–– Clock logic (clock gating logic, reset generation) should be Clock logic (clock gating logic, reset generation) should be

kept in one block.kept in one block.–– Avoid multiple clocks per blockAvoid multiple clocks per block–– Meaningful names for clocksMeaningful names for clocks–– For DFT scan, clocks be controlled from primary inputs.For DFT scan, clocks be controlled from primary inputs.

No glue logic at the topNo glue logic at the topModule name same as file nameModule name same as file namePads separate from core logicPads separate from core logicMinimize unnecessary hierarchyMinimize unnecessary hierarchyRegister all outputsRegister all outputs

13

Memory Element InferenceMemory Element Inference

Incomplete sensitivity lists: simulation Incomplete sensitivity lists: simulation missmatchmissmatch or infer a latchor infer a latchLatch Latch vsvs flipflip--flopflop–– Latch: levelLatch: level--sensitive, small area, more sensitive, small area, more

troublesometroublesomealways @(enable)always @(enable)

–– FlipFlip--flop: edgeflop: edge--sensitivesensitiveSynchronous resetSynchronous reset

always @(always @(posedgeposedge clkclk))Asynchronous resetAsynchronous reset

always @(always @(posedgeposedge clkclk or or negedgenegedge reset)reset)14

Synthesis of if statementsSynthesis of if statements

if without else if without else infer a latchinfer a latchifif--else implies else implies multiplxermultiplxerifif--elseifelseif imply imply prioritypriority

Priority Logic

int0

int1

int2

int3

int0_active

int1_active

int2_active

int3_active

15

Synthesis of case statementSynthesis of case statement

SynopsysSynopsys synthesis directive: synthesis directive: parallel_caseparallel_case, , full_casefull_case –– remove priorityremove priority

always @(int0 or int1 or int2 or int3)always @(int0 or int1 or int2 or int3)beginbegin

case (1case (1’’b1) b1) // // synopsyssynopsys full_casefull_caseint0:int0: int0_active = 1int0_active = 1’’b1;b1;int1:int1: int1_active = 1int1_active = 1’’b1;b1;int2:int2: int2_active = 1int2_active = 1’’b1;b1;int3:int3: int3_active = 1int3_active = 1’’b1;b1;

endcaseendcaseendend 16

Procedural AssignmentProcedural Assignment

Blocking versus NonBlocking versus Non--blockingblocking–– Blocking assignment(Blocking assignment(==): order dependent, may ): order dependent, may

cause simulation cause simulation missmatchmissmatch–– NonNon--blocking assignment(blocking assignment(<=<=) : order independent, ) : order independent,

same operations with synthesis resultssame operations with synthesis results

always @(always @(posedgeposedge clkclk))beginbegin

firstRegfirstReg <=<= data;data;secondRegsecondReg <=<= firstRegfirstReg;;thirdRegthirdReg <=<= secondRegsecondReg;;

endend 17

HDL VerificationHDL Verification

HDL Test Bench

Modelunder test

Waveformgeneration

Compareresults

Reference vectors

Stimulusvectors

Outputvectors

Testvectors

file

Resultsfile

Pass/failindication

Dynamic functional test 18

Page 4: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

4

Test BenchTest Bench

ObjectiveObjective–– instantiate hardware model under testinstantiate hardware model under test–– generate stimulus waveforms and applygenerate stimulus waveforms and apply–– generate waveforms of reference vectors and comparegenerate waveforms of reference vectors and compare–– automatically provide pass/fail indicationautomatically provide pass/fail indication

Writing in the same HDL as the hardware modelWriting in the same HDL as the hardware model–– no need to learn a special toolno need to learn a special tool–– transportable across different design toolstransportable across different design tools–– wide variety in coding test benchwide variety in coding test bench–– also used for functional verifications of synthesis resultsalso used for functional verifications of synthesis results

Waveform Viewer: Waveform Viewer: SignalscanSignalscan–– $$shm_openshm_open, $, $shm_proveshm_prove

19

SignalscanSignalscan

20

SynthesisSynthesis

Convert RTL level HDL models into gate Convert RTL level HDL models into gate level level netlistsnetlists–– Translation + Optimization + MappingTranslation + Optimization + Mapping

Utilize standard cell librariesUtilize standard cell librariesPhysical macro cells: PLL, memoryPhysical macro cells: PLL, memorySynopsysSynopsys Design CompilerDesign Compiler

21

Design Analyzer & Design Design Analyzer & Design CompilerCompiler

Menu-DrivenInterface

Command LineInterface

DesignCompiler

Design Analyzer

dc_shell

(New user,debug)

(Experienceduser)

22

Initial SetupInitial Setup

..synopsys_dc.setupsynopsys_dc.setup–– Path informationPath information–– Specify libraries: target library, link Specify libraries: target library, link

library, symbol librarylibrary, symbol library–– Conditions: worst case condition (Conditions: worst case condition (--10% 10%

VDD, worst case process, 85~125VDD, worst case process, 85~125℃℃))–– Naming ruleNaming rule–– AliasesAliases

23

Technology LibraryTechnology Library

A set of primitive cellsA set of primitive cells–– Timing and electrical characteristicsTiming and electrical characteristics–– Net delay and net Net delay and net parasticparastic informationinformation–– Definition of capacitance, time, resistance unitsDefinition of capacitance, time, resistance units

ProviedProvied by silicon vendorby silicon vendorSynopsysSynopsys settingsetting–– target_librarytarget_library: cells to be mapped (.db): cells to be mapped (.db)–– link_librarylink_library: instanced cells, wire load or operating : instanced cells, wire load or operating

condition models (.db)condition models (.db)–– symbol_librarysymbol_library: symbols for GUI : symbols for GUI chematicchematic viewer viewer

(.(.sdbsdb))24

Page 5: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

5

Synthesizing a DesignSynthesizing a Design

1.1. Bring in the designBring in the design–– Translate using readTranslate using read

2.2. Constrain the designConstrain the design–– Timing, area, environmentalTiming, area, environmental

3.3. Synthesize the designSynthesize the design–– Optimize and map to gates with compileOptimize and map to gates with compile

4.4. Inspect the designInspect the design–– View synthesized schematicView synthesized schematic–– Area, timing, and constraint reportsArea, timing, and constraint reports

5.5. Save the designSave the design–– write the write the netlistnetlist to a fileto a file

25

DC Shell ScriptDC Shell Script

A command file for Design Compiler that can A command file for Design Compiler that can be run iteratively or in batch modebe run iteratively or in batch mode

Contains:Contains:–– Setup information (.Setup information (.synopsys_dc.setupsynopsys_dc.setup))

–– Attribute and constraint informationAttribute and constraint information

–– Synthesis commands (read, compile, write,..)Synthesis commands (read, compile, write,..)

–– Control flow commandsControl flow commands

26

Constraining the DesignConstraining the Design

Area GoalArea Goal–– set_max_areaset_max_area: Specify area target for : Specify area target for

current_designcurrent_design

Timing GoalTiming Goal–– Define constraints for all pathsDefine constraints for all paths

All input pathsAll input pathsInternal pathsInternal pathsAll output pathsAll output paths

27

Defining a ClockDefining a Clock

ClockClock–– Source (port or pin)Source (port or pin)–– PeriodPeriod–– Duty cyclesDuty cycles–– Offset/skewOffset/skew

Creating a clock constraints timing paths Creating a clock constraints timing paths between registersbetween registersPreserve clock tree: Preserve clock tree: set_dont_touch_networkset_dont_touch_network

28

Constraining Timing PathsConstraining Timing Paths

FF1Q

QB

DFF2

Q

QB

DFF3

Q

QB

DFF4

Q

QB

DXNM S T

TO_BE_SYNTHESIZED

clk

set_input_delay

create_clock(period) (period) (period)

create_clock create_clock

set_output_delay

29

Environmental AttributesEnvironmental Attributes

FF2Q

QB

DFF3

Q

QB

DXN S

TO_BE_SYNTHESIZED

CLK

set_operating_conditionsDefines operating conditions

for current design set_loadSets load valueon ports andnets

set_driving_cellModels a library

cell drivinginput ports

set_wire_loadSets wire load modelfor current design 30

Page 6: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

6

Operating ConditionsOperating Conditions

Temperature

Delaybest

norminalworst

Voltage

Delaybestnorminal

worst Process

Delaybest

norminalworst

31

Design Rule ConstraintsDesign Rule Constraints

Maximum transition timeMaximum transition time–– set_max_transitionset_max_transition–– Ports or designsPorts or designs

Maximum Maximum fanoutfanout–– set_max_fanoutset_max_fanout–– Input ports or designsInput ports or designs

Maximum capacitanceMaximum capacitance–– set_max_capacitanceset_max_capacitance

Minimum capacitanceMinimum capacitance–– set_min_capacitanceset_min_capacitance

32

Report outReport out

AreaArea–– report_areareport_area: hardware area in equivalent gate : hardware area in equivalent gate

number (2number (2--input NAND gate)input NAND gate)

TimingTiming–– report_timingreport_timing: report path with the worst slack: report path with the worst slack

PowerPower–– report_powerreport_power: estimated power consumption: estimated power consumption

ConstraintsConstraints–– report_constraintreport_constraint: displays constraint violators: displays constraint violators

33

allsum.scrallsum.scr::active_designactive_design = = allsumallsumread read ––format format verilogverilog active_designactive_design + + ““.v.v””current_designcurrent_design active_designactive_designlinklinkcheck_designcheck_designset_wire_load_modelset_wire_load_model ““enclosedenclosed””set_operating_conditionsset_operating_conditions ““V270WTP0850V270WTP0850”” ––library library ““std90std90””

create_clockcreate_clock ––name name clkclk ––period 2 period 2 ––waveform {0 1} waveform {0 1} find(portfind(port, , ““clkclk””))

set_dont_touch_networkset_dont_touch_network {{clkclk resetbresetb}}set_clock_skewset_clock_skew ––plus_uncertaintyplus_uncertainty 0.2 0.2 ––minus_uncertaintyminus_uncertainty 0.2 0.2

clkclkset_fix_holdset_fix_hold find (clock, find (clock, ““clkclk””))set_input_delayset_input_delay 0.5 0.5 ––clock clock clkclk ––max max all_inputsall_inputs()()set_output_delayset_output_delay 0.5 0.5 ––clock clock clkclk ––max max all_outputsall_outputs()()set_max_areaset_max_area 00

set_max_fanoutset_max_fanout 1 1 all_inputsall_inputs()()set_max_transitionset_max_transition 3 3 current_designcurrent_design 34

set_driveset_drive 1 1 all_inputsall_inputs()()set_driveset_drive 0 {0 {clkclk resetbresetb}}set_fix_multiple_port_netsset_fix_multiple_port_nets ––feedthroughsfeedthroughs ––constantsconstants

compile compile ––map mediummap medium

remove_unconnected_portsremove_unconnected_ports find(find(--hierarchy cell, hierarchy cell, ““**””))change_nameschange_names ––h rules h rules sec_verilogsec_verilogset_dont_touchset_dont_touch current_designcurrent_design

report_constraintreport_constraint ––all_violatorsall_violators ––verbose > verbose > active_designactive_design + + ““.cons.cons””

report_timingreport_timing > > active_designactive_design + + ““.time.time””report_areareport_area > > active_designactive_design + + ““.area.area””report_powerreport_power > > active_designactive_design + + ““..powpow””

write write ––format db format db ––hierarchy hierarchy ––output output active_designactive_design + + ““.db.db””write write ––format format verilogverilog ––hierarchy hierarchy ––output output active_designactive_design + + ““..vnetvnet””quitquit

dc_shell –f allsum.scr > allsum.log35

********************************************************************************Report : areaReport : areaDesign : Design : allsumallsumVersion: 2000.05Version: 2000.05Date : Tue Nov 26 15:28:58 2002Date : Tue Nov 26 15:28:58 2002********************************************************************************

Library(sLibrary(s) Used:) Used:std90 (File: std90 (File: /user3/samsung_design_kit/secstd90_synopsys/syn/STD90/std9/user3/samsung_design_kit/secstd90_synopsys/syn/STD90/std90.db)0.db)

Number of ports: Number of ports: 1515Number of nets: Number of nets: 2626Number of cells: Number of cells: 99Number of references: Number of references: 44

Combinational area: Combinational area: 104.666656104.666656NoncombinationalNoncombinational area: area: 65.00003165.000031Net Interconnect area: Net Interconnect area: 7.268500 7.268500

Total cell area: Total cell area: 169.666687169.666687Total area: Total area: 176.935181176.935181 36

Page 7: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

7

********************************************************************************Report : timingReport : timing

--path fullpath full--delay maxdelay max--max_pathsmax_paths 11

Design : Design : allsumallsumVersion: 2000.05Version: 2000.05Date : Tue Nov 26 15:28:58 2002Date : Tue Nov 26 15:28:58 2002********************************************************************************Operating Conditions: V270WTP0850 Library: std90Operating Conditions: V270WTP0850 Library: std90Wire Load Model Mode: enclosedWire Load Model Mode: enclosedStartpointStartpoint: U2/a_reg_1A: U2/a_reg_1A

(rising edge(rising edge--triggered fliptriggered flip--flop clocked by flop clocked by clkclk))Endpoint: sumout_reg_5AEndpoint: sumout_reg_5A

(rising edge(rising edge--triggered fliptriggered flip--flop clocked by flop clocked by clkclk))Path Group: Path Group: clkclkPath Type: maxPath Type: max

Des/Des/ClustClust/Port Wire Load Model /Port Wire Load Model LibraryLibrary------------------------------------------------------------------------------------------------------------------------allsumallsum std90_5000_t std90_5000_t std90std90adder4 adder4 std90_5000_t std90_5000_t std90std90 37

Point Point IncrIncr PathPath------------------------------------------------------------------------------------------------------------------------------------clock clock clkclk (rise edge) (rise edge) 0.00 0.00 0.000.00clock network delay (ideal) 0.00 clock network delay (ideal) 0.00 0.000.00U2/a_reg_1A/CK (fd2qd2) U2/a_reg_1A/CK (fd2qd2) 0.00 0.00 0.000.00 rrU2/a_reg_1A/Q (fd2qd2) U2/a_reg_1A/Q (fd2qd2) 0.85 0.85 0.850.85 rrU2/a[1] (U2/a[1] (allsum_cntallsum_cnt) ) 0.00 0.00 0.85 r0.85 r

....................U1/c[5] (adder4) U1/c[5] (adder4) 0.00 0.00 2.50 r2.50 rsumout_reg_5A/D (fd2q) sumout_reg_5A/D (fd2q) 0.00 0.00 2.50 r2.50 rdata arrival time data arrival time 2.502.50

clock clock clkclk (rise edge) (rise edge) 2.00 2.00 2.002.00clock network delay (ideal) clock network delay (ideal) 0.00 0.00 2.002.00clock uncertainty clock uncertainty --0.20 0.20 1.801.80sumout_reg_5A/CK (fd2q) sumout_reg_5A/CK (fd2q) 0.00 0.00 1.80 r1.80 rlibrary setup time library setup time --0.43 0.43 1.371.37data required time data required time 1.371.37--------------------------------------------------------------------------------------------------------------------------------------data required time data required time 1.371.37data arrival time data arrival time --2.502.50--------------------------------------------------------------------------------------------------------------------------------------slack (VIOLATED) slack (VIOLATED) --1.121.12

38

********************************************************************************Report : powerReport : power --analysis_effortanalysis_effort lowlowDesign : Design : allsumallsum Version: 2000.05Version: 2000.05Date : Tue Nov 26 15:28:59 2002Date : Tue Nov 26 15:28:59 2002********************************************************************************Library(sLibrary(s) Used: std90 (File: ) Used: std90 (File:

/user3/samsung_design_kit/secstd90_synopsys/syn/STD90/std90.db)/user3/samsung_design_kit/secstd90_synopsys/syn/STD90/std90.db)Operating Conditions: V270WTP0850 Library: std90Operating Conditions: V270WTP0850 Library: std90Wire Load Model Mode: enclosedWire Load Model Mode: enclosedGlobal Operating Voltage = 2.7 Global Operating Voltage = 2.7 PowerPower--specific unit information :specific unit information :

Voltage Units = 1VVoltage Units = 1VCapacitance Units = 1.000000pfCapacitance Units = 1.000000pfTime Units = 1nsTime Units = 1nsDynamic Power Units = 1mW (derived from V,C,T units)Dynamic Power Units = 1mW (derived from V,C,T units)Leakage Power Units = 1mWLeakage Power Units = 1mW

Cell Internal Power = 0.0000 Cell Internal Power = 0.0000 mWmW (0%)(0%)Net Switching Power = 2.8393 Net Switching Power = 2.8393 mWmW (100%)(100%)

------------------Total Dynamic Power = 2.8393 Total Dynamic Power = 2.8393 mWmW (100%)(100%)Cell Leakage Power = 0.0000 Cell Leakage Power = 0.0000 mWmW

39

Synthesis ResultsSynthesis Results

40

PartitioningPartitioning

ObjectivesObjectives–– Separate distinct functionsSeparate distinct functions–– Achieve workable size and complexityAchieve workable size and complexity–– Manage project in team environmentManage project in team environment–– Design reuseDesign reuse

AdvantagesAdvantages–– Better results: smaller and fasterBetter results: smaller and faster–– Easier synthesis: simplified constraints and scriptsEasier synthesis: simplified constraints and scripts–– Faster compiles: quicker turnaroundFaster compiles: quicker turnaround

41

Group and UngroupGroup and Ungroup

GroupGroup–– creates a new hierarchical blockcreates a new hierarchical block

UngroupUngroup–– Remove unnecessary hierarchiesRemove unnecessary hierarchies–– Logic optimization: cannot cross block boundariesLogic optimization: cannot cross block boundaries–– Combinational logic cannot be mergedCombinational logic cannot be merged

No combinational path crossing hierarchy No combinational path crossing hierarchy boundariesboundaries

42

Page 8: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

8

Partitioning StrategiesPartitioning Strategies

No hierarchy in combinational pathsNo hierarchy in combinational pathsPlace hierarchy boundaries at register outputsPlace hierarchy boundaries at register outputsLimit block size for reasonable runtimes (20K~100K Limit block size for reasonable runtimes (20K~100K gates)gates)Related combinational logic in the same moduleRelated combinational logic in the same modulePartition for design reusePartition for design reuseSeparate structural logic from random logicSeparate structural logic from random logicSeparate core logic, pads, clocks, and JTAGSeparate core logic, pads, clocks, and JTAGRemove glue logicRemove glue logicIsolate stateIsolate state--machine from other logicmachine from other logicThink of layout styleThink of layout style

43

Compile a Hierarchical DesignCompile a Hierarchical Design

TopTop--down hierarchical compile: small designsdown hierarchical compile: small designs–– Only top level constraintsOnly top level constraints–– Optimization across entire designOptimization across entire design–– Long compile times (memory intensive)Long compile times (memory intensive)–– Changes to subChanges to sub--blocks require complete reblocks require complete re--synthesissynthesis–– Does not perform well for multiple clocksDoes not perform well for multiple clocks

TimeTime--budgeting compile (Bottombudgeting compile (Bottom--up): up): medium~largemedium~large–– Specify timing requirements for each blockSpecify timing requirements for each block–– Easier to manageEasier to manage–– Changes to subChanges to sub--blocks do not require reblocks do not require re--synthesissynthesis–– Does not suffer from design styleDoes not suffer from design style–– Good quality results in generalGood quality results in general–– Tedious to update and maintain multiple scriptsTedious to update and maintain multiple scripts–– Critical paths at topCritical paths at top--level are not critical at lower levellevel are not critical at lower level

44

Multiple InstancesMultiple Instances

Resolve to optimizeResolve to optimize–– uniquifyuniquify: creates unique definitions of multiple : creates unique definitions of multiple

instances, map each instance to specific instances, map each instance to specific environmentenvironment

–– compile + compile + dont_touchdont_touch: prevents modification of : prevents modification of design object, identical copy of design in N placesdesign object, identical copy of design in N places

UniquifyUniquify is recommendedis recommended–– Better optimization resultsBetter optimization results–– Clock tree insertionClock tree insertion

45

Dynamic timing simulationDynamic timing simulation

SDF: Standard Delay FormatSDF: Standard Delay Format–– Timing information of each cell in the designTiming information of each cell in the design–– Provide timing information for simulating gateProvide timing information for simulating gate--

level level netlistnetlist–– Used for preUsed for pre--layout, postlayout, post--layout simulationlayout simulation

VerilogVerilog netlistnetlist + SDF: Dynamic timing + SDF: Dynamic timing simulationsimulation–– Use Use VerilogVerilog simulation toolssimulation tools–– Use the same test vectors for functional testUse the same test vectors for functional test

46

SDF FileSDF File

Timing dataTiming data–– IOPATH delayIOPATH delay–– INTERCONNECT delayINTERCONNECT delay–– SETUP timing checkSETUP timing check–– HOLD timing checkHOLD timing check

Generating preGenerating pre--layout SDF filelayout SDF file–– Approximate postApproximate post--route clock tree: clock delay, route clock tree: clock delay,

skew, transition timeskew, transition time

write_timingwrite_timing ––format sdfformat sdf--v2.1 v2.1 ––output <filename>output <filename>47

SDF Generation ExampleSDF Generation Example

allsum_sdf.scrallsum_sdf.scr::active_designactive_design = = allsumallsumread read ““db/db/”” + + active_designactive_design + + ““.db.db””current_designcurrent_design active_designactive_designlinklink

write_timingwrite_timing ––format sdfformat sdf--v2.1 v2.1 ––output output active_designactive_design + + ““..sdfsdf””quitquit

dc_shelldc_shell ––f f allsum_sdf.scrallsum_sdf.scr=>=> allsum.sdfallsum.sdf is generatedis generated

48

Page 9: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

9

Timing simulation ExampleTiming simulation Example

Modify Modify VeriogVeriog test benchtest bench–– `include `include ““std90.vstd90.v”” -- library simulation filelibrary simulation file

–– `include `include ““allsum.vnetallsum.vnet”” -- synthesized synthesized verilogverilog netlistnetlist

–– $$sdf_annotate(sdf_annotate(““allsum.sdfallsum.sdf””, U0, , , , U0, , , ““MAXIMUMMAXIMUM””, , , , ““FROM_MAXIMUMFROM_MAXIMUM””););

Execute simulations as Execute simulations as VerilogVerilog RTL functional RTL functional simulationssimulations

49

Timing SimulationTiming Simulation

50

Static timing analysisStatic timing analysis

Analyze gateAnalyze gate--level designs using dynamic simulationlevel designs using dynamic simulation–– Use input vectors and logic simulatorUse input vectors and logic simulator–– No false path, brad support for design stylesNo false path, brad support for design styles–– Long run times: bottleneck for large complex designLong run times: bottleneck for large complex design–– Relies on quality and coverage of test benchRelies on quality and coverage of test bench

Static timing analysisStatic timing analysis–– Exhaustive method of analyzing, debugging, validating the Exhaustive method of analyzing, debugging, validating the

performance of designperformance of design–– Identification of critical pathsIdentification of critical paths–– Infinitely fast compared to dynamic simulationInfinitely fast compared to dynamic simulation–– Verifies all parts of gateVerifies all parts of gate--level design for timinglevel design for timing–– False paths induce violationsFalse paths induce violations

51

PrimeTimePrimeTime

SynopsysSynopsys standstand--alone full chip analyzer for gatealone full chip analyzer for gate--level level static timingstatic timing–– Analyze timing of modules in the context of full chipAnalyze timing of modules in the context of full chip–– Identify Identify intermoduleintermodule timing problemstiming problems–– Analyze entire chip, including nonAnalyze entire chip, including non--synthesized blockssynthesized blocks–– Create blockCreate block--level constraints for level constraints for reoptimizationreoptimization

InterfaceInterface–– primetime: GUI interface, showing details of a single timing primetime: GUI interface, showing details of a single timing

pathpath–– pt_shellpt_shell: command: command--line interface, for scripts and batch modeline interface, for scripts and batch mode

52

PT shell scriptPT shell script

PrimeTimePrimeTime flowflow–– Read in and link design and librariesRead in and link design and libraries–– Specify attributes, environment, constraints, timing Specify attributes, environment, constraints, timing

exceptionsexceptions–– Perform analysis: Perform analysis: check_timingcheck_timing, reports, visual analysis, reports, visual analysis–– Characterize context and write script for Design Compiler, Characterize context and write script for Design Compiler,

perform mode analysis and case analysis (optional)perform mode analysis and case analysis (optional)..synopsys_pt.setupsynopsys_pt.setup: Prime time initial setup file: Prime time initial setup filept_shellpt_shell script: Script file for static time analysisscript: Script file for static time analysisTrnascriptTrnascript: automatically convert a : automatically convert a dc_shelldc_shell script script file into a file into a pt_shellpt_shell script filescript file

53

allsum_sta.scrallsum_sta.scr::read_verilogread_verilog allsum.vnetallsum.vnetcurrent_designcurrent_design allsumallsumlink_designlink_design allsumallsumset_wire_load_modeset_wire_load_mode enclosedenclosedset_operating_conditionsset_operating_conditions ––library {std90} library {std90} ––min V360BTP0000 min V360BTP0000 ––max max

V270WTP0850V270WTP0850create_clockcreate_clock ––period 2 period 2 ––waveform {0 1} {waveform {0 1} {clkclk}}set_clock_uncertaintyset_clock_uncertainty ––setup setup --0.2 0.2 clkclkset_clock_uncertaintyset_clock_uncertainty ––hold 0.2 hold 0.2 clkclkset_input_delayset_input_delay ––clock clock clkclk ––max 0.5 [max 0.5 [all_inputsall_inputs]]set_output_delayset_output_delay ––clock clock clkclk ––max 0.5 [max 0.5 [all_outputsall_outputs]]set_max_fanoutset_max_fanout 1 [1 [all_inputsall_inputs]]set_max_transitionset_max_transition 3 [3 [current_designcurrent_design]]set_driveset_drive 1 [1 [all_inputsall_inputs]]set_driveset_drive 0 [list 0 [list clkclk resetbresetb]]report_timingreport_timing > > allsum.timeallsum.timereport_constraintreport_constraint ––all_violatorsall_violators ––verbose > verbose > allsum.consallsum.consquitquit 54

Page 10: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

10

********************************************************************************Report : timingReport : timing

--path fullpath full--delay max delay max --max_pathsmax_paths 11

Design : Design : allsumallsumVersion: 1999.10Version: 1999.10--44Date : Wed Nov 27 18:04:52 2002Date : Wed Nov 27 18:04:52 2002********************************************************************************

StartpointStartpoint: U2/a_reg_1A (rising edge: U2/a_reg_1A (rising edge--triggered fliptriggered flip--flop clocked by flop clocked by clkclk))

Endpoint: sumout_reg_5A (rising edgeEndpoint: sumout_reg_5A (rising edge--triggered fliptriggered flip--flop clocked by flop clocked by clkclk))

Path Group: Path Group: clkclkPath Type: maxPath Type: maxPoint Point IncrIncr PathPath------------------------------------------------------------------------------------------------------------------------------clock clock clkclk (rise edge) (rise edge) 0.00 0.00 0.000.00clock network delay (ideal) clock network delay (ideal) 0.00 0.00 0.000.00U2/a_reg_1A/CK (fd2qd2) U2/a_reg_1A/CK (fd2qd2) 0.00 0.00 0.000.00 rr

55

U1/U27/Y (ivd2) U1/U27/Y (ivd2) 0.10 0.10 1.66 f1.66 fU1/U28/Y (ao21d2) U1/U28/Y (ao21d2) 0.25 0.25 1.91 r1.91 rU1/U31/Y (oa21d2) U1/U31/Y (oa21d2) 0.20 0.20 2.11 f2.11 fU1/U25/Y (xn2) U1/U25/Y (xn2) 0.39 0.39 2.50 r2.50 rU1/c[5] (adder4) U1/c[5] (adder4) 0.00 0.00 2.50 r2.50 rsumout_reg_5A/D (fd2q) sumout_reg_5A/D (fd2q) 0.00 0.00 2.50 r2.50 rdata arrival time data arrival time 2.502.50

clock clock clkclk (rise edge) (rise edge) 2.00 2.00 2.002.00clock network delay (ideal) clock network delay (ideal) 0.00 0.00 2.002.00clock uncertainty clock uncertainty --0.20 0.20 1.801.80sumout_reg_5A/CK (fd2q) sumout_reg_5A/CK (fd2q) 1.80 r1.80 rlibrary setup time library setup time --0.43 0.43 1.371.37data required time data required time 1.371.37------------------------------------------------------------------------------------------------------------------------------data required time data required time 1.371.37data arrival time data arrival time --2.502.50------------------------------------------------------------------------------------------------------------------------------slack (VIOLATED) slack (VIOLATED) --1.121.12

56

Critical PathCritical Path

57

WaveformWaveform

58

Path ProfilerPath Profiler

59

LayoutLayout

Place & routePlace & route–– FloorplanningFloorplanning–– Clock tree insertionClock tree insertion–– Routing the databaseRouting the database

60

Page 11: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

11

FloorplanningFloorplanning

Most critical stepMost critical stepPlace cells and macros in proper location: reduce net Place cells and macros in proper location: reduce net RC delays and routing capacitancesRC delays and routing capacitancesMinimum possible area while meeting timing Minimum possible area while meeting timing requirementsrequirementsDivide design into manageable blocks: hierarchical Divide design into manageable blocks: hierarchical placement and routingplacement and routingTDL: Timing Driven LayoutTDL: Timing Driven Layout–– Forward annotating timing information to layout toolForward annotating timing information to layout tool–– Place cells with timing priority not to violate path constraintsPlace cells with timing priority not to violate path constraints

61

Clock Tree InsertionClock Tree Insertion

CTS: Clock Tree SynthesisCTS: Clock Tree Synthesis–– Control clock latency and skewControl clock latency and skew–– After cell placement, before routingAfter cell placement, before routing

RecommendationsRecommendations–– Use a balanced tree structure with minimum Use a balanced tree structure with minimum

number of levelsnumber of levels–– Use high drive strength buffers (inverters)Use high drive strength buffers (inverters)–– First level: a single buffer driven by Pad, placed First level: a single buffer driven by Pad, placed

near center, connected to next level through equal near center, connected to next level through equal interconnect wiresinterconnect wires

62

RoutingRouting

Global routingGlobal routing–– Assigns a general pathwayAssigns a general pathway–– Divide layout surface into several regionsDivide layout surface into several regions

Detailed routingDetailed routing–– Make use of information gathered by global Make use of information gathered by global

routeroute–– Routes geometric wires within each region Routes geometric wires within each region

of layout surfaceof layout surface63

ExtractionExtraction

WireWire--load modelload model–– Statistically estimating: inaccurateStatistically estimating: inaccurate

ExtractionExtraction–– Produce delay valuesProduce delay values–– Back annotated to PT for static timing analysisBack annotated to PT for static timing analysis

Net RC delays in SDF formatNet RC delays in SDF formatCapacitive net loading values in Capacitive net loading values in set_loadset_load formatformatParasitic information for clock and other critical netsParasitic information for clock and other critical nets

–– To DC for further optimizationTo DC for further optimizationNet RC delays in SDF formatNet RC delays in SDF formatCapacitive net loading values in Capacitive net loading values in set_loadset_load formatformat

64

Routing & Extraction FlowRouting & Extraction Flow

Synthesis and Optimization

Floorplanning, Placementand Clock Tree Insertion

Global Routing

Extract Estimated Delays

TimingOK?

Detailed Routing

Extract Real Delays

TimingOK?M

ajor

Tim

ing

Viol

atio

ns

No

Min

or T

imin

g Vi

olat

ions

No

Yes

No

Yes

65

PostPost--Layout OptimizationLayout Optimization

Major violations: full synthesisMajor violations: full synthesisMinor violations: Minor violations: IPO(InIPO(In--Place Place Optimization)Optimization)Back annotation to DCBack annotation to DC–– Net RC delays in SDF formatNet RC delays in SDF format–– Capacitive net loading in Capacitive net loading in set_loadset_load filefile–– Physical placement information in PDEFPhysical placement information in PDEF

Fixing holdFixing hold--time violationstime violations66

Page 12: Top-down Design Methodology · [1] Advanced ASIC Chip Synthesis, Using SynopsysDesign Compiler and PrimeTime, Himanshu Bhatnagar, Kluwer Academic Publishers, 1999 [2] The The VerilogVerilogHardware

12

DFT (DesignDFT (Design--ForFor--Test)Test)

Merging testability features early in design Merging testability features early in design cyclecycleFault models: stuckFault models: stuck--at fault modelat fault model

–– High fault coverage correlates to high detect High fault coverage correlates to high detect coveragecoverage

faults possible of number totalfaults detectable of number converage Fault =

67

DFT MethodsDFT Methods

TypesTypes–– Scan insertionScan insertion

link multiplexed fliplink multiplexed flip--flops(scanflops(scan--flops) to form a scan flops) to form a scan chainchain

–– Memory BIST (BuiltMemory BIST (Built--InIn--SelfSelf--Test) insertionTest) insertion–– BoundaryBoundary--Scan insertion (test board connections)Scan insertion (test board connections)

SynopsysSynopsys Test Compiler (TC)Test Compiler (TC)–– Scan insertion, test pattern generation, JTAG or Scan insertion, test pattern generation, JTAG or

boundary scan insertion, JTAG controller and boundary scan insertion, JTAG controller and surrounding logic generationsurrounding logic generation

68