eda court: hierarchical construction and timing sign-off of socs tau 2013 panel

51
EDA Court: Hierarchical Construction and Timing Sign- off of SoCs TAU 2013 Panel

Upload: dorthy-carroll

Post on 27-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

EDA Court: Hierarchical Construction and Timing Sign-off of SoCs

TAU 2013 Panel

Page 2: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

The good side of hierarchy

…k

…k…k

…k…k …k…k

…k …k

Chip (h=0)

Chiplet (h=1)

Core (h=2)

Unit (h=3)

Macro (h=4)

Page 3: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Impact of pruning

Sweet spot:50B objects2M per macroFraction at top = 4e-54 levels of hierarchyPruning = 93%

h = 0

h = 1

h = 2

h = 3

h = 4

h = 5

Unpruned fraction

Fract

ion o

f ch

ip a

t to

p

Page 4: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

The bad side of hierarchy

Accuracy? Pessimism? Coupling noise? Functional noise? Multiple interacting clocks? Parasitics on boundary nets?

Is “context” required? If so, we cannot “shelve and re-use” macros

Construction flow? Draconian methodology restrictions?

Page 5: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Chandu VisweswariahDistinguished EngineerIBM East Fishkill, [email protected]

Larry BrownDesign Center EngineerIBM San Jose, [email protected]

Alex RubinSenior Engineer

IBM San Jose, [email protected]

Amit ShaligramPrincipal EngineerSTMicroelectronics Scottsdale, [email protected]

Oleg LevitskySolutions ArchitectCadence San Jose, [email protected]

Qiuyang WuSenior Staff Engineer

Synopsys Hillsboro, [email protected]

Igor KellerSenior Architect

Cadence San Jose, [email protected]

Alexander SkourikhinEDA Engineer

Intel Haifa, [email protected]

Guntram WolskiPrincipal EngineerCisco San Jose, [email protected]

Page 6: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Panel plan

10 minCharge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future

Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys

10 minCharge 2: EDA tools and flows are inadequate for a construction flow: budgeting, IP models and hierarchical constraint development are lacking

Plaintiff: Amit Shaligram, STMicro. Defendant: Alex Rubin, IBM

10 minCharge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software

Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel

10 minCharge 4: hierarchical timing cannot handle multiple interacting synchronous clocks

Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence

30 min Discussion and audience questions

5 min Verdicts and “damages”

Page 7: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 1: Hierarchical implementa-tion and hence hierarchical timing sign-off don’t have a future

Plaintiff: Oleg Levitsky, CadenceDefendant: Qiuyang Wu, Synopsys

Page 8: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Evolution of design flow

Prototype

Implement

Sign Off

Page 9: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Evolution of design flow

Implement

Prototype

Sign Off

Page 10: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Evolution of design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Page 11: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Evolution of design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Blk1 Blk2 … Blkn

Quiz: Why hierarchical flow?

Create more work for managers

Contribute to real estate bubble

Control time to market schedule?

Page 12: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Blk1 Blk2 … Blkn

Complexity

Hierarchical scalability

Page 13: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Blk1 Blk2 … Blkn

tapeouttapeout

……

Step 2Step 2

Step nStep n

Step 1Step 1

Flow convergence is a key

Page 14: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Blk1 Blk2 … Blkn

Technical challenges: SI Over the block

routing Useful skew

distribution CPPR modeling Power

budgeting Channeless

designs …

Human factor: Level of

expertise Human error Lack of sleepConvergence

Page 15: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical design flow

Implement

Prototype

Sign Off

Blk1 Blk2 … Blkn

Blk1 Blk2 … Blkn

Convergence

Complexity

Hierarchical scalability

Failed to control

TTM

Page 16: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

What is the alternative?

Page 17: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 1: Hierarchical implementa-tion and hence hierarchical timing sign-off don’t have a future

Plaintiff: Oleg Levitsky, CadenceDefendant: Qiuyang Wu, Synopsys

Page 18: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

© Synopsys 2013 18

Hierarchical Design and Timing Closure is the Only Way to Have a Future

Qiuyang WuSr. Staff Engineer, Synopsys Inc.March 2013

Page 19: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical Implementation is Proven

• Way back when in the last century– Designs grew beyond the reach of flat implementation

– Established hierarchical methodologies, tried, and true

• The success will continue because– naturally an iterative and gradual refinement process

– relatively larger error margins and tolerances for tradeoff

– more about reuse and integration, less about from scratch

– …

+1M Gates

+100M Gates

Page 20: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

But, “Classic” Hierarchical Timing is Inadequate for Signoff

Gap #1 - Burden is on the users: “Garbage in, garbage out”– Block designers do not have quality constraints

Can’t close block timing with confidence: pessimism, optimism

Can’t create quality models: pessimism, optimism

Gap #2 - Language limitations: critical details can’t be elaborated– Chip level designers do not have means to express design intention

Can’t describe I/O timing context accurately and completely

Can’t cover different reuse scenarios

The rescue: flat signoff.

However, hierarchical signoff is the only way to stay on top of the technology curve.

Block constraints(ad-hoc)

Inst

TOP

Block

block netlist parasitics

ILM, ETM, glass-box, black-box,

Flat STA(golden)

Hier STA

Full chip golden constraints

chip netlist chip parasitics

Page 21: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

And Here is How to do Hierarchical Signoff

• The Recipe on Top of Signoff Quality Engine• Provide hierarchical constraint management

– Check and highlight inconsistencies

• Provide context feedback and allow refinement– Produce accurate and elaborate timing environment

• Provide Ease-of-Use through data / flow automation– Minimize/prevent user errors by construction

• The Benefits Go Beyond Signoff– Design faster: throughput and interoperation with implementation

– Design better: accuracy enables further optimization for power,

leakage, robustness, area, etc.

Page 22: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 2: EDA tools/flows are inad-equate for a construction flow: budgeting, IP models, hierarchical constraint development are lacking

Plaintiff: Amit Shaligram, STMicroelectronicsDefendant: Alex Rubin, IBM

Page 23: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical Constraints & Budgeting

Amit Shaligram, Principal Engineer

STMicroelectronics

Page 24: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Presentation Title

Models – Accuracy, speed and compatibility • Which model to use?

• ETM or .lib – Reasonable for use before clock tree.

• ILM – Required after clock tree insertion

• Model accuracy• Different modes at block and top level, block/top constraint mismatches

• Handling of high fanout and static nets

• Model compatibility• Models between different vendors/tools are not compatible.

• Some tools create “physical ILMs” others only “timing ILMs”

• It takes time..• For a ~2M instance block: 1 scenario (1 mode/1 corner), it takes ~6-8 hours

• Quickly becomes impractical with 25 blocks, ~5 modes and ~16 corners

• Can someone create models on the fly? Just use the DEF!

24

Page 25: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Presentation Title

Budgeting

• Floorplan and constraints – a chicken and egg problem!

• Estimation of feedthru delays can be challenging. • Consider crosstalk effect!

• Best practices not easy to follow all the time (FF at the boundary)• Critical path from a macro, legacy design, cannot tolerate extra latency

• Managing hold violations with FF at the boundary• Uncommon clock path creates hold violations due to OCV impact.

• SDC format limitations after clock tree insertion• Input/Output delay is specified with respect to virtual clock

• Latency of virtual clock changes with every step of the flow (postCTS, postRoute, postRouteSI)

25

Page 26: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Presentation Title

Hierarchical Constraints

• Top down or bottom-up constraints development flow ?

• How to ensure that block and top constraints are aligned?

• Constraint modifications required when using .lib or ILMs in top level • Generated clock definitions inside blocks create “new internal” clocks/pins

• Handling large constraint files created within ILM generation flow(s)

• Boundary conditions for hold?• How to estimate set_min_delay accurately?

• Crosstalk effects of top level clock tree• How much margin is too much margin inside the blocks?

• Using infinite timing windows inside the blocks is an overkill

26

Page 27: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 2: EDA tools/flows are inad-equate for a construction flow: budgeting, IP models, hierarchical constraint development are lacking

Plaintiff: Amit Shaligram, STMicroelectronicsDefendant: Alex Rubin, IBM

Page 28: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Living in a flat world?

March 27, 2013

Page 29: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Long list of charges that simply don’t stick…

Many teams have used hierarchy successfullyto tape out designs!

– Large problems require the use of “divide and conquer”.

Vast amount of design experience, understanding and overcoming practical challenges.

Tools help establish hand-shake across hierarchical levels.– Verification of boundary conditions and assumptions.– Automatic constraint generation and management.– Enforcement of best design practices.

Significant body of “do’s and don’ts” to help provide guidance, improve efficiency and reduce pessimism.

Page 30: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Follow best hierarchical design practices

Flop bound the design!

Use single macro clock input!

Simple rules can make hierarchy easy(er)!

Macro A

D Q

CLK

Flop 1Macro B

D Q

CLK

Flop 2

Avoid critical paths crossing boundaries!

Isolate output loading from internal paths!

Page 31: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Object count per unit

0

0.5

1

1.5

2

2.5

3

top level Ax20 Bx14 Cx20 Dx14 Ex1 Fx1 Gx1 Hx1 Ix1

Unit name x number of reused instances

Mil

lio

ns

of

ob

ject

s

Full Instance Abstracted Instance

Object count per unit

5X Speedup

0

50

100

150

200

250

300

PASTA SAUCE

Ru

n t

ime

(Ho

urs

)

Hierarchical Timing Full Chip Timing

Ru

n t

ime

(h

ou

rs)

Statistical Timing

10+ days

Deterministic Timing

5X Speedup

Hierarchy is a “must have”!

Parallelizes timing and optimization of independent paths to improve over-all efficiency.

Better supports timing closure when different macros / top level are at different “stages” of completeness.

Fosters un-interrupted design fix-up loop.

More resilient to failure.

44M Objects!

Page 32: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software

Plaintiff: Guntram Wolski, CiscoDefendant: Alexander Skourikhin, Intel

Page 33: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

33

Hierarchical TimingFelonies or Misdemeanors?Guntram Wolski – Cisco Systems

Principal Engineer

Enterprise Networking Group

Page 34: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

34

• You can come close, but that only counts in …..Or if you start worst casing things, you’ve overdesigned…

• You can set goals/targets for blocks, but then reality sets in.You end up opening block as it is the “right thing to do” in order to

close.

• Multiple instances of same coreHow do you wire over/through the cores?

Wiring bays – what if you don’t have enough in some areas?

Wire over the top == create new extraction/unique timing problems.

Noise issues

Every instance doesn’t have same IR drop/noise profile

Page 35: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

35

• Requires strict PD requirements to be effectiveVery strict methodology to be effective

Need flopped boundaries

Long distance routes/fly overs need extra handling or pushed down

Legacy designs/IP integration cause immediate loss of benefit

Integration/Adopt complexity seems more so than with other tools

Logic designers have very little interest in helping PD

It’s good enough, live with it.

I’m not paid to improve your problems, I just meet timing.

I have to work on something else, you have to fix it.

• Are we leaving performance on table?Subchips need to be designed to guardbanded conditions on I/Os and IR drop

Page 36: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

36

• Why are we not looking at taking advantage of parallelism?Are these not many individual paths?

If DRC can run on 120 cpus and benefit, why can’t timing?

Break up the problem and distribute to my farm….

Page 37: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software

Plaintiff: Guntram Wolski, CiscoDefendant: Alexander Skourikhin, Intel

Page 38: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Defense• Timing closure is an iterative process

• Controllability is the key for success• Start from initial spec• Once design is getting mature, gradually refine environmental

requirements and increase model accuracy• Finally, you see the “real” timing requirements, avoiding overdesign

• Non-overdesigned multi-instantiated blocks are reality• Must see all the requirements (timing, parasitics) w/o worst casing • Clocks handling is the real challenge• Noise is never an issue (at most – make worst case between

instances)

• Reusable IPs are feasible• Have to use accurate block models (adjustable to a new env.)• Have to apply design restrictions on interfaces

Page 39: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Defense (cont.)• Have to apply methodological restrictions to block

interfaces• Driver size, wire length, ports, etc. • All of them are manageable and ease integration on top level • Doesn’t necessarily lead to overdesign, due to accurate block

models• Applicable to both flop and latch based designs

• Timing analysis is highly parallelizable• Individual block analysis is naturally done in parallel• Top level analysis might

• leverage multi-threading technologies in STA algorithms• be divided in clusters and every cluster is analyzed in parallel

Page 40: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Summary• Efficient and Reliable Hierarchical Flow requires two

essential factors:• A robust project methodology, which

• Enforces design restrictions • Takes advantage of IP Reuse • Provides continuous timing picture throughout all project phases • Allows productive ECO work

• Advanced EDA tools, which• Are flexible and allow controllability between accuracy and simplicity• Can efficiently handle Multi-X environments (X=system, corner, clocks,

etc.)• Utilize parallel computing techniques • Support batch and ECO modes

Page 41: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 4: Hierarchical timing cannot handle multiple interacting synchro-nous clocks

Plaintiff: Larry Brown, IBMDefendant: Igor Keller, Cadence

Page 42: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Hierarchical timing cannot handle multiple interacting synchronous clocks

Define the problem:

Page 43: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Definition continued

If clk1X is later than clk2X, we reduce our setup margin. If clk1X is earlier than clk2X, we reduce our hold margin.

We don’t know the real relationship between the two clocks until we have our top level established. This makes it difficult to close timing on the logic

macro and “put it on the shelf.” The problem is magnified if the logic macro is re-used.

In that case, the setup and hold margins of the logic macro must span all existing clk1X-clk2X relationships.

Page 44: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Fixes from timing methodology

Option 1: Assert an uncertainty between clk1X and clk2X in macro timing, and validate this uncertainty when running top level timing. Problem with this:

Leave performance/area on the table by lowering cycle time and/or over-padding hold fails.

If top level can’t meet this requirement, we must open up logic macro for further work.

Option 2: ???

Page 45: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

The best solution: Fix the design

Update the design so we do not have multiple synchronous clock inputs in the first place.

Page 46: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Conclusion

Perhaps it’s more accurate to say that hierarchical timing can handle multiple synchronous clock inputs, but cannot do this without leaving performance and/or area on the table. In other words, it does not lead to the most efficient design.

Page 47: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Charge 4: Hierarchical timing cannot handle multiple interacting synchro-nous clocks

Plaintiff: Larry Brown, IBMDefendant: Igor Keller, Cadence

Page 48: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Defense:

48

First and foremost, defendant pleads not guiltyThe charge from plaintiff only means that there is

no free lunchFor Hierarchical Timing to work designers must

follow certain rulesThey are well described in Alex Rubin defenseSpecifically, one should have a single clock pin in

a block to avoid extra pessimism in hold/setup timing

In the case of multiple clock pins plaintiff himself exonerated defender by proposing a solution: it is possible to remove some of the pessimism by

describing relationship between two clocks

Page 49: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Defense (cont.)Advanced SI analysis today reduces pessimism

today if victim and aggressor share same clockSI analysis also becomes more problematic

with multiple clock pinsWith multiple clock pins one assumes the

clocks are different leading to Pessimism if uncertainty is assigned to both pinsOptimism if no uncertainty is assigned

As often is true, the best way to resolve a problem is to avoid creating it: stick to rules of hierarchy-friendly design methodology

Page 50: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Ways to Remove the Limitation

There are ways to define relationship between two internal clocks:Through parent external clockExplicitly define ranges of skews

Parameterization of timing models with skew on two clocks is possible

These enhancement are feasible but need to be driven by real commercial interest

CLK

Page 51: EDA Court: Hierarchical Construction and Timing Sign-off of SoCs TAU 2013 Panel

Q & A Verdicts Damages!!!

10 minCharge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future

Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys

10 minCharge 2: EDA tools and flows are inadequate for a construction flow: budgeting, IP models and hierarchical constraint development are lacking

Plaintiff: Amit Shaligram, STMicro. Defendant: Alex Rubin, IBM

10 minCharge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software

Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel

10 minCharge 4: hierarchical timing cannot handle multiple interacting synchronous clocks

Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence

30 min Discussion and audience questions

5 min Verdicts and “damages”