ccopt gen clk conf

Upload: nguyen-hung

Post on 04-Nov-2015

123 views

Category:

Documents


6 download

DESCRIPTION

document

TRANSCRIPT

  • Controlling balancing of generated and master clocks in CCOpt Product Version 14.2 January 19, 2015

  • Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 2

    Copyright Statement

    2015 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence and the Cadence logo are registered trademarks of Cadence Design Systems, Inc. All others are the property of their respective holders.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 3

    Contents

    Purpose ....................................................................................................................... 4

    Scope .......................................................................................................................... 4

    Overview ...................................................................................................................... 4

    A Simple Design Example with a Divided Clock .......................................................... 4

    Initial Configuration Single Mode SDC constraints ................................................ 7

    Using Multi-mode Constraints ................................................................................ 10

    How To Remedy This ................................................................................................ 12

    Method 1: Additional SDC Constraint ..................................................................... 12

    Method 2: User Modification of Skew Groups ........................................................ 13

    Final Clock Structure with Modified Skew Groups ..................................................... 14

    Summary ................................................................................................................... 15

    Support ...................................................................................................................... 16

    Feedback ................................................................................................................... 16

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 4

    Purpose

    This Application Note explains how native CCOpt handles generated clocks and the balancing of generated clocks with their master clocks and how to precisely control this where the default behavior is not adequate.

    Scope

    This is most relevant to designers and legacy EDI-CTS users who expect, or require, CTS to unbalance generated clock trees from their master clock trees in order to avoid over-balancing some clock trees.

    Overview

    In configuring a clock tree synthesis (CTS) tool, its important to understand and be able to define which clock domains should be balanced together.

    EDI native (integrated) CCOpt CTS includes an automatic clock specification generation command (create_ccopt_clock_tree_spec) that is recommended for automating the setup of CCOpt.

    This Application Note examines how create_ccopt_clock_tree_spec treats designs with generated clocks (clock dividers) and how to direct the balancing of generated and master clocks where the default behaviour is not sufficient.

    In particular, it explains how to get CCOpt to unbalance generated clock trees from their master clock trees.

    A Simple Design Example with a Divided Clock

    Most digital designs these days have multiple clocks running at different frequencies. Its common for a high frequency clock to be the main source input to a design to drive the highest performance logic. For logic that doesnt need to run this fast, the high speed clock is commonly divided down to a lower frequency to reduce power.

    Lets look at a small example circuit with this configuration.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 5

    The clock structure for this circuit consists of a high speed clock input, CLK1 that drives some high-speed logic (abstracted in the bottom left oval labeled Hi Speed). There is also some logic abstracted as Low Power in the bottom right that is driven by a low frequency clock derived from a divide-by-4 clock divider circuit.

    The SDC timing constraint that defines this divided clock is also shown where it is applied at the output of the clock divider. In the middle, there is some logic abstracted as Dual-mode, that talks to the Hi Speed logic when the clock mux selects CLK1. When the clock mux selects the divided clock, div4, then the Dual-mode logic talks to the Low Power logic at the lower frequency.

    Since we are focusing only on clock tree structure here, the functionality of the abstracted logic is not important just that it contains clock tree sinks (register clock pins) that demonstrate clock tree skew balancing. So, for this circuit, each of the three abstracted logic blocks consists of a simple 8-bit multiplier with 32 total flops each (mult_a/mult_b/mult_c in the Verilog RTL).

    For this circuit, we can say that there are two different clock domains. The CLK1 clock domain consists of the Hi Speed logic and the Dual-mode logic when the clock mux selects CLK1.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 6

    The Div4 clock domain consists of the Low Power logic and the Dual-mode logic when the clock mux selects the div4 clock input.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 7

    For optimal performance and power, wed like the clock tree to balance the sinks in the Div4 domain, but wed also like to minimize insertion delay to all the sinks in the CLK1 domain in order to minimize clock tree power.

    The Div4 domain clock tree will [normally this depends on the sink fan-out] have larger insertion delay since the clock is propagating through the divide-by-4 circuitry before fanning out to the sinks.

    Since our Hi Speed logic doesnt interact with the Low Power logic directly, we dont want the Hi Speed sinks balanced with the Low Power sinks. We only want the Hi Speed sinks balanced with the Dual-mode sinks when the CLK1 input is selected to drive the Dual-mode sinks, which doesnt suffer the delay penalty of propagating through the clock divider.

    So it should be possible in our example circuit to achieve much shorter insertion delay on the CLK1 domain tree than on the Div4 domain clock tree.

    Initial Configuration Single Mode SDC constraints

    To synthesize the RTL for our example circuit, we only need to define SDC constraints that define our master clock and our generated clock (plus some I/O constraints, etc for data path timing that we are ignoring for this exercise focused on clock trees)

    create_clock -name CLK1 -period 3.0 [get_ports {clk}]

    create_generated_clock -name clk_div4 -source [get_ports {clk}] -divide_by 4

    [get_pins {clk_divider/clkout}]

    With these SDC constraints and our RTL, we synthesize this design and load it into Encounter Digital Implementation System (EDI) to place it and run CCopt-CTS to build a balanced clock tree. The EDI command sequence for this default flow is:

    source init.globals

    init_design

    floorPlan -site CoreSite -r 1 0.35

    placeDesign -noPrePlaceOpt

    optDesign -preCts

    create_ccopt_clock_tree_spec -immediate

    ccopt_design cts

    After building the clock tree, we can bring up the CCOpt Clock Tree Debugger to see the structure of the clock tree, which looks like this:

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 8

    Here is the same view, with annotations added on it to help you see the parts of our design that are being displayed in this clock tree diagram.

    The clock tree debugger draws the clock tree with the clock root at the top (green home plate shape) and the other clock elements placed vertically according to their insertion

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 9

    delay, with the scale on the left of the screen. The red squares at the bottom of the tree are clock tree sinks flops in our case. So, on the right side of the diagram is the clock divider (two flops since its a divide-by-4), and the div4 clock domain with the branch to the right driving the Low Power mult_c logic, and the branch to the middle driving through the dark blue clock mux to the Dual-mode logic. We can see that CTS has balanced these two sets of sinks.

    The left side of the diagram shows the CLK1 domain with some buffering from the CLK1 root down to the Hi Speed mult_a logic, and a branch to the far left that goes through the other input of the clock mux which is replicated in this diagram and on to drive the Dual-mode mult_b logic, which is collapsed in this view into a small gray point labeled 32 to indicate how many sinks are collapsed there.

    We can see by the vertical location of the Hi Speed logic sinks and the collapsed Dual-mode logic that CCOpt-CTS has balanced these sinks with the div4 domain on the right side of the diagram. The buffer chain that it has inserted to increase the CLK1 domain delay for balancing is circled in red. For our design, this added buffering is just wasting power and increasing insertion delay for no reason. We want the insertion delay for the CLK1 domain on the left to be minimal, only balancing mult_a with mult_b.

    The EDI System User Guide Clock Tree Synthesis section describes the Automatic Clock Tree Spec Creation function as The create_ccopt_clock_tree_spec command is used to automatically create clock tree and skew group definitions from analysis of the active timing constraints, typically those loaded from SDC.

    Below that is a description of how spec creation works with multi-mode SDC:

    Multi-Mode Example

    The diagram below shows a simple multi-mode example annotated with SDC constraints and skew

    group information.

    Multi-Mode Example

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 10

    The resulting specification contains the following:

    Two clock trees, ck and gck. The clock tree definitions tell CTS which parts of the circuit

    are included in the clock tree graph and are not mode specific.

    Two skew groups, ck/mode0 and ck/mode1 . The skew groups tell CTS how to perform

    balancing.

    Each skew group has an ignore pin defined at the appropriate multiplexer input. This

    represents the fact that there is no need to balance the direct clock path with the

    divided clock path.

    Using Multi-mode Constraints

    So, this seems to indicate that if we use multi-mode SDC constraints to separate the different inputs of the mux by specifying the state of the mux select signal, then the CCOpt spec generation will not balance both insertion delays through both inputs of the mux.

    Lets try adding another constraint mode to our design as follows:

    main.sdc:

    create_clock -name CLK1 -period 3.0 [get_ports {clk}]

    create_generated_clock -name clk_div4 -source [get_ports {clk}] -divide_by 4

    [get_pins {clk_divider/clkout}]

    set_case_analysis 1 [get_ports {mode}]

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 11

    div4.sdc:

    create_clock -name CLK1 -period 3.0 [get_ports {clk}]

    create_generated_clock -name clk_div4 -source [get_ports {clk}] -divide_by 4

    [get_pins {clk_divider/clkout}]

    set_case_analysis 0 [get_ports {mode}]

    After running through synthesis, EDI and CCOpt-CTS again with the multi-mode constraints the clock tree debugger diagram now looks like this:

    This time the diagram is drawn with the Div4 domain on the left and the CLK1 domain on the right, so the structure is mirrored from our first run. Again, we see that CCOpt is inserting buffer chains (circled in red) to balance the Hi Speed and Dual-mode logic in the CLK1 domain with the div4, generated clock domain logic, despite the presence of multi-mode SDCs.

    So, why is CCOpt still balancing all three multipliers in our example?

    The default clock tree extraction tracing includes the low power module (mult_c) in the skew group for hi speed clock CLK1.

    It has to trace through the clock divider in order to do this.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 12

    How To Remedy This

    Method 1: Additional SDC Constraint

    As we just saw, the default clock tree spec generation will still trace through the clock divider and include the Low Power logic sinks in the CLK1 skew group in order to balance them.

    It is possible to explicitly define in SDC that the CLK1 clock and the div4 generated clock do not interact with each other.

    This constraint:

    set_clock_groups logically_exclusive group {CLK1} group {clk_div4}

    defines the two clocks as logically isolated from each other, so any paths between them are not valid timing paths.

    That should direct CTS that there is no need to balance the two clock domains together.

    Unfortunately, this constraint is not currently respected by CCOpt, so this does not help us today.

    : not currently supported in EDI14.2

    This method might become feasible in future EDI releases.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 13

    Method 2: User Modification of Skew Groups

    Since the SDC method above is not working, there is currently no way to influence the clock tree spec generation appropriately.

    However, it is fairly simple to intercept the CTS configuration after the clock tree extraction and modify the CCOpt skew groups to achieve the required behavior.

    Lets start by writing the CCOpt spec to a file so we can see what we are starting from:

    create_ccopt_clock_tree_spec file ccopt_multi.spec

    This shows the clock trees and skew groups that are being created for our design. Among other statements in the ccopt_multi.spec file are the following clock_tree and skew_group definitions:

    Clock trees:

    create_ccopt_clock_tree -name CLK1 -source clk -no_skew_group

    create_ccopt_generated_clock_tree -name clk_div4 -source clk_divider/ff2/Q -

    generated_by clk_divider/ff2/CK

    Skew Groups:

    create_ccopt_skew_group -name CLK1/main -sources clk -auto_sinks

    create_ccopt_skew_group -name clk_div4/main -sources clk_divider/ff2/Q -auto_sinks

    create_ccopt_skew_group -name CLK1/div4 -sources clk -auto_sinks

    create_ccopt_skew_group -name clk_div4/div4 -sources clk_divider/ff2/Q -auto_sinks

    Remember that in CCOpt:

    Clock trees define what is built as clock and DRV fixed

    Skew groups define what is balanced and control insertion delay optimization

    Our clock tree network is already correct we need change nothing here.

    We just need to fine-tune the skew groups to get the desired balancing behavior.

    To list all the sinks of each skew group, we can use a Tcl procedure like the following to print them out to the screen. This could easily be modified to dump them out to a file for larger designs if necessary:

    proc show_sinks {} {

    foreach sg [get_ccopt_skew_groups *] {

    puts "Skew group: $sg"

    foreach sink [get_ccopt_property sinks_active -skew_group ${sg}] {

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 14

    if {[get_ccopt_property sink_type -pin $sink] != "ignore"} {

    puts $sink

    }

    }

    }

    }

    When we list out all the skew_group sinks we see that the skew_group for CLK1 includes all the sinks for mult_a, mult_b and mult_c. So, to keep the CLK1 domain clock tree from being balanced with the mult_c logic, we need to define the mult_c sinks as ignore pins using this command sequence:

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[7]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[6]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[5]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[3]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[2]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[1]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/b_reg_reg[0]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/a_reg_reg[7]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/a_reg_reg[6]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/c_reg[12]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/c_reg[14]/CK

    modify_ccopt_skew_group -skew_group CLK1/main -add_ignore_pins mult_c/c_reg[15]/CK

    (etc There are 32 of these one for each mult_c sink pin)

    In the multi-mode SDC case, the CCOpt clock spec generation creates a skew_group for each clock domain in each constraint mode so we also need to ignore the mult_c sinks in the CLK1/div4 skew_group. Or, we could for our example also just delete the CLK1/div4 skew group altogether with this command:

    delete_ccopt_skew_groups CLK1/div4

    Final Clock Structure with Modified Skew Groups

    After adding these commands to our run script just before the ccopt_design cts

    command, we can now run through the EDI flow again and see the resulting clock tree structure and how it is impacted by our modification of the skew_groups.

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 15

    We now have the desired clock tree balancing (the CLK1 domain is drawn on the right side of this diagram).

    Both Hi Speed logic and the Dual-mode logic in the CLK1 domain have minimal insertion delay compared to the div4 clock domain logic.

    The Dual-mode logic is reasonably balanced with the Low Power logic when the clock select line of the mux is selecting the divided clock, at more than 1 ns insertion delay.

    And the Dual-mode logic is also balanced with the Hi Speed logic when the clock select line of the mux is selecting the high speed clock, at less than 0.5 ns insertion delay.

    Summary

    So, we have demonstrated that the CCOpt clock spec generation will trace through clock generators and force the balancing of generated clocks with their master clocks by default.

    We have also shown that this behavior is not impacted for most designs by the use of multi-mode constraints with case_analysis statements on the clock mux select lines.

    We also saw that the presence of SDC set_clock_group constraints that isolate the clocks is not currently effective in decoupling the balancing constraints for generated and master clocks.

    Finally, we demonstrated a method to display skew_group sinks so we could determine exactly what the tool was trying to balance. We then showed how a sequence of

  • Controlling balancing of generated and master clocks in CCOpt

    Learn more at Cadence Online Support - http://support.cadence.com 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Page 16

    commands could be used to modify the skew_groups to remove sinks that were causing the master clock domain to be balanced with the generated clock domain, resulting in an optimal clock tree structure with minimal insertion delay to the sinks of the high speed master clock, while maintaining balancing of the generated clock domain.

    Support

    Cadence Online Support provides access to support resources, including an extensive knowledge base, access to software updates for Cadence products, and the ability to interact with Cadence Customer Support. Visit http://support.cadence.com.

    Feedback

    Email comments, questions, and suggestions to [email protected].