parthi12

Upload: mohan-raj

Post on 02-Apr-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 parthi12

    1/6

    Abstract This paper describes a method of converting

    floating-point expressions into equivalent fixed-point code in

    DSP software. Replacing floating-point expressions with

    specialized integer operations can greatly improve the

    performance of embedded applications. This is a new method

    that is developed for Direct-Form I filters with constant

    coefficients and input variables whose low/high bounds are

    known. Two conflicting objectives are considered

    simultaneously: computational complexity and accuracy loss.

    The algorithm presented here can construct multiple fixed-

    point solutions for the same floating-point code: from high-

    complexity-high-accuracy to low-complexity-low-accuracy.

    A so-called cost function conducts the data flow

    transformation decisions. By changing the cost function

    coefficients, different fixed-point forms can be obtained. The

    data flow transformation takes very little time: less than 100

    milliseconds for a 32-tap FIR filter. The generated fixed-point

    code is tested on 8-bit (AVR ATmega), 16-bit (MSP430), and

    32-bit (ARM Cortex-M3) microcontrollers. It provides, in all

    cases, execution speeds better than if using floating-point code.

    I. INTRODUCTIONFloating-point code is somehow inappropriate for

    embedded applications. The computing capabilities of

    microcontrollers are reduced in general and, in most cases,no hardware support for floating-point operations is

    provided. To overcome this problem, the mathematical

    function contained in the floating-point code must be

    expressed with fixed-point code. Doing this manuallythat

    is, rewriting by hand a floating-point function into a

    sequence of integer operations can be a difficult task.

    II. RELATED WORKThere has been a significant effort to develop frameworks

    to automate the conversion of floating-point code to integer

    code [1]-[4]. Two distinct approaches can be identified:

    statistical (simulation-based) and analytical. The difference

    between them is in the way the dynamic intervals of

    variables are computed. A statistical method performs a

    series of simulations and may require an important amount

    of time. An analytical method is obligatorily based on a

    concrete data model (for example, propagation rules) and

    can give precise information in very short time.

    In [1] is described one of the first floating-point to fixed-

    point converters: AUTOSCALER for C. It is able to

    optimize the number of shift operations by equalizing the

    word-length of specific variables or constants. In [2] is

    presented a method that performs CDFG optimizations

    under accuracy constraints. It makes extensive use of the

    equations representing the system. In [4], is described a

    genetic algorithm employed to find the optimal trade-off

    between signal quality and implementation complexity.

    This paper is based on a previous work detailed in [5].

    Paper [5] addresses the same task as this paper, but is

    primarily focused on generating ANSI C compliant code.

    III. METHOD OVERVIEWThe method presented in this paper is designed to

    transform dot products with constant coefficients (floating-

    point literals) and integer variables with known intervals:

    N

    i

    iixa

    0

    (1)

    Failing to state the correct intervals of the integer

    variables can lead to erroneous results. The manipulation of

    intervals [6] is central to the optimization procedure.

    The following types of nodes are used to represent the

    data flow:

    Stand-alone nodes: nodes whose values do notdepend on other nodes. Stand-alone nodes are used to

    represent constants, parameters: ai,xi, etc.

    Operators: add, multiply, shift and change sign. Anoperator has one or more operands (child nodes). These can

    be stand-alone nodes or other operators.

    A node has an associated interval and fractional word

    length (FWL). The interval represents the extreme values of

    the node run-time integer (the memory or register variable).

    Scaling a node is a frequent operation encountered in the

    optimization process. Note: the FWL of a node is an integer

    value. A scaling operation does not change the fixed-point(or real) value of a node. The node interval is always altered

    together with the node FWL.

    A node can be realized in code as a 16-bit or 32-bit

    integer. The values that can pass through a node must

    have as many as possible significant bits, to carry precise

    information, but, on the other hand, must be limited to a

    specific interval. The FWL is not necessarily the same for

    every node.

    The data flow structure is modified in steps. A step can

    be viewed as an inference operation:

    Floating-point to fixed-point code conversion with variable trade-off

    between computational complexity and accuracy loss

    Alexandru Brleanu, Vadim Bitoiu and Andrei Stan,Member, IEEE

  • 7/27/2019 parthi12

    2/6

    1.Effect. The integer interval of a node must bedecreased or increased.

    2.List of possible causes. A list of candidate data flowtransformations is constructed.

    3.Best cause selection. The optimal data flowtransformation is selected with the help of a cost functionwhose coefficients represents in essence the importance

    given to the computational effort and to the accuracy loss.

    The method described in this paper is implemented in

    Java (mostly because of the Java support for object-oriented

    programming and advanced IDEs available).

    IV. DATA FLOW TRANSFORMATIONA. Problem Difficulty

    The initial form of the data flow represents a true image

    of the floating-point dot product expression. There is one

    add operator with Nchild nodes a multiply operator foreach ai xi term, as in (1).

    Node ai has a very long fractional part 24 bits if the

    floating-point literals of the dot product expression are

    parsed as single precision values. If node xi has, for

    example, the run-time interval equal to [0; 1023], then the

    multiply node overflows. Four data types are permitted for a

    node: signed/unsigned 16-bit and 32-bit integers. To make

    the multiply node not to overflow it is necessary to make its

    integer interval smaller (to decrease the fractional part).

    There are two possibilities: shift to right ai at design-time or

    shift to right xi at run-time. Each solution has its own

    impact on the data flow computational complexity and

    accuracy. In this case it is simple to decide which one

    solution to select discarding some least significant bits

    from the very precise constant at design-time means no run-

    time overhead and causes less accuracy loss than in

    comparison with the right shift ofxi. But this is a rare case.

    In general case, a solution is either low-complexity-low-

    accuracy or high-complexity-high-accuracy (not low-

    complexity-high-accuracy). This makes it difficult to

    compare candidate solutions. It is necessary to determine

    quantitatively the complexity and accuracy of a particular

    solution. There is no other way, because the number of thealternative possibilities grows very quickly with the size of

    the data flow area below the node whose integer interval

    must be modified.

    B. Computational ComplexityA data flow node has an associated computational

    complexity. This is an estimator of the computational effort

    required to obtain at run-time the node value. At design-

    time the exact computational complexity is difficult to

    evaluate. The Java application simply counts the operators

    contained in the data flow area below the target node. This

    is a sufficiently good approximation.

    C. Node Error Interval (Drift)Every data flow node stands for a fixed-point value which

    can vary within a specific interval. This interval refers to the

    node value at run-time, which is in essence an integer very

    close to the ideal infinite-precision value. Thus, every node

    has a specific error. The interval of the infinite-precision

    value that can pass through a node, and the corresponding

    interval of the run-time integer, can be calculated at design-

    time, which means that, for each node, the interval of the

    error can be obtained without actually running the code.

    The error interval of an operator node can be calculated

    using the integer interval and the error interval of every

    child node [6]. For example, the error interval of an add

    node can be calculated by adding the error interval of every

    child node.The low/high values of an error interval are considered to

    be absolute values; not relative values or, in other words,

    units in the last place (ulps) [7].

    D. Multi-Objective SearchThe simplest way to decrease or increase the integer

    interval of a node is to basically perform a shift operation.

    This can be done at design-time if the node represents a

    constant value or at run-time if the node represents an

    operator and (very important) the integer interval is valid

    (does not overflow). But these are not frequent cases. The

    most usual situation is when an operator overflows and itsinteger interval must be decreased. The problem is that the

    integer interval of an operator cannot be changed directly

    the integer intervals of the child nodes must be altered. To

    force the integer interval of an add node it is necessary to

    force the integer interval ofevery child node (logical AND).

    To force the integer interval of a multiply node it is

    necessary to force one or more child nodes (logical OR).

    In general case, there are multiple ways the increase or

    decrease the integer interval of a node (logical OR). One

    possible way is called a solution. A solution involves a

    number of data flow changes (logical AND). A change can

    be viewed in the simplest way as a node switchthe child

    node of an operator is replaced with another child node. A

    change is invertiblea change can be applied and can be

    undone. This is a very important feature. Because a change

    is always a part of a solution, it makes sense to say that a

    solution is applied or undone (meaning that all the changes

    included are applied or undone).

    Multiple solutions can be viewed as being concurrentif

    all of them are built with the same purposefor example, to

    make the integer interval of a specific node smaller. But

  • 7/27/2019 parthi12

    3/6

    each solution consists of a number of particular (different)

    changes. Thus each solution has its own computational

    complexity and influence on accuracy. The Java application

    compares concurrent solutions by these two metrics. To

    evaluate the complexity and error interval of a solution, it is

    applied (some child nodes are disconnected and others areconnected).

    This algorithm step (switching between different

    solutions) is essentially a search. The method described in

    this paper resembles other methods if regarding the way the

    data flow is implemented (types of nodes) and the usage of

    operator properties (value propagation). From this point of

    view, the method described here can be considered

    analytical. But it still does a search! It can be regarded as

    being search-based, but, anyway, it is very different from

    other search-based methods. The method described in this

    paper connects/disconnects various data flow fragments,

    while other methods scan large, large multi-dimensionalspaces that represent fractional word-lengths.

    To compare several concurrent solutions it is necessary to

    combine, for each solution, the complexity and error

    interval into a single indicator. In order to do this, a linear

    function is used:

    errorkcomplexitykcost 21 (2)

    Varying the cost function coefficients is like, for

    example, favoring solutions which introduce considerable

    computational overhead but give high accuracy results in

    place of low-complexity-low-accuracy solutions.

    Although the cost function has two parameters, the

    variation space is one-dimensional. The cost function can be

    represented geometrically as a line which passes through the

    origin point in a two-dimensional space (Fig. 1).

    Fig. 1: Solution space

    In Fig. 1 the cost of one solution is directly proportional

    to the shortest distance to the cost function line. The

    complexity coefficient (k1) and the error coefficient (k2)

    determine together the slope of cost function line. Two

    coefficients are used, because, otherwise it would be

    impossible to represent the vertical line (+INF slope). For

    simplicity, the sum of the cost coefficients is kept constant:

    121 kk (3)

    The cost of one solution has no meaning if considered

    separately. It does make sense only in comparison with the

    costs of other solutions.

    E. Transformation ExampleFig. 2 shows two extreme data flow structures obtained

    for a dot product with 12 terms. (Such images can be

    created with Graphviz software.)

    Fig. 2: High level view of two extreme data flow structures obtained for a

    dot product expression with 12 termslow-complexity-low-accuracy

    (left) and high-complexity-high-accuracy (right).

    In Fig. 2, the data flow on the left hand side has a specific

    pattern. The fractional word length is the same for the most

    part of the nodes. The number of operators is minimal

    (complexity = 26). In contrast, the data flow on the right

    hand side is very developed and doesnt have a specific

    pattern (at global level). Some nodes have very long

    fractional parts (which is not visible). The number of

    operators is maximal (complexity = 53).

    V. DESIGN-TIME TECHNIQUESA. Node Cache Information

    The optimization process makes extensive use of somenode attributes like integer interval and drift. For operator

    nodes this information depends on the child nodes

    (operands) and must be computed. The time required for

    this can become significant for large data flowsthe high

    nodes generate a lot of subsequent calls to nodes located

    below to get the necessary information. This traffic can be

    somehow diminished. The data flow structure is not itself

    very dynamica change that is applied in the optimization

    process has a limited impact area. In many cases the integer

    interval and drift information can be reused. For this

  • 7/27/2019 parthi12

    4/6

    purpose, each operator node is designed with its own cache.

    In this way, obtaining the integer interval or the drift

    information can be very straightforward; unless the cache of

    the target node is invalidated. The invalidation of the cache

    is crucial. Doing this for fewer nodes than required may

    lead to erroneous results, and, doing this for more nodesthan it is required can lower the cache hit rate.

    The cache invalidation is triggered in the following

    mannerwhenever a nodeN is connected with an operator

    F, a message of change is propagated along the chain of

    operators from F to the root operator to invalidate the

    corresponding cache data.

    The design time is reduced considerably with node cache

    information. This is easy to observe as the filter length is

    increased. Without caching, transforming the data flow of a

    dot product with 16 terms can take more than 10 seconds.

    Turning the cache mechanism on, the data flow is optimized

    is tens of milliseconds.Fig. 3 shows the execution time of the optimization

    procedure for dot products with different lengths (node

    cache information is used).

    Fig. 3: Average data flow transformation time

    B. Automatic Search of Data FlowsVarying the coefficients of the cost function leads to

    different data flow structures. The coefficients can be set by

    hand; but this is not very practical. The reason is that the

    coefficients themselves do not carry very much information

    (except for the extreme cases). In a concrete situation it

    might be more desirable to generate all the possible data

    flows, create code for all of them, and later select the most

    convenient function.

    From a high level point of view, the search method,

    whichever is, should traverse the one-dimensional search

    space from 0 to 1, generate various data flows, and pick-up

    the unique ones. The ideal search method should generate

    as few as possible equivalentdata flows. Two data flows are

    considered equivalent if, while traversing both structures in-

    depth in parallel, every node that is encountered has the

    same type (add, multiply), has the same integer interval

    (low/high values, fractional length) and has the same

    number of child nodes as its mirror node.

    Performing a sequential search can be very time-

    consuming. Varying a coefficient from 0 to 1 using a

    constant step and generating all the corresponding data

    flows can be very inefficient. A large number of data flow

    structures are equivalent and the generation of a single one

    requires an important amount of time for example, a dotproduct with 10 terms requires 10-15 milliseconds. On the

    other hand, the increment step has to be small enough to

    capture all the possible data flow structures.

    Fortunately, it is possible to perform a more selective

    search. It is just necessary to use the following context

    information: if the end-points of a segment inside the search

    space generate the same data flow structure then it does

    not make sense to sweep this particular segment. No new

    data flow structures can be discovered in this area. But if

    the end-points of a segment generate different data flows

    then the segment should be halved and the same procedure

    should be applied further for the resulted segments. Thismethod is very efficient, because the number of the

    discarded data flow structures is minimal.

    Fig. 4: Complexity of data flows that are found for a dot product with 20

    terms (partial view). The complexity coefficient is swept from 0 to 1, while

    the drift coefficient is set to a complementary value (2). A horizontal

    segment represents one or more data flows with the same complexity.

    Given an arbitrary filter, the number of non-equivalent

    data flows that can be found by varying the cost coefficients

    is proportional to the filter length. As a rule, ifN is the

    filter length, then the second search method yields

    0.5N1.5Nnon-equivalent data flows.

    VI. CODE GENERATIONGenerating fixed-point C code for a particular data flow

    is, in essence, a straightforward process. However, there are

    two important aspects: the declaration of the intermediary

    variables and the explicit data type casts [8].

    The C code can be generated in two very different forms:

    as a long sequence of short assignments (one operator in

    every right hand side) and a lot of intermediary variables or

    as a single, very long, arithmetic expression and a lot of

  • 7/27/2019 parthi12

    5/6

    parentheses. Although honestly both forms of code look

    meaningless, the first variant can be used for debugging

    purposes, because all the intermediary variables are declared

    and can be watched step by step. The second variant is

    preferable in case no compiler optimizations are applied.

    Generating a very long line with arithmetic operatorsposes some problems. It is so because the compiler must

    deduce the data type for some subexpressions. (In case the

    intermediary variables are declared, their data type is clearly

    stated.) Examples:

    Short multiplication. The compiler might considerthat the result of a multiplication between two 16-bit

    integers is a 16-bit integer. This is in general not desirable,

    because most multiply nodes produce 32-bit values; so,

    when the code is generated, short integers that must be

    multiplied are explicitly cast to long integers.

    Signed/Unsigned arithmetic. There are cases when asigned integer is added with an unsigned integer and theresult is known to be nonnegative, but the compiler assumes

    that it is signed. If such an integer must be shifted to right,

    the compiler might do an arithmetic (not logical) shift,

    which is wrong, because the most significant bit would be

    interpreted differently. To avoid this, additional casts are

    inserted when generating the code.

    VII. RESULTSA. Accuracy

    When a fixed-point C function is generated, the error

    interval of its result is already known. This is the worst-case

    indicator computed at design-time the drift of the data

    flow root node.

    Note: The error is considered as the difference between

    the floating-point value obtained with the original floating-

    point expression (the reference value) and the integer value

    obtained with the generated fixed-point code.

    A more relevant accuracy indicator is the signal-to-

    quantization-noise-ratio (SQNR) computed with the mean of

    the absolute reference values (S) and the mean of the

    absolute error values (N):

    N

    SSQNR 10log10 (3)

    The SQNR values are computed on a high-speed

    computer (not on microcontrollers). It is important to run

    the fixed-point code with as many as possible different

    input parameters.

    Fig. 5 illustrates the accuracy and the complexity of the

    solutions that are found (automatically) for a dot product

    with 24 terms. The accuracy is represented as the difference

    between the highest possible (ideal) SQNR and the SQNR

    of the generated code. The highest possible SQNR is

    defined as the SQNR of a function that would return the

    integer nearest to the ideal floating-point value.

    Fig. 5: Solutions found for a dot product with 24 terms, random coefficients

    within the interval [-1, 1] and variables within the interval [0, 4095]

    The SQNR degrades as the dot product number of terms

    grows and, especially, as the complexity cost coefficient is

    increased.

    B. SpeedThe execution time of the generated fixed-point code

    depends on many factors:

    The filter. The number of data flow nodes is directlyproportional to the number of filter taps. This holds true

    before and after the data flow is optimized.

    The cost function. Varying the cost coefficients leadsto specific data flow transformation decisions (as discussed

    in sectionMulti-Objective Search).

    The code generation. If the fixed-point code isgenerated as one very long expression (everything inline),

    then most of the intermediary values are allocated in

    registers and, in effect, the number of load/store operations

    is decreased. This is especially important when no compiler

    optimizations are applied.

    The compiler. Turning the compiler optimizations oncan greatly accelerate the fixed-point code. This is worth

    considering especially when the intermediary variables are

    declared.

    The microprocessor. The microprocessor capabilitiesare not regarded in detail, because the main purpose is to

    generate platform-independent code, not assembler. The

    only thing assumed is that there are no floating-point units,

    which is characteristic for embedded microprocessors. The

    microprocessors used for testing are shown in Table I. Some

    instruction sets include integer division (something that can

    be used instead of bitwise shift), but this is not a general

    feature and is not considered.

  • 7/27/2019 parthi12

    6/6

    TABLEI

    MICROPROCESSORS USED FORTESTING

    Microprocessor Register width Compiler

    ATmega16 8-bit IAR

    MSP430F149 16-bit IAR

    STM32F 32-bit gcc

    LPC1768 32-bit

    IAR

    The speed factor between the fixed-point code and the

    floating-point code can vary within a wide range. One very

    important cause is the cost function used throughout the

    data flow optimization. For low-complexity-low-accuracy

    solutions the speed can be increased by 15 times or more.

    For high-complexity-high-accuracy solutions the speed

    can be increased by at least 3 times. (These results are

    obtained with floating-point dot products with 4-32 terms

    generated randomly.)

    C. Memory Usage (Flash and SRAM)The fixed-point code takes in general slightly more Flash

    space (code memory) than the floating-point code.

    The SRAM (data memory) usage is determined mainly by

    the stack requirements. The fixed-point code, if generated as

    a single arithmetic expression (no intermediary variables),

    occupies almost no stack space. The floating-point code

    needs a specific amount of stack, because it is calling low-

    level functions.

    VIII. CONCLUSIONSA method for transforming floating-point expressions to

    integer C code for embedded processors is described.

    Direct-Form I non-adaptive filters with predefined input

    bounds are targeted. The algorithm presented uses a

    parametrizable cost function and is able to produce multiple

    solutions for the same given floating-point expression.

    The method can be applied for FIR filters, as well as for

    IIR filters if the intervals of the output variables can be

    specified. (There is work in progress for recursive filters.)

    The generated code is tested on 8-bit, 16-bit, and 32-bit

    microprocessors, using different compilers.

    There can be two major realizations of the presented

    algorithm: as a stand-alone application for code conversion(as it is currently implemented) or as a separate type of

    compiler IR optimization (requires integration in a compiler

    system).

    ANNEX

    A floating-point expression is converted to fixed-point

    code, for illustrative purposes:

    0.023159746f*x[0]+0.007362494f*x[1]+0.109808266f*x[2]-

    0.8996903f*x[3]-0.52352905f*x[4]+0.34677517f*x[5]+

    0.50765723f*x[6]+0.9989124f*x[7]+0.5545187f*x[8]-

    0.73752284f*x[9]

    This expression can be viewed as a FIR filter. The

    interval of the input variables x[0-9] is set to [0; 4095]. The

    coefficients are generated randomly in the interval [-1; 1].

    The conversion to fixed-point takes 106 milliseconds and

    yields 11 non-equivalent data flows. ANSI-C integer code is

    generated. Here is the compact form of one solution (codewithout intermediary variables):

    ((unsigned long)(1159921664L + (((((unsigned

    long)(((((unsigned long)(((((unsigned long)7720 * (unsigned

    long)x[1]) + ((unsigned long)24284 * (unsigned long)x[0])) +

    (115142L * (unsigned long)x[2])) + (532317L * (unsigned

    long)x[6])) >> 2) + ((unsigned long)(1047435L * (unsigned

    long)x[7]) >> 2)) + ((unsigned long)(581455L * (unsigned

    long)x[8]) >> 2)) + (90905L * (unsigned long)x[5])) >> 1) -

    ((unsigned long)(193337L * (unsigned long)x[9]) >> 1)) + ((-

    117924L) * (signed long)x[3])) + ((-68620L) * (signed

    long)x[4]))) >> 17) - 8849

    The accuracy of this fixed-point code is estimated by

    running 1.9e+6 random test cases.

    The SQNR of the fixed-point code is 38.694524dB. This

    value is 0.000098dB less than the ideal SQNR. The errordistribution is as follows: in 99.80% of the cases the result

    of the fixed-point code is the same as the integer nearest to

    the floating-point expression, in 0.07% of the cases the

    error is 1 and in 0.13% of the cases the error is -1.

    IAR Embedded Workbench for ARM is used to measure

    the performance of the integer code. LPC1768, which is an

    ARM Cortex-M3 microprocessor, is selected as the target

    architecture. It occurs that, without compiler optimizations,

    in Simulator, the floating-point code takes 737-754 cycles

    and the integer code takes 50 cycles. Thus, the execution

    time is decreased by approximately 15 times.

    REFERENCES

    [1] K. I Kum, J. Kang, W. Sung, AUTOSCALER For C: An OptimizingFloating-Point to Integer C Program Converter For Fixed-Point Digital

    Signal Processors,IEEE Trans. on Circuits and Systems II: Analog

    and Digital Signal Processing, vol. 47, issue 9, pp. 840-848, Sep.

    2000.

    [2] D. Menard, D. Chillet, F. Charot, O. Sentieys, Automatic Floating-point to Fixed-point Conversion for DSP Code Generation, inProc.

    of the 2002 International Conference on Compilers, Architecture,

    and Synthesis for Embedded Systems , Oct. 2002.

    [3] C. Shi, R. W. Brodersen. An Automated Floating-point to Fixed-point Conversion Methodology, inProc. of IEEE International Conf.

    on Acoustics, Speech, and Signal Processing, vol. II, pp. 529-32,

    2003

    [4] K. Han, Automating transformations from floating-point to fixed-point, Ph.D. dissertation, Faculty of the Graduate School of the

    University of Texas at Austin, 1996.

    [5] A. Brleanu, V. Bitoiu, A. Stan, Digital filter optimization for Clanguage, in Advances in Electrical and Computer Engineering, to

    be published.

    [6] R. B. Kearfott, Interval Computations: Introduction, Uses, andResources,Euromath Bulletin, vol. 2, no. 1, pp. 95112, 1996.

    [7] D. Goldberg, What Every Computer Scientist Should Know AboutFloating-Point Arithmetic, ACM Computing Surveys (CSUR), vol.

    23, issue 1, 1991.

    [8] Programming languages C, International Standard, ISO/IEC9899:TC2.

    [9] R. J. Mitchell and P.R. Minchinton, A Note on Dividing Integers byTwo, The Computer Journal, 32, No. 4, Aug 1989, 380.