risc_cc_gcc_ranjan_draft_v0.003
TRANSCRIPT
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
1/11
GCC COMPILER'SGCC COMPILER'S
Optimization Levels------------
-O1 optimize for maximum speed, but disable some optimizationswhich increase code size for a small speed benefit
-O2 optimize for maximum speed (DEFAULT)
-O3 optimize for maximum speed and enable more aggressiveoptimizations that may not improve performance on some programs
-O same as -O2
-Os enable speed optimizations, but disable some optimizationswhichincrease code size for small speed benefit
-O0 disable optimizations
1. Introduction:-
The GNU Compiler Collection(GCC) is a computer system produced by the GNUProject support various programming languages. GCC stable release is 4.6.2, written in C,C++,platform is GNU, OS is cross-platform/multi-platform.
GCC was originally written as the compiler for the GNU operatingsystem. It is a compiler generation framework which generates production quality optimizing compilersfrom descriptions of target platforms.
=> GNU is a acronym for GNU's NOT Unix, is a Unix-like OS developed by the GNU project,aiming to be a complete Unix-compatible software system composed wholly of free software. GNUdevelopment was initiated by Richard Stallman in 1983, latest alpha release of the GNU system isGNU 0.401 released on 1 April 2011, featuring GNU Hurd as the system's kernel.
Figure Compiler flow of GCC compiler:-C tree GENERIC GIMPLE
C source code
Tree- -SSA
optimized RTLAssemblyCode
Parser Genericizer GimplifierTree
Optimizer
RTL Gen-erator
RTLoptimizer
Code Genera-tor
RTL
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
2/11
GCC Front-end:-
It have to produce trees that can be handled by the backend but initially the meaning of atrees was somewhat different for different language front-ends. This was simplified with the
introduction of GENERIC and GIMPLE, two new forms of language-independent trees that wereintroduced with the advent of GCC 4.0.
=>GENERIC is used to simply to provide a language-independent way of representing an entirefunction in trees.
GIMPLE is a simplified GENERIC, in which various constructs are lowered tomultiple GIMPLE instructions. The C,C++ and Java front ends produce GENERIC directly in the frontend. GENERIC is an intermediate representation language used as a middle-endwhile compiling source code into executable binaries.
The middle stage of GCC does all the code analysis and optimization, working independently offboth the compiled language and the target architecture, starting from the GENERIC representation andexpanding it to Register Transfer Language(RTL- is a term used to describe a kind of intermediaterepresentation(IR) that is very close to assembly language).In transforming the source code to GIMPLE, complex expressions are split into a three address codeusing temporary variables.
Optimization:-
It can occur during any phase of compilation, however the bulk of optimizations areperformed after the syntax and semantic analysis of the front-end and before the code generation of theback-end. This part of compiler is middle end named with some contradiction.
Back-end:- Preprocessor macros partially decides GCC back end behavior along with functions specificto a target architecture, for instance to define the endianness, word size and calling conventions. Thefront part of the back end uses these to help decide RTL generation, although GCC RTL is nominallyprocessor-independent. The actual RTL instructions forming the program representation have tocomply with the machine description of the target architecture.The machine description file contains the RTL patterns, along with operand constraints and codesnippets to output the final assembly.
Figure Architecture of GCC compiler:-Front End Middle End
.
C
C++
AST
AST Generic GIMPLE SSA
OptPass 1
OptPass N
un-SSA RTL...
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
3/11
Back End
Debugging GCC:-
GNU Debugger(gdb) is the primary tool used to debug GCC code. More specialized tools areValgrind( for finding memory errors & leaks, and the graph profiler(gprof) that can determine howmuch time is spent in which routines and how often they are called, this requires program to becompiled with profiling options.
2. Register Translation Language(RTL):- RTL is an low-level intermediate representation in which instructions to beoutput are described one by one in an algebraic form that describes what the instruction does in thelater stage of compilation.In GCC, RTL is generated from the GIMPLE representation, transformed by various passes in the GCC'middle-end', and converted to assembly language. Actually RTL is a convenient tool for describing theinternal organization of digital computers in concise and precise manner.RTL is inspired by Lisp S-expression:
e.g register(b) = register(c) + register(a), is expressed in RTL as (set (reg:SI a)
(plus:SI (reg:SI b)(reg:SI c)))
where SI-specifies the access mode for each registers.Thus we can say that RTL is a design abstraction which models a synchronous digital circuit in termsof the flow of the digital signals(data) between hardware registers, and the logical operations performedon those signals.
Example RTL representation of 32bit integer plus operation: (insn UID PREV NEXT (set (reg : SI 1))
(plus:SI (reg : SI 2) (reg : SI 3)))
where insn node is an container with field UID containing unique identifier of instruction andNEXT,PREV of linked list of instructions. set node is used to represent stores to the first operand( pseudo register 1) and the 2 nd operand isthe actual expression. SI is mode representing 32bit integer
RTL advantages:-
Has some dependency on the characteristics of the processor for which GCC isgenerated
Knowledge of target processor for RTL code understanding is not necessary
Its meaning doesn't depend on the high-level language(source language) of theprogram.
RTL semantics is target independent making it possible to write common optimizers for all targets,
Java ASTMachine code
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
4/11
however the syntax( set of allowed instructions) is target dependent. For instance i386 conditional jumpdescribed as:
( insn 56 13 57 (set (reg : CCGC 17 flags)(compare: CCGC (reg:SI 61)
(reg:SI 62)))(jump_insn 57 56 33 (set (pc)
(if_then_else (ge (reg:CCGC 17 flags)(const_int 0))(label_ref 22)(pc))))
Above insn describing the presence of flags register( register 17) on i386 architecture and splitbetween compare and conditional jumps.Consider the example:
RTL for i386 translation of a=a+1;Dump file: test.c.141r.expand
(insn 12 11 13 4 t.c:24 (parallel [
(set (mem/c/i:SI(plus:SI
(reg/f:SI 54 virtual-stack-vars)(const_int -4 [0xfffffffc])) [0 a+0 S4 A32])
(plus:SI(mem/c/i:SI
(plus:SI(reg/f:SI 54 virtual-stack-vars)(const_int -4 [0xfffffffc])) [0 a+0 S4 A32])
(const_int 1 [0x1])))(clobber (reg:CC 17 flags))]) -1 (nil))
here plus modify condition code register non-dterministically=>Clobber register:- We use clobber register to inform gcc that values stored in these registers are useand modify by ourselves. So gcc will not assume that the values it loads into these registers will bevalid. We shouldn't list the input and output registers in this list.
If our instruction can alter the condition code register, we have to add CC to the list ofclobbered registers. Like (clobber ( reg:CC 17 flags)).
=> Control Flow Graph:-CFG is a data structure built on top of the intermediate code representation( RTL instruction chain or trees) abstracting the control flow behavior of compiled function.References:- 1. http://www.ucw.cz/~hubicka/papers/proj/node6.html
2. http://kcchao.wikidot.com/gcc-internals3. http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.34. http://en.wikipedia.org/wiki/Intel_80386
3. Gray Box Probing of GCC:-Black Box probing:- The user is only aware of what the software is supposed to
do, but not how i.e examining only the input & output relationship of a system. White Box probing:- Examining the internal structures orworkings of anapplication for a given input, as opposed to its functionality as in Black box probing. Gray Box probing:- It is a combination of both mentioned above means the internal
here plus modify condition code register non-dterministically
Current instructionBasic Block
File name line number
Memory referenceScalar that is not part
of aggregate
Register that holds pointer
Single integer
http://www.ucw.cz/~hubicka/papers/proj/node6.htmlhttp://kcchao.wikidot.com/gcc-internalshttp://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3http://en.wikipedia.org/wiki/Intel_80386http://kcchao.wikidot.com/gcc-internalshttp://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3http://en.wikipedia.org/wiki/Intel_80386http://www.ucw.cz/~hubicka/papers/proj/node6.html -
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
5/11
structure as well as the algorithms of the application for the input & output relationship inspection. Itincludes:
Overview if translation sequence in GCC
Overview of intermediate representations
Intermediate representations of programs across important phases
Basic Translation in GCC
Transformation from high level language to low level language
Target Independent Target Dependent
GIMPLE--->RTL RTL ----> ASM
GIMPLE Passes RTL Passes
Transformation Passes in GCC
The Middle-End of GCC performs SSA based optimizations on GIMPLE, then converts the GIMPLEto RTL and does more optimizations. Finally it hands it off the optimized RTL to the BackEnd.
There are a total of 203 unique pass names initialized in ${SOURCE}/gcc/passes.c
but actually there a total number of 239 pass.Some Passes are listed below:-Parsing pass--
This pass reads the entire text of a function definition, constructing a partialsyntax trees
The tree representation does not entirely follow C syntax, because it isintended to support other language as well.
Language-specific data type analysis is also done in this pass, every nodethat represent an expression has a data type attached. Variables are defined asdeclaration nodes.
Constant folding and some arithmetic simplification are also done here
RTL Pass-- It is actually done statement-by-statement during parsing, but for most
purposes it is considered as separate pass.
Optimization is done in this pass and decisions are made about how best toarrange loops and how to output 'switch' statements.
The decision of whether the function can and should be expanded inline in itssubsequent callers is made at the end of RTL generation
The option '-dr' causes a debugging dump of the RTL code is done after thispass. This dump file's name is made by appending '.rtl' to the input file name.
Parse Gimpli--fy
Tree SSAOptimizer
GenerateRTL
OptimizeRTL
GenerateASM
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
6/11
Jump Optimization pass--
This pass simplifies jumps to the following instruction, jumps across jumps,and jumps to jumps.
It modified some code originally written with jumps into sequences ofinstructions that directly set values from output of comparisons, if machinehave such instructions
It deletes unreferenced labels and unreachable code( have some restrictions) This pass is performed two or three times. 1St time is immediately following
RTL generation, 2nd is after common subexpression elimination(CSE), butonly if CSE required repeated optimization
The option '-dj' causes a debugging dump of the RTL code after this pass isrun for the first time. This dump file's name is made by appending '.jump' tothe input file name.
Register Scan pass--
This pass finds the first and the last use of each register, as a guide for CSE.
While considering all the passes they are broadly divided in two parts:
1. Passes on GIMPLE: Approximately everything passes throughhere atleast once. It also checks whether the expression islanguage specific construct or not etc.
2. Passes on RTL:- optimizations is done here along with generationof exception landing pads etc
Passes on GIMPLE:-
Pass Group Example Number of Passes
Interprocedural Optimization Conditional Constructpropagation, Inlining, SSAconstruction, LTO
49
Intraprocedural Optimization Constant Propagation, DeadCode Elimination, PRE
42
Loop Optimizations Vectorization, Parallelization 27
Remaining IntraproceduralOptimizations
Value Range Propagation,Rename SSA
23
Generating RTL01
Lowering GIMPLE IR, CFG Construction 12
Total Number of passes on GIMPLE 154
Passes on RTL:-
Pass Group Example Number of Passes
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
7/11
Intraprocedural Optimizations CSE, Jump Optimization 21
Loop Optimizations Loop Invariant Movement,Peeling, Unswitching
7
Machine DependentOptimizations
Register Allocation, InstructionScheduling, Peephole
optimizations
54
Assembly Emission andFinishing
03
Total number of Passes on RTL 85
Total number of dumbs in Different optimization level:-
Optimization Level Number of Dumps Goals
Default 47 Fast compilation
O1 134
O2 156
O3 165
Os 154 Optimize for space
Command Line Commands for optimizations and output passes:-
list of optimization with brief description$ gcc -c help=optimizers
optimization enabled at level 2( others are 0,1,3 and s)$ gcc -c -O2 help=optimizers -Q
format is -fdump-- where could betree, rtl, ipa(interprocedural passes on GIMPLE)
for seeing all dumps$ gcc -fdump-tree-all -fdump-rtl-all test.c
Diagrammatic representation of passes for First Level Gray-box Probing of GCC:-
C Source Code
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
8/11
ASTIRA
GIMPLEPrologue-epilogue
CFG
ASM Program
RTL Expand
==>> Inline assembly with GCC:-
If your assembler instruction can alter the condition code register, add 'cc' to thelist of clobbered registers, 'cc' serves to name this register. The input operands are guaranteed not to useany of the clobbered registers, and neither will the output operands' addresses, so you can read andwrite the clobbered registers as many times as you like.
Conventions:-
Register naming-- register names are prefixed with % like %eax
Source/Destination orderingthe source is always on the left, and thedestination is always on the right, Like load ebx with the value in eax:
movl %eax, %ebx
Constant/immediate value format-- you must prefix with $, Like load eaxwith 0xd00d: movvl $0xd00d, %ebx
Operator size specification: suffix the instruction with one of b, w, or l tospecify the width of the destination register as a byte, word or longword. If
you omit this GAS(GNU assembler) will attempt to guess.movw %ax, %bx
Referencing memory:- It has 386- protected mode. The canonical formatfor 32-bit addressing: immed32(basepointer, indexpointer, indexscale)
Addressing a variable offset by a value in a register:_variable(%eax) where underscore(_) is how you get at static(global) C variable from
assembler.
Addressing a value in an array of interger (scaling up by 4):_array(,%eax,4)
Parser
Gimplifier
CFGGenerator
RTLGenerator
Reg Allocator
pro_epiloguegeneration
Pattern Matcher
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
9/11
Basic inline assembly:- It's very simple, like asm (statement); You can even push your registers onto the stack, use them, and put them back, like
asm (pushl %eax\n\t movl $0, %eax\n\t popl %eax);
Extendedinline assembly:- Basic format is:
asm ( statement : output_registers : input_registers : clobbered_registers);==> Load Effective Address acronym form is lea , it does an address calculation without affecting anyflags.
Types of GAS(GNU Aseembler) instructions:
opcode (e.g pushal)
opcode operand (e.g pushl %edx)
opcode source, dest (e.g movl %edx, %eax) (e.g addl %edx,%eax
Important Processor Register set:
EAX,EBX,ECX,EDX- general purpose, more or less interchangeable
EBP- used to access data on stack
ESI, EDI- index registers SS(stack segment),DS(data segment),CS(code segment),ES,FS,GS segmentation registers
EIP- program counter (instruction pointer)
ESP- stack pointer
EFLAGS- condition codes
Example:- CODEVoid function(){int A=10;A +=66;
}
References:- http://www.ibm.com/developerworks/linux/library/l-ia/index.html http://stackoverflow.com/questions/4003894/leal-assembler-instruction http://www.hep.wisc.edu/~pinghc/x86AssmTutorial.htm
GCC Configuration and BuildingGCC Configuration and Building
GCC Native Compiler :-GCC Native Compiler :-
Today I gone through topics related to GCC configuration and building while my first half of sessionwent on reading workshop slide and searching internet thereafter I will be able to build the neededlibraries and configure GCC. Its a tedious task to install GCC because it checks our patience means ittakes approx 40 minutes.
First of all we need three other libraries for a successful build of gcc: MPC-0.9,MPFR-3.1.0 and GMP-5.0.3. Use below link and download latest version for all of them thereafter
ASSEMBLYpushl %ebp // push ebpmovl %esp, %ebp //copy stack pointer to ebp
Subl $4, %esp //make space on stack for local dataMovl $10, -4(%ebp) #, A //put value 10 in Aleal -4(%ebp), %eax //load address of A into EAXAddl $66 (%eax) // add 66 to A
http://www.ibm.com/developerworks/linux/library/l-ia/index.htmlhttp://stackoverflow.com/questions/4003894/leal-assembler-instructionhttp://www.hep.wisc.edu/~pinghc/x86AssmTutorial.htmhttp://www.ibm.com/developerworks/linux/library/l-ia/index.htmlhttp://stackoverflow.com/questions/4003894/leal-assembler-instructionhttp://www.hep.wisc.edu/~pinghc/x86AssmTutorial.htm -
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
10/11
download the latest version of the GCC which is GCC-4.6.2 from net.Now follow the mentioned steps to build all three libraries and gcc step by step:-
Build GMP-5.0.3:
Build MPFR-3.1.0:
Build MPC-0.9:
Build GCC-4.6.2:
$mkdir ~/gmp-5.0.3 ~/build$cd ~/build$tar xjf /home/rfs54/Downloads/gmp-5.0.3.tar.bz2
$cd gmp-5.0.3$ ./configure prefix=/home/rfs54/gmp-5.0.3 enable-cxx$ nice -n19 time make -j8$make install
$mkdir ~/mpfr-3.1.0$cd ~/build$tar xjf /home/rfs54/Downloads/mpfr-3.1.0.tar.bz2$cd mpfr-3.1.0
$./configure prefix=/home/rfs54/mpfr-3.1.0 with-gmp=/home/rfs54/gmp-5.0.3$nice -n 19 time make -j8$make install
$mkdir ~/mpc-0.9
$cd ~/build$tar xzf /home/rfs54/Downloads/mpc-0.9.tar.bz2$cd mpc-0.9$LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/user/mpfr-3.1.0/lib./configure --prefix=/home/rfs54/mpc-0.9 with-gmp=/home/rfs54/gmp-45.0.3--with-mpfr=/home/rfs54/mpfr-3.1.0$LD_LIBRARY_PATH=/home/user/gmp-4.3.2/lib:/home/user/mpfr-2.4.2/lib$nice -n 19 time make -j8$make install
$mkdir ~/gcc-4.6.2$cd ~/build$tar xjf /home/rfs54/Downloads/gcc-4.6.2.tar.bz2$cd gcc-4.6.2$LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/rfs54/mpfr-3.1.0/lib:/home/rfs54/mpc-0.9/lib ./configure prefix=/home/rfs54/gcc-4.6.2--with-gmp=/home/rfs54/gmp-5.0.3 with-mpfr=/home/rfs54/mpfr-3.1.0--with-mpc=/home/rfs54/mpc-0.9 disable-multilib$LD_LIBRARY_PATH=/home/rfs54/gmp-5.0.3/lib:/home/rfs54/mpfr-3.1.0/lib:/home/rfs54/mpc-0.9/lib$nice -n 19 time make -j8
$make install
-
7/31/2019 RISC_CC_GCC_Ranjan_draft_v0.003
11/11
Conclusion:- All other libraries are build but while running make command in gcc it shows 1 errors,which I am showing below. I will get back to it tomorrow and try to resolve the problem.
Error:- 1. $ error while loading shared libraries: libgmp.so.10: cannot open shared object file: No suchfile or directory
References:-http://studystuff.in/content/steps-install-and-configure-gcc-462
http://gcc.gnu.org/install/build.html http://solarianprogrammer.com/2011/12/01/compiling-gcc-4-6-2-on-mac-osx-lion/ http://www.multiprecision.org/index.php?prog=mpc&page=download http://www.mpfr.org/mpfr-current/#download
http://openwall.info/wiki/internal/gcc-local-build
http://gcc.gnu.org/install/build.htmlhttp://solarianprogrammer.com/2011/12/01/compiling-gcc-4-6-2-on-mac-osx-lion/http://www.multiprecision.org/index.php?prog=mpc&page=downloadhttp://www.mpfr.org/mpfr-current/#downloadhttp://gcc.gnu.org/install/build.htmlhttp://solarianprogrammer.com/2011/12/01/compiling-gcc-4-6-2-on-mac-osx-lion/http://www.multiprecision.org/index.php?prog=mpc&page=downloadhttp://www.mpfr.org/mpfr-current/#download