intermediate language compiler model front-end− language dependant part back-end− machine...
DESCRIPTION
Intermediate Language Two level Code Generation IL S 소스로부터 자동화에 의해 얻을 수 있는 형태 소스 언어에 의존적이며 high level 이다. IL T 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태 목적기계에 의존적이며 low level 이다. IL S to IL T IL S 에서 IL T 로의 번역이 주된 작업임. [3/34]TRANSCRIPT
Source Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
tokens
AST
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Semantic Analyzer
Intermediate Code Generator
tokens
AST
Front- End
Code Optimizer
Target Code Generator
IC
Back- End
Code Optimizer
Target Code Generator
IC
Back- EndILIL
Object Program
Intermediate Language
Compiler Model
Front-End− language dependant partBack-End − machine dependant part
[1/34]
IL 의 필요성 Modular Construction Automatic Construction Easy Translation Portability Optimization Bootstrapping
IL 의 분류 Polish Notation --- Postfix, IR Three Address Code --- Quadruple, Triple, Indirect triple Tree Structured Code --- PT, AST, TCOL Abstract Machine Code --- P-code, EM-code, U-code, Byte-code
Intermediate Language [2/34]
Source Front-End ILS ILS-ILT ILT Back-End Target
Intermediate Language
Two level Code Generation
ILS 소스로부터 자동화에 의해 얻을 수 있는 형태 소스 언어에 의존적이며 high level 이다 .
ILT 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태 목적기계에 의존적이며 low level 이다 .
ILS to ILT ILS 에서 ILT 로의 번역이 주된 작업임 .
[3/34]
Polish mathematician Lucasiewiez invented the parenthesis-free notation.
Postfix(Suffix) Polish Notation earliest IL popular for interpreted language - SNOBOL, BASIC
general form :e1 e2 ... ek OP (k ≥ 1)
where, OP : k_ary operator ei : any postfix expression (1 ≤ i ≤ k)
Intermediate Language [4/34]
Source Translator Postfix Evaluator Result
Intermediate Language
example :if a then if c-d then a+c else a*c else a+b 〓〉 a L1 BZ c d - L2 BZ a c + L3 BR L2: a c * L3 BR L1: a b + L3:
note1) high level: source to IL - fast & easy translation
IL to target - difficulty2) easy evaluation - operand stack3) optimization 부적당 - 다른 IL 로의 translation 필요4) parentheses free notation - arithmetic expression
interpretive language 에 적합
[5/34]
most popular IL, optimizing compiler General form:
where, A : result addressB, C : operand addressesop : operator
(1) Quadruple - 4-tuple notation <operator>,<operand1>,<operand2>,<result>
(2) Triple - 3-tuple notation <operator>,<operand1>,<operand2>
(3) Indirect triple - execution order table & triplesIntermediate Language
A := B op C
[6/34]
quadruple
(*, c, d, t1) (/, t1, e, t2) (+, b, t2, t3) (=, t3, , a) (*, c, d, t4) (=, t4, f)
triple
1. (*, c, d) 2. (/, (1), e) 3. (+, b, (2)) 4. (=, a, (3)) 5. (*, c, d) 6. (=, f, (5))
operationsIndirect triple
triples1. (1)2. (2)3. (3)4. (4)5. (1)6. (5)
(1) (*, c, d) (2) (/, (1), e) (3) (+, b, (2)) (4) (=, a, (3)) (5) (=, f, (1))
Intermediate Language
example a = b + c * d / e; f = c * d;
[7/34]
Note Quadruple vs. Triple
quadruple - optimization 용이 triple - removal of temporary addresses⇒ Indirect Triple
extensive code optimization 용이 IL rearrange 가능 (triple 제외 )
easy translation - source to IL difficult to generate good code
quadruple to two-address machine triple to three-address machine
Intermediate Language [8/34]
Text p.386
Intermediate Language
Abstract Syntax Tree parse tree 에서 redundant 한 information 제거 .
Leaf node -- variable name, constant Internal node -- operator
[ 예제 9.8]
{ x = 0;
y = z + 2 * y; while ((x<n) && (v[x] != z)) x = x+1; return x;}
[9/34]
Tree Structured Common Language(TCOL) Variants of AST - containing the result of semantic analysis. TCOL operator - type & context specific operator Context
┌ value --- rhs of assignment statement├ location --- lhs of assignment statement├ boolean --- conditional control statement└ statement --- statementex) . : operand - location
result - valuewhile : operand - boolean, statement
result - statement
Intermediate Language [10/34]
Example) int a; float b;...b = a + 1;
Representation ---- graph orientationinternal notation ----- efficientexternal notation ---- debug, interface
linear graph notation
Intermediate Language
AST: assign
b add
a 1
AST: assign
b add
a 1
assign
b add
a 1
TCOL: assign
b float
addi
a
1.
[11/34]
Pascal P Compiler --- portable compiler producing P_CODE for an abstract machine(P_Machine).
P_Machine ----- hypothetical stack machine designed for Pascal language.
(1) Instruction --- closely related to the PASCAL language.(2) Registers PC --- program counter
NP --- new pointer SP --- stack pointer MP --- mark pointer
(3) Memory CODE --- instruction part STORE --- data part(constant area, stack, heap)
Intermediate Language [12/34]
CODE PCPC
STOREstack
heap
stack
heap
MP current activation recordMP current activation record
SPSP
NPNP
constant area
Intermediate Language [13/34]
Ucodethe intermediate form used by the Stanford Portable Pascal compiler.stack-based and is defined in terms of a hypothetical stack machine.Ucode Interpreter : Appendix B.
Addressingstack addressing ===> a tuple : (B, O)
B : the block number containing the addressO : the offset in words from the beginning of the block, offsets start at 1.
labelto label any Ucode instruction with a label field.All targets of jumps and procedures must be labeled.All labels must be unique for the entire program.
Intermediate Language [14/34]
Example : Consider the following skeleton :
int x; void main() { int i; int j; // ..... }
block number- 전역변수 : 1- 함수 내 지역변수 : 2
variable addressing- x : (1,1)- i : (2,1)- j : (2,2)
Intermediate Language [15/34]
Ucode Operations(39 개 ) Unary --- notop, neg, inc, dec, dup Binary --- add, sub, mult, div, mod, swp
and, or, gt, lt, ge, le, eq, ne Stack Operations --- lod, str, ldc, lda Control Flow --- ujp, tjp, fjp Range Checking --- chkh, chkl Indirect Addressing --- ldi (load indirect), sti (store
indirect) Procedure --- cal, ret, retv, ldp, proc, end Etc. --- nop, bgn, sym
Intermediate Language [16/34]
Example : x = a + b * c;
lod 1 1 /* a */ lod 1 2 /* b */ lod 1 3 /* c */ mult add str 1 4 /* x */
if (a>b) a = a + b; lod 1 1 /* a */
lod 1 2 /* b */ gt fjp next lod 1 1 /* a */ lod 1 2 /* b */ add str 1 1 /* a */ next ...
Intermediate Language [17/34]
Indirect Addressing is used to access the array elements.
ldi --- indirect load replace stacktop by the value of the item at location
stacktop. to retrieve A[i] :
lod i // actually (Bi, Oi)) lda A // also (block number, offset) add // effective address ldi // indirect load gets contents of A[i]
Intermediate Language [18/34]
sti --- indirect store
sti stores stacktop into the address at stack[stacktop-1], both items are popped. A[i] = j;
lod ilda Aaddlod jSti
Intermediate Language [19/34]
Procedure Calling Sequence function definition :
void func(int x, int array[]) { }
function call : func(a, list);
calling sequence :ldp // load parameterlod a // load the value of actual parameterlda list // load the address of actual parametercall func // call func
Intermediate Language [20/34]