abstract syntax mooly sagiv html://msagiv/courses/wcc04.html

21
Abstract Syntax Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/ courses/wcc04.html

Post on 22-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Abstract Syntax

Mooly Sagiv

html://www.cs.tau.ac.il/~msagiv/courses/wcc04.html

Page 2: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Outline• The general idea

• Bison

• Motivating example Interpreter for arithmetic expressions

• The need for abstract syntax

• Abstract syntax for arithmetic expressions

Page 3: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Semantic Analysis during Recursive Descent Parsing

• Scanner returns “semantic values” for some tokens

• The function of every non-terminal returns the “corresponding subtree value”

• When A B C D is appliedthe function for A can use the values returned by B, C, and D– The function can also pass parameters, e.g., to

D(), reflecting left contexts

Page 4: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

E num E’

E’

E’ + num E’

int E() {

switch (tok) {

case num : temp=tok.val; eat(num);

return EP(temp);

default: error(…); }}

int EP(int left) {

switch (tok) {

case $: return left;

case + : eat(+);

temp=tok.val; eat(num);

return EP(left + temp);

default: error(…) ;}}

Page 5: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Semantic Analysis during Bottom-Up Parsing

• Scanner returns “semantic values” for some tokens

• Use parser stack to store the “corresponding subtree values”

• When A B C D is reducedthe function for A can use the values returned by B, C, and D

• No action in the middle of the rule

Page 6: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

E E + num

E num

Example

E 7

+

num 5

E 12

Page 7: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Bison Specification

Declarations

%%

Productions

%%

C -Routines

Page 8: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Interpreter (in Bison)%{ declarations of yylex()

and yyeror() %}%union {

int num; string id;}%token <id> ID%token <num> NUM%type <num> e f t%start e%%

e : e ‘+’ t {$$ = $1 + $3 ; } | e ‘-’ t { $$ = $1 - $3 ; } | t { $$ = $1; } ; t : t ‘*’ f { $$ = $1 * $3; } | t ‘/’ f { $$= $1 / $3; } | f { $$ = $1; } ; f : NUM { $$ = $1; } | ID { $$ = lookup($1); } | ‘-’ e { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 9: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Interpreter (compact spec.)%{ declarations of yylex()

and yyeror()%}%union {

int num; string id;}%token <id> ID%token <num> NUM%type <num> e%start e%left PLUS MINUS%left MUL DIV%right UMINUS%%

e : e PLUS e {$$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e { $$= $1 / $3; } | NUM | ID { $$ = lookup($1); } | MINUS e % prec UMINUS { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 10: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

stack input action

$ 7+11+17$ shift

num 7

$

+11+17$ reduce e num

e 7

$

+11+17$ shift

+

e 7

$

11+17$ shift

num 11

+

e 7

$

+17$ reduce e num

Page 11: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

stack input action

e 11

+

e 7

$

+17$ reduce e e+e

e 18

$

+17$ shift

+

e 18

$

17$ shift

num 17

+

e 18

$

$ reduce e num

Page 12: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

stack input action

e 17

+

e 18

$

$ reduce e::= e+e

e 35

$

$ accept

Page 13: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

So why can’t we write

all the compiler code

in Bison/CUP?

Page 14: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Historical Perspective• Originally parsers were written w/o tools

• yacc, bison, ... make tools acceptable

• But it is still difficult to write compilers in parser actions (top-down and bottom-up)– Natural grammars are ambiguous– No modularity principle– Many useful programming language features

prevent code generation while parsing• Use before declaration

• gotos

Page 15: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Abstract Syntax• Intermediate program representation

• Defines a tree - Preserves program hierarchy

• Generated by the parser

• Declared using an (ambiguous) context free grammar (relatively flat)– Not meant for parsing

• Keywords and punctuation symbols are not stored (Not relevant once the tree exists)

• Big programs can be also handled (possibly via virtual memory)

Page 16: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Issues

• Concrete vs. Abstract syntax tree

• Need to store concrete source position

• Abstract syntax can be defined by:– Ambiguous context free grammar– C recursive data type – “Constructor” functions

• Debugging routines linearize the tree

Page 17: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Abstract Syntax for Arithmetic Expressions

Exp id (IdExp)

Exp num (NumExp)

Exp Exp Binop Exp (BinOpExp)

Binop + (Plus)

Binop - (Minus)

Binop *

Binop /

(Times)

(Div)

Exp Unop Exp (UnOpExp)

Unop - (UnMin)

Page 18: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

/* absyn.h */

typedef enum {A_Plus, A_Minus, A_Times, A_Div} Binop;typedef enum {A_UnMin} Unop;typedef enum {A_IdExp, A_NumExp, A_BinOpExp, A_UnopExp} ExpKind;typedef struct exp { ExpKind kind ; union { struct {string repr ;} IdExp; struct {int number ; } NumExp; struct {Binop op; struct exp *left; struct exp *right;} BinExp; struct {Unop op; struct exp *arg;} UnExp;

} exp;} *Exp;

extern Exp idExp(string repr);extern Exp numExp(int number);extern Exp binOp(Binop, struct exp *, struct exp*);extern Exp unOp(Unop, struct exp*);

Page 19: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

%{ declarations of yylex() and yyeror()

#include “absyn.h”%}%union {

int num; string id; Exp exp;}%token <id> ID%token <num> NUM%type <exp> e f t%start e%%

e : e ‘+’ t {$$ = binop(A_Plus, $1, $3) ; } | e ‘-’ t { $$ = binop(A_Minus, $1, $3) ; } | t { $$ = $1; } ; t : t ‘*’ f { $$ = binop(A_Times,$1, $3); } | t ‘/’ f { $$= binop(A_Div, $1, $3); } | f { $$ = $1; } ; f : NUM { $$ = numExp($1); } | ID { $$ = idExp($1); } | ‘-’ e { $$ = unOp(A_UnMinus, $2); } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 20: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

%{ declarations of yylex() and yyeror()

#include “absyn.h”%}%union {

int num; string id; Exp exp; }%token <id> ID%token <num> NUM%type <exp> e%start e%left PLUS MINUS%left MUL DIV%right UMINUS%%

e : e PLUS e {$$ = binop(A_Plus, $1, $3) ; } | e MINUS e { $$ = binop(A_Minus, $1, $3) ; } | e MUL e { $$ = binop(A_Times, $1, $3); } | e DIV e { $$= binop(A_Div, $1, $3); } | NUM {$$ = numExp($1); } | ID { $$ = idExp($1); } | MINUS e % prec UMINUS { $$ = unOp(A_UnMin,$2); } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 21: Abstract Syntax Mooly Sagiv html://msagiv/courses/wcc04.html

Summary

• Flex and Bison simplify the task of writing compiler/interpreter front-ends

• Abstract syntax provides a clear interface with other compiler phases– Supports general programming languages

• But the design of an abstract syntax for a given PL may take some time