winter 2003/4pls – syntax – catriel beeri1 syntax syntax: form, structure the syntax of a pl:...
Post on 22-Dec-2015
223 views
TRANSCRIPT
Winter 2003/4 Pls – syntax – Catriel Beeri 1
SYNTAX
Syntax: form, structure
The syntax of a pl:• The set of its well-formed programs• The rules that define these programs
Two views:• Concrete syntax: program as text• Abstract syntax: program as composite structure, a
tree
Winter 2003/4 Pls – syntax – Catriel Beeri 2
Concrete syntax
The common view – program as text (a string)
Common practice in compilers – divide into two levels• Lexical structure - the words מבנה מילוני
– Lexical specification / analysis מפרט יניתוח מילונ ,
• Phrase structure – the sentences מבנה תחבירי – Phrase structure specification / parsing
ניתוח תחבירימפרט,
Winter 2003/4 Pls – syntax – Catriel Beeri 3
Lexical
A word: lexeme מילה
A class of words: token אסימון
for example: int, ident, real, leftpar, if …..
2.3 (real, 2.3)
(4+5) leftpar (int, 4) plus (int, 5) rightpar
Lexical analysis – convert text to (token, lexeme) - stream
Winter 2003/4 Pls – syntax – Catriel Beeri 4
Lexical analysis: implementaion
Token specified by regular expression
Regular expression (ndet) finite automaton,(det) finite automaton a program
– a lexical analyzer
Issues:• Many tokens• Where to stop• …..
Winter 2003/4 Pls – syntax – Catriel Beeri 5
Phrase structure/analysis
Specified by context free grammar (CFG) (BNF --- Backus-Naur form)
• T – terminals (here, tokens – sets of lexemes)• N – non-terminals = names of syntactical categories• P –production rules
Rule: A w (w is a string on N T)
• S – start non-terminal
A CFG as a generative device: • Start from S• Replace non-terminals by strings, using rules
Winter 2003/4 Pls – syntax – Catriel Beeri 6
Example: CFG for simple arithmetic expressions
T = {int, op}, N = {E}, S = E
Rules:
E ::= int | E op E (2 rules, | means `or’)
Generation by a derivation:
E op E => E => int op E => int op E op E =>
int op int op E => int op int op int
Could represent the expression 2 - 3 - 4
Winter 2003/4 Pls – syntax – Catriel Beeri 7
Here are two derivations:Cont’d
E op E => E =>int op E =>
E op E op E =>
int op E op E => int op int op E => int op int op int
Are they really different?
And from both:
Winter 2003/4 Pls – syntax – Catriel Beeri 8
A derivation corresponds to a derivation tree:
* E => E op E => E op E op E =>int op E op E =>
E
E Eop
E Eopint
int int
E => E op E => int op E =>
int op int op E => int op into op int
Winter 2003/4 Pls – syntax – Catriel Beeri 9
Derivations vs. derivation trees
A derivation tree represents many derivations
If there is a word with several derivation trees, the CFG is ambiguous.
Example:
E
E Eop
E Eopint
E
opE
E Eop
E
int
int int int int
Winter 2003/4 Pls – syntax – Catriel Beeri 10
The problem is addressed by:• Adopting left associativity• Allowing parentheses in expressions • Changing the CFG:
– New non-terminal T (for term)
– New rules:
E ::= E op T | T
T ::= int | (E)
Winter 2003/4 Pls – syntax – Catriel Beeri 11
This CFG is unambiguous, and reflects left associativity
E => E op T
int op int op int
Winter 2003/4 Pls – syntax – Catriel Beeri 12
A derivation tree
More complex than expression tree
E
E op
op
T
( )E
E T
int
intT
int
Winter 2003/4 Pls – syntax – Catriel Beeri 13
Phrase structure -summary
• A language is specifiable by many CFG’s• A CFG needs to address:
– Ambiguity (avoid)– Associativity (express)– Precedence (express)– Efficient parsing (ensure)
Methodologies for transforming CFG’s to account for the above are known
The resulting CFG’s are complex; so are the derivation trees.
Winter 2003/4 Pls – syntax – Catriel Beeri 14
Abstract Syntax
Consider:• (if (< x 3) 4 7) (scheme) X < 3? 4 : 7 (C) • (let ((x 5) (+ x 3) (scheme) let x = 5 in x + 3 (OCAML)
Each pair is the “same” expression, same componentsThe meaning is explained in same way: E.g., for the conditional: Evaluate the test if its value is true evaluate the 1st branch, else evaluate the 2nd
Winter 2003/4 Pls – syntax – Catriel Beeri 15
In abstract syntax: a program/expression is viewed as a labeled tree/ a compound structure
• A labeled leaf, represents an atomic phrase. label represents the category
• A larger tree represents a compound phrase – The root label is its category– The children are its components
int (3)
Winter 2003/4 Pls – syntax – Catriel Beeri 16
Typical building blocks:• Record:
IfExpr
test branch2branch1
E3E2E1
IfExpr : {test = E1, branch1 = E2 branch3 = E3}
Type can be expressed as an OCAML datatype
type ifexpr = IfExpr of {test : expr;
branch1 : expr; branch2 : expr}
Winter 2003/4 Pls – syntax – Catriel Beeri 17
• Tuple:
IfExpr
E3E2E1
IfExpr : (E1, E2,E3)
type ifexpr = IfExpr of expr * expr * expr
Tuple vs. Record: field name vs. ordering
Winter 2003/4 Pls – syntax – Catriel Beeri 18
• Sequence:
CmpdStmt : (S1, S2, … , Sn)
Tuple vs. sequence: In a tuple type, number of fields is known & fixed
type cmpd_stmt = CmpdStmt of stmt list
Winter 2003/4 Pls – syntax – Catriel Beeri 19
Summary of abstract syntax
Abstract syntax is the structure of the program keywords, separators, conventions - not included associativity, precedence, unambiguity - non-issues
Parsing: convert from concrete to abstract syntax
Type-checking, semantics, compiler translation use abstract syntax
In rest of course: abstract syntax
Winter 2003/4 Pls – syntax – Catriel Beeri 20
Q:
Can a cfg derivation tree serve as abstract syntax tree?
Winter 2003/4 Pls – syntax – Catriel Beeri 21
Syntax (concrete/abstract) is an inductive definition
Example : E ::= int | id | E op E
As rules:
int
expr
i
i
Id
expr
x
x
1 2
1 2
expr expr op
expr
e e o
e o e
How will the rules look like for type expr = Int of int | Id of string | Expr of expr * exp ?
Winter 2003/4 Pls – syntax – Catriel Beeri 22
Common informal approach to abstract syntax specification
Use a string CFG, interpret as a tree grammar • Ignore keywords • Labels and structures - left to reader to decide
This shows the category, the components Sufficient for semantics
Example : If-Expr ::= if Expr then Expr else Expr
This is the approach in the course
Winter 2003/4 Pls – syntax – Catriel Beeri 23
A convention for abstract syntax
Use variables, declare them before rules, omit indices
Example :
A similar convention often used for inductive definitions
int, id, op, expr
: | |
i x o e
e i x e o e