comp3190: principle of programming languages dfa and its equivalent, scanner
TRANSCRIPT
COMP3190: Principle of Programming Languages
DFA and its equivalent, scanner
- 2 -
Outline
DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA
Regular expression Regular languages Scanner
- 3 -
Example of DFA
q1 q2
1
0
0 1
δ 0 1
q1 q1 q2
q2 q1 q2
- 4 -
Deterministic Finite Automata (DFA)
5-tuple:» Q: finite set of states» Σ: finite set of “letters” (alphabet)» δ: Q × Σ → Q (transition function)
» q0: start state (in Q)
» F : set of accept states (subset of Q) Acceptance: Given an input string , it is
consumed with the automata in a final state.
- 5 -
Another Example of a DFA
S
q1
q2
r1
r2
a b
a
ab
b
b
a b
a
- 6 -
Outline
DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA
Regular expression Regular languages Context free languages &PDA Scanner Parser
- 7 -
Non-deterministic Finite Automata (NFA)
Transition function is different δ: Q × Σε → P(Q)
P(Q) is the powerset of Q (set of all subsets) Σε is the union of Σ and the special symbol ε
(denoting empty)String is accepted if there is at least one path leading to an accept state, and input consumed.
- 8 -
Example of an NFA
q1 q2 q3 q4
0, 11 0, ε 1
0, 1
δ 0 1 ε
q1 {q1} {q1, q2}
q2 {q3} {q3}
q3 {q4}
q4 {q4} {q4}
What strings does this NFA accept?
- 9 -
Outline
DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA
Regular expression Regular languages Context free languages &PDA Scanner Parser
- 10 -
Converting an NFA to a DFA
For set of states S, - closure(S) is the set of states that can be reached from S without consuming any input.
For a set of states S, Sc is the set of states that can be reached from S by consuming input symbol c.
Each set of NFA states corresponds to one DFA state (hence at most 2n states).
- 11 -
-closure({1})={1 , 2}=I Ia= -closure({5,4,3})
J={5 , 4 , 3}
-closure(J)= -closure({5 , 4 , 3})
={5 , 4 , 3 , 6 , 2 , 7 , 8} Ja={3}
6
1 a 2 3
4
5
7
8
a
a
- 12 -
I Ia Ib
{X,5,1} {5,3,1} {5,4,1}
{5,3,1} {5,2,3,1,6,Y} {5,4,1}
{5,4,1} {5,3,1} {5,2,4,1,6,Y}
{5,2,3,1,6,Y} {5,2,3,1,6,Y} {5,4,6,1,Y}
{5,4,6,1,Y} {5,3,6,1,Y} {5,2,4,1,6,Y}
{5,2,4,1,6,Y} {5,3,6,1,Y} {5,2,4,1,6,Y}
{5,3,6,1,Y} {5,2,3,1,6,Y} {5,4,6,1,Y}
X Y
5 1
4
2
3
6
a
b
a
b
a
b
a
b
- 13 -
I a b0 1 21 3 22 1 43 3 44 6 55 6 56 3 4
0
1
2
3
5
4
6
aa b
bb
a
ba aba
b
a
b
- 14 -
Excercise
1
2
3start
a
ab
a
4
65
ε
ε
ε
a
b
b
A B
4
6a
astart
Ca,b
a,bb
- 15 -
Class Problem
0 1
4
2
6
3
5
97ε ε
ε
ε
ε
ε
ε
ε
a
a
b
8 b
Convert this NFA to a DFA
- 16 -
Outline
DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA
Regular expression Regular languages Scanner
- 17 -
State Minimization
Resulting DFA can be quite large» Contains redundant or equivalent states
2
5
b
start1
3
b
ab
aa
4
b
a
1 2 3start
a a
bb Both DFAs acceptb*ab*a
Ca,b
a,b
Ca,b
a,b
- 18 -
Obtaining the minimal equivalent DFA
Initially two equivalence classes: accept and nonaccept states.
Search for an equivalence class C and an input letter a such that with a as input, the states in C make transitions to states in k>1 different equivalence classes.
Partition C into k classes accordingly Repeat until unable to find a class to partition.
- 19 -
Minimization Example
19
Split into two teams.
ACCEPT
vs.
NONACCEPT
- 20 -
Minimization Example
20
0-label doesn’t split
up any teams
- 21 -
Minimization Example
21
1-label splits up
NONACCEPT's
- 22 -
Minimization Example
22
No further splits. HALT!
Start team
contains
original
start
- 23 -
Minimization Example.End Result23
States of the minimal automata are
remaining teams. Edges are
consolidated across each team. Accept
states are break-offs from
original ACCEPT team.
- 24 -
Minimization Example.Compare24
100100101
10000
- 25 -
a
e
b
f
c
g
d
h
0
0
0
0
0
0
001
1
1 1
1
1
1
1
Class Exercise
- 26 -
Exercise
How to minimize the following DFA?
2
5
b
start1
3
b
ab
aa
4
b
a
1 2 3start
a a
bb
Both DFAs acceptb*ab*a
- 27 -
Outline
DFA & NFA Regular expression Regular languages Scanner
- 28 -
Regular Expressions
R is a regular expression if R is “a” for some a in Σ. ε (the empty string). member of the empty language. the union of two regular expressions. the concatenation of two regular expr. R1
* (Kleene closure: zero or more repetitions of R1).
- 29 -
Examples of Regular Expressions
{0, 1}* 0 all strings that end in 0{0, 1} 0* string that start with 1 or 0 followed by zero or more 0s.{0, 1}* all strings{0n1n, n >=0} not a regular expression!!!
- 30 -
Regular Expressions in Java Ex: pattern match. Is text in the set described by the pattern? public class RE {
public static void main(String[] args) { String pattern = args[0]; String text = args[1]; System.out.println(text.matches(pattern)); }}
% java RE "..oo..oo." bloodroottrue
% java RE "[$_A-Za-z][$_A-Za-z0-9]*" ident123true
% java RE "[a-z]+@([a-z]+\.)+(edu|com)" [email protected]
- 31 -
Regular Expression Notation in Java a: an ordinary letter ε: the empty string M | N: choosing from M or N MN: concatenation of M and N M*: zero or more times (Kleene star) M+: one or more times M?: zero or one occurence [a-zA-Z] character set alternation (choice) . period stands for any single char exc. newline
- 32 -
Converting a regular expression to a NFA
Empty string
Single character
union operator
Concatenation
Kleene closure
- 33 -
Regular expression→NFA
Language: Strings of 0s and 1s in which the number of 0s is even
Regular expression: (1*01*0)*1*
- 34 -
NFA → DFA
Initial classes:{A, B, E}, {C, D}
No class requires partitioning!
Hence a two-stateDFA is obtained.
- 35 -
Minimize DFA
- 36 -
Outline
DFA & NFA Regular expression Regular languages Scanner
- 37 -
Regular language
a formal language » a set of finite sequences of symbols from a
finite alphabet it can be generated by a regular grammar
- 38 -
Regular Grammar
Later definitions build on earlier ones Nothing defined in terms of itself (no
recursion)
Regular grammar for numeric literals in Pascal:digit → 0|1|2|...|8|9unsigned_integer → digit digit*unsigned_number → unsigned_integer (( . unsigned_integer) | ε ) (( e (+ | - | ε ) unsigned_integer ) | ε )
- 39 -
Important Theorems
A language is regular if a regular expression describes it.
A language is regular if a finite automata recognizes it.
DFAs and NFAs are equally powerful.
- 40 -
Outline
DFA & NFA Regular expression Regular languages Scanner
- 41 -
Scanning
Accept the longest possible token in each invocation of the scanner.
Implementation.» Capture finite automata.
Case(switch) statements. Table and driver.
- 42 -
Scanner for Pascal
- 43 -
Scanner for Pascal(case Statements)
- 44 -
Scanner Generators
Start with a regular expression. Construct an NFA from it. Use a set of subsets construction to obtain an
equivalent DFA. Construct the minimal equivalent DFA.