lecture 17 naveen z quazilbash simplification of grammars
TRANSCRIPT
Lecture 17Naveen Z Quazilbash
Simplification of Grammars
OverviewAttendanceMotivationSimplification of Grammars
Eliminating useless variablesEliminating null productionsEliminating unit productions
Quiz result
Motivation for grammar simplificationParsing Problem
Given a CFG G and string w, determine if w ϵ L(G).Fundamental problem in compiler design and natural
language processingIf G is in general form then the procedure
maybe very inefficient. So the grammar is “transformed” into a simpler form to make the parsing problem easier.
Simplification of GrammarsIt involves the removal of:
1. Useless variables2. ε-productions3. Unit productions
Useless variables:There are two types of useless variables:
1. Variables that cannot be reached2. Variables that do not derive any strings
3. ε-productionsE.g.: Aε
• Note that if we remove these productions, the language no longer includes the empty string.
4. Unit productions:
They are of the form ABOrAA
1) Unreachable Variables E.g.:
SBS|B|EADA|D|SBCB|CCaC|aDbD|bEcE|c
To find unreachable variables, draw a dependency graph
Dependency Graph:Vertices of the graph are variablesThe graph doesn’t include alphabet symbols,
such as “a” or “b”If there is a production A…..B…, i.e., the left
side is A and the right side includes B, then there is an edge AB
A variable is reachable if there is a path from S to this variable
S itself is always reachable
After identifying unreachable variables, remove all productions with unreachable left side.
SBS|B|EADA|D|SBCB|CCaC|aDbD|bEcE|c
Drawing its dependency graph:Reachable: S, B, C, E
S
DAE
CB
Grammar without unreachable variables:SBS|B|EBCB|CCaC|aEcE|c
Ex: Determine its language!!
2) Variables that don’t terminateA variable A terminates if either:
There is a production A…. with no variables on the right, e.g. Aaabc,
ORThere is a production A… where all variables
on the right terminate; e.g. AaBbaC, where B and C terminate.
Note: to find all variables that terminate, keep looking for such productions until you cannot find any new ones.
TASKExample: SA|BC|DEAaA|bABbB|bCEFDdD|BD|BAEaE|aFcFc|cRemove all productions that include a variable that
doesn’t terminate. Note: We remove a production if it has such a
variable on either side.
Solutionx SA|BC|DEX AaA|bAx BbB|bx CEFX DdD|BD|BAx EaE|ax FcFc|c
SBCBbB|bCEFEaE|aFcFc|c
Ex: Determine its language.
3) Eliminating ε-ProductionsNullable variables:A variable is nullable if either:
There is a production A ε, orThere is a production AB1B2…Bn(only
variables, no symbols), where all variables on the right side are nullable.
Note: to find all nullable variables, keep looking for such productions, until you cannot find any new ones.
TASKSSAB|SBC|BCAaA|aBbB|bC|CCcC| ε
First we find variables that can lead to the empty string:C=> εB=>C=> εS=>BC=>B=>C=> ε
x SSAB|SBC|BCAaA|a
x BbB|bC|Cx CcC| ε
Thus, S, B, and C can lead to ε; they are called nullable variables
For each production that has nullable variables, consider all possible ways to skip some of these variables and add the corresponding productions.
E.g. WaWXaYZb, suppose that X, Y and Z are nullable; then there are 8 ways to skip some of them.
WaWab|aWXab|aWaYb|aWaZb|aWXaYb|aWXaZb|
aWaYZb|aWXaYZb
Back to our grammar where S,B and C are nullable:SA|AB|SA|SAB|S|B|C|SB|BC|SBCAaA|aBb|bB|bC|CCc|cC|ε
Now, we can remove the ε- productions without changing the language.
The only possible change is losing the empty string, if it is in the original language.
So our grammar without null productions becomes:
SA|AB|SA|SAB|S|B|C|SB|BC|SBCAaA|aBb|bB|bC|CCc|cC
4) Eliminating Unit ProductionsSAa|BAa|bc|BBA|bb|C|cCCa|CFirst, for every variable, we find all single
variables that can be reached from it:For S: S=>B=>A, S=>B=>CFor A: A=>B=>CFor B: B=>A, B=>CFor C: NONE (C itself doesn’t count)
For finding reachable single variables, what should we do?
Use Dependency Graph!Drawing Dependency Graph:
Vertices of the graph are variables.If there is a unit production AB, then there is
an edge AB. A single variable is reachable from A if there
is a pth from A to B.
Dependency Graph:
S
A
B
C
To construct an equivalent grammar without unit productions:Remove all unit productionsFor each pair A=>*B, where B is a single
variable reachable from A, consider all productions Bp1|p2|…|pn; and add the corresponding productions A p1|p2|…|pn.
for example, since A=>*B and Bbb|cC, add the productions Abb|cC
SAa|BAa|bc|BBA|bb|C|cCCa|C
SAaBbb|cCAa|bcCa
Note that the variable B has become useless and we need to remove it!
Sbb|cC|a|bc|a
Ba|bc|aAbb|cC|aCa
Old non-unit productions
new productions
SummaryMain steps of simplifying a grammar:
Remove useless variables, which cannot be reached or do not terminate.
Remove ε- productions.Remove unit productions.Remove useless variables again!