meljun cortes automata theory 9

Upload: meljun-cortes-mbampa

Post on 04-Jun-2018

245 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    1/22

    CSC 3130: Automata theory and formal languages

    Normal forms and parsing

    Fall 2008ELJUN P. CORTES MBA MPA BSCS ACS

    MELJUN CORTES

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    2/22

    Testing membership and parsing

    Given a grammar

    How can we know if a string xis in its language?

    If so, can we reconstruct a parse tree for x?

    S 0S1 | 1S0S1 | T

    T S | e

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    3/22

    First attempt

    Maybe we can try all possible derivations:

    S 0S1 | 1S0S1 | TT S |

    x= 00111

    S 0S1

    1S0S1

    T

    00S11

    01S0S11

    0T1

    S

    10S10S1...

    when do we stop?

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    4/22

    Problems

    How do we know when to stop?

    S 0S1 | 1S0S1 | TT S |

    x= 00111

    S 0S1

    1S0S1

    00S11

    01S0S11

    0T1

    10S10S1...

    when do we stop?

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    5/22

    Problems

    Idea: Stop derivationwhen length exceeds |x|

    Not right because of -productions

    We might want to eliminate -productions too

    S 0S1 | 1S0S1 | TT S |

    x= 01011

    S 0S1 01S0S11 01S011 010111 3 7 6 5

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    6/22

    Problems

    Loops among the variables (STS) might

    make us go forever

    We might want to eliminate such loops

    S 0S1 | 1S0S1 | TT S |

    x= 00111

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    7/22

    Unit productions

    A unit productionis a production of the form

    whereA1andA2are both variables

    Example

    A1 A2

    S 0S1 | 1S0S1 | TT S | R |

    R 0SR

    grammar: unit productions:

    S T

    R

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    8/22

    Removal of unit productions

    If there is a cycle of unit productions

    delete it and replace everything withA1

    Example

    A1 A2 ... Ak A1

    S 0S1 | 1S0S1 | T

    T S | R | R 0SR

    S T

    R

    S 0S1 | 1S0S1

    S R | R 0SR

    Tis replaced by Sin the {S, T}cycle

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    9/22

    Removal of unit productions

    For other unit productions, replace every chain

    by productionsA1 ,... , Ak

    Example

    A1 A2 ... Ak

    S R 0SRis replaced by S 0SR, R 0SR

    S 0S1 | 1S0S1

    | R | R 0SR

    S 0S1 | 1S0S1

    | 0SR| R 0SR

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    10/22

    Removal of -productions

    A variable Nis nullableif there is a derivation

    How to remove -productions (except from S)Find all nullable variables N1, ..., Nk

    For i= 1to k

    For every production of the formA Ni

    ,

    add another productionA

    If Ni is a production, remove it

    If S is nullable, add the special productionS

    N*

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    11/22

    Example

    Find the nullable variables

    S

    ACDAa

    B

    C ED |

    D BC | b

    E b

    B C D

    nullable variablesgrammar

    Find all nullable variables N1, ..., Nk

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    12/22

    Finding nullable variables

    To find nullable variables, we work backwards First, mark all variablesAs.t.Aas nullable

    Then, as long as there are productions of the form

    where all ofA1,, Ak are marked as nullable, markA

    as nullable

    A A1 A

    k

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    13/22

    Eliminating -productions

    S ACDAa

    B

    C ED |

    D

    BC | bE b

    nullable variables:B, C, D

    For i= 1to kFor every production of the formA Ni,

    add another productionA

    If Ni is a production, remove it

    D CS AD

    D B

    D

    S AC

    S A

    C E

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    14/22

    Recap

    After eliminating -productions and unitproductions, we know that every derivation

    doesnt shrink in lengthand doesnt go intocycles

    Exception: S We will not use this rule at all, except to check if L

    Note

    -productions must be eliminated beforeunit

    S a1ak where a1, , ak are terminals*

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    15/22

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    16/22

    Algorithm 1 for testing membership

    We can now use the following algorithm to checkif a string xis in the language of G

    Eliminate all -productions and unit productions

    If x = and S , accept; else delete S LetX:= S

    While some new production Pcan be applied to X

    Apply Pto X

    IfX= x, accept

    If |X| > |x|, backtrack

    If no more productions can be applied toX, reject

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    17/22

    Practical limitations of Algorithm I

    Previous algorithm can be very slow if xis long

    There is a faster algorithm, but it requires that we

    do some more transformations on the grammar

    G= CFG of the java programming language

    x= code for a 200-line java program

    algorithm might take about 10200steps!

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    18/22

    Chomsky Normal Form

    A grammar is in Chomsky Normal Formif everyproduction (except possiblyS ) is of the type

    Conversion to Chomsky Normal Form is easy:

    A BC A aor

    A BcDEreplaceterminals

    with new

    variables

    A BCDE

    C c break upsequenceswith new

    variables

    A BX1X1 CX2X2 DE

    C c

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    19/22

    Exercise

    Convert this CFG into Chomsky Normal Form:

    S |ADDA

    A a

    C c

    D bCb

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    20/22

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    21/22

    Parse tree reconstruction

    S AB | BC

    A BA | a

    B CC | b

    C AB | a

    x= baabaab b aa

    ACB B ACACBSA SASC

    B B

    SAC

    SAC

    Tracing back the derivations, we obtain the parse tree

  • 8/13/2019 MELJUN CORTES Automata Theory 9

    22/22

    Cocke-Younger-Kasami algorithm

    For i= 1to k

    If there is a productionA xiPutAin table cell ii

    For b= 2to kFor s= 1to kb+ 1

    Set t= s+ b

    Forj= sto t

    If there is a productionA BC

    where Bis in cell sjand Cis in celljt

    PutAin cell st

    x1 x2 xk

    11 22 kk

    12 23

    1k

    s j t k1

    b

    Input:Grammar Gin CNF, string x = x1xk

    Cell ij remembers all possible derivations of substring xixj