CD M2 Lect 4 - Bottom Up Parsers (1).pptx

Compiler Design
Bottom Up Parsers

LR Parser
 LR parsers used to parse a large class of context-free grammars.
 The technique is called LR(k) parsing:
 L denotes that input sequence is processed from left to right
 R denotes that the right most derivation is performed.
 k denotes that at most k symbols are used to make a decision.
 Reasons for the attractiveness of LR parser
 LR parsers can handle a large class of CF grammars.
 An LR parser can detect syntax errors as soon as they occur.
 The LR parsing method is the most general non back tracking
shift-reduce parsing method.
 LR parsers can handle all language recognized by LL(1).

LR Parser
 Drawbacks of the method
 Parsing tables are too complicated to be generated by
hand, need an automated generator.
 Cannot handle ambiguous grammar without special
tricks.

LR Parser Algorithm
input
output

LR Parser
LR parser consists of an
1. Input
2. Output
3. Stack
4. Driver (parser) program
5. Parse table
1. Action
2. Goto

LR Parser
 The driver program is same for all LR parsers.
 Only the parsing table changes from one parser to another.
 The parsing program reads character from an input buffer one
at a time.
 The program uses a stack to store a string of the form
S0 X1 S1 X2 S2 X3 . . . . Xm Sm
where Sm is on top of the stack
each Si is a symbol called state
each Xi is a grammar symbol

action and goto
 The function action takes a state and input symbol as
arguments and produces one of four values:
Shift S where S is a state
Reduce by a grammar production A  β
Accept
Error
 The function goto takes a state and a grammar symbol as
arguments and produces a state.

Construction of LR Parsing Table
 There are three techniques for constructing LR parsing table for
a grammar.
 SLR (Simple LR)
 Canonical LR
 LALR (Look ahead LR)

Construction of SLR Parsing Table
 The central idea is the construction of a DFA from the
grammar.
 LR(k) parser uses stack content and the next k symbols of the
input to decide what is to be done.
 LR(O) parser uses only stack contents.
 Let G = (N, T, P ,S) be a CFG.
[ A  w1 . w2 , u ] is called LR(k) item if A  w1 . w2 is a
production
from P, and u is a sequence of terminals, the length of which is
less
or equal to k.
 LR(O) items should not contain sequence of terminals i.e.,
[ A  w . w ]

LR(0) item
 A production A  XYZ generates four LR(0) items
A  .XYZ A X.YZ A  XY.Z A  XYZ.
 One collection of sets of LR(0) items is called the canonical LR(0)
collection.
 To construct the canonical LR(0) collection for a grammar, we
need to define an augmented grammar two functions, namely
 Closure
 goto

Augmented grammar
 If G is a grammar with start symbol S, then G’, the augmented
grammar for G, is G with a new start symbol S’ and production S’  S.
If G = (N, T, P, S), then G’ = (N U {S’} , T, PU {S’  S} , S’)
 The purpose of this new starting production is to indicate to the
parser when it should stop parsing and announce acceptance of the
input.

Closure
 If I is a set of items for a grammar G, then the Closure(I) is the set of items
constructed from I by the following rules:
1. Every item in I is added to closure(I)
2. If A α.Bβ is in closure(I) and B  γ is a production, then add item
B  .γ to I, if it is not already there.
3. Repeat the step (2) until no more new items can be added to
closure(I)

Example
Consider the augmented grammar
E’  E
E E + T / T
→
T T * F / F
→
F ( E ) / id
→
If I is set of one item {[E’  .E]}, then closure(I) contains the items
E’  .E
E .E + T
→
E .T
→
T .T * F
→
T .F
→
F .( E )
→
F .id
→

goto
 goto(I, X), where I is set of items and X is a grammar symbol, is
defined to be the closure of the set of all items [A αX.β] such
that
A α.Xβ is in I
 The goto moves the dot behind the X symbol. That means the
transition is performed from one state to another under the
effect of the symbol X

Example
If I is the set of two items {[E’  .E , E E .+ T
→ ]} then goto(I , +)
consists of closure of { E E +. T
→ }
E E + .T
→
T .T * F
→
T .F
→
F .( E )
→
F .id
→

CD M2 Lect 4 - Bottom Up Parsers (1).pptx

Collection of Canonical LR(0) Sets of Items
I0 : E’  .E goto(I0 , T) = I2 goto(I0 , ( ) = I4
E .E + T
→ I2 : E T.
→ I4 : F (. E )
→
E .T T T. * F E .E + T
→ → →
T .T * F E .T
→ →
T .F goto(
→ I0 , F) = I3 T .T * F
→
F .( E )
→ I3 : T F. T .F
→ →
F .id F .( E )
→ →
goto(I0 , id) = I5 F .id
→
goto(I0 , E) = I1 I5 : F id.
→
I1 : E’  E.
E E. + T
→

goto(I1 , + ) = I6 goto(I4 , E ) = I8 goto(I4 , ( ) = I4
I6 : E E + .T
→ I8 : F ( E. ) F (. E )
→ →
T .T * F E E .+ T E .E + T
→ → →
T .F T .T * F
→ →
F .( E ) goto(
→ I4 , T ) = I2 T .F
→
F .id
→ E T. F .( E )
→ →
goto(I2 , * ) = I7 T T. * F
→ F .id
→
I7 : T T *. F
→
F .( E ) goto(
→ I4 , F ) = I3 goto(I4 , id ) =
I5
F .id T F. F id.
→ → →

goto(I6 , T ) = I9 goto(I6 , ( ) = I4 goto(I6 , id ) = I5
I9 : E E + T. F (. E ) F id.
→ → →
T T. * F E .E + T
→ →
E .T goto(
→ I7 , ( ) = I4
goto(I6 , F ) = I3 T .T * F F (. E )
→ →
T F. T .F E .E + T
→ → →
F .( E ) E .T
→ →
F .id T .T * F
→ →
T .F
→
goto(I7 , F ) = I10 goto(I7 , id ) = I5 F .( E )
→
I10 : T T * F. F id. F .id
→ → →

goto(I8 , +) = I6 goto(I9 , *) = I7
E E + .T
→ T T *. F
→
T .T * F F .( E )
→ →
T .F F .id
→ →
F .( E )
→
F .id
→
goto(I8 , )) = I11
I11 : F ( E ).
→

E E + T
→ ------(1) FOLLOW(E) = { +, ), $}
E T
→ ------(2) FOLLOW(T) = {+, *, ),
$ }
T T * F
→ ------(3) FOLLOW(F) = {+, *, ), $}
T F
→ ------(4)
F ( E )
→ ------(5)
F id
→ ------(6)

FOLLOW(E’) = { $ }
FOLLOW(E) = { +, ), $ }
FOLLOW(T) = { *,+, ), $ }
FOLLOW(F) = { *,+, ), $ }
E E + T (rule 1)
→
E T (rule 2)
→
T T * F (rule 3)
→
T F (rule 4)
→
F ( E ) (rule 5)
→
F id (rule 6)
→

CD M2 Lect 4 - Bottom Up Parsers (1).pptx

More Related Content

Similar to CD M2 Lect 4 - Bottom Up Parsers (1).pptx (20)

Recently uploaded (20)

CD M2 Lect 4 - Bottom Up Parsers (1).pptx