2. LR Parser
LR parsers used to parse a large class of context-free grammars.
The technique is called LR(k) parsing:
L denotes that input sequence is processed from left to right
R denotes that the right most derivation is performed.
k denotes that at most k symbols are used to make a decision.
Reasons for the attractiveness of LR parser
LR parsers can handle a large class of CF grammars.
An LR parser can detect syntax errors as soon as they occur.
The LR parsing method is the most general non back tracking
shift-reduce parsing method.
LR parsers can handle all language recognized by LL(1).
3. LR Parser
Drawbacks of the method
Parsing tables are too complicated to be generated by
hand, need an automated generator.
Cannot handle ambiguous grammar without special
tricks.
5. LR Parser
LR parser consists of an
1. Input
2. Output
3. Stack
4. Driver (parser) program
5. Parse table
1. Action
2. Goto
6. LR Parser
The driver program is same for all LR parsers.
Only the parsing table changes from one parser to another.
The parsing program reads character from an input buffer one
at a time.
The program uses a stack to store a string of the form
S0 X1 S1 X2 S2 X3 . . . . Xm Sm
where Sm is on top of the stack
each Si is a symbol called state
each Xi is a grammar symbol
7. action and goto
The function action takes a state and input symbol as
arguments and produces one of four values:
Shift S where S is a state
Reduce by a grammar production A β
Accept
Error
The function goto takes a state and a grammar symbol as
arguments and produces a state.
8. Construction of LR Parsing Table
There are three techniques for constructing LR parsing table for
a grammar.
SLR (Simple LR)
Canonical LR
LALR (Look ahead LR)
9. Construction of SLR Parsing Table
The central idea is the construction of a DFA from the
grammar.
LR(k) parser uses stack content and the next k symbols of the
input to decide what is to be done.
LR(O) parser uses only stack contents.
Let G = (N, T, P ,S) be a CFG.
[ A w1 . w2 , u ] is called LR(k) item if A w1 . w2 is a
production
from P, and u is a sequence of terminals, the length of which is
less
or equal to k.
LR(O) items should not contain sequence of terminals i.e.,
[ A w . w ]
10. LR(0) item
A production A XYZ generates four LR(0) items
A .XYZ A X.YZ A XY.Z A XYZ.
One collection of sets of LR(0) items is called the canonical LR(0)
collection.
To construct the canonical LR(0) collection for a grammar, we
need to define an augmented grammar two functions, namely
Closure
goto
11. Augmented grammar
If G is a grammar with start symbol S, then G’, the augmented
grammar for G, is G with a new start symbol S’ and production S’ S.
If G = (N, T, P, S), then G’ = (N U {S’} , T, PU {S’ S} , S’)
The purpose of this new starting production is to indicate to the
parser when it should stop parsing and announce acceptance of the
input.
12. Closure
If I is a set of items for a grammar G, then the Closure(I) is the set of items
constructed from I by the following rules:
1. Every item in I is added to closure(I)
2. If A α.Bβ is in closure(I) and B γ is a production, then add item
B .γ to I, if it is not already there.
3. Repeat the step (2) until no more new items can be added to
closure(I)
13. Example
Consider the augmented grammar
E’ E
E E + T / T
→
T T * F / F
→
F ( E ) / id
→
If I is set of one item {[E’ .E]}, then closure(I) contains the items
E’ .E
E .E + T
→
E .T
→
T .T * F
→
T .F
→
F .( E )
→
F .id
→
14. goto
goto(I, X), where I is set of items and X is a grammar symbol, is
defined to be the closure of the set of all items [A αX.β] such
that
A α.Xβ is in I
The goto moves the dot behind the X symbol. That means the
transition is performed from one state to another under the
effect of the symbol X
15. Example
If I is the set of two items {[E’ .E , E E .+ T
→ ]} then goto(I , +)
consists of closure of { E E +. T
→ }
E E + .T
→
T .T * F
→
T .F
→
F .( E )
→
F .id
→
17. Collection of Canonical LR(0) Sets of Items
I0 : E’ .E goto(I0 , T) = I2 goto(I0 , ( ) = I4
E .E + T
→ I2 : E T.
→ I4 : F (. E )
→
E .T T T. * F E .E + T
→ → →
T .T * F E .T
→ →
T .F goto(
→ I0 , F) = I3 T .T * F
→
F .( E )
→ I3 : T F. T .F
→ →
F .id F .( E )
→ →
goto(I0 , id) = I5 F .id
→
goto(I0 , E) = I1 I5 : F id.
→
I1 : E’ E.
E E. + T
→
18. goto(I1 , + ) = I6 goto(I4 , E ) = I8 goto(I4 , ( ) = I4
I6 : E E + .T
→ I8 : F ( E. ) F (. E )
→ →
T .T * F E E .+ T E .E + T
→ → →
T .F T .T * F
→ →
F .( E ) goto(
→ I4 , T ) = I2 T .F
→
F .id
→ E T. F .( E )
→ →
goto(I2 , * ) = I7 T T. * F
→ F .id
→
I7 : T T *. F
→
F .( E ) goto(
→ I4 , F ) = I3 goto(I4 , id ) =
I5
F .id T F. F id.
→ → →
19. goto(I6 , T ) = I9 goto(I6 , ( ) = I4 goto(I6 , id ) = I5
I9 : E E + T. F (. E ) F id.
→ → →
T T. * F E .E + T
→ →
E .T goto(
→ I7 , ( ) = I4
goto(I6 , F ) = I3 T .T * F F (. E )
→ →
T F. T .F E .E + T
→ → →
F .( E ) E .T
→ →
F .id T .T * F
→ →
T .F
→
goto(I7 , F ) = I10 goto(I7 , id ) = I5 F .( E )
→
I10 : T T * F. F id. F .id
→ → →
20. goto(I8 , +) = I6 goto(I9 , *) = I7
E E + .T
→ T T *. F
→
T .T * F F .( E )
→ →
T .F F .id
→ →
F .( E )
→
F .id
→
goto(I8 , )) = I11
I11 : F ( E ).
→
21. E E + T
→ ------(1) FOLLOW(E) = { +, ), $}
E T
→ ------(2) FOLLOW(T) = {+, *, ),
$ }
T T * F
→ ------(3) FOLLOW(F) = {+, *, ), $}
T F
→ ------(4)
F ( E )
→ ------(5)
F id
→ ------(6)
28. FOLLOW(E’) = { $ }
FOLLOW(E) = { +, ), $ }
FOLLOW(T) = { *,+, ), $ }
FOLLOW(F) = { *,+, ), $ }
E E + T (rule 1)
→
E T (rule 2)
→
T T * F (rule 3)
→
T F (rule 4)
→
F ( E ) (rule 5)
→
F id (rule 6)
→