SlideShare a Scribd company logo
9
Most read
10
Most read
11
Most read
OPTIMIZATION OF DFA
BASED PATTERN
MATCHERS
Important States of an NFA
 An NFA state is important if it has non-
out transitions
 During Subset construction - -closure
(move (T, a)) takes into account only the
important states
 Direct construction relates important
states of NFA with symbols in the RE
Augmented RE
 Final state is not important
 Concatenate an unique right end marker
#
 Add a transition on # out of the
accepting state
Converting Regular Expression to DFA
 A regular expression can be converted into a
DFA (without creating a NFA first).
 First the given regular expression is
augmented by concatenating it with a
special symbol #.
r  (r)#
augmented regular expression
 Then a syntax tree is created for this
augmented regular expression.
 In this syntax tree, all alphabet symbols, # ,
and the empty string in the augmented
regular expression will be on the leaves,
and
 All inner nodes will be the operators
 Then each alphabet symbol and # will be
numbered (position numbers).
Converting Regular Expression to DFA
Types of Interior nodes
 Cat-node
 Star-node
 Or-node
Regular Expression  DFA (cont.)
(a|b)*
a  (a|b) *
a # augmented regular expression

*

|
b
a
#
a
1
4
3
2
Syntax tree of (a|b)*
a #
• each symbol is numbered
• each symbol is at a leaf
• inner nodes are operators
followpos
Followpos is defined for the positions
(positions assigned to leaves).
followpos(i) : is the set of positions which can follow
the position i in the strings generated by
the augmented regular expression.
For example, ( a | b)*
a #
1 2 3 4
followpos(1) = {1,2,3}
followpos(2) = {1,2,3}
followpos(3) = {4}
followpos(4) = {}
followpos is just defined for leaves,
it is not defined for inner nodes.
firstpos, lastpos, nullable
To evaluate followpos, three more functions are to be
defined for the nodes (not just for leaves) of the
syntax tree.
 firstpos(n) -- the set of the positions of the first
symbols of strings generated by the sub-expression
rooted at n.
 lastpos(n) -- the set of the positions of the last
symbols of strings generated by the sub-expression
rooted at n.
 nullable(n) -- true if the empty string is a member of
strings generated by the sub-expression rooted by n
false otherwise
Evaluation of firstpos, lastpos, nullable
n nullable(n) firstpos(n) lastpos(n)
leaf labeled

true  
leaf labeled
with
position i
false {i} {i}
c1 c2
nullable(c1) or
nullable(c2)
firstpos(c1) 
firstpos(c2)
lastpos(c1) 
lastpos(c2)

c1 c2
nullable(c1) and
nullable(c2)
if (nullable(c1))
firstpos(c1) 
firstpos(c2)
else firstpos(c1)
if (nullable(c2))
lastpos(c1) 
lastpos(c2)
else lastpos(c2)
*
c1
true firstpos(c1) lastpos(c1)
Evaluation of followpos
Two-rules define the function followpos:
1. If n is concatenation-node with left child c1 and right child c2,and i is
a position in lastpos(c1), then all positions in firstpos(c2) are in
followpos(i).
2. If n is a star-node, and i is a position in lastpos(n), then all positions
in firstpos(n) are in followpos(i).
If firstpos and lastpos have been computed for each node, followpos of
each position can be computed by making one depth-first traversal of the
syntax tree.
n
C1 C2
Cat-node
ilastpos(c1) firstpos(c2)
followpos(i) = { firstpos(c2) }
Followpos
Followpos
n
C1
Star-node
ilastpos(n)
firstpos(n)
followpos(i) = { firstpos(n) }
Example -- ( a | b) *
a #

*

|
b
a
#
a
1
4
3
2
{1}
{1}
{1,2,3}
{3}
{1,2,3}
{1,2}
{1,2}
{2}
{4}
{4}
{4}
{3}
{3}
{1,2}
{1,2}
{2}
red – firstpos
blue – lastpos
followpos(1) = {1,2,3}
followpos(2) = {1,2,3}
followpos(3) = {4}
followpos(4) = {}
• The DFA can now be constructed for the Regular Expression
Algorithm (RE  DFA)
 Create the syntax tree of (r) #
 Calculate the functions: followpos, firstpos, lastpos, nullable
 Put firstpos(root) into the states of DFA as an unmarked state.
 while (there is an unmarked state S in the states of DFA) do
 mark S
 for each input symbol a do

let s1,...,sn are positions in S and symbols in those positions are a

S’
 followpos(s1)  ...  followpos(sn)

move(S,a)  S’

if (S’ is not empty and not in the states of DFA)
 put S’ into the states of DFA as an unmarked state.
 the start state of DFA is firstpos(root)
 the accepting states of DFA are all states containing the position of #
Example -- ( a | b) *
a #
followpos(1)={1,2,3} followpos(2)={1,2,3} followpos(3)={4} followpos(4)={}
S1=firstpos(root)={1,2,3}
 mark S1
a: followpos(1)  followpos(3)={1,2,3,4}=S2 move(S1,a)=S2
b: followpos(2)={1,2,3}=S1 move(S1,b)=S1
 mark S2
a: followpos(1)  followpos(3)={1,2,3,4}=S2 move(S2,a)=S2
b: followpos(2)={1,2,3}=S1 move(S2,b)=S1
start state: S1
accepting states: {S2}
1 2 3 4
S1 S2
a
b
b
a
Example -- ( a | ) b c*
#
1 2 3 4
followpos(1)={2} followpos(2)={3,4} followpos(3)={3,4} followpos(4)={}
S1=firstpos(root)={1,2}
 mark S1
a: followpos(1)={2}=S2 move(S1,a)=S2
b: followpos(2)={3,4}=S3 move(S1,b)=S3
 mark S2
b: followpos(2)={3,4}=S3 move(S2,b)=S3
 mark S3
c: followpos(3)={3,4}=S3 move(S3,c)=S3
S3
S2
S1
c
a
b
b

More Related Content

What's hot (20)

PPT
Paging.ppt
infomerlin
 
PPTX
Code Optimization
Akhil Kaushik
 
PPTX
Concurrency control
Javed Khan
 
PPTX
Data structure power point presentation
Anil Kumar Prajapati
 
PPTX
Demand paging
SwaroopSorte
 
PPT
Thrashing allocation frames.43
myrajendra
 
PPTX
Generating code from dags
indhu mathi
 
PPTX
Shift reduce parser
TEJVEER SINGH
 
PDF
Basic blocks and flow graph in Compiler Construction
Muhammad Haroon
 
PPTX
Priority Queue in Data Structure
Meghaj Mallick
 
PPTX
Demand paging
Trinity Dwarka
 
PPTX
Computer architecture virtual memory
Mazin Alwaaly
 
PPTX
Operations on Processes
Navid Daneshvaran
 
PPT
Sum of subsets problem by backtracking 
Hasanain Alshadoodee
 
PDF
Advanced computer architechture -Memory Hierarchies and its Properties and Type
LalfakawmaKh
 
PDF
OS - Process Concepts
Mukesh Chinta
 
PPT
Symbol table management and error handling in compiler design
Swati Chauhan
 
PPT
Chapter 6 intermediate code generation
Vipul Naik
 
PPT
Introduction to System Calls
Vandana Salve
 
PPTX
ch 2 - DISTRIBUTED DEADLOCK DETECTION.pptx
Ethiopia Satlliet television
 
Paging.ppt
infomerlin
 
Code Optimization
Akhil Kaushik
 
Concurrency control
Javed Khan
 
Data structure power point presentation
Anil Kumar Prajapati
 
Demand paging
SwaroopSorte
 
Thrashing allocation frames.43
myrajendra
 
Generating code from dags
indhu mathi
 
Shift reduce parser
TEJVEER SINGH
 
Basic blocks and flow graph in Compiler Construction
Muhammad Haroon
 
Priority Queue in Data Structure
Meghaj Mallick
 
Demand paging
Trinity Dwarka
 
Computer architecture virtual memory
Mazin Alwaaly
 
Operations on Processes
Navid Daneshvaran
 
Sum of subsets problem by backtracking 
Hasanain Alshadoodee
 
Advanced computer architechture -Memory Hierarchies and its Properties and Type
LalfakawmaKh
 
OS - Process Concepts
Mukesh Chinta
 
Symbol table management and error handling in compiler design
Swati Chauhan
 
Chapter 6 intermediate code generation
Vipul Naik
 
Introduction to System Calls
Vandana Salve
 
ch 2 - DISTRIBUTED DEADLOCK DETECTION.pptx
Ethiopia Satlliet television
 

Similar to 2_6 Optimization of DFA Based Pattern Matchers.ppt (20)

PPT
Ch3.ppt
MDSayem35
 
PPTX
Optimization of dfa
Kiran Acharya
 
PPTX
Lec1.pptx
ziadk6872
 
PPTX
Regular Expressions To Finite Automata
International Institute of Information Technology (I²IT)
 
PPTX
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
venkatapranaykumarGa
 
DOCX
UNIT_-_II.docx
karthikeyan Muthusamy
 
DOC
Compiler Design QA
Dr. C.V. Suresh Babu
 
PPT
PARSING.ppt
ayyankhanna6480086
 
DOC
Compiler Design Material 2
Dr. C.V. Suresh Babu
 
PPT
02. chapter 3 lexical analysis
raosir123
 
DOC
PCD ?s(MCA)
guestf07b62f
 
DOC
Principles of Compiler Design
Babu Pushkaran
 
DOC
Pcd(Mca)
guestf07b62f
 
DOC
Pcd(Mca)
guestf07b62f
 
PDF
Lecture 3 RE NFA DFA
Rebaz Najeeb
 
PDF
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Rushabh2428
 
PPT
02. Chapter 3 - Lexical Analysis NLP.ppt
charvivij
 
PPTX
Data strutcure and annalysis topic stack
MihirMishra36
 
PDF
awkbash quick ref for Red hat Linux admin
ZoumanaDiomande1
 
Ch3.ppt
MDSayem35
 
Optimization of dfa
Kiran Acharya
 
Lec1.pptx
ziadk6872
 
Regular Expressions To Finite Automata
International Institute of Information Technology (I²IT)
 
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
venkatapranaykumarGa
 
UNIT_-_II.docx
karthikeyan Muthusamy
 
Compiler Design QA
Dr. C.V. Suresh Babu
 
PARSING.ppt
ayyankhanna6480086
 
Compiler Design Material 2
Dr. C.V. Suresh Babu
 
02. chapter 3 lexical analysis
raosir123
 
PCD ?s(MCA)
guestf07b62f
 
Principles of Compiler Design
Babu Pushkaran
 
Pcd(Mca)
guestf07b62f
 
Pcd(Mca)
guestf07b62f
 
Lecture 3 RE NFA DFA
Rebaz Najeeb
 
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Rushabh2428
 
02. Chapter 3 - Lexical Analysis NLP.ppt
charvivij
 
Data strutcure and annalysis topic stack
MihirMishra36
 
awkbash quick ref for Red hat Linux admin
ZoumanaDiomande1
 
Ad

More from Ranjeet Reddy (8)

PDF
CSCADING style sheet. Internal external inline
Ranjeet Reddy
 
PPT
software testing mtehododlogies path testing
Ranjeet Reddy
 
PDF
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
Ranjeet Reddy
 
PPT
Introduction to compiler design and phases of compiler
Ranjeet Reddy
 
PPTX
COMPILER DESIGN LECTURES -UNIT-2 ST.pptx
Ranjeet Reddy
 
PPT
UNIT 1 part II.ppt
Ranjeet Reddy
 
PDF
WT UNIT-2 XML.pdf
Ranjeet Reddy
 
PPTX
SessionTrackServlets.pptx
Ranjeet Reddy
 
CSCADING style sheet. Internal external inline
Ranjeet Reddy
 
software testing mtehododlogies path testing
Ranjeet Reddy
 
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
Ranjeet Reddy
 
Introduction to compiler design and phases of compiler
Ranjeet Reddy
 
COMPILER DESIGN LECTURES -UNIT-2 ST.pptx
Ranjeet Reddy
 
UNIT 1 part II.ppt
Ranjeet Reddy
 
WT UNIT-2 XML.pdf
Ranjeet Reddy
 
SessionTrackServlets.pptx
Ranjeet Reddy
 
Ad

Recently uploaded (20)

PPTX
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PPTX
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
PPTX
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
PPTX
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
PPTX
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PDF
Electrical Engineer operation Supervisor
ssaruntatapower143
 
PDF
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
PDF
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
PPTX
OCS353 DATA SCIENCE FUNDAMENTALS- Unit 1 Introduction to Data Science
A R SIVANESH M.E., (Ph.D)
 
PPTX
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PPT
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
PDF
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
PPTX
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
PDF
Design Thinking basics for Engineers.pdf
CMR University
 
PDF
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
PDF
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
PPTX
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
PPTX
Distribution reservoir and service storage pptx
dhanashree78
 
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
MODULE 04 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Electrical Engineer operation Supervisor
ssaruntatapower143
 
methodology-driven-mbse-murphy-july-hsv-huntsville6680038572db67488e78ff00003...
henriqueltorres1
 
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
OCS353 DATA SCIENCE FUNDAMENTALS- Unit 1 Introduction to Data Science
A R SIVANESH M.E., (Ph.D)
 
Final Major project a b c d e f g h i j k l m
bharathpsnab
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
Basic_Concepts_in_Clinical_Biochemistry_2018كيمياء_عملي.pdf
AdelLoin
 
What is Shot Peening | Shot Peening is a Surface Treatment Process
Vibra Finish
 
Design Thinking basics for Engineers.pdf
CMR University
 
MODULE-5 notes [BCG402-CG&V] PART-B.pdf
Alvas Institute of Engineering and technology, Moodabidri
 
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
MODULE 05 - CLOUD COMPUTING AND SECURITY.pptx
Alvas Institute of Engineering and technology, Moodabidri
 
Distribution reservoir and service storage pptx
dhanashree78
 

2_6 Optimization of DFA Based Pattern Matchers.ppt

  • 1. OPTIMIZATION OF DFA BASED PATTERN MATCHERS
  • 2. Important States of an NFA  An NFA state is important if it has non- out transitions  During Subset construction - -closure (move (T, a)) takes into account only the important states  Direct construction relates important states of NFA with symbols in the RE
  • 3. Augmented RE  Final state is not important  Concatenate an unique right end marker #  Add a transition on # out of the accepting state
  • 4. Converting Regular Expression to DFA  A regular expression can be converted into a DFA (without creating a NFA first).  First the given regular expression is augmented by concatenating it with a special symbol #. r  (r)# augmented regular expression  Then a syntax tree is created for this augmented regular expression.
  • 5.  In this syntax tree, all alphabet symbols, # , and the empty string in the augmented regular expression will be on the leaves, and  All inner nodes will be the operators  Then each alphabet symbol and # will be numbered (position numbers). Converting Regular Expression to DFA
  • 6. Types of Interior nodes  Cat-node  Star-node  Or-node
  • 7. Regular Expression  DFA (cont.) (a|b)* a  (a|b) * a # augmented regular expression  *  | b a # a 1 4 3 2 Syntax tree of (a|b)* a # • each symbol is numbered • each symbol is at a leaf • inner nodes are operators
  • 8. followpos Followpos is defined for the positions (positions assigned to leaves). followpos(i) : is the set of positions which can follow the position i in the strings generated by the augmented regular expression. For example, ( a | b)* a # 1 2 3 4 followpos(1) = {1,2,3} followpos(2) = {1,2,3} followpos(3) = {4} followpos(4) = {} followpos is just defined for leaves, it is not defined for inner nodes.
  • 9. firstpos, lastpos, nullable To evaluate followpos, three more functions are to be defined for the nodes (not just for leaves) of the syntax tree.  firstpos(n) -- the set of the positions of the first symbols of strings generated by the sub-expression rooted at n.  lastpos(n) -- the set of the positions of the last symbols of strings generated by the sub-expression rooted at n.  nullable(n) -- true if the empty string is a member of strings generated by the sub-expression rooted by n false otherwise
  • 10. Evaluation of firstpos, lastpos, nullable n nullable(n) firstpos(n) lastpos(n) leaf labeled  true   leaf labeled with position i false {i} {i} c1 c2 nullable(c1) or nullable(c2) firstpos(c1)  firstpos(c2) lastpos(c1)  lastpos(c2)  c1 c2 nullable(c1) and nullable(c2) if (nullable(c1)) firstpos(c1)  firstpos(c2) else firstpos(c1) if (nullable(c2)) lastpos(c1)  lastpos(c2) else lastpos(c2) * c1 true firstpos(c1) lastpos(c1)
  • 11. Evaluation of followpos Two-rules define the function followpos: 1. If n is concatenation-node with left child c1 and right child c2,and i is a position in lastpos(c1), then all positions in firstpos(c2) are in followpos(i). 2. If n is a star-node, and i is a position in lastpos(n), then all positions in firstpos(n) are in followpos(i). If firstpos and lastpos have been computed for each node, followpos of each position can be computed by making one depth-first traversal of the syntax tree.
  • 14. Example -- ( a | b) * a #  *  | b a # a 1 4 3 2 {1} {1} {1,2,3} {3} {1,2,3} {1,2} {1,2} {2} {4} {4} {4} {3} {3} {1,2} {1,2} {2} red – firstpos blue – lastpos followpos(1) = {1,2,3} followpos(2) = {1,2,3} followpos(3) = {4} followpos(4) = {} • The DFA can now be constructed for the Regular Expression
  • 15. Algorithm (RE  DFA)  Create the syntax tree of (r) #  Calculate the functions: followpos, firstpos, lastpos, nullable  Put firstpos(root) into the states of DFA as an unmarked state.  while (there is an unmarked state S in the states of DFA) do  mark S  for each input symbol a do  let s1,...,sn are positions in S and symbols in those positions are a  S’  followpos(s1)  ...  followpos(sn)  move(S,a)  S’  if (S’ is not empty and not in the states of DFA)  put S’ into the states of DFA as an unmarked state.  the start state of DFA is firstpos(root)  the accepting states of DFA are all states containing the position of #
  • 16. Example -- ( a | b) * a # followpos(1)={1,2,3} followpos(2)={1,2,3} followpos(3)={4} followpos(4)={} S1=firstpos(root)={1,2,3}  mark S1 a: followpos(1)  followpos(3)={1,2,3,4}=S2 move(S1,a)=S2 b: followpos(2)={1,2,3}=S1 move(S1,b)=S1  mark S2 a: followpos(1)  followpos(3)={1,2,3,4}=S2 move(S2,a)=S2 b: followpos(2)={1,2,3}=S1 move(S2,b)=S1 start state: S1 accepting states: {S2} 1 2 3 4 S1 S2 a b b a
  • 17. Example -- ( a | ) b c* # 1 2 3 4 followpos(1)={2} followpos(2)={3,4} followpos(3)={3,4} followpos(4)={} S1=firstpos(root)={1,2}  mark S1 a: followpos(1)={2}=S2 move(S1,a)=S2 b: followpos(2)={3,4}=S3 move(S1,b)=S3  mark S2 b: followpos(2)={3,4}=S3 move(S2,b)=S3  mark S3 c: followpos(3)={3,4}=S3 move(S3,c)=S3 S3 S2 S1 c a b b