SlideShare a Scribd company logo
7
Most read
12
Most read
16
Most read
String-Matching Algorithms (UNIT-5)
1
2
1. String Matching :
Let there is an array of text, T[1..n] of length ‘n’.
Let there is a pattern of text, P[1..m] of length ‘m’.
Let T and P are drawn from a finite alphabet .
Here P and T are called ‘Strings of Characters’.
Here, the pattern P occurs with shift s in text T,
if, 0 ≤ s ≤ n – m
and T[s+1..s+m] = P[1..m]
i.e., for 1 ≤ j ≤ m, T[s+j] = P[j]
If P occurs with shift s in T, it is a VALID SHIFT.
Other wise, we call INVALID SHIFT.
3
The String-matching Problem is the problem of finding
all valid shifts with which a given pattern P occurs in a
given text T.
Ex-1 : Let text T : a b c a b a a b c a b a c
Let pattern P : a b a a
Find the number of valid shifts and ‘s’ values.
Answer : Only one Valid Shift. s = 3
The symbol * (read as ‘sigma-star’) is the set of all
finite-length strings formed using characters from
the alphabet .
4
The zero-length string is called ‘Empty String’.
denoted by ‘ɛ’, also belongs to *.
The length of the string ‘x’ is denoted |x|.
The concatenation of two strings x and y, denoted xy
has length |x| + |y|.
A string ω is a prefix of a string x, denoted as ω ⊏ x,
if x = ω y for some string y ∊ *.
Here, note that if ω ⊏ x, then |w| ≤ |x|.
Similarly, a string ω is a suffix of a string x, denoted
as ω ⊐ x, if x = y ω for some string y ∊ *.
Here, note that if ω ⊐ x, then |w| ≤ |x|.
5
Ex-2 : Let abcca is a string.
Here, ab ⊏ abcca and cca ⊐ abcca
Note-1: The empty string ɛ is both a suffix and
prefix of every string.
Note-2 : Both prefix and suffix are transitive
relations.
Lemma : Suppose that x, y, and z are strings
such that x ⊐ z and y ⊐ z.
Here, if |x| ≤ |y| then x ⊐ y.
if |x| ≥ |y| then y ⊐ x.
if |x| = |y| then x = y.
6
2. The Naïve String-matching Algorithm :
This algorithm finds all valid shifts using a loop that
checks the condition P[1..m] = T[s+1..s+m] for each
of the n –m + 1 possible values of s.
NAÏVE-STRING-MATCHER(T,P)
1 n = T.length
2 m = P.length
3. for s = 0 to n – m
4. if P[1..m] = = T[s+1..s+m]
5 Print “Pattern occurs with shift s.”
7
Ex-3 : Let T = acaabc & P = aab
Find the value of s.
Answer : The value of s = 2
Ex-4 : Let T = 000010001010001
P = 0001
Find the values of ‘s’.
Answer : The value of s = 1 & 5 & 11
Ex-5 : Let T = an and P = am
Answer : The values of s = 0 to n – m
i.e., s contains n – m + 1 values
8
3. The Rabin-Karp Algorithm :
Let  = {0, 1, 2, … , 9}
Here each character is a decimal digit.
d = |  | = 10.
The string 31415 represents 31,415 in radix-d notation.
Let there is a text T[1..n].
Let there is a pattern P[1..m].
Let p denote the corresponding decimal value.
Let ts is the decimal value of the length –m substring
T[s+1..s+m], for s = 0,1,2,..n-m.
 ts = p iff T[s+1..s+m] = P[1..m]
 s is a valid shift iff ts = p
9
Now, the value of p can be computed using
Horner’s rule as follows:
p = P[1..m] = P[1] P[2] P[3]…P[m]
So, p = P[m] + 10 (P[m-1] + 10 (P[m-2] + … +
10 (P[2] + 10 P[1])…)).
Similarly, one can compute t0 as follows :
t0 = T[m] + 10 (T[m-1] + 10 (T[m-2] + … +
10 (T[2] + 10 T[1])…)).
Here we can compute ts+1 from ts as follows :
ts+1 = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1].
10
Let q is defined so that dq fits in one computer word
and the above recurrence equation can be written as :
ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
Here, h  dm-1 (mod q)
i.e., h is the first digit in the m-digit text window.
Ex-6 : Let m = 5, ts = 31415
Let T[s+m+1] = 2
So, RHS = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1]
= 10 (31415 – 104 . 3) + 2 = 14150 + 2 = 14152
11
The test ts  p (mod q) is a fast heuristic
test to rule out invalid shifts s.
For any value of ‘s’,
if ts  p (mod q) is TRUE
and P[1..m] = T[s+1..s+m] is FALSE
then ‘s’ is called SPURIOUS HIT.
Note : a) If ts  p (mod q) is TRUE
then ts = p may be TRUE
b) If ts  p (mod q) is FALSE
then ts ≠ p is definitely TRUE
12
RABIN-KARP-MATCHER (T,P,d,q)
1 n = T.length
2 m = P.length
3 h = dm-1 (mod q)
4 p = 0
5 t0 = 0
6 for i = 1 to m // preprocessing
7 p = (dp + P[i]) mod q
8 t0 = (d t0 + T[i]) mod q
9 for s = 0 to n-m //matching
10 if (p = = ts )
11 if (P[1..m] = T[s+1..s+m])
12 print “Pattern occurs with shift” s
13 if (s < n – m)
14 ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
13
Ex-7 : Let T = 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1
Let P = 3 1 4 1 5
Here n = 19 m = 5 d = 10
q = 13 h = 3
p = 0 t0 = 0
First for statement :
i = 1 : p = 3 t0 = 2
i = 2 : p = 5 t0 = 10
i = 3 : p = 2 t0 = 1
i = 4 : p = 8 t0 = 6
i = 5 : p = 7 t0 = 8
14
Second for statement :
s p ts T p = = ts s < n – m ts+1
0 7 8 23590 FALSE TRUE 9
1 7 9 35902 FALSE TRUE 3
2 7 3 59023 FALSE TRUE 11
3 7 11 90231 FALSE TRUE 0
4 7 0 02314 FALSE TRUE 1
5 7 1 23141 FALSE TRUE 7
6 7 7 31415 TRUE S = 6 TRUE VM 8
7 7 8 14152 FALSE TRUE 4
8 7 4 41526 FALSE TRUE 5
15
s p ts T p = = ts s < n – m ts+1
9 7 5 15267 FALSE TRUE 10
10 7 10 52673 FALSE TRUE 11
11 7 11 26739 FALSE TRUE 7
12 7 7 67399 TRUE S = 12 TRUE SH 9
13 7 9 73992 FALSE TRUE 11
14 7 11 39921 FALSE FALSE ---
Hence, there is only ONE VALID MATCH at s = 6
there is only ONE SPURIOUS HIT at s = 12
16
4. The Knuth-Morris-Pratt Algorithm :
This algorithm is meant for ‘Pattern Matching’.
Here, the prefix function  for a pattern
encapsulates knowledge about how the pattern
matches against shifts of itself.
Ex-8 : Let the Text String T & Pattern P is :
T : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
b a c b a b a b a c a c a c a
P : 1 2 3 4 5 6 7
a b a b a c a
17
COMPUTE-PREFIX-FUNCTION (P) :
1. m = P.length
2. Let [1..m] be a new array
3. [1] = 0
4. k = 0
5. for q = 2 to m
6. while k > 0 and P[k+1]  P[q]
7. k = [k]
8. if P[k+1] = = P[q]
9. k = k + 1
10. [q] = k
11. return 
18
Ex-8 (contd…)
P : 1 2 3 4 5 6 7
a b a b a c a
INIT : m = 7 [1] = 0 k = 0
Step : q = 2 :
Here, k = 0 & P[k+1] = a & P[q] = b
So, while : FALSE & if : FALSE
Hence, [2] = 0
Step : q = 3 :
Here, k = 0 & P[k+1] = a & P[q] = a
So, while : FALSE & if : TRUE k = 1
Hence, [3] = 1
19
Step : q = 4 :
Here, k = 1 & P[k+1] = b & P[q] = b
So, while : FALSE & if : TRUE k = 2
Hence, [4] = 2
Step : q = 5 :
Here, k = 2 & P[k+1] = a & P[q] = a
So, while : FALSE & if : TRUE k = 3
Hence, [5] = 3
Step : q = 6 :
Here, k = 3 & P[k+1] = b & P[q] = c
So, while : TRUE  k = 1 ( = [3] )
& k = 1 & P[k+1] = b & P[q] = c
while : TRUE  k = 0 ( = [1] )
if : FALSE ([P[1] = = P[6])
Hence, [6] = 0
20
Step : q = 7 :
Here, k = 0 & P[k+1] = a & P[q] = a
So, while : FALSE &
if : TRUE (P[1] = = P[7] )
k = 1
Hence, [7] = 1
Hence the  array is as follows :
q : 1 2 3 4 5 6 7
 : 0 0 1 2 3 0 1
Hence, this returns the value : 1
21
KMP-MATCHER (T,P) :
1. n = T.length
2. m = P.length
3.  = COMPUTE-PREFIX-FUNCTION(P)
4. q = 0
5. for i = 1 to n
6. while q > 0 and P[q+1]  T[i]
7. q =  [q]
8. if P[q+1] = = T[i]
9. q = q + 1
10. if q = = m
11. print ”Pattern occurs with shift” i - m
12. q =  [q]
22
Ex-8 contd..
KMP-Matcher (T,P) :
INIT : n = 15 m = 7  =1 q = 0
----------------------------------------------------------------------------------------
i q C1 C2 wh q=  [q] if q++ if print q=  [q]
-------------------------------------------------------------------
1 0 F T F --- F ---- F ---- ----
2 0 F F F --- T q = 1 F ---- ----
3 1 T T T q = 0 F ---- F ---- ----
4 0 F T F --- F ---- F ---- ----
5 0 F F F --- T q = 1 F ---- ----
23
-----------------------------------------------------------------------------------------------
i q C1 C2 wh q=  [q] if q++ if print q=  [q]
-----------------------------------------------------------------------------------------------
6 1 T F F --- T q=2 F ---- ----
7 2 T F F --- T q=3 F ---- ----
8 3 T F F --- T q=4 F ---- ----
9 4 T F F --- T q=5 F ---- ----
10 5 T F F --- T q=6 F ---- ----
11 6 T F F --- T q=7 F shift 4 q=1
12 1 T T T q=0 F ---- F ---- ----
13 0 F F F ---- T q=1 F ---- ----
14 1 T T T q=0 F ---- F ---- ----
15 0 F F F ---- T q=1 F ---- ----
-----------------------------------------------------------------------------------------------

More Related Content

Similar to String-Matching Algorithms Advance algorithm (20)

PDF
StringMatching-Rabikarp algorithmddd.pdf
bhagabatijenadukura
 
PPT
chap09alg.ppt for string matching algorithm
SadiaSharmin40
 
PDF
Daa chapter9
B.Kirron Reddi
 
PDF
Pricing Exotics using Change of Numeraire
Swati Mital
 
PPTX
Rabin Carp String Matching algorithm
sabiya sabiya
 
PDF
25 String Matching
Andres Mendez-Vazquez
 
PDF
Hull White model presentation
Stephan Chang
 
PPTX
Knuth morris pratt string matching algo
sabiya sabiya
 
PPTX
Data Analysis Assignment Help
Statistics Assignment Help
 
PDF
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
Afshin Tiraie
 
PDF
Graph for Coulomb damped oscillation
phanhung20
 
PDF
Solving Linear Equations Over p-Adic Integers
Joseph Molina
 
DOCX
Theoryofcomp science
Raghu nath
 
PPTX
stochastic processes assignment help
Statistics Homework Helper
 
PPT
lecture6.ppt
AbhiYadav655132
 
PPT
String matching algorithm
Alokeparna Choudhury
 
DOCX
Sequence function
jennytuazon01630
 
PDF
Find the compact trigonometric Fourier series for the periodic signal.pdf
arihantelectronics
 
PPTX
lapalce transformation maths presentation.ppt.pptx
swethab129
 
PDF
A New Deterministic RSA-Factoring Algorithm
Jim Jimenez
 
StringMatching-Rabikarp algorithmddd.pdf
bhagabatijenadukura
 
chap09alg.ppt for string matching algorithm
SadiaSharmin40
 
Daa chapter9
B.Kirron Reddi
 
Pricing Exotics using Change of Numeraire
Swati Mital
 
Rabin Carp String Matching algorithm
sabiya sabiya
 
25 String Matching
Andres Mendez-Vazquez
 
Hull White model presentation
Stephan Chang
 
Knuth morris pratt string matching algo
sabiya sabiya
 
Data Analysis Assignment Help
Statistics Assignment Help
 
A Numeric Algorithm for Generating Permutations in Lexicographic Order with a...
Afshin Tiraie
 
Graph for Coulomb damped oscillation
phanhung20
 
Solving Linear Equations Over p-Adic Integers
Joseph Molina
 
Theoryofcomp science
Raghu nath
 
stochastic processes assignment help
Statistics Homework Helper
 
lecture6.ppt
AbhiYadav655132
 
String matching algorithm
Alokeparna Choudhury
 
Sequence function
jennytuazon01630
 
Find the compact trigonometric Fourier series for the periodic signal.pdf
arihantelectronics
 
lapalce transformation maths presentation.ppt.pptx
swethab129
 
A New Deterministic RSA-Factoring Algorithm
Jim Jimenez
 

More from ssuseraf60311 (7)

PPT
Graph coloring with back tracking aoa.ppt
ssuseraf60311
 
PPT
3526192.ppt
ssuseraf60311
 
PPT
8259731.ppt
ssuseraf60311
 
PPT
fit100-16-dom.ppt
ssuseraf60311
 
PPT
6065165.ppt
ssuseraf60311
 
PPTX
application of http.pptx
ssuseraf60311
 
PPTX
Working of web browser.pptx
ssuseraf60311
 
Graph coloring with back tracking aoa.ppt
ssuseraf60311
 
3526192.ppt
ssuseraf60311
 
8259731.ppt
ssuseraf60311
 
fit100-16-dom.ppt
ssuseraf60311
 
6065165.ppt
ssuseraf60311
 
application of http.pptx
ssuseraf60311
 
Working of web browser.pptx
ssuseraf60311
 
Ad

Recently uploaded (20)

PPTX
Presentation 2.pptx AI-powered home security systems Secure-by-design IoT fr...
SoundaryaBC2
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PDF
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
PPTX
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
PDF
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
PPTX
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
PPT
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
PDF
AN EMPIRICAL STUDY ON THE USAGE OF SOCIAL MEDIA IN GERMAN B2C-ONLINE STORES
ijait
 
PPTX
Introduction to Internal Combustion Engines - Types, Working and Camparison.pptx
UtkarshPatil98
 
PPTX
2025 CGI Congres - Surviving agile v05.pptx
Derk-Jan de Grood
 
PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PPTX
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
PPTX
Arduino Based Gas Leakage Detector Project
CircuitDigest
 
PPTX
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
PPTX
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
PDF
SERVERLESS PERSONAL TO-DO LIST APPLICATION
anushaashraf20
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PPTX
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
Presentation 2.pptx AI-powered home security systems Secure-by-design IoT fr...
SoundaryaBC2
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
Lecture 1 Shell and Tube Heat exchanger-1.pptx
mailforillegalwork
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
AI TECHNIQUES FOR IDENTIFYING ALTERATIONS IN THE HUMAN GUT MICROBIOME IN MULT...
vidyalalltv1
 
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
Footbinding.pptmnmkjkjkknmnnjkkkkkkkkkkkkkk
mamadoundiaye42742
 
AN EMPIRICAL STUDY ON THE USAGE OF SOCIAL MEDIA IN GERMAN B2C-ONLINE STORES
ijait
 
Introduction to Internal Combustion Engines - Types, Working and Camparison.pptx
UtkarshPatil98
 
2025 CGI Congres - Surviving agile v05.pptx
Derk-Jan de Grood
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
Worm gear strength and wear calculation as per standard VB Bhandari Databook.
shahveer210504
 
Arduino Based Gas Leakage Detector Project
CircuitDigest
 
How Industrial Project Management Differs From Construction.pptx
jamespit799
 
Element 11. ELECTRICITY safety and hazards
merrandomohandas
 
SERVERLESS PERSONAL TO-DO LIST APPLICATION
anushaashraf20
 
MRRS Strength and Durability of Concrete
CivilMythili
 
GitOps_Without_K8s_Training_detailed git repository
DanialHabibi2
 
Ad

String-Matching Algorithms Advance algorithm

  • 2. 2 1. String Matching : Let there is an array of text, T[1..n] of length ‘n’. Let there is a pattern of text, P[1..m] of length ‘m’. Let T and P are drawn from a finite alphabet . Here P and T are called ‘Strings of Characters’. Here, the pattern P occurs with shift s in text T, if, 0 ≤ s ≤ n – m and T[s+1..s+m] = P[1..m] i.e., for 1 ≤ j ≤ m, T[s+j] = P[j] If P occurs with shift s in T, it is a VALID SHIFT. Other wise, we call INVALID SHIFT.
  • 3. 3 The String-matching Problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T. Ex-1 : Let text T : a b c a b a a b c a b a c Let pattern P : a b a a Find the number of valid shifts and ‘s’ values. Answer : Only one Valid Shift. s = 3 The symbol * (read as ‘sigma-star’) is the set of all finite-length strings formed using characters from the alphabet .
  • 4. 4 The zero-length string is called ‘Empty String’. denoted by ‘ɛ’, also belongs to *. The length of the string ‘x’ is denoted |x|. The concatenation of two strings x and y, denoted xy has length |x| + |y|. A string ω is a prefix of a string x, denoted as ω ⊏ x, if x = ω y for some string y ∊ *. Here, note that if ω ⊏ x, then |w| ≤ |x|. Similarly, a string ω is a suffix of a string x, denoted as ω ⊐ x, if x = y ω for some string y ∊ *. Here, note that if ω ⊐ x, then |w| ≤ |x|.
  • 5. 5 Ex-2 : Let abcca is a string. Here, ab ⊏ abcca and cca ⊐ abcca Note-1: The empty string ɛ is both a suffix and prefix of every string. Note-2 : Both prefix and suffix are transitive relations. Lemma : Suppose that x, y, and z are strings such that x ⊐ z and y ⊐ z. Here, if |x| ≤ |y| then x ⊐ y. if |x| ≥ |y| then y ⊐ x. if |x| = |y| then x = y.
  • 6. 6 2. The Naïve String-matching Algorithm : This algorithm finds all valid shifts using a loop that checks the condition P[1..m] = T[s+1..s+m] for each of the n –m + 1 possible values of s. NAÏVE-STRING-MATCHER(T,P) 1 n = T.length 2 m = P.length 3. for s = 0 to n – m 4. if P[1..m] = = T[s+1..s+m] 5 Print “Pattern occurs with shift s.”
  • 7. 7 Ex-3 : Let T = acaabc & P = aab Find the value of s. Answer : The value of s = 2 Ex-4 : Let T = 000010001010001 P = 0001 Find the values of ‘s’. Answer : The value of s = 1 & 5 & 11 Ex-5 : Let T = an and P = am Answer : The values of s = 0 to n – m i.e., s contains n – m + 1 values
  • 8. 8 3. The Rabin-Karp Algorithm : Let  = {0, 1, 2, … , 9} Here each character is a decimal digit. d = |  | = 10. The string 31415 represents 31,415 in radix-d notation. Let there is a text T[1..n]. Let there is a pattern P[1..m]. Let p denote the corresponding decimal value. Let ts is the decimal value of the length –m substring T[s+1..s+m], for s = 0,1,2,..n-m.  ts = p iff T[s+1..s+m] = P[1..m]  s is a valid shift iff ts = p
  • 9. 9 Now, the value of p can be computed using Horner’s rule as follows: p = P[1..m] = P[1] P[2] P[3]…P[m] So, p = P[m] + 10 (P[m-1] + 10 (P[m-2] + … + 10 (P[2] + 10 P[1])…)). Similarly, one can compute t0 as follows : t0 = T[m] + 10 (T[m-1] + 10 (T[m-2] + … + 10 (T[2] + 10 T[1])…)). Here we can compute ts+1 from ts as follows : ts+1 = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1].
  • 10. 10 Let q is defined so that dq fits in one computer word and the above recurrence equation can be written as : ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q. Here, h  dm-1 (mod q) i.e., h is the first digit in the m-digit text window. Ex-6 : Let m = 5, ts = 31415 Let T[s+m+1] = 2 So, RHS = 10 (ts – 10m-1 T[s+1 ]) + T[s+m+1] = 10 (31415 – 104 . 3) + 2 = 14150 + 2 = 14152
  • 11. 11 The test ts  p (mod q) is a fast heuristic test to rule out invalid shifts s. For any value of ‘s’, if ts  p (mod q) is TRUE and P[1..m] = T[s+1..s+m] is FALSE then ‘s’ is called SPURIOUS HIT. Note : a) If ts  p (mod q) is TRUE then ts = p may be TRUE b) If ts  p (mod q) is FALSE then ts ≠ p is definitely TRUE
  • 12. 12 RABIN-KARP-MATCHER (T,P,d,q) 1 n = T.length 2 m = P.length 3 h = dm-1 (mod q) 4 p = 0 5 t0 = 0 6 for i = 1 to m // preprocessing 7 p = (dp + P[i]) mod q 8 t0 = (d t0 + T[i]) mod q 9 for s = 0 to n-m //matching 10 if (p = = ts ) 11 if (P[1..m] = T[s+1..s+m]) 12 print “Pattern occurs with shift” s 13 if (s < n – m) 14 ts+1 = (d (ts – T[s+1] h ) + T[s+m+1]) mod q.
  • 13. 13 Ex-7 : Let T = 2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1 Let P = 3 1 4 1 5 Here n = 19 m = 5 d = 10 q = 13 h = 3 p = 0 t0 = 0 First for statement : i = 1 : p = 3 t0 = 2 i = 2 : p = 5 t0 = 10 i = 3 : p = 2 t0 = 1 i = 4 : p = 8 t0 = 6 i = 5 : p = 7 t0 = 8
  • 14. 14 Second for statement : s p ts T p = = ts s < n – m ts+1 0 7 8 23590 FALSE TRUE 9 1 7 9 35902 FALSE TRUE 3 2 7 3 59023 FALSE TRUE 11 3 7 11 90231 FALSE TRUE 0 4 7 0 02314 FALSE TRUE 1 5 7 1 23141 FALSE TRUE 7 6 7 7 31415 TRUE S = 6 TRUE VM 8 7 7 8 14152 FALSE TRUE 4 8 7 4 41526 FALSE TRUE 5
  • 15. 15 s p ts T p = = ts s < n – m ts+1 9 7 5 15267 FALSE TRUE 10 10 7 10 52673 FALSE TRUE 11 11 7 11 26739 FALSE TRUE 7 12 7 7 67399 TRUE S = 12 TRUE SH 9 13 7 9 73992 FALSE TRUE 11 14 7 11 39921 FALSE FALSE --- Hence, there is only ONE VALID MATCH at s = 6 there is only ONE SPURIOUS HIT at s = 12
  • 16. 16 4. The Knuth-Morris-Pratt Algorithm : This algorithm is meant for ‘Pattern Matching’. Here, the prefix function  for a pattern encapsulates knowledge about how the pattern matches against shifts of itself. Ex-8 : Let the Text String T & Pattern P is : T : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 b a c b a b a b a c a c a c a P : 1 2 3 4 5 6 7 a b a b a c a
  • 17. 17 COMPUTE-PREFIX-FUNCTION (P) : 1. m = P.length 2. Let [1..m] be a new array 3. [1] = 0 4. k = 0 5. for q = 2 to m 6. while k > 0 and P[k+1]  P[q] 7. k = [k] 8. if P[k+1] = = P[q] 9. k = k + 1 10. [q] = k 11. return 
  • 18. 18 Ex-8 (contd…) P : 1 2 3 4 5 6 7 a b a b a c a INIT : m = 7 [1] = 0 k = 0 Step : q = 2 : Here, k = 0 & P[k+1] = a & P[q] = b So, while : FALSE & if : FALSE Hence, [2] = 0 Step : q = 3 : Here, k = 0 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE k = 1 Hence, [3] = 1
  • 19. 19 Step : q = 4 : Here, k = 1 & P[k+1] = b & P[q] = b So, while : FALSE & if : TRUE k = 2 Hence, [4] = 2 Step : q = 5 : Here, k = 2 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE k = 3 Hence, [5] = 3 Step : q = 6 : Here, k = 3 & P[k+1] = b & P[q] = c So, while : TRUE  k = 1 ( = [3] ) & k = 1 & P[k+1] = b & P[q] = c while : TRUE  k = 0 ( = [1] ) if : FALSE ([P[1] = = P[6]) Hence, [6] = 0
  • 20. 20 Step : q = 7 : Here, k = 0 & P[k+1] = a & P[q] = a So, while : FALSE & if : TRUE (P[1] = = P[7] ) k = 1 Hence, [7] = 1 Hence the  array is as follows : q : 1 2 3 4 5 6 7  : 0 0 1 2 3 0 1 Hence, this returns the value : 1
  • 21. 21 KMP-MATCHER (T,P) : 1. n = T.length 2. m = P.length 3.  = COMPUTE-PREFIX-FUNCTION(P) 4. q = 0 5. for i = 1 to n 6. while q > 0 and P[q+1]  T[i] 7. q =  [q] 8. if P[q+1] = = T[i] 9. q = q + 1 10. if q = = m 11. print ”Pattern occurs with shift” i - m 12. q =  [q]
  • 22. 22 Ex-8 contd.. KMP-Matcher (T,P) : INIT : n = 15 m = 7  =1 q = 0 ---------------------------------------------------------------------------------------- i q C1 C2 wh q=  [q] if q++ if print q=  [q] ------------------------------------------------------------------- 1 0 F T F --- F ---- F ---- ---- 2 0 F F F --- T q = 1 F ---- ---- 3 1 T T T q = 0 F ---- F ---- ---- 4 0 F T F --- F ---- F ---- ---- 5 0 F F F --- T q = 1 F ---- ----
  • 23. 23 ----------------------------------------------------------------------------------------------- i q C1 C2 wh q=  [q] if q++ if print q=  [q] ----------------------------------------------------------------------------------------------- 6 1 T F F --- T q=2 F ---- ---- 7 2 T F F --- T q=3 F ---- ---- 8 3 T F F --- T q=4 F ---- ---- 9 4 T F F --- T q=5 F ---- ---- 10 5 T F F --- T q=6 F ---- ---- 11 6 T F F --- T q=7 F shift 4 q=1 12 1 T T T q=0 F ---- F ---- ---- 13 0 F F F ---- T q=1 F ---- ---- 14 1 T T T q=0 F ---- F ---- ---- 15 0 F F F ---- T q=1 F ---- ---- -----------------------------------------------------------------------------------------------