SlideShare a Scribd company logo
Combinatorial Algorithms
String Matching & Applications
Introduction
● String matching algorithms are fundamental in computer science, allowing us to search for a specific pattern
within a larger text efficiently. These algorithms play a crucial role in various real-world applications, from
text processing to security systems.
Why Are String Matching Algorithms Important?
● Helps in fast searching of text or patterns within a large dataset.
● Improves efficiency in data retrieval and pattern recognition.
● Used in various fields like bioinformatics, cybersecurity, search engines, and plagiarism detection.
Problem & Terminology
Types of String Matching Algorithms
A. Exact String Matching
These algorithms find occurrences where the pattern exactly matches a part of the text.
Examples of Exact Matching Algorithms:
1. Brute Force Algorithm:
○ Compares the pattern with every substring in the text sequentially.
○ Simple but inefficient for large texts.
○ Slides the pattern one character at a time until a match is found.
2. Knuth-Morris-Pratt (KMP) Algorithm:
○ Uses a preprocessing step (prefix function) to avoid unnecessary comparisons.
○ Efficient for large-scale text searching.
○ Instead of sliding the pattern one step at a time, it jumps based on previous matches.
3. Boyer-Moore Algorithm:
○ Compares the pattern from right to left for faster mismatches.
○ Uses two heuristics: bad-character heuristic (shifts based on mismatched character) and good-suffix heuristic (shifts based on
matched suffixes).
○ Works well for long patterns and large texts.
4. Rabin-Karp Algorithm:
○ Uses hashing to quickly compare substrings.
○ Ideal for searching multiple patterns at once.
5. Aho-Corasick Algorithm:
○ Uses a Trie data structure for searching multiple patterns simultaneously.
○ Commonly used in network security and bioinformatics.
B. Approximate String Matching Algorithms
These algorithms find matches even when there are slight differences (e.g., typos, mutations in DNA sequences).
Examples of Approximate Matching Algorithms:
1. Naive Approach:
○ Similar to the exact matching naive approach but allows minor differences.
2. Sellers Algorithm:
○ Uses dynamic programming to calculate how different two strings are.
3. Shift-Or Algorithm:
○ Uses bitwise operations to speed up searching in texts with errors.
Types of String Matching Algorithms
Real-World Applications of String Matching Algorithms
A. Plagiarism Detection
● Compares documents to find similarities.
● Used in academic institutions and research publications.
● Example: Turnitin, Grammarly.
B. Bioinformatics and DNA Sequencing
● Finds patterns in genetic sequences.
● Helps in identifying mutations, gene mapping, and disease research.
● Example: BLAST (Basic Local Alignment Search Tool).
C. Digital Forensics
● Locates specific keywords in large datasets during investigations.
● Used in crime detection and cybersecurity.
● Example: Searching for illegal keywords in emails or chat logs.
D. Spell Checking and Auto-correction
● Uses Trie structures and approximate matching to detect misspellings.
● Example: Microsoft Word spell checker, Google Keyboard auto-correct.
Real-World Applications of String Matching Algorithms
E. Spam Filters
● Detects spam emails by searching for common spam phrases.
● Example: Gmail's spam filtering system.
F. Search Engines and Database Searching
● Indexes and retrieves relevant information based on search keywords.
● Example: Google Search, SQL full-text search.
G. Intrusion Detection Systems (IDS)
● Identifies malicious network packets by matching with known attack signatures.
● Example: Snort, an open-source IDS.
String Matching Problem and Terminology
● A string w is a prefix of x if x= w y, for some string
● Similarly, a string w is a suffix of x if x =y w , for some string .
Algorithms
Brute Force Algorithm
Initially, P is aligned with T at the first index position. P is then compared with T from
left-to-right. If a mismatch occurs, ”slide” P to right by 1 position, and start the
comparison again.
Brute Force Algorithm
BF_StringMatcher(T, P) {
n = length(T); m = length(P);
for (s=0; s<=n-m; s++) {
i=1; j=1;
while (j<=m && T[s+i]==P[j]) {
i++; j++;
}
if (j==m+1) print ("Pattern occurs with shift=", s)
}
}
The Knuth-Morris-Pratt (KMP) Algorithm
In the Brute-Force algorithm, if a mismatch occurs at P[ j ] (j>1), it only slides P to right
by 1 step. It throws away one piece of information that we’ve already known. What is that
piece of information ?
Let be the current shift value. Since it is a mismatch
at P[j] , we know
The Knuth-Morris-Pratt (KMP) Algorithm
How can we make use of this information to make the next shift? In general, P should
slide by s’> s such that P[1..k] = T[s’ +1..s’ + k]. We then compare
P[1+k] with T[s’ +1..s’ + k] .
References
https://www.geeksforgeeks.org/applications-of-string-matching-algorithms/

More Related Content

Similar to Combinatorial Algorithms String Matching.pptx (20)

PPTX
String_Matching_algorithm String_Matching_algorithm .pptx
praweenkumarsahu9
 
PPTX
String matching algorithms
Ashikapokiya12345
 
PPT
String matching algorithms
Dr Shashikant Athawale
 
PPTX
String Matching (Naive,Rabin-Karp,KMP)
Aditya pratap Singh
 
PPTX
String Matching Algorithms: Naive, KMP, Rabin-Karp
NAtional Institute of TEchnology Rourkela , Galgotias University
 
PPTX
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
NETAJI SUBHASH ENGINEERING COLLEGE , KOLKATA
 
PDF
module6_stringmatchingalgorithm_2022.pdf
Shiwani Gupta
 
PDF
String matching algorithms
Mahdi Esmailoghli
 
PDF
String matching, naive,
Amit Kumar Rathi
 
PPT
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
PDF
StringMatching-Rabikarp algorithmddd.pdf
bhagabatijenadukura
 
PDF
06. string matching
Onkar Nath Sharma
 
PDF
An Index Based K-Partitions Multiple Pattern Matching Algorithm
IDES Editor
 
PDF
Pattern matching programs
akruthi k
 
PPTX
Boyer more algorithm
Kritika Purohit
 
PPTX
String Matching algorithm String Matching algorithm String Matching algorithm
praweenkumarsahu9
 
PDF
A Survey of String Matching Algorithms
IJERA Editor
 
PPTX
Boyer more algorithm
Kritika Purohit
 
PPT
String searching
thinkphp
 
PPT
Chpt9 patternmatching
dbhanumahesh
 
String_Matching_algorithm String_Matching_algorithm .pptx
praweenkumarsahu9
 
String matching algorithms
Ashikapokiya12345
 
String matching algorithms
Dr Shashikant Athawale
 
String Matching (Naive,Rabin-Karp,KMP)
Aditya pratap Singh
 
String Matching Algorithms: Naive, KMP, Rabin-Karp
NAtional Institute of TEchnology Rourkela , Galgotias University
 
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
NETAJI SUBHASH ENGINEERING COLLEGE , KOLKATA
 
module6_stringmatchingalgorithm_2022.pdf
Shiwani Gupta
 
String matching algorithms
Mahdi Esmailoghli
 
String matching, naive,
Amit Kumar Rathi
 
PatternMatching2.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
RAtna29
 
StringMatching-Rabikarp algorithmddd.pdf
bhagabatijenadukura
 
06. string matching
Onkar Nath Sharma
 
An Index Based K-Partitions Multiple Pattern Matching Algorithm
IDES Editor
 
Pattern matching programs
akruthi k
 
Boyer more algorithm
Kritika Purohit
 
String Matching algorithm String Matching algorithm String Matching algorithm
praweenkumarsahu9
 
A Survey of String Matching Algorithms
IJERA Editor
 
Boyer more algorithm
Kritika Purohit
 
String searching
thinkphp
 
Chpt9 patternmatching
dbhanumahesh
 

Recently uploaded (20)

PDF
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PPTX
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
PPTX
big data eco system fundamentals of data science
arivukarasi
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PDF
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
PDF
Research Methodology Overview Introduction
ayeshagul29594
 
PPTX
What Is Data Integration and Transformation?
subhashenia
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PDF
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
PPTX
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
apidays Singapore 2025 - Trustworthy Generative AI: The Role of Observability...
apidays
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
big data eco system fundamentals of data science
arivukarasi
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
Optimizing Large Language Models with vLLM and Related Tools.pdf
Tamanna36
 
Research Methodology Overview Introduction
ayeshagul29594
 
What Is Data Integration and Transformation?
subhashenia
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
1750162332_Snapshot-of-Indias-oil-Gas-data-May-2025.pdf
sandeep718278
 
apidays Singapore 2025 - Generative AI Landscape Building a Modern Data Strat...
apidays
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
Ad

Combinatorial Algorithms String Matching.pptx

  • 2. Introduction ● String matching algorithms are fundamental in computer science, allowing us to search for a specific pattern within a larger text efficiently. These algorithms play a crucial role in various real-world applications, from text processing to security systems. Why Are String Matching Algorithms Important? ● Helps in fast searching of text or patterns within a large dataset. ● Improves efficiency in data retrieval and pattern recognition. ● Used in various fields like bioinformatics, cybersecurity, search engines, and plagiarism detection.
  • 4. Types of String Matching Algorithms A. Exact String Matching These algorithms find occurrences where the pattern exactly matches a part of the text. Examples of Exact Matching Algorithms: 1. Brute Force Algorithm: ○ Compares the pattern with every substring in the text sequentially. ○ Simple but inefficient for large texts. ○ Slides the pattern one character at a time until a match is found. 2. Knuth-Morris-Pratt (KMP) Algorithm: ○ Uses a preprocessing step (prefix function) to avoid unnecessary comparisons. ○ Efficient for large-scale text searching. ○ Instead of sliding the pattern one step at a time, it jumps based on previous matches. 3. Boyer-Moore Algorithm: ○ Compares the pattern from right to left for faster mismatches. ○ Uses two heuristics: bad-character heuristic (shifts based on mismatched character) and good-suffix heuristic (shifts based on matched suffixes). ○ Works well for long patterns and large texts. 4. Rabin-Karp Algorithm: ○ Uses hashing to quickly compare substrings. ○ Ideal for searching multiple patterns at once. 5. Aho-Corasick Algorithm: ○ Uses a Trie data structure for searching multiple patterns simultaneously. ○ Commonly used in network security and bioinformatics.
  • 5. B. Approximate String Matching Algorithms These algorithms find matches even when there are slight differences (e.g., typos, mutations in DNA sequences). Examples of Approximate Matching Algorithms: 1. Naive Approach: ○ Similar to the exact matching naive approach but allows minor differences. 2. Sellers Algorithm: ○ Uses dynamic programming to calculate how different two strings are. 3. Shift-Or Algorithm: ○ Uses bitwise operations to speed up searching in texts with errors. Types of String Matching Algorithms
  • 6. Real-World Applications of String Matching Algorithms A. Plagiarism Detection ● Compares documents to find similarities. ● Used in academic institutions and research publications. ● Example: Turnitin, Grammarly. B. Bioinformatics and DNA Sequencing ● Finds patterns in genetic sequences. ● Helps in identifying mutations, gene mapping, and disease research. ● Example: BLAST (Basic Local Alignment Search Tool). C. Digital Forensics ● Locates specific keywords in large datasets during investigations. ● Used in crime detection and cybersecurity. ● Example: Searching for illegal keywords in emails or chat logs. D. Spell Checking and Auto-correction ● Uses Trie structures and approximate matching to detect misspellings. ● Example: Microsoft Word spell checker, Google Keyboard auto-correct.
  • 7. Real-World Applications of String Matching Algorithms E. Spam Filters ● Detects spam emails by searching for common spam phrases. ● Example: Gmail's spam filtering system. F. Search Engines and Database Searching ● Indexes and retrieves relevant information based on search keywords. ● Example: Google Search, SQL full-text search. G. Intrusion Detection Systems (IDS) ● Identifies malicious network packets by matching with known attack signatures. ● Example: Snort, an open-source IDS.
  • 8. String Matching Problem and Terminology ● A string w is a prefix of x if x= w y, for some string ● Similarly, a string w is a suffix of x if x =y w , for some string .
  • 9. Algorithms Brute Force Algorithm Initially, P is aligned with T at the first index position. P is then compared with T from left-to-right. If a mismatch occurs, ”slide” P to right by 1 position, and start the comparison again.
  • 10. Brute Force Algorithm BF_StringMatcher(T, P) { n = length(T); m = length(P); for (s=0; s<=n-m; s++) { i=1; j=1; while (j<=m && T[s+i]==P[j]) { i++; j++; } if (j==m+1) print ("Pattern occurs with shift=", s) } }
  • 11. The Knuth-Morris-Pratt (KMP) Algorithm In the Brute-Force algorithm, if a mismatch occurs at P[ j ] (j>1), it only slides P to right by 1 step. It throws away one piece of information that we’ve already known. What is that piece of information ? Let be the current shift value. Since it is a mismatch at P[j] , we know
  • 12. The Knuth-Morris-Pratt (KMP) Algorithm How can we make use of this information to make the next shift? In general, P should slide by s’> s such that P[1..k] = T[s’ +1..s’ + k]. We then compare P[1+k] with T[s’ +1..s’ + k] .