Akhil Kaushik
Asstt. Prof., CE Deptt.,
TIT Bhiwani
Lexical Analysis - Implementation
Lexical Analysis
Tokens
Tokens
Tokens -> RE
• Specification of Tokens: The Patterns corresponding
to a token are generally specified using a compact
notation known as regular expression.
• Regular expressions of a language are created by
combining members of its alphabet.
• A regular expression r corresponds to a set of strings
L(r) where L(r) is called a regular set or a regular
language and may be infinite.
RegEx
A regular expression is defined as follows:-
• A basic regular expression a denotes the set {a}
where a ∈Σ; L(a) ={a}
• The regular expression ε denotes the set {ε}
• If r and s are two regular expressions denoting the
sets L(r) and L(s) then; following are some rules for
regular expressions
RegEx
RegEx
Token Recognition
• Finite Automata are recognizers that can identify the
tokens occurring in input stream.
• Finite Automaton (FA) consists of:
– A finite set of states
– A set of transitions (or moves) between states:
– A special start state
– A set of final or accepting states
Token Recognition
Token Recognition
Token Recognition
Transition Digrams
Input Buffering
• Buffer Pairs: Due to amount of time taken to process
characters and the large number of characters that
must be processed.
• Specialized buffering techniques are developed to
reduce the amount of overhead required to process a
single input character.
Language to Specify LA
• Lex, allows one to specify a lexical analyzer by
specifying regular expressions to describe patterns for
tokens.
• The input notation for the Lex tool is referred to as the
Lex language and the tool itself is the Lex compiler.
• Behind the scenes, the Lex compiler transforms the
input patterns into a transition diagram and
generates code, in a file called lex.yy.c, that simulates
this transition diagram.
Language to Specify LA
• Creating LA with Lex:-
Language to Specify LA
In simple words:-
• LEX converts Lex source program to Lexical
analyzer.
• Lexical Analyzer converts input stream into tokens.
Lex Source Program - Structure
A Lex program has the following form:-
• Auxiliary Definitions – it denotes Res of the form
D1 = R1 //Di is the shortcut name for RE
D2 = R2 //Ri is the RE
• Translation Rules – Rules that tell LA which action
to take upon encountering these tokens.
Lex Source Program - Structure
Auxiliary Definitions:-
• D1 (letter) = A | B | C…| Z | a | b…| z (R1)
• D2 (digit) = 0 | 1| 2….. | 9 (R2)
• D3 (identifier) = letter (letter | digit)*
• D4 (integer) = digit+
• D5 (sign) = + | -
• D6 (signed-integer) = sign integer
Lex Source Program - Structure
Translation Rules:-
• The translation rules each have form: Pi {Actioni}
• Each pattern is a regular expression, which may use
the regular definitions of the declaration section.
• The actions are fragments of code , typically written
in C.
• Ex: for ‘keyword’-> begin {return 1}
• Ex: for ‘variable’-> identifier {install(); return 6}
Lex Source Program - Structure
Lex Source Program - Structure
Lex Source Program - Structure
Implementation of LA
• Lex generates LA as its o/p by taking Lex program
as i/p.
• Lex program is collection of patterns (REs) and their
corresponding actions.
• Patterns represent tokens to be recognized by LA to
be generated.
• For each pattern, corresponding NFA will be
designed.
Implementation of LA
• There can be ‘n’ no. of NFAs for ‘n’ no. of patterns.
• A start state is taken & using ε-transitions, all NFAs
are combined.
• The final state of each NFA show that it has found
its own token Pi.
• Convert NFA into DFA.
• The final state of DFA shows the token we have
found.
Implementation of LA
• If none of the states of DFA include any final states
of NFA, then an error is reported.
• If final state of DFA includes more than one final
state of NFA, then final state for pattern coming first
in transition rule has priority.
************ -----------------************
Akhil Kaushik
akhilkaushik05@gmail.com
9416910303
CONTACT ME AT:
Akhil Kaushik
akhilkaushik05@gmail.com
9416910303
THANK YOU !!!

More Related Content

PPTX
CONTEXT FREE GRAMMAR
PPTX
Sparse matrix and its representation data structure
PPTX
Specification-of-tokens
PPTX
Top Down Parsing, Predictive Parsing
PDF
Matlab for beginners, Introduction, signal processing
PDF
String operation
PDF
sparse matrix in data structure
PPT
Theory of Computation Unit 5
CONTEXT FREE GRAMMAR
Sparse matrix and its representation data structure
Specification-of-tokens
Top Down Parsing, Predictive Parsing
Matlab for beginners, Introduction, signal processing
String operation
sparse matrix in data structure
Theory of Computation Unit 5

What's hot (20)

PPTX
Strassen's matrix multiplication
PPTX
Database Normalization
PPTX
PDF
Compiler lec 8
PPT
Function
PPTX
Brute force method
PPTX
Radix and Merge Sort
PPTX
Regular expressions
PPT
Introduction to Assembly Language
PPTX
Dynamic Memory Allocation(DMA)
PPT
Module 11
PPT
mano.ppt
PPTX
Heap Sort in Design and Analysis of algorithms
PPT
Dbms ii mca-ch5-ch6-relational algebra-2013
PDF
Searching and Sorting Techniques in Data Structure
PPT
Matlab practical and lab session
PDF
Operator precedence
PPTX
Operator precedance parsing
Strassen's matrix multiplication
Database Normalization
Compiler lec 8
Function
Brute force method
Radix and Merge Sort
Regular expressions
Introduction to Assembly Language
Dynamic Memory Allocation(DMA)
Module 11
mano.ppt
Heap Sort in Design and Analysis of algorithms
Dbms ii mca-ch5-ch6-relational algebra-2013
Searching and Sorting Techniques in Data Structure
Matlab practical and lab session
Operator precedence
Operator precedance parsing
Ad

Similar to Lexical Analyzer Implementation (20)

PPT
Module4 lex and yacc.ppt
PPTX
Compiler Design_Lexical Analysis phase.pptx
PPTX
Ch 2.pptx
PPT
Compier Design_Unit I_SRM.ppt
PPT
LexicalAnalysis in Compiler design .pt
DOCX
Compiler Design
PPTX
Language for specifying lexical Analyzer
PPTX
CD UNIT-1.3 LEX PPT.pptx
PDF
Lexicalanalyzer
PDF
Lexicalanalyzer
PPT
atc 3rd module compiler and automata.ppt
PDF
11700220036.pdf
PDF
Lexical analysis - Compiler Design
PPTX
Regular Expression to Finite Automata
PPTX
A Role of Lexical Analyzer
PPT
Compier Design_Unit I.ppt
PPT
Compier Design_Unit I.ppt
PDF
Lexical
PPTX
Unit2 Toc.pptx
DOC
Pcd question bank
Module4 lex and yacc.ppt
Compiler Design_Lexical Analysis phase.pptx
Ch 2.pptx
Compier Design_Unit I_SRM.ppt
LexicalAnalysis in Compiler design .pt
Compiler Design
Language for specifying lexical Analyzer
CD UNIT-1.3 LEX PPT.pptx
Lexicalanalyzer
Lexicalanalyzer
atc 3rd module compiler and automata.ppt
11700220036.pdf
Lexical analysis - Compiler Design
Regular Expression to Finite Automata
A Role of Lexical Analyzer
Compier Design_Unit I.ppt
Compier Design_Unit I.ppt
Lexical
Unit2 Toc.pptx
Pcd question bank
Ad

More from Akhil Kaushik (20)

PPT
Symbol Table, Error Handler & Code Generation
PPTX
Code Optimization
PPTX
Parsing in Compiler Design
PPTX
Context Free Grammar
PPTX
Error Detection & Recovery
PPTX
Symbol Table
PPTX
NFA & DFA
PPTX
Lexical Analysis - Compiler Design
PPTX
File Handling Python
PPTX
Regular Expressions
PPTX
Algorithms & Complexity Calculation
PPTX
Intro to Data Structure & Algorithms
PPTX
Decision Making & Loops
PPTX
Basic programs in Python
PPTX
Python Data-Types
PPTX
Introduction to Python Programming
PPT
Compiler Design Basics
PPTX
Bootstrapping in Compiler
PPTX
Compiler construction tools
PPTX
Phases of compiler
Symbol Table, Error Handler & Code Generation
Code Optimization
Parsing in Compiler Design
Context Free Grammar
Error Detection & Recovery
Symbol Table
NFA & DFA
Lexical Analysis - Compiler Design
File Handling Python
Regular Expressions
Algorithms & Complexity Calculation
Intro to Data Structure & Algorithms
Decision Making & Loops
Basic programs in Python
Python Data-Types
Introduction to Python Programming
Compiler Design Basics
Bootstrapping in Compiler
Compiler construction tools
Phases of compiler

Recently uploaded (20)

PPTX
Why I Am A Baptist, History of the Baptist, The Baptist Distinctives, 1st Bap...
PPTX
4. Diagnosis and treatment planning in RPD.pptx
PDF
Disorder of Endocrine system (1).pdfyyhyyyy
PDF
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
PDF
The TKT Course. Modules 1, 2, 3.for self study
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PPTX
Macbeth play - analysis .pptx english lit
PDF
Everyday Spelling and Grammar by Kathi Wyldeck
PPTX
Theoretical for class.pptxgshdhddhdhdhgd
PDF
Farming Based Livelihood Systems English Notes
PDF
CAT 2024 VARC One - Shot Revision Marathon by Shabana.pptx.pdf
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
DOCX
EDUCATIONAL ASSESSMENT ASSIGNMENT SEMESTER MAY 2025.docx
PDF
Laparoscopic Dissection Techniques at WLH
PPTX
Power Point PR B.Inggris 12 Ed. 2019.pptx
PPTX
Neurology of Systemic disease all systems
PPT
hsl powerpoint resource goyloveh feb 07.ppt
PDF
Chevening Scholarship Application and Interview Preparation Guide
PDF
Health aspects of bilberry: A review on its general benefits
PDF
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
Why I Am A Baptist, History of the Baptist, The Baptist Distinctives, 1st Bap...
4. Diagnosis and treatment planning in RPD.pptx
Disorder of Endocrine system (1).pdfyyhyyyy
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
The TKT Course. Modules 1, 2, 3.for self study
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
Macbeth play - analysis .pptx english lit
Everyday Spelling and Grammar by Kathi Wyldeck
Theoretical for class.pptxgshdhddhdhdhgd
Farming Based Livelihood Systems English Notes
CAT 2024 VARC One - Shot Revision Marathon by Shabana.pptx.pdf
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
EDUCATIONAL ASSESSMENT ASSIGNMENT SEMESTER MAY 2025.docx
Laparoscopic Dissection Techniques at WLH
Power Point PR B.Inggris 12 Ed. 2019.pptx
Neurology of Systemic disease all systems
hsl powerpoint resource goyloveh feb 07.ppt
Chevening Scholarship Application and Interview Preparation Guide
Health aspects of bilberry: A review on its general benefits
Horaris_Grups_25-26_Definitiu_15_07_25.pdf

Lexical Analyzer Implementation

  • 1. Akhil Kaushik Asstt. Prof., CE Deptt., TIT Bhiwani Lexical Analysis - Implementation
  • 5. Tokens -> RE • Specification of Tokens: The Patterns corresponding to a token are generally specified using a compact notation known as regular expression. • Regular expressions of a language are created by combining members of its alphabet. • A regular expression r corresponds to a set of strings L(r) where L(r) is called a regular set or a regular language and may be infinite.
  • 6. RegEx A regular expression is defined as follows:- • A basic regular expression a denotes the set {a} where a ∈Σ; L(a) ={a} • The regular expression ε denotes the set {ε} • If r and s are two regular expressions denoting the sets L(r) and L(s) then; following are some rules for regular expressions
  • 9. Token Recognition • Finite Automata are recognizers that can identify the tokens occurring in input stream. • Finite Automaton (FA) consists of: – A finite set of states – A set of transitions (or moves) between states: – A special start state – A set of final or accepting states
  • 14. Input Buffering • Buffer Pairs: Due to amount of time taken to process characters and the large number of characters that must be processed. • Specialized buffering techniques are developed to reduce the amount of overhead required to process a single input character.
  • 15. Language to Specify LA • Lex, allows one to specify a lexical analyzer by specifying regular expressions to describe patterns for tokens. • The input notation for the Lex tool is referred to as the Lex language and the tool itself is the Lex compiler. • Behind the scenes, the Lex compiler transforms the input patterns into a transition diagram and generates code, in a file called lex.yy.c, that simulates this transition diagram.
  • 16. Language to Specify LA • Creating LA with Lex:-
  • 17. Language to Specify LA In simple words:- • LEX converts Lex source program to Lexical analyzer. • Lexical Analyzer converts input stream into tokens.
  • 18. Lex Source Program - Structure A Lex program has the following form:- • Auxiliary Definitions – it denotes Res of the form D1 = R1 //Di is the shortcut name for RE D2 = R2 //Ri is the RE • Translation Rules – Rules that tell LA which action to take upon encountering these tokens.
  • 19. Lex Source Program - Structure Auxiliary Definitions:- • D1 (letter) = A | B | C…| Z | a | b…| z (R1) • D2 (digit) = 0 | 1| 2….. | 9 (R2) • D3 (identifier) = letter (letter | digit)* • D4 (integer) = digit+ • D5 (sign) = + | - • D6 (signed-integer) = sign integer
  • 20. Lex Source Program - Structure Translation Rules:- • The translation rules each have form: Pi {Actioni} • Each pattern is a regular expression, which may use the regular definitions of the declaration section. • The actions are fragments of code , typically written in C. • Ex: for ‘keyword’-> begin {return 1} • Ex: for ‘variable’-> identifier {install(); return 6}
  • 21. Lex Source Program - Structure
  • 22. Lex Source Program - Structure
  • 23. Lex Source Program - Structure
  • 24. Implementation of LA • Lex generates LA as its o/p by taking Lex program as i/p. • Lex program is collection of patterns (REs) and their corresponding actions. • Patterns represent tokens to be recognized by LA to be generated. • For each pattern, corresponding NFA will be designed.
  • 25. Implementation of LA • There can be ‘n’ no. of NFAs for ‘n’ no. of patterns. • A start state is taken & using ε-transitions, all NFAs are combined. • The final state of each NFA show that it has found its own token Pi. • Convert NFA into DFA. • The final state of DFA shows the token we have found.
  • 26. Implementation of LA • If none of the states of DFA include any final states of NFA, then an error is reported. • If final state of DFA includes more than one final state of NFA, then final state for pattern coming first in transition rule has priority. ************ -----------------************