2
Most read
10
Most read
13
Most read
Compiler Design
Lex & Flex
Lexical Analyzer Generator
• Lex is a program generator designed for lexical
processing of character input streams.
• Lex source is a table of regular expressions and
corresponding program fragments.
• The table is translated to a program which reads an
input stream, copying it to an output stream and
partitioning the input into strings which match the
given expressions.
• The compiler writer uses specialized tools that
produce components that can easily be integrated
in the compiler and help implement various phases
of a compiler.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
2
Contd…..
• The program fragments written by the user are executed in the
order in which the corresponding regular expressions occur in
the input stream.
• Lex can generate analyzers in either “C” or “Ratfor”, a language
which can be translated automatically to portable Fortran.
• Lex is a program designed to generate scanners, also known as
tokenizers, which recognize lexical patterns in text.
• Lex is an acronym that stands for "lexical analyzer generator.
• It is intended primarily for Unix-based systems.
• The code for Lex was originally developed by Eric Schmidt and
Mike Lesk.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
3
Contd……
• Lex can be used with a parser generator to perform lexical
analysis.
• It is easy, for example, to interface Lex and Yacc, an open
source program that generates code for the parser in the C
programming language.
• Lex can perform simple transformations by itself but its main
purpose is to facilitate lexical analysis.
• The processing of character sequences such as source code
to produce symbol sequences called tokens for use as input
to other programs such as parsers.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
4
Contd……
Processes in lexical analyzers
– Scanning
• Pre-processing
– Strip out comments and white space
– Macro functions
– Correlating error messages from compiler with
source program
• A line number can be associated with an error
message
– Lexical analysis
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
5
Contd……
Terms of the lexical analyzer
– Token
• Types of words in source program
• Keywords, operators, identifiers, constants,
literal strings, punctuation symbols(such as
commas, semicolons)
– Lexeme
• Actual words in source program
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
6
Contd…..
– Pattern
• A rule describing the set of lexemes that can
represent a particular token in source program
• Relation {<.<=,>,>=,==,<>}
 Lexical Errors
– Deleting an extraneous character
– Inserting a missing character
– Replacing an incorrect character by a correct character
– Pre-scanning
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
7
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
8
Compilation Sequence
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
9
What is Lex?
• The main job of a lexical analyzer (scanner) is
to break up an input stream into more usable
elements (tokens)
a = b + c * d;
ID ASSIGN ID PLUS ID MULT ID SEMI
• Lex is an utility to help you rapidly generate
your scanners
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
10
Lex – Lexical Analyzer
• Lexical analyzers tokenize input streams
• Tokens are the terminals of a language
– English
• words, punctuation marks, …
– Programming language
• Identifiers, operators, keywords, …
• Regular expressions define terminals/tokens
• Some examples are
• Flex lexical analyser
• Yacc
• Ragel
• PLY (Python Lex-Yacc)
1/31/2017
Contd…..
• Lex turns the user's expressions and actions (called source in this memo)
into the host general-purpose language.
• the generated program is named yylex.
• The yylex program will recognize expressions in a stream (called input in
this memo) and perform the specified actions for each expression as it is
detected. See the below fig….
Source -> -> yylex
Input -> -> Output
Fig. An overview of Lex
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
11
Lex
yylex
yylex()
• It implies the main entry point for lex.
• Reads the input stream generates tokens, returns
0 at the end of input stream.
• It is called to invoke the lexer or scanner.
• Each time yylex() is called, the scanner continues
processing the input from where it last left off.
• Eg:
• yyout
• yyin
• yywrap
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
12
Lexical Analyzer Generator - Lex
Lexical Compiler
Lex Source program
lex.l
lex.yy.c
C
compiler
lex.yy.c a.out
a.outInput stream Sequence
of tokens
Structure of a Lex file
• The structure of a Lex file is intentionally similar to
that of a yacc file.
• files are divided into three sections, separated by
lines that contain only two percent signs, as follows:
• Definition section
%%
• Rules section
%%
• C code section
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
14
Contd…..
• The definition section defines macros and imports header files
written in C. It is also possible to write any C code here, which
will be copied verbatim into the generated source file.
• The rules section associates regular expression patterns with
C statements. When the lexer sees text in the input matching a
given pattern, it will execute the associated C code.
• The C code section contains C statements and functions that are
copied verbatim to the generated source file. These statements
presumably contain code called by the rules in the rules section.
In large programs it is more convenient to place this code in a
separate file linked in at compile time.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
15
Example
• The following is an example Lex file for the flex version of Lex.
• It recognizes strings of numbers (positive integers) in the input,
and simply prints them out.
/*** Definition section ***/
%{
/* C code to be copied */
#include <stdio.h>
%}
/* This tells flex to read only one input file
*/
%option noyywrap
%%
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
16
Contd……
/*** Rules section ***/
/* [0-9]+ matches a string of one or more digits */
[0-9]+ {
/* yytext is a string containing the matched text.
*/
printf("Saw an integer: %sn", yytext);
}
.|n { /* Ignore all other characters. */
}
%%
/*** C Code section ***/
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
17
Contd…..
int main(void)
{
/* Call the lexer, then quit. */
yylex();
return 0;
}
If this input is given to flex, it will be converted into a C file, lex.yy.c.
This can be compiled into an executable which matches and
outputs strings of integers. Eg.
abc123z.!&*2gj6
the program will print: Saw an integer: 123
Saw an integer: 2
Saw an integer: 6
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
18
Flex, A fast scanner generator
• flex is a tool for generating scanners:
• programs which recognized lexical patterns in text.
• flex reads the given input files,
• or its standard input if no file names are given, for a description
of a scanner to generate.
• The description is in the form of pairs of regular expressions and
C code, called rules.
• flex generates as output a C source file, `lex.yy.c', which defines
a routine `yylex()'.
• This file is compiled and linked with the `-lfl' library to produce
an executable.
• When the executable is run, it analyzes its input for occurrences
of the regular expressions.
• Whenever it finds one, it executes the corresponding C code.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
19
Contd…
• Flex (fast lexical analyzer generator) is a free and
open-source software alternative to lex.
• It is a computer program that generates lexical
analyzers (also known as "scanners" or "lexers").
• Flex was written in C by Vern Paxson around 1987.
• He was translating a Ratfor generator.
• It had been led by Jef Poskanzer.
• The tokens recognized are: '+', '-', '*', '/', '=', '(', ')', ',',
';', '.', ':=', '<', '<=', '<>', '>', '>=';
• numbers: 0-9 {0-9}; identifiers: a-zA-Z {a-zA-Z0-9} and
keywords: begin, call, const, do, end, if, odd, procedur
e, then, var, while.
1/31/2017
ANKUR SRIVASTAVA ASSISTANT
PROFESSOR JIT
20
Thanks

More Related Content

PPTX
Compiler Design Unit 2
PDF
Lexical Analysis - Compiler design
PPTX
Lexical analyzer generator lex
PPT
Lexical Analysis
PPTX
Compiler design syntax analysis
PPTX
Lexical analyzer
PPTX
Lex & yacc
PPTX
Compiler Design Unit 2
Lexical Analysis - Compiler design
Lexical analyzer generator lex
Lexical Analysis
Compiler design syntax analysis
Lexical analyzer
Lex & yacc

What's hot (20)

PPTX
Lexical analysis - Compiler Design
PPTX
Syntax Analysis in Compiler Design
PPTX
Introduction TO Finite Automata
PPTX
Lexical Analysis - Compiler Design
PPT
1.Role lexical Analyzer
PPTX
Type checking in compiler design
PPTX
Specification-of-tokens
PPTX
Compiler design
PPT
Introduction to Compiler design
PPT
Regular expressions and languages pdf
PDF
Code optimization in compiler design
PPT
Compiler Design Unit 1
PPTX
Role-of-lexical-analysis
PPTX
Asymptotic Notation
PPTX
Generating code from dags
PPTX
Parsing in Compiler Design
PPTX
Recognition-of-tokens
PPTX
The role of the parser and Error recovery strategies ppt in compiler design
PPTX
Regular expressions
PPTX
Regular expressions
Lexical analysis - Compiler Design
Syntax Analysis in Compiler Design
Introduction TO Finite Automata
Lexical Analysis - Compiler Design
1.Role lexical Analyzer
Type checking in compiler design
Specification-of-tokens
Compiler design
Introduction to Compiler design
Regular expressions and languages pdf
Code optimization in compiler design
Compiler Design Unit 1
Role-of-lexical-analysis
Asymptotic Notation
Generating code from dags
Parsing in Compiler Design
Recognition-of-tokens
The role of the parser and Error recovery strategies ppt in compiler design
Regular expressions
Regular expressions
Ad

Viewers also liked (20)

PPT
Lexical analyzer
PPT
Cd2 [autosaved]
DOC
Lex tool manual
PPT
Minimization of dfa
PPTX
More on Lex
PDF
Flex y Bison
PDF
Cimplementation
DOCX
Yacc topic beyond syllabus
PPTX
Passescd
PPT
Chapter Five(2)
PDF
PDF
Lexyacc
PPT
Yacc lex
PPTX
Yacc (yet another compiler compiler)
PDF
Compiler design tutorial
PPTX
Compiler design
PPT
Data Locality
PPTX
System Programming Unit IV
PPT
C program compiler presentation
Lexical analyzer
Cd2 [autosaved]
Lex tool manual
Minimization of dfa
More on Lex
Flex y Bison
Cimplementation
Yacc topic beyond syllabus
Passescd
Chapter Five(2)
Lexyacc
Yacc lex
Yacc (yet another compiler compiler)
Compiler design tutorial
Compiler design
Data Locality
System Programming Unit IV
C program compiler presentation
Ad

Similar to Lex (20)

PPT
Module4 lex and yacc.ppt
PPT
LEX lexical analyzer for compiler theory.ppt
PPT
LEX Intrduction Compiler Construction_VIET.ppt
PPT
Lex and Yacc Tool M1.ppt
PPT
module 4_ Lex_new.ppt
PPTX
Compiler Design_LEX Tool for Lexical Analysis.pptx
DOCX
LEX & YACC
PDF
Lex and Yacc.pdf
PPT
Chapter-2-lexical-analyser and its property lecture note.ppt
PPT
compiler Design laboratory lex and yacc tutorial
PDF
Handout#02
PPTX
PPTX
Compiler Engineering Lab#5 : Symbol Table, Flex Tool
PDF
role of lexical parser compiler design1-181124035217.pdf
PDF
lecture_lex.pdf
PPTX
Language for specifying lexical Analyzer
PDF
Compiler_Design_Introduction_Unit_2_IIT.pdf
PPTX
Lex programming
PDF
LANGUAGE PROCESSOR
Module4 lex and yacc.ppt
LEX lexical analyzer for compiler theory.ppt
LEX Intrduction Compiler Construction_VIET.ppt
Lex and Yacc Tool M1.ppt
module 4_ Lex_new.ppt
Compiler Design_LEX Tool for Lexical Analysis.pptx
LEX & YACC
Lex and Yacc.pdf
Chapter-2-lexical-analyser and its property lecture note.ppt
compiler Design laboratory lex and yacc tutorial
Handout#02
Compiler Engineering Lab#5 : Symbol Table, Flex Tool
role of lexical parser compiler design1-181124035217.pdf
lecture_lex.pdf
Language for specifying lexical Analyzer
Compiler_Design_Introduction_Unit_2_IIT.pdf
Lex programming
LANGUAGE PROCESSOR

More from BBDITM LUCKNOW (16)

PPT
Unit 5 cspc
PPT
Unit 4 cspc
PPT
Unit3 cspc
PPT
Cse ppt 2018
PPT
Binary system ppt
PPT
Unit 4 ca-input-output
PPTX
Unit 3 ca-memory
PPT
Unit 2 ca- control unit
PPTX
Unit 1 ca-introduction
PPT
Bnf and ambiquity
PDF
Compiler unit 4
PDF
Compiler unit 2&3
PDF
Compiler unit 1
PDF
Compiler unit 5
PDF
Cspc final
PPTX
Validation based protocol
Unit 5 cspc
Unit 4 cspc
Unit3 cspc
Cse ppt 2018
Binary system ppt
Unit 4 ca-input-output
Unit 3 ca-memory
Unit 2 ca- control unit
Unit 1 ca-introduction
Bnf and ambiquity
Compiler unit 4
Compiler unit 2&3
Compiler unit 1
Compiler unit 5
Cspc final
Validation based protocol

Recently uploaded (20)

PPTX
PLASMA AND ITS CONSTITUENTS 123.pptx
PDF
The TKT Course. Modules 1, 2, 3.for self study
PDF
Journal of Dental Science - UDMY (2020).pdf
PPTX
Climate Change and Its Global Impact.pptx
PDF
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
PUBH1000 - Module 6: Global Health Tute Slides
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
Key-Features-of-the-SHS-Program-v4-Slides (3) PPT2.pptx
PPTX
Power Point PR B.Inggris 12 Ed. 2019.pptx
PDF
Journal of Dental Science - UDMY (2021).pdf
PDF
anganwadi services for the b.sc nursing and GNM
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
PDF
Health aspects of bilberry: A review on its general benefits
PDF
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
PDF
Chevening Scholarship Application and Interview Preparation Guide
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PPTX
Cite It Right: A Compact Illustration of APA 7th Edition.pptx
PDF
Diabetes Mellitus , types , clinical picture, investigation and managment
PLASMA AND ITS CONSTITUENTS 123.pptx
The TKT Course. Modules 1, 2, 3.for self study
Journal of Dental Science - UDMY (2020).pdf
Climate Change and Its Global Impact.pptx
Fun with Grammar (Communicative Activities for the Azar Grammar Series)
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PUBH1000 - Module 6: Global Health Tute Slides
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
faiz-khans about Radiotherapy Physics-02.pdf
Key-Features-of-the-SHS-Program-v4-Slides (3) PPT2.pptx
Power Point PR B.Inggris 12 Ed. 2019.pptx
Journal of Dental Science - UDMY (2021).pdf
anganwadi services for the b.sc nursing and GNM
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
Health aspects of bilberry: A review on its general benefits
Horaris_Grups_25-26_Definitiu_15_07_25.pdf
Chevening Scholarship Application and Interview Preparation Guide
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
Cite It Right: A Compact Illustration of APA 7th Edition.pptx
Diabetes Mellitus , types , clinical picture, investigation and managment

Lex

  • 2. Lexical Analyzer Generator • Lex is a program generator designed for lexical processing of character input streams. • Lex source is a table of regular expressions and corresponding program fragments. • The table is translated to a program which reads an input stream, copying it to an output stream and partitioning the input into strings which match the given expressions. • The compiler writer uses specialized tools that produce components that can easily be integrated in the compiler and help implement various phases of a compiler. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 2
  • 3. Contd….. • The program fragments written by the user are executed in the order in which the corresponding regular expressions occur in the input stream. • Lex can generate analyzers in either “C” or “Ratfor”, a language which can be translated automatically to portable Fortran. • Lex is a program designed to generate scanners, also known as tokenizers, which recognize lexical patterns in text. • Lex is an acronym that stands for "lexical analyzer generator. • It is intended primarily for Unix-based systems. • The code for Lex was originally developed by Eric Schmidt and Mike Lesk. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 3
  • 4. Contd…… • Lex can be used with a parser generator to perform lexical analysis. • It is easy, for example, to interface Lex and Yacc, an open source program that generates code for the parser in the C programming language. • Lex can perform simple transformations by itself but its main purpose is to facilitate lexical analysis. • The processing of character sequences such as source code to produce symbol sequences called tokens for use as input to other programs such as parsers. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 4
  • 5. Contd…… Processes in lexical analyzers – Scanning • Pre-processing – Strip out comments and white space – Macro functions – Correlating error messages from compiler with source program • A line number can be associated with an error message – Lexical analysis 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 5
  • 6. Contd…… Terms of the lexical analyzer – Token • Types of words in source program • Keywords, operators, identifiers, constants, literal strings, punctuation symbols(such as commas, semicolons) – Lexeme • Actual words in source program 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 6
  • 7. Contd….. – Pattern • A rule describing the set of lexemes that can represent a particular token in source program • Relation {<.<=,>,>=,==,<>}  Lexical Errors – Deleting an extraneous character – Inserting a missing character – Replacing an incorrect character by a correct character – Pre-scanning 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 7
  • 8. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 8 Compilation Sequence 1/31/2017
  • 9. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 9 What is Lex? • The main job of a lexical analyzer (scanner) is to break up an input stream into more usable elements (tokens) a = b + c * d; ID ASSIGN ID PLUS ID MULT ID SEMI • Lex is an utility to help you rapidly generate your scanners 1/31/2017
  • 10. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 10 Lex – Lexical Analyzer • Lexical analyzers tokenize input streams • Tokens are the terminals of a language – English • words, punctuation marks, … – Programming language • Identifiers, operators, keywords, … • Regular expressions define terminals/tokens • Some examples are • Flex lexical analyser • Yacc • Ragel • PLY (Python Lex-Yacc) 1/31/2017
  • 11. Contd….. • Lex turns the user's expressions and actions (called source in this memo) into the host general-purpose language. • the generated program is named yylex. • The yylex program will recognize expressions in a stream (called input in this memo) and perform the specified actions for each expression as it is detected. See the below fig…. Source -> -> yylex Input -> -> Output Fig. An overview of Lex 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 11 Lex yylex
  • 12. yylex() • It implies the main entry point for lex. • Reads the input stream generates tokens, returns 0 at the end of input stream. • It is called to invoke the lexer or scanner. • Each time yylex() is called, the scanner continues processing the input from where it last left off. • Eg: • yyout • yyin • yywrap 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 12
  • 13. Lexical Analyzer Generator - Lex Lexical Compiler Lex Source program lex.l lex.yy.c C compiler lex.yy.c a.out a.outInput stream Sequence of tokens
  • 14. Structure of a Lex file • The structure of a Lex file is intentionally similar to that of a yacc file. • files are divided into three sections, separated by lines that contain only two percent signs, as follows: • Definition section %% • Rules section %% • C code section 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 14
  • 15. Contd….. • The definition section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file. • The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code. • The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 15
  • 16. Example • The following is an example Lex file for the flex version of Lex. • It recognizes strings of numbers (positive integers) in the input, and simply prints them out. /*** Definition section ***/ %{ /* C code to be copied */ #include <stdio.h> %} /* This tells flex to read only one input file */ %option noyywrap %% 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 16
  • 17. Contd…… /*** Rules section ***/ /* [0-9]+ matches a string of one or more digits */ [0-9]+ { /* yytext is a string containing the matched text. */ printf("Saw an integer: %sn", yytext); } .|n { /* Ignore all other characters. */ } %% /*** C Code section ***/ 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 17
  • 18. Contd….. int main(void) { /* Call the lexer, then quit. */ yylex(); return 0; } If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. Eg. abc123z.!&*2gj6 the program will print: Saw an integer: 123 Saw an integer: 2 Saw an integer: 6 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 18
  • 19. Flex, A fast scanner generator • flex is a tool for generating scanners: • programs which recognized lexical patterns in text. • flex reads the given input files, • or its standard input if no file names are given, for a description of a scanner to generate. • The description is in the form of pairs of regular expressions and C code, called rules. • flex generates as output a C source file, `lex.yy.c', which defines a routine `yylex()'. • This file is compiled and linked with the `-lfl' library to produce an executable. • When the executable is run, it analyzes its input for occurrences of the regular expressions. • Whenever it finds one, it executes the corresponding C code. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 19
  • 20. Contd… • Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex. • It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers"). • Flex was written in C by Vern Paxson around 1987. • He was translating a Ratfor generator. • It had been led by Jef Poskanzer. • The tokens recognized are: '+', '-', '*', '/', '=', '(', ')', ',', ';', '.', ':=', '<', '<=', '<>', '>', '>='; • numbers: 0-9 {0-9}; identifiers: a-zA-Z {a-zA-Z0-9} and keywords: begin, call, const, do, end, if, odd, procedur e, then, var, while. 1/31/2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 20