SlideShare a Scribd company logo
Heshan T. Suriyaarachchi
 
lexical analysis  is the process of converting a sequence of characters into a sequence of tokens. Programs performing lexical analysis are called  lexical analyzers  or  lexers .  A lexer consists of a  scanner  and a  tokenizer .
 
A lexical analyzer breaks an input stream of characters into tokens.  Writing lexical analyzers by hand can be a tedious process, so software tools have been developed to ease this task.  Perhaps the best known such utility is Lex. Lex is a lexical analyzer generator for the UNIX operating system, targeted to the C programming language
Lex takes a specially-formatted specification file containing the details of a lexical analyzer. This tool then creates a C source file for the associated table-driven lexer. The JLex utility is based upon the Lex lexical analyzer generator model.  JLex takes a specification file similar to that accepted by Lex, then creates a Java source file for the corresponding lexical analyzer.
A JLex input file is organized into three sections, separated by double-percent directives (``%%''). A proper JLex specification has the following format.
user code %% JLex directives %% regular expression rules The ``%%'' directives distinguish sections of the input file and must be placed at the beginning of their line.
The user code section - the first section of the specification file - is copied directly into the resulting output file. This area of the specification provides space for the implementation of utility classes or return types.  The JLex directives section is the second part of the input file. Here, macros definitions are given and state names are declared.  The third section contains the rules of lexical analysis, each of which consists of three parts: an optional state list, a regular expression, and an action.
This code is copied verbatim into the lexical analyzer source file that JLex outputs, at the top of the file.  Therefore, if the lexer source file needs to begin with a package declaration or with the importation of an external class, the user code section should begin with the corresponding declaration. This declaration will then be copied onto the top of the generated source file.
The JLex directive section begins after the first ``%%'' and continues until the second ``%%'' delimiter. Each JLex directive should be contained on a single line and should begin that line.
The third part of the JLex specification consists of a series of rules for breaking the input stream into tokens.  These rules specify regular expressions, then associate these expressions with actions consisting of Java source code.  The rules have three distinct parts: the optional state list, the regular expression, and the associated action. This format is represented as follows. [<states>] <expression> { <action> }
If more than one rule matches strings from its input, the generated lexer resolves conflicts between rules by greedily choosing the rule that matches the longest string.  Rules appearing earlier in the specification are given a higher priority by the generated lexer.  If the generated lexical analyzer receives input that does not match any of its rules, an error will be raised.
Therefore, all input should be matched by at least one rule. This can be guaranteed by placing the following rule at the bottom of a JLex specification: . { java.lang.System.out.println(&quot;Unmatched input: &quot; + yytext()); } The dot (.) , will match any input except for the newline
JLex will take a properly-formed specification and transform it into a Java source file for the corresponding lexical analyzer.  A benchmark experiment was conducted, comparing the performance of a lexical analyzer generated by JLex to that of a hand-written lexical analyzer.
The comparison was made for lexical analyzers of a simple ''toy'' programming language. The hand-written lexical analyzer was written in Java.  The experiment consists of running each lexical analyzer on two source files written in the toy language, then measuring the time required to process these files.
The generated lexical analyzer proved to be quite quick, as the following results show. Source File  JLex-Generated  Hand-Written Lexical Analyzer  Lexical Analyzer  177 lines 0.42 seconds 0.53 seconds 897 lines 0.98 seconds 1.28 seconds  The JLex lexical analyzer soundly outperformed the hand-written lexer.
One of the biggest complaints about table-driven lexical analyzers generated by programs like JLex is that these lexical analyzers do not perform as well as hand-written ones. Therefore, this experiment is particularly important in demonstrating the relative speed of JLex lexical analyzers.
The following is a (possibly incomplete) list of unimplemented features of JLex.  1)The regular expression lookahead operator is unimplemented, and not included in the list of special regular expression metacharacters.   2)The start-of-line operator (^) assumes the following nonstandard behavior. A match on a regular expression that uses this operator will cause the newline that precedes the match to be discarded.
Javac Main.java Java JLex.Main Sample.lex Javac sample.lex.java Java Sample
Java CUP is a parser generator for Java Java CUP compatibility is turned off by default, but can be activated with the following JLex directive. %cup
When given, this directive makes the generated scanner conform to the java_cup.runtime.Scanner interface. It has the same effect as the following three directives: %implements java_cup.runtime.Scanner %function next_token %type java_cup.runtime.Symbol
 
 
 
Parsing  ( syntactic analysis)  is the process of analyzing a sequence of tokens to determine their grammatical structure with respect to a given (more or less) formal grammar.
Document Object Model Platform- and language-independent standard object model for representing HTML or XML and related formats. Tree Structure based API:      The Dom parser implements the dom api and it creates a DOM tree in memory for a XML document
supports navigation in any direction (e.g., parent and previous sibling) and allows for arbitrary modifications an implementation must at least buffer the document that has been read so far (or some parsed form of it). best suited for applications where the document must be accessed repeatedly or out of sequence order
DOM parsers must have the entire tree in memory before any processing can begin, so the amount of memory used by a DOM parser depends entirely on the size of the input data.
When to use DOM parser   Manipulate the document Traverse the document back and forth Small XML files Drawbacks of DOM parser  Consumes a lot of memory
Is a serial access parser API for XML. Provides a mechanism for reading data from an XML document.  Popular alternative to the DOM. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser.
Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers
When to use SAX parser   No structural modification Huge XML files Drawbacks of SAX Parser Certain kinds of XML validation require access to the document in full
OM stands for Object Model (also known as AXIOM - AXis Object Model) Refers to the XML infoset model that is initially developed for Apache Axis2.  For an object oriented language the obvious choice is a model made up of objects. DOM and JDOM are two such XML models.
OM is conceptually similar to such an XML model by its external behavior but deep down it is very much different.  OM is based on Pull Parsing instead of Push Parsing.
Pull parsing is a recent trend in XML processing. The previously popular XML processing frameworks such as SAX and DOM were &quot;push-based“, which means the control of the parsing was in the hands of the parser itself.
Push-based approach is fine and easy to use, but it was not efficient in handling large XML documents since a complete memory model will be generated in the memory.  Pull parsing inverts the control and hence the parser only proceeds at the users command.  The user can decide to store or discard events generated from the parser.
Credits goes out to Mr Elliot Joel Berk who wrote JLex. To the Department of Computer Science, Princeton University   for maintaining JLex. All the others who contributed towards these projects.  A special thanks goes out to  Dr. Damith Karunaratne for giving me this opportunity.
 

More Related Content

What's hot (20)

PDF
Feature Selection.pdf
adarshbarnwal5
 
PPT
Compiler Design Unit 1
Jena Catherine Bel D
 
PPTX
OpenFlow
Kingston Smiler
 
PDF
maquinas de turing
Anel Sosa
 
PPTX
Data communication
Abdul Rehman
 
PPTX
Métodos para la detección y corrección de errores
Daniel Huerta Cruz
 
PPT
Introduction to Compiler design
Dr. C.V. Suresh Babu
 
PPTX
Div, idiv, Neg ensamblador
David Flores Gallegos
 
PPT
Chapter 15 - Security
Wayne Jones Jnr
 
DOCX
Unidad 2 concepto de Programa,Proceso y Procesador
Mario Alberto Antonio Lopez
 
PDF
INTRODUCTION TO ALGORITHMS Third Edition
PHI Learning Pvt. Ltd.
 
PPTX
Deadlock and Banking Algorithm
MD.ANISUR RAHMAN
 
PPTX
Memory management in operating system | Paging | Virtual memory
Shivam Mitra
 
PPT
Introduction to Natural Language Processing
Pranav Gupta
 
PPTX
Information retrieval (introduction)
Primya Tamil
 
PDF
Lecture 01 introduction to compiler
Iffat Anjum
 
PPTX
Signature files
Deepali Raikar
 
PPTX
Natural language processing (NLP)
ASWINKP11
 
Feature Selection.pdf
adarshbarnwal5
 
Compiler Design Unit 1
Jena Catherine Bel D
 
OpenFlow
Kingston Smiler
 
maquinas de turing
Anel Sosa
 
Data communication
Abdul Rehman
 
Métodos para la detección y corrección de errores
Daniel Huerta Cruz
 
Introduction to Compiler design
Dr. C.V. Suresh Babu
 
Div, idiv, Neg ensamblador
David Flores Gallegos
 
Chapter 15 - Security
Wayne Jones Jnr
 
Unidad 2 concepto de Programa,Proceso y Procesador
Mario Alberto Antonio Lopez
 
INTRODUCTION TO ALGORITHMS Third Edition
PHI Learning Pvt. Ltd.
 
Deadlock and Banking Algorithm
MD.ANISUR RAHMAN
 
Memory management in operating system | Paging | Virtual memory
Shivam Mitra
 
Introduction to Natural Language Processing
Pranav Gupta
 
Information retrieval (introduction)
Primya Tamil
 
Lecture 01 introduction to compiler
Iffat Anjum
 
Signature files
Deepali Raikar
 
Natural language processing (NLP)
ASWINKP11
 

Viewers also liked (20)

PPT
Lexical analyzer
Ashwini Sonawane
 
PPTX
Lexical analyzer
kiran acharya
 
PPTX
Input-Buffering
Dattatray Gandhmal
 
PPT
Lex (lexical analyzer)
Sami Said
 
PPTX
Role-of-lexical-analysis
Dattatray Gandhmal
 
ODP
About Tokens and Lexemes
Ben Scholzen
 
PPT
Lexical Analysis
Munni28
 
PPTX
Lexical analyzer
Princess Doll
 
PPTX
Error detection recovery
Tech_MX
 
PPT
7.1. procedimientos almacenados
Jorge Luis Lopez M
 
PDF
Revisão OCPJP7 - Class Design (parte 02)
Julio Cesar Nunes de Souza
 
PPTX
Slideshare haidee thomson noticing and acquiring lexical bundles with schemat...
Haidee Thomson
 
PPTX
Error Handling - P 7
ahmad haidaroh
 
PPTX
Ambiguty
Rajendran
 
PPT
computer architecture.
Shivalik college of engineering
 
PPT
Php Error Handling
mussawir20
 
PDF
Bottomupparser
Royalzig Luxury Furniture
 
PDF
ليلة القدر
Ambiguity
 
PPTX
Syntax errors
Peter Andrews
 
Lexical analyzer
Ashwini Sonawane
 
Lexical analyzer
kiran acharya
 
Input-Buffering
Dattatray Gandhmal
 
Lex (lexical analyzer)
Sami Said
 
Role-of-lexical-analysis
Dattatray Gandhmal
 
About Tokens and Lexemes
Ben Scholzen
 
Lexical Analysis
Munni28
 
Lexical analyzer
Princess Doll
 
Error detection recovery
Tech_MX
 
7.1. procedimientos almacenados
Jorge Luis Lopez M
 
Revisão OCPJP7 - Class Design (parte 02)
Julio Cesar Nunes de Souza
 
Slideshare haidee thomson noticing and acquiring lexical bundles with schemat...
Haidee Thomson
 
Error Handling - P 7
ahmad haidaroh
 
Ambiguty
Rajendran
 
computer architecture.
Shivalik college of engineering
 
Php Error Handling
mussawir20
 
ليلة القدر
Ambiguity
 
Syntax errors
Peter Andrews
 
Ad

Similar to Lexical Analyzers and Parsers (20)

DOCX
LEX & YACC
Mahbubur Rahman
 
PPT
5 xml parsing
gauravashq
 
PDF
Using Static Analysis in Program Development
PVS-Studio
 
PDF
Creation of a Test Bed Environment for Core Java Applications using White Box...
cscpconf
 
PPTX
Chapter 2.pptx compiler design lecture note
adugnanegero
 
DOCX
Unit 2.3
Abhishek Kesharwani
 
PDF
Unit 2.3
Abhishek Kesharwani
 
PPTX
role of lexical anaysis
Sudhaa Ravi
 
PPTX
Xml and xml processor
Himanshu Soni
 
PPTX
Xml and xml processor
Himanshu Soni
 
PDF
LANGUAGE PROCESSOR
EZIOAUDITORE15070
 
PPT
Module4 lex and yacc.ppt
ProddaturNagaVenkata
 
PDF
xml2tex at TUG 2014
Keiichiro Shikano
 
PPT
Java XML Parsing
srinivasanjayakumar
 
PDF
Lex and Yacc.pdf
AKASHPAL102
 
PPTX
A Role of Lexical Analyzer
Archana Gopinath
 
PDF
Python xml processing
Learnbay Datascience
 
PDF
X Usax Pdf
nit Allahabad
 
PDF
LEXICAL ANALYZER
IRJET Journal
 
LEX & YACC
Mahbubur Rahman
 
5 xml parsing
gauravashq
 
Using Static Analysis in Program Development
PVS-Studio
 
Creation of a Test Bed Environment for Core Java Applications using White Box...
cscpconf
 
Chapter 2.pptx compiler design lecture note
adugnanegero
 
role of lexical anaysis
Sudhaa Ravi
 
Xml and xml processor
Himanshu Soni
 
Xml and xml processor
Himanshu Soni
 
LANGUAGE PROCESSOR
EZIOAUDITORE15070
 
Module4 lex and yacc.ppt
ProddaturNagaVenkata
 
xml2tex at TUG 2014
Keiichiro Shikano
 
Java XML Parsing
srinivasanjayakumar
 
Lex and Yacc.pdf
AKASHPAL102
 
A Role of Lexical Analyzer
Archana Gopinath
 
Python xml processing
Learnbay Datascience
 
X Usax Pdf
nit Allahabad
 
LEXICAL ANALYZER
IRJET Journal
 
Ad

More from Heshan Suriyaarachchi (6)

PDF
Apache Airavata Cloud Integration
Heshan Suriyaarachchi
 
PDF
Airavata deployment studio
Heshan Suriyaarachchi
 
PDF
Apache Airavata Tutorial
Heshan Suriyaarachchi
 
PPTX
Getting started with Apache Airavata
Heshan Suriyaarachchi
 
PDF
Apache Airavata API Demo
Heshan Suriyaarachchi
 
PPT
WSO2 WSF/Jython
Heshan Suriyaarachchi
 
Apache Airavata Cloud Integration
Heshan Suriyaarachchi
 
Airavata deployment studio
Heshan Suriyaarachchi
 
Apache Airavata Tutorial
Heshan Suriyaarachchi
 
Getting started with Apache Airavata
Heshan Suriyaarachchi
 
Apache Airavata API Demo
Heshan Suriyaarachchi
 
WSO2 WSF/Jython
Heshan Suriyaarachchi
 

Recently uploaded (20)

PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 

Lexical Analyzers and Parsers

  • 2.  
  • 3. lexical analysis is the process of converting a sequence of characters into a sequence of tokens. Programs performing lexical analysis are called lexical analyzers or lexers . A lexer consists of a scanner and a tokenizer .
  • 4.  
  • 5. A lexical analyzer breaks an input stream of characters into tokens. Writing lexical analyzers by hand can be a tedious process, so software tools have been developed to ease this task. Perhaps the best known such utility is Lex. Lex is a lexical analyzer generator for the UNIX operating system, targeted to the C programming language
  • 6. Lex takes a specially-formatted specification file containing the details of a lexical analyzer. This tool then creates a C source file for the associated table-driven lexer. The JLex utility is based upon the Lex lexical analyzer generator model. JLex takes a specification file similar to that accepted by Lex, then creates a Java source file for the corresponding lexical analyzer.
  • 7. A JLex input file is organized into three sections, separated by double-percent directives (``%%''). A proper JLex specification has the following format.
  • 8. user code %% JLex directives %% regular expression rules The ``%%'' directives distinguish sections of the input file and must be placed at the beginning of their line.
  • 9. The user code section - the first section of the specification file - is copied directly into the resulting output file. This area of the specification provides space for the implementation of utility classes or return types. The JLex directives section is the second part of the input file. Here, macros definitions are given and state names are declared. The third section contains the rules of lexical analysis, each of which consists of three parts: an optional state list, a regular expression, and an action.
  • 10. This code is copied verbatim into the lexical analyzer source file that JLex outputs, at the top of the file. Therefore, if the lexer source file needs to begin with a package declaration or with the importation of an external class, the user code section should begin with the corresponding declaration. This declaration will then be copied onto the top of the generated source file.
  • 11. The JLex directive section begins after the first ``%%'' and continues until the second ``%%'' delimiter. Each JLex directive should be contained on a single line and should begin that line.
  • 12. The third part of the JLex specification consists of a series of rules for breaking the input stream into tokens. These rules specify regular expressions, then associate these expressions with actions consisting of Java source code. The rules have three distinct parts: the optional state list, the regular expression, and the associated action. This format is represented as follows. [<states>] <expression> { <action> }
  • 13. If more than one rule matches strings from its input, the generated lexer resolves conflicts between rules by greedily choosing the rule that matches the longest string. Rules appearing earlier in the specification are given a higher priority by the generated lexer. If the generated lexical analyzer receives input that does not match any of its rules, an error will be raised.
  • 14. Therefore, all input should be matched by at least one rule. This can be guaranteed by placing the following rule at the bottom of a JLex specification: . { java.lang.System.out.println(&quot;Unmatched input: &quot; + yytext()); } The dot (.) , will match any input except for the newline
  • 15. JLex will take a properly-formed specification and transform it into a Java source file for the corresponding lexical analyzer. A benchmark experiment was conducted, comparing the performance of a lexical analyzer generated by JLex to that of a hand-written lexical analyzer.
  • 16. The comparison was made for lexical analyzers of a simple ''toy'' programming language. The hand-written lexical analyzer was written in Java. The experiment consists of running each lexical analyzer on two source files written in the toy language, then measuring the time required to process these files.
  • 17. The generated lexical analyzer proved to be quite quick, as the following results show. Source File JLex-Generated Hand-Written Lexical Analyzer Lexical Analyzer 177 lines 0.42 seconds 0.53 seconds 897 lines 0.98 seconds 1.28 seconds The JLex lexical analyzer soundly outperformed the hand-written lexer.
  • 18. One of the biggest complaints about table-driven lexical analyzers generated by programs like JLex is that these lexical analyzers do not perform as well as hand-written ones. Therefore, this experiment is particularly important in demonstrating the relative speed of JLex lexical analyzers.
  • 19. The following is a (possibly incomplete) list of unimplemented features of JLex. 1)The regular expression lookahead operator is unimplemented, and not included in the list of special regular expression metacharacters. 2)The start-of-line operator (^) assumes the following nonstandard behavior. A match on a regular expression that uses this operator will cause the newline that precedes the match to be discarded.
  • 20. Javac Main.java Java JLex.Main Sample.lex Javac sample.lex.java Java Sample
  • 21. Java CUP is a parser generator for Java Java CUP compatibility is turned off by default, but can be activated with the following JLex directive. %cup
  • 22. When given, this directive makes the generated scanner conform to the java_cup.runtime.Scanner interface. It has the same effect as the following three directives: %implements java_cup.runtime.Scanner %function next_token %type java_cup.runtime.Symbol
  • 23.  
  • 24.  
  • 25.  
  • 26. Parsing ( syntactic analysis) is the process of analyzing a sequence of tokens to determine their grammatical structure with respect to a given (more or less) formal grammar.
  • 27. Document Object Model Platform- and language-independent standard object model for representing HTML or XML and related formats. Tree Structure based API:     The Dom parser implements the dom api and it creates a DOM tree in memory for a XML document
  • 28. supports navigation in any direction (e.g., parent and previous sibling) and allows for arbitrary modifications an implementation must at least buffer the document that has been read so far (or some parsed form of it). best suited for applications where the document must be accessed repeatedly or out of sequence order
  • 29. DOM parsers must have the entire tree in memory before any processing can begin, so the amount of memory used by a DOM parser depends entirely on the size of the input data.
  • 30. When to use DOM parser Manipulate the document Traverse the document back and forth Small XML files Drawbacks of DOM parser Consumes a lot of memory
  • 31. Is a serial access parser API for XML. Provides a mechanism for reading data from an XML document. Popular alternative to the DOM. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser.
  • 32. Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers
  • 33. When to use SAX parser No structural modification Huge XML files Drawbacks of SAX Parser Certain kinds of XML validation require access to the document in full
  • 34. OM stands for Object Model (also known as AXIOM - AXis Object Model) Refers to the XML infoset model that is initially developed for Apache Axis2. For an object oriented language the obvious choice is a model made up of objects. DOM and JDOM are two such XML models.
  • 35. OM is conceptually similar to such an XML model by its external behavior but deep down it is very much different. OM is based on Pull Parsing instead of Push Parsing.
  • 36. Pull parsing is a recent trend in XML processing. The previously popular XML processing frameworks such as SAX and DOM were &quot;push-based“, which means the control of the parsing was in the hands of the parser itself.
  • 37. Push-based approach is fine and easy to use, but it was not efficient in handling large XML documents since a complete memory model will be generated in the memory. Pull parsing inverts the control and hence the parser only proceeds at the users command. The user can decide to store or discard events generated from the parser.
  • 38. Credits goes out to Mr Elliot Joel Berk who wrote JLex. To the Department of Computer Science, Princeton University for maintaining JLex. All the others who contributed towards these projects. A special thanks goes out to Dr. Damith Karunaratne for giving me this opportunity.
  • 39.