SlideShare a Scribd company logo
General Structure of a Compiler
Lecture 2
General Structure of a Compiler
Conceptual Structure
• Front-end performs the analysis of the source language:
– Breaks up the source code into pieces and imposes a grammatical
structure.
Front-End Back-End
Intermediate
Representation
Source code Target code
9-Jan-15 CS 346 Lecture 2 2
– Using this, creates a generic Intermediate Representation (IR) of the
source code.
– Checks for syntax / semantics and provides informative messages
back to user in case of errors.
– Builds a symbol table to collect and store information about source
program.
• Back-end does the target language synthesis:
– Chooses instructions to implement each IR operation.
Conceptual Structure
Front-End Back-End
Intermediate
Representation
Source code Target code
9-Jan-15 CS 346 Lecture 2 3
– Translates IR into target code.
– Needs to conform with system interfaces.
– Automation has been less successful.
What is the implication of this separation (front-end: analysis; back-
end:synthesis) in building a compiler for, say, a new language?
m×n compilers with m+n components!
Front-end
Front-end
Front-end
Front-end
Back-end
Back-end
Back-end
Back-end
I.R.
Fortran
Smalltalk
target 1
C
Java
target 2
target 3
target 4
9-Jan-15 4
• All language specific knowledge must be encoded in the front-end
• All target specific knowledge must be encoded in the back-end
But: in practice, this strict separation is not trivial.
Front-end Back-end
CS 346 Lecture 2
General Structure of a Compiler
Lexical
Analysis
Syntax
Analysis
I.C.
Optimisation
Code
Generation
I.R.
tokens
Abstract Syntax Tree (AST)
IR
symbolic instructions
Source
9-Jan-15 5
Semantic
Analysis
Intermediate
code generat.
Target code
Generation
Target code
Optimisation
Annotated AST optimised symbolic instr.
front-end back-end
Target
CS 346 Lecture 2
Lexical Analysis (Scanning)
• Reads characters in the source program and groups them into
meaningful sequences called lexemes.
• Produces as output a token of the form <Token-class, Attribute>
and passes it to the next phase syntax analysis
9-Jan-15 6
• Token-class: The symbol for this token to be used during syntax
Analysis
• Attribute: Points to symbol table entry for this token
e.g.: The tokens for: fin = ini + rate * 60 are:
<id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60>
CS 346 Lecture 2
Symbol Table
1 pos …
2 ini …
3 rate …
… … …
Syntax Analysis (Parsing)
• Imposes a hierarchical structure on the token stream.
• This hierarchical structure is usually expressed by recursive rules.
• Context-free grammars formalise these recursive rules and guide
9-Jan-15 7
• Context-free grammars formalise these recursive rules and guide
syntax analysis to flag syntax error in case of any mismatch.
• The IR is usually represented as syntax trees
– Interior nodes represent operation
– Leaves depict the arguments of the operation
CS 346 Lecture 2
• Parse tree for: fin = ini + rate * 60
• Tokens: <id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60>
=
Syntax Analysis (Parsing)
+
<id,3>
<id,1>
*
60
<id,2>
9-Jan-15 8CS 346 Lecture 2
Semantic Analysis (context handling)
• Collects context (semantic) information from syntax tree and
symbol table and checks for semantic consistency with
language definition.
• Annotates nodes of the tree with the results.
• Semantic errors:
9-Jan-15 9
• Semantic errors:
– Type mismatches, incompatible operands, function called with
improper arguments, undeclared variable, etc.
– e.g: int ary[10], x; x = ary * 20;
• Type checkers in the semantic analyzer may also do automatic
type conversions if permitted by the language specification
(Coercions).
CS 346 Lecture 2
• Annotated Syntax Tree (AST) for the expression fin = ini + rate * 60
• Tokens: <id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60>
+
= float
float
Semantic Analysis (context handling)
*
+
60
<id,1>
<id,2>
<id,3>
float
float
float
float
float
intofloat
9-Jan-15 10CS 346 Lecture 2
• Translate language-specific constructs in the AST into
more general constructs.
• A criterion for the level of “generality”: it should be
straightforward to generate the target code from the
intermediate representation chosen.
Intermediate code generation
• Example of a form of IR (3-address code):
t1 = inttofloat(60)
t2 = id3 * t1
t3 = id2 + t2
id1 = t3
9-Jan-15 11CS 346 Lecture 2
• Generates streamlined code still in intermediate representation.
• A range of optimization techniques may be applied. For example,
– Removing unused variables
– Suppressing generation of unreachable code segments
– Constant Folding
– Common Sub-expression Elimination
– Loop optimization (Removing unmodified statements in a
Code Optimisation
– Loop optimization (Removing unmodified statements in a
loop)... etc.
• Example:
t1 = inttofloat(60) t1 = id3 * 60.0
t2 = id3 * t1 id1 = id2 + t1
t3 = id2 + t2
id1 = t3
9-Jan-15 12CS 346 Lecture 2
• Map 3-address code into assembly code of the target
architecture
– Instruction selection: A pattern matching problem.
– Register allocation: Each value should be in a register when it is
used (but there is only a limited number).
– Instruction scheduling: take advantage of multiple functional
units.
• Example — A possible translation using two registers:
Code Generation
• Example — A possible translation using two registers:
t1 = id3 * 60.0 MOVF id3, R2
id1 = id2 + t1 MULF #60.0, R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
9-Jan-15 13CS 346 Lecture 2
9-Jan-15 14CS 346 Lecture 2
Lecture2 general structure of a compiler

More Related Content

What's hot (20)

PPT
Introduction to Compiler Construction
Sarmad Ali
 
PPTX
Semantics analysis
Bilalzafar22
 
PPTX
Lexical analysis - Compiler Design
Muhammed Afsal Villan
 
PPTX
Context free grammar
Mohammad Ilyas Malik
 
PPT
Parsing
khush_boo31
 
PPTX
Restoring and Non-Restoring division algo for CSE
ARoy10
 
PPT
context free language
khush_boo31
 
PPT
Bottom - Up Parsing
kunj desai
 
PPTX
Round Robin Algorithm.pptx
Sanad Bhowmik
 
PPTX
Linker and Loader
sonalikharade3
 
PPTX
Parsing in Compiler Design
Akhil Kaushik
 
PPTX
Syntax Analysis in Compiler Design
MAHASREEM
 
PPTX
Predictive parser
Jothi Lakshmi
 
PPTX
Stacks in c++
Vineeta Garg
 
PPTX
Regular Expression to Finite Automata
Archana Gopinath
 
PDF
Lecture1 introduction compilers
Mahesh Kumar Chelimilla
 
PPTX
Types of Parser
SomnathMore3
 
PPT
Compiler Construction introduction
Rana Ehtisham Ul Haq
 
PPT
Introduction to Compiler design
Dr. C.V. Suresh Babu
 
Introduction to Compiler Construction
Sarmad Ali
 
Semantics analysis
Bilalzafar22
 
Lexical analysis - Compiler Design
Muhammed Afsal Villan
 
Context free grammar
Mohammad Ilyas Malik
 
Parsing
khush_boo31
 
Restoring and Non-Restoring division algo for CSE
ARoy10
 
context free language
khush_boo31
 
Bottom - Up Parsing
kunj desai
 
Round Robin Algorithm.pptx
Sanad Bhowmik
 
Linker and Loader
sonalikharade3
 
Parsing in Compiler Design
Akhil Kaushik
 
Syntax Analysis in Compiler Design
MAHASREEM
 
Predictive parser
Jothi Lakshmi
 
Stacks in c++
Vineeta Garg
 
Regular Expression to Finite Automata
Archana Gopinath
 
Lecture1 introduction compilers
Mahesh Kumar Chelimilla
 
Types of Parser
SomnathMore3
 
Compiler Construction introduction
Rana Ehtisham Ul Haq
 
Introduction to Compiler design
Dr. C.V. Suresh Babu
 

Viewers also liked (12)

PDF
Lecture5 syntax analysis_1
Mahesh Kumar Chelimilla
 
PDF
Lecture6 syntax analysis_2
Mahesh Kumar Chelimilla
 
PDF
تعرف إلى-الهندسة المعلوماتية
Business Clinic Damascus University
 
PPS
ف 2 الدرس الأول والثانى والثالث
فتيات بنها النموذجى
 
PPTX
Symbolic Automata = Automata + SMT solvers at ExCape14
Loris D'Antoni
 
PDF
Lecture3 lexical analysis
Mahesh Kumar Chelimilla
 
PPTX
DFA minimization algorithms in map reduce
Iraj Hedayati
 
PPT
DFA Minimization
guest5873b2d
 
PDF
Lecture4 lexical analysis2
Mahesh Kumar Chelimilla
 
PPTX
optimization of DFA
Maulik Togadiya
 
PDF
نموذج مسار هندسة الأفكار
Hani Al-Menaii
 
PDF
هندسة الأفكار
Hani Al-Menaii
 
Lecture5 syntax analysis_1
Mahesh Kumar Chelimilla
 
Lecture6 syntax analysis_2
Mahesh Kumar Chelimilla
 
تعرف إلى-الهندسة المعلوماتية
Business Clinic Damascus University
 
ف 2 الدرس الأول والثانى والثالث
فتيات بنها النموذجى
 
Symbolic Automata = Automata + SMT solvers at ExCape14
Loris D'Antoni
 
Lecture3 lexical analysis
Mahesh Kumar Chelimilla
 
DFA minimization algorithms in map reduce
Iraj Hedayati
 
DFA Minimization
guest5873b2d
 
Lecture4 lexical analysis2
Mahesh Kumar Chelimilla
 
optimization of DFA
Maulik Togadiya
 
نموذج مسار هندسة الأفكار
Hani Al-Menaii
 
هندسة الأفكار
Hani Al-Menaii
 
Ad

Similar to Lecture2 general structure of a compiler (20)

PPT
Interm codegen
Anshul Sharma
 
PPTX
The Phases of a Compiler
Radhika Talaviya
 
PDF
Lecture 2.1 - Phase of a Commmmpiler.pdf
AbuZahed5
 
PPT
what is compiler and five phases of compiler
adilmehmood93
 
PPTX
Compiler Design
Dr. Jaydeep Patil
 
PPTX
Plc part 2
Taymoor Nazmy
 
PPTX
Principal Sources of Optimization in compiler design
LogsAk
 
PPT
SS & CD Module 3
ShwetaNirmanik
 
PPT
Module 2
ShwetaNirmanik
 
PDF
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
PPTX
CD U1-5.pptx
Himajanaidu2
 
PPT
Cpcs302 1
guest5de1a5
 
PDF
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Jonathan Salwan
 
PPT
01-intro.ppt 8085 microprocessors PPT presentation
Vasundhara682986
 
PDF
TSR_CLASS CD-UNIT 4.pdf ewqhqhqhewhwiqhe
vtu21524
 
PPTX
Compiler Construction-2 for bs computer science.pptx
DailyReminder1
 
PPT
Compiler Construction
Sarmad Ali
 
PPTX
Chapter 1.pptx
NesredinTeshome1
 
DOCX
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
venkatapranaykumarGa
 
PDF
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Interm codegen
Anshul Sharma
 
The Phases of a Compiler
Radhika Talaviya
 
Lecture 2.1 - Phase of a Commmmpiler.pdf
AbuZahed5
 
what is compiler and five phases of compiler
adilmehmood93
 
Compiler Design
Dr. Jaydeep Patil
 
Plc part 2
Taymoor Nazmy
 
Principal Sources of Optimization in compiler design
LogsAk
 
SS & CD Module 3
ShwetaNirmanik
 
Module 2
ShwetaNirmanik
 
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
CD U1-5.pptx
Himajanaidu2
 
Cpcs302 1
guest5de1a5
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Jonathan Salwan
 
01-intro.ppt 8085 microprocessors PPT presentation
Vasundhara682986
 
TSR_CLASS CD-UNIT 4.pdf ewqhqhqhewhwiqhe
vtu21524
 
Compiler Construction-2 for bs computer science.pptx
DailyReminder1
 
Compiler Construction
Sarmad Ali
 
Chapter 1.pptx
NesredinTeshome1
 
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
venkatapranaykumarGa
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Ad

More from Mahesh Kumar Chelimilla (15)

PDF
Lecture11 syntax analysis_7
Mahesh Kumar Chelimilla
 
PDF
Lecture10 syntax analysis_6
Mahesh Kumar Chelimilla
 
PDF
Lecture9 syntax analysis_5
Mahesh Kumar Chelimilla
 
PDF
Lecture8 syntax analysis_4
Mahesh Kumar Chelimilla
 
PDF
Lecture7 syntax analysis_3
Mahesh Kumar Chelimilla
 
PPT
Transportlayer tanenbaum
Mahesh Kumar Chelimilla
 
PPT
Network layer tanenbaum
Mahesh Kumar Chelimilla
 
PPT
Forouzan x25
Mahesh Kumar Chelimilla
 
PPT
Forouzan ppp
Mahesh Kumar Chelimilla
 
PPT
Forouzan isdn
Mahesh Kumar Chelimilla
 
PPT
Forouzan frame relay
Mahesh Kumar Chelimilla
 
PPT
Forouzan data link_2
Mahesh Kumar Chelimilla
 
PPT
Forouzan data link_1
Mahesh Kumar Chelimilla
 
PPT
Forouzan atm
Mahesh Kumar Chelimilla
 
PPT
Datalinklayer tanenbaum
Mahesh Kumar Chelimilla
 
Lecture11 syntax analysis_7
Mahesh Kumar Chelimilla
 
Lecture10 syntax analysis_6
Mahesh Kumar Chelimilla
 
Lecture9 syntax analysis_5
Mahesh Kumar Chelimilla
 
Lecture8 syntax analysis_4
Mahesh Kumar Chelimilla
 
Lecture7 syntax analysis_3
Mahesh Kumar Chelimilla
 
Transportlayer tanenbaum
Mahesh Kumar Chelimilla
 
Network layer tanenbaum
Mahesh Kumar Chelimilla
 
Forouzan frame relay
Mahesh Kumar Chelimilla
 
Forouzan data link_2
Mahesh Kumar Chelimilla
 
Forouzan data link_1
Mahesh Kumar Chelimilla
 
Datalinklayer tanenbaum
Mahesh Kumar Chelimilla
 

Recently uploaded (20)

DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PPTX
Break Statement in Programming with 6 Real Examples
manojpoojary2004
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
PPTX
Day2 B2 Best.pptx
helenjenefa1
 
PDF
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
PDF
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
PPTX
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
PPTX
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
PDF
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
PDF
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
PPTX
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
PDF
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
PPTX
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
PPTX
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
PDF
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
PPTX
Innowell Capability B0425 - Commercial Buildings.pptx
regobertroza
 
PPTX
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
PPTX
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
PDF
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 
MRRS Strength and Durability of Concrete
CivilMythili
 
Break Statement in Programming with 6 Real Examples
manojpoojary2004
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
Day2 B2 Best.pptx
helenjenefa1
 
MAD Unit - 2 Activity and Fragment Management in Android (Diploma IT)
JappanMavani
 
GTU Civil Engineering All Semester Syllabus.pdf
Vimal Bhojani
 
原版一样(Acadia毕业证书)加拿大阿卡迪亚大学毕业证办理方法
Taqyea
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
Benefits_^0_Challigi😙🏡💐8fenges[1].pptx
akghostmaker
 
Ethics and Trustworthy AI in Healthcare – Governing Sensitive Data, Profiling...
AlqualsaDIResearchGr
 
Unified_Cloud_Comm_Presentation anil singh ppt
anilsingh298751
 
Pharmaceuticals and fine chemicals.pptxx
jaypa242004
 
Pressure Measurement training for engineers and Technicians
AIESOLUTIONS
 
artificial intelligence applications in Geomatics
NawrasShatnawi1
 
Server Side Web Development Unit 1 of Nodejs.pptx
sneha852132
 
PORTFOLIO Golam Kibria Khan — architect with a passion for thoughtful design...
MasumKhan59
 
Innowell Capability B0425 - Commercial Buildings.pptx
regobertroza
 
The Role of Information Technology in Environmental Protectio....pptx
nallamillisriram
 
MobileComputingMANET2023 MobileComputingMANET2023.pptx
masterfake98765
 
ARC--BUILDING-UTILITIES-2-PART-2 (1).pdf
IzzyBaniquedBusto
 

Lecture2 general structure of a compiler

  • 1. General Structure of a Compiler Lecture 2 General Structure of a Compiler
  • 2. Conceptual Structure • Front-end performs the analysis of the source language: – Breaks up the source code into pieces and imposes a grammatical structure. Front-End Back-End Intermediate Representation Source code Target code 9-Jan-15 CS 346 Lecture 2 2 – Using this, creates a generic Intermediate Representation (IR) of the source code. – Checks for syntax / semantics and provides informative messages back to user in case of errors. – Builds a symbol table to collect and store information about source program.
  • 3. • Back-end does the target language synthesis: – Chooses instructions to implement each IR operation. Conceptual Structure Front-End Back-End Intermediate Representation Source code Target code 9-Jan-15 CS 346 Lecture 2 3 – Translates IR into target code. – Needs to conform with system interfaces. – Automation has been less successful. What is the implication of this separation (front-end: analysis; back- end:synthesis) in building a compiler for, say, a new language?
  • 4. m×n compilers with m+n components! Front-end Front-end Front-end Front-end Back-end Back-end Back-end Back-end I.R. Fortran Smalltalk target 1 C Java target 2 target 3 target 4 9-Jan-15 4 • All language specific knowledge must be encoded in the front-end • All target specific knowledge must be encoded in the back-end But: in practice, this strict separation is not trivial. Front-end Back-end CS 346 Lecture 2
  • 5. General Structure of a Compiler Lexical Analysis Syntax Analysis I.C. Optimisation Code Generation I.R. tokens Abstract Syntax Tree (AST) IR symbolic instructions Source 9-Jan-15 5 Semantic Analysis Intermediate code generat. Target code Generation Target code Optimisation Annotated AST optimised symbolic instr. front-end back-end Target CS 346 Lecture 2
  • 6. Lexical Analysis (Scanning) • Reads characters in the source program and groups them into meaningful sequences called lexemes. • Produces as output a token of the form <Token-class, Attribute> and passes it to the next phase syntax analysis 9-Jan-15 6 • Token-class: The symbol for this token to be used during syntax Analysis • Attribute: Points to symbol table entry for this token e.g.: The tokens for: fin = ini + rate * 60 are: <id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60> CS 346 Lecture 2 Symbol Table 1 pos … 2 ini … 3 rate … … … …
  • 7. Syntax Analysis (Parsing) • Imposes a hierarchical structure on the token stream. • This hierarchical structure is usually expressed by recursive rules. • Context-free grammars formalise these recursive rules and guide 9-Jan-15 7 • Context-free grammars formalise these recursive rules and guide syntax analysis to flag syntax error in case of any mismatch. • The IR is usually represented as syntax trees – Interior nodes represent operation – Leaves depict the arguments of the operation CS 346 Lecture 2
  • 8. • Parse tree for: fin = ini + rate * 60 • Tokens: <id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60> = Syntax Analysis (Parsing) + <id,3> <id,1> * 60 <id,2> 9-Jan-15 8CS 346 Lecture 2
  • 9. Semantic Analysis (context handling) • Collects context (semantic) information from syntax tree and symbol table and checks for semantic consistency with language definition. • Annotates nodes of the tree with the results. • Semantic errors: 9-Jan-15 9 • Semantic errors: – Type mismatches, incompatible operands, function called with improper arguments, undeclared variable, etc. – e.g: int ary[10], x; x = ary * 20; • Type checkers in the semantic analyzer may also do automatic type conversions if permitted by the language specification (Coercions). CS 346 Lecture 2
  • 10. • Annotated Syntax Tree (AST) for the expression fin = ini + rate * 60 • Tokens: <id, 1> <=> <id, 2> <+> <id, 3> <*> <const, 60> + = float float Semantic Analysis (context handling) * + 60 <id,1> <id,2> <id,3> float float float float float intofloat 9-Jan-15 10CS 346 Lecture 2
  • 11. • Translate language-specific constructs in the AST into more general constructs. • A criterion for the level of “generality”: it should be straightforward to generate the target code from the intermediate representation chosen. Intermediate code generation • Example of a form of IR (3-address code): t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3 9-Jan-15 11CS 346 Lecture 2
  • 12. • Generates streamlined code still in intermediate representation. • A range of optimization techniques may be applied. For example, – Removing unused variables – Suppressing generation of unreachable code segments – Constant Folding – Common Sub-expression Elimination – Loop optimization (Removing unmodified statements in a Code Optimisation – Loop optimization (Removing unmodified statements in a loop)... etc. • Example: t1 = inttofloat(60) t1 = id3 * 60.0 t2 = id3 * t1 id1 = id2 + t1 t3 = id2 + t2 id1 = t3 9-Jan-15 12CS 346 Lecture 2
  • 13. • Map 3-address code into assembly code of the target architecture – Instruction selection: A pattern matching problem. – Register allocation: Each value should be in a register when it is used (but there is only a limited number). – Instruction scheduling: take advantage of multiple functional units. • Example — A possible translation using two registers: Code Generation • Example — A possible translation using two registers: t1 = id3 * 60.0 MOVF id3, R2 id1 = id2 + t1 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1 9-Jan-15 13CS 346 Lecture 2
  • 14. 9-Jan-15 14CS 346 Lecture 2