UNIT-5
Code Optimization
• Code optimization is next phase after intermediate code
generation.
• Code optimization can be done at two levels. Machine
independent and Machine dependent code optimization.
• A graph representation of intermediate code is helpful for
discussing how to generate optimized code.
• Code generation benefits from this context.
• We can do a better job of register allocation if we know how
values are defined and used.
• We can do a better job of instruction selection by looking at
sequences of three-address statements transformations on flow
graphs that turn the original intermediate code into "optimized"
intermediate code from which better target code can be
generated.
• The "optimized" intermediate code is turned into machine code
using the code-generation techniques
The representation is constructed as follows:
1. Partition the intermediate code into basic blocks, which are maximal sequences
of consecutive three-address instructions with the properties that
(a)The flow of control can only enter the basic block through the first instruction
in the block. That is, there are no jumps into the middle of the block.
(b) Control will leave the block without halting or branching, except
possibly at the last instruction in the block.
2. The basic blocks become the nodes of a flow graph, whose edges indicate
which blocks can follow which other blocks.
•We begin a new basic block with the first instruction and keep adding
instructions until we meet either a jump, a conditional jump, or a label on the
following instruction.
Basic blocks and flow graphs
Algorithm : Partitioning three-address instructions into basic
blocks.
INPUT: A sequence of three-address instructions.
OUTPUT: A list of the basic blocks for that sequence in which each
instruction is assigned to exactly one basic block.
METHOD: First, we determine those instructions in the
intermediate code that are leaders, that is, the first instructions in
some basic block.
The instruction just past the end of the intermediate program is not
included as a leader.
rules for finding leaders
1. The first three-address instruction in the intermediate code is a
leader.
2. Any instruction that is the target of a conditional or unconditional
jump is a leader.
3. Any instruction that immediately follows a conditional or
unconditional jump is a leader.
 Then, for each leader, its basic block consists of itself and all
instructions up to but not including the next leader or the end of the
intermediate program.
Intermediate code to set a 10*10
matrix to an identity matrix
• In generating the intermediate
code, we have assumed that the
real-valued array elements take 8
bytes each, and that the matrix a is
stored in row-major form.
Flow Graph
•Once an intermediate-code program is partitioned into basic
blocks, we represent the flow of control between them by a flow
graph.
•The nodes of the flow graph are the basic blocks.
•There is an edge from block B to block C if and only if it is
possible for the first instruction in block C to immediately follow
the last instruction in block B.
•There are two ways that such an edge could be justified:
1.There is a conditional or unconditional jump from the end of B to
the beginning of C.
2. C immediately follows B in the original order of the three-
address instructions, and B does not end in an unconditional jump.
•We say that B is a predecessor of C, and C is a successor of B.
Flow graph based on
Basic Blocks
• The entry points is to basic block B1, since B1 contains the first
instruction of the program.
• The only successor of B1 is B2, because B1 does not end in an
unconditional jump, and the leader of B2 immediately follows
the end of B1.
• Block B3 has two successors. One is itself, because the leader of
B3, instruction 3, is the target of the conditional jump at the end
of B3, instruction 9.
• The other successor is B4, because control can fall through the
conditional jump at the end of B3 and next enter the leader of
B4.
• Only B6 points to the exit of the flow graph, since the only way
to get to code that follows the program from which we
constructed the flow graph is to fall through the conditional
jump that ends B6.
Representation of Flow Graphs
•Flow graphs, being quite ordinary graphs, can be represented by
any of the data structures appropriate for graphs.
•The content of nodes (basic blocks) need their own
representation.
•We might represent the content of a node by a pointer to the
leader in the array of three-address instructions, together with a
count of the number of instructions or a second pointer to the last
instruction.
•Hence it is likely to use Linked Lists for each basic blocks.
Next-use information
• Knowing when the value of a variable will be used next is
essential for generating good code.
• If the value of a variable that is currently in a register will never
be referenced subsequently, then that register can be assigned to
another variable.
• Suppose three-address statement i assigns a value to x.
• If statement j has x as an operand, and control can flow from
statement i to j along a path that has no intervening assignments to
x, then we say statement j uses the value of x computed at
statement i.
• We further say that x is live at statement i.
liveness and next-use information
• We wish to determine for each three address statement i: x=y+z what
the next uses of x, y and z are.
Algorithm:
1. Attach to statement i the information currently found in the symbol table
regarding the next use and liveness of x, y, and z.
2. In the symbol table, set x to "not live" and "no next use.“
3. In the symbol table, set y and z to "live" and the next uses of y and z to i.
Loops
•Since virtually every program spends most of its time in
executing its loops, it is especially important for a compiler to
generate good code for loops.
•Many code transformations depend upon the identification of
"loops" in a flow graph.
•We say that a set of nodes L in a flow graph is a loop if
1.There is a node in L called the loop entry with the property that
no other node in L has a predecessor outside L.
That is, every path from the entry of the entire flow
graph to any node in L goes through the loop entry.
2. Every node in L has a nonempty path, completely within L, to
the entry of L.
According to the above flow graph there are three loops
1. B3 by itself
2. B6 by itself
3. {B2,B3,B4}
Optimization of Basic Blocks
•We can often obtain a substantial improvement in the running
time of code merely by performing
local optimization within each basic block by itself .
global optimization, which looks at how information flows
among the basic blocks of a program.
•Many important techniques for local optimization begin by
transforming a basic block into a DAG (directed acyclic graph)
DAG representation of basic blocks
We construct a DAG for a basic block as follows:
•There is a node in the DAG for each of the initial values of the variables
appearing in the basic block.
•There is a node N associated with each statement s within the block. The
children of N are those nodes corresponding to statements that are the
last definitions, prior to s, of the operands used by s.
•Node N is labeled by the operator applied at s, and also attached to N is
the list of variables for which it is the last definition within the block.
•Certain nodes are designated output nodes. These are the nodes whose
variables are live on exit from the block.
Code improving transformations
• We can eliminate local common subexpressions, that is, instructions
that compute a value that has already been computed.
• We can eliminate dead code, that is, instructions that compute a
value that is never used.
• We can reorder statements that do not depend on one another; such
reordering may reduce the time a temporary value needs to be
preserved in a register.
• We can apply algebraic laws to reorder operands of three-address
instructions, and sometimes thereby simplify the computation.
DAG for basic block
Since there are only three non leaf nodes in the DAG, the basic
block contains only three statements as
a=b+c
d=a-d
c=d+c
If b is not live on exit from the block
then no need to compute that variable
i.e
DAG for basic block
array accesses in a DAG
• An assignment from an array, like x = a [i], is represented by
creating a node with operator =[] and two children representing
the initial value of the array, a0 in this case, and the index i.
Variable x becomes a label of this new node.
• An assignment to an array, like a [j] = y, is represented by a new
node with operator []= and three children representing a0, j and y.
There is no variable labeling this node. What is different is that the
creation of this node kills all currently constructed nodes whose
value depends on a0. A node that has been killed cannot receive
any more labels; that is, it cannot become a common
subexpression.
DAG for a sequence of array assignments
Dead Code Elimination
We delete from a DAG any root (node with no ancestors) that has no
live variables attached.
In the previous figure a & b are live but c and e are not, we can
immediately remove the root labelled e . Then the node c becomes
a root and can be removed. The roots labelled a &b remain , since
they each have live variables attached
Use of Algebraic Identities
X+0=0=x=x x-0=x
X*1=1*x=x x/1=x
This should be done to eliminate computations
from a basic block.
Local Reduction in Strength
Replacing a more expensive operator
by a cheaper one.
x2 x*x
2*x x+x
x/2 x*0.5
Pointer Assignments & Procedure calls
X=*p
*q=y
Rules for reconstructing the basic block from a
DAG
• The order of instructions must respect the order of nodes in the DAG.
That is, we cannot compute a node's value until we have computed a
value for each of its children.
• Assignments to an array must follow all previous assignments to, or
evaluations from, the same array, according to the order of these
instructions in the original basic block.
• Evaluations of array elements must follow any previous (according to
the original block) assignments to the same array. The only permutation
allowed is that two evaluations from the same array may be done in
either order, as long as neither crosses over an assignment to that array.
• Any use of a variable must follow all previous (according to the original
block) procedure calls or indirect assignments through a pointer.
• Any procedure call or indirect assignment through a pointer must follow
all previous (according to the original block) evaluations of any variable.
Reassembling basic blocks from DAGs
A Simple Code Generator
• Generates target code for a sequence of 3-address statements
• For each operator in a statement, there is a corresponding target
language operator.
Register & Address descriptors:
Register descriptor: It keeps track of what is currently in each register.
Intially all registers are empty.
Address descriptor: It keeps track of the location where the current
value of the name can be found.
Location may be a register, a stack location or memory address
principal uses of registers
• In most machine architectures, some or all of the operands of an
operation must be in registers in order to perform the operation.
• Registers make good temporaries - places to hold the result of a
subexpression while a larger expression is being evaluated, or more
generally, a place to hold a variable that is used only within a single
basic block.
• Registers are often used to help with run-time storage management,
for example, to manage the run-time stack, including the
maintenance of stack pointers and possibly the top elements of the
stack itself.
A code Generation Algorithm
For each three- address statement of the form x=y op z,
1.Invoke a function getreg to determine the location L, where result of y op
z should be stored.
2.Consult address descriptor for y to determine y1
, the current location of y.
If y is not already in L,
Generate mov y1
, L
3.Generate the instruction op z1
, L update the address
descriptor of x to indicate that x is in L. If x is a register update
its descriptor to indicate that it contains the value of x.
4.If y and z have no next uses and not live on exit, update the
descriptors to remove y and z.
Example
d=(a-b)+(a-c)+ (a-c)
• Three address statements
t1=a-b
t2=a-c
t3=t1+t2
d=t3+t2
Example
Statements Code Generator Register Descriptor Address Descriptor
t1=a-b Mov a, Ro
Sub b,Ro
Registers are empty
Ro contains t1
t1 in Ro
t2=a-c Mov a,R1
Sub c,R1
Ro contains t1
R1 contains t2
t1 in Ro
t2 in R1
t3=t1+t2 Add R1, Ro Ro contains t3
R1 contains t2
t2 in R1
t3 in Ro
d=t3+t2 Add R1,Ro
Mpv Ro, d
Ro contains d d in Ro
d in Ro and memory
Descriptors for data structure
• For each available register, a register descriptor keeps track of the
variable names whose current value is in that register. Since we
shall use only those registers that are available for local use within
a basic block, we assume that initially, all register descriptors are
empty. As the code generation progresses, each register will hold
the value of zero or more names.
• For each program variable, an address descriptor keeps track of
the location or locations where the current value of that variable
can be found. The location might be a register, a memory address,
a stack location, or some set of more than one of these. The
information can be stored in the symbol-table entry for that
variable name.
Machine Instructions for Operations
• Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry
and Rz.
• If y is not in Ry (according to the register descriptor for Ry), then issue
an instruction LD Ry, y', where y' is one of the memory locations for y
(according to the address descriptor for y).
• Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a
location for x .
• Issue the instruction ADD Rx , Ry, Rz.
Rules for updating the register and address descriptors
• For the instruction LD R, x
• Change the register descriptor for register R so it holds only x.
• Change the address descriptor for x by adding register R as an additional
location.
• For the instruction ST x, R, change the address descriptor for x to include
its own memory location.
• For an operation such as ADD Rx, Ry, Rz implementing a three-address
instruction x = y + x
• Change the register descriptor for Rx so that it holds only x.
• Change the address descriptor for x so that its only location is Rx. Note
that the memory location for x is not now in the address descriptor for x.
• Remove Rx from the address descriptor of any variable other than x.
• When we process a copy statement x = y, after generating the load for y
into register Ry, if needed, and after managing descriptors as for all load
statements (per rule I):
• Add x to the register descriptor for Ry.
• Change the address descriptor for x so that its only location is Ry .
Characteristic of peephole optimizations
• Redundant-instruction elimination(loads & Stores)
• Eliminating Unreachable code
• Flow-of-control optimizations
• Algebraic simplifications & Reduction in Strength
• Use of machine idioms
Redundant-instruction elimination
• LD a, R0
ST R0, a
• if debug == 1 goto L1
goto L2
L I : print debugging information
L2:
Eliminating Unreachable Code
• if debug == 1 goto L1
goto L2
L I : print debugging information
L2:
If debug!=1 goto L2
Print debugging information
L2:
Flow-of-control optimizations
simple intermediate code-generation algorithms frequently produce
jumps to jumps, jumps to conditional jumps, conditional jumps to jumps.
goto L1
...
Ll: goto L2
Can be replaced by:
goto L2
...
Ll: goto L2
if a<b goto L1
...
Ll: goto L2
Can be replaced by:
if a<b goto L2
...
Ll: goto L2
Algebraic simplifications& Reduction in
strength
• x=x+0
• x=x*1
Use of Machine Idioms
• The target machine may have hardware instructions to implement
certain specific operations efficiently.
• Ex:Some machines have auto increment and auto decrement
addressing modes.
• Hence these add or subtract one from an operand before or after
using its value.
• The use of these modes greatly improves the quality of code when
pushing or popping a stack
Machine –Independent Optimization
• Elimination of unnecessary instructions in object code , or the
replacement of one sequence of instructions by a faster sequence of
instructions that does the same thing is called “code improvement”
or “code optimization”.
• Local code optimization (code improvement with in a basic block)
• Global code optimization where the improvements taken into
account across basic blocks .
• Most global code optimization are based on data- flow analyses
which are algorithms to gather information about a program. The
results of data-flow analyses all have the same form: for each
instruction in the program specify some property that must hold
every time that instruction is executed.
Principal sources of code Optimization
• A compiler must preserve the semantics of the original program.
• A compiler knows only how to apply relatively low level
transformations , using general facts such as algebraic identities like
i=i+0 so that performance of such operations leads to the same
result.
Causes of Redundancy
• There are many redundant operations in a program.
• Some times redundancy is available at the source level.
• A programmer may find it more direct and convenient to recalculate
some result, leaving it to the compiler to recognize that only one such
calculation is necessary.
• Redundancy is a side effect of having written in the program in a high
level language.
ex : In C/C++ where pointer arithmetic is allowed
referring the elements of an array or fields in a structure using
a[i][j] or x->s1.
As a program is compiled, each of these expands into a number of low
level arithmetic operations such as the computation of location of (i,j)
th element of an matrix. Programmers are not aware of all these and
cannot eliminate the redundancies.
unit-5.pptvshvshshhshsjjsjshhshshshhshsj
Flow graph for quick sort
Semantics-Preserving Transformations
There are a number of ways in which a compiler can improve a program
without changing the function it computes. Common-subexpression
elimination, copy propagation, dead-code elimination, and constant folding
are common examples of such function-preserving (or semantics-preserving)
transformations; we shall consider each in turn.
Global Common-subexpression
elimination
For example, block B5 shown in Fig. 9.4(a) recalculates 4 * i and 4* j, although none of these
calculations were requested explicitly by the programmer.
Global Common-subexpression
elimination
After local common subexpressions are
eliminated, B5 still evaluates 4*i and 4* j, as
shown in Fig. 9.4(b). Both are common
subexpressions; in particular, the three
statements
Copy Propagation
In order to eliminate the common subexpression from
the statement c = d+e in Fig. 9.6(a), we must use a new
variable t to hold the value of d + e. The value of
variable t, instead of that of the expression d + e, is
assigned to c in Fig. 9.6(b). Since control may reach c =
d+e either after the assignment to a or after the
assignment to b, it would be incorrect to replace c =
d+e by either c = a or by c = b.
The idea behind the copy-propagation transformation
is to use v for u, wherever possible after the copy
statement u = v. For example, the assignment x = t3 in
block B5 of Fig. 9.5 is a copy. Copy propagation
applied to B$ yields the code in Fig. 9.7. This change
may not appear to be an improvement, but, as we shall
see in Section 9.1.6, it gives us the opportunity to
eliminate the assignment to x.
Dead-Code Elimination
• A variable is live at a point in a program if its value can be used
subsequently; otherwise, it is dead at that point. A related idea is
dead (or useless) code — statements that compute values that never
get used.
Suppose debug is set to TRUE or FALSE at various points in the program, and used in statements l
i f (debug) p r i n t . . .
It may be possible for the compiler to deduce that each time the program reaches this statement, the value
of debug is FALSE. Usually, it is because there is one particular statement
debug = FALSE
that must be the last assignment to debug prior to any tests of the value of debug, no matter what sequence
of branches the program actually takes. If copy propagation replaces debug by FALSE, then the print
statement is dead because it cannot be reached. We can eliminate both the test and the print operation from
the object code.
Constant Folding
• At compile time deducing the value of an expression is a constant and using
the constant instead is known as constant folding.
• One advantage of copy propagation is that it often turns the copy statement
into dead code. For example, copy propagation followed by dead-code
elimination removes the assignment to x and transforms the code in Fig 9.7
into
a[t2] = t5
a[t4] = t3
goto B2
• This code is a further improvement of block B5 in Fig. 9.5.
Code Motion
• Loops are a very important place for optimizations, especially the inner loops where
programs tend to spend the bulk of their time. The running time of a program may be
improved if we decrease the number of instructions in an inner loop, even if we increase
the amount of code outside that loop.
• An important modification that decreases the amount of code in a loop is code
motion. This transformation takes an expression that yields the same result independent of
the number of times a loop is executed (a loop-invariant computation) and evaluates the
expression before the loop. Note that the notion "before the loop" assumes the existence
of an entry for the loop, that is, one basic block to which all jumps from outside the loop go
while (i <= limit-2) /* statement does not change limit */
Code motion will result in the equivalent code
t = limit-2
while (i <= t) /* statement does not change limit or t */
Induction variables and Reduction in strength
• A variable x is said to be an. "induction variable" if there is a positive or
negative constant c such that each time x is assigned, its value increases by
c.
• i and tl are induction variables in the loop containing B2 of Fig. 9.5. Induction
variables can be computed with a single increment (addition or subtraction)
per loop iteration. The transformation of replacing an expensive operation,
such as multiplication, by a cheaper one, such as addition, is known
as strength reduction.
• But induction variables not only allow us sometimes to perform a strength
reduction; often it is possible to eliminate all but one of a group of induction
variables whose values remain in lock step as we go around the loop.
Strength Reduction
Register Allocation and Assignment
• Register Allocation-what values should reside in registers.
• Register Assignment- In which register each value should
reside
Register Allocation and Assignment
• Global Register Allocation
• Usage Counts
• Register Assignment for Outer Loops
• Register Allocation by Graph Coloring
Global register allocation
• Previously explained algorithm does local (block based) register
allocation
• This resulted that all live variables be stored at the end of block
• To save some of these stores and their corresponding loads, we
might arrange to assign registers to frequently used variables and
keep these registers consistent across block boundaries (globally)
• Some options are:
• Keep values of variables used in loops inside registers
• Use graph coloring approach for more globally allocation
Usage counts
• X has been computed in a block will remain in a register if there are
subsequent uses of x in that block . Thus we count a savings of one
for each of use of x in Loop L not preceded by an assignment to x in
the same block.
• We save two units if we can avoid a store of x at the end of a block.
• Thus if x is allocated a register, we count savings of two for each
block in loop L for which x is live on exit and in which x is assigned a
value.
Usage counts
• For the loops we can approximate the saving by register allocation
as:
• Sum over all blocks (B) in a loop (L)
• For each uses of x before any definition in the block we add one unit of
saving
• If x is live on exit from B and is assigned a value in B, then we ass 2 units of
saving
Σ use(x,B) + 2 * live (x,B)
Use(x,B) is the number of times x is used in B prior to any definition of x.
Live(x,B) is 1 if x is live on exit from B and is assigned a value in B.
Live(x,B) is 0 otherwise.
Flow graph of an inner loop
B1 B2 B3 B4
a= (0+2*1) + (1+2*0) + (1+2*0) + (0+2*0) = 4
b= (1+2*0) + (0+2*0) + (0+2*1) + (0+2*1) = 5
c= (1+2*0) + (0+2*0) + (1+2*0) + (1+2*0) = 3
d= (1+2*1) + (1+2*0) + (1+2*0) + (1+2*0) = 6
e= (0+2*1) + (0+2*0) + (0+2*1) + (0+2*0) = 4
f= (1+2*0) + (0+2*1) + (1+2*0) + (0+2*0) = 4
Code sequence using global register
assignment
Register Assignment for Outer Loops
If outer loop L1 contains an inner loop L2, the names
allocated registers in L2 need not be allocated registers
in L1 - L2. How ever , if we choose to allocate x a
register in L2, but not L1, we must load x on entrance
to L2 and store x on exit from L2.
Register allocation by Graph coloring
• Two passes are used
• Target-machine instructions are selected as though there are an infinite
number of symbolic registers
• Assign physical registers to symbolic ones
• Create a register-interference graph
• Nodes are symbolic registers and edges connects two nodes if one is live at a point
where the other is defined.
• For example in the previous example an edge connects a and d in the graph
• Use a graph coloring algorithm to assign registers.
• A graph is said to be colored if each node has been assigned a color in such a way that
no two adjacent nodes have the same color.
• A color represents a register, and the color makes sure that no two symbolic registers
that can interfere with each other are assigned the same physical register.
• The problem of determining whether a graph is k-colorable is NP complete

More Related Content

PPTX
Compiler Design theory and various phases of compiler.pptx
PDF
Code optimization lecture
PPTX
PPT
PRESENTATION ON DATA STRUCTURE AND THEIR TYPE
PPT
456589.-Compiler-Design-Code-Generation (1).ppt
PPT
Code Generations - 1 compiler design.ppt
PPT
456589.-Compiler-Design-Code-Generation (1).ppt
PDF
Wondershare UniConverter Crack Download Latest 2025
Compiler Design theory and various phases of compiler.pptx
Code optimization lecture
PRESENTATION ON DATA STRUCTURE AND THEIR TYPE
456589.-Compiler-Design-Code-Generation (1).ppt
Code Generations - 1 compiler design.ppt
456589.-Compiler-Design-Code-Generation (1).ppt
Wondershare UniConverter Crack Download Latest 2025

Similar to unit-5.pptvshvshshhshsjjsjshhshshshhshsj (20)

PDF
Enscape 3D 3.6.6 License Key Crack Full Version
PDF
Wondershare Filmora Crack 12.0.10 With Latest 2025
PPT
Code_generatio.lk,jhgfdcxzcvgfhjkmnjhgfcxvfghjmh
PDF
Skype 125.0.201 Crack key Free Download
PPTX
Basic blocks and control flow graphs
PPTX
UNIT V - Compiler Design notes power point presentation
PPTX
Principal Sources of Optimization in compiler design
PPTX
Compiler Design_Code generation techniques.pptx
PPTX
COMPILER_DESIGN_CLASS 1.pptx
PPT
COMPILER_DESIGN_CLASS 2.ppt
PPT
457418.-Compiler-Design-Code-optimization.ppt
PPT
lect23_optimization.ppt
DOC
Compiler notes--unit-iii
PDF
Code optimization in compiler design
PDF
Compiler unit 5
PDF
Module-4 Program Design and Anyalysis.pdf
PPT
ERTS UNIT 3.ppt
PPTX
Machine_Learning_JNTUH_R18_UNIT5_CONCEPTS.pptx
PPTX
Bp150513(compiler)
Enscape 3D 3.6.6 License Key Crack Full Version
Wondershare Filmora Crack 12.0.10 With Latest 2025
Code_generatio.lk,jhgfdcxzcvgfhjkmnjhgfcxvfghjmh
Skype 125.0.201 Crack key Free Download
Basic blocks and control flow graphs
UNIT V - Compiler Design notes power point presentation
Principal Sources of Optimization in compiler design
Compiler Design_Code generation techniques.pptx
COMPILER_DESIGN_CLASS 1.pptx
COMPILER_DESIGN_CLASS 2.ppt
457418.-Compiler-Design-Code-optimization.ppt
lect23_optimization.ppt
Compiler notes--unit-iii
Code optimization in compiler design
Compiler unit 5
Module-4 Program Design and Anyalysis.pdf
ERTS UNIT 3.ppt
Machine_Learning_JNTUH_R18_UNIT5_CONCEPTS.pptx
Bp150513(compiler)
Ad

Recently uploaded (20)

PDF
POCSO ACT in India and its implications.
PDF
The Ways The Abhay Bhutada Foundation Is Helping Indian STEM Education
PPTX
InnoTech Mahamba Presentation yearly.pptx
PDF
To dialogue with the “fringes”, from the “fringes”
PPTX
IMPLEMENTING GUIDELINES OF SUSTAINABLE LIVELIHOOD PROGRAM -SLP MC 22 ORIENTAT...
PPTX
Human_Population_Growth and demographic crisis.pptx
PPTX
c. b. 3 Basics of BDP geared towards public service.pptx
PPTX
Quiz Night Game Questions and Questions for interactive games
PPTX
A quiz and riddle collection for intellctual stimulation
PDF
Global Peace Index - 2025 - Ghana slips on 2025 Global Peace Index; drops out...
PPTX
一比一原版(MHL毕业证)德国吕贝克音乐学院毕业证文凭学历认证
PDF
A Comparative Analysis of Digital Transformation in Public Administration.pdf
PPTX
ISO 9001 awarness for government offices 2015
PDF
Europe's Political and Economic Clouds- August 2025.pdf
PPTX
PER Resp Dte Mar - Ops Wing 20 Mar 27.pptx
PDF
Item # 1b - August 12, 2025 Special Meeting Minutes
PDF
Firefighter Safety Skills training older version
PDF
Covid-19 Immigration Effects - Key Slides - June 2025
PDF
Oil Industry Ethics Evolution Report (1).pdf
PDF
rs_9fsfssdgdgdgdgdgdgdgsdgdgdgdconverted.pdf
POCSO ACT in India and its implications.
The Ways The Abhay Bhutada Foundation Is Helping Indian STEM Education
InnoTech Mahamba Presentation yearly.pptx
To dialogue with the “fringes”, from the “fringes”
IMPLEMENTING GUIDELINES OF SUSTAINABLE LIVELIHOOD PROGRAM -SLP MC 22 ORIENTAT...
Human_Population_Growth and demographic crisis.pptx
c. b. 3 Basics of BDP geared towards public service.pptx
Quiz Night Game Questions and Questions for interactive games
A quiz and riddle collection for intellctual stimulation
Global Peace Index - 2025 - Ghana slips on 2025 Global Peace Index; drops out...
一比一原版(MHL毕业证)德国吕贝克音乐学院毕业证文凭学历认证
A Comparative Analysis of Digital Transformation in Public Administration.pdf
ISO 9001 awarness for government offices 2015
Europe's Political and Economic Clouds- August 2025.pdf
PER Resp Dte Mar - Ops Wing 20 Mar 27.pptx
Item # 1b - August 12, 2025 Special Meeting Minutes
Firefighter Safety Skills training older version
Covid-19 Immigration Effects - Key Slides - June 2025
Oil Industry Ethics Evolution Report (1).pdf
rs_9fsfssdgdgdgdgdgdgdgsdgdgdgdconverted.pdf
Ad

unit-5.pptvshvshshhshsjjsjshhshshshhshsj

  • 2. • Code optimization is next phase after intermediate code generation. • Code optimization can be done at two levels. Machine independent and Machine dependent code optimization. • A graph representation of intermediate code is helpful for discussing how to generate optimized code. • Code generation benefits from this context. • We can do a better job of register allocation if we know how values are defined and used. • We can do a better job of instruction selection by looking at sequences of three-address statements transformations on flow graphs that turn the original intermediate code into "optimized" intermediate code from which better target code can be generated. • The "optimized" intermediate code is turned into machine code using the code-generation techniques
  • 3. The representation is constructed as follows: 1. Partition the intermediate code into basic blocks, which are maximal sequences of consecutive three-address instructions with the properties that (a)The flow of control can only enter the basic block through the first instruction in the block. That is, there are no jumps into the middle of the block. (b) Control will leave the block without halting or branching, except possibly at the last instruction in the block. 2. The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks. •We begin a new basic block with the first instruction and keep adding instructions until we meet either a jump, a conditional jump, or a label on the following instruction. Basic blocks and flow graphs
  • 4. Algorithm : Partitioning three-address instructions into basic blocks. INPUT: A sequence of three-address instructions. OUTPUT: A list of the basic blocks for that sequence in which each instruction is assigned to exactly one basic block. METHOD: First, we determine those instructions in the intermediate code that are leaders, that is, the first instructions in some basic block. The instruction just past the end of the intermediate program is not included as a leader.
  • 5. rules for finding leaders 1. The first three-address instruction in the intermediate code is a leader. 2. Any instruction that is the target of a conditional or unconditional jump is a leader. 3. Any instruction that immediately follows a conditional or unconditional jump is a leader.  Then, for each leader, its basic block consists of itself and all instructions up to but not including the next leader or the end of the intermediate program.
  • 6. Intermediate code to set a 10*10 matrix to an identity matrix • In generating the intermediate code, we have assumed that the real-valued array elements take 8 bytes each, and that the matrix a is stored in row-major form.
  • 7. Flow Graph •Once an intermediate-code program is partitioned into basic blocks, we represent the flow of control between them by a flow graph. •The nodes of the flow graph are the basic blocks. •There is an edge from block B to block C if and only if it is possible for the first instruction in block C to immediately follow the last instruction in block B. •There are two ways that such an edge could be justified: 1.There is a conditional or unconditional jump from the end of B to the beginning of C. 2. C immediately follows B in the original order of the three- address instructions, and B does not end in an unconditional jump. •We say that B is a predecessor of C, and C is a successor of B.
  • 8. Flow graph based on Basic Blocks
  • 9. • The entry points is to basic block B1, since B1 contains the first instruction of the program. • The only successor of B1 is B2, because B1 does not end in an unconditional jump, and the leader of B2 immediately follows the end of B1. • Block B3 has two successors. One is itself, because the leader of B3, instruction 3, is the target of the conditional jump at the end of B3, instruction 9. • The other successor is B4, because control can fall through the conditional jump at the end of B3 and next enter the leader of B4. • Only B6 points to the exit of the flow graph, since the only way to get to code that follows the program from which we constructed the flow graph is to fall through the conditional jump that ends B6.
  • 10. Representation of Flow Graphs •Flow graphs, being quite ordinary graphs, can be represented by any of the data structures appropriate for graphs. •The content of nodes (basic blocks) need their own representation. •We might represent the content of a node by a pointer to the leader in the array of three-address instructions, together with a count of the number of instructions or a second pointer to the last instruction. •Hence it is likely to use Linked Lists for each basic blocks.
  • 11. Next-use information • Knowing when the value of a variable will be used next is essential for generating good code. • If the value of a variable that is currently in a register will never be referenced subsequently, then that register can be assigned to another variable. • Suppose three-address statement i assigns a value to x. • If statement j has x as an operand, and control can flow from statement i to j along a path that has no intervening assignments to x, then we say statement j uses the value of x computed at statement i. • We further say that x is live at statement i.
  • 12. liveness and next-use information • We wish to determine for each three address statement i: x=y+z what the next uses of x, y and z are. Algorithm: 1. Attach to statement i the information currently found in the symbol table regarding the next use and liveness of x, y, and z. 2. In the symbol table, set x to "not live" and "no next use.“ 3. In the symbol table, set y and z to "live" and the next uses of y and z to i.
  • 13. Loops •Since virtually every program spends most of its time in executing its loops, it is especially important for a compiler to generate good code for loops. •Many code transformations depend upon the identification of "loops" in a flow graph. •We say that a set of nodes L in a flow graph is a loop if 1.There is a node in L called the loop entry with the property that no other node in L has a predecessor outside L. That is, every path from the entry of the entire flow graph to any node in L goes through the loop entry. 2. Every node in L has a nonempty path, completely within L, to the entry of L. According to the above flow graph there are three loops 1. B3 by itself 2. B6 by itself 3. {B2,B3,B4}
  • 14. Optimization of Basic Blocks •We can often obtain a substantial improvement in the running time of code merely by performing local optimization within each basic block by itself . global optimization, which looks at how information flows among the basic blocks of a program. •Many important techniques for local optimization begin by transforming a basic block into a DAG (directed acyclic graph)
  • 15. DAG representation of basic blocks We construct a DAG for a basic block as follows: •There is a node in the DAG for each of the initial values of the variables appearing in the basic block. •There is a node N associated with each statement s within the block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s, of the operands used by s. •Node N is labeled by the operator applied at s, and also attached to N is the list of variables for which it is the last definition within the block. •Certain nodes are designated output nodes. These are the nodes whose variables are live on exit from the block.
  • 16. Code improving transformations • We can eliminate local common subexpressions, that is, instructions that compute a value that has already been computed. • We can eliminate dead code, that is, instructions that compute a value that is never used. • We can reorder statements that do not depend on one another; such reordering may reduce the time a temporary value needs to be preserved in a register. • We can apply algebraic laws to reorder operands of three-address instructions, and sometimes thereby simplify the computation.
  • 17. DAG for basic block Since there are only three non leaf nodes in the DAG, the basic block contains only three statements as a=b+c d=a-d c=d+c If b is not live on exit from the block then no need to compute that variable i.e
  • 18. DAG for basic block
  • 19. array accesses in a DAG • An assignment from an array, like x = a [i], is represented by creating a node with operator =[] and two children representing the initial value of the array, a0 in this case, and the index i. Variable x becomes a label of this new node. • An assignment to an array, like a [j] = y, is represented by a new node with operator []= and three children representing a0, j and y. There is no variable labeling this node. What is different is that the creation of this node kills all currently constructed nodes whose value depends on a0. A node that has been killed cannot receive any more labels; that is, it cannot become a common subexpression.
  • 20. DAG for a sequence of array assignments
  • 21. Dead Code Elimination We delete from a DAG any root (node with no ancestors) that has no live variables attached. In the previous figure a & b are live but c and e are not, we can immediately remove the root labelled e . Then the node c becomes a root and can be removed. The roots labelled a &b remain , since they each have live variables attached
  • 22. Use of Algebraic Identities X+0=0=x=x x-0=x X*1=1*x=x x/1=x This should be done to eliminate computations from a basic block. Local Reduction in Strength Replacing a more expensive operator by a cheaper one. x2 x*x 2*x x+x x/2 x*0.5
  • 23. Pointer Assignments & Procedure calls X=*p *q=y
  • 24. Rules for reconstructing the basic block from a DAG • The order of instructions must respect the order of nodes in the DAG. That is, we cannot compute a node's value until we have computed a value for each of its children. • Assignments to an array must follow all previous assignments to, or evaluations from, the same array, according to the order of these instructions in the original basic block. • Evaluations of array elements must follow any previous (according to the original block) assignments to the same array. The only permutation allowed is that two evaluations from the same array may be done in either order, as long as neither crosses over an assignment to that array. • Any use of a variable must follow all previous (according to the original block) procedure calls or indirect assignments through a pointer. • Any procedure call or indirect assignment through a pointer must follow all previous (according to the original block) evaluations of any variable. Reassembling basic blocks from DAGs
  • 25. A Simple Code Generator • Generates target code for a sequence of 3-address statements • For each operator in a statement, there is a corresponding target language operator. Register & Address descriptors: Register descriptor: It keeps track of what is currently in each register. Intially all registers are empty. Address descriptor: It keeps track of the location where the current value of the name can be found. Location may be a register, a stack location or memory address
  • 26. principal uses of registers • In most machine architectures, some or all of the operands of an operation must be in registers in order to perform the operation. • Registers make good temporaries - places to hold the result of a subexpression while a larger expression is being evaluated, or more generally, a place to hold a variable that is used only within a single basic block. • Registers are often used to help with run-time storage management, for example, to manage the run-time stack, including the maintenance of stack pointers and possibly the top elements of the stack itself.
  • 27. A code Generation Algorithm For each three- address statement of the form x=y op z, 1.Invoke a function getreg to determine the location L, where result of y op z should be stored. 2.Consult address descriptor for y to determine y1 , the current location of y. If y is not already in L, Generate mov y1 , L 3.Generate the instruction op z1 , L update the address descriptor of x to indicate that x is in L. If x is a register update its descriptor to indicate that it contains the value of x. 4.If y and z have no next uses and not live on exit, update the descriptors to remove y and z.
  • 28. Example d=(a-b)+(a-c)+ (a-c) • Three address statements t1=a-b t2=a-c t3=t1+t2 d=t3+t2
  • 29. Example Statements Code Generator Register Descriptor Address Descriptor t1=a-b Mov a, Ro Sub b,Ro Registers are empty Ro contains t1 t1 in Ro t2=a-c Mov a,R1 Sub c,R1 Ro contains t1 R1 contains t2 t1 in Ro t2 in R1 t3=t1+t2 Add R1, Ro Ro contains t3 R1 contains t2 t2 in R1 t3 in Ro d=t3+t2 Add R1,Ro Mpv Ro, d Ro contains d d in Ro d in Ro and memory
  • 30. Descriptors for data structure • For each available register, a register descriptor keeps track of the variable names whose current value is in that register. Since we shall use only those registers that are available for local use within a basic block, we assume that initially, all register descriptors are empty. As the code generation progresses, each register will hold the value of zero or more names. • For each program variable, an address descriptor keeps track of the location or locations where the current value of that variable can be found. The location might be a register, a memory address, a stack location, or some set of more than one of these. The information can be stored in the symbol-table entry for that variable name.
  • 31. Machine Instructions for Operations • Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and Rz. • If y is not in Ry (according to the register descriptor for Ry), then issue an instruction LD Ry, y', where y' is one of the memory locations for y (according to the address descriptor for y). • Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a location for x . • Issue the instruction ADD Rx , Ry, Rz.
  • 32. Rules for updating the register and address descriptors • For the instruction LD R, x • Change the register descriptor for register R so it holds only x. • Change the address descriptor for x by adding register R as an additional location. • For the instruction ST x, R, change the address descriptor for x to include its own memory location. • For an operation such as ADD Rx, Ry, Rz implementing a three-address instruction x = y + x • Change the register descriptor for Rx so that it holds only x. • Change the address descriptor for x so that its only location is Rx. Note that the memory location for x is not now in the address descriptor for x. • Remove Rx from the address descriptor of any variable other than x. • When we process a copy statement x = y, after generating the load for y into register Ry, if needed, and after managing descriptors as for all load statements (per rule I): • Add x to the register descriptor for Ry. • Change the address descriptor for x so that its only location is Ry .
  • 33. Characteristic of peephole optimizations • Redundant-instruction elimination(loads & Stores) • Eliminating Unreachable code • Flow-of-control optimizations • Algebraic simplifications & Reduction in Strength • Use of machine idioms
  • 34. Redundant-instruction elimination • LD a, R0 ST R0, a • if debug == 1 goto L1 goto L2 L I : print debugging information L2:
  • 35. Eliminating Unreachable Code • if debug == 1 goto L1 goto L2 L I : print debugging information L2: If debug!=1 goto L2 Print debugging information L2:
  • 36. Flow-of-control optimizations simple intermediate code-generation algorithms frequently produce jumps to jumps, jumps to conditional jumps, conditional jumps to jumps. goto L1 ... Ll: goto L2 Can be replaced by: goto L2 ... Ll: goto L2 if a<b goto L1 ... Ll: goto L2 Can be replaced by: if a<b goto L2 ... Ll: goto L2
  • 37. Algebraic simplifications& Reduction in strength • x=x+0 • x=x*1
  • 38. Use of Machine Idioms • The target machine may have hardware instructions to implement certain specific operations efficiently. • Ex:Some machines have auto increment and auto decrement addressing modes. • Hence these add or subtract one from an operand before or after using its value. • The use of these modes greatly improves the quality of code when pushing or popping a stack
  • 39. Machine –Independent Optimization • Elimination of unnecessary instructions in object code , or the replacement of one sequence of instructions by a faster sequence of instructions that does the same thing is called “code improvement” or “code optimization”. • Local code optimization (code improvement with in a basic block) • Global code optimization where the improvements taken into account across basic blocks . • Most global code optimization are based on data- flow analyses which are algorithms to gather information about a program. The results of data-flow analyses all have the same form: for each instruction in the program specify some property that must hold every time that instruction is executed.
  • 40. Principal sources of code Optimization • A compiler must preserve the semantics of the original program. • A compiler knows only how to apply relatively low level transformations , using general facts such as algebraic identities like i=i+0 so that performance of such operations leads to the same result.
  • 41. Causes of Redundancy • There are many redundant operations in a program. • Some times redundancy is available at the source level. • A programmer may find it more direct and convenient to recalculate some result, leaving it to the compiler to recognize that only one such calculation is necessary. • Redundancy is a side effect of having written in the program in a high level language. ex : In C/C++ where pointer arithmetic is allowed referring the elements of an array or fields in a structure using a[i][j] or x->s1. As a program is compiled, each of these expands into a number of low level arithmetic operations such as the computation of location of (i,j) th element of an matrix. Programmers are not aware of all these and cannot eliminate the redundancies.
  • 43. Flow graph for quick sort
  • 44. Semantics-Preserving Transformations There are a number of ways in which a compiler can improve a program without changing the function it computes. Common-subexpression elimination, copy propagation, dead-code elimination, and constant folding are common examples of such function-preserving (or semantics-preserving) transformations; we shall consider each in turn.
  • 45. Global Common-subexpression elimination For example, block B5 shown in Fig. 9.4(a) recalculates 4 * i and 4* j, although none of these calculations were requested explicitly by the programmer.
  • 46. Global Common-subexpression elimination After local common subexpressions are eliminated, B5 still evaluates 4*i and 4* j, as shown in Fig. 9.4(b). Both are common subexpressions; in particular, the three statements
  • 47. Copy Propagation In order to eliminate the common subexpression from the statement c = d+e in Fig. 9.6(a), we must use a new variable t to hold the value of d + e. The value of variable t, instead of that of the expression d + e, is assigned to c in Fig. 9.6(b). Since control may reach c = d+e either after the assignment to a or after the assignment to b, it would be incorrect to replace c = d+e by either c = a or by c = b. The idea behind the copy-propagation transformation is to use v for u, wherever possible after the copy statement u = v. For example, the assignment x = t3 in block B5 of Fig. 9.5 is a copy. Copy propagation applied to B$ yields the code in Fig. 9.7. This change may not appear to be an improvement, but, as we shall see in Section 9.1.6, it gives us the opportunity to eliminate the assignment to x.
  • 48. Dead-Code Elimination • A variable is live at a point in a program if its value can be used subsequently; otherwise, it is dead at that point. A related idea is dead (or useless) code — statements that compute values that never get used. Suppose debug is set to TRUE or FALSE at various points in the program, and used in statements l i f (debug) p r i n t . . . It may be possible for the compiler to deduce that each time the program reaches this statement, the value of debug is FALSE. Usually, it is because there is one particular statement debug = FALSE that must be the last assignment to debug prior to any tests of the value of debug, no matter what sequence of branches the program actually takes. If copy propagation replaces debug by FALSE, then the print statement is dead because it cannot be reached. We can eliminate both the test and the print operation from the object code.
  • 49. Constant Folding • At compile time deducing the value of an expression is a constant and using the constant instead is known as constant folding. • One advantage of copy propagation is that it often turns the copy statement into dead code. For example, copy propagation followed by dead-code elimination removes the assignment to x and transforms the code in Fig 9.7 into a[t2] = t5 a[t4] = t3 goto B2 • This code is a further improvement of block B5 in Fig. 9.5.
  • 50. Code Motion • Loops are a very important place for optimizations, especially the inner loops where programs tend to spend the bulk of their time. The running time of a program may be improved if we decrease the number of instructions in an inner loop, even if we increase the amount of code outside that loop. • An important modification that decreases the amount of code in a loop is code motion. This transformation takes an expression that yields the same result independent of the number of times a loop is executed (a loop-invariant computation) and evaluates the expression before the loop. Note that the notion "before the loop" assumes the existence of an entry for the loop, that is, one basic block to which all jumps from outside the loop go while (i <= limit-2) /* statement does not change limit */ Code motion will result in the equivalent code t = limit-2 while (i <= t) /* statement does not change limit or t */
  • 51. Induction variables and Reduction in strength • A variable x is said to be an. "induction variable" if there is a positive or negative constant c such that each time x is assigned, its value increases by c. • i and tl are induction variables in the loop containing B2 of Fig. 9.5. Induction variables can be computed with a single increment (addition or subtraction) per loop iteration. The transformation of replacing an expensive operation, such as multiplication, by a cheaper one, such as addition, is known as strength reduction. • But induction variables not only allow us sometimes to perform a strength reduction; often it is possible to eliminate all but one of a group of induction variables whose values remain in lock step as we go around the loop.
  • 53. Register Allocation and Assignment • Register Allocation-what values should reside in registers. • Register Assignment- In which register each value should reside
  • 54. Register Allocation and Assignment • Global Register Allocation • Usage Counts • Register Assignment for Outer Loops • Register Allocation by Graph Coloring
  • 55. Global register allocation • Previously explained algorithm does local (block based) register allocation • This resulted that all live variables be stored at the end of block • To save some of these stores and their corresponding loads, we might arrange to assign registers to frequently used variables and keep these registers consistent across block boundaries (globally) • Some options are: • Keep values of variables used in loops inside registers • Use graph coloring approach for more globally allocation
  • 56. Usage counts • X has been computed in a block will remain in a register if there are subsequent uses of x in that block . Thus we count a savings of one for each of use of x in Loop L not preceded by an assignment to x in the same block. • We save two units if we can avoid a store of x at the end of a block. • Thus if x is allocated a register, we count savings of two for each block in loop L for which x is live on exit and in which x is assigned a value.
  • 57. Usage counts • For the loops we can approximate the saving by register allocation as: • Sum over all blocks (B) in a loop (L) • For each uses of x before any definition in the block we add one unit of saving • If x is live on exit from B and is assigned a value in B, then we ass 2 units of saving Σ use(x,B) + 2 * live (x,B) Use(x,B) is the number of times x is used in B prior to any definition of x. Live(x,B) is 1 if x is live on exit from B and is assigned a value in B. Live(x,B) is 0 otherwise.
  • 58. Flow graph of an inner loop
  • 59. B1 B2 B3 B4 a= (0+2*1) + (1+2*0) + (1+2*0) + (0+2*0) = 4 b= (1+2*0) + (0+2*0) + (0+2*1) + (0+2*1) = 5 c= (1+2*0) + (0+2*0) + (1+2*0) + (1+2*0) = 3 d= (1+2*1) + (1+2*0) + (1+2*0) + (1+2*0) = 6 e= (0+2*1) + (0+2*0) + (0+2*1) + (0+2*0) = 4 f= (1+2*0) + (0+2*1) + (1+2*0) + (0+2*0) = 4
  • 60. Code sequence using global register assignment
  • 61. Register Assignment for Outer Loops If outer loop L1 contains an inner loop L2, the names allocated registers in L2 need not be allocated registers in L1 - L2. How ever , if we choose to allocate x a register in L2, but not L1, we must load x on entrance to L2 and store x on exit from L2.
  • 62. Register allocation by Graph coloring • Two passes are used • Target-machine instructions are selected as though there are an infinite number of symbolic registers • Assign physical registers to symbolic ones • Create a register-interference graph • Nodes are symbolic registers and edges connects two nodes if one is live at a point where the other is defined. • For example in the previous example an edge connects a and d in the graph • Use a graph coloring algorithm to assign registers. • A graph is said to be colored if each node has been assigned a color in such a way that no two adjacent nodes have the same color. • A color represents a register, and the color makes sure that no two symbolic registers that can interfere with each other are assigned the same physical register. • The problem of determining whether a graph is k-colorable is NP complete