Assembly p1
 A technique used in advanced microprocessors where the
microprocessor begins executing a second instruction
before the first has been completed.
- A Pipeline is a series of stages, where some work is done at
each stage. The work is not finished until it has passed
through all stages.
 With pipelining, the computer architecture allows the next
instructions to be fetched while the processor is
performing arithmetic operations, holding them in a buffer
close to the processor until each instruction operation can
performed.
 The pipeline is divided into segments and each
segment can execute it operation concurrently with
the other segments. Once a segment completes an
operations, it passes the result to the next segment in
the pipeline and fetches the next operations from the
preceding segment.
Four Pipelined Instructions
IF
IF
IF
IF
ID
ID
ID
ID
EX
EX
EX
EX M
M
M
M
W
W
W
W
5
1
1
1
Instructions Fetch
 The instruction Fetch (IF) stage is responsible for obtaining
the requested instruction from memory. The instruction
and the program counter (which is incremented to the next
instruction) are stored in the IF/ID pipeline register as
temporary storage so that may be used in the next stage at
the start of the next clock cycle.
Instruction Decode
 The Instruction Decode (ID) stage is responsible for
decoding the instruction and sending out the various
control lines to the other parts of the processor. The
instruction is sent to the control unit where it is decoded
and the registers are fetched from the register file.
Execution
 The Execution (EX) stage is where any calculations are
performed. The main component in this stage is the ALU.
The ALU is made up of arithmetic, logic and capabilities.
Memory and IO
 The Memory and IO (MEM) stage is responsible for storing
and loading values to and from memory. It also responsible
for input or output from the processor. If the current
instruction is not of Memory or IO type than the result
from the ALU is passed through to the write back stage.
Write Back
 The Write Back (WB) stage is responsible for writing
the result of a calculation, memory access or input into
the register file.
Decode
Instruction and
Calculate Effective
Address
Fetch Instruction
From Memory
Branch ?
Update PC
Empty Pipe
Interrupt
Handling
Fetch Operand
From memory
Execute
Instruction
Interrupt
YES
NO
YES NO
INTRODUCTION
Pipelining is technique of decomposing a
sequential process into suboperation, with each
subprocess being executed in a special dedicated
segment that operates concurrently with all other
segments.
The name “pipeline” implies a flows of
information analogous to an industrial assembly
line.
The name “pipeline” implies a flow of
information analogous to an industrial assembly
line.
It is characteristic of pipelines that several
computation can be in progress in distinct at the
same time.
Each subtask can be processed independently
on a different machine.
The pipelining design provides a way to start a
new task before an old one has been completed.
F
1
E
1
F
2
E
2
F
3
E
3
I1 I2 I3
(a) Sequential execution
Instruction
fetch
unit
Exelution
unit
Interstage buffer
B1
(b) Hardware organization
Time
F1 E1
F2 E2
F3 E3
I1
I2
I3
Instruction
(c) Pipelined execution
Clock cycle 1 2 3 4
Time
Fetch + Execution
 pipelining processing:
 Perform arithmetic operation (Ai*Bi)+(Ci*Di) with a
stream of number. A specify pipeline configuration to
carry out the task. Register in the pipeline for i=1
through 6.
It consist of seven registers that receive new data with
every clock pulse ,two multipliers and one adder circuits .
R1 R2 R3 R4
MULTIPLIER MULTIPLIER
R5 R6
ADDER
R7
Stage 1
Stage 2
Stage 3
Ai Bi Ci Di
 The performance gain from using pipelining occurs
because we can start the execution of a new
instruction each clock cycle. In a real implementation
this is not always possible.
 Another important note is that in a pipelined
processor, a particular instruction still takes at least as
long to execute as non-pipelined.
 Pipeline hazards prevent the execution of the next
instruction during the appropriate clock cycle.
 There are three types of hazards in a pipeline, they are
as follows:
 Structural Hazards: are created when the data path hardware
in the pipeline cannot support all of the overlapped
instructions in the pipeline.
 Data Hazards: When there is an instruction in the pipeline
that affects the result of another instruction in the pipeline.
 Control Hazards: The PC causes these due to the pipelining
of branches and other instructions that change the PC.
 Structural hazards result from the CPU data path
not having resources to service all the required
overlapping resources.
 Suppose a processor can only read and write from
the registers in one clock cycle. This would cause a
problem during the ID and WB stages.
 Assume that there are not separate instruction and
data caches, and only one memory access can occur
during one clock cycle. A hazard would be caused
during the IF and MEM cycles.
Assembly p1
 A structural hazard is dealt with by inserting a stall or
pipeline bubble into the pipeline. This means that for that
clock cycle, nothing happens for that instruction. This
effectively “slides” that instruction, and subsequent
instructions, by one clock cycle.
 This effectively increases the average CPI.
 EX: Assume that you need to compare two processors, one
with a structural hazard that occurs 40% for the time,
causing a stall. Assume that the processor with the hazard
has a clock rate 1.05 times faster than the processor without
the hazard. How fast is the processor with the hazard
compared to the one without the hazard?
 We can see that even though the clock speed of the
processor with the hazard is a little faster, the
speedup is still less than 1.
 Therefore the hazard has quite an effect on the
performance.
 Sometimes computer architects will opt to design a
processor that exhibits a structural hazard. Why?
• .
 We haven’t looked at assembly programming in
detail at this point.
 Consider the following operations:
DADD R1, R2, R3
DSUB R4, R1, R5
AND R6, R1, R7
OR R8, R1, R9
XOR R10, R1, R11
Pipeline Registers
What are the problems?
 In this trivial example, we cannot expect the programmer to
reorder his/her operations. Assuming this is the only code we
want to execute.
 Data forwarding can be used to solve this problem.
 To implement data forwarding we need to bypass the
pipeline register flow:
 instruction depends on the write of a previous instruction.
Assembly p1
 It is easy to see how data forwarding can be used by
drawing out the pipelined execution of each
instruction.
 Now consider the following instructions:
DADD R1, R2, R3
LD R4, O(R1)
SD R4, 12(R1)
Assembly p1
ENGR9861 Winter 2007 RV
 Can data forwarding prevent all data hazards?
 NO!
 The following operations will still cause a data
hazard. This happens because the further down the
pipeline we get, the less we can use forwarding.
LD R1, O(R2)
DSUB R4, R1, R5
AND R6, R1, R7
OR R8, R1, R9
Assembly p1
 We can avoid the hazard by using a pipeline
interlock.
 The pipeline interlock will detect when data
forwarding will not be able to get the data to the
next instruction in time.
 A stall is introduced until the instruction can get
the appropriate data from the previous instruction.
ENGR9861 Winter 2007 RV
 Control hazards are caused by branches in the
code.
 During the IF stage remember that the PC is
incremented by 4 in preparation for the next IF
cycle of the next instruction.
 What happens if there is a branch performed and
we aren’t simply incrementing the PC by 4.
 The easiest way to deal with the occurrence of a
branch is to perform the IF stage again once the
branch occurs.
 These following solutions assume that we are
dealing with static branches. Meaning that the
actions taken during a branch do not change.
 We already saw the first example, we stall the
pipeline until the branch is resolved (in our case we
repeated the IF stage until the branch resolved and
modified the PC)
 The next two examples will always make an
assumption about the branch instruction.
ENGR9861 Winter 2007 RV
 What if we treat every branch as “not taken”
remember that not only do we read the registers
during ID, but we also perform an equality test in
case we need to branch or not.
 We can improve performance by assuming that the
branch will not be taken.
 What in this case we can simply load in the next
instruction (PC+4) can continue. The complexity
arises when the branch evaluates and we end up
needing to actually take the branch.
ENGR9861 Winter 2007 RV
 The “branch-not taken” scheme is the same as performing
the IF stage a second time in our 5 stage pipeline if the
branch is taken.
 If not there is no performance degradation.
 The “branch taken” scheme is no benefit in our case because
we evaluate the branch target address in the ID stage.
 The fourth method for dealing with a control hazard is to
implement a “delayed” branch scheme.
 In this scheme an instruction is inserted into the pipeline
that is useful and not dependent on whether the branch is
taken or not. It is the job of the compiler to determine the
delayed branch instruction.
 Sometimes operations require more than one clock
cycle to complete. Examples are:
 Floating Point Multiply
 Floating Point Divide
 Floating Point Add
 We can assume that there is hardware available on
the processor for performing the operations.
 Assume that the FP Mul and Add are fully
pipelined, and the divide is un-pipelined.
 The multiplier and the divider are fully pipelined.
The divider is not pipelined at all.
 Take a look at figure A.34 for a good example of
how pipelining will function in the case of longer
instruction execution. The author assumes a single
floating point register port.
 Structural hazards are avoided in the ID stage by
assigning a memory bit in a shift register. Incoming
instructions can then check to see if they should
stall.
ENGR9861 Winter 2007 RV
 Data Dependence:
 Instruction i produces a result the instruction j will
use or instruction i is data dependent on instruction
j and vice versa.
 Name Dependence:
 Occurs when two instructions use the same register
and memory location. But there is no flow of data
between the instructions. Instruction order must be
preserved.
ENGR9861 Winter 2007 RV
 Types of data hazards:
 RAW: read after write
 WAW: write after write
 WAR: write after read
 We have already seen a RAW hazard. WAW hazards
occur due to output dependence.
 WAR hazards do not usually occur because of the
amount of time between the read cycle and write
cycle in a pipeline.
39
Read After Write (RAW)
InstrJ tries to read operand before InstrI writes it
• Caused by a “Dependence” (in compiler nomenclature).
This hazard results from an actual need for
communication.
Execution Order is:
InstrI
InstrJ
I: add r1,r2,r3
J: sub r4,r1,r3
Write After Read (WAR)
InstrJ tries to write operand before InstrI reads i
– Gets wrong operand
– Called an “anti-dependence” by compiler writers.
This results from reuse of the name “r1”.
• Can’t happen in MIPS 5 stage pipeline because:
– All instructions take 5 stages, and
– Reads are always in stage 2, and
– Writes are always in stage 5
Execution Order is:
InstrI
InstrJ
I: sub r4,r1,r3
J: add r1,r2,r3
K: mul r6,r1,r7
Write After Write (WAW)
InstrJ tries to write operand before InstrI writes it
– Leaves wrong result ( InstrI not InstrJ )
• Called an “output dependence” by compiler writers
This also results from the reuse of name “r1”.
• Can’t happen in MIPS 5 stage pipeline because:
– All instructions take 5 stages, and
– Writes are always in stage 5
• Will see WAR and WAW in later more complicated pipes
Execution Order is:
InstrI
InstrJ
I: sub r1,r4,r3
J: add r1,r2,r3
K: mul r6,r1,r7

More Related Content

PPTX
Pipeline & Nonpipeline Processor
PDF
Pipeline Computing by S. M. Risalat Hasan Chowdhury
PDF
Comp archch06l01pipeline
PDF
Comp architecture : branch prediction
PDF
Advanced pipelining
PPT
Pipelining in computer architecture
PPT
Pipelining
PPT
pipelining
Pipeline & Nonpipeline Processor
Pipeline Computing by S. M. Risalat Hasan Chowdhury
Comp archch06l01pipeline
Comp architecture : branch prediction
Advanced pipelining
Pipelining in computer architecture
Pipelining
pipelining

What's hot (20)

PPT
Pipelining In computer
DOC
Pipeline Mechanism
PPT
Instruction pipelining
PPT
Chapter6 pipelining
PPT
Pipelining
PPTX
Loop parallelization & pipelining
PPTX
Design a pipeline
PPSX
Concept of Pipelining
PPT
Computer architecture pipelining
PPTX
Pipeline processing - Computer Architecture
PPTX
3 Pipelining
PDF
Instruction pipeline
PPTX
INSTRUCTION PIPELINING
PPT
Lec18 pipeline
PPTX
pipelining
PPT
Pipeline hazard
PPTX
INSTRUCTION LEVEL PARALLALISM
PPT
pipeline and pipeline hazards
PPTX
Pipelining, processors, risc and cisc
PPTX
Pipelining , structural hazards
Pipelining In computer
Pipeline Mechanism
Instruction pipelining
Chapter6 pipelining
Pipelining
Loop parallelization & pipelining
Design a pipeline
Concept of Pipelining
Computer architecture pipelining
Pipeline processing - Computer Architecture
3 Pipelining
Instruction pipeline
INSTRUCTION PIPELINING
Lec18 pipeline
pipelining
Pipeline hazard
INSTRUCTION LEVEL PARALLALISM
pipeline and pipeline hazards
Pipelining, processors, risc and cisc
Pipelining , structural hazards
Ad

Viewers also liked (6)

PPTX
Dianaventura metodo
PPTX
Aseguramiento de calidad y costos
DOCX
Reflection Paper - GEL
PPTX
Investigacion computacion
PDF
Robert Louis stevenson
PPTX
Uso responsable de internet powerpoint
Dianaventura metodo
Aseguramiento de calidad y costos
Reflection Paper - GEL
Investigacion computacion
Robert Louis stevenson
Uso responsable de internet powerpoint
Ad

Similar to Assembly p1 (20)

PPT
Performance Enhancement with Pipelining
PPT
Pipelining & All Hazards Solution
PPTX
Core pipelining
PPTX
Slides.pptx
PDF
Pipeline Organization Overview and Performance.pdf
PPT
PipelineHazards _
PDF
Topic2a ss pipelines
PDF
Module 2 of apj Abdul kablam university hpc.pdf
PPTX
Computer organisation and architecture .
PPT
Pipelining in COA, traditional pipelining in computer architecture
PPT
chapter6- Pipelining.ppt chaptPipelining
PDF
Computer SAarchitecture Lecture 6_Pip.pdf
PPTX
Presentation1(1)
PPTX
CPU Pipelining and Hazards - An Introduction
PPT
12 processor structure and function
PPT
Chapt12Processor Structure and Function.ppt
PPT
Ct213 processor design_pipelinehazard
PPT
12 processor structure and function
PDF
COA_Unit-3_slides_Pipeline Processing .pdf
PPT
Performance Enhancement with Pipelining
Pipelining & All Hazards Solution
Core pipelining
Slides.pptx
Pipeline Organization Overview and Performance.pdf
PipelineHazards _
Topic2a ss pipelines
Module 2 of apj Abdul kablam university hpc.pdf
Computer organisation and architecture .
Pipelining in COA, traditional pipelining in computer architecture
chapter6- Pipelining.ppt chaptPipelining
Computer SAarchitecture Lecture 6_Pip.pdf
Presentation1(1)
CPU Pipelining and Hazards - An Introduction
12 processor structure and function
Chapt12Processor Structure and Function.ppt
Ct213 processor design_pipelinehazard
12 processor structure and function
COA_Unit-3_slides_Pipeline Processing .pdf

Recently uploaded (20)

PPT
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PPTX
Thinking Routines and Learning Engagements.pptx
PDF
1.Salivary gland disease.pdf 3.Bleeding and Clotting Disorders.pdf important
PDF
Civil Department's presentation Your score increases as you pick a category
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PDF
Compact First Student's Book Cambridge Official
PDF
Nurlina - Urban Planner Portfolio (english ver)
PPTX
ACFE CERTIFICATION TRAINING ON LAW.pptx
PDF
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
PDF
Journal of Dental Science - UDMY (2020).pdf
PDF
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
DOCX
Ibrahim Suliman Mukhtar CV5AUG2025.docx
PDF
African Communication Research: A review
PPTX
Reproductive system-Human anatomy and physiology
PDF
Farming Based Livelihood Systems English Notes
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PPTX
Climate Change and Its Global Impact.pptx
REGULATION OF RESPIRATION lecture note 200L [Autosaved]-1-1.ppt
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
Thinking Routines and Learning Engagements.pptx
1.Salivary gland disease.pdf 3.Bleeding and Clotting Disorders.pdf important
Civil Department's presentation Your score increases as you pick a category
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Compact First Student's Book Cambridge Official
Nurlina - Urban Planner Portfolio (english ver)
ACFE CERTIFICATION TRAINING ON LAW.pptx
fundamentals-of-heat-and-mass-transfer-6th-edition_incropera.pdf
Journal of Dental Science - UDMY (2020).pdf
Solved Past paper of Pediatric Health Nursing PHN BS Nursing 5th Semester
Ibrahim Suliman Mukhtar CV5AUG2025.docx
African Communication Research: A review
Reproductive system-Human anatomy and physiology
Farming Based Livelihood Systems English Notes
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
Climate Change and Its Global Impact.pptx

Assembly p1

  • 2.  A technique used in advanced microprocessors where the microprocessor begins executing a second instruction before the first has been completed. - A Pipeline is a series of stages, where some work is done at each stage. The work is not finished until it has passed through all stages.  With pipelining, the computer architecture allows the next instructions to be fetched while the processor is performing arithmetic operations, holding them in a buffer close to the processor until each instruction operation can performed.
  • 3.  The pipeline is divided into segments and each segment can execute it operation concurrently with the other segments. Once a segment completes an operations, it passes the result to the next segment in the pipeline and fetches the next operations from the preceding segment.
  • 5. Instructions Fetch  The instruction Fetch (IF) stage is responsible for obtaining the requested instruction from memory. The instruction and the program counter (which is incremented to the next instruction) are stored in the IF/ID pipeline register as temporary storage so that may be used in the next stage at the start of the next clock cycle.
  • 6. Instruction Decode  The Instruction Decode (ID) stage is responsible for decoding the instruction and sending out the various control lines to the other parts of the processor. The instruction is sent to the control unit where it is decoded and the registers are fetched from the register file.
  • 7. Execution  The Execution (EX) stage is where any calculations are performed. The main component in this stage is the ALU. The ALU is made up of arithmetic, logic and capabilities.
  • 8. Memory and IO  The Memory and IO (MEM) stage is responsible for storing and loading values to and from memory. It also responsible for input or output from the processor. If the current instruction is not of Memory or IO type than the result from the ALU is passed through to the write back stage.
  • 9. Write Back  The Write Back (WB) stage is responsible for writing the result of a calculation, memory access or input into the register file.
  • 10. Decode Instruction and Calculate Effective Address Fetch Instruction From Memory Branch ? Update PC Empty Pipe Interrupt Handling Fetch Operand From memory Execute Instruction Interrupt YES NO YES NO
  • 11. INTRODUCTION Pipelining is technique of decomposing a sequential process into suboperation, with each subprocess being executed in a special dedicated segment that operates concurrently with all other segments. The name “pipeline” implies a flows of information analogous to an industrial assembly line.
  • 12. The name “pipeline” implies a flow of information analogous to an industrial assembly line. It is characteristic of pipelines that several computation can be in progress in distinct at the same time. Each subtask can be processed independently on a different machine. The pipelining design provides a way to start a new task before an old one has been completed.
  • 13. F 1 E 1 F 2 E 2 F 3 E 3 I1 I2 I3 (a) Sequential execution Instruction fetch unit Exelution unit Interstage buffer B1 (b) Hardware organization Time F1 E1 F2 E2 F3 E3 I1 I2 I3 Instruction (c) Pipelined execution Clock cycle 1 2 3 4 Time Fetch + Execution
  • 14.  pipelining processing:  Perform arithmetic operation (Ai*Bi)+(Ci*Di) with a stream of number. A specify pipeline configuration to carry out the task. Register in the pipeline for i=1 through 6. It consist of seven registers that receive new data with every clock pulse ,two multipliers and one adder circuits .
  • 15. R1 R2 R3 R4 MULTIPLIER MULTIPLIER R5 R6 ADDER R7 Stage 1 Stage 2 Stage 3 Ai Bi Ci Di
  • 16.  The performance gain from using pipelining occurs because we can start the execution of a new instruction each clock cycle. In a real implementation this is not always possible.  Another important note is that in a pipelined processor, a particular instruction still takes at least as long to execute as non-pipelined.  Pipeline hazards prevent the execution of the next instruction during the appropriate clock cycle.
  • 17.  There are three types of hazards in a pipeline, they are as follows:  Structural Hazards: are created when the data path hardware in the pipeline cannot support all of the overlapped instructions in the pipeline.  Data Hazards: When there is an instruction in the pipeline that affects the result of another instruction in the pipeline.  Control Hazards: The PC causes these due to the pipelining of branches and other instructions that change the PC.
  • 18.  Structural hazards result from the CPU data path not having resources to service all the required overlapping resources.  Suppose a processor can only read and write from the registers in one clock cycle. This would cause a problem during the ID and WB stages.  Assume that there are not separate instruction and data caches, and only one memory access can occur during one clock cycle. A hazard would be caused during the IF and MEM cycles.
  • 20.  A structural hazard is dealt with by inserting a stall or pipeline bubble into the pipeline. This means that for that clock cycle, nothing happens for that instruction. This effectively “slides” that instruction, and subsequent instructions, by one clock cycle.  This effectively increases the average CPI.  EX: Assume that you need to compare two processors, one with a structural hazard that occurs 40% for the time, causing a stall. Assume that the processor with the hazard has a clock rate 1.05 times faster than the processor without the hazard. How fast is the processor with the hazard compared to the one without the hazard?
  • 21.  We can see that even though the clock speed of the processor with the hazard is a little faster, the speedup is still less than 1.  Therefore the hazard has quite an effect on the performance.  Sometimes computer architects will opt to design a processor that exhibits a structural hazard. Why? • .
  • 22.  We haven’t looked at assembly programming in detail at this point.  Consider the following operations: DADD R1, R2, R3 DSUB R4, R1, R5 AND R6, R1, R7 OR R8, R1, R9 XOR R10, R1, R11
  • 24.  In this trivial example, we cannot expect the programmer to reorder his/her operations. Assuming this is the only code we want to execute.  Data forwarding can be used to solve this problem.  To implement data forwarding we need to bypass the pipeline register flow:  instruction depends on the write of a previous instruction.
  • 26.  It is easy to see how data forwarding can be used by drawing out the pipelined execution of each instruction.  Now consider the following instructions: DADD R1, R2, R3 LD R4, O(R1) SD R4, 12(R1)
  • 28. ENGR9861 Winter 2007 RV  Can data forwarding prevent all data hazards?  NO!  The following operations will still cause a data hazard. This happens because the further down the pipeline we get, the less we can use forwarding. LD R1, O(R2) DSUB R4, R1, R5 AND R6, R1, R7 OR R8, R1, R9
  • 30.  We can avoid the hazard by using a pipeline interlock.  The pipeline interlock will detect when data forwarding will not be able to get the data to the next instruction in time.  A stall is introduced until the instruction can get the appropriate data from the previous instruction.
  • 31. ENGR9861 Winter 2007 RV  Control hazards are caused by branches in the code.  During the IF stage remember that the PC is incremented by 4 in preparation for the next IF cycle of the next instruction.  What happens if there is a branch performed and we aren’t simply incrementing the PC by 4.  The easiest way to deal with the occurrence of a branch is to perform the IF stage again once the branch occurs.
  • 32.  These following solutions assume that we are dealing with static branches. Meaning that the actions taken during a branch do not change.  We already saw the first example, we stall the pipeline until the branch is resolved (in our case we repeated the IF stage until the branch resolved and modified the PC)  The next two examples will always make an assumption about the branch instruction.
  • 33. ENGR9861 Winter 2007 RV  What if we treat every branch as “not taken” remember that not only do we read the registers during ID, but we also perform an equality test in case we need to branch or not.  We can improve performance by assuming that the branch will not be taken.  What in this case we can simply load in the next instruction (PC+4) can continue. The complexity arises when the branch evaluates and we end up needing to actually take the branch.
  • 34. ENGR9861 Winter 2007 RV  The “branch-not taken” scheme is the same as performing the IF stage a second time in our 5 stage pipeline if the branch is taken.  If not there is no performance degradation.  The “branch taken” scheme is no benefit in our case because we evaluate the branch target address in the ID stage.  The fourth method for dealing with a control hazard is to implement a “delayed” branch scheme.  In this scheme an instruction is inserted into the pipeline that is useful and not dependent on whether the branch is taken or not. It is the job of the compiler to determine the delayed branch instruction.
  • 35.  Sometimes operations require more than one clock cycle to complete. Examples are:  Floating Point Multiply  Floating Point Divide  Floating Point Add  We can assume that there is hardware available on the processor for performing the operations.  Assume that the FP Mul and Add are fully pipelined, and the divide is un-pipelined.
  • 36.  The multiplier and the divider are fully pipelined. The divider is not pipelined at all.  Take a look at figure A.34 for a good example of how pipelining will function in the case of longer instruction execution. The author assumes a single floating point register port.  Structural hazards are avoided in the ID stage by assigning a memory bit in a shift register. Incoming instructions can then check to see if they should stall.
  • 37. ENGR9861 Winter 2007 RV  Data Dependence:  Instruction i produces a result the instruction j will use or instruction i is data dependent on instruction j and vice versa.  Name Dependence:  Occurs when two instructions use the same register and memory location. But there is no flow of data between the instructions. Instruction order must be preserved.
  • 38. ENGR9861 Winter 2007 RV  Types of data hazards:  RAW: read after write  WAW: write after write  WAR: write after read  We have already seen a RAW hazard. WAW hazards occur due to output dependence.  WAR hazards do not usually occur because of the amount of time between the read cycle and write cycle in a pipeline.
  • 39. 39 Read After Write (RAW) InstrJ tries to read operand before InstrI writes it • Caused by a “Dependence” (in compiler nomenclature). This hazard results from an actual need for communication. Execution Order is: InstrI InstrJ I: add r1,r2,r3 J: sub r4,r1,r3
  • 40. Write After Read (WAR) InstrJ tries to write operand before InstrI reads i – Gets wrong operand – Called an “anti-dependence” by compiler writers. This results from reuse of the name “r1”. • Can’t happen in MIPS 5 stage pipeline because: – All instructions take 5 stages, and – Reads are always in stage 2, and – Writes are always in stage 5 Execution Order is: InstrI InstrJ I: sub r4,r1,r3 J: add r1,r2,r3 K: mul r6,r1,r7
  • 41. Write After Write (WAW) InstrJ tries to write operand before InstrI writes it – Leaves wrong result ( InstrI not InstrJ ) • Called an “output dependence” by compiler writers This also results from the reuse of name “r1”. • Can’t happen in MIPS 5 stage pipeline because: – All instructions take 5 stages, and – Writes are always in stage 5 • Will see WAR and WAW in later more complicated pipes Execution Order is: InstrI InstrJ I: sub r1,r4,r3 J: add r1,r2,r3 K: mul r6,r1,r7