Online Cab Booking and Management System.pptxdiptipaneri80
FUNDAMENTALS OF ELECTRIC VEHICLES UNIT-1MikkiliSuresh
Ad
Lecture6_Datapath_muceuok40lti_cycle.pdf
1. Department of Computer Engineering
University of Kurdistan
Computer Architecture
MIPS Multi-cycle lmplementation
By: Dr. Alireza Abdollahpouri
2. A Multi-cycle MIPS processor
Any instruction set can be implemented in
many different ways
MIPS ISA
Single Cycle Multi-Cycle Pipelined
Short CPI
Long CCT
Long CPI
Short CCT
Short CPI
Short CCT
2
Micro-Arch.
3. A Multicycle Implementation
Single-cycle versus multicycle instruction execution.
Clock
Clock
Instr 2
Instr 1 Instr 3 Instr 4
3 cycles 3 cycles 4 cycles
5 cycles
Time
saved
Instr 1 Instr 4
Instr 3
Instr 2
Time
needed
Time
needed
Time
allotted
Time
allotted
3
5.
در هستند نیاز مورد دستور بعدی سیکلهای در که مقادیری معماری این در
میشوند ذخیره رجیسترهائی
.
ش افزوده معماری به زیر اجزای باید نتیجه در
وند
:
IR – Instruction Register
MDR – Memory Data Register
A, B – regfile read data registers
ALUout – ALU output register
نگرش
Multicycle Datapath
Address
Read Data
(Instr. or Data)
Memory
PC
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
ALU
Write Data
IR
MDR
A
B
ALUout
5
6. Our new adder setup
We can eliminate both extra adders in a multicycle datapath, and instead use
just one ALU, with multiplexers to select the proper inputs.
A 2-to-1 mux ALUSrcA sets the first ALU input to be the PC or a register.
A 4-to-1 mux ALUSrcB selects the second ALU input from among:
— the register file (for arithmetic operations),
— a constant 4 (to increment the PC),
— a sign-extended constant (for effective addresses), and
— a sign-extended and shifted constant (for branch targets).
This permits a single ALU to perform all of the necessary functions.
— Arithmetic operations on two register operands.
— Incrementing the PC.
— Computing effective addresses for lw and sw.
— Adding a sign-extended, shifted offset to (PC + 4) for branches.
6
7. The multicycle adder setup highlighted
Result
Zero
ALU
ALUOp
0
M
u
x
1
ALUSrcA
0
1
2
3
ALUSrcB
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Sign
extend
Shift
left 2
PC
4
0
M
u
x
1
RegDst
0
M
u
x
1
MemToReg
0
M
u
x
1
IorD
Address
Memory
Mem
Data
Write
data
MemRead
MemWrite
PCWrite
7
8. Eliminating a memory
Similarly, we can get by with one unified memory, which
will store both program instructions and data. (a
Princeton architecture)
This memory is used in both the instruction fetch and
data access stages, and the address could come from
either:
the PC register (when we’re fetching an instruction), or
the ALU output (for the effective address of a lw or
sw).
We add another 2-to-1 mux, IorD, to decide whether the
memory is being accessed for instructions or for data.
8
9. The new memory setup highlighted
Result
Zero
ALU
ALUOp
0
M
u
x
1
ALUSrcA
0
1
2
3
ALUSrcB
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Sign
extend
Shift
left 2
PC
4
0
M
u
x
1
RegDst
0
M
u
x
1
MemToReg
0
M
u
x
1
IorD
Address
Memory
Mem
Data
Write
data
MemRead
MemWrite
PCWrite
9
10. Intermediate registers
Sometimes we need the output of a functional unit in a later clock
cycle during the execution of one instruction.
The instruction word fetched in stage 1 determines the
destination of the register write in stage 5.
The ALU result for an address computation in stage 3 is needed
as the memory address for lw or sw in stage 4.
These outputs will have to be stored in intermediate registers for
future use. Otherwise they would probably be lost by the next clock
cycle.
The instruction read in stage 1 is saved in Instruction register.
Register file outputs from stage 2 are saved in registers A and B.
The ALU output will be stored in a register ALUOut.
Any data fetched from memory in stage 4 is kept in the Memory
data register, also called MDR.
10
11. The final multicycle datapath
Result
Zero
ALU
ALUOp
0
M
u
x
1
ALUSrcA
0
1
2
3
ALUSrcB
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Address
Memory
Mem
Data
Write
data
Sign
extend
Shift
left 2
0
M
u
x
1
PCSource
PC
A
4
[31-26]
[25-21]
[20-16]
[15-11]
[15-0]
Instruction
register
Memory
data
register
IRWrite
0
M
u
x
1
RegDst
0
M
u
x
1
MemToReg
0
M
u
x
1
IorD
MemRead
MemWrite
PCWrite
ALU
Out
B
11
13. Multicycle Datapath with Control
Shift
left 2
PC
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Instruction
[15– 11]
M
u
x
0
1
M
u
x
0
1
4
Instruction
[15– 0]
Sign
extend
32
16
Instruction
[25– 21]
Instruction
[20– 16]
Instruction
[15– 0]
Instruction
register
ALU
control
ALU
result
ALU
Zero
Memory
data
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
Outputs
Op
[5– 0]
Instruction
[31-26]
Instruction [5– 0]
M
u
x
0
2
Jump
address [31-0]
Instruction [25– 0] 26 28
Shift
left 2
PC [31-28]
1
1 M
u
x
0
3
2
M
u
x
0
1
ALUOut
Memory
MemData
Write
data
Address
13
14. Multicycle control unit
The control unit is responsible for producing all of the control signals.
Each instruction requires a sequence of control signals, generated
over multiple clock cycles.
This implies that we need a state machine.
The datapath control signals will be outputs of the state machine.
Different instructions require different sequences of steps.
This implies the instruction word is an input to the state machine.
The next state depends upon the exact instruction being
executed.
After we finish executing one instruction, we’ll have to repeat the
entire process again to execute the next instruction.
14
15.
معماری در
Multicycle
ه بیت روی از فقط نمیتوان را کنترل سیگنالهای
ای
آورد بدست دستورالعمل
.
ماشین یک از اینرو از
FSM
میشود استفاده کنترل واحد طراحی برای
.
تعدادی
state
در که میشود فرض پردازنده برای محدود
state reg
میشوند ذخیره
.
state
روی از بعدی
state
میشوند تعیین ورودی ومقادیر فعلی
.
کنترل واحد
Multicycle
Combinational
control logic
State Reg
Inst
Opcode
Datapath
Control points
Next State
. . .
. . .
.
.
.
15
16. Finite-state machine for the control unit
Instruction fetch
and PC increment Register fetch and
branch computation
Effective address
computation
Memory
read
Register
write
Op = LW/SW
Op = SW
Op = LW
Memory
write
R-type
execution
Op = R-type
R-type
writeback
Branch
completion
Op = BEQ
Each bubble is a state
– Holds the control signals for a single cycle
– Note: All instructions do the same things during the first two cycles
? ?
? ?
?
?
?
? ?
17. Stage 1: Instruction Fetch
Stage 1 includes two actions which use two
separate functional units: the memory and the
ALU.
Fetch the instruction from memory and store it in IR.
IR = Mem[PC]
Use the ALU to increment the PC by 4.
PC = PC + 4
17
18. Stage 1: Instruction fetch and PC increment
Result
Zero
ALU
ALUOp
0
M
u
x
1
ALUSrcA
0
1
2
3
ALUSrcB
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Address
Memory
Mem
Data
Write
data
Sign
extend
Shift
left 2
0
M
u
x
1
PCSource
PC
A
B
ALU
Out
4
[31-26]
[25-21]
[20-16]
[15-11]
[15-0]
Instruction
register
Memory
data
register
IRWrite
0
M
u
x
1
RegDst
0
M
u
x
1
MemToReg
0
M
u
x
1
IorD
MemRead
MemWrite
PCWrite
PC = PC + 4
IR =
Mem[PC]
18
19. Stage 1 control signals
Instruction fetch: IR = Mem[PC]
Increment the PC: PC = PC + 4
We’ll assume that all control signals not listed are implicitly set to 0.
Signal Value Description
MemRead 1 Read from memory
IorD 0 Use PC as the memory read address
IRWrite 1 Save memory contents to instruction register
Signal Value Description
ALUSrcA 0 Use PC as the first ALU operand
ALUSrcB 01 Use constant 4 as the second ALU operand
ALUOp ADD Perform addition
PCWrite 1 Change PC
PCSource 0 Update PC from the ALU output
19
20. Summary of Instruction Execution
Step name
Action for R-type
instructions
Action for memory-reference
instructions
Action for
branches
Action for
jumps
Instruction fetch IR = Memory[PC]
PC = PC + 4
Instruction A = Reg [IR[25-21]]
decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] II
computation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)
jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]
completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
1: IF
2: ID
3: EX
4: MEM
5: WB
Step
22. This can be translated into a state table; here are the first two states.
You can implement this the hard way (hardwired control).
Represent the current state using flip-flops or a register.
Find equations for the next state and (control signal) outputs in terms of the
current state and input (instruction word).
Or you can use the easy way.
Write the whole control signals into a memory, like a ROM.
This would be much easier, since you don’t have to derive equations.
Implementing the FSM
Current
State
Input
(Op)
Next
State
Output (Control signals)
PC
Writ
e
Ior
D
Mem
Rea
d
Mem
Write
IR
Writ
e
Re
g
Dst
MemT
oReg
Reg
Writ
e
ALU
Src
A
ALU
Src
B
ALU
Op
PC
Source
Instr
Fetch
X Reg
Fetch
1 0 1 0 1 X X 0 0 01 010 0
Reg
Fetch
BEQ Branch
compl
0 X 0 0 0 X X 0 0 11 010 X
Reg
Fetch
R-
type
R-type
execute
0 X 0 0 0 X X 0 0 11 010 X
Reg
Fetch
LW/S
W
Compu
te eff
addr
0 X 0 0 0 X X 0 0 11 010 X
22
23. Control Unit
Control
memory
Control Unit (micro-program)
CAR: Control
Address
Register
CAR
Control word
To DataPath
صورت به کنترل در
Micro program
اطالعات
به موسوم ای حافظه در کنترلی
کنترلی حافظه
ذ
خیره
میگردد
.
23
25. Label
ALU
control SRC1 SRC2
Register
control Memory
PCWrite
control Sequencing
Fetch Add PC 4 Read PC ALU Seq
Add PC Extshft Read Dispatch 1
Mem1 Add A Extend Dispatch 2
LW2 Read ALU Seq
Write MDR Fetch
SW2 Write ALU Fetch
Rformat1 Func code A B Seq
Write ALU Fetch
BEQ1 Subt A B ALUOut-cond Fetch
JUMP1 Jump address Fetch
Dispatch ROM 1
Dispatch ROM 2
Op Opcode name Value
Op Opcode name Value
000000 R-format Rformat1
100011 lw LW2
000010 jmp JUMP1
101011 sw SW2
000100 beq BEQ1
100011 lw Mem1
101011 sw Mem1
Microprogram containing 10 microinstructions
Dispatch Table 2
Dispatch Table 1
25
Control Unit (micro-program)
30. Summary
A single-cycle CPU has two main disadvantages.
The cycle time is limited by the worst case latency.
It requires more hardware than necessary.
A multicycle processor splits instruction execution into several stages.
Instructions only execute as many stages as required.
Each stage is relatively simple, so the clock cycle time is reduced.
Functional units can be reused on different cycles.
We made several modifications to the single-cycle datapath.
The two extra adders and one memory were removed.
Multiplexers were inserted so the ALU and memory can be used for
different purposes in different execution stages.
New registers are needed to store intermediate results.
30