SlideShare a Scribd company logo
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 1 of 9 ECE 260A PROJECT – Fall 2015
Design and Simulation of an 8-Bit Processor and Its
Associated 32-Byte SRAM
Fanyu Yang, A53102865
Haoran Pu, A53104427
1. Design Challenge and Significance
Microprocessor design takes entry-level engineering students through
a decently complicated circuit design process which involve static
circuit, dynamic circuit, sequential logic, combinational logic, and
memory arrays, etc. The microprocessor design process requires a
comprehensive and integral understanding of the entire
microprocessor as well as precise and efficient operation of every
single subsystem, such as SRAM, register file. The microprocessor
consists of five major subsystems: register file, control logic unit,
arithmetic logic unit (ALU), SRAM, and program counter.
The register file block (see Figure 1.2) is used to store operating
numbers. The key challenge is how to block unselected registers when
data is written into selected registers or when selected registers send
data out in reading operation. 64 8-1 multiplexers can be used to block
unselected registers at the expense of larger power consumption.
Instead of multiplexers, transmission gates implement the same
function with low power consumption, but the output of the
transmission gates become floating points when the gates switch off,
which causes serious problem. An alternative energy efficient way to
block unselected registers is to use transmission gates with certain
optimization. A pull down network is inserted at the output of the
transmission gates to pull the outputs down to the ground when
complementary enable signal is inserted.
For the control logic unit, it is vital to send correct control signals to
the corresponding systems and this operation involves plenty of
accurate bus notations.
ALU block (see Figure 1.3) works as a calculator to implement
arithmetic operations, such as addition, subtraction, and
multiplication. The major challenge is to turn off unwanted operations
in ALU. The solution is using transmission gates to select required
calculation results. Enable signals generated by the 4-16 decoder are
used to turn on the transmission gate for the required operation and
turn off the gates of unselected operations. By turning of the
unselected arithmetic sub-block, power consumption of the ALU
decreases significantly and operating speed increases.
SRAM (see Figure 1.4) works as a memory element in the
microprocessor. In the SRAM design, 256 6T SRAM cells are used.
6T SRAM cell require ratio logic. The primary challenge for SRAM is
sizing of transistors, which has considerable influence on operating
speed, power consumption, and even the validity and correctness of
the result.
Program counter is used to count the number of operations executed
after inserting a reset signal. The major challenge is caused by reset
signals. Since the reset operation of the D flip-flops is synchronous,
which means reset insertion happens at the rising edge of the clock, if
the output of the last D flip-flop, namely Q, feeds into the clock of the
next D flip-flop, the reset of the next D flip-flop can have a delay
compared with the next D flip-flop, which causes severe problem. By
adding 5 half-adder, two D flip-flops can reset at the same time and
this problem is solved.
2. Architecture Description
The microprocessor (see Figure 1.1) has five building blocks in which
three of them are very important – register file, ALU, and SRAM.
Register file block (see Figure 1.2) is used to write and read data using
its address. The 3-8 decoder generates address for WS (write
selection) and RS (read selection) blocks. Data feeds into WS to write
the data in the address decoded by the decoder and then the data go
into the 8*8 register from WS. In the next operation cycle, the stored
data feeds to RS from the 8*8 register and then export as an 8 bits
output data from RS. 8 bits output data can also export directly from
WS.
ALU (see Figure 1.3) is the major part carrying out arithmetic
operations. Control signals feeding into the 3-8 decoder generate
operation selection signals, which feed into the multiplexer to choose
one of eight arithmetic operations. 8 bits output from WS register Y
and 8 bits output from RS register X are sent as inputs to the
arithmetic operation blocks, however only selected operation can
execute. The 8 bits output of ALU are sent back to the control logic
unit waiting for next cycle operation.
SRAM (see Figure 1.4) is used to store data which is not being current
used. Control signals are fed into a 5-32 decoder to generate 32 bits
wordline enable signals. Since wordline should not be enabled if T6
SRAM cells are in precharging stage, an AND operation between
complementary clock signal and wordline enable signals should be
added before wordline enable signals turn SRAM cells on. For writing
operation, Write Enable inserted, 8 bits input data are written into
SRAM cells through bitlines. For reading operation, data in SRAM
cells are sent out to 8 bits SRAM output and then fed back to control
logic unit.
3. Innovation
Transmission gate logic is applied to almost all multiplexers to
increase speed and reduce power consumption. In register file block,
transmission gates are used to select required registers and block other
registers. In ALU block, transmission gate controls which arithmetic
operation is carry out.
4. Remaining Problem
The major problem is that reading 8 bits output from register X has
some delay from the rising edge of the clock witch result in reading 8
bits output from register Y happens at the next rising edge of the
clock. This happens when ALU has large delay. Reducing ALU delay
can eliminate that miss match.
5. Future Iterations
The microprocessor design can be improved by using 10T SRAM
instead of 6T SRAM.
References:
[1] Weste, H. (2011). CMOS VLSI Design – A Circuits and Systems
Perspective. 4th
Edition.
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 2 of 9 ECE 260A PROJECT – Fall 2015
Figure 1 | Schematic and Building Blocks
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 3 of 9 ECE 260A PROJECT – Fall 2015
Figure 2 | Schematic of Multiplier
Z<0:7>
A<0:3>
B<0:3>
GND
VDD
A<3>
A<3>
A<3>
A<2>
A<3> A<2>
A<2>
A<2> A<1>
A<1>
A<1>
A<1> A<0>
A<0>
A<0>
A<0>
B<0>
Z<0>
B<1>
Z<1>
B<2>
Z<2>
B<3>
FAFAFA
FAFAFA
FAFA HALFHALF
HALF
HALF
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 4 of 9 ECE 260A PROJECT – Fall 2015
Figure 3 | SRAM Schematic
T6 SRAM
cells
Bitline Conditioning Circuits
Reading and Writing Drives
5-32 Decoder
8 Bits Output signals
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 5 of 9 ECE 260A PROJECT – Fall 2015
Figure 4 | Schematic of Test Bench
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 6 of 9 ECE 260A PROJECT – Fall 2015
Figure 5 | SRAM(4)=4
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 7 of 9 ECE 260A PROJECT – Fall 2015
Figure 6 | SRAM (13)=13
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 8 of 9 ECE 260A PROJECT – Fall 2015
Optional Figure 7 | Energy
UCSD - ECE 260A: VLSI DESIGN PROJECT
Page 9 of 9 ECE 260A PROJECT – Fall 2015
Optional Figure 8 | Result Table

More Related Content

PDF
ECE260BMiniProject2Report
Fanyu Yang
 
PDF
Reliability Prediction using the Fussel Algorithm
IRJET Journal
 
DOCX
Evaluation of Branch Predictors
Bharat Biyani
 
PDF
Integrating fault tolerant scheme with feedback control scheduling algorithm ...
ijics
 
PDF
CE150--Hongyi Huang
Bridget (Hongyi) Huang
 
PPT
Basic elements in control systems
SATHEESH C S
 
PDF
09 placement
yogiramesh89
 
PDF
Discrete Time Optimal Tracking Control of BLDC Motor
International Journal of Engineering Inventions www.ijeijournal.com
 
ECE260BMiniProject2Report
Fanyu Yang
 
Reliability Prediction using the Fussel Algorithm
IRJET Journal
 
Evaluation of Branch Predictors
Bharat Biyani
 
Integrating fault tolerant scheme with feedback control scheduling algorithm ...
ijics
 
CE150--Hongyi Huang
Bridget (Hongyi) Huang
 
Basic elements in control systems
SATHEESH C S
 
09 placement
yogiramesh89
 
Discrete Time Optimal Tracking Control of BLDC Motor
International Journal of Engineering Inventions www.ijeijournal.com
 

What's hot (19)

PPTX
Vector computing
Safayet Hossain
 
PDF
Modified montgomery modular multiplier for cryptosystems
IAEME Publication
 
DOCX
Pipeline
saman Iftikhar
 
DOCX
Low cost high-performance vlsi architecture for montgomery modular multiplica...
jpstudcorner
 
PPT
Lecture 3
Mr SMAK
 
PPT
Real time-embedded-system-lec-04
University of Computer Science and Technology
 
PDF
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
IDES Editor
 
PPT
Real time-embedded-system-lec-05
University of Computer Science and Technology
 
PDF
Study and Development of an Energy Saving Mechanical System
IDES Editor
 
PPTX
Placement
yogesh kumar
 
PDF
Remote core locking (rcl)
Chinthaka Henadeera
 
PPT
Real time-embedded-system-lec-02
University of Computer Science and Technology
 
PPT
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
PDF
Model reduction of unstable systems based on balanced truncation algorithm
IJECEIAES
 
PDF
Lecture 8 Me 176 2 Time Response
Leonides De Ocampo
 
PDF
control system lab 02 - PID tuning
nalan karunanayake
 
PDF
Chen2016 article robust_adaptivecross-couplingpo
SumanSaha821367
 
PDF
A Review on Image Compression in Parallel using CUDA
IJERD Editor
 
DOC
Basic Control System unit6
Asraf Malik
 
Vector computing
Safayet Hossain
 
Modified montgomery modular multiplier for cryptosystems
IAEME Publication
 
Pipeline
saman Iftikhar
 
Low cost high-performance vlsi architecture for montgomery modular multiplica...
jpstudcorner
 
Lecture 3
Mr SMAK
 
Real time-embedded-system-lec-04
University of Computer Science and Technology
 
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
IDES Editor
 
Real time-embedded-system-lec-05
University of Computer Science and Technology
 
Study and Development of an Energy Saving Mechanical System
IDES Editor
 
Placement
yogesh kumar
 
Remote core locking (rcl)
Chinthaka Henadeera
 
Real time-embedded-system-lec-02
University of Computer Science and Technology
 
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
Model reduction of unstable systems based on balanced truncation algorithm
IJECEIAES
 
Lecture 8 Me 176 2 Time Response
Leonides De Ocampo
 
control system lab 02 - PID tuning
nalan karunanayake
 
Chen2016 article robust_adaptivecross-couplingpo
SumanSaha821367
 
A Review on Image Compression in Parallel using CUDA
IJERD Editor
 
Basic Control System unit6
Asraf Malik
 
Ad

Similar to ece260project.doc (20)

PDF
Unit 3 The processor
Balaji Vignesh
 
PPT
8085
Sunil Dutt
 
PPTX
THE PROCESSOR
Jai Sudhan
 
PDF
Microprocessor and microcontroller (MPMC).pdf
XyzjakhaAbhuvs
 
PDF
lecture1423813120.pdf
Akhilesh Mishra
 
PDF
4bit pc report[cse 08-section-b2_group-02]
shibbirtanvin
 
PDF
4bit PC report
tanvin
 
PPTX
CPU ORGANIZATION CHAPTER FIVE COMPUTER ORGANIZATION.pptx
GAEphrem
 
PDF
Dm25671674
IJERA Editor
 
PPT
Processor Design Flow architecture design
Varsha506533
 
PPT
Chapter 4 the processor
s9007912
 
DOCX
Attachment_ VHDL datasheet
jethro kimande
 
PPT
CO By Rakesh Roshan
Anurag University Hyderabad
 
PPT
Unit 3 basic processing unit
chidabdu
 
DOCX
Customizable Microprocessor design on Nexys 3 Spartan FPGA Board
Bharat Biyani
 
PPTX
VLSI_UNIT_4 _PPT.pptx
vutukuruvarsha
 
DOCX
4th sem,(cs is),computer org unit-7
Sujay pai
 
DOCX
8085 archi
HarshitParkar6677
 
PPTX
8085 microprocessor(1)
Reevu Pal
 
Unit 3 The processor
Balaji Vignesh
 
THE PROCESSOR
Jai Sudhan
 
Microprocessor and microcontroller (MPMC).pdf
XyzjakhaAbhuvs
 
lecture1423813120.pdf
Akhilesh Mishra
 
4bit pc report[cse 08-section-b2_group-02]
shibbirtanvin
 
4bit PC report
tanvin
 
CPU ORGANIZATION CHAPTER FIVE COMPUTER ORGANIZATION.pptx
GAEphrem
 
Dm25671674
IJERA Editor
 
Processor Design Flow architecture design
Varsha506533
 
Chapter 4 the processor
s9007912
 
Attachment_ VHDL datasheet
jethro kimande
 
CO By Rakesh Roshan
Anurag University Hyderabad
 
Unit 3 basic processing unit
chidabdu
 
Customizable Microprocessor design on Nexys 3 Spartan FPGA Board
Bharat Biyani
 
VLSI_UNIT_4 _PPT.pptx
vutukuruvarsha
 
4th sem,(cs is),computer org unit-7
Sujay pai
 
8085 archi
HarshitParkar6677
 
8085 microprocessor(1)
Reevu Pal
 
Ad

ece260project.doc

  • 1. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 1 of 9 ECE 260A PROJECT – Fall 2015 Design and Simulation of an 8-Bit Processor and Its Associated 32-Byte SRAM Fanyu Yang, A53102865 Haoran Pu, A53104427 1. Design Challenge and Significance Microprocessor design takes entry-level engineering students through a decently complicated circuit design process which involve static circuit, dynamic circuit, sequential logic, combinational logic, and memory arrays, etc. The microprocessor design process requires a comprehensive and integral understanding of the entire microprocessor as well as precise and efficient operation of every single subsystem, such as SRAM, register file. The microprocessor consists of five major subsystems: register file, control logic unit, arithmetic logic unit (ALU), SRAM, and program counter. The register file block (see Figure 1.2) is used to store operating numbers. The key challenge is how to block unselected registers when data is written into selected registers or when selected registers send data out in reading operation. 64 8-1 multiplexers can be used to block unselected registers at the expense of larger power consumption. Instead of multiplexers, transmission gates implement the same function with low power consumption, but the output of the transmission gates become floating points when the gates switch off, which causes serious problem. An alternative energy efficient way to block unselected registers is to use transmission gates with certain optimization. A pull down network is inserted at the output of the transmission gates to pull the outputs down to the ground when complementary enable signal is inserted. For the control logic unit, it is vital to send correct control signals to the corresponding systems and this operation involves plenty of accurate bus notations. ALU block (see Figure 1.3) works as a calculator to implement arithmetic operations, such as addition, subtraction, and multiplication. The major challenge is to turn off unwanted operations in ALU. The solution is using transmission gates to select required calculation results. Enable signals generated by the 4-16 decoder are used to turn on the transmission gate for the required operation and turn off the gates of unselected operations. By turning of the unselected arithmetic sub-block, power consumption of the ALU decreases significantly and operating speed increases. SRAM (see Figure 1.4) works as a memory element in the microprocessor. In the SRAM design, 256 6T SRAM cells are used. 6T SRAM cell require ratio logic. The primary challenge for SRAM is sizing of transistors, which has considerable influence on operating speed, power consumption, and even the validity and correctness of the result. Program counter is used to count the number of operations executed after inserting a reset signal. The major challenge is caused by reset signals. Since the reset operation of the D flip-flops is synchronous, which means reset insertion happens at the rising edge of the clock, if the output of the last D flip-flop, namely Q, feeds into the clock of the next D flip-flop, the reset of the next D flip-flop can have a delay compared with the next D flip-flop, which causes severe problem. By adding 5 half-adder, two D flip-flops can reset at the same time and this problem is solved. 2. Architecture Description The microprocessor (see Figure 1.1) has five building blocks in which three of them are very important – register file, ALU, and SRAM. Register file block (see Figure 1.2) is used to write and read data using its address. The 3-8 decoder generates address for WS (write selection) and RS (read selection) blocks. Data feeds into WS to write the data in the address decoded by the decoder and then the data go into the 8*8 register from WS. In the next operation cycle, the stored data feeds to RS from the 8*8 register and then export as an 8 bits output data from RS. 8 bits output data can also export directly from WS. ALU (see Figure 1.3) is the major part carrying out arithmetic operations. Control signals feeding into the 3-8 decoder generate operation selection signals, which feed into the multiplexer to choose one of eight arithmetic operations. 8 bits output from WS register Y and 8 bits output from RS register X are sent as inputs to the arithmetic operation blocks, however only selected operation can execute. The 8 bits output of ALU are sent back to the control logic unit waiting for next cycle operation. SRAM (see Figure 1.4) is used to store data which is not being current used. Control signals are fed into a 5-32 decoder to generate 32 bits wordline enable signals. Since wordline should not be enabled if T6 SRAM cells are in precharging stage, an AND operation between complementary clock signal and wordline enable signals should be added before wordline enable signals turn SRAM cells on. For writing operation, Write Enable inserted, 8 bits input data are written into SRAM cells through bitlines. For reading operation, data in SRAM cells are sent out to 8 bits SRAM output and then fed back to control logic unit. 3. Innovation Transmission gate logic is applied to almost all multiplexers to increase speed and reduce power consumption. In register file block, transmission gates are used to select required registers and block other registers. In ALU block, transmission gate controls which arithmetic operation is carry out. 4. Remaining Problem The major problem is that reading 8 bits output from register X has some delay from the rising edge of the clock witch result in reading 8 bits output from register Y happens at the next rising edge of the clock. This happens when ALU has large delay. Reducing ALU delay can eliminate that miss match. 5. Future Iterations The microprocessor design can be improved by using 10T SRAM instead of 6T SRAM. References: [1] Weste, H. (2011). CMOS VLSI Design – A Circuits and Systems Perspective. 4th Edition.
  • 2. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 2 of 9 ECE 260A PROJECT – Fall 2015 Figure 1 | Schematic and Building Blocks
  • 3. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 3 of 9 ECE 260A PROJECT – Fall 2015 Figure 2 | Schematic of Multiplier Z<0:7> A<0:3> B<0:3> GND VDD A<3> A<3> A<3> A<2> A<3> A<2> A<2> A<2> A<1> A<1> A<1> A<1> A<0> A<0> A<0> A<0> B<0> Z<0> B<1> Z<1> B<2> Z<2> B<3> FAFAFA FAFAFA FAFA HALFHALF HALF HALF
  • 4. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 4 of 9 ECE 260A PROJECT – Fall 2015 Figure 3 | SRAM Schematic T6 SRAM cells Bitline Conditioning Circuits Reading and Writing Drives 5-32 Decoder 8 Bits Output signals
  • 5. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 5 of 9 ECE 260A PROJECT – Fall 2015 Figure 4 | Schematic of Test Bench
  • 6. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 6 of 9 ECE 260A PROJECT – Fall 2015 Figure 5 | SRAM(4)=4
  • 7. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 7 of 9 ECE 260A PROJECT – Fall 2015 Figure 6 | SRAM (13)=13
  • 8. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 8 of 9 ECE 260A PROJECT – Fall 2015 Optional Figure 7 | Energy
  • 9. UCSD - ECE 260A: VLSI DESIGN PROJECT Page 9 of 9 ECE 260A PROJECT – Fall 2015 Optional Figure 8 | Result Table