SlideShare a Scribd company logo
Advanced Processor  Power Point  Presentation
Advanced Processor  Power Point  Presentation
Advanced Processor  Power Point  Presentation
INDEX
1. INTRODUCTION
2. DESIGN SPACE
3. INSTRUCTION SET ARCHITECTURE
4. CISC SCALAR PROCESSOR
5. RISC SCALAR PROCESSOR
6. VLIW ARCHITECTURE
7. VECTOR PROCESSOR
8. SYMBOLIC PROCESSOR
INTRODUTION
A Processor is an integrated electronic circuit that performs the calculations that run
a computer. A processor performs arithmetical, logical, input/output (I/O) and other
basic instructions that are passed from an operating system (OS). Most other processes
are dependent on the operations of a processor.
Similarly
Advanced processors allow you to perform advanced functions such as defining
processing ,This Advance Processing Technology incorporate the CISC, RISC,
VLIW, Vector and Symbolic Processors for its computation.
The Scalar and vector processors are for mathematical calculations.
Symbolic processors have been developed for AI applications
Design Space
Processor families can be mapped onto a coordinated space of clock rate versus cycles per
instruction (CPI).
Clock Rate = Clock speed, also known as clock rate or clock frequency, is a measure of speed of
computer’s central processing unit (CPU) in executing instructions. It is typically measured in
gigahertz (GHz). Higher clock speeds generally mean that a CPU can process more instructions per
second, and thus can perform better on tasks that require fast processing.
Cycles per instruction (CPI) or also known as clock cycles per instruction is one aspect of a
processor's performance: the average number of clock cycles per instruction for a program .
As implementation technology
evolves rapidly, the clock rates of
various processors have moved from
low to higher speeds toward the
right of the design space (i.e.
increase in clock rate).
Similarly processor manufacturers
have been trying to lower the CPI
rate(cycles taken to execute an
instruction) using innovative
hardware approaches.
Instruction Set
Architecture(ISA)
• An instruction is a set of codes that the computer processor can understand. The
code is usually in 1s and 0s, or machine language.
Example of some instruction sets −
ADD − Add two numbers together.
JUMP − Jump to designated RAM address.
LOAD − Load information from RAM to the CPU.
• Instruction set Architecture (ISA) is defined as the design of a computer from the
Programmer’s Perspective. This basically means that an ISA describes the design of a
Computer in terms of the basic operations it must support. The ISA is not concerned
with the implementation-specific details of a computer. It is only concerned with the
set or collection of basic operations the computer must support.
• The ISA acts as an interface between the hardware and the software.
• The ISA describes
(1) memory model,
(2) instruction format, types and modes, and
(3) operand registers, types, and data addressing.
Instruction types include arithmetic, logical, data transfer, and flow control.
Instruction modes include kernel and user instructions.
The implementation of the ISA in hardware as fetch-decode-execute cycle.
In the fetch step, operands are
retrieved from memory.
The decode step puts the
operands into a format that the
ALU can manipulate.
The execute cycle performs the
selected operation within the
ALU.
Control facilitates orderly
routing of data, including I/O to
the ALU's external environment
(e.g., peripheral devices such
as disk or keyboard).
Fetch-Decode-Execute Cycle
Objective of an ISA
Let us try to understand the Objectives of an ISA by taking the example of the MIPS
ISA(Million instructions per second (MIPS)) is an approximate measure of a computer's
raw processing power ).
MIPS is one of the most widely used ISAs due to its simplicity.
The ISA defines the types of instructions to be supported by the processor.
• Based on the type of operations they perform MIPS Instructions are classified into 3 types
1. Arithmetic/Logic Instructions: These Instructions perform various Arithmetic & Logical
operations on one or more operands.
2. Data Transfer Instructions: These instructions are responsible for the transfer of
instructions from memory to the processor registers and vice versa.
3. Branch and Jump Instructions: These instructions are responsible for breaking the
sequential flow of instructions and jumping to instructions at various other locations
• The ISA defines the maximum length of each type of
instruction. Since the MIPS is a 32 bit ISA, each
instruction must be accommodated within 32 bits.
• The ISA defines the Instruction Format of each type of
instruction.
The Instruction Format determines how the entire
instruction is encoded within 32 bits
There are 3 types of Instruction Formats in the MIPS
ISA:
1. R-Instruction Format
2. I-Instruction Format
3. J-Instruction Format
R-Instruction Format
The R instruction format has fields for three registers (typically, two sources and a destination), as
well a shift amount (5 bits) and a function (6 bits). It is used for arithmetic/bitwise instructions
which do not have an immediate operand.
Opcode = 000000 RS RT RD Shift Amount Function
6 bits 5 5 5 5 6
The Function field specifies the actual arithmetic function to be applied to the operands given by
the RS, RT (sources) and RD (destination) fields. For example, function 32 (100000b) is addition.
The Left/Right Shift instructions use the shift amount field to specify the amount to shift.
I-Instruction Format
The I instruction format contains fields for two registers (typically source and destination) and
for a 16-bit immediate value. The I format is used for arithmetic operations with an immediate
operand,
Opcode RS RD Immediate/Address
6 bits 5 bits 5 bits 16 bits
The Reg1 and Reg2 fields encode the source and destination registers (MIPS has 32 registers
= 25), while the Immediate field encodes any immediate value.
J-Instruction Format
The J format is used for the Jump instruction, which jumps to an absolute address. Because
instructions must be aligned to 32-bits, the low 3 bits of every valid address are always 0.
Thus, we can jump to any one of 232-6+3 addresses; the currently-executing address is used
to supply the missing 3 bits.
Opcode Address
6 bits 26 bits
Two Main Categories of Processors are :
1.CISC
2.RISC
Under both CISC and RISC categories,
products designed for multi-core chips,
embedded applications, or for low cost
and/or low power consumption, tend to
have lower clock speeds.
High performance processors must
necessarily be designed to operate at
high clock speeds.
The category of vector processors has
been marked VP;
vector processing features may be
associated with CISC or RISC main
processors.
CISC Scalar Processor
Scalar processors are a class of Computer Processors that process only one data item at a time.
Typical data items include Integers and floating point numbers .
A scalar processor is classified as a single instruction, single data (SISD) processor in Flynn's
taxonomy. The Intel i486 is an example of a scalar processor .
(Single instruction stream, single data stream (SISD)) ( Intel i486 )
• CISC stands for Complex Instruction Set Computer. It comprises a complex instruction set. It
incorporates a variable length instruction format.
• The CISC approach attempts to minimize the number of instructions per program but at the
the cost of an increase in the number of cycles per instruction.
• It emphasizes to build complex instructions directly in the hardware because the hardware
is always faster than software. However, CISC chips are relatively slower as compared to
RISC chips but use little instruction than RISC. Examples of CISC processors are VAX, AMD,
Intel x86 and the System/360.
• It has a large collection of complex instructions that range from simple to very complex and
specialized in the assembly language level, which takes a long time to execute the
instructions.
• Examples of CISC architectures include complex mainframe computers to simplistic
microcontrollers where memory load and store operations are not separated from
arithmetic instructions.
( CISC Architecture )
• The CISC architecture helps reduce
program code by embedding
multiple operations on each
program instruction, which makes
the CISC processor more complex.
• The CISC architecture-based
computer is designed to decrease
memory costs because large
programs or instruction required
large memory space to store the
data, thus increasing the memory
requirement, and a large collection
of memory increases the memory
cost, which makes them more
expensive.
Characteristics of CISC
1. The length of the code is shorts, so it requires very little RAM.
2. CISC or complex instructions may take longer than a single clock cycle to execute the code.
3. Less instruction is needed to write an application.
4. It provides easier programming in assembly language.
5. Support for complex data structure and easy compilation of high-level languages.
6. It is composed of fewer registers and more addressing nodes, typically 5 to 20.
7. Instructions can be larger than a single word.
8. It emphasizes the building of instruction on hardware because it is faster to create than the
software.
RISC Scalar Processor
• RISC stands for Reduced Instruction Set Computer Processor, a microprocessor architecture
with a simple collection and highly customized set of instructions. It is built to minimize the
instruction execution time by optimizing and limiting the number of instructions.
• It means each instruction cycle requires only one clock cycle, and each cycle contains
three parameters: fetch, decode and execute.
• The RISC processor is also used to perform various complex instructions by combining
combining them into simpler ones. RISC chips require several transistors, making it
cheaper to design and reduce the execution time for instruction.
• Examples of RISC processors are SUN's SPARC, PowerPC (601), Microchip PIC
processors, RISC-V
( RISC Architecture
)
Characteristics :
One cycle execution time: For executing each
instruction in a computer, the RISC processors
require one CPI (Clock per cycle). And each CPI
includes the fetch, decode and execute method
applied in computer instruction.
Pipelining technique: The pipelining technique is
used in the RISC processors to execute multiple
parts or stages of instructions to perform more
efficiently.
A large number of registers: RISC processors are
optimized with multiple registers that can be used
to store instruction and quickly respond to the
computer and minimize interaction with
computer memory.
It uses LOAD and STORE instruction to access the
the memory location.
RISC Instruction Sets Addressing
RISC instructions operate on processor registers only. The instructions that have arithmetic and logic
operation should have their operand either in the processor register or should be given directly in the
instruction.
Like in both the instructions below we have the operands in registers
Add R2, R3
Add R2, R3, R4
The operand can be mentioned directly in the instruction as below:
Add R2, 100
But initially, at the start of execution of the program, all the operands are in memory. So, to access
the memory operands, the RISC instruction set has Load and Store instruction.
The Load instruction loads the operand present in memory to the processor register. The load
instruction is of the form:
Load destination, Source
Example Load R2, A
The Store instruction above will store the content in register R2 into the A a memory location.
RISC Instruction Addressing Types
1. Immediate addressing mode: This addressing mode explicitly specifies the operand in the
instruction. Like
Add R4, R2, #200 add 200 to the content of R2 and store the result in R4
2. Register addressing mode: This addressing mode describes the registers holding the operands.
Add R3, R3, R4 add the content of register R4 to the content of register R3 and store in R3.
3. Absolute addressing mode: This addressing mode describes a name for a memory location in the
instruction. It is used to declare global variables in the program.
Integer A, B, SUM; This instruction will allocate memory to variable A, B, SUM.
4. Register Indirect addressing mode: This addressing mode describes the register which has the address
of the actual operand in the instruction.
Load R2, (R3); load the register R2 with the content, whose address is mentioned in register R3.
5. Index addressing mode: This addressing mode provides a register in the instruction, to which when
we add a constant, obtain the address of the actual operand.
Load R2, 4(R3) load the reg. R2 with the content present at the location obtained by adding 4 to the
content of reg. R3.
000010001111010101010101010
100101010110100111110010000
101010111010010101111100100
100101010101010101111110000
011111100000001101011001111
111111100000001111010101101
111000011000101010100101010
101001010010010101010100101
010101010101010101101001010
101000101010100101000010101
101010101011111
110101010100101
010101001111111
000101010101010101010101001
010101101001111100100001010
101110100101010101111100100
100100101010101001010101010
101010101011010010101010001
010101001010000101011001010
100100101010101001010101010
101010101011010010101010001
010101001010000101011100101
010101010101111111010101010
010101010100111111111100101
101001010111011010111111111
010101010100000000111111110
010101010100111100000100101
101001010111011010111111111
CISC RISC
V/S
Less
No.
Of
Cycle
but
complex
instruction
More
No.
of
Cycles
but
optimized
instruction
Superscalar Processor
A superscalar processor is a CPU that implements a form of parallelism called instruction-level
parallelism within a single processor. The concept of the superscalar issue was first
developed as early as 1970 (Tjaden and Flynn, 1970). It was later reformulated more
precisely in the 1980s (Torng, 1982, Acosta et al, 1986).
In contrast to a scalar processor, which can execute at most one single instruction per clock
cycle, a superscalar processor can execute more than one instruction during a clock cycle by
simultaneously dispatching multiple instructions to different execution units on the processor.
It therefore allows more throughput (the number of instructions that can be executed in a unit
of time) than would otherwise be possible at a given clock rate.
Superscalar design techniques involve parallel instruction decoding, parallel register renaming,
speculative execution, and out-of-order execution. Each execution unit is not a separate processor
(or a core if the processor is a multi-core processor), but an execution resource within a single CPU
such as an arithmetic logic unit.
In Flynn's taxonomy,
Single-core superscalar processor is classified as an SISD processor (single instruction
stream, single data stream),
Multi-core superscalar processor is classified as an MIMD processor (multiple instruction
streams, multiple data streams).
Remember Superscalar and pipelining execution are considered different performance
enhancement techniques. The former executes multiple instructions in parallel by using
multiple execution units, whereas the latter executes multiple instructions in the same
execution unit in parallel by dividing the execution unit into different phases.
Processor board of a CRAY T3e supercomputer
with four superscalar Alpha 21164 processors
Basic Superscalar Processor Architecture
Superscalar Advantages Superscalar
Disadvantages
The compiler can avoid many hazards
through judicious selection and ordering
of instructions.
In general, high performance is achieved
if the compiler is able to arrange program
instructions to take maximum advantage
of the available hardware units.
The compiler should strive to interleave
floating point and integer instructions.
This would enable the dispatch unit to
keep both the integer and floating point
units busy most of the time.
In a Superscalar Processor, the
detrimental effect on performance of
various hazards becomes even more
pronounced.
Due to this type of architecture, problem
in scheduling can occur.
VLIW Architecture
The limitations of the Superscalar processor are prominent as the difficulty of scheduling
instruction becomes complex. The intrinsic parallelism in the instruction stream, complexity, cost,
and the branch instruction issue get resolved by a higher instruction set architecture called the
Very Long Instruction Word (VLIW) or VLIW Machines.
VLIW uses Instruction Level Parallelism, i.e. it has programs to control the parallel execution of
the instructions.
In other architectures, the performance of the processor is improved by using either of the
following methods: pipelining (break the instruction into subparts),
superscalar processor (independently execute the instructions in different parts of the
processor),
VLIW Architecture deals with it by depending on the compiler. The programs decide the parallel
flow of the instructions and to resolve conflicts. This increases compiler complexity but
decreases hardware complexity by a lot.
( Block Diagram of VLIW Architecture)
• The processors in this architecture have multiple
functional units, fetch from the Instruction
cache that have the Very Long Instruction Word.
• Multiple independent operations are grouped
together in a single VLIW Instruction. They are
initialized in the same clock cycle.
• Each operation is assigned an independent
functional unit.
• All the functional units share a common register
file.
• Instruction scheduling and parallel dispatch of
the word is done statically by the compiler.
• The compiler checks for dependencies before
scheduling parallel execution of the instructions
VLIW Advantages VLIW Disadvantages
Reduces hardware complexity Reduces
power consumption because of
reduction of hardware complexity.
Since compiler takes care of data
dependency check, decoding,
instruction issues, it becomes a lot
simpler.
Increases potential clock rate.
Complex compilers are required
which are hard to design and hence
Increased program code size.
Larger memory bandwidth and
register-file bandwidth.
Unscheduled events, for example a
cache miss could lead to a stall which
will stall the entire processor.
Vector Processors
• Vector processor is basically a central processing unit that has the ability to execute the
complete vector input in a single instruction. In other words, it is a complete unit of hardware
resources that executes a sequential (as to have successive addressing format of the memory)
Architecture • IPU (Instruction Processing Unit) fetches the
instruction from the memory.
• If Instruction scalar in nature, then the instruction
is transferred to the scalar register and then further
scalar processing is performed. Similarly If its vector
in nature then it is fed to vector instruction register.
• This vector instruction controller first decodes the
vector instruction then accordingly determines the
address of the vector operand present in the
memory.
• Then it gives a signal to the vector access controller
about the demand of the respective operand. This
vector access controller then fetches the desired
operand from the memory. Once the operand is
fetched then it is provided to the instruction
register so that it can be processed at the vector
processor
(Block Diagram of Vector Processors Computing)
Classification of Vector Processors
1. Register to Register Architecture
• In register to register architecture, operands and results are retrieved indirectly from
the main memory through the use of large number of vector registers or scalar
registers.
• Example - The processors like Cray-1 and the Fujitsu VP-200
The main points about register to register architecture are:
1. Register to register architecture has limited size.
2. Speed is very high as compared to the memory to memory architecture.
3. The hardware cost is high in this architecture.
2. Memory to Memory Architecture
• Here the operands or the results are directly fetched from the memory despite using
registers.
• This architecture enables the fetching of data of size 512 bits from memory to pipeline.
However, due to high memory access time, the pipelines of the vector computer
requires higher startup time, as higher time is required to initiate the vector
instruction. For Example Cyber 205, CDC etc.
Advantages
• Vector processor uses vector instructions by which code density of the instructions can
be improved.
• The sequential arrangement of data helps to handle the data by the hardware in a
better way.
• It offers a reduction in instruction bandwidth.
Symbolic Processors
Symbolic processors are designed for expert system, machine intelligence, knowledge
based system, pattern-recognition, text retrieval, etc. Symbolic processors are also called LISP
processors or PROLOG processors.
Attributes Characteristics
Common operations Search, sort, pattern matching, unification
Memory requirement Large memory with intensive access pattern
Properties of algorithm Parallel and distributed, irregular in pattern
Input / Output requirements Graphical/audio/keyboard. User guided programs, machine interface
Architecture Features Parallel update, dynamic load balancing and memory allocation
Knowledge representation Lists, relational databases, Semantic nets, Frames, Production
• For example, a Lisp program can be viewed as a set of functions in which data are passed from
function to function. The concurrent execution of these functions forms the basis for
parallelism.
• The applicative and recursive nature of Lisp requires an environment that efficiently supports
stack computations and function calling. The use of linked lists as the basic data structure
makes it possible to implement an automatic garbage collection mechanism.
• Instead of dealing with numerical data, symbolic processing deals with logic programs,
symbolic lists, objects, scripts, blackboards, production systems, semantic networks, frames,
artificial neural networks. Primitive operations for artificial intelligence include search,
logic inference, pattern matching, unification.
• Example: The Symbolics 3600 Lisp processor
Architecture of Symbolic 3600 Lisp
processor
• This was a stack-oriented machine. The
division of the overall architecture into
layers allowed the use of a simplified
instruction-set design, while
implementation was carried out with a
stack-oriented machine.
• Since most operands were fetched from
the stack, the stack buffer and scratch-pad
memories were implemented as fast
caches to main memory.
• The Symbolic 3600 executed most Lisp
instructions in one machine cycle. Integer
instructions fetched operands form the
stack buffer and the duplicate top of the
stack in the scratch-pad memory.
THANK YOU

More Related Content

What's hot (20)

PPTX
parallel language and compiler
Vignesh Tamil
 
PPTX
Parallel computing and its applications
Burhan Ahmed
 
PDF
Memory mapping
SnehalataAgasti
 
PDF
Aca2 10 11
Sumit Mittu
 
DOCX
Parallel computing persentation
VIKAS SINGH BHADOURIA
 
PPTX
Parallel processing
Praveen Kumar
 
PPT
multiprocessors and multicomputers
Pankaj Kumar Jain
 
PPT
Branch prediction
Aneesh Raveendran
 
PPTX
Applications of paralleL processing
Page Maker
 
PPTX
Parallel programming model
easy notes
 
PPT
Paging.ppt
infomerlin
 
PPTX
Lec 4 (program and network properties)
Sudarshan Mondal
 
PPT
parallel programming models
Swetha S
 
PPT
system interconnect architectures in ACA
Pankaj Kumar Jain
 
PPT
Parallel processing
Syed Zaid Irshad
 
PDF
Aca2 01 new
Sumit Mittu
 
PPT
program partitioning and scheduling IN Advanced Computer Architecture
Pankaj Kumar Jain
 
PPS
Virtual memory
Anuj Modi
 
PPTX
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 
parallel language and compiler
Vignesh Tamil
 
Parallel computing and its applications
Burhan Ahmed
 
Memory mapping
SnehalataAgasti
 
Aca2 10 11
Sumit Mittu
 
Parallel computing persentation
VIKAS SINGH BHADOURIA
 
Parallel processing
Praveen Kumar
 
multiprocessors and multicomputers
Pankaj Kumar Jain
 
Branch prediction
Aneesh Raveendran
 
Applications of paralleL processing
Page Maker
 
Parallel programming model
easy notes
 
Paging.ppt
infomerlin
 
Lec 4 (program and network properties)
Sudarshan Mondal
 
parallel programming models
Swetha S
 
system interconnect architectures in ACA
Pankaj Kumar Jain
 
Parallel processing
Syed Zaid Irshad
 
Aca2 01 new
Sumit Mittu
 
program partitioning and scheduling IN Advanced Computer Architecture
Pankaj Kumar Jain
 
Virtual memory
Anuj Modi
 
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 

Similar to Advanced Processor Power Point Presentation (20)

PPTX
Vlsi_ppt_34_36_64.pptx
SahilMaske1
 
PPT
isa architecture
AJAL A J
 
PPTX
M&i(lec#01)
Majid Mehmood
 
PDF
Area Optimized Implementation For Mips Processor
IOSR Journals
 
PPTX
Processors selection
Pradeep Shankhwar
 
PDF
Lecture 1 m&ca
Tribhuvan University
 
PDF
Module 1 of apj Abdul kablam university hpc.pdf
22br14851
 
PPSX
Processors used in System on chip
Dr. A. B. Shinde
 
PPT
CSE675_01_Introduction.ppt
AshokRachapalli1
 
PPT
CSE675_01_Introduction.ppt
AshokRachapalli1
 
PPT
software engineering CSE675_01_Introduction.ppt
SomnathMule5
 
PPTX
Instruction Set Architecture
Jaffer Haadi
 
PPTX
Chapter_2_ESD_Typical Embedded System.pptx
ShanthiM13
 
PPTX
11-risc-cisc-and-isa-w.pptx
Suma Prakash
 
PPTX
Computer Organization.pptx
saimagul310
 
PPTX
Advanced processor principles
Dhaval Bagal
 
PPTX
CAO.pptx
FarhanaMariyam1
 
PPTX
MCI-Unit_1.PPTX electronics communication Engineering
KongaMadhukar
 
PPTX
CISC.pptx
UmaimaAsif3
 
PDF
This is Unit 1 of High Performance Computing For SRM students
cegafen778
 
Vlsi_ppt_34_36_64.pptx
SahilMaske1
 
isa architecture
AJAL A J
 
M&i(lec#01)
Majid Mehmood
 
Area Optimized Implementation For Mips Processor
IOSR Journals
 
Processors selection
Pradeep Shankhwar
 
Lecture 1 m&ca
Tribhuvan University
 
Module 1 of apj Abdul kablam university hpc.pdf
22br14851
 
Processors used in System on chip
Dr. A. B. Shinde
 
CSE675_01_Introduction.ppt
AshokRachapalli1
 
CSE675_01_Introduction.ppt
AshokRachapalli1
 
software engineering CSE675_01_Introduction.ppt
SomnathMule5
 
Instruction Set Architecture
Jaffer Haadi
 
Chapter_2_ESD_Typical Embedded System.pptx
ShanthiM13
 
11-risc-cisc-and-isa-w.pptx
Suma Prakash
 
Computer Organization.pptx
saimagul310
 
Advanced processor principles
Dhaval Bagal
 
CAO.pptx
FarhanaMariyam1
 
MCI-Unit_1.PPTX electronics communication Engineering
KongaMadhukar
 
CISC.pptx
UmaimaAsif3
 
This is Unit 1 of High Performance Computing For SRM students
cegafen778
 
Ad

Recently uploaded (20)

PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
PDF
July Patch Tuesday
Ivanti
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Biography of Daniel Podor.pdf
Daniel Podor
 
July Patch Tuesday
Ivanti
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Ad

Advanced Processor Power Point Presentation

  • 4. INDEX 1. INTRODUCTION 2. DESIGN SPACE 3. INSTRUCTION SET ARCHITECTURE 4. CISC SCALAR PROCESSOR 5. RISC SCALAR PROCESSOR 6. VLIW ARCHITECTURE 7. VECTOR PROCESSOR 8. SYMBOLIC PROCESSOR
  • 5. INTRODUTION A Processor is an integrated electronic circuit that performs the calculations that run a computer. A processor performs arithmetical, logical, input/output (I/O) and other basic instructions that are passed from an operating system (OS). Most other processes are dependent on the operations of a processor. Similarly Advanced processors allow you to perform advanced functions such as defining processing ,This Advance Processing Technology incorporate the CISC, RISC, VLIW, Vector and Symbolic Processors for its computation. The Scalar and vector processors are for mathematical calculations. Symbolic processors have been developed for AI applications
  • 6. Design Space Processor families can be mapped onto a coordinated space of clock rate versus cycles per instruction (CPI). Clock Rate = Clock speed, also known as clock rate or clock frequency, is a measure of speed of computer’s central processing unit (CPU) in executing instructions. It is typically measured in gigahertz (GHz). Higher clock speeds generally mean that a CPU can process more instructions per second, and thus can perform better on tasks that require fast processing. Cycles per instruction (CPI) or also known as clock cycles per instruction is one aspect of a processor's performance: the average number of clock cycles per instruction for a program .
  • 7. As implementation technology evolves rapidly, the clock rates of various processors have moved from low to higher speeds toward the right of the design space (i.e. increase in clock rate). Similarly processor manufacturers have been trying to lower the CPI rate(cycles taken to execute an instruction) using innovative hardware approaches.
  • 8. Instruction Set Architecture(ISA) • An instruction is a set of codes that the computer processor can understand. The code is usually in 1s and 0s, or machine language. Example of some instruction sets − ADD − Add two numbers together. JUMP − Jump to designated RAM address. LOAD − Load information from RAM to the CPU. • Instruction set Architecture (ISA) is defined as the design of a computer from the Programmer’s Perspective. This basically means that an ISA describes the design of a Computer in terms of the basic operations it must support. The ISA is not concerned with the implementation-specific details of a computer. It is only concerned with the set or collection of basic operations the computer must support.
  • 9. • The ISA acts as an interface between the hardware and the software. • The ISA describes (1) memory model, (2) instruction format, types and modes, and (3) operand registers, types, and data addressing. Instruction types include arithmetic, logical, data transfer, and flow control. Instruction modes include kernel and user instructions.
  • 10. The implementation of the ISA in hardware as fetch-decode-execute cycle. In the fetch step, operands are retrieved from memory. The decode step puts the operands into a format that the ALU can manipulate. The execute cycle performs the selected operation within the ALU. Control facilitates orderly routing of data, including I/O to the ALU's external environment (e.g., peripheral devices such as disk or keyboard). Fetch-Decode-Execute Cycle
  • 11. Objective of an ISA Let us try to understand the Objectives of an ISA by taking the example of the MIPS ISA(Million instructions per second (MIPS)) is an approximate measure of a computer's raw processing power ). MIPS is one of the most widely used ISAs due to its simplicity. The ISA defines the types of instructions to be supported by the processor. • Based on the type of operations they perform MIPS Instructions are classified into 3 types 1. Arithmetic/Logic Instructions: These Instructions perform various Arithmetic & Logical operations on one or more operands. 2. Data Transfer Instructions: These instructions are responsible for the transfer of instructions from memory to the processor registers and vice versa. 3. Branch and Jump Instructions: These instructions are responsible for breaking the sequential flow of instructions and jumping to instructions at various other locations
  • 12. • The ISA defines the maximum length of each type of instruction. Since the MIPS is a 32 bit ISA, each instruction must be accommodated within 32 bits. • The ISA defines the Instruction Format of each type of instruction. The Instruction Format determines how the entire instruction is encoded within 32 bits There are 3 types of Instruction Formats in the MIPS ISA: 1. R-Instruction Format 2. I-Instruction Format 3. J-Instruction Format
  • 13. R-Instruction Format The R instruction format has fields for three registers (typically, two sources and a destination), as well a shift amount (5 bits) and a function (6 bits). It is used for arithmetic/bitwise instructions which do not have an immediate operand. Opcode = 000000 RS RT RD Shift Amount Function 6 bits 5 5 5 5 6 The Function field specifies the actual arithmetic function to be applied to the operands given by the RS, RT (sources) and RD (destination) fields. For example, function 32 (100000b) is addition. The Left/Right Shift instructions use the shift amount field to specify the amount to shift. I-Instruction Format The I instruction format contains fields for two registers (typically source and destination) and for a 16-bit immediate value. The I format is used for arithmetic operations with an immediate operand,
  • 14. Opcode RS RD Immediate/Address 6 bits 5 bits 5 bits 16 bits The Reg1 and Reg2 fields encode the source and destination registers (MIPS has 32 registers = 25), while the Immediate field encodes any immediate value. J-Instruction Format The J format is used for the Jump instruction, which jumps to an absolute address. Because instructions must be aligned to 32-bits, the low 3 bits of every valid address are always 0. Thus, we can jump to any one of 232-6+3 addresses; the currently-executing address is used to supply the missing 3 bits. Opcode Address 6 bits 26 bits
  • 15. Two Main Categories of Processors are : 1.CISC 2.RISC Under both CISC and RISC categories, products designed for multi-core chips, embedded applications, or for low cost and/or low power consumption, tend to have lower clock speeds. High performance processors must necessarily be designed to operate at high clock speeds. The category of vector processors has been marked VP; vector processing features may be associated with CISC or RISC main processors.
  • 16. CISC Scalar Processor Scalar processors are a class of Computer Processors that process only one data item at a time. Typical data items include Integers and floating point numbers . A scalar processor is classified as a single instruction, single data (SISD) processor in Flynn's taxonomy. The Intel i486 is an example of a scalar processor . (Single instruction stream, single data stream (SISD)) ( Intel i486 )
  • 17. • CISC stands for Complex Instruction Set Computer. It comprises a complex instruction set. It incorporates a variable length instruction format. • The CISC approach attempts to minimize the number of instructions per program but at the the cost of an increase in the number of cycles per instruction. • It emphasizes to build complex instructions directly in the hardware because the hardware is always faster than software. However, CISC chips are relatively slower as compared to RISC chips but use little instruction than RISC. Examples of CISC processors are VAX, AMD, Intel x86 and the System/360. • It has a large collection of complex instructions that range from simple to very complex and specialized in the assembly language level, which takes a long time to execute the instructions. • Examples of CISC architectures include complex mainframe computers to simplistic microcontrollers where memory load and store operations are not separated from arithmetic instructions.
  • 18. ( CISC Architecture ) • The CISC architecture helps reduce program code by embedding multiple operations on each program instruction, which makes the CISC processor more complex. • The CISC architecture-based computer is designed to decrease memory costs because large programs or instruction required large memory space to store the data, thus increasing the memory requirement, and a large collection of memory increases the memory cost, which makes them more expensive.
  • 19. Characteristics of CISC 1. The length of the code is shorts, so it requires very little RAM. 2. CISC or complex instructions may take longer than a single clock cycle to execute the code. 3. Less instruction is needed to write an application. 4. It provides easier programming in assembly language. 5. Support for complex data structure and easy compilation of high-level languages. 6. It is composed of fewer registers and more addressing nodes, typically 5 to 20. 7. Instructions can be larger than a single word. 8. It emphasizes the building of instruction on hardware because it is faster to create than the software.
  • 20. RISC Scalar Processor • RISC stands for Reduced Instruction Set Computer Processor, a microprocessor architecture with a simple collection and highly customized set of instructions. It is built to minimize the instruction execution time by optimizing and limiting the number of instructions. • It means each instruction cycle requires only one clock cycle, and each cycle contains three parameters: fetch, decode and execute. • The RISC processor is also used to perform various complex instructions by combining combining them into simpler ones. RISC chips require several transistors, making it cheaper to design and reduce the execution time for instruction. • Examples of RISC processors are SUN's SPARC, PowerPC (601), Microchip PIC processors, RISC-V
  • 21. ( RISC Architecture ) Characteristics : One cycle execution time: For executing each instruction in a computer, the RISC processors require one CPI (Clock per cycle). And each CPI includes the fetch, decode and execute method applied in computer instruction. Pipelining technique: The pipelining technique is used in the RISC processors to execute multiple parts or stages of instructions to perform more efficiently. A large number of registers: RISC processors are optimized with multiple registers that can be used to store instruction and quickly respond to the computer and minimize interaction with computer memory. It uses LOAD and STORE instruction to access the the memory location.
  • 22. RISC Instruction Sets Addressing RISC instructions operate on processor registers only. The instructions that have arithmetic and logic operation should have their operand either in the processor register or should be given directly in the instruction. Like in both the instructions below we have the operands in registers Add R2, R3 Add R2, R3, R4 The operand can be mentioned directly in the instruction as below: Add R2, 100 But initially, at the start of execution of the program, all the operands are in memory. So, to access the memory operands, the RISC instruction set has Load and Store instruction. The Load instruction loads the operand present in memory to the processor register. The load instruction is of the form: Load destination, Source Example Load R2, A The Store instruction above will store the content in register R2 into the A a memory location.
  • 23. RISC Instruction Addressing Types 1. Immediate addressing mode: This addressing mode explicitly specifies the operand in the instruction. Like Add R4, R2, #200 add 200 to the content of R2 and store the result in R4 2. Register addressing mode: This addressing mode describes the registers holding the operands. Add R3, R3, R4 add the content of register R4 to the content of register R3 and store in R3. 3. Absolute addressing mode: This addressing mode describes a name for a memory location in the instruction. It is used to declare global variables in the program. Integer A, B, SUM; This instruction will allocate memory to variable A, B, SUM. 4. Register Indirect addressing mode: This addressing mode describes the register which has the address of the actual operand in the instruction. Load R2, (R3); load the register R2 with the content, whose address is mentioned in register R3. 5. Index addressing mode: This addressing mode provides a register in the instruction, to which when we add a constant, obtain the address of the actual operand. Load R2, 4(R3) load the reg. R2 with the content present at the location obtained by adding 4 to the content of reg. R3.
  • 24. 000010001111010101010101010 100101010110100111110010000 101010111010010101111100100 100101010101010101111110000 011111100000001101011001111 111111100000001111010101101 111000011000101010100101010 101001010010010101010100101 010101010101010101101001010 101000101010100101000010101 101010101011111 110101010100101 010101001111111 000101010101010101010101001 010101101001111100100001010 101110100101010101111100100 100100101010101001010101010 101010101011010010101010001 010101001010000101011001010 100100101010101001010101010 101010101011010010101010001 010101001010000101011100101 010101010101111111010101010 010101010100111111111100101 101001010111011010111111111 010101010100000000111111110 010101010100111100000100101 101001010111011010111111111 CISC RISC V/S Less No. Of Cycle but complex instruction More No. of Cycles but optimized instruction
  • 25. Superscalar Processor A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. The concept of the superscalar issue was first developed as early as 1970 (Tjaden and Flynn, 1970). It was later reformulated more precisely in the 1980s (Torng, 1982, Acosta et al, 1986). In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a superscalar processor can execute more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor. It therefore allows more throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate. Superscalar design techniques involve parallel instruction decoding, parallel register renaming, speculative execution, and out-of-order execution. Each execution unit is not a separate processor (or a core if the processor is a multi-core processor), but an execution resource within a single CPU such as an arithmetic logic unit.
  • 26. In Flynn's taxonomy, Single-core superscalar processor is classified as an SISD processor (single instruction stream, single data stream), Multi-core superscalar processor is classified as an MIMD processor (multiple instruction streams, multiple data streams). Remember Superscalar and pipelining execution are considered different performance enhancement techniques. The former executes multiple instructions in parallel by using multiple execution units, whereas the latter executes multiple instructions in the same execution unit in parallel by dividing the execution unit into different phases. Processor board of a CRAY T3e supercomputer with four superscalar Alpha 21164 processors
  • 28. Superscalar Advantages Superscalar Disadvantages The compiler can avoid many hazards through judicious selection and ordering of instructions. In general, high performance is achieved if the compiler is able to arrange program instructions to take maximum advantage of the available hardware units. The compiler should strive to interleave floating point and integer instructions. This would enable the dispatch unit to keep both the integer and floating point units busy most of the time. In a Superscalar Processor, the detrimental effect on performance of various hazards becomes even more pronounced. Due to this type of architecture, problem in scheduling can occur.
  • 29. VLIW Architecture The limitations of the Superscalar processor are prominent as the difficulty of scheduling instruction becomes complex. The intrinsic parallelism in the instruction stream, complexity, cost, and the branch instruction issue get resolved by a higher instruction set architecture called the Very Long Instruction Word (VLIW) or VLIW Machines. VLIW uses Instruction Level Parallelism, i.e. it has programs to control the parallel execution of the instructions. In other architectures, the performance of the processor is improved by using either of the following methods: pipelining (break the instruction into subparts), superscalar processor (independently execute the instructions in different parts of the processor), VLIW Architecture deals with it by depending on the compiler. The programs decide the parallel flow of the instructions and to resolve conflicts. This increases compiler complexity but decreases hardware complexity by a lot.
  • 30. ( Block Diagram of VLIW Architecture) • The processors in this architecture have multiple functional units, fetch from the Instruction cache that have the Very Long Instruction Word. • Multiple independent operations are grouped together in a single VLIW Instruction. They are initialized in the same clock cycle. • Each operation is assigned an independent functional unit. • All the functional units share a common register file. • Instruction scheduling and parallel dispatch of the word is done statically by the compiler. • The compiler checks for dependencies before scheduling parallel execution of the instructions
  • 31. VLIW Advantages VLIW Disadvantages Reduces hardware complexity Reduces power consumption because of reduction of hardware complexity. Since compiler takes care of data dependency check, decoding, instruction issues, it becomes a lot simpler. Increases potential clock rate. Complex compilers are required which are hard to design and hence Increased program code size. Larger memory bandwidth and register-file bandwidth. Unscheduled events, for example a cache miss could lead to a stall which will stall the entire processor.
  • 32. Vector Processors • Vector processor is basically a central processing unit that has the ability to execute the complete vector input in a single instruction. In other words, it is a complete unit of hardware resources that executes a sequential (as to have successive addressing format of the memory)
  • 33. Architecture • IPU (Instruction Processing Unit) fetches the instruction from the memory. • If Instruction scalar in nature, then the instruction is transferred to the scalar register and then further scalar processing is performed. Similarly If its vector in nature then it is fed to vector instruction register. • This vector instruction controller first decodes the vector instruction then accordingly determines the address of the vector operand present in the memory. • Then it gives a signal to the vector access controller about the demand of the respective operand. This vector access controller then fetches the desired operand from the memory. Once the operand is fetched then it is provided to the instruction register so that it can be processed at the vector processor (Block Diagram of Vector Processors Computing)
  • 35. 1. Register to Register Architecture • In register to register architecture, operands and results are retrieved indirectly from the main memory through the use of large number of vector registers or scalar registers. • Example - The processors like Cray-1 and the Fujitsu VP-200 The main points about register to register architecture are: 1. Register to register architecture has limited size. 2. Speed is very high as compared to the memory to memory architecture. 3. The hardware cost is high in this architecture. 2. Memory to Memory Architecture • Here the operands or the results are directly fetched from the memory despite using registers. • This architecture enables the fetching of data of size 512 bits from memory to pipeline. However, due to high memory access time, the pipelines of the vector computer requires higher startup time, as higher time is required to initiate the vector instruction. For Example Cyber 205, CDC etc.
  • 36. Advantages • Vector processor uses vector instructions by which code density of the instructions can be improved. • The sequential arrangement of data helps to handle the data by the hardware in a better way. • It offers a reduction in instruction bandwidth.
  • 37. Symbolic Processors Symbolic processors are designed for expert system, machine intelligence, knowledge based system, pattern-recognition, text retrieval, etc. Symbolic processors are also called LISP processors or PROLOG processors. Attributes Characteristics Common operations Search, sort, pattern matching, unification Memory requirement Large memory with intensive access pattern Properties of algorithm Parallel and distributed, irregular in pattern Input / Output requirements Graphical/audio/keyboard. User guided programs, machine interface Architecture Features Parallel update, dynamic load balancing and memory allocation Knowledge representation Lists, relational databases, Semantic nets, Frames, Production
  • 38. • For example, a Lisp program can be viewed as a set of functions in which data are passed from function to function. The concurrent execution of these functions forms the basis for parallelism. • The applicative and recursive nature of Lisp requires an environment that efficiently supports stack computations and function calling. The use of linked lists as the basic data structure makes it possible to implement an automatic garbage collection mechanism. • Instead of dealing with numerical data, symbolic processing deals with logic programs, symbolic lists, objects, scripts, blackboards, production systems, semantic networks, frames, artificial neural networks. Primitive operations for artificial intelligence include search, logic inference, pattern matching, unification. • Example: The Symbolics 3600 Lisp processor
  • 39. Architecture of Symbolic 3600 Lisp processor • This was a stack-oriented machine. The division of the overall architecture into layers allowed the use of a simplified instruction-set design, while implementation was carried out with a stack-oriented machine. • Since most operands were fetched from the stack, the stack buffer and scratch-pad memories were implemented as fast caches to main memory. • The Symbolic 3600 executed most Lisp instructions in one machine cycle. Integer instructions fetched operands form the stack buffer and the duplicate top of the stack in the scratch-pad memory.