SlideShare a Scribd company logo
Processors
SYSTEM ON CHIP ARCHITECTURE
UNIT --- II
Introduction to the processor
 Processors come in many types and with many intended uses.
 This chapter contains details about processor design issues, especially for
advanced processors in high - performance applications
 Clearly, controllers, embedded controllers, digital signal processors (DSPs), are the
dominant processor types, providing the focus for much of the processor design
effort.
 SOC and larger microcontrollers is growing at almost three times that of
microprocessor units (MPUs in Figure 3.2 ).
 Especially in SOC type applications, the processor itself is a small component
occupying just a few percent of the die. SOC designs often use many different
types of processors suiting the application.
Introduction to the processor
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Introduction to the processor
Processors topic in system on chip architecture
PROCESSOR SELECTION FOR SOC
 For many SOC design situations, the selection of the processor is the most obvious task
and, in some ways, the most restricted.
 The processor must run a specific system software, so at least a core processor —
usually a general purpose processor (GPP) — must be selected for this function.
Processor selection for soc
 Figure 3.3 shows the processor model used in the initial design process.
 1. Define the Application Requirements
 Before selecting a processor, determine:
 Target Device: Smartphone, IoT device, automotive system, embedded
system, etc.
 Performance Needs: High performance (gaming, AI), balanced
(smartphones, laptops), or low-power (IoT, wearables).
 Power Constraints: Battery-powered (low power) vs. plug-in devices (higher
power acceptable).
 Connectivity: 5G, Wi-Fi, Bluetooth, or wired interfaces.
 Security: Secure Boot, TPM, or dedicated security cores.
Processors topic in system on chip architecture
Processor selection for soc
 Soft Processors
 The term “ soft core ” refers to an instruction processor design in bit stream
format that can be used to program a field programmable gate array (FPGA)
 device. The 4 main reasons for using such designs, despite their large area
power – time cost, are
 1. cost reduction in terms of system - level integration,
 2. design reuse in cases where multiple designs are really just variations on one,
 3. creating an exact fi t for a microcontroller/peripheral combination, and
 4. providing future protection against discontinued microcontroller variants.
Processors topic in system on chip architecture
BASIC CONCEPTS IN PROCESSOR ARCHITECTURE
 The processor architecture consists of the instruction set of the processor.
 While the instruction set implies many implementation (microarchitecture) details, the resulting
implementation is a great deal more than the instruction set.
 It is the synthesis of the physical device limitations with area – time – power trade - offs to
optimize specified user requirements.
 Instruction Set
 The instruction set for most processors is based upon a register set to hold operands and
addresses
 The register set size varies from 8 to 64 words or more, each word consisting of 32 – 64 bits.
 An additional set of floating - point registers (32 – 128 bits) is usually also available.
 Common instruction sets can be classified by format differences into two basic types, the load
– store ( L/S ) architecture and the register – memory ( R/M ) architecture:
Processors topic in system on chip architecture
Processors topic in system on chip architecture
BASIC CONCEPTS IN PROCESSOR ARCHITECTURE
 The L/S instruction set includes the RISC microprocessors. Arguments must be
in registers before execution.
 A Load/Store (L/S) architecture is a type of Reduced Instruction Set Computing
(RISC) design where memory access is restricted to specific Load (L) and Store
(S) instructions. Unlike Complex Instruction Set Computing (CISC) architectures
(e.g., x86), where data can be processed directly from memory, L/S
architectures require data to be first loaded into registers before processing.
 A Register-Memory Instruction Set refers to a type of CPU architecture where
instructions can operate directly on data stored in memory without requiring
explicit load/store operations. This contrasts with Load/Store (RISC)
architectures, where all operations must first load data into registers.
Trade-offs in Instruction Set Architecture (ISA)
 The Instruction Set Architecture (ISA) defines how a processor executes
instructions, impacting performance, power efficiency, and complexity.
Different ISAs, such as RISC (Reduced Instruction Set Computing) and CISC
(Complex Instruction Set Computing), have trade-offs based on design
goals.
Trade-offs in Instruction Set Architecture (ISA)
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Interrupts and Exceptions
 Interrupts and exceptions allow a processor to respond to events such as hardware
signals, errors, and system calls. These mechanisms help in efficient multitasking, error
handling, and real-time processing.
 Types of Interrupts
 A. Hardware Interrupts (Triggered by external devices)
 Maskable Interrupts (IRQ) → Can be ignored or delayed by disabling interrupts.
 Non-Maskable Interrupts (NMI) → Cannot be ignored (e.g., power failure).
 Interrupt Requests (IRQs) → Devices send requests via the Interrupt Controller (PIC/APIC).
How Interrupts Are Handled?
 Interrupt Occurs → Device or software sends an interrupt signal.
 Processor Saves State → Stores registers and program counter (PC).
 Interrupt Vector Table (IVT) Lookup → Finds the correct handler for the
interrupt.
 Interrupt Service Routine (ISR) Executes → Handles the event (e.g., reading
from a device).
 processor Restores State → Resumes execution of the interrupted program.
Types of Exceptions
 A. Faults (Can be recovered; program restarts)
 Page Fault (Accessing invalid memory).
 Divide-by-Zero Error (Mathematical errors).
 Traps (Handled immediately and continue execution)
 System Calls (Used by OS to switch from user mode to kernel mode).
 Aborts (Serious errors that terminate execution)
 Hardware failures (e.g., memory corruption).
How Exceptions Are Handled?
 CPU detects an exception (e.g., invalid instruction, divide-by-zero).
 Exception Vector Table Lookup → Finds the correct handler
 Exception Handler Executes → Fixes the issue or terminates the program
Interrupts allow devices and software to interact with the CPU asynchronously.
Exceptions handle errors and system events synchronously.
Efficient handling using interrupt controllers and prioritization is crucial for
performance.
Modern CPUs optimize interrupt handling using vectored and fast interrupts.
 Interrupts and Exceptions Using Condition Codes
 Condition codes (also called status flags) are special bits in a processor's status register
(flag register) that indicate the result of arithmetic and logical operations. These flags help
determine whether an interrupt or exception should be triggered.
 Common Condition Codes in a Processor
 Zero Flag (ZF) – Set when the result of an operation is zero.
 Carry Flag (CF) – Set when an arithmetic operation results in a carry (for unsigned numbers).
 Overflow Flag (OF) – Set when an arithmetic operation results in an overflow (for signed
numbers).
 Sign Flag (SF) – Set if the result of an operation is negative.
 Parity Flag (PF) – Set if the result has an even number of 1s in binary.
BASIC CONCEPTS IN PROCESSOR MICROARCHITECTURE
 Almost all modern processors use an instruction execution pipeline design.
Simple processors issue only one instruction for each cycle;
 Many embedded and some signal processors use a simple issue - one -
instruction per - cycle design approach.
 others issue many. Many embedded and some signal processors use a
simple issue - one - instruction per - cycle design approach.
 But the bulk of modern desktop, laptop, and server systems issue multiple
instructions for each cycle.
 Every processor (Figure 3.7 ) has a memory system, execution unit (data
paths), and instruction unit.
BASIC CONCEPTS IN PROCESSOR
MICROARCHITECTURE
 The pipeline mechanism or control has many possibilities. Potentially, it can
execute one or more instructions for each cycle. Instructions may or may not be
decoded and/or executed in program order
 Regardless of the type of pipeline, “ breaks ” or delays are the major limit on
performance.
 1. Pipeline
 A technique used to improve instruction throughput by breaking execution into
stages.
 Common pipeline stages: Fetch, Decode, Execute, Memory Access, Write back.
 Example: A 5-stage pipeline in RISC processors.
Processors topic in system on chip architecture
 Instruction unit
BASIC CONCEPTS IN PROCESSOR MICROARCHITECTURE
 The Instruction Register (IR) is a special-purpose register in a computer's central
processing unit (CPU) that holds the instruction currently being executed or
decoded. It is a crucial part of the instruction cycle in a computer.
 Functions of the Instruction Register:
 Holds the Current Instruction – The IR temporarily stores the machine instruction
fetched from memory.
 instruction Buffer
 An Instruction Buffer is a temporary storage unit in a CPU that holds multiple
instructions before they are executed. It is mainly used to improve instruction
processing speed by reducing delays in fetching instructions from memory.
Processors topic in system on chip architecture
Processors topic in system on chip architecture
 Execution Unit (EU) in a CPU
 The Execution Unit (EU) is the part of the CPU responsible for processing and
executing instructions. It works in conjunction with the Control Unit (CU), which
fetches and decodes instructions before passing them to the Execution Unit.
 Components of the Execution Unit
 Arithmetic Logic Unit (ALU)
 Performs arithmetic (addition, subtraction, multiplication, division).
 Executes logical operations (AND, OR, XOR, NOT).
 Handles bitwise shifts, comparisons, and Boolean logic.
 Floating-Point Unit (FPU)
 Specializes in floating-point arithmetic (decimal operations).
 Follows the IEEE 754 standard for high-precision calculations.
 Floating-Point Unit (FPU)
 Specializes in floating-point arithmetic (decimal operations).
 Follows the IEEE 754 standard for high-precision calculations.
BASIC ELEMENTS IN INSTRUCTION HANDLING
 An instruction unit consists of the state registers as defined by the instruction set
— the instruction register — plus the instruction buffer, decoder, and an interlock
unit.
 The instruction buffer ’ s function is to fetch instructions into registers so that
instructions can be rapidly brought into a position to be decoded.
 The decoder has the responsibility for controlling the cache, ALU, registers, and
so on.
 Interlock in a Processor
 In a processor, an interlock is a hardware or control mechanism that prevents a
subsequent instruction from executing until a previous instruction has completed,
thereby avoiding data hazards or structural conflicts.
Processors topic in system on chip architecture
Instruction decoder and interlocks
Instruction decoder and interlocks
 Instruction Decoder in a Processor
 The Instruction Decoder is a key component of the Control Unit (CU) in a CPU. It
translates the machine code instructions fetched from memory into signals that control
other parts of the processor, such as the Arithmetic Logic Unit (ALU), registers, and
memory access units.
 1. Role of the Instruction Decoder
 The Instruction Decoder is responsible for: Decoding the instruction opcode – Identifies
the type of operation (ADD, SUB, LOAD, etc.).
 Extracting operands – Determines source and destination registers or memory
addresses.
Generating control signals – Activates ALU, memory, and register operations.
Determining instruction format – Identifies whether it's R-type, I-type, etc.
•Determines whether the instruction requires immediate values, registers, or memory access.
 How the Instruction Decoder Works
 Instruction Fetch – The instruction is fetched from memory by the Instruction
Fetch Unit (IFU).
 Instruction Decode – The Instruction Decoder analyzes the opcode and
operands.
 Control Signal Generation – The decoder sends signals to activate ALU,
registers, or memory units.
 Execution – The decoded instruction is executed in the Execution Unit (EU).
Instruction decoder and interlocks
 Interlock in Instruction Decoder
 An interlock in an instruction decoder is a control mechanism that
prevents the execution of an instruction until all necessary conditions (like
data availability, resource availability, or hazard resolution) are met. This
helps avoid errors due to pipeline hazards or data dependencies.
 1. Why Interlocks Are Needed in Instruction Decoding
 The instruction decoder translates binary machine code into control signals
for execution. However, certain conditions may require an instruction to
pause (stall) before proceeding:
 Data Dependency (RAW Hazard) – An instruction requires the result of a
previous instruction, but the result isn't ready.
 Structural Hazard – The required hardware (ALU, register file, etc.) is already
in use
 Control Hazard (Branching Issue) – The processor is unsure which instruction
to execute next (branch prediction).
 Memory Read/Write Delays – Load/store instructions may take multiple
cycles to complete.
 To handle these issues, the instruction decoder uses interlocks to insert
stalls or delay execution until safe.
How Interlocks Work in the Instruction
Decoder
 Instruction Fetch – The CPU fetches an instruction from memory.
 Instruction Decode – The decoder identifies the instruction type and operands.
 Dependency Check (Hazard Detection Unit) – The decoder checks if the
instruction can execute immediately or needs to wait.
 Interlock Activation (if needed)
 If dependencies exist, an interlock mechanism inserts a stall (delay cycle).
 If no dependency exists, the instruction proceeds to execution.
 Execution / Stall Handling – If stalled, the CPU waits; otherwise, execution
proceeds.
Processors topic in system on chip architecture
Buffers minimizing pipeline delays
 Why Buffers Are Needed in Pipelining?
 When an instruction moves through the pipeline, different stages (Fetch,
Decode, Execute, Memory Access, Write Back) require time and
resources. If an instruction needs data that is not yet available, it creates a
stall (pipeline delay).
 Buffers store intermediate values between stages to prevent stalling.
 They reduce dependency issues by keeping temporary results available for
the next stage.
 They improve instruction throughput, making execution faste
MORE ROBUST PROCESSORS: VECTOR, VERY LONG INSTRUCTION WORD
( VLIW ), AND SUPERSCALAR
 To go beyond one cycle per instruction (CPI), the processor must be able
to execute multiple instructions at the same time.
 . Concurrent processors must be able to make simultaneous accesses to
instruction and data memory and to simultaneously execute multiple
operations.
 Processors that achieve a higher degree of concurrency are called
concurrent processors, short for processors with instruction - level
concurrency.
VECTOR PROCESSORS AND VECTOR INSTRUCTION
EXTENSIONS
 Vector instructions boost performance by
 reducing the number of instructions required to execute a program (they
reduce the I - bandwidth);
 organizing data into regular sequences that can be effi ciently handled by
the hardware; and
 Vector processing requires extensions to the instruction set, together with
(for best performance) extensions to the functional units, the register sets,
and particularly to the memory of the system
Processors topic in system on chip architecture
 Vector processors usually include vector register (VR) hardware to decou ple
arithmetic processing from memory
 Vector Functional Units
 The VRs typically consist of eight or more register sets, each consisting of 16 –
64 vector elements, where each vector element is a fl oating - point word.
 The VRs access memory with special load and store instructions.
 The vector execution units are usually arranged as an independent
functional unit for each instruction class. These might include
 add/subtract, • multiplication, • division or reciprocal, and • logical
operations, including compare.
 Since the purpose of the vector vocabulary is to manage operations over a
vector of operands, once the vector operation is begun, it can continue at
the cycle rate of the system.
 The advantage of vector processing is that fewer instructions are required to
execute the vector operations.
 Vector Registers: What They Are & How They Work
 A vector register is a special type of CPU register that holds entire vectors
(arrays of data) instead of single scalar values. These registers are used in
vector processors and SIMD (Single Instruction, Multiple Data) architectures to
enable parallel processing of multiple data elements in a single instruction.
 Example: Scalar vs. Vector Registers
 Let’s say we want to add two arrays:
 A=[1,2,3,4],B=[5,6,7,8]
 Scalar Processor (Using Scalar Registers)
 Load 1 into a scalar register.
 Load 5 into another scalar register.
 Perform addition (1+5) and store the result.
 Repeat for the remaining elements.
 Takes 4 cycles (one per operation).
 vector Processor (Using Vector Registers)
 Load entire A array into a vector register.
 Load entire B array into another vector register.
 Perform vector addition on all elements at once.
 Takes 1 cycle (all 4 operations happen in parallel).
 Features of Vector Registers.
 Store Multiple Data Elements – Instead of a single value, they store an array.
Optimized for Parallel Execution – Allow SIMD processing.
Reduce Memory Access Time – Fewer loads and stores compared to scalar registers.
Accelerate Computation – Common in AI, graphics, and scientific computing.
Vector processor and functionality
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Processors topic in system on chip architecture
Processors topic in system on chip architecture

More Related Content

Similar to Processors topic in system on chip architecture (20)

PPTX
Embedded systems 101 final
Khalid Elmeadawy
 
PPTX
Instruction Set Architecture
Jaffer Haadi
 
PPT
Computer System.ppt
jguuhxxxfp
 
PPTX
An introduction to digital signal processors 1
Hossam Hassan
 
PPTX
Computer System Overview-William Stallings.pptx
theboy24816
 
PPTX
152-15-5588
Self-employed
 
PPTX
MD JAHID HASAN
Self-employed
 
PDF
Bca examination 2015 csa
Anjaan Gajendra
 
PPTX
Project report on embedded system using 8051 microcontroller
Vandna Sambyal
 
PDF
U proc ovw
Brit4
 
PPTX
AEC 8051 controller.pptxmicrocontroller notes
samarthwali91
 
PPT
Multilevel arch & str org.& mips, 8086, memory
Mahesh Kumar Attri
 
PPT
isa architecture
AJAL A J
 
PPT
chapter1 -Basic co.pptjsjjsjdjxjdjdjdjjsjsjd
freefire2619rowdy
 
PDF
Computer engineering - overview of microprocessors
EkeedaPvtLtd
 
PDF
Area Optimized Implementation For Mips Processor
IOSR Journals
 
PPT
03. top level view of computer function & interconnection
noman yasin
 
PPTX
System Programming- Unit I
Saranya1702
 
PPTX
Presentation1.pptx
TheresaSKMansaray
 
PPT
microprocessor-and-microcontroller
jhcid
 
Embedded systems 101 final
Khalid Elmeadawy
 
Instruction Set Architecture
Jaffer Haadi
 
Computer System.ppt
jguuhxxxfp
 
An introduction to digital signal processors 1
Hossam Hassan
 
Computer System Overview-William Stallings.pptx
theboy24816
 
152-15-5588
Self-employed
 
MD JAHID HASAN
Self-employed
 
Bca examination 2015 csa
Anjaan Gajendra
 
Project report on embedded system using 8051 microcontroller
Vandna Sambyal
 
U proc ovw
Brit4
 
AEC 8051 controller.pptxmicrocontroller notes
samarthwali91
 
Multilevel arch & str org.& mips, 8086, memory
Mahesh Kumar Attri
 
isa architecture
AJAL A J
 
chapter1 -Basic co.pptjsjjsjdjxjdjdjdjjsjsjd
freefire2619rowdy
 
Computer engineering - overview of microprocessors
EkeedaPvtLtd
 
Area Optimized Implementation For Mips Processor
IOSR Journals
 
03. top level view of computer function & interconnection
noman yasin
 
System Programming- Unit I
Saranya1702
 
Presentation1.pptx
TheresaSKMansaray
 
microprocessor-and-microcontroller
jhcid
 

Recently uploaded (20)

PPTX
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
PDF
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
PDF
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
PPTX
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PPTX
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PPTX
BANDHA (BANDAGES) PPT.pptx ayurveda shalya tantra
rakhan78619
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
DOCX
A summary of SPRING SILKWORMS by Mao Dun.docx
maryjosie1
 
PDF
'' IMPORTANCE OF EXCLUSIVE BREAST FEEDING ''
SHAHEEN SHAIKH
 
PPTX
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
PDF
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
PDF
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
PDF
CHILD RIGHTS AND PROTECTION QUESTION BANK
Dr Raja Mohammed T
 
PDF
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
PPTX
Quarter1-English3-W4-Identifying Elements of the Story
FLORRACHELSANTOS
 
PPTX
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
PPTX
How to Configure Lost Reasons in Odoo 18 CRM
Celine George
 
PPTX
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
PDF
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
STAFF DEVELOPMENT AND WELFARE: MANAGEMENT
PRADEEP ABOTHU
 
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
ARAL_Orientation_Day-2-Sessions_ARAL-Readung ARAL-Mathematics ARAL-Sciencev2.pdf
JoelVilloso1
 
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
BANDHA (BANDAGES) PPT.pptx ayurveda shalya tantra
rakhan78619
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
A summary of SPRING SILKWORMS by Mao Dun.docx
maryjosie1
 
'' IMPORTANCE OF EXCLUSIVE BREAST FEEDING ''
SHAHEEN SHAIKH
 
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
CHILD RIGHTS AND PROTECTION QUESTION BANK
Dr Raja Mohammed T
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - GLOBAL SUCCESS - CẢ NĂM - NĂM 2024 (VOCABULARY, ...
Nguyen Thanh Tu Collection
 
Quarter1-English3-W4-Identifying Elements of the Story
FLORRACHELSANTOS
 
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
How to Configure Lost Reasons in Odoo 18 CRM
Celine George
 
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
Ad

Processors topic in system on chip architecture

  • 1. Processors SYSTEM ON CHIP ARCHITECTURE UNIT --- II
  • 2. Introduction to the processor  Processors come in many types and with many intended uses.  This chapter contains details about processor design issues, especially for advanced processors in high - performance applications  Clearly, controllers, embedded controllers, digital signal processors (DSPs), are the dominant processor types, providing the focus for much of the processor design effort.  SOC and larger microcontrollers is growing at almost three times that of microprocessor units (MPUs in Figure 3.2 ).  Especially in SOC type applications, the processor itself is a small component occupying just a few percent of the die. SOC designs often use many different types of processors suiting the application.
  • 8. PROCESSOR SELECTION FOR SOC  For many SOC design situations, the selection of the processor is the most obvious task and, in some ways, the most restricted.  The processor must run a specific system software, so at least a core processor — usually a general purpose processor (GPP) — must be selected for this function.
  • 9. Processor selection for soc  Figure 3.3 shows the processor model used in the initial design process.  1. Define the Application Requirements  Before selecting a processor, determine:  Target Device: Smartphone, IoT device, automotive system, embedded system, etc.  Performance Needs: High performance (gaming, AI), balanced (smartphones, laptops), or low-power (IoT, wearables).  Power Constraints: Battery-powered (low power) vs. plug-in devices (higher power acceptable).  Connectivity: 5G, Wi-Fi, Bluetooth, or wired interfaces.  Security: Secure Boot, TPM, or dedicated security cores.
  • 11. Processor selection for soc  Soft Processors  The term “ soft core ” refers to an instruction processor design in bit stream format that can be used to program a field programmable gate array (FPGA)  device. The 4 main reasons for using such designs, despite their large area power – time cost, are  1. cost reduction in terms of system - level integration,  2. design reuse in cases where multiple designs are really just variations on one,  3. creating an exact fi t for a microcontroller/peripheral combination, and  4. providing future protection against discontinued microcontroller variants.
  • 13. BASIC CONCEPTS IN PROCESSOR ARCHITECTURE  The processor architecture consists of the instruction set of the processor.  While the instruction set implies many implementation (microarchitecture) details, the resulting implementation is a great deal more than the instruction set.  It is the synthesis of the physical device limitations with area – time – power trade - offs to optimize specified user requirements.  Instruction Set  The instruction set for most processors is based upon a register set to hold operands and addresses  The register set size varies from 8 to 64 words or more, each word consisting of 32 – 64 bits.  An additional set of floating - point registers (32 – 128 bits) is usually also available.  Common instruction sets can be classified by format differences into two basic types, the load – store ( L/S ) architecture and the register – memory ( R/M ) architecture:
  • 16. BASIC CONCEPTS IN PROCESSOR ARCHITECTURE  The L/S instruction set includes the RISC microprocessors. Arguments must be in registers before execution.  A Load/Store (L/S) architecture is a type of Reduced Instruction Set Computing (RISC) design where memory access is restricted to specific Load (L) and Store (S) instructions. Unlike Complex Instruction Set Computing (CISC) architectures (e.g., x86), where data can be processed directly from memory, L/S architectures require data to be first loaded into registers before processing.  A Register-Memory Instruction Set refers to a type of CPU architecture where instructions can operate directly on data stored in memory without requiring explicit load/store operations. This contrasts with Load/Store (RISC) architectures, where all operations must first load data into registers.
  • 17. Trade-offs in Instruction Set Architecture (ISA)  The Instruction Set Architecture (ISA) defines how a processor executes instructions, impacting performance, power efficiency, and complexity. Different ISAs, such as RISC (Reduced Instruction Set Computing) and CISC (Complex Instruction Set Computing), have trade-offs based on design goals.
  • 18. Trade-offs in Instruction Set Architecture (ISA)
  • 21. Interrupts and Exceptions  Interrupts and exceptions allow a processor to respond to events such as hardware signals, errors, and system calls. These mechanisms help in efficient multitasking, error handling, and real-time processing.  Types of Interrupts  A. Hardware Interrupts (Triggered by external devices)  Maskable Interrupts (IRQ) → Can be ignored or delayed by disabling interrupts.  Non-Maskable Interrupts (NMI) → Cannot be ignored (e.g., power failure).  Interrupt Requests (IRQs) → Devices send requests via the Interrupt Controller (PIC/APIC).
  • 22. How Interrupts Are Handled?  Interrupt Occurs → Device or software sends an interrupt signal.  Processor Saves State → Stores registers and program counter (PC).  Interrupt Vector Table (IVT) Lookup → Finds the correct handler for the interrupt.  Interrupt Service Routine (ISR) Executes → Handles the event (e.g., reading from a device).  processor Restores State → Resumes execution of the interrupted program.
  • 23. Types of Exceptions  A. Faults (Can be recovered; program restarts)  Page Fault (Accessing invalid memory).  Divide-by-Zero Error (Mathematical errors).  Traps (Handled immediately and continue execution)  System Calls (Used by OS to switch from user mode to kernel mode).  Aborts (Serious errors that terminate execution)  Hardware failures (e.g., memory corruption).
  • 24. How Exceptions Are Handled?  CPU detects an exception (e.g., invalid instruction, divide-by-zero).  Exception Vector Table Lookup → Finds the correct handler  Exception Handler Executes → Fixes the issue or terminates the program Interrupts allow devices and software to interact with the CPU asynchronously. Exceptions handle errors and system events synchronously. Efficient handling using interrupt controllers and prioritization is crucial for performance. Modern CPUs optimize interrupt handling using vectored and fast interrupts.
  • 25.  Interrupts and Exceptions Using Condition Codes  Condition codes (also called status flags) are special bits in a processor's status register (flag register) that indicate the result of arithmetic and logical operations. These flags help determine whether an interrupt or exception should be triggered.  Common Condition Codes in a Processor  Zero Flag (ZF) – Set when the result of an operation is zero.  Carry Flag (CF) – Set when an arithmetic operation results in a carry (for unsigned numbers).  Overflow Flag (OF) – Set when an arithmetic operation results in an overflow (for signed numbers).  Sign Flag (SF) – Set if the result of an operation is negative.  Parity Flag (PF) – Set if the result has an even number of 1s in binary.
  • 26. BASIC CONCEPTS IN PROCESSOR MICROARCHITECTURE  Almost all modern processors use an instruction execution pipeline design. Simple processors issue only one instruction for each cycle;  Many embedded and some signal processors use a simple issue - one - instruction per - cycle design approach.  others issue many. Many embedded and some signal processors use a simple issue - one - instruction per - cycle design approach.  But the bulk of modern desktop, laptop, and server systems issue multiple instructions for each cycle.  Every processor (Figure 3.7 ) has a memory system, execution unit (data paths), and instruction unit.
  • 27. BASIC CONCEPTS IN PROCESSOR MICROARCHITECTURE  The pipeline mechanism or control has many possibilities. Potentially, it can execute one or more instructions for each cycle. Instructions may or may not be decoded and/or executed in program order  Regardless of the type of pipeline, “ breaks ” or delays are the major limit on performance.  1. Pipeline  A technique used to improve instruction throughput by breaking execution into stages.  Common pipeline stages: Fetch, Decode, Execute, Memory Access, Write back.  Example: A 5-stage pipeline in RISC processors.
  • 30. BASIC CONCEPTS IN PROCESSOR MICROARCHITECTURE  The Instruction Register (IR) is a special-purpose register in a computer's central processing unit (CPU) that holds the instruction currently being executed or decoded. It is a crucial part of the instruction cycle in a computer.  Functions of the Instruction Register:  Holds the Current Instruction – The IR temporarily stores the machine instruction fetched from memory.  instruction Buffer  An Instruction Buffer is a temporary storage unit in a CPU that holds multiple instructions before they are executed. It is mainly used to improve instruction processing speed by reducing delays in fetching instructions from memory.
  • 33.  Execution Unit (EU) in a CPU  The Execution Unit (EU) is the part of the CPU responsible for processing and executing instructions. It works in conjunction with the Control Unit (CU), which fetches and decodes instructions before passing them to the Execution Unit.  Components of the Execution Unit  Arithmetic Logic Unit (ALU)  Performs arithmetic (addition, subtraction, multiplication, division).  Executes logical operations (AND, OR, XOR, NOT).  Handles bitwise shifts, comparisons, and Boolean logic.
  • 34.  Floating-Point Unit (FPU)  Specializes in floating-point arithmetic (decimal operations).  Follows the IEEE 754 standard for high-precision calculations.  Floating-Point Unit (FPU)  Specializes in floating-point arithmetic (decimal operations).  Follows the IEEE 754 standard for high-precision calculations.
  • 35. BASIC ELEMENTS IN INSTRUCTION HANDLING  An instruction unit consists of the state registers as defined by the instruction set — the instruction register — plus the instruction buffer, decoder, and an interlock unit.  The instruction buffer ’ s function is to fetch instructions into registers so that instructions can be rapidly brought into a position to be decoded.  The decoder has the responsibility for controlling the cache, ALU, registers, and so on.  Interlock in a Processor  In a processor, an interlock is a hardware or control mechanism that prevents a subsequent instruction from executing until a previous instruction has completed, thereby avoiding data hazards or structural conflicts.
  • 38. Instruction decoder and interlocks  Instruction Decoder in a Processor  The Instruction Decoder is a key component of the Control Unit (CU) in a CPU. It translates the machine code instructions fetched from memory into signals that control other parts of the processor, such as the Arithmetic Logic Unit (ALU), registers, and memory access units.  1. Role of the Instruction Decoder  The Instruction Decoder is responsible for: Decoding the instruction opcode – Identifies the type of operation (ADD, SUB, LOAD, etc.).  Extracting operands – Determines source and destination registers or memory addresses. Generating control signals – Activates ALU, memory, and register operations. Determining instruction format – Identifies whether it's R-type, I-type, etc. •Determines whether the instruction requires immediate values, registers, or memory access.
  • 39.  How the Instruction Decoder Works  Instruction Fetch – The instruction is fetched from memory by the Instruction Fetch Unit (IFU).  Instruction Decode – The Instruction Decoder analyzes the opcode and operands.  Control Signal Generation – The decoder sends signals to activate ALU, registers, or memory units.  Execution – The decoded instruction is executed in the Execution Unit (EU).
  • 40. Instruction decoder and interlocks  Interlock in Instruction Decoder  An interlock in an instruction decoder is a control mechanism that prevents the execution of an instruction until all necessary conditions (like data availability, resource availability, or hazard resolution) are met. This helps avoid errors due to pipeline hazards or data dependencies.  1. Why Interlocks Are Needed in Instruction Decoding  The instruction decoder translates binary machine code into control signals for execution. However, certain conditions may require an instruction to pause (stall) before proceeding:
  • 41.  Data Dependency (RAW Hazard) – An instruction requires the result of a previous instruction, but the result isn't ready.  Structural Hazard – The required hardware (ALU, register file, etc.) is already in use  Control Hazard (Branching Issue) – The processor is unsure which instruction to execute next (branch prediction).  Memory Read/Write Delays – Load/store instructions may take multiple cycles to complete.  To handle these issues, the instruction decoder uses interlocks to insert stalls or delay execution until safe.
  • 42. How Interlocks Work in the Instruction Decoder  Instruction Fetch – The CPU fetches an instruction from memory.  Instruction Decode – The decoder identifies the instruction type and operands.  Dependency Check (Hazard Detection Unit) – The decoder checks if the instruction can execute immediately or needs to wait.  Interlock Activation (if needed)  If dependencies exist, an interlock mechanism inserts a stall (delay cycle).  If no dependency exists, the instruction proceeds to execution.  Execution / Stall Handling – If stalled, the CPU waits; otherwise, execution proceeds.
  • 44. Buffers minimizing pipeline delays  Why Buffers Are Needed in Pipelining?  When an instruction moves through the pipeline, different stages (Fetch, Decode, Execute, Memory Access, Write Back) require time and resources. If an instruction needs data that is not yet available, it creates a stall (pipeline delay).  Buffers store intermediate values between stages to prevent stalling.  They reduce dependency issues by keeping temporary results available for the next stage.  They improve instruction throughput, making execution faste
  • 45. MORE ROBUST PROCESSORS: VECTOR, VERY LONG INSTRUCTION WORD ( VLIW ), AND SUPERSCALAR  To go beyond one cycle per instruction (CPI), the processor must be able to execute multiple instructions at the same time.  . Concurrent processors must be able to make simultaneous accesses to instruction and data memory and to simultaneously execute multiple operations.  Processors that achieve a higher degree of concurrency are called concurrent processors, short for processors with instruction - level concurrency.
  • 46. VECTOR PROCESSORS AND VECTOR INSTRUCTION EXTENSIONS  Vector instructions boost performance by  reducing the number of instructions required to execute a program (they reduce the I - bandwidth);  organizing data into regular sequences that can be effi ciently handled by the hardware; and  Vector processing requires extensions to the instruction set, together with (for best performance) extensions to the functional units, the register sets, and particularly to the memory of the system
  • 48.  Vector processors usually include vector register (VR) hardware to decou ple arithmetic processing from memory  Vector Functional Units  The VRs typically consist of eight or more register sets, each consisting of 16 – 64 vector elements, where each vector element is a fl oating - point word.  The VRs access memory with special load and store instructions.  The vector execution units are usually arranged as an independent functional unit for each instruction class. These might include  add/subtract, • multiplication, • division or reciprocal, and • logical operations, including compare.
  • 49.  Since the purpose of the vector vocabulary is to manage operations over a vector of operands, once the vector operation is begun, it can continue at the cycle rate of the system.  The advantage of vector processing is that fewer instructions are required to execute the vector operations.  Vector Registers: What They Are & How They Work  A vector register is a special type of CPU register that holds entire vectors (arrays of data) instead of single scalar values. These registers are used in vector processors and SIMD (Single Instruction, Multiple Data) architectures to enable parallel processing of multiple data elements in a single instruction.
  • 50.  Example: Scalar vs. Vector Registers  Let’s say we want to add two arrays:  A=[1,2,3,4],B=[5,6,7,8]  Scalar Processor (Using Scalar Registers)  Load 1 into a scalar register.  Load 5 into another scalar register.  Perform addition (1+5) and store the result.  Repeat for the remaining elements.  Takes 4 cycles (one per operation).
  • 51.  vector Processor (Using Vector Registers)  Load entire A array into a vector register.  Load entire B array into another vector register.  Perform vector addition on all elements at once.  Takes 1 cycle (all 4 operations happen in parallel).  Features of Vector Registers.  Store Multiple Data Elements – Instead of a single value, they store an array. Optimized for Parallel Execution – Allow SIMD processing. Reduce Memory Access Time – Fewer loads and stores compared to scalar registers. Accelerate Computation – Common in AI, graphics, and scientific computing.
  • 52. Vector processor and functionality