SlideShare a Scribd company logo
ComputerComputer
ArchitectureArchitecture
Instruction-Level Parallel
Processors
Improve CPU performance by
• increasing clock rates
• (CPU running at even higher frequencies.)
• increasing the number of instructions to be executed in parallel
• (executing 6-10 instructions at the same time)
• What is the limit for these?
Pipeline (assembly line)
Result of pipeline (e.g.)
• Very long instruction word (VLIW) refers to processor
architectures designed to exploit instruction level
parallelism (ILP).
• VLIW processor allows programs to explicitly specify
instructions to execute at the same time, concurrently, in
parallel
VLIW (very long instruction
word,1024 bits!)
VLIW (very long instruction
word,1024 bits!)
Superscalar (sequential stream of
instructions)
superscalar processor is a CPU that
implements a form of parallelism called
instruction-level parallelism within a single
processor. It therefore allows for more
throughput (the number of instructions that
can be executed in a unit of time) than would
otherwise be possible at a given clock rate
Scalar Vs Superscalar
•In contrast to a scalar processor that can execute at
most one single instruction per clock cycle
• superscalar processor can execute more than one
instruction during a clock cycle by simultaneously
dispatching multiple instructions to different
execution units on the processor.
Superscalar (sequential stream of
instructions)
Computer Architecture Instruction-Level paraallel processors
Flynn's taxonomy,
From Sequential instructions
to parallel execution
• Dependencies between instructions
• Instruction scheduling
• Preserving sequential consistency
Dependencies between instructions
Instructions often depend on each other in such a way that a particular
instruction cannot be executed until a preceding instruction or even two or
three preceding instructions have been executed.
1 Data dependencies
2 Control dependencies
3 Resource dependencies
Data dependencies
• Read after Write (RAW)
• Write after Read (WAR)
• Write after Write (WAW)
• Recurrences
Data dependencies in straight-line code
(RAW)
•RAW dependencies
•i1: load r1, a
•r2: add r2, r1, r1
•flow dependencies
•true dependencies
•cannot be abandoned
Data dependencies in straight-line code
(WAR)
•WAR dependencies
• i1: mul r1, r2, r3
• r2: add r2, r4, r5
•anti-dependencies
•false dependencies
•can be eliminated through register renaming
• i1: mul r1, r2, r3
• r2: add r6, r4, r5
• by using compiler or ILP-processor
Data dependencies in straight-line code
(WAW)
•WAW dependencies
• i1: mul r1, r2, r3
• r2: add r1, r4, r5
•output dependencies
•false dependencies
•can be eliminated through register renaming
• i1: mul r1, r2, r3
• r2: add r6, r4, r5
• by using compiler or ILP-processor
Computer Architecture Instruction-Level paraallel processors
Data dependencies in loops
(recurrences)
for (int i=2; i<10; i++) {
x[i] = a*x[i-1] + b
}
• cannot be executed in parallel
S1. if (a == b)
S2. a = a + b
S3. b = a + b
Data dependencies in conditional
statement
Data dependency graphs
• i1: load r1, a;
• i2: load r2, b;
• i3: add r3, r1, r2; RAW -> δt
• i4: mul r1, r2, r4; WAR -> δa
• i5: div r1, r2, r4; WAW -> δo
i1 i2
i3
i4
i5
δt δt
δa
δo
Control dependencies
mul r1, r2, r3
jz zproc
:
zproc: load r1, x
:
•actual path of execution depends on the outcome
of multiplication
•impose dependencies on the logical subsequent
instructions
Control Dependency Graph
Resource dependencies
•An instruction is resource-dependent on a
previously issued instruction if it requires a
hardware resource which is still being used by a
previously issued instruction.
•e.g.
• div r1, r2, r3
• div r4, r2, r5

More Related Content

What's hot (20)

PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
PPTX
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 
PPTX
Apache kafka
Long Nguyen
 
PDF
WSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2
 
PPTX
Streaming and Messaging
Xin Wang
 
PDF
Topic and schema management-meetupberlin
confluent
 
PPTX
How to manage large amounts of data with akka streams
Igor Mielientiev
 
PDF
Serverless log analytics with Amazon Kinesis
Rob Greenwood
 
PDF
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
confluent
 
PDF
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Spark Summit
 
PDF
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
confluent
 
PDF
Building your own Distributed System The easy way - Cassandra Summit EU 2014
Kévin LOVATO
 
PPTX
Vitalii Korzh - "Exciting Migrations"
LogeekNightUkraine
 
PDF
Openzipkin conf: Zipkin at Yelp
Prateek Agarwal
 
PDF
KELK Stack on AWS
Steamhaus
 
PPTX
How to Lock Down Apache Kafka and Keep Your Streams Safe
confluent
 
PDF
The Many Faces of Apache Kafka: Leveraging real-time data at scale
Neha Narkhede
 
PDF
Apache Kafka® at Dropbox
confluent
 
PPTX
SignalFx Kafka Consumer Optimization
SignalFx
 
PPTX
Kafka Streams: The Stream Processing Engine of Apache Kafka
Eno Thereska
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 
Apache kafka
Long Nguyen
 
WSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2
 
Streaming and Messaging
Xin Wang
 
Topic and schema management-meetupberlin
confluent
 
How to manage large amounts of data with akka streams
Igor Mielientiev
 
Serverless log analytics with Amazon Kinesis
Rob Greenwood
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
confluent
 
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Spark Summit
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
confluent
 
Building your own Distributed System The easy way - Cassandra Summit EU 2014
Kévin LOVATO
 
Vitalii Korzh - "Exciting Migrations"
LogeekNightUkraine
 
Openzipkin conf: Zipkin at Yelp
Prateek Agarwal
 
KELK Stack on AWS
Steamhaus
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
confluent
 
The Many Faces of Apache Kafka: Leveraging real-time data at scale
Neha Narkhede
 
Apache Kafka® at Dropbox
confluent
 
SignalFx Kafka Consumer Optimization
SignalFx
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Eno Thereska
 

Similar to Computer Architecture Instruction-Level paraallel processors (20)

PDF
3 ilp
KaushikGhosh91
 
PPT
2. ILP Processors.ppt
ShifaZahra7
 
PPT
Lec1 final
Gichelle Amon
 
PPTX
Computer Architecture and Organization
ssuserdfc773
 
PDF
Vliw or epic
Amit Kumar Rathi
 
PPTX
Parallel Computing
Mohsin Bhat
 
PPT
Instruction Level Parallelism and Superscalar Processors
Syed Zaid Irshad
 
PPTX
Instruction Level Parallelism | Static Multiple Issue & Dynamic Multiple Issu...
babuece
 
PPT
Overview of Very long instruction word Computing
Raun24S
 
PPTX
6. ILP.pptx
KarthikeyanC53
 
PPT
Chapter 3
Rozase Patel
 
PDF
The Challenges facing Libraries and Imperative Languages from Massively Paral...
Jason Hearne-McGuiness
 
PPT
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Hsien-Hsin Sean Lee, Ph.D.
 
PPTX
Difficulties in Pipelining
ChristineMaeCion1
 
PPTX
INSTRUCTION LEVEL PARALLALISM
Kamran Ashraf
 
PPT
Vliw and superscaler
Rafi Dar
 
PPT
14 superscalar
Anwal Mirza
 
PPT
Overview of Very long instruction word processors
Raun24S
 
PDF
Arch 1112-6
Hector Sanjuan
 
PDF
23_Advanced_Processors controller system
stellan7
 
2. ILP Processors.ppt
ShifaZahra7
 
Lec1 final
Gichelle Amon
 
Computer Architecture and Organization
ssuserdfc773
 
Vliw or epic
Amit Kumar Rathi
 
Parallel Computing
Mohsin Bhat
 
Instruction Level Parallelism and Superscalar Processors
Syed Zaid Irshad
 
Instruction Level Parallelism | Static Multiple Issue & Dynamic Multiple Issu...
babuece
 
Overview of Very long instruction word Computing
Raun24S
 
6. ILP.pptx
KarthikeyanC53
 
Chapter 3
Rozase Patel
 
The Challenges facing Libraries and Imperative Languages from Massively Paral...
Jason Hearne-McGuiness
 
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Hsien-Hsin Sean Lee, Ph.D.
 
Difficulties in Pipelining
ChristineMaeCion1
 
INSTRUCTION LEVEL PARALLALISM
Kamran Ashraf
 
Vliw and superscaler
Rafi Dar
 
14 superscalar
Anwal Mirza
 
Overview of Very long instruction word processors
Raun24S
 
Arch 1112-6
Hector Sanjuan
 
23_Advanced_Processors controller system
stellan7
 
Ad

More from Haris456 (11)

PPTX
Hazards Computer Architecture
Haris456
 
PPTX
Pipelining of Processors Computer Architecture
Haris456
 
PPTX
Computer Architecture Vector Computer
Haris456
 
PPTX
Multithreading computer architecture
Haris456
 
PPTX
Graphics processing uni computer archiecture
Haris456
 
PPTX
Computer Memory Hierarchy Computer Architecture
Haris456
 
PPTX
Pipeline Computer Architecture
Haris456
 
PPTX
Addressing mode Computer Architecture
Haris456
 
PPTX
Ca lecture 03
Haris456
 
PPT
Instruction Set Architecture
Haris456
 
PPTX
Computer Architecture
Haris456
 
Hazards Computer Architecture
Haris456
 
Pipelining of Processors Computer Architecture
Haris456
 
Computer Architecture Vector Computer
Haris456
 
Multithreading computer architecture
Haris456
 
Graphics processing uni computer archiecture
Haris456
 
Computer Memory Hierarchy Computer Architecture
Haris456
 
Pipeline Computer Architecture
Haris456
 
Addressing mode Computer Architecture
Haris456
 
Ca lecture 03
Haris456
 
Instruction Set Architecture
Haris456
 
Computer Architecture
Haris456
 
Ad

Recently uploaded (20)

PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PPTX
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PDF
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
[Solution] Why Choose the VeryPDF DRM Protector Custom-Built Solution for You...
Lingwen1998
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 

Computer Architecture Instruction-Level paraallel processors

  • 2. Improve CPU performance by • increasing clock rates • (CPU running at even higher frequencies.) • increasing the number of instructions to be executed in parallel • (executing 6-10 instructions at the same time) • What is the limit for these?
  • 5. • Very long instruction word (VLIW) refers to processor architectures designed to exploit instruction level parallelism (ILP). • VLIW processor allows programs to explicitly specify instructions to execute at the same time, concurrently, in parallel VLIW (very long instruction word,1024 bits!)
  • 6. VLIW (very long instruction word,1024 bits!)
  • 7. Superscalar (sequential stream of instructions) superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. It therefore allows for more throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate
  • 8. Scalar Vs Superscalar •In contrast to a scalar processor that can execute at most one single instruction per clock cycle • superscalar processor can execute more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor.
  • 12. From Sequential instructions to parallel execution • Dependencies between instructions • Instruction scheduling • Preserving sequential consistency
  • 13. Dependencies between instructions Instructions often depend on each other in such a way that a particular instruction cannot be executed until a preceding instruction or even two or three preceding instructions have been executed. 1 Data dependencies 2 Control dependencies 3 Resource dependencies
  • 14. Data dependencies • Read after Write (RAW) • Write after Read (WAR) • Write after Write (WAW) • Recurrences
  • 15. Data dependencies in straight-line code (RAW) •RAW dependencies •i1: load r1, a •r2: add r2, r1, r1 •flow dependencies •true dependencies •cannot be abandoned
  • 16. Data dependencies in straight-line code (WAR) •WAR dependencies • i1: mul r1, r2, r3 • r2: add r2, r4, r5 •anti-dependencies •false dependencies •can be eliminated through register renaming • i1: mul r1, r2, r3 • r2: add r6, r4, r5 • by using compiler or ILP-processor
  • 17. Data dependencies in straight-line code (WAW) •WAW dependencies • i1: mul r1, r2, r3 • r2: add r1, r4, r5 •output dependencies •false dependencies •can be eliminated through register renaming • i1: mul r1, r2, r3 • r2: add r6, r4, r5 • by using compiler or ILP-processor
  • 19. Data dependencies in loops (recurrences) for (int i=2; i<10; i++) { x[i] = a*x[i-1] + b } • cannot be executed in parallel S1. if (a == b) S2. a = a + b S3. b = a + b Data dependencies in conditional statement
  • 20. Data dependency graphs • i1: load r1, a; • i2: load r2, b; • i3: add r3, r1, r2; RAW -> δt • i4: mul r1, r2, r4; WAR -> δa • i5: div r1, r2, r4; WAW -> δo i1 i2 i3 i4 i5 δt δt δa δo
  • 21. Control dependencies mul r1, r2, r3 jz zproc : zproc: load r1, x : •actual path of execution depends on the outcome of multiplication •impose dependencies on the logical subsequent instructions
  • 23. Resource dependencies •An instruction is resource-dependent on a previously issued instruction if it requires a hardware resource which is still being used by a previously issued instruction. •e.g. • div r1, r2, r3 • div r4, r2, r5