SlideShare a Scribd company logo
Threads
Cesarano Antonio
Del Monte Bonaventura
Università degli studi di
Salerno
7th April 2014
Operating Systems II
Agenda
 Introduction
 Threads models
 Multithreading: single-core Vs multicore
 Implementation
 A Case Study
 Conclusions
CPU Trends
Introduction
What’s a Thread?
Memory: Heavy vs Light processes
Introduction
Why should I care about Threads?
 Pro
• Responsiveness
• Resources
sharing
• Economy
• Scalability
Cons
• Hard implementation
• Synchronization
• Critical section,
deadlock, livelock…
Introduction
Thread Models
Two kinds of Threads
User Threads Kernel Threads
Thread Models
User-level Threads
Implemented in software library
 Pthread
 Win32 API
Pro:
• Easy handling
• Fast context switch
• Trasparent to OS
• No new address
space, no need to
change address
space
Cons:
• Do not benefit from
multithreading or
multiprocessing
• Thread blocked
Process blocked
Thread Models
Kernel-level
Threads Executed only in kernel mode, managed
by OS
 Kthreadd children
Pro:
• Resource Aware
• No need to use a
new address space
• Thread blocked
Scheduled
Con:
• Slower then User-
threads
Thread Models
Thread implementation models:
From many to one
From one to one
From many to many
Thread Models
From many to one
 Whole process is blocked if one thread is
blocked
 Useless on multicore architectures
Thread Models
From one to one
 Works fine on multicore architectures
o Many kernel threads = High overhead
Thread Models
From many to many
 Works fine on multicore architectures
 Less overhead then “one to one” model
Multithreading
Multitasking
Single core Symmetric
Multi-Processor
MultiThreading
Multithreading
Multithreading
HyperThreadin
g
Multithreading
 How can We use multithreading
architectures?
Thread
Level
Parallelism
Data
Level
Parallelis
m
Multithreading
Thread Level Parallelism
Multithreading
Data Level Parallelism
Multithreading
Granularity
 Coarse-
grained:
Multithreading
 Context switch on high latency event
 Very fast thread-switching, no threads slow down
 Loss of throughput due to short stalls: pipeline start-
up
Granularity
 Fine-grained
Multithreading
 Context switch on every cycle
 Interleaved execution of multiple threads: it can
hide
both short and long stalls
 Rarely-stalling threads are slowed down
Granularity
Multithreading
Context Switching
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread
2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Running Ready
Pushing old context
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Running Ready
Saving old stack pointer
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Running Ready
Changing stack pointer
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
Popping off thread #2 old
context
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 2 regs
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
Done: return
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 2 regs
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
RET pops of the
returning address
and it assigns its
value to PC reg
Problems
 Critical Section:
When a thread A tries to access to a shared
variable simultaneously to a thread B
 Deadlock:
When a process A is waiting for
resource reserved to B, which is
waiting for resource reserved to A
 Race Condition:
The result of an execution depens on
the order of execution of different
threads
More Issues
 fork() and exec() system calls: to duplicate or
to not deplicate all threads?
 Signal handling in multithreading application.
 Scheduler activation: kernel threads have to
communicate with user thread, i.e.: upcalls
 Thread cancellation: termination a thread
before it has completed.
• Deferred cancellation
• Asynchronous cancellation: immediate
Designing a thread library
 Multiprocessor support
 Virtual processor
 RealTime support
 Memory Management
 Provide functions library rather than a module
 Portability
 No Kernel mode
Implementation
Posix Thread
 Posix standard for threads: IEEE POSIX
1003.1c
 Library made up of a set of types and
procedure calls written in C, for UNIX
platform
 It supports:
a) Thread management
b) Mutexes
c) Condition Variables
d) Synchronization between threads using
R/W locks and barries
Implementation
Thread Pool
 Different threads available in a pool
 When a task arrives, it gets assigned to a
free thread
 Once a thread completes its service, it
returns in the pool and awaits another work.
Implementation
PThred Lib base operations
 pthread_create()- create and launch a new thread
 pthread_exit()- destroy a running thread
 pthread_attr_init()- set thread attributes to their
default values
 pthread_join()- the caller thread blocks and waits
for another thread to finish
 pthread_self()- it retrieves the id assigned to the
calling thread
Implementation Example
N x N Matrix Multiplication
Implementation Example
A simple algorithm
for (int i = 0; i < MATRIX_ELEMENTS; i += MATRIX_LINE)
{
for (int j = 0; j < MATRIX_LINE; ++j)
{
float tmp = 0;
for (int k = 0; k < MATRIX_LINE; k++)
{
tmp +=
A[i + k] * B[(MATRIX_LINE * k) + j];
}
C[i + j] = tmp;
}
}
Implementation Example
SIMD Approach
transpose(B);
for (int i = 0; i < MATRIX_LINE; i++) {
for (int j = 0; j < MATRIX_LINE; j++){
__m128 tmp = _mm_setzero_ps();
for (int k = 0; k < MATRIX_LINE; k += 4){
tmp = _mm_add_ps(tmp,
_mm_mul_ps(_mm_load_ps(&A[MATRIX_LINE * i + k]),
_mm_load_ps(&B[MATRIX_LINE * j +
k])));
}
tmp = _mm_hadd_ps(tmp, tmp);
tmp = _mm_hadd_ps(tmp, tmp);
_mm_store_ss(&C[MATRIX_LINE * i + j], tmp);
}
}
transpose(B);
Implementation Example
TLP Approach
struct thread_params
{
pthread_t id;
float* a;
float* b;
float* c;
int low;
int high;
bool flag;
};
………
int main(int argc, char** argv){
int
ncores=sysconf(_SC_NPROCESSORS_ONLN);
int stride = MATRIX_LINE / ncores;
for (int j = 0; j < ncores; ++j){
pthread_attr_t attr;
pthread_attr_init(&attr);
thread_params* par = new
thread_params;
par->low=j*stride; par-
>high=j*stride+stride;
par->a = A; par->b = B; par->c = C;
pthread_create(&(par->id), &attr, runner,
par);
// set cpu affinity for thread
// sched_setaffinity
}
Implementation Example
TLP Approach
int main(int argc, char**
argv){
….
int completed = 0;
while (true) {
if (completed >= ncores)
break;
completed = 0;
usleep(100000);
for (int j=0; j<ncores;
++j){
if (p[j]->flag)
completed++;
}
}
….
}
void runner(void* p){
thread_params* params = (thread_params*)
p;
int low = params->low; // unpack others
values
for (int i = low; i < high; i++) {
for (int j = 0; j < MATRIX_LINE; j++)
{
float tmp = 0;
for (int k = 0; k < MATRIX_LINE; k++){
tmp +=
A[MATRIX_LINE * i + k] *
B[(MATRIX_LINE * k) + j];
}
C[i + j] = tmp;
}
}
Implementation Performance
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
Simple SIMD TLP SIMD&TLP
8 cores
4 cores
A case study
Using threads in Interactive Systems
• Research by XEROX PARC Palo Alto
• Analysis of two large interactive system: Cedar and GVX
• Goals:
i. Identifing paradigms of thread usage
ii. architecture analysis of thread-based environment
iii. pointing out the most important properties of an
interactive system
A case study
Thread model
 Mesa language
 Multiple, lightweight, pre-emptively scheduled threads in shared
address space, threads may have different priorities
 FORK, JOIN, DETACH
 Support to conditional variables and monitors: critical sections and
mutexes
 Finer grain for locks: directly on data structures
A case study
Three types of thread
1. Eternal: run forever, waiting for cond. var.
2. Worker: perform some computation
3. Transient: short life threads, forked off by long-lived
threads
A case study
Dynamic analysis
0
5
10
15
20
25
30
35
40
45
Cedar GVX
# threads idle
Fork rate max
# threads max
Switching intervals: (130/sec, 270/sec) vs. (33/sec,
60/sec)
A case study
Paradigms of thread usage
 Defer Work: forking for reducing latency
 print documents
 Pumps or slack processes: components of pipeline
 Preprocessing user input
 Request to X server
 Sleepers and one-shots: wait for some event and then
execute
 Blink cursor
 Double click
 Deadlock avoiders: avoid violating lock order constraint
 Windows repainting
A case study
Paradigms of thread usage
 Task rejuvenation: recover a service from a bad state,
either forking a new thread or reporting the error
o Avoid fork overhead in input event dispatcher of
Cedar
 Serializers: thread processing a queue
o A window system with input events from many
sources
 Concurrency exploiters: for using multiple processors
 Encapsulated forks: a mix of previous paradigms, code
modularity
A case study
Common Mistakes and Issues
o Timeout hacks for compensate missing NOTIFY
o IF instead of WHILE for monitors
o Handling resources consumption
o Slack processes may need hack YieldButNotToMe
o Using single-thread designed libraries in multi-
threading environment: Xlib and XI
o Spurious lock
A case study
Xerox scientists’ conclusions
 Interesting difficulties were discovered both in
use and implementation of multi-threading
environment
 Starting point for new studies
Conclusion

More Related Content

What's hot (20)

PDF
Linux-Internals-and-Networking
Emertxe Information Technologies Pvt Ltd
 
PPTX
Load Balancing In Distributed Computing
Richa Singh
 
PPTX
Thread management
Ayaan Adeel
 
PDF
Thread
Mohd Arif
 
PPT
Parallel processing
Syed Zaid Irshad
 
PDF
Basic Multithreading using Posix Threads
Tushar B Kute
 
PPTX
Client vs server operating system
Muhammad Zubair
 
PDF
Cs8493 unit 1
Kathirvel Ayyaswamy
 
PDF
Linux Memory Management
Anil Kumar Pugalia
 
PPTX
cpu scheduling
hashim102
 
PPTX
Lecture 3 threads
Kumbirai Junior Muzavazi
 
PDF
Pthread
Gopi Saiteja
 
PPT
Scheduling algorithms
Chankey Pathak
 
PPTX
Lecture 6- Deadlocks.pptx
Amanuelmergia
 
PPTX
Linux kernel
Goutam Sahoo
 
PPTX
Threads in Operating System | Multithreading | Interprocess Communication
Shivam Mitra
 
PDF
Introduction to Parallel Computing
Akhila Prabhakaran
 
PPTX
Multi processor scheduling
Shashank Kapoor
 
PDF
Unit II - 2 - Operating System - Threads
cscarcas
 
Linux-Internals-and-Networking
Emertxe Information Technologies Pvt Ltd
 
Load Balancing In Distributed Computing
Richa Singh
 
Thread management
Ayaan Adeel
 
Thread
Mohd Arif
 
Parallel processing
Syed Zaid Irshad
 
Basic Multithreading using Posix Threads
Tushar B Kute
 
Client vs server operating system
Muhammad Zubair
 
Cs8493 unit 1
Kathirvel Ayyaswamy
 
Linux Memory Management
Anil Kumar Pugalia
 
cpu scheduling
hashim102
 
Lecture 3 threads
Kumbirai Junior Muzavazi
 
Pthread
Gopi Saiteja
 
Scheduling algorithms
Chankey Pathak
 
Lecture 6- Deadlocks.pptx
Amanuelmergia
 
Linux kernel
Goutam Sahoo
 
Threads in Operating System | Multithreading | Interprocess Communication
Shivam Mitra
 
Introduction to Parallel Computing
Akhila Prabhakaran
 
Multi processor scheduling
Shashank Kapoor
 
Unit II - 2 - Operating System - Threads
cscarcas
 

Similar to Threads and multi threading (20)

PPT
Intro To .Net Threads
rchakra
 
PPT
Introto netthreads-090906214344-phpapp01
Aravindharamanan S
 
PPT
Operating System Chapter 4 Multithreaded programming
guesta40f80
 
PPT
Hs java open_party
Open Party
 
PPT
Threads in Operating systems and concepts
RamaSubramanian79
 
PPTX
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
PPTX
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
babayaga920391
 
PPTX
MULTI-THREADING in python appalication.pptx
SaiDhanushM
 
PDF
Threads operating system slides easy understand
shamsulhuda34
 
PPTX
CS345 09 - Ch04 Threads operating system1.pptx
RichaAgnihotri13
 
PPTX
Threads
Sameer Shaik
 
PPTX
Os
DeepaR42
 
PPT
Chapter 6 os
AbDul ThaYyal
 
PDF
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
PDF
A22 Introduction to DTrace by Kyle Hailey
Insight Technology, Inc.
 
PDF
Operating Systems 1 (7/12) - Threads
Peter Tröger
 
PDF
CH04.pdf
ImranKhan880955
 
PPTX
Bglrsession4
Nagasuri Bala Venkateswarlu
 
PDF
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Takuo Watanabe
 
PDF
Sucet os module_2_notes
SRINIVASUNIVERSITYEN
 
Intro To .Net Threads
rchakra
 
Introto netthreads-090906214344-phpapp01
Aravindharamanan S
 
Operating System Chapter 4 Multithreaded programming
guesta40f80
 
Hs java open_party
Open Party
 
Threads in Operating systems and concepts
RamaSubramanian79
 
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
babayaga920391
 
MULTI-THREADING in python appalication.pptx
SaiDhanushM
 
Threads operating system slides easy understand
shamsulhuda34
 
CS345 09 - Ch04 Threads operating system1.pptx
RichaAgnihotri13
 
Threads
Sameer Shaik
 
Chapter 6 os
AbDul ThaYyal
 
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
A22 Introduction to DTrace by Kyle Hailey
Insight Technology, Inc.
 
Operating Systems 1 (7/12) - Threads
Peter Tröger
 
CH04.pdf
ImranKhan880955
 
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Takuo Watanabe
 
Sucet os module_2_notes
SRINIVASUNIVERSITYEN
 
Ad

More from Antonio Cesarano (8)

PDF
Inspire JSON Merger
Antonio Cesarano
 
PPTX
Erasmus Traineeship Report @ RedHat
Antonio Cesarano
 
PPTX
Lost John - Mobile Game Development
Antonio Cesarano
 
PPT
Pitch ItLosers - TechGarage 2014
Antonio Cesarano
 
PDF
Project Proposal - Project Management
Antonio Cesarano
 
PDF
Project management - Final Report
Antonio Cesarano
 
PDF
Tech Talk Project Work
Antonio Cesarano
 
PPTX
Cluster based storage - Nasd and Google file system - advanced operating syst...
Antonio Cesarano
 
Inspire JSON Merger
Antonio Cesarano
 
Erasmus Traineeship Report @ RedHat
Antonio Cesarano
 
Lost John - Mobile Game Development
Antonio Cesarano
 
Pitch ItLosers - TechGarage 2014
Antonio Cesarano
 
Project Proposal - Project Management
Antonio Cesarano
 
Project management - Final Report
Antonio Cesarano
 
Tech Talk Project Work
Antonio Cesarano
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Antonio Cesarano
 
Ad

Recently uploaded (20)

PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 

Threads and multi threading

  • 1. Threads Cesarano Antonio Del Monte Bonaventura Università degli studi di Salerno 7th April 2014 Operating Systems II
  • 2. Agenda  Introduction  Threads models  Multithreading: single-core Vs multicore  Implementation  A Case Study  Conclusions
  • 5. Memory: Heavy vs Light processes Introduction
  • 6. Why should I care about Threads?  Pro • Responsiveness • Resources sharing • Economy • Scalability Cons • Hard implementation • Synchronization • Critical section, deadlock, livelock… Introduction
  • 7. Thread Models Two kinds of Threads User Threads Kernel Threads
  • 8. Thread Models User-level Threads Implemented in software library  Pthread  Win32 API Pro: • Easy handling • Fast context switch • Trasparent to OS • No new address space, no need to change address space Cons: • Do not benefit from multithreading or multiprocessing • Thread blocked Process blocked
  • 9. Thread Models Kernel-level Threads Executed only in kernel mode, managed by OS  Kthreadd children Pro: • Resource Aware • No need to use a new address space • Thread blocked Scheduled Con: • Slower then User- threads
  • 10. Thread Models Thread implementation models: From many to one From one to one From many to many
  • 11. Thread Models From many to one  Whole process is blocked if one thread is blocked  Useless on multicore architectures
  • 12. Thread Models From one to one  Works fine on multicore architectures o Many kernel threads = High overhead
  • 13. Thread Models From many to many  Works fine on multicore architectures  Less overhead then “one to one” model
  • 18.  How can We use multithreading architectures? Thread Level Parallelism Data Level Parallelis m Multithreading
  • 21. Granularity  Coarse- grained: Multithreading  Context switch on high latency event  Very fast thread-switching, no threads slow down  Loss of throughput due to short stalls: pipeline start- up
  • 22. Granularity  Fine-grained Multithreading  Context switch on every cycle  Interleaved execution of multiple threads: it can hide both short and long stalls  Rarely-stalling threads are slowed down
  • 24. Context Switching Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Running Ready
  • 25. Pushing old context Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Running Ready
  • 26. Saving old stack pointer Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Running Ready
  • 27. Changing stack pointer Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running
  • 28. Popping off thread #2 old context Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 2 regs Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running
  • 29. Done: return Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 2 regs Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running RET pops of the returning address and it assigns its value to PC reg
  • 30. Problems  Critical Section: When a thread A tries to access to a shared variable simultaneously to a thread B  Deadlock: When a process A is waiting for resource reserved to B, which is waiting for resource reserved to A  Race Condition: The result of an execution depens on the order of execution of different threads
  • 31. More Issues  fork() and exec() system calls: to duplicate or to not deplicate all threads?  Signal handling in multithreading application.  Scheduler activation: kernel threads have to communicate with user thread, i.e.: upcalls  Thread cancellation: termination a thread before it has completed. • Deferred cancellation • Asynchronous cancellation: immediate
  • 32. Designing a thread library  Multiprocessor support  Virtual processor  RealTime support  Memory Management  Provide functions library rather than a module  Portability  No Kernel mode
  • 33. Implementation Posix Thread  Posix standard for threads: IEEE POSIX 1003.1c  Library made up of a set of types and procedure calls written in C, for UNIX platform  It supports: a) Thread management b) Mutexes c) Condition Variables d) Synchronization between threads using R/W locks and barries
  • 34. Implementation Thread Pool  Different threads available in a pool  When a task arrives, it gets assigned to a free thread  Once a thread completes its service, it returns in the pool and awaits another work.
  • 35. Implementation PThred Lib base operations  pthread_create()- create and launch a new thread  pthread_exit()- destroy a running thread  pthread_attr_init()- set thread attributes to their default values  pthread_join()- the caller thread blocks and waits for another thread to finish  pthread_self()- it retrieves the id assigned to the calling thread
  • 36. Implementation Example N x N Matrix Multiplication
  • 37. Implementation Example A simple algorithm for (int i = 0; i < MATRIX_ELEMENTS; i += MATRIX_LINE) { for (int j = 0; j < MATRIX_LINE; ++j) { float tmp = 0; for (int k = 0; k < MATRIX_LINE; k++) { tmp += A[i + k] * B[(MATRIX_LINE * k) + j]; } C[i + j] = tmp; } }
  • 38. Implementation Example SIMD Approach transpose(B); for (int i = 0; i < MATRIX_LINE; i++) { for (int j = 0; j < MATRIX_LINE; j++){ __m128 tmp = _mm_setzero_ps(); for (int k = 0; k < MATRIX_LINE; k += 4){ tmp = _mm_add_ps(tmp, _mm_mul_ps(_mm_load_ps(&A[MATRIX_LINE * i + k]), _mm_load_ps(&B[MATRIX_LINE * j + k]))); } tmp = _mm_hadd_ps(tmp, tmp); tmp = _mm_hadd_ps(tmp, tmp); _mm_store_ss(&C[MATRIX_LINE * i + j], tmp); } } transpose(B);
  • 39. Implementation Example TLP Approach struct thread_params { pthread_t id; float* a; float* b; float* c; int low; int high; bool flag; }; ……… int main(int argc, char** argv){ int ncores=sysconf(_SC_NPROCESSORS_ONLN); int stride = MATRIX_LINE / ncores; for (int j = 0; j < ncores; ++j){ pthread_attr_t attr; pthread_attr_init(&attr); thread_params* par = new thread_params; par->low=j*stride; par- >high=j*stride+stride; par->a = A; par->b = B; par->c = C; pthread_create(&(par->id), &attr, runner, par); // set cpu affinity for thread // sched_setaffinity }
  • 40. Implementation Example TLP Approach int main(int argc, char** argv){ …. int completed = 0; while (true) { if (completed >= ncores) break; completed = 0; usleep(100000); for (int j=0; j<ncores; ++j){ if (p[j]->flag) completed++; } } …. } void runner(void* p){ thread_params* params = (thread_params*) p; int low = params->low; // unpack others values for (int i = low; i < high; i++) { for (int j = 0; j < MATRIX_LINE; j++) { float tmp = 0; for (int k = 0; k < MATRIX_LINE; k++){ tmp += A[MATRIX_LINE * i + k] * B[(MATRIX_LINE * k) + j]; } C[i + j] = tmp; } }
  • 42. A case study Using threads in Interactive Systems • Research by XEROX PARC Palo Alto • Analysis of two large interactive system: Cedar and GVX • Goals: i. Identifing paradigms of thread usage ii. architecture analysis of thread-based environment iii. pointing out the most important properties of an interactive system
  • 43. A case study Thread model  Mesa language  Multiple, lightweight, pre-emptively scheduled threads in shared address space, threads may have different priorities  FORK, JOIN, DETACH  Support to conditional variables and monitors: critical sections and mutexes  Finer grain for locks: directly on data structures
  • 44. A case study Three types of thread 1. Eternal: run forever, waiting for cond. var. 2. Worker: perform some computation 3. Transient: short life threads, forked off by long-lived threads
  • 45. A case study Dynamic analysis 0 5 10 15 20 25 30 35 40 45 Cedar GVX # threads idle Fork rate max # threads max Switching intervals: (130/sec, 270/sec) vs. (33/sec, 60/sec)
  • 46. A case study Paradigms of thread usage  Defer Work: forking for reducing latency  print documents  Pumps or slack processes: components of pipeline  Preprocessing user input  Request to X server  Sleepers and one-shots: wait for some event and then execute  Blink cursor  Double click  Deadlock avoiders: avoid violating lock order constraint  Windows repainting
  • 47. A case study Paradigms of thread usage  Task rejuvenation: recover a service from a bad state, either forking a new thread or reporting the error o Avoid fork overhead in input event dispatcher of Cedar  Serializers: thread processing a queue o A window system with input events from many sources  Concurrency exploiters: for using multiple processors  Encapsulated forks: a mix of previous paradigms, code modularity
  • 48. A case study Common Mistakes and Issues o Timeout hacks for compensate missing NOTIFY o IF instead of WHILE for monitors o Handling resources consumption o Slack processes may need hack YieldButNotToMe o Using single-thread designed libraries in multi- threading environment: Xlib and XI o Spurious lock
  • 49. A case study Xerox scientists’ conclusions  Interesting difficulties were discovered both in use and implementation of multi-threading environment  Starting point for new studies