SlideShare a Scribd company logo
3
Most read
4
Most read
10
Most read
Introduction to Perf
Hsiangkai
What is Perf?
• Perf is a profiler tool for Linux 2.6+ based systems
that abstracts away CPU hardware differences in
Linux performance measurements and presents a
simple command line interface.
• Perf is based on the perf_events interface exported
by recent versions of the Linux kernel.
Events
• software events - pure kernel counters
• context-switches
• hardware events - Performance Monitoring Unit (PMU)
• measure micro-architectural events such as the number
of cycles, instructions retired, L1 cache misses and so
on
• hardware cache events - events provided by the CPU
• tracepoint events - kernel ftrace infrastructure
perf stat
• For any of the supported events, perf can keep a
running count during process execution.
• Events are designated using their symbolic names
followed by optional unit masks and modifiers.
• perf stat -e cycles <command>
• perf stat -e cycles:u <command>
• perf stat -e cycles,instructions,cache-misses <command>
Modifiers
Multiplexing and Scaling
Events
• If there are more events than counters, the kernel
uses time multiplexing (switch frequency = HZ,
generally 100 or 1000) to give each event a chance
to access the monitoring hardware.
• Multiplexing only applies to PMU events.
• At the end of the run, the tool scales the count
based on total time enabled vs time running.
• final_count = raw_count * time_enabled/time_running
• The perf tool can be used to count events on a per-
thread, per-process, per-cpu or system-wide basis.
• per-thread
• the counter only monitors the execution of a
designated thread.
• When the thread is scheduled out, monitoring stops.
• By default, perf stat counts in per-thread mode.
• Attaching to a running kernel thread
• perf stat -e cycles -t <thread-id>
• per-process
• all threads of the process are monitored
• Counts and samples are aggregated at the process
level.
• The perf_events interface allows for automatic
inheritance on fork() and pthread_create().
• Attaching to a running process
• perf stat -e cycles -p <pid>
• per-cpu
• all threads running on the designated processors
are monitored.
• perf stat -e cycles:u,instructions:u -a <command>
• perf stat -e cycles:u,instructions:u -a -C 0,2-3
<command>
perf record
• collect profiles on per-thread, per-process and per-
cpu basis
• This generates an output file called perf.data.
Event-Based Sampling
• By default, perf record uses the cycles event as the sampling event.
• The perf_events interface allows two modes to express the sampling
period:
• the number of occurrences of the event (period)
• perf record -e retired_instructions:u -c 2000 <command>
• the average rate of samples/sec (frequency)
• The perf tool defaults to the average rate. It is set to 1000Hz,
or 1000 samples/sec.
• perf record -e instructions:u -F 250 <command>
perf report
• Samples collected by perf record are saved into a
binary file called, by default, perf.data. The perf
report command reads this file and generates a
concise execution profile.

More Related Content

What's hot (20)

PDF
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg
 
PDF
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
PDF
Linux Performance Profiling and Monitoring
Georg Schönberger
 
PDF
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk
 
PDF
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
PDF
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
PPTX
Linux Network Stack
Adrien Mahieux
 
PDF
Linux Systems Performance 2016
Brendan Gregg
 
PPT
Troubleshooting Linux Kernel Modules And Device Drivers
Satpal Parmar
 
PPTX
Linux kernel debugging
Hao-Ran Liu
 
PDF
Blazing Performance with Flame Graphs
Brendan Gregg
 
PPTX
Understanding eBPF in a Hurry!
Ray Jenkins
 
PPTX
Broken Linux Performance Tools 2016
Brendan Gregg
 
PDF
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
PDF
Architecture Of The Linux Kernel
guest547d74
 
PDF
Linux BPF Superpowers
Brendan Gregg
 
ODP
eBPF maps 101
SUSE Labs Taipei
 
PDF
Inter process communication using Linux System Calls
jyoti9vssut
 
PDF
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
PDF
Linux scheduler
Liran Ben Haim
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg
 
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Linux Performance Profiling and Monitoring
Georg Schönberger
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk
 
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
Linux Network Stack
Adrien Mahieux
 
Linux Systems Performance 2016
Brendan Gregg
 
Troubleshooting Linux Kernel Modules And Device Drivers
Satpal Parmar
 
Linux kernel debugging
Hao-Ran Liu
 
Blazing Performance with Flame Graphs
Brendan Gregg
 
Understanding eBPF in a Hurry!
Ray Jenkins
 
Broken Linux Performance Tools 2016
Brendan Gregg
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Adrian Huang
 
Architecture Of The Linux Kernel
guest547d74
 
Linux BPF Superpowers
Brendan Gregg
 
eBPF maps 101
SUSE Labs Taipei
 
Inter process communication using Linux System Calls
jyoti9vssut
 
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Linux scheduler
Liran Ben Haim
 

Similar to Introduction to Perf (20)

PDF
Performance Analysis Tools for Linux Kernel
lcplcp1
 
PPTX
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
PDF
Kernel Recipes 2017 - Using Linux perf at Netflix - Brendan Gregg
Anne Nicolas
 
PDF
linux monitoring and performance tunning
iman darabi
 
PPTX
Operating Systems Process Management.pptx
Sivakumar M
 
PDF
Monitorama 2015 Netflix Instance Analysis
Brendan Gregg
 
PPT
Capturing comprehensive storage workload traces in windows
Bruce Worthington
 
PPTX
Opmanager Workshop - Middle East
ManageEngine, Zoho Corporation
 
PDF
Dpdk 2019-ipsec-eventdev
Hemant Agrawal
 
PDF
2010 02 instrumentation_and_runtime_measurement
PTIHPA
 
PPT
Linux Performance Tunning Kernel
Shay Cohen
 
PPT
Cache profiling on ARM Linux
Prabindh Sundareson
 
PPT
Linux monitoring and Troubleshooting for DBA's
Mydbops
 
PPTX
ch2.pptx
Halogens
 
PDF
Unit 1.1.pdfOperating_SystemOperating_System
DharmatejMallampati
 
PDF
Linux Internals - Part II
Emertxe Information Technologies Pvt Ltd
 
PPT
Unix kernal
Shehrevar Davierwala
 
DOCX
Perf stat windows
Accenture
 
PDF
AS & A Level Computer Science Chapter 4 Presentation
ArnelAvila6
 
PDF
AOS Lab 6: Scheduling
Zubair Nabi
 
Performance Analysis Tools for Linux Kernel
lcplcp1
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
Kernel Recipes 2017 - Using Linux perf at Netflix - Brendan Gregg
Anne Nicolas
 
linux monitoring and performance tunning
iman darabi
 
Operating Systems Process Management.pptx
Sivakumar M
 
Monitorama 2015 Netflix Instance Analysis
Brendan Gregg
 
Capturing comprehensive storage workload traces in windows
Bruce Worthington
 
Opmanager Workshop - Middle East
ManageEngine, Zoho Corporation
 
Dpdk 2019-ipsec-eventdev
Hemant Agrawal
 
2010 02 instrumentation_and_runtime_measurement
PTIHPA
 
Linux Performance Tunning Kernel
Shay Cohen
 
Cache profiling on ARM Linux
Prabindh Sundareson
 
Linux monitoring and Troubleshooting for DBA's
Mydbops
 
ch2.pptx
Halogens
 
Unit 1.1.pdfOperating_SystemOperating_System
DharmatejMallampati
 
Linux Internals - Part II
Emertxe Information Technologies Pvt Ltd
 
Perf stat windows
Accenture
 
AS & A Level Computer Science Chapter 4 Presentation
ArnelAvila6
 
AOS Lab 6: Scheduling
Zubair Nabi
 
Ad

More from Wang Hsiangkai (13)

PDF
Debug Line Issues After Relaxation.
Wang Hsiangkai
 
PDF
Machine Trace Metrics
Wang Hsiangkai
 
PDF
Instruction Combine in LLVM
Wang Hsiangkai
 
PDF
GCC LTO
Wang Hsiangkai
 
PDF
LTO plugin
Wang Hsiangkai
 
PDF
Something About Dynamic Linking
Wang Hsiangkai
 
PDF
DWARF Data Representation
Wang Hsiangkai
 
PDF
Effective Modern C++
Wang Hsiangkai
 
PDF
GCC GENERIC
Wang Hsiangkai
 
PDF
LLVM Register Allocation (2nd Version)
Wang Hsiangkai
 
PDF
Perf File Format
Wang Hsiangkai
 
PDF
LLVM Register Allocation
Wang Hsiangkai
 
PDF
SSA - PHI-functions Placements
Wang Hsiangkai
 
Debug Line Issues After Relaxation.
Wang Hsiangkai
 
Machine Trace Metrics
Wang Hsiangkai
 
Instruction Combine in LLVM
Wang Hsiangkai
 
LTO plugin
Wang Hsiangkai
 
Something About Dynamic Linking
Wang Hsiangkai
 
DWARF Data Representation
Wang Hsiangkai
 
Effective Modern C++
Wang Hsiangkai
 
GCC GENERIC
Wang Hsiangkai
 
LLVM Register Allocation (2nd Version)
Wang Hsiangkai
 
Perf File Format
Wang Hsiangkai
 
LLVM Register Allocation
Wang Hsiangkai
 
SSA - PHI-functions Placements
Wang Hsiangkai
 
Ad

Recently uploaded (20)

PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PDF
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
PDF
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
PDF
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PDF
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
PPTX
Presentation about Database and Database Administrator
abhishekchauhan86963
 
PDF
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
PPTX
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
PDF
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
PPTX
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PPTX
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
PDF
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
PDF
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PDF
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
PDF
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
Using licensed Data Loss Prevention (DLP) as a strategic proactive data secur...
Q-Advise
 
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
Presentation about Database and Database Administrator
abhishekchauhan86963
 
Supabase Meetup: Build in a weekend, scale to millions
Carlo Gilmar Padilla Santana
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
Contractor Management Platform and Software Solution for Compliance
SHEQ Network Limited
 
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
SAP GUI Installation Guide for Windows | Step-by-Step Setup for SAP Access
SAP Vista, an A L T Z E N Company
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 

Introduction to Perf

  • 2. What is Perf? • Perf is a profiler tool for Linux 2.6+ based systems that abstracts away CPU hardware differences in Linux performance measurements and presents a simple command line interface. • Perf is based on the perf_events interface exported by recent versions of the Linux kernel.
  • 3. Events • software events - pure kernel counters • context-switches • hardware events - Performance Monitoring Unit (PMU) • measure micro-architectural events such as the number of cycles, instructions retired, L1 cache misses and so on • hardware cache events - events provided by the CPU • tracepoint events - kernel ftrace infrastructure
  • 4. perf stat • For any of the supported events, perf can keep a running count during process execution. • Events are designated using their symbolic names followed by optional unit masks and modifiers. • perf stat -e cycles <command> • perf stat -e cycles:u <command> • perf stat -e cycles,instructions,cache-misses <command>
  • 6. Multiplexing and Scaling Events • If there are more events than counters, the kernel uses time multiplexing (switch frequency = HZ, generally 100 or 1000) to give each event a chance to access the monitoring hardware. • Multiplexing only applies to PMU events. • At the end of the run, the tool scales the count based on total time enabled vs time running. • final_count = raw_count * time_enabled/time_running
  • 7. • The perf tool can be used to count events on a per- thread, per-process, per-cpu or system-wide basis.
  • 8. • per-thread • the counter only monitors the execution of a designated thread. • When the thread is scheduled out, monitoring stops. • By default, perf stat counts in per-thread mode. • Attaching to a running kernel thread • perf stat -e cycles -t <thread-id>
  • 9. • per-process • all threads of the process are monitored • Counts and samples are aggregated at the process level. • The perf_events interface allows for automatic inheritance on fork() and pthread_create(). • Attaching to a running process • perf stat -e cycles -p <pid>
  • 10. • per-cpu • all threads running on the designated processors are monitored. • perf stat -e cycles:u,instructions:u -a <command> • perf stat -e cycles:u,instructions:u -a -C 0,2-3 <command>
  • 11. perf record • collect profiles on per-thread, per-process and per- cpu basis • This generates an output file called perf.data.
  • 12. Event-Based Sampling • By default, perf record uses the cycles event as the sampling event. • The perf_events interface allows two modes to express the sampling period: • the number of occurrences of the event (period) • perf record -e retired_instructions:u -c 2000 <command> • the average rate of samples/sec (frequency) • The perf tool defaults to the average rate. It is set to 1000Hz, or 1000 samples/sec. • perf record -e instructions:u -F 250 <command>
  • 13. perf report • Samples collected by perf record are saved into a binary file called, by default, perf.data. The perf report command reads this file and generates a concise execution profile.