软件开发中程序运行一段时间出现2类问题最头疼:
1.突然崩溃(调试问题)
- Windows:可以用
WinDbg
分析core dump文件,使用应用程序验证器(appverif.exe
)对程序进行全面检查- Linux:可以用
gdb
分析core dump文件,使用valgrind
的相关工具(里面最常用的是memleak
检查内存泄露)对程序进行全面检查2.运行变慢或CPU等使用率居高不下(性能问题)
静态分析代码:效率太低,除非是某一个git分支的合入导致出现问题,这种小范围的改动还可以用静态分析
采样分析:windows下可以使用
Xperf
进行分析,linux下可以使用perf+火焰图
进行分析
说明:下面的笔记是对
perf
的总结,方便自己后续复习,因此里面列举了很多常用的命令,显得篇幅有点长
重要的事情说3遍:
笔记更新换位置了!
笔记更新换位置了!
笔记更新换位置了!
新位置
:我写的新系列,刚开始写没几天,后续文章主要在新地址更新,欢迎支持;写作不易,且看且珍惜(点击跳转,欢迎收藏)
文章目录
1.perf-Overview
- 简介
perf
是linux(2.6+)官方的分析器(profiler
),是一个轻量化的采用和分析的内核级工具,位于tools/perf
下的linux内核源码中,并且是基于内核perf_events
的;是一个具有分析(profiling)、跟踪( tracing)和脚本(scripting)功能的多工具集合
- 安装
使用lsb_release -a
列举出版本信息,然后使用下面对应的命令进行安装
Cent OS/RHEL:yum install perf
Fedora:dnf install perf
SUSE:zypper install perf
Ubuntu:apt install linux-tools-common
- 常见使用场景
perf
特别适合CPU分析(perf可以被用来剖析CPU的调用路径):分析/采样CPU的堆栈跟踪、跟踪CPU调度器的行为、磁盘I/O等;通过对程序进行几次采样,通常就可以找到影响性能的线索
提示:perf的子命令跟git很像,学习perf就是学习子命令的使用方式,最常用的是stat、record、report和script;先对perf支持的常用子命令有一个概括的了解吧…
2.子命令-Overview
part 1.子命令框架
下面展示了最常用的perf
子命令,包含子命令的输入来源和输出格式,其中还展示了配合stackcollapse-perf.pl
和flamegraph.pl
生成火焰图的流程
提示:学会这个图的每个细节,基本上就学会了perf了
part 2.支持的命令列表
下面直接列出了perf支持的常用子命令和基本说明,注意:perf list
列举出的是支持的events列表
# perf
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
annotate Read perf.data (created by perf record) and display annotated code
archive Create archive with object files with build-ids found in perf.data file
bench General framework for benchmark suites
buildid-cache Manage build-id cache.
buildid-list List the buildids in a perf.data file
c2c Shared Data C2C/HITM Analyzer.
config Get and set variables in a configuration file.
data Data file related processing
diff Read perf.data files and display the differential profile
evlist List the event names in a perf.data file
ftrace simple wrapper for kernel's ftrace functionality
inject Filter to augment the events stream with additional information
kallsyms Searches running kernel for symbols
kmem Tool to trace/measure kernel memory properties
kvm Tool to trace/measure kvm guest os
list List all symbolic event types
lock Analyze lock events
mem Profile memory accesses
record Run a command and record its profile into perf.data
report Read perf.data (created by perf record) and display the profile
sched Tool to trace/measure scheduler properties (latencies)
script Read perf.data (created by perf record) and display trace output
stat Run a command and gather performance counter statistics
test Runs sanity tests.
timechart Tool to visualize total system behavior during a workload
top System profiling tool.
version display the version of perf binary
probe Define new dynamic tracepoints
trace strace inspired tool
See 'perf help COMMAND' for more information on a specific command.
part 3.Option参数
下面使用-h
列举出了perf stat
子命令的Option参数,其他命令也类似
提示:不用全部记住,也不现实,记住常用的,不明白查询一下就可以
# perf stat -h
Usage: perf stat [<options>] [<command>]
-a, --all-cpus system-wide collection from all CPUs
-A, --no-aggr disable CPU count aggregation
-B, --big-num print large numbers with thousands' separators
-C, --cpu <cpu> list of cpus to monitor in system-wide
-c, --scale scale/normalize counters
-D, --delay <n> ms to wait before starting measurement after program start
-d, --detailed detailed run - start a lot of events
-e, --event <event> event selector. use 'perf list' to list available events
-G, --cgroup <name> monitor event in cgroup name only
-g, --group put the counters into a counter group
-I, --interval-print <n>
print counts at regular interval in ms (overhead is possible for values <= 100ms)
-i, --no-inherit child tasks do not inherit counters
-M, --metrics <metric/metric group list>
monitor specified metrics or metric groups (separated by ,)
-n, --null null run - dont start any counters
-o, --output <file> output file name
-p, --pid <pid> stat events on existing process id
-r, --repeat <n> repeat command and print average + stddev (max: 100, forever: 0)
-S, --sync call sync() before starting a run
-t, --tid <tid> stat events on existing thread id
-T, --transaction hardware transaction statistics
-v, --verbose be more verbose (show counter open errors, etc)
-x, --field-separator <separator>
part 4.支持版本
下面列出了常用命令引入的版本信息,这也说明perf
能支持的子命令是与linux内核版本有关系的;可以使用cat /proc/version
查看linux的内核版本
part 5.子命令快速预览
下面是最常用命令的快速预览,基本上能涵盖80%的perf使用场景
#1.Listing Events
#列举名字中包含字符串 "block"的事件(events)
perf list block
#2.Counting Events
#按类型 计数 指定PID的系统调用
perf stat -e 'syscalls:sys_enter_*' -p PID
#计数整个系统的阻塞设备I/O事件,持续10秒
perf stat -e 'block:*' -a sleep 10
#3.剖析(Profiling)
#Sample on-CPU functions for the specified command, at 99 Hertz:
perf record -F 99 command
#Sample CPU stack traces (via frame pointers) system-wide ,at 99 Hertz,for 10 seconds:
perf record -F 99 -a -g sleep 10
#Sample CPU stack traces for the PID, using dwarf (debuginfo) to unwind stacks:
perf record -F 99 -p PID --call-graph dwarf sleep 10
#Record new process events via exec:
perf record -e sched:sched_process_exec -a
#4.Static Tracing
#Trace all context switches with stack traces for 1 second:
perf record -e sched:sched_switch -a -g sleep 1
#Trace all block requests, of size at least 64 Kbytes, until Ctrl-C:
perf record -e block:block_rq_issue --filter 'bytes >= 65536'
#5.Dynamic Tracing
#Add a probe for the kernel tcp_sendmsg() function entry (--add optional):
perf probe --add tcp_sendmsg
#Remove the tcp_sendmsg() tracepoint (or -d):
perf probe --del tcp_sendmsg
#6.Reporting
#Show perf.data as a text report, with data coalesced(合并) and counts and percentages:
perf report -n --stdio
#List all perf.data events, with data header (recommended):
perf script --header
#List all perf.data events, w