Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- The default for callchains is back to 'callee' when --children is not used.
(Namhyung Kim)
- Move the 'use_offset' option to the right place where the annotate code
expects it to be to be able to properly handle it. (Namhyung Kim)
- Don't die when an unknown 'annotate' option is found in the perf config
file (usually ~/.perfconfig), just warn the user. (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Support %ps/%pS in libtraceevent. (Scott Wood)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The --call-graph option is complex so we should provide better guide for
users. Also change help message to be consistent with config option
names. Now perf top will show help like below:
$ perf top --call-graph
Error: option `call-graph' requires a value
Usage: perf top [<options>]
--call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
setup and enables call-graph (stack chain/backtrace):
record_mode: call graph recording mode (fp|dwarf|lbr)
record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
default: 8192 (bytes)
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold (<percent>)
print_limit: maximum number of call graph entry (<number>)
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)
Default: fp,graph,0.5,caller,function
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently 'perf top --call-graph' option is same as 'perf record'. But
'perf top' also need to receive display options in 'perf report'. To do
that, change parse_callchain_report_opt() to allow record options too.
Now perf top can receive display options like below:
$ perf top --call-graph
Error: option `call-graph' requires a value
Usage: perf top [<options>]
--call-graph
<mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]>
setup and enables call-graph (stack chain/backtrace)
recording: fp dwarf lbr, output_type (graph, flat,
fractal, or none), min percent threshold, optional
print limit, callchain order, key (function or
address), add branches
$ perf top --call-graph callee,graph,fp
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445495330-25416-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commits such as 65dd297ac2 ("xfs: %pF is only for function
pointers") caused a regression because pretty_print() didn't support
%ps/%pS. The current %pf/%pF implementation in pretty_print() is what
%ps/%pS is supposed to do, so use the same code for %ps/%pS.
Addressing the incorrect %pf/%pF implementation is beyond the scope of
this patch.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Chinner <david@fromorbit.com>
Link: http://lkml.kernel.org/r/20150831211637.GA12848@home.buserror.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf will core dump if --per-socket/core -a are applied for perf stat.
The root cause is that cpu_map__build_map set refcnt of evlist's cpu_map
to 1. It should set refcnt for the newly created cpu_map, not evlist's
cpu_map.
Here is the example:
# perf stat -e cycles --per-socket -a sleep 1
Performance counter stats for 'system wide':
S0 36 30,196,257 cycles
S1 28 15,823,536 cycles
1.001126828 seconds time elapsed
*** Error in `./perf': corrupted double-linked list: 0x00000000021f9090 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3002e7bbe7]
/lib64/libc.so.6[0x3002e7d2b5]
./perf(perf_evsel__delete+0x28)[0x485bdd]
./perf[0x4800e8]
./perf(perf_evlist__delete+0x5e)[0x482cd5]
./perf(cmd_stat+0xf25)[0x432328]
./perf[0x4768e0]
./perf[0x476ad6]
./perf[0x476b41]
./perf(main+0x1d0)[0x476db2]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45]
./perf[0x4202c5]
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1444388363-35936-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes:
User visible changes:
- 'perf bench mem' now prefaults unconditionally, no sense in
providing modes where page faults are measured. (Ingo Molnar)
- Harmonize -l/--nr_loops accross 'perf bench'. (Ingo Molnar)
- Various 'perf bench' consistency improvements. (Ingo Molnar)
- Suppress libtraceevent warnings in non-verbose 'perf test' mode.
(Namhyung Kim)
- Move some tracepoint event test error messages to the verbose mode
of 'perf test'. (Namhyung Kim)
- Make 'perf help' usage message consistent with other tools. (Yunlong Song)
Build fixes:
- Fix 'perf bench' build with gcc 4.4.7. (Arnaldo Carvalho de Melo)
Infrastructure changes:
- 'perf stat' prep work for the 'perf stat scripting' patchkit. (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have three benchmarking subsystems that specify some sort of 'number
of loops' parameter - but all of them do it inconsistently:
numa: -l/--nr_loops
sched messaging: -l/--loops
mem memset/memcpy: -i/--iterations
Harmonize them to -l/--nr_loops by picking the numa variant - which is
also the most likely one to have existing scripting which we don't want
to break.
Plus improve the parameter help texts to indicate the default value for
the nr_loops variable to keep users from guessing ...
Also propagate the naming to internal variables.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-13-git-send-email-mingo@kernel.org
[ Let the harmonisation reach the perf-bench man page as well ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So 'perf bench mem memcpy/memset' has elaborate code to measure
memcpy()/memset() performance both with freshly allocated buffers (which
includes initial page fault overhead) and with preallocated buffers.
But the thing is, the resulting bandwidth results are mostly
meaningless, because page faults dominate so much of the cost.
It might make sense to measure cache cold vs. cache hot performance, but
the code does not do this.
So remove this complication, and always prefault the ranges before using
them.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-6-git-send-email-mingo@kernel.org
[ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently libtraceevent emits warning on unsupported event formats.
However it'd be better to see them only -v option is given. To do that,
it needs to override the warning() function which is used in the
libtracevent. Thus add set_warning_routine() same as set_die_routine()
and check the verbose flag in our warning routine.
Before:
# perf test 5
5: parse events tests :
Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
...
Ok
After:
# perf test 5
5: parse events tests : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, when 'perf test' is run by a normal user, it'll fail to
access tracepoint events. The output becomes somewhat messy because it
tries to be nice with long error messages and hints.
IMHO this is not needed for 'perf test' by default and AFAIK 'perf test'
uses pr_debug() rather than pr_err() for such messages so that one can
use -v option to see further details on failed testcases if needed.
Before:
$ perf test
1: vmlinux symtab matches kallsyms : FAILED!
2: detect openat syscall event :Error:
No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
FAILED!
3: detect openat syscall event on all cpus :Error:
No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
FAILED!
...
After:
$ perf test
1: vmlinux symtab matches kallsyms : FAILED!
2: detect openat syscall event : FAILED!
3: detect openat syscall event on all cpus : FAILED!
...
$ perf test -v 2
2: detect openat syscall event :
--- start ---
test child forked, pid 30575
Error: No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
test child finished with -1
---- end ----
detect openat syscall event: FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Use the alternative with the most descriptive filename containing
a vmlinux file for a given build-id, providing a better title line
for tools such as 'annotate'. (Arnaldo Carvalho de Melo)
- Remove help messages about previous right and left arrow keybidings, that
were repurposed for horizontal scrolling. (Arnaldo Carvalho de Melo)
- Inform how to reset the symbol filter in the hists browser. (top & report)
(Arnaldo Carvalho de Melo)
- Add 'm' key for context menu display in the hists browser, that became
inacessible with the repurposing of the right arrow key for horizontal
scrolling. (Namhyung Kim)
- Use debug_frame for callchains if eh_frame is unusable. (Rabin Vicent)
Build fixes:
- Fix strict-aliasing breakage with gcc 4.4 in the READ_ONCE/WRITE_ONCE code
adopted from the kernel tree, that builds with -fno-strict-aliasing while
tools/perf/ uses -Wstrict-aliasing=3. (Jiri Olsa)
- Fix unw_word_t pointer casts in code using libunwind for callchains,
fixing the build in at least 32-bit MIPS systems. (Rabin Vicent)
- Work around cross compile build problems related to fixdep. (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vinson reported build breakage with gcc 4.4 due to strict-aliasing.
CC util/annotate.o
cc1: warnings being treated as errors
util/annotate.c: In function ‘disasm__purge’:
linux-next/tools/include/linux/compiler.h:66: error: dereferencing
pointer ‘res.41’ does break strict-aliasing rules
The reason is READ_ONCE/WRITE_ONCE code we took from kernel sources. They
intentionaly break aliasing rules. While this is ok for kernel because it's
built with -fno-strict-aliasing, it breaks perf which is build with
-Wstrict-aliasing=3.
Using extra __may_alias__ type to allow aliasing in this case.
Reported-and-tested-by: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Martin Liska <mliska@suse.cz>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabin@rab.in>
Cc: linux-next@vger.kernel.org
Link: http://lkml.kernel.org/r/20151013085214.GB2705@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
unw_word_t is uint64_t even on 32-bit MIPS. Cast it to uintptr_t before
the cast to void *p to get rid of the following errors:
util/unwind-libunwind.c: In function 'access_mem':
util/unwind-libunwind.c:464:4: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
util/unwind-libunwind.c:475:2: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
cc1: all warnings being treated as errors
make[3]: *** [util/unwind-libunwind.o] Error 1
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabinv@axis.com>
Link: http://lkml.kernel.org/r/1443379079-29133-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>