Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
. Fix some leaks in exit paths.
. Use memdup where applicable
. Remove some die() calls, allowing callers to handle exit paths
gracefully.
. Correct typo in tools Makefile, fix from Borislav Petkov.
. Add 'perf bench numa mem' NUMA performance measurement suite, from Ingo Molnar.
. Handle dynamic array's element size properly, fix from Jiri Olsa.
. Fix memory leaks on evsel->counts, from Namhyung Kim.
. Make numa benchmark optional, allowing the build in machines where required
numa libraries are not present, fix from Peter Hurley.
. Add interval printing in 'perf stat', from Stephane Eranian.
. Fix compile warnings in tests/attr.c, from Sukadev Bhattiprolu.
. Fix double free, pclose instead of fclose, leaks and double fclose errors
found with the cppcheck tool, from Thomas Jarosch.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We print several '__u64' quantities using '%llu'. On powerpc, we by
default include '<asm-generic/int-l64.h> which results in __u64 being an
unsigned long. This causes compile warnings which are treated as errors
due to '-Werror'.
By defining __SANE_USERSPACE_TYPES__ we include <asm-generic/int-ll64.h>
and define __u64 as unsigned long long.
Changelog[v2]:
[Michael Ellerman] Use __SANE_USERSPACE_TYPES__ and avoid PRIu64
format specifier - which as Jiri Olsa pointed out, breaks on x86-64.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Anton Blanchard <anton@au1.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <ellerman@au1.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20130124054439.GA31588@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds a new printing mode for perf stat. It allows interval
printing. That means perf stat can now print event deltas at regular
time interval. This is useful to detect phases in programs.
The -I option enables interval printing. It expects an interval duration
in milliseconds. Minimum is 100ms. Once, activated perf stat prints
events deltas since last printout. All modes are supported.
$ perf stat -I 1000 -e cycles noploop 10
noploop for 10 seconds
# time counts events
1.000109853 2,388,560,546 cycles
2.000262846 2,393,332,358 cycles
3.000354131 2,393,176,537 cycles
4.000439503 2,393,203,790 cycles
5.000527075 2,393,167,675 cycles
6.000609052 2,393,203,670 cycles
7.000691082 2,393,175,678 cycles
The output format makes it easy to feed into a plotting program such as
gnuplot when the -I option is used in combination with the -x option:
$ perf stat -x, -I 1000 -e cycles noploop 10
noploop for 10 seconds
1.000084113,2378775498,cycles
2.000245798,2391056897,cycles
3.000354445,2392089414,cycles
4.000459115,2390936603,cycles
5.000565341,2392108173,cycles
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359460064-3060-3-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a suite of NUMA performance benchmarks.
The goal was simulate the behavior and access patterns of real NUMA
workloads, via a wide range of parameters, so this tool goes well
beyond simple bzero() measurements that most NUMA micro-benchmarks use:
- It processes the data and creates a chain of data dependencies,
like a real workload would. Neither the compiler, nor the
kernel (via KSM and other optimizations) nor the CPU can
eliminate parts of the workload.
- It randomizes the initial state and also randomizes the target
addresses of the processing - it's not a simple forward scan
of addresses.
- It provides flexible options to set process, thread and memory
relationship information: -G sets "global" memory shared between
all test processes, -P sets "process" memory shared by all
threads of a process and -T sets "thread" private memory.
- There's a NUMA convergence monitoring and convergence latency
measurement option via -c and -m.
- Micro-sleeps and synchronization can be injected to provoke lock
contention and scheduling, via the -u and -S options. This simulates
IO and contention.
- The -x option instructs the workload to 'perturb' itself artificially
every N seconds, by moving to the first and last CPU of the system
periodically. This way the stability of convergence equilibrium and
the number of steps taken for the scheduler to reach equilibrium again
can be measured.
- The amount of work can be specified via the -l loop count, and/or
via a -s seconds-timeout value.
- CPU and node memory binding options, to test hard binding scenarios.
THP can be turned on and off via madvise() calls.
- Live reporting of convergence progress in an 'at glance' output format.
Printing of convergence and deconvergence events.
The 'perf bench numa mem -a' option will start an array of about 30
individual tests that will each output such measurements:
# Running 5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp 1"
5x5-bw-thread, 20.276, secs, runtime-max/thread
5x5-bw-thread, 20.004, secs, runtime-min/thread
5x5-bw-thread, 20.155, secs, runtime-avg/thread
5x5-bw-thread, 0.671, %, spread-runtime/thread
5x5-bw-thread, 21.153, GB, data/thread
5x5-bw-thread, 528.818, GB, data-total
5x5-bw-thread, 0.959, nsecs, runtime/byte/thread
5x5-bw-thread, 1.043, GB/sec, thread-speed
5x5-bw-thread, 26.081, GB/sec, total-speed
See the help text and the code for more details.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
. Allow skipping problematic entries in 'perf test'.
. Fix some namespace problems in the event parsing routines.
. Add 'perf test' entry to make sure the python binding doesn't have
linking problems.
. Adjust 'perf test' attr tests verbosity levels.
. Make tools/perf build with GNU make v3.80, fix from Al Cooper.
. Do missing feature fallbacks in just one place, removing duplicated
code in multiple tools.
. Fix some memory leaks, from David Ahern.
. Fix segfault when drawing out-of-bounds jumps, from Frederik Deweerdt.
. Allow of casting an array of char to string in 'perf probe', from
Hyeoncheol Lee.
. Add support for wildcard in tracepoint system name, from Jiri Olsa.
. Update FSF postal address to be URL's, from Jon Stanley.
. Add anonymous huge page recognition, from Joshua Zhu.
. Remove some needless feature test checks, from Namhyung Kim.
. Multiple improvements to the sort routines, from Namhyung Kim.
. Fix warning on '>=' operator in libtraceevent, from Namhyung Kim.
. Use ARRAY_SIZE instead of reinventing it in 'perf script' and 'perf kmem',
from Sasha Levin.
. Remove some redundant checks, from Sasha Levin.
. Test correct variable after allocation in libtraceevent, fix from Sasha Levin.
. Mark branch_info maps as referenced, fix from Stephane Eranian.
. Fix PMU format parsing test failure, from Sukadev Bhattiprolu.
. Fix possible (unlikely) buffer overflow, from Thomas Jarosch.
. Multiple 'perf script' fixes, from Tom Zanussi.
. Add missing field in PERF_RECORD_SAMPLE documentation, from Vince Weaver.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sometimes a test is problematic for some reason and one wants to skip it,
for instance:
[root@sandy ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: detect open syscall event : Ok
3: detect open syscall event on all cpus : Ok
4: read samples using the mmap interface : Ok
5: parse events tests : Warning: bad op token {
Warning: bad op token {
Warning: bad op token {
Warning: bad op token {
Warning: bad op token {
Warning: function is_writable_pte not defined
Segmentation fault (core dumped)
So now we can use -s/--skip while the problematic tests are being fixed,
allowing us to test all the other entries:
[root@sandy ~]# perf test -s 5
1: vmlinux symtab matches kallsyms : Ok
2: detect open syscall event : Ok
3: detect open syscall event on all cpus : Ok
4: read samples using the mmap interface : Ok
5: parse events tests : Skip (user override)
6: x86 rdpmc test : Ok
7: Validate PERF_RECORD_* events & perf_sample fields : Ok
8: Test perf pmu format parsing : Ok
9: Test dso data interface : Ok
10: roundtrip evsel->name check : Ok
11: Check parsing of sched tracepoints fields : Ok
12: Generate and check syscalls:sys_enter_open event fields: Ok
13: struct perf_event_attr setup : Ok
14: Test matching and linking mutliple hists : Ok
15: Try 'use perf' in python, checking link problems : Ok
[root@sandy ~]#
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-klzd8p57jzdryafqkmlppcb1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Running the check-perf-trace scripts causes segfaults in both the Perl
and Python cases:
# perf script record check-perf-trace
# perf script -s libexec/perf-core/scripts/python/check-perf-trace.py
trace_begin
Segmentation fault (core dumped)
The reason is that the 'pevent' field was added to
perf_scripting_context but it wasn't hooked up with an actual pevent in
either case, so when one of the 'common' fields is accessed (in
util/trace-event-parse.c:get_common_fields()), pevent->events tries to
dereference a NULL pointer.
This sets the pevent field when the scripting context is set up.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: http://lkml.kernel.org/r/d2b1b8166a6ca0a36e1f5255b88a8289058ba236.1358527965.git.tom.zanussi@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some reason the libtraceevent tracepoint-parsing code is missing
the FIELD_IS_SIGNED flag-setting code, which causes problems for the
Perl trace event binding at least, since it ends up unable to
recognize negative numbers.
Things like checking for negative return values therefore fail, causing
scripts like rwtop to instead interpret the negative return value as a
large positive value, which in turn get added to e.g. read totals with
insanely invalid results.
So set the FIELD_IS_SIGNED flag for tracepoint events that specify
"signed:1".
Before:
# perf script record rw-by-pid
# perf script report rw-by-pid
read counts by pid:
pid comm # reads bytes_requested bytes_read
------ -------------------- ----------- ---------- ----------
753 Xorg 88 512000 7.74763251095801e+20
1619 firefox 42 462 2.58254417031934e+20
1232 gnome-shell 11 176 1.10680464442257e+20
1471 gnome-terminal 3 16366 18446744073709551615
1408 libsocialweb-co 2 32 18446744073709551613
After:
# perf script report rw-by-pid
read counts by pid:
pid comm # reads bytes_requested bytes_read
------ -------------------- ----------- ---------- ----------
753 Xorg 88 512000 2764
1619 firefox 42 462 126
1232 gnome-shell 11 176 40
1471 gnome-terminal 3 16366 10
1408 libsocialweb-co 2 32 8
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: http://lkml.kernel.org/r/1471b5968821a455cf5168bb4567964e74ecf530.1358527965.git.tom.zanussi@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>