Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
$ perf record -e intel_bts:// usleep 1
Is a valid event name.
But the code printing such PMUs was not honouring the 'event_glob'
parameter, so the following line was always appearing:
$ intel_bts// [Kernel PMU event]
Fix it:
$ [acme@felicio linux]$ perf list data
List of pre-defined events (to be used in -e):
uncore_imc/data_reads/ [Kernel PMU event]
uncore_imc/data_writes/ [Kernel PMU event]
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The patch f9db0d0f1b ("perf callchain: Allow disabling call graphs
per event") added an ability to enable/disable callchain recording per
event. But it had a problem when the enablement setting is changed at
'perf report' time using -g/--call-graph option.
For example, the following scenario will get a segfault.
$ perf record -ag sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.500 MB perf.data (2555 samples) ]
$ perf report -g none
perf: Segmentation fault
-------- backtrace --------
perf[0x53a98a]
/usr/lib/libc.so.6(+0x335af)[0x7f4e91df95af]
This is because callchain_param.sort() callback was not set but it
tried to call the function as it had the PERF_SAMPLE_CALLCHAIN bit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: f9db0d0f1b ("perf callchain: Allow disabling call graphs per event")
Link: http://lkml.kernel.org/r/1443587640-24242-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf top uses 'dso,symbol' sort keys by default so it overlooked a
problem in task's comm resolving. When the sort key contains 'comm',
some task's comm is not shown properly. This is because the
perf_top__mmap_read_idx() checks the cpumode value improperly.
The cpumode value of non-sample events are 0 (PERF_RECORD_MISC_CPUMODE_
UNKNOWN) so the events will be ignored by the switch statement. This patch
allows it for non-sample events.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A previous patch added a synthesized comm event for forked child process
but it missed that the event should contain area for sample_id_hdr at
the end. It worked by accident since the perf_event union contains
bigger event structs like mmap_events.
This patch fixes it by dynamically allocating event struct including
those area like in perf_event__synthesize_thread_map().
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes:
User visible changes:
- By default use the most precise "cycles" hw counter available, i.e.
when the user doesn't specify any event, it will try using cycles:ppp,
cycles:pp, etc. (Arnaldo Carvalho de Melo)
- Remove blank lines, headers when piping output in 'perf list', so that it can
be sanely used with 'wc -l', etc. (Arnaldo Carvalho de Melo)
- Amend documentation about max_stack and synthesized callchains. (Adrian Hunter)
- Fix 'perf probe -l' for probes added to kernel module functions. (Masami Hiramatsu)
Build fixes:
- Fix shadowed declarations that break the build on older distros. (Jiri Olsa)
- Fix build break on powerpc due to sample_reg_masks. (Sukadev Bhattiprolu)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the user doesn't specify any event, try the most precise "cycles"
available, i.e. start by "cycles:ppp" and go on removing "p" till it
works.
E.g.
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ]
$ perf evlist
cycles:pp
$ perf evlist -v
cycles:pp: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
$ grep 'model name' /proc/cpuinfo | head -1
model name : Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
$
When 'cycles' appears explicitely is specified this will not be tried,
i.e. the user has full control of the level of precision to be used:
$ perf record -e cycles usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (9 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
1, comm_exec: 1
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://www.youtube.com/watch?v=nXaxk27zwlk
Link: http://lkml.kernel.org/n/tip-b1ywebmt22pi78vjxau01wth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf probe shows more precisely message when it finds given
%return target function is inlined.
Without this fix:
----
# ./perf probe -V getname_flags%return
Return probe must be on the head of a real function.
Debuginfo analysis failed.
Error: Failed to show vars.
----
With this fix:
----
# ./perf probe -V getname_flags%return
Failed to find "getname_flags%return",
because getname_flags is an inlined function and has no return point.
Debuginfo analysis failed.
Error: Failed to show vars.
----
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164137.3733.55055.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf probe --list will get a segfault if the first kprobe event is on a
module and the second or latter one is on the kernel.
e.g.
----
# ./perf probe -q -m pcspkr pcspkr_event
# ./perf probe -q vfs_read
# ./perf probe -l
Segmentation fault (core dumped)
----
This is because the debuginfo_cache fails to handle NULL module name,
which causes segfault on strcmp. (Note that strcmp("something", NULL)
always causes segfault)
To fix this debuginfo_cache__open always translates the NULL module name
to "kernel" (this is correct, because NULL module name means opening the
debuginfo for the kernel)
----
# ./perf probe -l
probe:pcspkr_event (on pcspkr_event@drivers/input/misc/pcspkr.c
in pcspkr)
probe:vfs_read (on vfs_read@ksrc/linux-3/fs/read_write.c)
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164135.3733.23993.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf probe always failed to find appropriate line numbers because of
failing to find .text start address offset from debuginfo.
e.g.
----
# ./perf probe -m pcspkr pcspkr_event:5
Added new events:
probe:pcspkr_event (on pcspkr_event:5 in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event:5 in pcspkr)
You can now use it in all perf tools, such as:
perf record -e probe:pcspkr_event_1 -aR sleep 1
# ./perf probe -l
Failed to find debug information for address ffffffffa031f006
Failed to find debug information for address ffffffffa031f016
probe:pcspkr_event (on pcspkr_event+6 in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event+22 in pcspkr)
----
This fixes the above issue as below.
1. Get the relative address of the symbol in .text by using
map->start.
2. Adjust the address by adding the offset of .text section
in the kernel module binary.
With this fix, perf probe -l shows lines correctly.
----
# ./perf probe -l
probe:pcspkr_event (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
----
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164132.3733.24643.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix to remove dot suffix (e.g. .const, .isra) from the second or latter
events which has suffix numbers.
Since the previous commit 35a23ff928 ("perf probe: Cut off the gcc
optimization postfixes from function name") didn't care about the suffix
numbered events, therefore we'll have an error when we add additional
events on the same dot suffix functions.
e.g.
----
# ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
-a get_sigframe.isra.2.constprop.3
Failed to write event: Invalid argument
Error: Failed to add events.
----
This fixes above issue as below:
----
# ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
-a get_sigframe.isra.2.constprop.3
Added new events:
probe:get_sigframe (on get_sigframe.isra.2.constprop.3)
probe:get_sigframe_1 (on get_sigframe.isra.2.constprop.3)
You can now use it in all perf tools, such as:
perf record -e probe:get_sigframe_1 -aR sleep 1
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164130.3733.26573.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --max_stack option was added as an optimization to reduce processing time,
so people specifying --max-stack might get a increased processing time if
combined with synthesized callchains, but otherwise no real harm.
A warning about setting both --max_stack and the synthesized callchains max
depth seems like overkill. Amend the documentation.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/560A5155.4060105@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The error variable breaks build on CentOS 6.7, due to collision with
global error symbol:
CC util/evlist.o
cc1: warnings being treated as errors
In file included from util/evlist.c:28:
tools/include/linux/err.h: In function ‘ERR_PTR’:
tools/include/linux/err.h:34: error: declaration of ‘error’ shadows a global declaration
util/util.h:135: error: shadowed declaration is here
Using 'error_' name instead to fix it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-i9mdgdbrgauy3fe76s9rd125@git.kernel.org
Reported-by: Vinson Lee <vlee@twopensource.com>
[ Use 'error_' instead of 'err' to, visually, not diverge too much from include/linux/err.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Accept a zero --itrace period, meaning "as often as possible". In the case
of Intel PT that is the same as a period of 1 and a unit of 'instructions'
(i.e. --itrace=i1i). (Adrian Hunter)
- Harmonize itrace's synthesized callchains with the existing --max-stack
tool option. (Adrian Hunter)
- Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter)
- Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter)
- Slighly improve Intel PT debug logging. (Adrian Hunter)
- Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST.
(Adrian Hunter)
- Further document export-to-postgresql.py script. (Adrian Hunter)
- Add option to synthesize branch stack from auxtrace data. (Adrian Hunter)
- Use equivalent logic to avoid using dso->kernel. (Arnaldo Carvalho de Melo)
- Show proper error messages when parsing bad terms for hw/sw events. (He Kuang)
- Tracepoint event parsing improvements. (He Kuang)
- Store tracing mountpoint for better error message. (Jiri Olsa)
- Add fixdep to tools/build, bringing it closer to the kernel counterpart, from
where it is being lifted. (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Show proper error message and show valid terms when wrong config terms
is specified for hw/sw type perf events.
This patch makes the original error format function formats_error_string()
more generic, which only outputs the static config terms for hw/sw perf
events, and prepends pmu formats for pmu events.
Before this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
invalid or unsupported event: 'cpu-clock/freqx=200/'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
After this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
event syntax error: 'cpu-clock/freqx=200/'
\___ unknown term
valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
autofdo incorrectly expects branch flags to include either mispred or
predicted. In fact mispred = predicted = 0 is valid and means the flags
are not supported, which they aren't by Intel PT.
To make autofdo work, add a config option which will cause Intel PT
decoder to set the mispred flag on all branches.
Below is an example of using Intel PT with autofdo. The example is
also added to the Intel PT documentation. It requires autofdo
(https://github.com/google/autofdo) and gcc version 5. The bubble
sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial)
amended to take the number of elements as a parameter.
$ gcc-5 -O3 sort.c -o sort_optimized
$ ./sort_optimized 30000
Bubble sorting array of 30000 elements
2254 ms
$ cat ~/.perfconfig
[intel-pt]
mispred-all
$ perf record -e intel_pt//u ./sort 3000
Bubble sorting array of 3000 elements
58 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 3.939 MB perf.data ]
$ perf inject -i perf.data -o inj --itrace=i100usle --strip
$ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
$ ./sort_autofdo 30000
Bubble sorting array of 30000 elements
2155 ms
Note there is currently no advantage to using Intel PT instead of LBR,
but that may change in the future if greater use is made of the data.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new option --strip which is used with --itrace to strip out
non-synthesized events. This results in a perf.data file that is
simpler for external tools to parse. In particular, this can be used to
prepare a perf.data file for consumption by autofdo.
A subsequent patch makes a change to Intel PT also to enable use with
autofdo and gives an example of that use.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-25-git-send-email-adrian.hunter@intel.com
[ Made it use perf_evlist__remove() + perf_evsel__delete() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf inject can process instruction traces (using the --itrace option)
which removes aux-related events and replaces them with the requested
synthesized events.
However there are still some leftovers, namely PERF_RECORD_ITRACE_START
events and the original evsel (selected event) e.g. intel_pt//
For the sake of completeness, remove them too.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-24-git-send-email-adrian.hunter@intel.com
[ Made it use perf_evlist__remove() + perf_evsel__delete() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf script has a setting to set the maximum stack depth when processing
callchains. The setting defaults to the hard-coded maximum definition
PERF_MAX_STACK_DEPTH which is 127.
It is possible, when processing instruction traces, to synthesize
callchains. Synthesized callchains do not have the kernel size
limitation and are whatever size the user requests, although validation
presently prevents the user requested a value greater that 1024. The
default value is 16.
To allow for synthesized callchains, make the scripting_max_stack value
at least the same size as the synthesized callchain size.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-21-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf report has an option (--max-stack) to set the maximum stack depth
when processing callchains. The option defaults to the hard-coded
maximum definition PERF_MAX_STACK_DEPTH which is 127. The intention of
the option is to allow the user to reduce the processing time by
reducing the amount of the callchain that is processed.
It is also possible, when processing instruction traces, to synthesize
callchains. Synthesized callchains do not have the kernel size
limitation and are whatever size the user requests, although validation
presently prevents the user requested a value greater that 1024. The
default value is 16.
To allow for synthesized callchains, make the max_stack value at least
the same size as the synthesized callchain size.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for generating branch stack context for PT samples. The
decoder reports a configurable number of branches as branch context for
each sample. Internally it keeps track of them by using a simple sliding
window. We also flush the last branch buffer on each sample to avoid
overlapping intervals.
This is useful for:
- Reporting accurate basic block edge frequencies through the perf
report branch view
- Using with --branch-history to get the wider context of samples
- Other users of LBRs
Also the Documentation is updated.
Examples:
Record with Intel PT:
perf record -e intel_pt//u ls
Branch stacks are used by default if synthesized so:
perf report --itrace=ile
is the same as:
perf report --itrace=ile -b
Branch history can be requested also:
perf report --itrace=igle --branch-history
Based-on-patch-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A non-synthesized event might not have a branch stack if branch stacks
have been synthesized (using itrace options).
An example of that is when Intel PT records sched_switch events for
decoding purposes. Those sched_switch events do not have branch stacks
even though the Intel PT decoder may be synthesizing other events that
do due to the itrace options.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf report looks at event sample types to determine if branch stacks
have been sampled. Adjust the validation to know about instruction
tracing options.
This change allows the use of the -b option which otherwise would
complain with an error like:
Error:
Selected -b but no branch data. Did you call perf record without -b?
# To display the perf.data header info,
# please use --header/--header-only options.
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1443186956-18718-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>