The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:
# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)
While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.
Applies to the following commands:
perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechart
Also fixes char const* -> const char* type declaration for filename
strings.
v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the meaning of -C varies by perf command: for perf-top,
perf-stat, perf-record it means cpu list. For perf-report it means comm
list. Then perf-annotate, perf-report and perf-script use -c for cpu
list.
Fix annotate, report and script to use -C for cpu list to be consistent
with top, stat and record. This means report needs to use -c for comm
list which does introduce a backward compatibility change.
v1 -> v2
- update perf-script.txt too
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1321209008-7004-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds two new options to perf annotate:
- --no-asm-raw : Do not display raw instruction encodings
- --no-source : Do not interleave source code with assembly code
We believe those options make the output of annotate more readable.
Systematically displaying source can make it hard to follow code and
especially optimized code.
Raw encodings are not useful in most cases.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110517153207.GA9834@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[committer note: Use the 'no-' option inverting logic]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.
This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.
The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.
v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentation
v3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.
Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By creating an perf_evlist out of the attributes in the perf.data file
header, so that we can use evlists and evsels when reading recorded
sessions in addition to when we record sessions.
More work is needed to allow tools to allow the user to select which
events are wanted when browsing sessions, be it just one or a subset of
them, aggregated or showed at the same time but with different
indications on the UI to allow seeing workloads thru different views at
the same time.
But the overall goal/trend is to more uniformly use evsels and evlists.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we'll need it when implementing the live annotate TUI browser.
This also simplifies things a bit by having the list head for the source
code to be in the dynamicly allocated part of struct annotation, that
way we don't have to pass it around, it can be found from the struct
symbol that is passed everywhere.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf annotate tool continues aggregating everything on just one
histograms, but to support the top model add support for one histogram
perf evsel in the evlist.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They will be used by perf top, so that we have just one set of routines
to do annotation.
Rename "struct sym_priv" to "struct annotation", etc, to clarify this
code a bit.
Rename "struct sym_ext" to "struct source_line", to give it a meaningful
name, that clarifies that it is a the result of an addr2line call, that
is sorted by percentage one particular source code line appeared in the
annotation.
And since we're moving things around also rename 'sym_hist->ip' to
'sym_hist->addr' as we want to do data structure annotation at some
point.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.
While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.
This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.
This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.
This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.
Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.
There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Relying just on ~/.perfconfig or rebuilding the tool disabling support
for the TUI is too cumbersome, so allow specifying which UI to use and
make the command line switch override whatever is in ~/.perfconfig.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make all browsers return the exit key uniformly and remove the
newtExitStruct parameter, removing one more newt specific thing from the
ui API.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Right now it will just sort and position at the hottest line, i.e.
the one where more samples were taken.
It will be at the center of the screen and later TAB/shift-TAB will
cycle thru the hottest lines.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The newt initialization routines weren't being called because the output
was a file (perf annotate > /tmp/bla) but use_browser was still 1,
because ~/.perfconfig had it as 'on', so, later on newt routines
segfaulted.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When annotating multiple entries, for instance, when running simply as:
$ perf annotate
the right and left keys, as well as TAB can be used to cycle thru the
multiple symbols being annotated.
If one doesn't like TUI annotate, disable it by editing ~/.perfconfig
and adding:
[tui]
annotate = off
Just like it is possible for report.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.
Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.
The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.
While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.
Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.
The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And with that fix at least one bug:
The first hit for an entry, the one that calls malloc to create a new
instance in __perf_session__add_hist_entry, wasn't adding the count to
the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.
What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit. Doing
that in perf-record itself, however, is off-limits.
This patch introduces perf-inject, which does the same job while
leaving perf-record untouched. Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:
perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -
perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.
Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.
There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.
Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That will be in both struct hist_entry and struct
callchain_list, so that the TUI can store a pointer to the pair
(map, symbol) in the trees where hist_entries and
callchain_lists are present, to allow precise annotation instead
of looking for the first symbol with the selected name.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1269459619-982-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Before this patch we would not find a vmlinux, then try to pass
objdump "[kernel.kallsyms]" as the filename, it would get
confused and produce no output:
[root@doppio ~]# perf annotate n_tty_write
------------------------------------------------
Percent | Source code & Disassembly of [kernel.kallsyms]
------------------------------------------------
Now we check that and emit meaningful warning:
[root@doppio ~]# perf annotate n_tty_write
Can't annotate n_tty_write: No vmlinux file was found in the
path: [0] vmlinux
[1] /boot/vmlinux
[2] /boot/vmlinux-2.6.34-rc1-tip+
[3] /lib/modules/2.6.34-rc1-tip+/build/vmlinux
[4] /usr/lib/debug/lib/modules/2.6.34-rc1-tip+/vmlinux
[root@doppio ~]#
This bug was introduced when we added automatic search for
vmlinux, before that time the user had to specify a vmlinux
file.
v2: Print the warning just for the first symbol found when no
symbol name is specified, otherwise it will spam the screen
repeating the warning for each symbol.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268669073-6856-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>