The perf_event_open() system call returns EACCES if the user is
not root which results in a very confusing error message:
$ perf record -A -a -f
Error: perfcounter syscall returned with -1 (Permission denied)
Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
It turns out that's because perf tools are checking only for
EPERM. Fix that up to get a much better error message:
$ perf record -A -a -f
Fatal: Permission error - are you root?
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1257696066-4046-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The present use of -Wcast-align causes the build to blow up on
SH due to generating a "cast increases required alignment of
target type" error on each invocation of list_for_each_entry().
It seems that this was previously reported and killed off in the
ia64 support patch, but nothing seems to have happened with
that. Presumably the same problem still remains there, too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <20091026054000.GA13517@linux-sh.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[from KS feedback]
Currently, scheduler delays are shown in a mostly transparent,
light yellow color. This color is rather hard to see on several
screens, especially projectors.
This patch changes the color of the scheduler delays to be a
much more "hard" yellow that survived the kernel summit
projector.
Reported-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064731.20ae126a@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The timechart wakeup arrows currently show no process
information when the waker/wakee are processes that are not
actually chosen to be shown on the timechart.
This patch fixes this oversight, by looking through all
processes (after giving preference to visible processes) as well
as falling back to just showing the PID if no name for the
process can be resolved.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064649.0e4959b2@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We released the first version of perf with 0.0.1 in v2.6.31,
time to double our version number to 0.0.2 ;-)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Randy Dunlap reported that 'make NO_64BIT=1' fails to build
a pure 32-b it binary on 64-bit/64-bit x86 systems.
The reason is that we dont pass in the -m32 and GCC defaults
to -m64.
So pass it in - and also extend the warning message about libelf
dependencies - glibc-dev[el] is needed as well beyond the libelf
library.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: Message-Id: <20091005131729.78444bfb.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The following perf build warnings/errors in function
argument types:
builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type
... trigger because older GCC is not able to prove that
sort_dimension__add() does not change the string.
Some goes for test_type_token().
Fix this by improving type consistency.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com>
[ Also remove ugly type cast now unnecessary. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some architectures such as Sparc, ARM and MIPS (basically
everything with flush_dcache_page()) need to deal with dcache
aliases by carefully placing pages in both kernel and user maps.
These architectures typically have to use vmalloc_user() for this.
However, on other architectures, vmalloc() is not needed and has
the downsides of being more restricted and slower than regular
allocations.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1254830228.21044.272.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Asm routines that end up having size equal to zero are not really
zero sized, and as now we do kernel_maps__fixup_sym_end, at least
for kernel routines this gets fixed.
A similar fixup needs to be done for the userspace bits as well,
but as this fixup started only because in /proc/kallsyms we don't
have the end address nor the function size, it appeared here first.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254796503-27203-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For doing work on the Linux power management components, I need to
make long (30+ seconds) traces. Currently, this then results in a
HUGE svg file, with mostly process data that isn't interesting.
This patch adds a --power-only mode to perf timechart that only
outputs the CPU power section of the SVG; this significantly
reduces the size of the SVG file, making even 30+ second traces
viewable with inkscape.
As a minor tweak for the same effect, the minimum text size is
decreased; current inkscape cannot zoom in deep enough to show text
this small, but it reduces inkscape compute time.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: peterz@infradead.org
LKML-Reference: <20090924154013.0675ab71@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
openat() is still a young glibc facility, better to not use it in a
non performance critical program (perf list)
Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36
on my dev machine for example).
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ulrich Drepper <drepper@redhat.com>
LKML-Reference: <4ABB767D.6080004@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
"perf top" cores dump on my dev machine, if run from a directory
where vmlinux is present:
*** glibc detected *** malloc(): memory corruption: 0x085670d0 ***
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@kernel.org>
LKML-Reference: <4ABB6EB7.7000002@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Avi Kivity reported 'perf annotate' failures with modules, the
requested function was not annotated.
If there are no modules currently loaded, or the last module
scanned is not loaded, dso__load_modules() steps on the value from
dso__load_vmlinux(), so we happily load the kallsyms symbols on top
of what we've already loaded.
Fix that such that the total count of symbols loaded is returned.
Should module symbol load fail after parsing of vmlinux, is's a
hard failure, so do not silently fall-back to kallsyms.
Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1253697658.11461.36.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tweak the output SVG to increase performance in SVG viewers by
limiting the different types of font sizes and by smarter
transformations on the text.
At least with Inkscape this gives a notable performance improvement
during zoom and scrolling.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181438.3a49cb93@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds a command line option for timechart that allows the
user to specify the width of the SVG file.
This patch also makes sure that each second of recording has at
least 200 units (pixels at 96 DPI) of width. This impacts
recordings longer than 5 seconds; recordings shorter than 5 second
will scale up to have a width of 1000 units for the whole recording
(as before).
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181416.69570c5d@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Given that scheduler latencies are the hot thing nowadays, show the
duration of said latencies in the SVG in text form.
In addition, if the latency is more than 10 msec, pick a brighter
yellow color as a way to point these long delays out.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181353.796f4509@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
timechart is a tool to visualize what is going on in the system.
The user makes a trace of what is going on with
> perf record --timechart /usr/bin/some_command
and then can turn the output of this into an svg file
> perf timechart
which then can be viewed with any SVG view; inkscape works well
enough for me.
The idea behind timechart is to create a "infinitely zoomable"
picture; something that has high level information on a 1:1 zoom
level, but which exposes more details every time you zoom into a
specific area.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130713.6a77bbc0@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The trace event name<->id mapping is dynamic for each kernel
compile. In order for perf.data to be useable outside the actual
system, we thus need to store a table of this mapping for later
use.
This patch adds this table to perf.data, and provides helper
functions for lookup up fields from this table.
To avoid mistakes, lookup-from-table is kept completely seprate
from lookup-from-local-debugfs.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130405.6960d099@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
perf sched record passes unparsed args on to perf record, so
specifying an output file via perf sched record -o FILE (cmd) just
works. Ergo, provide an option to specify input file as well.
Also add the missing 'map' command to help.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1253254944.20589.11.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The name length of some trace events is longer than 30, like
sys_enter_sched_get_priority_max and
ext4_mb_discard_preallocations.
Passing those events to perf-record will fail, try:
# ./perf record -f -e syscalls:sys_enter_sched_get_priority_max -F 1 -a
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4AB1F4AB.7050205@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I noticed that perf-record continues profiling itself after the
child terminated and we're draining the buffer.
This can cause a _lot_ of overhead with --all recording - we keep
and keep recording, which produces new and new events.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter noticed that we have 3 ways of referring to the idle thread:
[idle]:0
swapper:0
swapper-0
Standardize on 'swapper:0'.
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use 'perf sched latency' to track the current task based on
context-switch events, and flag the cases where there's some
impossible transition: such as a PID being switched out that
was not switched in.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Output such lost event and state machine weirdness stats:
TOTAL: | 14974.910 ms | 46384 |
---------------------------------------------------
INFO: 8.865% lost events (19132 out of 215819, in 8 chunks)
INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?)
And increase buffering to -m 1024 (4 MB) by default. Since we
use output multiplexing that kind of space is needed.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>