Commit Graph

2336 Commits

Author SHA1 Message Date
Linus Torvalds
e0972916e8 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Features:

   - Add "uretprobes" - an optimization to uprobes, like kretprobes are
     an optimization to kprobes.  "perf probe -x file sym%return" now
     works like kretprobes.  By Oleg Nesterov.

   - Introduce per core aggregation in 'perf stat', from Stephane
     Eranian.

   - Add memory profiling via PEBS, from Stephane Eranian.

   - Event group view for 'annotate' in --stdio, --tui and --gtk, from
     Namhyung Kim.

   - Add support for AMD NB and L2I "uncore" counters, by Jacob Shin.

   - Add Ivy Bridge-EP uncore support, by Zheng Yan

   - IBM zEnterprise EC12 oprofile support patchlet from Robert Richter.

   - Add perf test entries for checking breakpoint overflow signal
     handler issues, from Jiri Olsa.

   - Add perf test entry for for checking number of EXIT events, from
     Namhyung Kim.

   - Add perf test entries for checking --cpu in record and stat, from
     Jiri Olsa.

   - Introduce perf stat --repeat forever, from Frederik Deweerdt.

   - Add --no-demangle to report/top, from Namhyung Kim.

   - PowerPC fixes plus a couple of cleanups/optimizations in uprobes
     and trace_uprobes, by Oleg Nesterov.

  Various fixes and refactorings:

   - Fix dependency of the python binding wrt libtraceevent, from
     Naohiro Aota.

   - Simplify some perf_evlist methods and to allow 'stat' to share code
     with 'record' and 'trace', by Arnaldo Carvalho de Melo.

   - Remove dead code in related to libtraceevent integration, from
     Namhyung Kim.

   - Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf
     sched lat' back working, by Arnaldo Carvalho de Melo

   - We don't use Newt anymore, just plain libslang, by Arnaldo Carvalho
     de Melo.

   - Kill a bunch of die() calls, from Namhyung Kim.

   - Fix build on non-glibc systems due to libio.h absence, from Cody P
     Schafer.

   - Remove some perf_session and tracing dead code, from David Ahern.

   - Honor parallel jobs, fix from Borislav Petkov

   - Introduce tools/lib/lk library, initially just removing duplication
     among tools/perf and tools/vm.  from Borislav Petkov

  ... and many more I missed to list, see the shortlog and git log for
  more details."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (136 commits)
  perf/x86/intel/P4: Robistify P4 PMU types
  perf/x86/amd: Fix AMD NB and L2I "uncore" support
  perf/x86/amd: Remove old-style NB counter support from perf_event_amd.c
  perf/x86: Check all MSRs before passing hw check
  perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
  perf/x86/intel: Add Ivy Bridge-EP uncore support
  perf/x86/intel: Fix SNB-EP CBO and PCU uncore PMU filter management
  perf/x86: Avoid kfree() in CPU_{STARTING,DYING}
  uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty
  uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit()
  uprobes/tracing: Change create_trace_uprobe() to support uretprobes
  uprobes/tracing: Make seq_printf() code uretprobe-friendly
  uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
  uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
  uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
  uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
  uprobes/tracing: Generalize struct uprobe_trace_entry_head
  uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
  uprobes/tracing: Kill the pointless seq_print_ip_sym() call
  uprobes/tracing: Kill the pointless task_pt_regs() calls
  ...
2013-04-30 07:41:01 -07:00
Steven Rostedt
6c24499f40 tracing: Fix small merge bug
During the 3.10 merge, a conflict happened and the resolution was
almost, but not quite, correct. An if statement was reversed.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[ Duh. That was just silly of me  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 07:23:08 -07:00
Linus Torvalds
9e8529afc4 Tracing updates for Linux 3.10
Along with the usual minor fixes and clean ups there are a few major
 changes with this pull request.
 
 1) Multiple buffers for the ftrace facility
 
 This feature has been requested by many people over the last few years.
 I even heard that Google was about to implement it themselves. I finally
 had time and cleaned up the code such that you can now create multiple
 instances of the ftrace buffer and have different events go to different
 buffers. This way, a low frequency event will not be lost in the noise
 of a high frequency event.
 
 Note, currently only events can go to different buffers, the tracers
 (ie. function, function_graph and the latency tracers) still can only
 be written to the main buffer.
 
 2) The function tracer triggers have now been extended.
 
 The function tracer had two triggers. One to enable tracing when a
 function is hit, and one to disable tracing. Now you can record a
 stack trace on a single (or many) function(s), take a snapshot of the
 buffer (copy it to the snapshot buffer), and you can enable or disable
 an event to be traced when a function is hit.
 
 3) A perf clock has been added.
 
 A "perf" clock can be chosen to be used when tracing. This will cause
 ftrace to use the same clock as perf uses, and hopefully this will make
 it easier to interleave the perf and ftrace data for analysis.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJRfnTPAAoJEOdOSU1xswtMqYYH/1WIdrwXmxHflErnYkCIr3sU
 QtYae2K5A1HcgiqOvRJrdWMOt016iMx5CaQQyBFM1vvMiPY0sTWRmwNxDfZzz9LN
 10jRvWEzZSLtzl+a9mkFWLEpr5nR/QODOxkWFCnRWscp46sp04LSTxGDYsOnPQZB
 sam/AQ1h4xA+DqDBChm9BDEUEPorGleTlN54LBaCGgSFGvrbF+eAg2s4vHNAQAvQ
 8d5xjSE9zC7J+FqbVxvJTbKI3+EqKL6hMsJKsKfi0SI+FuxBaFMSltXck5zKyTI4
 HpNJzXCmw+v90Tju7oMkPHh6RTbESPCHoGU+wqE52fM6m7oScVeuI/kfc6USwU4=
 =W1n+
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Along with the usual minor fixes and clean ups there are a few major
  changes with this pull request.

   1) Multiple buffers for the ftrace facility

  This feature has been requested by many people over the last few
  years.  I even heard that Google was about to implement it themselves.
  I finally had time and cleaned up the code such that you can now
  create multiple instances of the ftrace buffer and have different
  events go to different buffers.  This way, a low frequency event will
  not be lost in the noise of a high frequency event.

  Note, currently only events can go to different buffers, the tracers
  (ie function, function_graph and the latency tracers) still can only
  be written to the main buffer.

   2) The function tracer triggers have now been extended.

  The function tracer had two triggers.  One to enable tracing when a
  function is hit, and one to disable tracing.  Now you can record a
  stack trace on a single (or many) function(s), take a snapshot of the
  buffer (copy it to the snapshot buffer), and you can enable or disable
  an event to be traced when a function is hit.

   3) A perf clock has been added.

  A "perf" clock can be chosen to be used when tracing.  This will cause
  ftrace to use the same clock as perf uses, and hopefully this will
  make it easier to interleave the perf and ftrace data for analysis."

* tag 'trace-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (82 commits)
  tracepoints: Prevent null probe from being added
  tracing: Compare to 1 instead of zero for is_signed_type()
  tracing: Remove obsolete macro guard _TRACE_PROFILE_INIT
  ftrace: Get rid of ftrace_profile_bits
  tracing: Check return value of tracing_init_dentry()
  tracing: Get rid of unneeded key calculation in ftrace_hash_move()
  tracing: Reset ftrace_graph_filter_enabled if count is zero
  tracing: Fix off-by-one on allocating stat->pages
  kernel: tracing: Use strlcpy instead of strncpy
  tracing: Update debugfs README file
  tracing: Fix ftrace_dump()
  tracing: Rename trace_event_mutex to trace_event_sem
  tracing: Fix comment about prefix in arch_syscall_match_sym_name()
  tracing: Convert trace_destroy_fields() to static
  tracing: Move find_event_field() into trace_events.c
  tracing: Use TRACE_MAX_PRINT instead of constant
  tracing: Use pr_warn_once instead of open coded implementation
  ring-buffer: Add ring buffer startup selftest
  tracing: Bring Documentation/trace/ftrace.txt up to date
  tracing: Add "perf" trace_clock
  ...

Conflicts:
	kernel/trace/ftrace.c
	kernel/trace/trace.c
2013-04-29 13:55:38 -07:00
Ingo Molnar
73e21ce28d Merge branch 'perf/urgent' into perf/core
Conflicts:
	arch/x86/kernel/cpu/perf_event_intel.c

Merge in the latest fixes before applying new patches, resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-21 10:57:33 +02:00
Linus Torvalds
0a82a8d132 Revert "block: add missing block_bio_complete() tracepoint"
This reverts commit 3a366e614d.

Wanlong Gao reports that it causes a kernel panic on his machine several
minutes after boot. Reverting it removes the panic.

Jens says:
 "It's not quite clear why that is yet, so I think we should just revert
  the commit for 3.9 final (which I'm assuming is pretty close).

  The wifi is crap at the LSF hotel, so sending this email instead of
  queueing up a revert and pull request."

Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Requested-by: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-18 09:00:26 -07:00
Oleg Nesterov
515619f209 uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty
perf_trace_buf_prepare() + perf_trace_buf_submit() make no sense
if this task/CPU has no active counters. Change uprobe_perf_print()
to return if hlist_empty(call->perf_events).

Note: this is not uprobe-specific, we can change other users too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-04-15 17:39:52 +02:00
Linus Torvalds
ae9f4939ba Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixlets"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix error return code
  ftrace: Fix strncpy() use, use strlcpy() instead of strncpy()
  perf: Fix strncpy() use, use strlcpy() instead of strncpy()
  perf: Fix strncpy() use, always make sure it's NUL terminated
  perf: Fix ring_buffer perf_output_space() boundary calculation
  perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer()
2013-04-14 11:10:44 -07:00
Oleg Nesterov
32520b2c69 uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit()
uprobe_perf_print() passes addr=ip to perf_trace_buf_submit() for
no reason. This sets perf_sample_data->addr for PERF_SAMPLE_ADDR,
we already have perf_sample_data->ip initialized if PERF_SAMPLE_IP.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-04-13 15:32:04 +02:00
Oleg Nesterov
4ee5a52ed6 uprobes/tracing: Change create_trace_uprobe() to support uretprobes
Finally change create_trace_uprobe() to check if argv[0][0] == 'r'
and pass the correct "is_ret" to alloc_trace_uprobe().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:03 +02:00
Oleg Nesterov
3ede82dd3e uprobes/tracing: Make seq_printf() code uretprobe-friendly
Change probes_seq_show() and print_uprobe_event() to check
is_ret_probe() and print the correct data.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:03 +02:00
Oleg Nesterov
4d1298e212 uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
Change uprobe_event_define_fields(), and __set_print_fmt() to check
is_ret_probe() and use the appropriate format/fields.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:03 +02:00
Oleg Nesterov
393a736c28 uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
Change uprobe_trace_print() and uprobe_perf_print() to check
is_ret_probe() and fill ring_buffer_event accordingly.

Also change uprobe_trace_func() and uprobe_perf_func() to not
_print() if is_ret_probe() is true. Note that we keep ->handler()
nontrivial even for uretprobe, we need this for filtering and for
other potential extensions.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:03 +02:00
Oleg Nesterov
c1ae5c75e1 uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
Create the new functions we need to support uretprobes, and change
alloc_trace_uprobe() to initialize consumer.ret_handler if the new
"is_ret" argument is true. Curently this argument is always false,
so the new code is never called and is_ret_probe(tu) is false too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:02 +02:00
Oleg Nesterov
a51cc60417 uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
Extract the output code from uprobe_trace_func() and uprobe_perf_func()
into the new helpers, they will be used by ->ret_handler() too. We also
add the unused "unsigned long func" argument in advance, to simplify the
next changes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:01 +02:00
Oleg Nesterov
457d1772f1 uprobes/tracing: Generalize struct uprobe_trace_entry_head
struct uprobe_trace_entry_head has a single member for reporting,
"unsigned long ip". If we want to support uretprobes we need to
create another struct which has "func" and "ret_ip" and duplicate
a lot of functions, like trace_kprobe.c does.

To avoid this copy-and-paste horror we turn ->ip into ->vaddr[]
and add couple of trivial helpers to calculate sizeof/data. This
uglifies the code a bit, but this allows us to avoid a lot more
complications later, when we add the support for ret-probes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:01 +02:00
Oleg Nesterov
0e3853d202 uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
uprobe_trace_func() is never called with irqs or preemption
disabled, no need to ask preempt_count() or local_save_flags().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:32:00 +02:00
Oleg Nesterov
456fdbcb86 uprobes/tracing: Kill the pointless seq_print_ip_sym() call
seq_print_ip_sym(ip) in print_uprobe_event() is pointless,
kallsyms_lookup(ip) can not resolve a user-space address.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:31:59 +02:00
Oleg Nesterov
07720b63a9 uprobes/tracing: Kill the pointless task_pt_regs() calls
uprobe_trace_func() and uprobe_perf_func() do not need task_pt_regs(),
we already have "struct pt_regs *regs".

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Anton Arapov <anton@redhat.com>
2013-04-13 15:31:59 +02:00
Namhyung Kim
20079ebe73 ftrace: Get rid of ftrace_profile_bits
It seems that function profiler's hash size is fixed at 1024.  Add and
use FTRACE_PROFILE_HASH_BITS instead and update hash size macro.

Link: http://lkml.kernel.org/r/1365551750-4504-1-git-send-email-namhyung@kernel.org

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 23:02:33 -04:00
Namhyung Kim
ed6f1c996b tracing: Check return value of tracing_init_dentry()
Check return value and bail out if it's NULL.

Link: http://lkml.kernel.org/r/1365553093-10180-2-git-send-email-namhyung@kernel.org

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 23:02:32 -04:00
Namhyung Kim
f1943977e6 tracing: Get rid of unneeded key calculation in ftrace_hash_move()
It's not used anywhere in the function.

Link: http://lkml.kernel.org/r/1365553093-10180-1-git-send-email-namhyung@kernel.org

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 23:02:31 -04:00
Namhyung Kim
9f50afccfd tracing: Reset ftrace_graph_filter_enabled if count is zero
The ftrace_graph_count can be decreased with a "!" pattern, so that
the enabled flag should be updated too.

Link: http://lkml.kernel.org/r/1365663698-2413-1-git-send-email-namhyung@kernel.org

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 23:02:30 -04:00
Steven Rostedt (Red Hat)
7f49ef69db ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section
As ftrace_filter_lseek is now used with ftrace_pid_fops, it needs to
be moved out of the #ifdef CONFIG_DYNAMIC_FTRACE section as the
ftrace_pid_fops is defined when DYNAMIC_FTRACE is not.

Cc: stable@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 17:12:41 -04:00
Namhyung Kim
6a76f8c0ab tracing: Fix possible NULL pointer dereferences
Currently set_ftrace_pid and set_graph_function files use seq_lseek
for their fops.  However seq_open() is called only for FMODE_READ in
the fops->open() so that if an user tries to seek one of those file
when she open it for writing, it sees NULL seq_file and then panic.

It can be easily reproduced with following command:

  $ cd /sys/kernel/debug/tracing
  $ echo 1234 | sudo tee -a set_ftrace_pid

In this example, GNU coreutils' tee opens the file with fopen(, "a")
and then the fopen() internally calls lseek().

Link: http://lkml.kernel.org/r/1365663302-2170-1-git-send-email-namhyung@kernel.org

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12 14:43:34 -04:00
Namhyung Kim
39e30cd153 tracing: Fix off-by-one on allocating stat->pages
The first page was allocated separately, so no need to start from 0.

Link: http://lkml.kernel.org/r/1364820385-32027-2-git-send-email-namhyung@kernel.org

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-09 19:00:49 -04:00
Namhyung Kim
83e03b3fe4 tracing: Fix double free when function profile init failed
On the failure path, stat->start and stat->pages will refer same page.
So it'll attempt to free the same page again and get kernel panic.

Link: http://lkml.kernel.org/r/1364820385-32027-1-git-send-email-namhyung@kernel.org

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-09 18:54:04 -04:00
Chen Gang
9607a869ee kernel: tracing: Use strlcpy instead of strncpy
Use strlcpy() instead of strncpy() as it will always add a '\0'
to the end of the string even if the buffer is smaller than what
is being copied.

Link: http://lkml.kernel.org/r/51624254.30301@asianux.com

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-09 11:25:08 -04:00
Steven Rostedt (Red Hat)
395b97a3ae ftrace: Do not call stub functions in control loop
The function tracing control loop used by perf spits out a warning
if the called function is not a control function. This is because
the control function references a per cpu allocated data structure
on struct ftrace_ops that is not allocated for other types of
functions.

commit 0a016409e4 "ftrace: Optimize the function tracer list loop"

Had an optimization done to all function tracing loops to optimize
for a single registered ops. Unfortunately, this allows for a slight
race when tracing starts or ends, where the stub function might be
called after the current registered ops is removed. In this case we
get the following dump:

root# perf stat -e ftrace:function sleep 1
[   74.339105] WARNING: at include/linux/ftrace.h:209 ftrace_ops_control_func+0xde/0xf0()
[   74.349522] Hardware name: PRIMERGY RX200 S6
[   74.357149] Modules linked in: sg igb iTCO_wdt ptp pps_core iTCO_vendor_support i7core_edac dca lpc_ich i2c_i801 coretemp edac_core crc32c_intel mfd_core ghash_clmulni_intel dm_multipath acpi_power_meter pcspk
r microcode vhost_net tun macvtap macvlan nfsd kvm_intel kvm auth_rpcgss nfs_acl lockd sunrpc uinput xfs libcrc32c sd_mod crc_t10dif sr_mod cdrom mgag200 i2c_algo_bit drm_kms_helper ttm qla2xxx mptsas ahci drm li
bahci scsi_transport_sas mptscsih libata scsi_transport_fc i2c_core mptbase scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
[   74.446233] Pid: 1377, comm: perf Tainted: G        W    3.9.0-rc1 #1
[   74.453458] Call Trace:
[   74.456233]  [<ffffffff81062e3f>] warn_slowpath_common+0x7f/0xc0
[   74.462997]  [<ffffffff810fbc60>] ? rcu_note_context_switch+0xa0/0xa0
[   74.470272]  [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
[   74.478117]  [<ffffffff81062e9a>] warn_slowpath_null+0x1a/0x20
[   74.484681]  [<ffffffff81102ede>] ftrace_ops_control_func+0xde/0xf0
[   74.491760]  [<ffffffff8162f400>] ftrace_call+0x5/0x2f
[   74.497511]  [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
[   74.503486]  [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
[   74.509500]  [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
[   74.516088]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[   74.522268]  [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
[   74.528837]  [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
[   74.536696]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[   74.542878]  [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
[   74.548869]  [<ffffffff81105c67>] unregister_ftrace_function+0x27/0x50
[   74.556243]  [<ffffffff8111eadf>] perf_ftrace_event_register+0x9f/0x140
[   74.563709]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[   74.569887]  [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
[   74.575898]  [<ffffffff8111e94e>] perf_trace_destroy+0x2e/0x50
[   74.582505]  [<ffffffff81127ba9>] tp_perf_event_destroy+0x9/0x10
[   74.589298]  [<ffffffff811295d0>] free_event+0x70/0x1a0
[   74.595208]  [<ffffffff8112a579>] perf_event_release_kernel+0x69/0xa0
[   74.602460]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
[   74.608667]  [<ffffffff8112a640>] put_event+0x90/0xc0
[   74.614373]  [<ffffffff8112a740>] perf_release+0x10/0x20
[   74.620367]  [<ffffffff811a3044>] __fput+0xf4/0x280
[   74.625894]  [<ffffffff811a31de>] ____fput+0xe/0x10
[   74.631387]  [<ffffffff81083697>] task_work_run+0xa7/0xe0
[   74.637452]  [<ffffffff81014981>] do_notify_resume+0x71/0xb0
[   74.643843]  [<ffffffff8162fa92>] int_signal+0x12/0x17

To fix this a new ftrace_ops flag is added that denotes the ftrace_list_end
ftrace_ops stub as just that, a stub. This flag is now checked in the
control loop and the function is not called if the flag is set.

Thanks to Jovi for not just reporting the bug, but also pointing out
where the bug was in the code.

Link: http://lkml.kernel.org/r/514A8855.7090402@redhat.com
Link: http://lkml.kernel.org/r/1364377499-1900-15-git-send-email-jovi.zhangwei@huawei.com

Tested-by: WANG Chao <chaowang@redhat.com>
Reported-by: WANG Chao <chaowang@redhat.com>
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08 12:24:23 -04:00
Jan Kiszka
5000c41884 ftrace: Consistently restore trace function on sysctl enabling
If we reenable ftrace via syctl, we currently set ftrace_trace_function
based on the previous simplistic algorithm. This is inconsistent with
what update_ftrace_function does. So better call that helper instead.

Link: http://lkml.kernel.org/r/5151D26F.1070702@siemens.com

Cc: stable@vger.kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08 12:24:22 -04:00
Steven Rostedt (Red Hat)
2930e04d00 tracing: Fix race with update_max_tr_single and changing tracers
The commit 34600f0e9 "tracing: Fix race with max_tr and changing tracers"
fixed the updating of the main buffers with the race of changing
tracers, but left out the fix to the updating of just a per cpu buffer.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-08 12:24:22 -04:00
Chen Gang
75761cc158 ftrace: Fix strncpy() use, use strlcpy() instead of strncpy()
For NUL terminated string we always need to set '\0' at the end.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: rostedt@goodmis.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/516243B7.9020405@asianux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-08 13:26:56 +02:00
Chen Gang
67012ab1d2 perf: Fix strncpy() use, use strlcpy() instead of strncpy()
For NUL terminated string we always need to set '\0' at the end.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: rostedt@goodmis.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/51624254.30301@asianux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-08 13:26:56 +02:00
Steven Rostedt (Red Hat)
22f45649ce tracing: Update debugfs README file
Update the README file in debugfs/tracing to something more useful.
What's currently in the file is very old and what it shows doesn't
have much use. Heck, it tells you how to mount debugfs! But to read
this file you would have already needed to mount it.

Replace the file with current up-to-date information. It's rather
limited, but what do you expect from a pseudo README file.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-20 21:55:02 -04:00
Ingo Molnar
1f1b396758 Merge branch 'tip/perf/urgent-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull tracing fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 09:48:29 +01:00
Steven Rostedt (Red Hat)
7fe70b579c tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.

This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.

There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.

It also totally disables ftrace, and tracing for no good reasons.

As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.

ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.

The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.

For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.

The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.

Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 19:24:56 -04:00
zhangwei(Jovi)
52f6ad6dc3 tracing: Rename trace_event_mutex to trace_event_sem
trace_event_mutex is an rw semaphore now, not a mutex, change the name.

Link: http://lkml.kernel.org/r/513D843B.40109@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
[ Forward ported to my new code ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:10 -04:00
zhangwei(Jovi)
36a78e9e87 tracing: Fix comment about prefix in arch_syscall_match_sym_name()
ppc64 has its own syscall prefix like ".SyS" or ".sys". Make the
comment in arch_syscall_match_sym_name() more understandable.

Link: http://lkml.kernel.org/r/513D842F.40205@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:09 -04:00
zhangwei(Jovi)
ad7067cebf tracing: Convert trace_destroy_fields() to static
trace_destroy_fields() is not used outside of the file. It can be
a static function.

Link: http://lkml.kernel.org/r/513D842A.2000907@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:08 -04:00
zhangwei(Jovi)
b3a8c6fd7b tracing: Move find_event_field() into trace_events.c
By moving find_event_field() and trace_find_field() into trace_events.c,
the ftrace_common_fields list and trace_get_fields() can become local to
the trace_events.c file.

find_event_field() is renamed to trace_find_event_field() to conform to
the tracing global function names.

Link: http://lkml.kernel.org/r/513D8426.9070109@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
[ rostedt: Modified trace_find_field() to trace_find_event_field() ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:07 -04:00
zhangwei(Jovi)
bd6df18716 tracing: Use TRACE_MAX_PRINT instead of constant
TRACE_MAX_PRINT macro is defined, but is not used.

Link: http://lkml.kernel.org/r/513D8421.4070404@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:06 -04:00
zhangwei(Jovi)
687c878afb tracing: Use pr_warn_once instead of open coded implementation
Use pr_warn_once, instead of making an open coded implementation.

Link: http://lkml.kernel.org/r/513D8419.20400@huawei.com

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:22:05 -04:00
Steven Rostedt (Red Hat)
6c43e554a2 ring-buffer: Add ring buffer startup selftest
When testing my large changes to the ftrace system, there was
a bug that looked like the ring buffer was dropping events.
I wrote up a quick integrity checker of the ring buffer to
see if it was.

Although the bug ended up being something stupid I did in ftrace,
and had nothing to do with the ring buffer, I figured if I spent
the time to write up this test, I might as well include it in the
kernel.

I cleaned it up a bit, as the original version was rather ugly.
Not saying this version is pretty, but it's a beauty queen
compared to what I original wrote.

To enable the start up test, set CONFIG_RING_BUFFER_STARTUP_TEST.

Note, it runs for 10 seconds, so it will slow your boot time
by at least 10 more seconds.

What it does is documented in both the comments and the Kconfig
help.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 13:21:16 -04:00
Steven Rostedt (Red Hat)
76f119179b tracing: Add "perf" trace_clock
The function trace_clock() calls "local_clock()" which is exactly
the same clock that perf uses. I'm not sure why perf doesn't call
trace_clock(), as trace_clock() doesn't have any users.

But now it does. As trace_clock() calls local_clock() like perf does,
I added the trace_clock "perf" option that uses trace_clock().

Now the ftrace buffers can use the same clock as perf uses. This
will be useful when perf starts reading the ftrace buffers, and will
be able to interleave them with the same clock data.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:10 -04:00
Steven Rostedt (Red Hat)
8aacf017b0 tracing: Add "uptime" trace clock that uses jiffies
Add a simple trace clock called "uptime" for those that are
interested in the uptime of the trace. It uses jiffies as that's
the safest method, as other uptime clocks grab seq locks, which could
cause a deadlock if taken from an event or function tracer.

Requested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:09 -04:00
Steven Rostedt (Red Hat)
328df4759c tracing: Add function-trace option to disable function tracing of latency tracers
Currently, the only way to stop the latency tracers from doing function
tracing is to fully disable the function tracer from the proc file
system:

  echo 0 > /proc/sys/kernel/ftrace_enabled

This is a big hammer approach as it disables function tracing for
all users. This includes kprobes, perf, stack tracer, etc.

Instead, create a function-trace option that the latency tracers can
check to determine if it should enable function tracing or not.
This option can be set or cleared even while the tracer is active
and the tracers will disable or enable function tracing depending
on how the option was set.

Instead of using the proc file, disable latency function tracing with

  echo 0 > /debug/tracing/options/function-trace

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:08 -04:00
Steven Rostedt (Red Hat)
4df297129f tracing: Remove most or all of stack tracer stack size from stack_max_size
Currently, the depth reported in the stack tracer stack_trace file
does not match the stack_max_size file. This is because the stack_max_size
includes the overhead of stack tracer itself while the depth does not.

The first time a max is triggered, a calculation is not performed that
figures out the overhead of the stack tracer and subtracts it from
the stack_max_size variable. The overhead is stored and is subtracted
from the reported stack size for comparing for a new max.

Now the stack_max_size corresponds to the reported depth:

 # cat stack_max_size
4640

 # cat stack_trace
        Depth    Size   Location    (48 entries)
        -----    ----   --------
  0)     4640      32   _raw_spin_lock+0x18/0x24
  1)     4608     112   ____cache_alloc+0xb7/0x22d
  2)     4496      80   kmem_cache_alloc+0x63/0x12f
  3)     4416      16   mempool_alloc_slab+0x15/0x17
[...]

While testing against and older gcc on x86 that uses mcount instead
of fentry, I found that pasing in ip + MCOUNT_INSN_SIZE let the
stack trace show one more function deep which was missing before.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:07 -04:00
Steven Rostedt (Red Hat)
d4ecbfc49b tracing: Fix stack tracer with fentry use
When gcc 4.6 on x86 is used, the function tracer will use the new
option -mfentry which does a call to "fentry" at every function
instead of "mcount". The significance of this is that fentry is
called as the first operation of the function instead of the mcount
usage of being called after the stack.

This causes the stack tracer to show some bogus results for the size
of the last function traced, as well as showing "ftrace_call" instead
of the function. This is due to the stack frame not being set up
by the function that is about to be traced.

 # cat stack_trace
        Depth    Size   Location    (48 entries)
        -----    ----   --------
  0)     4824     216   ftrace_call+0x5/0x2f
  1)     4608     112   ____cache_alloc+0xb7/0x22d
  2)     4496      80   kmem_cache_alloc+0x63/0x12f

The 216 size for ftrace_call includes both the ftrace_call stack
(which includes the saving of registers it does), as well as the
stack size of the parent.

To fix this, if CC_USING_FENTRY is defined, then the stack_tracer
will reserve the first item in stack_dump_trace[] array when
calling save_stack_trace(), and it will fill it in with the parent ip.
Then the code will look for the parent pointer on the stack and
give the real size of the parent's stack pointer:

 # cat stack_trace
        Depth    Size   Location    (14 entries)
        -----    ----   --------
  0)     2640      48   update_group_power+0x26/0x187
  1)     2592     224   update_sd_lb_stats+0x2a5/0x4ac
  2)     2368     160   find_busiest_group+0x31/0x1f1
  3)     2208     256   load_balance+0xd9/0x662

I'm Cc'ing stable, although it's not urgent, as it only shows bogus
size for item #0, the rest of the trace is legit. It should still be
corrected in previous stable releases.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:07 -04:00
Steven Rostedt (Red Hat)
87889501d0 tracing: Use stack of calling function for stack tracer
Use the stack of stack_trace_call() instead of check_stack() as
the test pointer for max stack size. It makes it a bit cleaner
and a little more accurate.

Adding stable, as a later fix depends on this patch.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:06 -04:00
Steven Rostedt (Red Hat)
dd42cd3ea9 tracing: Add function probe to trigger stack traces
Add a function probe that will cause a stack trace to be traced in
the ring buffer when the given function(s) are called.

format is:

 <function>:stacktrace[:<count>]

 echo 'schedule:stacktrace' > /debug/tracing/set_ftrace_filter
 cat /debug/tracing/trace_pipe
     kworker/2:0-4329  [002] ...2  2933.558007: <stack trace>
 => kthread
 => ret_from_fork
          <idle>-0     [000] .N.2  2933.558019: <stack trace>
 => rest_init
 => start_kernel
 => x86_64_start_reservations
 => x86_64_start_kernel
     kworker/2:0-4329  [002] ...2  2933.558109: <stack trace>
 => kthread
 => ret_from_fork
[...]

This can be set to only trace a specific amount of times:

 echo 'schedule:stacktrace:3' > /debug/tracing/set_ftrace_filter
 cat /debug/tracing/trace_pipe
           <...>-58    [003] ...2   841.801694: <stack trace>
 => kthread
 => ret_from_fork
          <idle>-0     [001] .N.2   841.801697: <stack trace>
 => start_secondary
           <...>-2059  [001] ...2   841.801736: <stack trace>
 => wait_for_common
 => wait_for_completion
 => flush_work
 => tty_flush_to_ldisc
 => input_available_p
 => n_tty_poll
 => tty_poll
 => do_select
 => core_sys_select
 => sys_select
 => system_call_fastpath

To remove these:

 echo '!schedule:stacktrace' > /debug/tracing/set_ftrace_filter
 echo '!schedule:stacktrace:0' > /debug/tracing/set_ftrace_filter

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:05 -04:00
Steven Rostedt (Red Hat)
c142be8ebe tracing: Add skip argument to trace_dump_stack()
Altough the trace_dump_stack() already skips three functions in
the call to stack trace, which gets the stack trace to start
at the caller of the function, the caller may want to skip some
more too (as it may have helper functions).

Add a skip argument to the trace_dump_stack() that lets the caller
skip back tracing functions that it doesn't care about.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:36:05 -04:00