[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <557A06D7.7030304@plumgrid.com>
Date: Thu, 11 Jun 2015 15:08:23 -0700
From: Alexei Starovoitov <ast@...mgrid.com>
To: Daniel Wagner <daniel.wagner@...-carit.de>,
Steven Rostedt <rostedt@...dmis.org>,
Tom Zanussi <tom.zanussi@...ux.intel.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Wang Nan <wangnan0@...wei.com>
Subject: Re: latency histogram with BPF
On 6/11/15 12:25 AM, Daniel Wagner wrote:
>
> In both cases BPF or based on Tom's 'hist' triggers' patches, there is
> some trickery necessary to get it working. While the first approach
> has more flexibility what you want to measure or how you want to
> present it, I suspect it will be harder to get it working/accepted.
why? Out of your patch only 2 lines are for kernel, which I wanted
to add anyway. The sample code also looks good.
> Obviously, adding kprobes to trace_preempt_[on|off] is
> asking for problems:
thanks for reporting.
I reproduced it with hackbench, but my stack trace looks completely
different. Looks like some memory corruption is happening.
Not clear where. I'm still debugging.
> CPU 3
> latency : count distribution
> 1 -> 1 : 0 | |
> 2 -> 3 : 0 | |
> 4 -> 7 : 0 | |
> 8 -> 15 : 25182 |************************************************* |
> 16 -> 31 : 1675 |** |
> 32 -> 63 : 0 | |
> 64 -> 127 : 0 | |
> 128 -> 255 : 0 | |
> 256 -> 511 : 0 | |
> 512 -> 1023 : 0 | |
>
> The numbers look a bit too nice and low. I suspect something is going
> wrong.
I see similar numbers. 25k over 5 sec = 5k preempt on/off per second
which sounds about right for idle.
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 2d56ce5..e4d3e76 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -172,6 +172,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
> return &bpf_probe_read_proto;
> case BPF_FUNC_ktime_get_ns:
> return &bpf_ktime_get_ns_proto;
> + case BPF_FUNC_get_smp_processor_id:
> + return &bpf_get_smp_processor_id_proto;
This part I'll take into my set of patches.
> --- a/samples/bpf/bpf_helpers.h
> +++ b/samples/bpf/bpf_helpers.h
> @@ -21,6 +21,8 @@ static unsigned long long (*bpf_ktime_get_ns)(void) =
> (void *) BPF_FUNC_ktime_get_ns;
> static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
> (void *) BPF_FUNC_trace_printk;
> +static unsigned int (*bpf_get_smp_processor_id)(void) =
> + (void *) BPF_FUNC_get_smp_processor_id;
this part is already there in net-next.
> diff --git a/samples/bpf/hwlathist_kern.c b/samples/bpf/hwlathist_kern.c
> new file mode 100644
> index 0000000..7a97e7e
> --- /dev/null
> +++ b/samples/bpf/hwlathist_kern.c
looks good and useful, but I would like to find and fix the bug first
before exposing this bit as a 'sample', since right now it's
'crash me if hackbench is running' test :)
If you have any suggestions on where to look, I'm all ears.
My stack traces look like:
Running with 10*40 (== 400) tasks.
[ 12.032571] kernel BUG at ../mm/slub.c:3413!
[ 12.036004] [<ffffffff810c1913>] rcu_process_callbacks+0x283/0x680
[ 12.036004] [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
[ 12.036004] [<ffffffff8107dff9>] ? kthread+0xc9/0xe0
[ 12.036004] [<ffffffff810634b1>] run_ksoftirqd+0x21/0x50
or
[ 25.788074] kernel tried to execute NX-protected page - exploit
attempt? (uid: 0)
[ 25.788801] BUG: unable to handle kernel paging request at
ffff88000b1ea2a0
[ 25.800085] [<ffffffff810c18ba>] ? rcu_process_callbacks+0x22a/0x680
[ 25.800085] [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
[ 25.800085] [<ffffffff816d6d7c>] do_softirq_own_stack+0x1c/0x30
[ 25.800085] <EOI>
[ 25.800085] [<ffffffff8106351d>] do_softirq+0x3d/0x40
[ 25.800085] [<ffffffff810635ba>] __local_bh_enable_ip+0x9a/0xb0
[ 25.800085] [<ffffffff816d4c1b>] _raw_spin_unlock_bh+0x1b/0x20
[ 25.800085] [<ffffffff811f3e42>] bdi_writeback_workfn+0x1b2/0x470
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists