[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56CD2B98.4040009@huawei.com>
Date: Wed, 24 Feb 2016 12:03:36 +0800
From: "Wangnan (F)" <wangnan0@...wei.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>
CC: Alexei Starovoitov <ast@...nel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Brendan Gregg <brendan.d.gregg@...il.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Cody P Schafer <dev@...yps.com>,
"David S. Miller" <davem@...emloft.net>,
He Kuang <hekuang@...wei.com>,
Jérémie Galarneau
<jeremie.galarneau@...icios.com>, Jiri Olsa <jolsa@...nel.org>,
Kirill Smelkov <kirr@...edi.com>,
Li Zefan <lizefan@...wei.com>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Namhyung Kim <namhyung@...nel.org>, <pi3orama@....com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 10/48] perf tools: Introduce bpf-output event
On 2016/2/24 10:04, Wangnan (F) wrote:
>
>
> On 2016/2/24 9:58, Wangnan (F) wrote:
>>
>>
>> On 2016/2/24 1:45, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Feb 22, 2016 at 09:10:37AM +0000, Wang Nan escreveu:
>>>> Commit a43eec304259a6c637f4014a6d4767159b6a3aa3 (bpf: introduce
>>>> bpf_perf_event_output() helper) add a helper to enable BPF program
>>>> output data to perf ring buffer through a new type of perf event
>>>> PERF_COUNT_SW_BPF_OUTPUT. This patch enable perf to create perf
>>>> event of that type. Now perf user can use following cmdline to
>>>> receive output data from BPF programs:
>>>>
>>>> # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
>>>> -e ./test_bpf_output.c/map:channel.event=evt/
>>>> ls /
>>>> # ./perf script
>>>> perf 1560 [004] 347747.086295:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> perf 1560 [004] 347747.086300:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> perf 1560 [004] 347747.086315:
>>>> evt: ffffffff811fd201 sys_write ...
>>>> ...
>>>>
>>>> Test result:
>>>> # cat ./test_bpf_output.c
>>>> /************************ BEGIN **************************/
>>>> #include <uapi/linux/bpf.h>
>>>> struct bpf_map_def {
>>>> unsigned int type;
>>>> unsigned int key_size;
>>>> unsigned int value_size;
>>>> unsigned int max_entries;
>>>> };
>>>>
>>>> #define SEC(NAME) __attribute__((section(NAME), used))
>>>> static u64 (*ktime_get_ns)(void) =
>>>> (void *)BPF_FUNC_ktime_get_ns;
>>>> static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
>>>> (void *)BPF_FUNC_trace_printk;
>>>> static int (*get_smp_processor_id)(void) =
>>>> (void *)BPF_FUNC_get_smp_processor_id;
>>>> static int (*perf_event_output)(void *, struct bpf_map_def *,
>>>> int, void *, unsigned long) =
>>>> (void *)BPF_FUNC_perf_event_output;
>>>>
>>>> struct bpf_map_def SEC("maps") channel = {
>>>> .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
>>>> .key_size = sizeof(int),
>>>> .value_size = sizeof(u32),
>>>> .max_entries = __NR_CPUS__,
>>>> };
>>>>
>>>> SEC("func_write=sys_write")
>>>> int func_write(void *ctx)
>>>> {
>>>> struct {
>>>> u64 ktime;
>>>> int cpuid;
>>>> } __attribute__((packed)) output_data;
>>>> char error_data[] = "Error: failed to output: %d\n";
>>>>
>>>> output_data.cpuid = get_smp_processor_id();
>>>> output_data.ktime = ktime_get_ns(); supr
>>>> int err = perf_event_output(ctx, &channel,
>>>> get_smp_processor_id(),
>>>> &output_data, sizeof(output_data));
>>>> if (err)
>>>> trace_printk(error_data, sizeof(error_data), err);
>>>> return 0;
>>>> }
>>>> char _license[] SEC("license") = "GPL";
>>>> int _version SEC("version") = LINUX_VERSION_CODE;
>>>> /************************ END ***************************/
>>>>
>>>> # ./perf record -a -e bpf-output/no-inherit,name=evt/ \
>>>> -e ./test_bpf_output.c/map:channel.event=evt/
>>>> ls /
>>>> # ./perf script | grep ls
>>>> ls 2242 [003] 347851.557563: evt: ffffffff811fd201
>>>> sys_write ...
>>>> ls 2242 [003] 347851.557571: evt: ffffffff811fd201
>>>> sys_write ...
>>> So, there is something strange here:
>>>
>>> if (unlikely(event->oncpu != smp_processor_id()))
>>> return -EOPNOTSUPP;
>>>
>>
>
> All failures have 'event->oncpu == -1' here. I guess we should
> suppress warning in
> this case. But why event->oncpu becomes -1?
>
For this specific test it is not surprising to see these error messages.
In this test
we create bpf-output channel on 'ls' process only, but the BPF script is
triggered
on all procs (BPF triggering is not related to perf event scheduling).
Trying to
output data through 'ls' specific bpf-output channel should fail if this
'sys_write'
is not issued by 'ls' or its children. So it is a correct behavior.
However, I also see them in system wide channel:
# echo "" > /sys/kernel/debug/tracing/trace
# ./perf record -a -e bpf-output/no-inherit,name=evt/ \
-e ./test_bpf_output.c/map:channel.event=evt/
-a
^C[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 17.534 MB perf.data (264326 samples) ]
# cat /sys/kernel/debug/tracing/trace | tail
rs:main Q:Reg-582 [000] d..2 4858.711225: : Error: failed to
output: -95
rs:main Q:Reg-582 [000] d..2 4858.711241: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711436: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711441: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.711473: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712215: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712224: : Error: failed to
output: -95
gmain-1858 [003] d..2 4858.712230: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712235: : Error: failed to
output: -95
rs:main Q:Reg-582 [002] d..2 4858.712239: : Error: failed to
output: -95
System wide events can also be scheduled in and out. If the bpf-output
events
are scheduled out, trying to output data through it causes the above
failure.
I don't think it is a problem.
Peter, Could you please give some infomation? In which case a system wide
bpf output channel would be scheduled out?
Thank you.
Powered by blists - more mailing lists