lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGdX0WEVxGDNi-YMVeAtEV9bZpMbnnSBXxrUrn_oh0sWRouSaQ@mail.gmail.com>
Date:	Sun, 8 Dec 2013 00:21:24 +0800
From:	Jovi Zhangwei <jovi.zhangwei@...il.com>
To:	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
Cc:	Alexei Starovoitov <ast@...mgrid.com>,
	Andi Kleen <andi@...stfloor.org>,
	Ingo Molnar <mingo@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tom Zanussi <tom.zanussi@...ux.intel.com>,
	Eric Dumazet <edumazet@...gle.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: Re: [RFC PATCH tip 0/5] tracing filters with BPF

On Sat, Dec 7, 2013 at 7:58 AM, Masami Hiramatsu
<masami.hiramatsu.pt@...achi.com> wrote:
> (2013/12/06 14:19), Jovi Zhangwei wrote:
>> Hi Alexei,
>>
>> On Thu, Dec 5, 2013 at 12:40 PM, Alexei Starovoitov <ast@...mgrid.com> wrote:
>>>> On Tue, Dec 3, 2013 at 4:01 PM, Andi Kleen <andi@...stfloor.org> wrote:
>>>>>
>>>>> Can you do some performance comparison compared to e.g. ktap?
>>>>> How much faster is it?
>>>
>>> Did simple ktap test with 1M alloc_skb/kfree_skb toy test from earlier email:
>>> trace skb:kfree_skb {
>>>         if (arg2 == 0x100) {
>>>                 printf("%x %x\n", arg1, arg2)
>>>         }
>>> }
>>> 1M skb alloc/free 350315 (usecs)
>>>
>>> baseline without any tracing:
>>> 1M skb alloc/free 145400 (usecs)
>>>
>>> then equivalent bpf test:
>>> void filter(struct bpf_context *ctx)
>>> {
>>>         void *loc = (void *)ctx->regs.dx;
>>>         if (loc == 0x100) {
>>>                 struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
>>>                 char fmt[] = "skb %p loc %p\n";
>>>                 bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 0);
>>>         }
>>> }
>>> 1M skb alloc/free 183214 (usecs)
>>>
>>> so with one 'if' condition the difference ktap vs bpf is 350-145 vs 183-145
>>>
>>> obviously ktap is an interpreter, so it's not really fair.
>>>
>>> To make it really unfair I did:
>>> trace skb:kfree_skb {
>>>         if (arg2 == 0x100 || arg2 == 0x200 || arg2 == 0x300 || arg2 == 0x400 ||
>>>             arg2 == 0x500 || arg2 == 0x600 || arg2 == 0x700 || arg2 == 0x800 ||
>>>             arg2 == 0x900 || arg2 == 0x1000) {
>>>                 printf("%x %x\n", arg1, arg2)
>>>         }
>>> }
>>> 1M skb alloc/free 484280 (usecs)
>>>
>>> and corresponding bpf:
>>> void filter(struct bpf_context *ctx)
>>> {
>>>         void *loc = (void *)ctx->regs.dx;
>>>         if (loc == 0x100 || loc == 0x200 || loc == 0x300 || loc == 0x400 ||
>>>             loc == 0x500 || loc == 0x600 || loc == 0x700 || loc == 0x800 ||
>>>             loc == 0x900 || loc == 0x1000) {
>>>                 struct sk_buff *skb = (struct sk_buff *)ctx->regs.si;
>>>                 char fmt[] = "skb %p loc %p\n";
>>>                 bpf_trace_printk(fmt, sizeof(fmt), (long)skb, (long)loc, 0);
>>>         }
>>> }
>>> 1M skb alloc/free 185660 (usecs)
>>>
>>> the difference is bigger now: 484-145 vs 185-145
>>>
>> There have big differences for compare arg2(in ktap) with direct register
>> access(ctx->regs.dx).
>>
>> The current argument fetching(arg2 in above testcase) implementation in ktap
>> is very inefficient, see ktap/interpreter/lib_kdebug.c:kp_event_getarg.
>> The only way to speedup is kernel tracing code change, let external tracing
>> module access event field not through list lookup. This work is not
>> started yet. :)
>
> I'm not sure why you can't access it directly from ftrace-event buffer.
> There is just a packed data structure and it is exposed via debugfs.
> You can decode it and can get an offset/size by using libtraceevent.
>
Then it means there need pass the event field info into kernel through trunk,
it looks strange because the kernel structure is the source of event field info,
it's like loop-back, and need to engage with libtraceevent in userspace.
(the side effect is it will make compilation slow, and consume more memory,
sometimes it will process 20K events in one script, like 'trace
probe:big_dso:*')

So "the only way" which I said is wrong, your approach indeed is another way.
I just think maybe use array instead of list for event fields would be more
efficient if list is not must needed. we can check it more in future.

Thanks.

Jovi.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ