[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150122010316.GA15871@sejong>
Date: Thu, 22 Jan 2015 10:03:16 +0900
From: Namhyung Kim <namhyung@...nel.org>
To: Alexei Starovoitov <ast@...mgrid.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Jiri Olsa <jolsa@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Daniel Borkmann <dborkman@...hat.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Brendan Gregg <brendan.d.gregg@...il.com>,
Linux API <linux-api@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH tip 0/9] tracing: attach eBPF programs to
tracepoints/syscalls/kprobe
Hi Alexei,
On Fri, Jan 16, 2015 at 10:57:15AM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 16, 2015 at 7:02 AM, Steven Rostedt <rostedt@...dmis.org> wrote:
> > One last thing. If the ebpf is used for anything but filtering, it
> > should go into the trigger file. The filtering is only a way to say if
> > the event should be recorded or not. But the trigger could do something
> > else (a printk, a stacktrace, etc).
>
> it does way more than just filtering, but
> invoking program as a trigger is too slow.
> When program is called as soon as tracepoint fires,
> it can fetch other fields, evaluate them, printk some of them,
> optionally dump stack, aggregate into maps.
> We can let it call triggers too, so that user program will
> be able to enable/disable other events.
> I'm not against invoking programs as a trigger, but I don't
> see a use case for it. It's just too slow for production
> analytics that needs to act on huge number of events
> per second.
AFAIK a trigger can be fired before allocating a ring buffer if it
doesn't use the event record (i.e. has filter) or ->post_trigger bit
set (stacktrace). Please see ftrace_trigger_soft_disabled().
This also makes it keeping events in the soft-disabled state.
Thanks,
Namhyung
> We must minimize the overhead between tracepoint
> firing and program executing, so that programs can
> be used on events like packet receive which will be
> in millions per second. Every nsec counts.
> For example:
> - raw dd if=/dev/zero of=/dev/null
> does 760 MB/s (on my debug kernel)
> - echo 1 > events/syscalls/sys_enter_write/enable
> drops it to 400 MB/s
> - echo "echo "count == 123 " > events/syscalls/sys_enter_write/filter
> drops it even further down to 388 MB/s
> This slowdown is too high for this to be used on a live system.
> - tracex4 that computes histogram of sys_write sizes
> and stores log2(count) into a map does 580 MB/s
> This is still not great, but this slowdown is now usable
> and we can work further on minimizing the overhead.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists