netdev - Re: [RFC][PATCH 00/10] Add trace event support to eBPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20160219041622.GA88036@ast-mbp.thefacebook.com>
Date:	Thu, 18 Feb 2016 20:16:23 -0800
From:	Alexei Starovoitov <alexei.starovoitov@...il.com>
To:	Tom Zanussi <tom.zanussi@...ux.intel.com>
Cc:	rostedt@...dmis.org, masami.hiramatsu.pt@...achi.com,
	namhyung@...nel.org, peterz@...radead.org,
	linux-kernel@...r.kernel.org,
	Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Subject: Re: [RFC][PATCH 00/10] Add trace event support to eBPF

On Thu, Feb 18, 2016 at 03:27:18PM -0600, Tom Zanussi wrote:
> On Tue, 2016-02-16 at 20:51 -0800, Alexei Starovoitov wrote:
> > On Tue, Feb 16, 2016 at 04:35:27PM -0600, Tom Zanussi wrote:
> > > On Sun, 2016-02-14 at 01:02 +0100, Alexei Starovoitov wrote:
> > > > On Fri, Feb 12, 2016 at 10:11:18AM -0600, Tom Zanussi wrote:
> > 
> 
>   # ./funccount.py '*spin*'
> 
> Which on my machine resulted in a hard lockup on all CPUs.  I'm not set

thanks for the report. looks like something got broken. After:
# ./funccount.par '*spin*'
Tracing 12 functions for "*spin*"... Hit Ctrl-C to end.
^C
ADDR             FUNC                          COUNT
ffffffff810aeb91 mutex_spin_on_owner             530
ffffffff8177f241 _raw_spin_unlock_bh            1325
ffffffff810aebe1 mutex_optimistic_spin          1696
ffffffff8177f581 _raw_spin_lock_bh              1985
ffffffff8177f511 _raw_spin_trylock             55337
ffffffff8177f3c1 _raw_spin_lock_irq           787875
ffffffff8177f551 _raw_spin_lock              2211324
ffffffff8177f331 _raw_spin_lock_irqsave      3556740
ffffffff8177f1c1 __lock_text_start           3593983
[  275.175524] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 11
it seems kprobe cleanup is racing with bpf cleanup...

> > > surrounding that even in the comments.  I guess I'd have to spend a few
> > > hours reading the BPF code and the verifier even, to understand that.
> > 
> > not sure what is your goal. Runtime lookup via field name is not acceptable
> > whether it's cached or not. There is no place for strcmp in the critical path.
> 
> Exactly - that's why I was asking about a 'begin probe', in order to do
> the lookup once, in an non-critical path.

It is critical path. When program is called million times for_each{strcmp}
at the beginning of every program is unacceptable overhead.
In the crash above, Ctrl-C was pressed in a split second, yet bpf
already processed 2.2M + 3.5M + 3.5M events and then hung while unloading.
In the upcoming tracepoint+bpf patches the programs will have
direct access to tracepoint data without wasting time on strcmp.
The steps to do that were already outlined in the previous email.