[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+ZOOTONwNt2xpwUc=tqhP=m31KZTzij-rrAHVh6g3hVYgKaEw@mail.gmail.com>
Date: Fri, 20 Jun 2014 14:56:21 -0700
From: Chema Gonzalez <chema@...gle.com>
To: Daniel Borkmann <dborkman@...hat.com>
Cc: Alexei Starovoitov <ast@...mgrid.com>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Jiri Olsa <jolsa@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Kees Cook <keescook@...omium.org>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Network Development <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 net-next 1/4] net: flow_dissector: avoid multiple calls
in eBPF
I'll try to revive the discussion for this patch, in case I can
convince you about its implementation. I rebased it to the latest
HEAD, and I'm ready to re-submit.
On Wed, Jun 4, 2014 at 1:51 AM, Daniel Borkmann <dborkman@...hat.com> wrote:
> On 06/03/2014 11:12 PM, Chema Gonzalez wrote:
> ...
>
>> Your approach needs it too. Citing from your pseudo-code:
>>
>>> ld #5 <-- indicates to fill the first 5 slots of M[], so M[0] to M[4]
>>> ld #keys <-- triggers the extension to fill the M[] slots
>>> ld M[0] <-- loads nhoff from M[0] into accu
>>
>>
>> How does the "ld M[0]" know that the actual flow dissector has already
>> been called? What if the insn just before the "ld #5" was "jmp +2" ?
>> In that case, the "ld #keys" would have never been called.
>
>
> But that case would be no different from doing something like ...
>
> [...]
> jmp foo
> ldi #42
> st M[0]
> foo:
> ld M[0]
> [...]
>
> ... and would then not pass the checker in check_load_and_stores(),
> which, as others have already stated, would need to be extended,
> of course. It's one possible approach.
As Alexei/you have noted, this is moving the complexity to
check_load_and_stores(). It's more complicated than the current
ensure-there's-a-store-preceding-each-load-for-each-M[]-position
check. In particular, your approach requires 2 insns before a memory
position can be considered filled, namely the "ld #5" and the "ld
#keys". That means adding intra-instruction state to the FSM in
check_load_and_stores().
A second issue is that you need to ensure that M[0:5] does not get
polluted between the moment you call "ld #5; ld #keys" and the moment
the code uses the flow-dissector values. This is more complex than
just ensuring the code doesn't access uninit'ed M[].
There's also some concerns about effectively adding a prologue. We
already have that:
$ cat net/core/filter.c
...
int sk_convert_filter(struct sock_filter *prog, int len,
struct sock_filter_int *new_prog, int *new_len)
{
...
if (new_insn)
*new_insn = BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1);
new_insn++;
for (i = 0; i < len; fp++, i++) {
...
The current prologue is embedded in sk_convert_filter(). The patch
just moves it to its own function, and (potentially) adds another
instruction (which zeroes flow_inited iff there's any call to the flow
dissector).
>>> Anyway as I said before I'm not excited about either.
>>> I don't think we should be adding classic BPF extensions any more.
>>> The long term headache of supporting classic BPF extensions
>>> outweighs the short term benefits.
>> I see a couple of issues with (effectively) freezing classic BPF
>> development while waiting for direct eBPF access to happen. The first
>> one is that the kernel has to accept it. I can see many questions
>> about this, especially security and usability (I'll send an email
>> about the "split BPF out of core later"). Now, the main issue is
>> whether/when the tools will support it. IMO, this is useful iff I can
>> quickly write/reuse filters and run tcpdump filters based on them. I'm
>> trying to get upstream libpcap to accept support for raw (classic) BPF
>> filters, and it's taking a long time. I can imagine how they may be
>> less receptive about supporting a Linux-only eBPF mechanism. Tools do
>> matter.
This is a high-level decision, more than a technical one. Do we want
to freeze classic BPF development in linux, even before we have a
complete eBPF replacement, and zero eBPF tool (libpcap) support?
> Grepping through libpcap code, which tries to be platform independent,
> it seems after all the years, the only thing where you can see support
> for in their code is SKF_AD_PKTTYPE and SKF_AD_PROTOCOL. Perhaps they
Actually they recently added MOD/XOR support. Woo-hoo!
> just don't care, perhaps they do, who knows, but it looks to me a bit
> that they are reluctant to these improvements, maybe for one reason
> that other OSes don't support it.
>From the comments in the MOD/XOR patch, the latter seem to be the issue.
> That was also one of the reasons that
> led me to start writing bpf_asm (net/tools/) for having a small DSL
> for more easily trying out BPF code while having _full_ control over it.
>
> Maybe someone should start a binary-compatible Linux-only version of
> libpcap, where tcpdump will transparently make use of these low level
> improvements eventually. </rant> ;)
There's too much code dependent on libpcap to make a replacement possible.
Thanks,
-Chema
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists