netdev - Re: [PATCH v2 net-next 0/2] split BPF out of core networking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+ZOOTOfq705CbUjx9ppw4gLxJ8JFCaFsphwhAS+NskHVjd86Q@mail.gmail.com>
Date:	Tue, 3 Jun 2014 14:40:39 -0700
From:	Chema Gonzalez <chema@...gle.com>
To:	Alexei Starovoitov <ast@...mgrid.com>
Cc:	Daniel Borkmann <dborkman@...hat.com>,
	"David S. Miller" <davem@...emloft.net>,
	Ingo Molnar <mingo@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Eric Dumazet <edumazet@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Jiri Olsa <jolsa@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Network Development <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 net-next 0/2] split BPF out of core networking

First of all, and just to join the crowd, kernel/bpf/ FTW.

Now, I have some suggestions about eBPF. IMO classic BPF is an ISA
oriented to filter (meaning returning a single integer that states how
many bytes of the packet must be captured) packets (e.g. consider the
6 load modes, where 3 provide access the packet -- abs, ind, msh --,
one to an skb field -- len--, the 5th one to the memory itself -- mem
--, and the 6th is an immediate set mode --imm-- ) that has been used
in other environments (seccomp, tracing, etc.) by (a) extending the
idea of a "packet" into a "buffer", and (b) adding ancillary loads.

eBPF should be a generic ISA that can be used by many environments,
including those served today by classic BPF. IMO, we should get a
nicely-defined ISA (MIPS anyone?) and check what should go into eBPF
and what should not.

- 1. we should considering separating the eBPF ISA farther from classic BPF
  - eBPF still uses a_reg and x_reg as the names of the 2 op
registers. This is very confusing, especially when dealing with
translated filters that do move data between A and X. I've had a_reg
being X, and x_reg being A. We should rename them d_reg and s_reg.
  - BPF_LD vs. BPF_LDX: this made sense in classic BPF, as there was
only one register, and d_reg was implicit in the name of the insn
code. Now, why are we keeping both in eBPF, when the register we're
writing to is made explicit in d_reg (I already forgot if d_reg was
a_reg or x_reg ;) ? Removing one of them will save us 1/8th of the
insns.
  - BPF_ST vs. BPF_STX: same here. Note that the current
sk_convert_filter() just converts all stores to BPF_STX.

- 2. there are other insn that we should consider adding:
  - lui: AFAICT, there is no clean way to build a 64-bit number (you
can LD_IMM the upper part, lsh 32, and then add the lower part).
  - nop: I'd like to have a nop. Do I know why? Nope.


On Tue, Jun 3, 2014 at 1:58 PM, Alexei Starovoitov <ast@...mgrid.com> wrote:
> On Tue, Jun 3, 2014 at 1:35 PM, Daniel Borkmann <dborkman@...hat.com> wrote:
>> On 06/03/2014 05:44 PM, Alexei Starovoitov wrote:
>> ...
>>>
>>> All of your points are valid. They are right questions to ask. I just
>>>
>>> don't see why you're still arguing about first step of filter.c split,
>>> whereas your concerns are about steps 2, 3, 4.
>>
>>
>> Fair enough, lets keep them in mind though for future work. Btw,
>
> Ok :)
>
>> are other files planned for kernel/bpf/ or should it instead just
>> simply be kernel/bpf.c?
>
> The most obvious one is eBPF verifier in separate file (kernel/bpf/verifier.c)
> bpf maps is yet another thing, but that's different topic.
> Probably a set of bpf-callable functions in another file. Like right now
> for sockets these helpers are __skb_get_pay_offset(), __skb_get_nlattr()
> For tracing there will be a different set of helper functions and eventually
> some will be common. Like __get_raw_cpu_id() from filter.c could
> eventually move to kernel/bpf/helpers.c
LGTM.

I like the idea of every user (packet filter, seccomp, etc.) providing
a map of the bpf calls that are ok, as in the packet filter stating
that {1->__skb_get_pay_offset(), 2->__skb_get_nlattr(), ...}, but
seccomp providing a completely different (or even empty) map.

-Chema


> I'm not a fan of squeezing different logic into one file.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html