[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20140310.152242.340501323969517196.davem@davemloft.net>
Date: Mon, 10 Mar 2014 15:22:42 -0400 (EDT)
From: David Miller <davem@...emloft.net>
To: ast@...mgrid.com
Cc: dborkman@...hat.com, mingo@...nel.org, wad@...omium.org,
rostedt@...dmis.org, a.p.zijlstra@...llo.nl, hpa@...or.com,
hagen@...u.net, jesse@...ira.com, tglx@...utronix.de,
masami.hiramatsu.pt@...achi.com, tom.zanussi@...ux.intel.com,
jovi.zhangwei@...il.com, edumazet@...gle.com,
torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
fweisbec@...il.com, acme@...radead.org, penberg@....fi,
arjan@...radead.org, hch@...radead.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v6 net-next 1/3] filter: add Extended BPF interpreter
and converter
From: Alexei Starovoitov <ast@...mgrid.com>
Date: Fri, 7 Mar 2014 14:19:39 -0800
> On Fri, Mar 7, 2014 at 12:38 PM, David Miller <davem@...emloft.net> wrote:
> 2.
> Another alternative is to do
> struct sk_filter {
> ..
> union {
> unsigned int (*bpf_func)(const struct sk_buff *skb,const struct
> sock_filter *filter);
> unsigned int (*bpf_func_ext)(void*, const struct sock_filter_ext *filter);
> unsigned int (*bpf_func_generic)(void *, void*);
> };
>
> assignments into sk_filter will use correct field like:
> fp->bpf_func_ext = sk_run_filter_ext;
> and the hot call macro will look like:
> #define SK_RUN_FILTER(F, CTX) (*F->bpf_func_generic)(CTX, F->insns)
>
> I think current typecasts are equally ugly, but 'union' style
> will be cleaner from gcc point of view.
The main point is that we should only bypass the type protection provided
by the C language as an absolute last resort.
But in some cases, we can consider making an exception.
The problem with letting sk_check_filter() do the verification is that the
type is also determined by the call site. That requires some kind of type
based or run time verification.
Although it doesn't use C typing, one thing you could do is encoded the
pointer into an unsigned long. Then you encode the context in the lowest
bits of that value, masking them out when derferencing at run time.
So that way we can make sure seccomp _only_ runs seccomd filters.
And sockets only run socket filters with SKBs in the second argument.
> btw, ebpf jit coming in the next diff (it was also previewed in V1 series).
>
> In the 1/3 commit log of this patch I explain the current relation
> between bpf_ext_enable and bpf_jit_enable flags.
>
> In the next patch with ebpf jit, single bpf_jit_enable flag will act for both:
> if (bpf_ext_enable) {
> convert to new
> sk_chk_filter() - check old bpf
> if (bpf_jit_enable)
> use new jit
> else
> use new interpreter
> } else {
> sk_chk_filter() - check old bpf
> if (bpf_jit_enable)
> use old jit
> else
> use old interpreter
> }
>
> I believe Daniel oked this approach, but if you object, I can do it differently.
> Are you saying 'bpf_jit_enable' flag should mean: do old jit no matter what?
> Then we would need another flag 'bpf_ext_jit_enable' as well?
> Seems overkill to me.
It is not OK to create a temporary regression in between patch sets.
You can make JIT higher priority than EBPF in this series, then once every
single JIT is converted to EBPF, you can just adjust things accordingly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists