[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180629100133.16286f24@cakuba.netronome.com>
Date: Fri, 29 Jun 2018 10:01:33 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Daniel Borkmann <borkmann@...earbox.net>
Cc: Jiri Benc <jbenc@...hat.com>, davem@...emloft.net,
Roopa Prabhu <roopa@...ulusnetworks.com>, jiri@...nulli.us,
jhs@...atatu.com, xiyou.wangcong@...il.com,
oss-drivers@...ronome.com, netdev@...r.kernel.org,
Pieter Jansen van Vuuren
<pieter.jansenvanvuuren@...ronome.com>
Subject: Re: [PATCH net-next v2 3/4] net: check tunnel option type in tunnel
flags
On Fri, 29 Jun 2018 09:04:15 +0200, Daniel Borkmann wrote:
> On 06/28/2018 06:54 PM, Jakub Kicinski wrote:
> > On Thu, 28 Jun 2018 09:42:06 +0200, Jiri Benc wrote:
> >> On Wed, 27 Jun 2018 11:49:49 +0200, Daniel Borkmann wrote:
> >>> Looks good to me, and yes in BPF case a mask like TUNNEL_OPTIONS_PRESENT is
> >>> right approach since this is opaque info and solely defined by the BPF prog
> >>> that is using the generic helper.
> >>
> >> Wouldn't it make sense to introduce some safeguards here (in a backward
> >> compatible way, of course)? It's easy to mistakenly set data for a
> >> different tunnel type in a BPF program and then be surprised by the
> >> result. It might help users if such usage was detected by the kernel,
> >> one way or another.
> >
> > Well, that's how it works today ;)
>
> Well, it was designed like that on purpose, to be i) agnostic of the underlying
> device, ii) to not clutter BPF API with tens of different APIs effectively doing
> the same thing, and at the same time to avoid adding protocol specifics. E.g. at
> least core bits of bpf_skb_{set,get}_tunnel_key() will work whether I use vxlan
> or geneve underneath (we are actually using it this way) and I could use things
> like tun_id to encode custom meta data from BPF for either of them depending on flavor
> picked by orchestration system. For the tunnel options in bpf_skb_{set,get}_tunnel_opt()
> it's similar although here there needs to be awareness of the underlying dev depending
> on whether you encode data into e.g. gbp or tlvs, etc. However, downside right now I
> can see with a patch like below is that:
>
> i) People might still just keep using 'TUNNEL_OPTIONS_PRESENT path' since available
> and backwards compatible with current/older kernels, ii) we cut bits away from
> size over time for each new tunnel proto added in future that would support tunnel
> options, iii) that extension is one-sided (at least below) and same would be needed
> in getter part, and iv) there needs to be a way for the case when folks add new
> tunnel options where we don't need to worry that we forget updating BPF_F_TUN_*
> each time otherwise this will easily slip through and again people will just rely
> on using TUNNEL_OPTIONS_PRESENT catchall. Given latter and in particular point i)
> I wouldn't think it's worth the pain, the APIs were added to BPF in v4.6 so this
> would buy them 2 more years wrt kernel compatibility with same functionality level.
> And point v), I just noticed the patch is actually buggy: size is ARG_CONST_SIZE and
> verifier will attempt to check the value whether the buffer passed in argument 2 is
> valid or not, so using flags here in upper bits would let verification fail, you'd
> really have to make a new helper just for this.
Ah, indeed. I'd rather avoid a new helper, if we reuse an old one
people can always write a program like:
err = helper(all_flags);
if (err == -EINVAL)
err = helper(fallback_flags);
With a new helper the program will not even load on old kernels :(
Could we add the flags new to bpf_skb_set_tunnel_key(), maybe? It is a
bit ugly because only options care about the flags and in theory
bpf_skb_set_tunnel_key() doesn't have to be called before
bpf_skb_set_tunnel_opt() ...
Alternatively we could do this:
diff --git a/net/core/filter.c b/net/core/filter.c
index dade922678f6..9edcc2947be5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3582,7 +3582,9 @@ BPF_CALL_3(bpf_skb_set_tunnel_opt, struct sk_buff *, skb,
if (unlikely(size > IP_TUNNEL_OPTS_MAX))
return -ENOMEM;
- ip_tunnel_info_opts_set(info, from, size, TUNNEL_OPTIONS_PRESENT);
+ ip_tunnel_info_opts_set(info, from, size, TUNNEL_VXLAN_OPT |
+ TUNNEL_GENEVE_OPT |
+ TUNNEL_ERSPAN_OPT);
return 0;
}
to force ourselves to solve the problem when a next protocol is added ;)
Or should we just leave things as they are? As you explain the new
helper would really be of marginal use given the backward compatibility
requirement and availability of the old one...
Powered by blists - more mailing lists