[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOftzPgviq+J6kCCXEiWjtpR25ZPhZYFaWpNOFJ3i9pOoDg6cg@mail.gmail.com>
Date: Mon, 19 Nov 2018 13:59:23 -0800
From: Joe Stringer <joe@...d.net.nz>
To: Joe Stringer <joe@...d.net.nz>
Cc: nicolas.dichtel@...nd.com, David Ahern <dsahern@...il.com>,
netdev <netdev@...r.kernel.org>, daniel@...earbox.net
Subject: Re: netns_id in bpf_sk_lookup_{tcp,udp}
On Mon, 19 Nov 2018 at 12:54, Joe Stringer <joe@...d.net.nz> wrote:
>
> On Mon, 19 Nov 2018 at 12:29, Nicolas Dichtel <nicolas.dichtel@...nd.com> wrote:
> >
> > Le 19/11/2018 à 20:54, David Ahern a écrit :
> > > On 11/19/18 12:47 PM, Joe Stringer wrote:
> > >> On Mon, 19 Nov 2018 at 10:39, David Ahern <dsahern@...il.com> wrote:
> > >>>
> > >>> On 11/19/18 11:36 AM, Joe Stringer wrote:
> > >>>> Hi David, thanks for pointing this out.
> > >>>>
> > >>>> This is more of an oversight through iterations, the runtime lookup
> > >>>> will fail to find a socket if the netns value is greater than the
> > >>>> range of a uint32 so I think it would actually make more sense to drop
> > >>>> the parameter size to u32 rather than u64 so that this would be
> > >>>> validated at load time rather than silently returning NULL because of
> > >>>> a bad parameter.
> > >>>
> > >>> ok. I was wondering if it was a u64 to handle nsid of 0 which as I
> > >>> understand it is a legal nsid. If you drop to u32, how do you know when
> > >>> nsid has been set?
> > >>
> > >> I was operating under the assumption that 0 represents the root netns
> > >> id, and cannot be assigned to another non-root netns.
> > >>
> > >> Looking at __peernet2id_alloc(), it seems to me like it attempts to
> > >> find a netns and if it cannot find one, returns 0, which then leads to
> > >> a scroll over the idr starting from 0 to INT_MAX to find a legitimate
> > >> id for the netns, so I think this is a fair assumption?
> > The NET_ID_ZERO trick is used to manage nsid 0 in net_eq_idr() (idr_for_each()
> > stops when the callback returns != 0).
> >
> > >>
> > >
> > > Maybe Nicolas can give a definitive answer; as I recall he added the
> > > NSID option. I have not had time to walk the code. But I do recall
> > > seeing an id of 0. e.g, on my dev box:
> > > $ ip netns
> > > vms (id: 0)
> > >
> > > And include/uapi/linux/net_namespace.h shows -1 as not assigned.
> > Yes, 0 is a valid value and can be assigned to any netns.
> > nsid are signed 32 bit values. Note that -1 (NETNSA_NSID_NOT_ASSIGNED) is used
> > by the kernel to express that the nsid is not assigned. It can also be used by
> > the user to let the kernel chooses a nsid.
> >
> > $ ip netns add foo
> > $ ip netns add bar
> > $ ip netns
> > bar
> > foo
> > $ ip netns set foo 0
> > $ ip netns set bar auto
> > $ ip netns
> > bar (id: 1)
> > foo (id: 0)
>
> OK, I'll fix this up then.
Here's what I have in mind:
@@ -2221,12 +2222,13 @@ union bpf_attr {
* **sizeof**\ (*tuple*\ **->ipv6**)
* Look for an IPv6 socket.
*
- * If the *netns* is zero, then the socket lookup table in the
- * netns associated with the *ctx* will be used. For the TC hooks,
- * this in the netns of the device in the skb. For socket hooks,
- * this in the netns of the socket. If *netns* is non-zero, then
- * it specifies the ID of the netns relative to the netns
- * associated with the *ctx*.
+ * If the *netns* is **BPF_F_SK_CURRENT_NS** or greater, then the
+ * socket lookup table in the netns associated with the *ctx* will
+ * will be used. For the TC hooks, this is the netns of the device
+ * in the skb. For socket hooks, this is the netns of the socket.
+ * If *netns* is less than **BPF_F_SK_CURRENT_NS**, then it
+ * specifies the ID of the netns relative to the netns associated
+ * with the *ctx*.
*
* All values for *flags* are reserved for future usage, and must
* be left at zero.
@@ -2409,6 +2411,9 @@ enum bpf_func_id {
/* BPF_FUNC_perf_event_output for sk_buff input context. */
#define BPF_F_CTXLEN_MASK (0xfffffULL << 32)
+/* BPF_FUNC_sk_lookup_tcp and BPF_FUNC_sk_lookup_udp flags. */
+#define BPF_F_SK_CURRENT_NS 0x80000000 /* For netns argument */
+
/* Mode for BPF_FUNC_skb_adjust_room helper. */
enum bpf_adj_room_mode {
BPF_ADJ_ROOM_NET,
Plus adjusting all of the internal types and the helper headers to use
u32. With the highest bit used to specify that the netns should be the
current netns, all other netns IDs should be available.
Powered by blists - more mailing lists