[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpUqp1PRKfS6WcsZ16yjF4jjOrkTHX7Zdhrqo0nrE2VH1Q@mail.gmail.com>
Date: Sun, 6 Jun 2021 17:37:40 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
Changbin Du <changbin.du@...il.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
stable <stable@...r.kernel.org>,
David Laight <David.Laight@...lab.com>
Subject: Re: [PATCH] nsfs: fix oops when ns->ops is not provided
On Fri, Jun 4, 2021 at 2:54 AM Christian Brauner
<christian.brauner@...ntu.com> wrote:
>
> On Thu, Jun 03, 2021 at 03:52:29PM -0700, Cong Wang wrote:
> > On Wed, Jun 2, 2021 at 2:14 AM Christian Brauner
> > <christian.brauner@...ntu.com> wrote:
> > > But the point is that ns->ops should never be accessed when that
> > > namespace type is disabled. Or in other words, the bug is that something
> > > in netns makes use of namespace features when they are disabled. If we
> > > handle ->ops being NULL we might be tapering over a real bug somewhere.
> >
> > It is merely a protocol between fs/nsfs.c and other namespace users,
> > so there is certainly no right or wrong here, the only question is which
> > one is better.
> >
> > >
> > > Jakub's proposal in the other mail makes sense and falls in line with
> > > how the rest of the netns getters are implemented. For example
> > > get_net_ns_fd_fd():
> >
> > It does not make any sense to me. get_net_ns() merely increases
> > the netns refcount, which is certainly fine for init_net too, no matter
> > CONFIG_NET_NS is enabled or disabled. Returning EOPNOTSUPP
> > there is literally saying we do not support increasing init_net refcount,
> > which is of course false.
> >
> > > struct net *get_net_ns_by_fd(int fd)
> > > {
> > > return ERR_PTR(-EINVAL);
> > > }
> >
> > There is a huge difference between just increasing netns refcount
> > and retrieving it by fd, right? I have no idea why you bring this up,
> > calling them getters is missing their difference.
>
> This argument doesn't hold up. All netns helpers ultimately increase the
> reference count of the net namespace they find. And if any of them
> perform operations where they are called in environments wherey they
> need CONFIG_NET_NS they handle this case at compile time.
Let me explain it in this more straight way: what is the protocol here
for indication of !CONFIG_XXX_NS? Clearly it must be ns->ops==NULL,
because all namespaces use the following similar pattern:
#ifdef CONFIG_NET_NS
net->ns.ops = &netns_operations;
#endif
Now you are arguing the protocol is not this, but it is the getter of
open_related_ns() returns an error pointer.
>
> (Pluse they are defined in a central place in net/net_namespace.{c,h}.
> That includes the low-level get_net() function and all the others.
> get_net_ns() is the only one that's defined out of band. So get_net_ns()
> currently is arguably also misplaced.)
Of course they do, only struct ns_common is generic. What's your
point? Each ns.ops is defined by each namespace too.
>
> The problem I have with fixing this in nsfs is that it gives the
> impression that this is a bug in nsfs whereas it isn't and it
> potentially helps tapering over other bugs.
Like I keep saying, this is just a protocol, there is no right or
wrong here. If the protocol is just ops==NULL, then there is nothing
wrong use it.
(BTW, we have a lot of places that use ops==NULL as a protocol,
they work really well.)
>
> get_net_ns() is only called for codepaths that call into nsfs via
> open_related_ns() and it's the only namespace that does this. But
I am pretty sure userns does the same:
197 case NS_GET_USERNS:
198 return open_related_ns(ns, ns_get_owner);
> open_related_ns() is only well defined if CONFIG_<NAMESPACE_TYPE> is
> set. For example, none of the procfs namespace f_ops will be set for
> !CONFIG_NET_NS. So clearly the socket specific getter here is buggy as
> it doesn't account for !CONFIG_NET_NS and it should be fixed.
If the protocol is just ops==NULL, then the core part should just check
ops==NULL. Pure and simple. I have no idea why you do not admit the
fact that every namespace intentionally leaves ops as NULL when its
config is disabled.
>
> Plus your fix leaks references to init netns without fixing get_net_ns()
> too.
I thought it is 100% clear that this patch is not from me?
Plus, the PoC patch from me actually suggests to change
open_related_ns(), not __ns_get_path(). I have no idea why you
both miss it.
Thanks.
Powered by blists - more mailing lists