[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231206-gutmenschen-freie-5da710dfa4ab@brauner>
Date: Wed, 6 Dec 2023 17:50:25 +0100
From: Christian Brauner <brauner@...nel.org>
To: Jann Horn <jannh@...gle.com>
Cc: Pablo Neira Ayuso <pablo@...filter.org>,
Jozsef Kadlecsik <kadlec@...filter.org>,
Florian Westphal <fw@...len.de>,
netfilter-devel <netfilter-devel@...r.kernel.org>,
coreteam@...filter.org, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Network Development <netdev@...r.kernel.org>,
kernel list <linux-kernel@...r.kernel.org>
Subject: Re: Is xt_owner's owner_mt() racy with sock_orphan()? [worse with
new TYPESAFE_BY_RCU file lifetime?]
On Wed, Dec 06, 2023 at 03:38:50PM +0100, Jann Horn wrote:
> On Wed, Dec 6, 2023 at 2:58 PM Christian Brauner <brauner@...nel.org> wrote:
> >
> > On Tue, Dec 05, 2023 at 06:08:29PM +0100, Jann Horn wrote:
> > > On Tue, Dec 5, 2023 at 5:40 PM Jann Horn <jannh@...gle.com> wrote:
> > > >
> > > > Hi!
> > > >
> > > > I think this code is racy, but testing that seems like a pain...
> > > >
> > > > owner_mt() in xt_owner runs in context of a NF_INET_LOCAL_OUT or
> > > > NF_INET_POST_ROUTING hook. It first checks that sk->sk_socket is
> > > > non-NULL, then checks that sk->sk_socket->file is non-NULL, then
> > > > accesses the ->f_cred of that file.
> > > >
> > > > I don't see anything that protects this against a concurrent
> > > > sock_orphan(), which NULLs out the sk->sk_socket pointer, if we're in
> > >
> > > Ah, and all the other users of ->sk_socket in net/netfilter/ do it
> > > under the sk_callback_lock... so I guess the fix would be to add the
> > > same in owner_mt?
> >
> > In your other mail you wrote:
> >
> > > I also think we have no guarantee here that the socket's ->file won't
> > > go away due to a concurrent __sock_release(), which could cause us to
> > > continue reading file credentials out of a file whose refcount has
> > > already dropped to zero?
> >
> > Is this an independent worry or can the concurrent __sock_release()
> > issue only happen due to a sock_orphan() having happened first? I think
> > that it requires a sock_orphan() having happend, presumably because the
> > socket gets marked SOCK_DEAD and can thus be released via
> > __sock_release() asynchronously?
> >
> > If so then taking sk_callback_lock() in owner_mt() should fix this.
> > (Otherwise we might need an additional get_active_file() on
> > sk->sk_socker->file in owner_mt() in addition to the other fix.)
>
> My understanding is that it could only happen due to a sock_orphan()
> having happened first, and so just sk_callback_lock() should probably
> be a sufficient fix. (I'm not an expert on net subsystem locking rules
> though.)
Ok, so as I suspected. That's good.
Powered by blists - more mailing lists