lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251110-elastisch-endeffekt-747abc5a614a@brauner>
Date: Mon, 10 Nov 2025 09:41:56 +0100
From: Christian Brauner <brauner@...nel.org>
To: Hillf Danton <hdanton@...a.com>
Cc: linux-fsdevel@...r.kernel.org, Jann Horn <jannh@...gle.com>, 
	Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org, 
	syzbot+1957b26299cf3ff7890c@...kaller.appspotmail.com
Subject: Re: [PATCH 0/8] ns: fixes for namespace iteration and active
 reference counting

On Mon, Nov 10, 2025 at 06:55:26AM +0800, Hillf Danton wrote:
> On Sun, 09 Nov 2025 22:11:21 +0100 Christian Brauner wrote:
> > * Make sure to initialize the active reference count for the initial
> >   network namespace and prevent __ns_common_init() from returning too
> >   early.
> > 
> > * Make sure that passive reference counts are dropped outside of rcu
> >   read locks as some namespaces such as the mount namespace do in fact
> >   sleep when putting the last reference.
> > 
> > * The setns() system call supports:
> > 
> >   (1) namespace file descriptors (nsfd)
> >   (2) process file descriptors (pidfd)
> > 
> >   When using nsfds the namespaces will remain active because they are
> >   pinned by the vfs. However, when pidfds are used things are more
> >   complicated.
> > 
> >   When the target task exits and passes through exit_nsproxy_namespaces()
> >   or is reaped and thus also passes through exit_cred_namespaces() after
> >   the setns()'ing task has called prepare_nsset() but before the active
> >   reference count of the set of namespaces it wants to setns() to might
> >   have been dropped already:
> > 
> >     P1                                                              P2
> > 
> >     pid_p1 = clone(CLONE_NEWUSER | CLONE_NEWNET | CLONE_NEWNS)
> >                                                                     pidfd = pidfd_open(pid_p1)
> >                                                                     setns(pidfd, CLONE_NEWUSER | CLONE_NEWNET | CLONE_NEWNS)
> >                                                                     prepare_nsset()
> > 
> >     exit(0)
> >     // ns->__ns_active_ref        == 1
> >     // parent_ns->__ns_active_ref == 1
> >     -> exit_nsproxy_namespaces()
> >     -> exit_cred_namespaces()
> > 
> >     // ns_active_ref_put() will also put
> >     // the reference on the owner of the
> >     // namespace. If the only reason the
> >     // owning namespace was alive was
> >     // because it was a parent of @ns
> >     // it's active reference count now goes
> >     // to zero... --------------------------------
> >     //                                           |
> >     // ns->__ns_active_ref        == 0           |
> >     // parent_ns->__ns_active_ref == 0           |
> >                                                  |                  commit_nsset()
> >                                                  -----------------> // If setns()
> >                                                                     // now manages to install the namespaces
> >                                                                     // it will call ns_active_ref_get()
> >                                                                     // on them thus bumping the active reference
> >                                                                     // count from zero again but without also
> >                                                                     // taking the required reference on the owner.
> >                                                                     // Thus we get:
> >                                                                     //
> >                                                                     // ns->__ns_active_ref        == 1
> >                                                                     // parent_ns->__ns_active_ref == 0
> > 
> >     When later someone does ns_active_ref_put() on @ns it will underflow
> >     parent_ns->__ns_active_ref leading to a splat from our asserts
> >     thinking there are still active references when in fact the counter
> >     just underflowed.
> > 
> >   So resurrect the ownership chain if necessary as well. If the caller
> >   succeeded to grab passive references to the set of namespaces the
> >   setns() should simply succeed even if the target task exists or gets
> >   reaped in the meantime.
> > 
> >   The race is rare and can only be triggered when using pidfs to setns()
> >   to namespaces. Also note that active reference on initial namespaces are
> >   nops.
> > 
> >   Since we now always handle parent references directly we can drop
> >   ns_ref_active_get_owner() when adding a namespace to a namespace tree.
> >   This is now all handled uniformly in the places where the new namespaces
> >   actually become active.
> > 
> > Signed-off-by: Christian Brauner <brauner@...nel.org>
> > ---
> >
> FYI namespace-6.19.fixes failed to survive the syzbot test [1].
> 
> [1] Subject: Re: [syzbot] [lsm?] WARNING in put_cred_rcu
> https://lore.kernel.org/lkml/690eedba.a70a0220.22f260.0075.GAE@google.com/

This used a stale branch that existed for testing:

Tested on:

commit:         00f5a3b5 DO NOT MERGE - This is purely for testing a b..

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

git tree:       https://github.com/brauner/linux.git namespace-6.19.fixes
console output: https://syzkaller.appspot.com/x/log.txt?x=17a46a58580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e31f5f45f87b6763
dashboard link: https://syzkaller.appspot.com/bug?extid=553c4078ab14e3cf3358
compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8

Note: no patches were applied.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ