lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAVpQUDbYnzyBuN5SS9bSTfJDvVH18QNXrXGTYzZRb3Cgk=mQg@mail.gmail.com>
Date: Wed, 28 Jan 2026 09:13:59 -0800
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Qiliang Yuan <realwujing@...il.com>
Cc: brauner@...nel.org, davem@...emloft.net, edumazet@...gle.com, 
	horms@...nel.org, jlayton@...nel.org, kuba@...nel.org, 
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com, 
	sd@...asysnail.net, yuanql9@...natelecom.cn
Subject: Re: [PATCH net-next v3] netns: optimize netns cleaning by batching
 unhash_nsid calls

On Wed, Jan 28, 2026 at 4:19 AM Qiliang Yuan <realwujing@...il.com> wrote:
>
> Hi Kuniyuki,
>
> On Tue, Jan 27, 2026 at 6:05 PM Kuniyuki Iwashima <kuniyu@...gle.com> wrote:
> >
> > On Tue, Jan 27, 2026 at 5:22 PM Qiliang Yuan <realwujing@...il.com> wrote:
> > >
> > > Currently, unhash_nsid() scans the entire net_namespace_list for each
> > > netns in a destruction batch during cleanup_net(). This leads to
> > > O(M_batch * N_system * M_nsids) complexity, where M_batch is the
> > > destruction batch size, N_system is the total number of namespaces,
> > > and M_nsids is the number of IDs in each IDR.
> > >
> > > Reduce the complexity to O(N_system * M_nsids) by introducing an
> > > 'is_dying' flag to mark namespaces being destroyed. This allows
> > > unhash_nsid() to perform a single-pass traversal over the system's
> > > namespaces. In this pass, for each survivor namespace, iterate
> > > through its netns_ids and remove any mappings that point to a marked
> > > namespace, effectively eliminating the M_batch multiplier.
> > >
> > > Signed-off-by: Qiliang Yuan <realwujing@...il.com>
> > > Signed-off-by: Qiliang Yuan <yuanql9@...natelecom.cn>
> >
> > Why two SOBs with the same person ?
>
> - Signed-off-by: Qiliang Yuan <realwujing@...il.com> (Personal email)
> - Signed-off-by: Qiliang Yuan <yuanql9@...natelecom.cn> (Work email)
>
> My work email often has trouble receiving external mailing list replies,
> so I've included both to ensure I don't miss any feedback and to
> properly attribute the work. The v8 version should have everything
> matching correctly now.

You can just CC two of them and keep one SOB.


>
> > > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> > > index a6e6a964a287..50fdd4f9bb3b 100644
> > > --- a/net/core/net_namespace.c
> > > +++ b/net/core/net_namespace.c
> > > @@ -624,9 +624,10 @@ void net_ns_get_ownership(const struct net *net, kuid_t *uid, kgid_t *gid)
> > >  }
> > >  EXPORT_SYMBOL_GPL(net_ns_get_ownership);
> > >
> > > -static void unhash_nsid(struct net *net, struct net *last)
> > > +static void unhash_nsid(struct net *last)
> > >  {
> > >         struct net *tmp;
> > > +
> > >         /* This function is only called from cleanup_net() work,
> > >          * and this work is the only process, that may delete
> > >          * a net from net_namespace_list. So, when the below
> > > @@ -636,20 +637,34 @@ static void unhash_nsid(struct net *net, struct net *last)
> > >         for_each_net(tmp) {
> > >                 int id;
> > >
> > > -               spin_lock(&tmp->nsid_lock);
> > > -               id = __peernet2id(tmp, net);
> > > -               if (id >= 0)
> > > -                       idr_remove(&tmp->netns_ids, id);
> > > -               spin_unlock(&tmp->nsid_lock);
> > > -               if (id >= 0)
> > > -                       rtnl_net_notifyid(tmp, RTM_DELNSID, id, 0, NULL,
> > > -                                         GFP_KERNEL);
> > > +               for (id = 0; ; id++) {
> >
> > Doesn't this rather slow down in a common case where
> > init_net has ids for other netns since it is never dismantled ?
>
> Yes, you're right. In the original code, we only scanned 'tmp' for specific 'net'
> which was being killed. Now we are scanning all IDs in 'tmp' to find any dying
> peers.
>
> If 'tmp' (like init_net) has many long-lived netns IDs, we end up iterating through
> them even if none of them are dying.
>
> To address this and avoid the overhead, I can use idr_for_each() with a callback
> to find and collect dying IDs,

idr_for_each() sounds better to me.

If we replace list_del_rcu(&net->list); with
list_del_init_rcu(&net->list);, we can check net->list.pprev
instead of adding dying_net, which is a bit racy since
idr_for_each() could return a net which would have been
processed in the next cleanup_net() invocation.


> or keep the O(M_batch) outer loop but optimize the
> inner part if it's truly problematic.
>
> However, given that this is the cleanup path, I thought the batching benefit
> (N_system vs M_batch * N_system) would outweigh the per-netns IDR scan.
>
> I'll revert to a more efficient iteration or use idr_for_each() to handle this
> gracefully in v4.
>
> Thanks,
> Qiliang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ