lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-Q81rFZ2BW_7fYY@mini-arch>
Date: Wed, 26 Mar 2025 10:43:50 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Cosmin Ratiu <cratiu@...dia.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"sdf@...ichev.me" <sdf@...ichev.me>,
	"edumazet@...gle.com" <edumazet@...gle.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"kuba@...nel.org" <kuba@...nel.org>,
	"pabeni@...hat.com" <pabeni@...hat.com>
Subject: Re: [PATCH net-next 2/9] net: hold instance lock during
 NETDEV_REGISTER/UP/UNREGISTER

On 03/26, Cosmin Ratiu wrote:
> On Wed, 2025-03-26 at 15:03 +0000, Cosmin Ratiu wrote:
> > On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> > > @@ -2072,8 +2087,8 @@ static void
> > > __move_netdevice_notifier_net(struct net *src_net,
> > >  					  struct net *dst_net,
> > >  					  struct notifier_block
> > > *nb)
> > >  {
> > > -	__unregister_netdevice_notifier_net(src_net, nb);
> > > -	__register_netdevice_notifier_net(dst_net, nb, true);
> > > +	__unregister_netdevice_notifier_net(src_net, nb, false);
> > > +	__register_netdevice_notifier_net(dst_net, nb, true,
> > > false);
> > >  }
> > 
> > I tested with your (and the rest of Jakub's) patches.
> > The problem with this approach is that when a netdev's net is
> > changed,
> > its lock will be acquired, but the notifiers for ALL netdevs in the
> > old
> > and the new namespace will be called, which will result in correct
> > behavior for that device and lockdep_assert_held failure for all
> > others.
> 
> But a thing I've learned many years ago about locking is that locks
> should protect data, not code. Shouldn't we avoid locking deep call
> hierarchies (like notifiers) with the instance lock and instead focus
> on 1) what fields need to be protected by the lock and 2) reduce
> critical section length for those fields.
> 
> That plus reference counting usually does the trick and should avoid
> these ugly deadlocks.

We want the operations to look atomic from the userspace if possible.
So the whole device is either moved or not, some other thread should
not be able to change, say, mtu mid-way.

And we do try to clarify what's specifically protected in terms of data:
https://web.git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/include/linux/netdevice.h#n2494

But the notifiers are super tricky. There are years of natural growth
with the assumption of a single rtnl lock :-(

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ