[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc1597b12b617cbb62d325285c3a50bfb2b1ce1a.camel@nvidia.com>
Date: Wed, 26 Mar 2025 15:24:03 +0000
From: Cosmin Ratiu <cratiu@...dia.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "sdf@...ichev.me"
<sdf@...ichev.me>
CC: "edumazet@...gle.com" <edumazet@...gle.com>, "davem@...emloft.net"
<davem@...emloft.net>, "kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>
Subject: Re: [PATCH net-next 2/9] net: hold instance lock during
NETDEV_REGISTER/UP/UNREGISTER
On Wed, 2025-03-26 at 15:03 +0000, Cosmin Ratiu wrote:
> On Tue, 2025-03-25 at 14:30 -0700, Stanislav Fomichev wrote:
> > @@ -2072,8 +2087,8 @@ static void
> > __move_netdevice_notifier_net(struct net *src_net,
> > struct net *dst_net,
> > struct notifier_block
> > *nb)
> > {
> > - __unregister_netdevice_notifier_net(src_net, nb);
> > - __register_netdevice_notifier_net(dst_net, nb, true);
> > + __unregister_netdevice_notifier_net(src_net, nb, false);
> > + __register_netdevice_notifier_net(dst_net, nb, true,
> > false);
> > }
>
> I tested with your (and the rest of Jakub's) patches.
> The problem with this approach is that when a netdev's net is
> changed,
> its lock will be acquired, but the notifiers for ALL netdevs in the
> old
> and the new namespace will be called, which will result in correct
> behavior for that device and lockdep_assert_held failure for all
> others.
But a thing I've learned many years ago about locking is that locks
should protect data, not code. Shouldn't we avoid locking deep call
hierarchies (like notifiers) with the instance lock and instead focus
on 1) what fields need to be protected by the lock and 2) reduce
critical section length for those fields.
That plus reference counting usually does the trick and should avoid
these ugly deadlocks.
Cosmin.
Powered by blists - more mailing lists