[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8zYBUwQlQdDeLLC@mini-arch>
Date: Sat, 8 Mar 2025 15:51:33 -0800
From: Stanislav Fomichev <stfomichev@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Kohei Enju <enjuk@...zon.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...zon.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Ahmed Zaki <ahmed.zaki@...el.com>,
Stanislav Fomichev <sdf@...ichev.me>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Kohei Enju <kohei.enju@...il.com>
Subject: Re: [PATCH net-next v1] dev: remove netdev_lock() and
netdev_lock_ops() in register_netdevice().
On 03/08, Stanislav Fomichev wrote:
> On 03/08, Jakub Kicinski wrote:
> > On Sat, 8 Mar 2025 13:18:13 -0800 Jakub Kicinski wrote:
> > > On Sun, 9 Mar 2025 05:37:18 +0900 Kohei Enju wrote:
> > > > Both netdev_lock() and netdev_lock_ops() are called before
> > > > list_netdevice() in register_netdevice().
> > > > No other context can access the struct net_device, so we don't need these
> > > > locks in this context.
> > >
> > > Doesn't sysfs get registered earlier?
> > > I'm afraid not being able to take the lock from the registration
> > > path ties our hands too much. Maybe we need to make a more serious
> > > attempt at letting the caller take the lock?
> >
> > Looking closer at the report - we are violating the contract that only
> > drivers which opted in get their ops called under the instance lock.
> > iavf had a similar problem but it had to opt in. WiFi doesn't.
> >
> > Maybe we can bring the address semaphore back?
> > We just need to take it before the ops lock in do_setlink.
> > A bit ugly but would work?
>
> I remember I was having another lockdep circular report with the addr
> sema, but maybe moving it before the ops lock fill fix it not sure.
>
> But coming back to "No other context can access the struct net_device,
> so we don't need these locks in this context.". What if we move
> netdev_set_addr_lockdep_class() call down a bit? Right before list_netdevice
> happens. Will it help with the lockdep?
Hmm, netdev_set_addr_lockdep_class is not touching instance lock :-(
But basically do lockdep_set_novalidate_class early and undo it
before list_netdevice...
Powered by blists - more mailing lists