[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8zLwzMl1wU6va7d@mini-arch>
Date: Sat, 8 Mar 2025 14:59:15 -0800
From: Stanislav Fomichev <stfomichev@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Kohei Enju <enjuk@...zon.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...zon.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Ahmed Zaki <ahmed.zaki@...el.com>,
Stanislav Fomichev <sdf@...ichev.me>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Kohei Enju <kohei.enju@...il.com>
Subject: Re: [PATCH net-next v1] dev: remove netdev_lock() and
netdev_lock_ops() in register_netdevice().
On 03/08, Jakub Kicinski wrote:
> On Sat, 8 Mar 2025 13:18:13 -0800 Jakub Kicinski wrote:
> > On Sun, 9 Mar 2025 05:37:18 +0900 Kohei Enju wrote:
> > > Both netdev_lock() and netdev_lock_ops() are called before
> > > list_netdevice() in register_netdevice().
> > > No other context can access the struct net_device, so we don't need these
> > > locks in this context.
> >
> > Doesn't sysfs get registered earlier?
> > I'm afraid not being able to take the lock from the registration
> > path ties our hands too much. Maybe we need to make a more serious
> > attempt at letting the caller take the lock?
>
> Looking closer at the report - we are violating the contract that only
> drivers which opted in get their ops called under the instance lock.
> iavf had a similar problem but it had to opt in. WiFi doesn't.
>
> Maybe we can bring the address semaphore back?
> We just need to take it before the ops lock in do_setlink.
> A bit ugly but would work?
I remember I was having another lockdep circular report with the addr
sema, but maybe moving it before the ops lock fill fix it not sure.
But coming back to "No other context can access the struct net_device,
so we don't need these locks in this context.". What if we move
netdev_set_addr_lockdep_class() call down a bit? Right before list_netdevice
happens. Will it help with the lockdep?
Powered by blists - more mailing lists