[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240620191148.26fc09ac@kernel.org>
Date: Thu, 20 Jun 2024 19:11:48 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Andrew Lunn <andrew@...n.ch>
Cc: Eric Dumazet <edumazet@...gle.com>, "David S . Miller"
<davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, Ziwei Xiao
<ziweixiao@...gle.com>, Praveen Kaligineedi <pkaligineedi@...gle.com>,
Harshitha Ramamurthy <hramamurthy@...gle.com>, Willem de Bruijn
<willemb@...gle.com>, Jeroen de Borst <jeroendb@...gle.com>, Shailend Chand
<shailend@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 3/6] net: ethtool: perform pm duties outside of
rtnl lock
On Fri, 21 Jun 2024 02:59:54 +0200 Andrew Lunn wrote:
> > I also keep wondering whether we shouldn't use this as an opportunity
> > to introduce a "netdev instance lock". I think you mentioned we should
> > move away from rtnl for locking ethtool and ndos since most drivers
> > don't care at all about global state. Doing that is a huge project,
> > but maybe this is where we start?
>
> Is there much benefit to the average system?
>
> Embedded systems typically have 1 or 2 netdevs. Laptops, desktops and
> the like have one, maybe two netdevs. VMs typically have one netdev.
> So we are talking about high end switches with lots of ports and
> servers hosting lots of VMs. So of the around 500 netdev drivers we
> have, only maybe a dozen drivers would benefit?
>
> It seems unlikely those 500 drivers will be reviewed and declared safe
> to not take RTNL. So maybe a better way forward is that struct
> ethtool_ops gains a flag indicating its ops can be called without
> first talking RTNL. Somebody can then look at those dozen drivers, and
> we leave the other 490 alone and don't need to worry about
> regressions.
Right, we still need an opt in.
My question is more whether we should offer an opt out from rtnl_lock,
and beyond that driver is on its own (which reviewing the driver code
- I believe will end pretty badly), or to also offer a per-netdev
instance lock. Give the drivers a choice:
- rtnl
- netdev_lock(dev)
- (my least preferred) nothing.
The netdev lock would also be useful for things like napi and queue
stats, RSS contexts, and whatever else we add for drivers in the core.
For NAPI / queue info via netlink we currently require rtnl_lock,
taking a global lock to access a couple of per-netdevs structs feels
quite wasteful :(
Powered by blists - more mailing lists