[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e7ae1f5-77e3-a561-2d6b-377026b1fd26@intel.com>
Date: Wed, 6 Dec 2023 09:46:18 +0100
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Johannes Berg <johannes@...solutions.net>, <netdev@...r.kernel.org>
CC: Marc MERLIN <marc@...lins.org>, Jesse Brandeburg
<jesse.brandeburg@...el.com>, Tony Nguyen <anthony.l.nguyen@...el.com>,
<intel-wired-lan@...ts.osuosl.org>, Heiner Kallweit <hkallweit1@...il.com>,
Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Subject: Re: [RFC PATCH] net: ethtool: do runtime PM outside RTNL
On 12/5/23 20:48, Johannes Berg wrote:
> On Tue, 2023-12-05 at 06:19 +0100, Przemek Kitszel wrote:
>> On 12/4/23 20:07, Johannes Berg wrote:
>>> From: Johannes Berg <johannes.berg@...el.com>
>>>
>>> As reported by Marc MERLIN in [1], at least one driver (igc)
>>
>> perhaps Reported-by tag? (I know this is RFC as of now)
>
> I guess.
>
>>> wants/needs to acquire the RTNL inside suspend/resume ops,
>>> which can be called from here in ethtool if runtime PM is
>>> enabled.
>>>
>>> [1] https://lore.kernel.org/r/20231202221402.GA11155@merlins.org
>>>
>>> Allow this by doing runtime PM transitions without the RTNL
>>> held. For the ioctl to have the same operations order, this
>>> required reworking the code to separately check validity and
>>> do the operation. For the netlink code, this now has to do
>>> the runtime_pm_put a bit later.
>>>
>>> Signed-off-by: Johannes Berg <johannes.berg@...el.com>
>>> ---
>>> net/ethtool/ioctl.c | 71 ++++++++++++++++++++++++++-----------------
>>> net/ethtool/netlink.c | 32 ++++++++-----------
>>> 2 files changed, 56 insertions(+), 47 deletions(-)
>>>
>> Thank you for the patch,
>>
>> I like the idea of split into validate + do for dev_ethtool(),
>> what minimizes unneeded PM touching. Moving pm_runtime_get_sync() out of
>> RTNL is also a great improvement per se. Also from the pure coding
>> perspective I see no obvious flaws in the patch. I think that igc code
>> was just accidental to the issue, in a way that it was not deliberate to
>> hold RTNL for extended periods.
>
> Well Jakub was arguing igc shouldn't be taking rtnl in suspend/resume,
> maybe, but dunno.
That sounds right too; one could argue if your fix is orthogonal to that
or not. I would say that your fix makes core net code more robust
against drivers from past millennia. :)
igc folks are notified, no idea how much time it would take to propose
a fix.
>
>> With your patch fixing the bug, there is
>> no point with waiting IMO, so
>>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
>
> Well, according to the checks, the patch really should use
> netdev_get_by_name() and netdev_put()? But I don't know how to do that
> on short-term stack thing ... maybe it doesn't have to?
Nice to have such checks :)
You need some &netdevice_tracker, perhaps one added into struct net
or other place that would allow to track it at ethtool level.
"short term stack thing" does not relieve us from good coding practices,
but perhaps "you just replaced __dev_get_by_name() call by
dev_get_by_name()" to fix a bug would ;) - with transition to tracked
alloc as a next series to be promised :)
anyway, I'm fresh here, and would love to know what others think about
>
> johannes
Powered by blists - more mailing lists