[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76b863f1-1e5f-4d7c-9f39-aabb35865f69@gmail.com>
Date: Fri, 28 Jun 2024 09:48:34 +0100
From: Florian Fainelli <f.fainelli@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Youwan Wang <youwan@...china.com>, andrew@...n.ch, hkallweit1@...il.com,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to
suspend
On 6/28/2024 9:38 AM, Russell King (Oracle) wrote:
> On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
>>
>>
>> On 6/28/2024 9:17 AM, Russell King (Oracle) wrote:
>>> On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
>>>> If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
>>>> we cannot suspend the PHY. Although the WOL status has been
>>>> checked in phy_suspend(), returning -EBUSY(-16) would cause
>>>> the Power Management (PM) to fail to suspend. Since
>>>> phy_suspend() is an exported symbol (EXPORT_SYMBOL),
>>>> timely error reporting is needed. Therefore, an additional
>>>> check is performed here. If the PHY of the mido bus is enabled
>>>> with WOL, we skip calling phy_suspend() to avoid PM failure.
>>>>
>>>> log:
>>>> [ 322.631362] OOM killer disabled.
>>>> [ 322.631364] Freezing remaining freezable tasks
>>>> [ 322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>>>> [ 322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
>>>> [ 322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
>>>> [ 322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: failed to suspend: error -16
>>>> [ 322.669699] PM: Some devices failed to suspend, or early wake event detected
>>>> [ 322.669949] OOM killer enabled.
>>>> [ 322.669951] Restarting tasks ... done.
>>>> [ 322.671008] random: crng reseeded on system resumption
>>>> [ 322.671014] PM: suspend exit
>>>>
>>>> If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
>>>> WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
>>>> flag would cause the resume failure.
>>
>> Did you mean to write that if the YT8521 PHY driver entry set the
>> PHY_ALWAYS_CALL_SUSPEND flag, then it would cause an error during resume? If
>> so, why is that?
>
> It doesn't appear to do that - at least not in net-next, and not in
> mainline.
>
>>> I think the reason this is happening is because the PHY has WoL enabled
>>> on it without the kernel/netdev driver being aware that WoL is enabled.
>>> Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
>>> happen, but then we find unexpectedly that WoL is enabled on the PHY.
>>>
>>> However, whenever a user configures WoL, netdev->wol_enabled will be
>>> set when _any_ WoL mode is enabled and cleared only if all WoL modes
>>> are disabled.
>>>
>>> Thus, what we have is a de-sync between the kernel state and hardware
>>> state, leading to the suspend failing.
>>>
>>> I don't see anything in the motorcomm driver that requires suspend
>>> if WoL is enabled - yt8521_suspend() first checks to see whether WoL
>>> is enabled, and exits if it is.
>>>
>>> Andrew - how do you feel about reading the WoL state from the PHY and
>>> setting netdev->wol_enabled if any WoL is enabled on the PHY? That
>>> would mean that the netdev's WoL state is consistent with the PHY
>>> whether or not the user has configured WoL.
>>
>> Would not the situation described here be solved by having the Motorcomm PHY
>> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
>> is enabled or not and will just return then.
>
> Is there a reason that netdev->wol_enabled shouldn't reflect the
> hardware configuration?
Unless there is some sort of Ethernet MAC driver bug that we are not
being made aware of, the only thing that I can of is happening here, is
that a SW/FW agent other than Linux would have enabled the PHY for
Wake-on-LAN, and there was no configuration done by the user via the
Ethernet MAC driver that attempted to enable Wake-on-LAN. In that case
only can I think of a disconnect between the HW and SW states?
>
> If netdev->wol_enabled is appropriately set, then it seems to me
> that there's little reason for motorcomm to be checking whether
> WoL is enabled in its suspend function - which means less driver
> specific code and driver specific behaviour.
>
That should work, too.
--
Florian
Powered by blists - more mailing lists