lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76b863f1-1e5f-4d7c-9f39-aabb35865f69@gmail.com>
Date: Fri, 28 Jun 2024 09:48:34 +0100
From: Florian Fainelli <f.fainelli@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Youwan Wang <youwan@...china.com>, andrew@...n.ch, hkallweit1@...il.com,
 davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: phy: phy_device: fix PHY WOL enabled, PM failed to
 suspend



On 6/28/2024 9:38 AM, Russell King (Oracle) wrote:
> On Fri, Jun 28, 2024 at 09:25:54AM +0100, Florian Fainelli wrote:
>>
>>
>> On 6/28/2024 9:17 AM, Russell King (Oracle) wrote:
>>> On Fri, Jun 28, 2024 at 02:03:18PM +0800, Youwan Wang wrote:
>>>> If the PHY of the mido bus is enabled with Wake-on-LAN (WOL),
>>>> we cannot suspend the PHY. Although the WOL status has been
>>>> checked in phy_suspend(), returning -EBUSY(-16) would cause
>>>> the Power Management (PM) to fail to suspend. Since
>>>> phy_suspend() is an exported symbol (EXPORT_SYMBOL),
>>>> timely error reporting is needed. Therefore, an additional
>>>> check is performed here. If the PHY of the mido bus is enabled
>>>> with WOL, we skip calling phy_suspend() to avoid PM failure.
>>>>
>>>> log:
>>>> [  322.631362] OOM killer disabled.
>>>> [  322.631364] Freezing remaining freezable tasks
>>>> [  322.632536] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>>>> [  322.632540] printk: Suspending console(s) (use no_console_suspend to debug)
>>>> [  322.633052] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x110 [libphy] returns -16
>>>> [  322.633071] YT8521 Gigabit Ethernet stmmac-0:01:
>>>> PM: failed to suspend: error -16
>>>> [  322.669699] PM: Some devices failed to suspend, or early wake event detected
>>>> [  322.669949] OOM killer enabled.
>>>> [  322.669951] Restarting tasks ... done.
>>>> [  322.671008] random: crng reseeded on system resumption
>>>> [  322.671014] PM: suspend exit
>>>>
>>>> If the YT8521 driver adds phydrv->flags, ask the YT8521 driver to process
>>>> WOL at suspend and resume time, the phydev->suspended_by_mdio_bus=1
>>>> flag would cause the resume failure.
>>
>> Did you mean to write that if the YT8521 PHY driver entry set the
>> PHY_ALWAYS_CALL_SUSPEND flag, then it would cause an error during resume? If
>> so, why is that?
> 
> It doesn't appear to do that - at least not in net-next, and not in
> mainline.
> 
>>> I think the reason this is happening is because the PHY has WoL enabled
>>> on it without the kernel/netdev driver being aware that WoL is enabled.
>>> Thus, mdio_bus_phy_may_suspend() returns true, allowing the suspend to
>>> happen, but then we find unexpectedly that WoL is enabled on the PHY.
>>>
>>> However, whenever a user configures WoL, netdev->wol_enabled will be
>>> set when _any_ WoL mode is enabled and cleared only if all WoL modes
>>> are disabled.
>>>
>>> Thus, what we have is a de-sync between the kernel state and hardware
>>> state, leading to the suspend failing.
>>>
>>> I don't see anything in the motorcomm driver that requires suspend
>>> if WoL is enabled - yt8521_suspend() first checks to see whether WoL
>>> is enabled, and exits if it is.
>>>
>>> Andrew - how do you feel about reading the WoL state from the PHY and
>>> setting netdev->wol_enabled if any WoL is enabled on the PHY? That
>>> would mean that the netdev's WoL state is consistent with the PHY
>>> whether or not the user has configured WoL.
>>
>> Would not the situation described here be solved by having the Motorcomm PHY
>> driver set PHY_ALWAYS_CALL_SUSPEND since it deals with checking whether WoL
>> is enabled or not and will just return then.
> 
> Is there a reason that netdev->wol_enabled shouldn't reflect the
> hardware configuration?

Unless there is some sort of Ethernet MAC driver bug that we are not 
being made aware of, the only thing that I can of is happening here, is 
that a SW/FW agent other than Linux would have enabled the PHY for 
Wake-on-LAN, and there was no configuration done by the user via the 
Ethernet MAC driver that attempted to enable Wake-on-LAN. In that case 
only can I think of a disconnect between the HW and SW states?

> 
> If netdev->wol_enabled is appropriately set, then it seems to me
> that there's little reason for motorcomm to be checking whether
> WoL is enabled in its suspend function - which means less driver
> specific code and driver specific behaviour.
> 

That should work, too.
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ