lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Apr 2021 11:47:09 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Heiner Kallweit <hkallweit1@...il.com>,
        Joakim Zhang <qiangqing.zhang@....com>,
        "christian.melki@...ata.com" <christian.melki@...ata.com>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "linux@...linux.org.uk" <linux@...linux.org.uk>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        dl-linux-imx <linux-imx@....com>
Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
 back



On 4/6/2021 11:43 AM, Heiner Kallweit wrote:
> On 06.04.2021 20:32, Florian Fainelli wrote:
>>
>>
>> On 4/6/2021 4:42 AM, Heiner Kallweit wrote:
>>>
>>> Waiting for ANEG_COMPLETE to be set wouldn't be a good option. Aneg may never
>>> complete for different reasons, e.g. no physical link. And even if we use a
>>> timeout this may add unwanted delays.
>>>
>>>> Do you have any other insights that can help me further locate the issue? Thanks.
>>>>
>>>
>>> I think current MAC/PHY PM handling isn't perfect. Often we have the following
>>> scenario:
>>>
>>> *suspend*
>>> 1. PHY is suspended (mdio_bus_phy_suspend)
>>> 2. MAC suspend callback (typically involving phy_stop())
>>>
>>> *resume*
>>> 1. MAC resume callback (typically involving phy_start())
>>> 2. PHY is resumed (mdio_bus_phy_resume), incl. calling phy_init_hw()
>>>
>>> Calling phy_init_hw() after phy_start() doesn't look right.
>>> It seems to work in most cases, but there's a certain risk
>>> that phy_init_hw() overwrites something, e.g. the advertised
>>> modes.
>>> I think we have two valid scenarios:
>>>
>>> 1. phylib PM callbacks are used, then the MAC driver shouldn't
>>>    touch the PHY in its PM callbacks, especially not call
>>>    phy_stop/phy_start.
>>>
>>> 2. MAC PM callbacks take care also of the PHY. Then I think we would
>>>    need a flag at the phy_device telling it to make the PHY PM
>>>    callbacks a no-op.
>>
>> Maybe part of the problem is that the FEC is calling phy_{stop,start} in
>> its suspend/resume callbacks instead of phy_{suspend,resume} which would
>> play nice and tell the MDIO bus PM callbacks that the PHY has already
>> been suspended.
>>
> This basically is what I just proposed to test.

What you suggested to be tested is to let the MDIO bus PM callbacks deal
with suspending the PHY, which is different from having the MAC do it
explicitly, both would be interesting to try out.

> 
>> I am also suspicious about whether Wake-on-LAN actually works with the
>> FEC, you cannot wake from LAN if the PHY is stopped and powered down.
>>
> phy_stop() calls phy_suspend() which checks for WoL. Therefore this
> should not be a problem.

Indeed, and I had missed that phy_suspend() checks netdev->wol_enabled,
thanks for reminding me.
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ