lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 31 Jan 2020 21:50:48 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     poros@...hat.com
Cc:     netdev@...r.kernel.org, andrew@...n.ch, f.fainelli@...il.com,
        ivecera@...hat.com
Subject: Re: [PATCH net v2] phy: avoid unnecessary link-up delay in polling
 mode

On 31.01.2020 16:09, Petr Oros wrote:
> Heiner Kallweit píše v St 29. 01. 2020 v 22:01 +0100:
>> On 29.01.2020 13:19, Petr Oros wrote:
>>> commit 93c0970493c71f ("net: phy: consider latched link-down status in
>>> polling mode") removed double-read of latched link-state register for
>>> polling mode from genphy_update_link(). This added extra ~1s delay into
>>> sequence link down->up.
>>> Following scenario:
>>>  - After boot link goes up
>>>  - phy_start() is called triggering an aneg restart, hence link goes
>>>    down and link-down info is latched.
>>>  - After aneg has finished link goes up. In phy_state_machine is checked
>>>    link state but it is latched "link is down". The state machine is
>>>    scheduled after one second and there is detected "link is up". This
>>>    extra delay can be avoided when we keep link-state register double read
>>>    in case when link was down previously.
>>>
>>> With this solution we don't miss a link-down event in polling mode and
>>> link-up is faster.
>>>
>>
>> I have a little problem to understand why it should be faster this way.
>> Let's take an example: aneg takes 3.5s
>> Current behavior:
>>
>> T0: aneg is started, link goes down, link-down status is latched
>>     (phydev->link is still 1)
>> T0+1s: state machine runs, latched link-down is read,
>>        phydev->link goes down, state change PHY_UP to PHY_NOLINK
>> T0+2s: state machine runs, up-to-date link-down is read
>> T0+3s: state machine runs, up-to-date link-down is read
>> T0+4s: state machine runs, aneg is finished, up-to-date link-up is read,
>>        phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
>>
>> Your patch changes the behavior of T0+1s only. So it should make a
>> difference only if aneg takes less than 1s.
>> Can you explain, based on the given example, how your change is
>> supposed to improve this?
>>
> 
> 
> I see this behavior on real hw:
> With patch:
> T0+3s: state machine runs, up-to-date link-down is read
> T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
>        first BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==0,
>        second BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==1,
>        phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
> 
> line: 1917 is first BMSR read
> line: 1921 is second BMSR read
> 
> [   24.124572] xgene-mii-rgmii:03: genphy_restart_aneg()
> [   24.132000] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [   24.139347] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
> 0x7949
> [   24.146783] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7949
> [   24.154174] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
> 
> . supressed 3 same messages in T0+1,2,3s
> 
> [   28.609822] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [   28.629906] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
> 0x7969
> ^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
> [   28.644590] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x796d
> ^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
> [   28.658681] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1
> 

I see, thanks. Strange behavior of the PHY. Did you test also with other PHY's
whether they behave the same?

> --------------------------------------------------------------------------------
> ---
> 
> Without patch:
> T0+3s: state machine runs, up-to-date link-down is read
> T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
>        here i read link-down (BMSR_LSTATUS==0),
> T0+5s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
>        up-to-date link-up is read (BMSR_LSTATUS==1),
>        phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
> 
> line: 1917 is first BMSR read (status is zero because without patch it is readed
> once)
> line: 1921 is second BMSR read
> 
> [   24.862702] xgene-mii-rgmii:03: 1768: genphy_restart_aneg
> [   24.869070] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [   24.876409] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [   24.885999] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7949
> [   24.893401] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
> 
> . supressed 3 same messages in T0+1,2,3s
> 
> [   29.319613] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [   29.326408] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [   29.333557] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7969
> ^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
> [   29.340923] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
> 
> [   30.359713] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [   30.366507] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [   30.373650] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x796d
> ^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
> [   30.381016] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1
> 
> I tried many variants and it is deterministic behavior. Without patch is delay
> one second longer due to later detect link up after aneg finish
> 
> -Petr
> 
> 
>> And on a side note: I wouldn't consider this change a fix, therefore
>> it would be material for net-next that is closed at the moment.
>>
>> Heiner
>>
>>> Changes in v2:
>>> - Fixed typos in phy_polling_mode() argument
>>>
>>> Fixes: 93c0970493c71f ("net: phy: consider latched link-down status in polling mode")
>>> Signed-off-by: Petr Oros <poros@...hat.com>
>>> ---
>>>  drivers/net/phy/phy-c45.c    | 5 +++--
>>>  drivers/net/phy/phy_device.c | 5 +++--
>>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
>>> index a1caeee1223617..bceb0dcdecbd61 100644
>>> --- a/drivers/net/phy/phy-c45.c
>>> +++ b/drivers/net/phy/phy-c45.c
>>> @@ -239,9 +239,10 @@ int genphy_c45_read_link(struct phy_device *phydev)
>>>  
>>>  		/* The link state is latched low so that momentary link
>>>  		 * drops can be detected. Do not double-read the status
>>> -		 * in polling mode to detect such short link drops.
>>> +		 * in polling mode to detect such short link drops except
>>> +		 * the link was already down.
>>>  		 */
>>> -		if (!phy_polling_mode(phydev)) {
>>> +		if (!phy_polling_mode(phydev) || !phydev->link) {
>>>  			val = phy_read_mmd(phydev, devad, MDIO_STAT1);
>>>  			if (val < 0)
>>>  				return val;
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 6a5056e0ae7757..05417419c484fa 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -1930,9 +1930,10 @@ int genphy_update_link(struct phy_device *phydev)
>>>  
>>>  	/* The link state is latched low so that momentary link
>>>  	 * drops can be detected. Do not double-read the status
>>> -	 * in polling mode to detect such short link drops.
>>> +	 * in polling mode to detect such short link drops except
>>> +	 * the link was already down.
>>>  	 */
>>> -	if (!phy_polling_mode(phydev)) {
>>> +	if (!phy_polling_mode(phydev) || !phydev->link) {
>>>  		status = phy_read(phydev, MII_BMSR);
>>>  		if (status < 0)
>>>  			return status;
>>>
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ