lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d2c26ac18d0ce7b76024fec86a9b1a084ad3fd3.camel@redhat.com>
Date:   Fri, 31 Jan 2020 16:09:31 +0100
From:   Petr Oros <poros@...hat.com>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     netdev@...r.kernel.org, andrew@...n.ch, f.fainelli@...il.com,
        ivecera@...hat.com
Subject: Re: [PATCH net v2] phy: avoid unnecessary link-up delay in polling
 mode

Heiner Kallweit píše v St 29. 01. 2020 v 22:01 +0100:
> On 29.01.2020 13:19, Petr Oros wrote:
> > commit 93c0970493c71f ("net: phy: consider latched link-down status in
> > polling mode") removed double-read of latched link-state register for
> > polling mode from genphy_update_link(). This added extra ~1s delay into
> > sequence link down->up.
> > Following scenario:
> >  - After boot link goes up
> >  - phy_start() is called triggering an aneg restart, hence link goes
> >    down and link-down info is latched.
> >  - After aneg has finished link goes up. In phy_state_machine is checked
> >    link state but it is latched "link is down". The state machine is
> >    scheduled after one second and there is detected "link is up". This
> >    extra delay can be avoided when we keep link-state register double read
> >    in case when link was down previously.
> > 
> > With this solution we don't miss a link-down event in polling mode and
> > link-up is faster.
> > 
> 
> I have a little problem to understand why it should be faster this way.
> Let's take an example: aneg takes 3.5s
> Current behavior:
> 
> T0: aneg is started, link goes down, link-down status is latched
>     (phydev->link is still 1)
> T0+1s: state machine runs, latched link-down is read,
>        phydev->link goes down, state change PHY_UP to PHY_NOLINK
> T0+2s: state machine runs, up-to-date link-down is read
> T0+3s: state machine runs, up-to-date link-down is read
> T0+4s: state machine runs, aneg is finished, up-to-date link-up is read,
>        phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
> 
> Your patch changes the behavior of T0+1s only. So it should make a
> difference only if aneg takes less than 1s.
> Can you explain, based on the given example, how your change is
> supposed to improve this?
> 


I see this behavior on real hw:
With patch:
T0+3s: state machine runs, up-to-date link-down is read
T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
       first BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==0,
       second BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==1,
       phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING

line: 1917 is first BMSR read
line: 1921 is second BMSR read

[   24.124572] xgene-mii-rgmii:03: genphy_restart_aneg()
[   24.132000] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
[   24.139347] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
0x7949
[   24.146783] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
0x7949
[   24.154174] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0

. supressed 3 same messages in T0+1,2,3s

[   28.609822] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
[   28.629906] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
0x7969
^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
[   28.644590] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
0x796d
^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
[   28.658681] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1

--------------------------------------------------------------------------------
---

Without patch:
T0+3s: state machine runs, up-to-date link-down is read
T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
       here i read link-down (BMSR_LSTATUS==0),
T0+5s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
       up-to-date link-up is read (BMSR_LSTATUS==1),
       phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING

line: 1917 is first BMSR read (status is zero because without patch it is readed
once)
line: 1921 is second BMSR read

[   24.862702] xgene-mii-rgmii:03: 1768: genphy_restart_aneg
[   24.869070] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
[   24.876409] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
[   24.885999] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
0x7949
[   24.893401] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0

. supressed 3 same messages in T0+1,2,3s

[   29.319613] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
[   29.326408] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
[   29.333557] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
0x7969
^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
[   29.340923] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0

[   30.359713] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
[   30.366507] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
[   30.373650] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
0x796d
^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
[   30.381016] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1

I tried many variants and it is deterministic behavior. Without patch is delay
one second longer due to later detect link up after aneg finish

-Petr


> And on a side note: I wouldn't consider this change a fix, therefore
> it would be material for net-next that is closed at the moment.
> 
> Heiner
> 
> > Changes in v2:
> > - Fixed typos in phy_polling_mode() argument
> > 
> > Fixes: 93c0970493c71f ("net: phy: consider latched link-down status in polling mode")
> > Signed-off-by: Petr Oros <poros@...hat.com>
> > ---
> >  drivers/net/phy/phy-c45.c    | 5 +++--
> >  drivers/net/phy/phy_device.c | 5 +++--
> >  2 files changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
> > index a1caeee1223617..bceb0dcdecbd61 100644
> > --- a/drivers/net/phy/phy-c45.c
> > +++ b/drivers/net/phy/phy-c45.c
> > @@ -239,9 +239,10 @@ int genphy_c45_read_link(struct phy_device *phydev)
> >  
> >  		/* The link state is latched low so that momentary link
> >  		 * drops can be detected. Do not double-read the status
> > -		 * in polling mode to detect such short link drops.
> > +		 * in polling mode to detect such short link drops except
> > +		 * the link was already down.
> >  		 */
> > -		if (!phy_polling_mode(phydev)) {
> > +		if (!phy_polling_mode(phydev) || !phydev->link) {
> >  			val = phy_read_mmd(phydev, devad, MDIO_STAT1);
> >  			if (val < 0)
> >  				return val;
> > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > index 6a5056e0ae7757..05417419c484fa 100644
> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -1930,9 +1930,10 @@ int genphy_update_link(struct phy_device *phydev)
> >  
> >  	/* The link state is latched low so that momentary link
> >  	 * drops can be detected. Do not double-read the status
> > -	 * in polling mode to detect such short link drops.
> > +	 * in polling mode to detect such short link drops except
> > +	 * the link was already down.
> >  	 */
> > -	if (!phy_polling_mode(phydev)) {
> > +	if (!phy_polling_mode(phydev) || !phydev->link) {
> >  		status = phy_read(phydev, MII_BMSR);
> >  		if (status < 0)
> >  			return status;
> > 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ