[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <414b2dc1-2421-e4c8-ea81-1177545fb327@gmail.com>
Date: Fri, 31 Jan 2020 21:50:48 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: poros@...hat.com
Cc: netdev@...r.kernel.org, andrew@...n.ch, f.fainelli@...il.com,
ivecera@...hat.com
Subject: Re: [PATCH net v2] phy: avoid unnecessary link-up delay in polling
mode
On 31.01.2020 16:09, Petr Oros wrote:
> Heiner Kallweit píše v St 29. 01. 2020 v 22:01 +0100:
>> On 29.01.2020 13:19, Petr Oros wrote:
>>> commit 93c0970493c71f ("net: phy: consider latched link-down status in
>>> polling mode") removed double-read of latched link-state register for
>>> polling mode from genphy_update_link(). This added extra ~1s delay into
>>> sequence link down->up.
>>> Following scenario:
>>> - After boot link goes up
>>> - phy_start() is called triggering an aneg restart, hence link goes
>>> down and link-down info is latched.
>>> - After aneg has finished link goes up. In phy_state_machine is checked
>>> link state but it is latched "link is down". The state machine is
>>> scheduled after one second and there is detected "link is up". This
>>> extra delay can be avoided when we keep link-state register double read
>>> in case when link was down previously.
>>>
>>> With this solution we don't miss a link-down event in polling mode and
>>> link-up is faster.
>>>
>>
>> I have a little problem to understand why it should be faster this way.
>> Let's take an example: aneg takes 3.5s
>> Current behavior:
>>
>> T0: aneg is started, link goes down, link-down status is latched
>> (phydev->link is still 1)
>> T0+1s: state machine runs, latched link-down is read,
>> phydev->link goes down, state change PHY_UP to PHY_NOLINK
>> T0+2s: state machine runs, up-to-date link-down is read
>> T0+3s: state machine runs, up-to-date link-down is read
>> T0+4s: state machine runs, aneg is finished, up-to-date link-up is read,
>> phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
>>
>> Your patch changes the behavior of T0+1s only. So it should make a
>> difference only if aneg takes less than 1s.
>> Can you explain, based on the given example, how your change is
>> supposed to improve this?
>>
>
>
> I see this behavior on real hw:
> With patch:
> T0+3s: state machine runs, up-to-date link-down is read
> T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
> first BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==0,
> second BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==1,
> phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
>
> line: 1917 is first BMSR read
> line: 1921 is second BMSR read
>
> [ 24.124572] xgene-mii-rgmii:03: genphy_restart_aneg()
> [ 24.132000] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [ 24.139347] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
> 0x7949
> [ 24.146783] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7949
> [ 24.154174] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
>
> . supressed 3 same messages in T0+1,2,3s
>
> [ 28.609822] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [ 28.629906] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status:
> 0x7969
> ^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
> [ 28.644590] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x796d
> ^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
> [ 28.658681] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1
>
I see, thanks. Strange behavior of the PHY. Did you test also with other PHY's
whether they behave the same?
> --------------------------------------------------------------------------------
> ---
>
> Without patch:
> T0+3s: state machine runs, up-to-date link-down is read
> T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
> here i read link-down (BMSR_LSTATUS==0),
> T0+5s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1),
> up-to-date link-up is read (BMSR_LSTATUS==1),
> phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING
>
> line: 1917 is first BMSR read (status is zero because without patch it is readed
> once)
> line: 1921 is second BMSR read
>
> [ 24.862702] xgene-mii-rgmii:03: 1768: genphy_restart_aneg
> [ 24.869070] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [ 24.876409] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [ 24.885999] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7949
> [ 24.893401] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
>
> . supressed 3 same messages in T0+1,2,3s
>
> [ 29.319613] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [ 29.326408] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [ 29.333557] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x7969
> ^^^^^^^^^^^^^^^ detected BMSR_ANEGCOMPLETE but not BMSR_LSTATUS
> [ 29.340923] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 0
>
> [ 30.359713] xgene-mii-rgmii:03: genphy_update_link(), line: 1895, link: 0
> [ 30.366507] xgene-mii-rgmii:03: genphy_update_link(), line: 1917, status: 0x0
> [ 30.373650] xgene-mii-rgmii:03: genphy_update_link(), line: 1921, status:
> 0x796d
> ^^^^^^^^^^^^^^^ here is detected BMSR_ANEGCOMPLETE and BMSR_LSTATUS
> [ 30.381016] xgene-mii-rgmii:03: genphy_update_link(), line: 1927, link: 1
>
> I tried many variants and it is deterministic behavior. Without patch is delay
> one second longer due to later detect link up after aneg finish
>
> -Petr
>
>
>> And on a side note: I wouldn't consider this change a fix, therefore
>> it would be material for net-next that is closed at the moment.
>>
>> Heiner
>>
>>> Changes in v2:
>>> - Fixed typos in phy_polling_mode() argument
>>>
>>> Fixes: 93c0970493c71f ("net: phy: consider latched link-down status in polling mode")
>>> Signed-off-by: Petr Oros <poros@...hat.com>
>>> ---
>>> drivers/net/phy/phy-c45.c | 5 +++--
>>> drivers/net/phy/phy_device.c | 5 +++--
>>> 2 files changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
>>> index a1caeee1223617..bceb0dcdecbd61 100644
>>> --- a/drivers/net/phy/phy-c45.c
>>> +++ b/drivers/net/phy/phy-c45.c
>>> @@ -239,9 +239,10 @@ int genphy_c45_read_link(struct phy_device *phydev)
>>>
>>> /* The link state is latched low so that momentary link
>>> * drops can be detected. Do not double-read the status
>>> - * in polling mode to detect such short link drops.
>>> + * in polling mode to detect such short link drops except
>>> + * the link was already down.
>>> */
>>> - if (!phy_polling_mode(phydev)) {
>>> + if (!phy_polling_mode(phydev) || !phydev->link) {
>>> val = phy_read_mmd(phydev, devad, MDIO_STAT1);
>>> if (val < 0)
>>> return val;
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 6a5056e0ae7757..05417419c484fa 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -1930,9 +1930,10 @@ int genphy_update_link(struct phy_device *phydev)
>>>
>>> /* The link state is latched low so that momentary link
>>> * drops can be detected. Do not double-read the status
>>> - * in polling mode to detect such short link drops.
>>> + * in polling mode to detect such short link drops except
>>> + * the link was already down.
>>> */
>>> - if (!phy_polling_mode(phydev)) {
>>> + if (!phy_polling_mode(phydev) || !phydev->link) {
>>> status = phy_read(phydev, MII_BMSR);
>>> if (status < 0)
>>> return status;
>>>
>
>
Powered by blists - more mailing lists