lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 31 Jul 2019 13:58:41 +0800
From:   liuyonglong <liuyonglong@...wei.com>
To:     Heiner Kallweit <hkallweit1@...il.com>, <andrew@...n.ch>,
        <davem@...emloft.net>
CC:     <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linuxarm@...wei.com>, <salil.mehta@...wei.com>,
        <yisen.zhuang@...wei.com>, <shiju.jose@...wei.com>
Subject: Re: [RFC] net: phy: read link status twice when
 phy_check_link_status()



On 2019/7/31 13:44, Heiner Kallweit wrote:
> On 31.07.2019 05:33, liuyonglong wrote:
>>
>>
>> On 2019/7/31 3:04, Heiner Kallweit wrote:
>>> On 30.07.2019 08:35, liuyonglong wrote:
>>>> :/sys/kernel/debug/tracing$ cat trace
>>>> # tracer: nop
>>>> #
>>>> # entries-in-buffer/entries-written: 45/45   #P:128
>>>> #
>>>> #                              _-----=> irqs-off
>>>> #                             / _----=> need-resched
>>>> #                            | / _---=> hardirq/softirq
>>>> #                            || / _--=> preempt-depth
>>>> #                            ||| /     delay
>>>> #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
>>>> #              | |       |   ||||       |         |
>>>>     kworker/64:2-1028  [064] ....   172.295687: mdio_access: mii-0000:bd:00.0 read  phy:0x01 reg:0x02 val:0x001c
>>>>     kworker/64:2-1028  [064] ....   172.295726: mdio_access: mii-0000:bd:00.0 read  phy:0x01 reg:0x03 val:0xc916
>>>>     kworker/64:2-1028  [064] ....   172.296902: mdio_access: mii-0000:bd:00.0 read  phy:0x01 reg:0x01 val:0x79ad
>>>>     kworker/64:2-1028  [064] ....   172.296938: mdio_access: mii-0000:bd:00.0 read  phy:0x01 reg:0x0f val:0x2000
>>>>     kworker/64:2-1028  [064] ....   172.321213: mdio_access: mii-0000:bd:00.0 read  phy:0x01 reg:0x00 val:0x1040
>>>>     kworker/64:2-1028  [064] ....   172.343209: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x02 val:0x001c
>>>>     kworker/64:2-1028  [064] ....   172.343245: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x03 val:0xc916
>>>>     kworker/64:2-1028  [064] ....   172.343882: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x79ad
>>>>     kworker/64:2-1028  [064] ....   172.343918: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x0f val:0x2000
>>>>     kworker/64:2-1028  [064] ....   172.362658: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>>>>     kworker/64:2-1028  [064] ....   172.385961: mdio_access: mii-0000:bd:00.2 read  phy:0x05 reg:0x02 val:0x001c
>>>>     kworker/64:2-1028  [064] ....   172.385996: mdio_access: mii-0000:bd:00.2 read  phy:0x05 reg:0x03 val:0xc916
>>>>     kworker/64:2-1028  [064] ....   172.386646: mdio_access: mii-0000:bd:00.2 read  phy:0x05 reg:0x01 val:0x79ad
>>>>     kworker/64:2-1028  [064] ....   172.386681: mdio_access: mii-0000:bd:00.2 read  phy:0x05 reg:0x0f val:0x2000
>>>>     kworker/64:2-1028  [064] ....   172.411286: mdio_access: mii-0000:bd:00.2 read  phy:0x05 reg:0x00 val:0x1040
>>>>     kworker/64:2-1028  [064] ....   172.433225: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x02 val:0x001c
>>>>     kworker/64:2-1028  [064] ....   172.433260: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x03 val:0xc916
>>>>     kworker/64:2-1028  [064] ....   172.433887: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
>>>>     kworker/64:2-1028  [064] ....   172.433922: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x0f val:0x2000
>>>>     kworker/64:2-1028  [064] ....   172.452862: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
>>>>         ifconfig-1324  [011] ....   177.325585: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>>>>   kworker/u257:0-8     [012] ....   177.325642: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x04 val:0x01e1
>>>>   kworker/u257:0-8     [012] ....   177.325654: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x04 val:0x05e1
>>>>   kworker/u257:0-8     [012] ....   177.325708: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x79ad
>>>>   kworker/u257:0-8     [012] ....   177.325744: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x09 val:0x0200
>>>>   kworker/u257:0-8     [012] ....   177.325779: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x00 val:0x1040
>>>>   kworker/u257:0-8     [012] ....   177.325788: mdio_access: mii-0000:bd:00.1 write phy:0x03 reg:0x00 val:0x1240
>>>>   kworker/u257:0-8     [012] ....   177.325843: mdio_access: mii-0000:bd:00.1 read  phy:0x03 reg:0x01 val:0x798d
>>>
>>> What I think that happens here:
>>> Writing 0x1240 to BMCR starts aneg. When reading BMSR immediately after that then the PHY seems to have cleared
>>> the "aneg complete" bit already, but not yet the "link up" bit. This results in the false "link up" notification.
>>> The following patch is based on the fact that in case of enabled aneg we can't have a valid link if aneg isn't
>>> finished. Could you please test whether this works for you?
>>>
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 6b5cb87f3..7ddd91df9 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -1774,6 +1774,12 @@ int genphy_update_link(struct phy_device *phydev)
>>>  	phydev->link = status & BMSR_LSTATUS ? 1 : 0;
>>>  	phydev->autoneg_complete = status & BMSR_ANEGCOMPLETE ? 1 : 0;
>>>  
>>> +	/* Consider the case that autoneg was started and "aneg complete"
>>> +	 * bit has been reset, but "link up" bit not yet.
>>> +	 */
>>> +	if (phydev->autoneg == AUTONEG_ENABLE && !phydev->autoneg_complete)
>>> +		phydev->link = 0;
>>> +
>>>  	return 0;
>>>  }
>>>  EXPORT_SYMBOL(genphy_update_link);
>>>
>>
>> This patch can solve the issue! Will it be upstream?
>>
> I'll check for side effects, but in general: yes.
> 
>> So it's nothing to do with the bios, and just the PHY's own behavior,
>> the "link up" bit can not reset immediately,right?
>>
> Yes, it's the PHY's own behavior, and to a certain extent it may depend on speed
> of the MDIO bus. At least few network chips require a delay of several microseconds
> after each MDIO bus access. This may be sufficient for the PHY to reset the
> link-up bit in time.
> 
>> ps: It will take 1 second more to make the link up for RTL8211F when 0x798d happend.
>>
> In polling mode link-up is detected up to 1s after it happened.
> You could switch to interrupt mode to reduce the aneg time.
> 
>>
>>
> Heiner
> 
> .
> 

Thanks very much!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ