lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3f996ac2-7920-008e-3b83-b8b82cc89b31@huawei.com>
Date:   Wed, 13 May 2020 11:04:20 +0800
From:   Yonglong Liu <liuyonglong@...wei.com>
To:     Andrew Lunn <andrew@...n.ch>
CC:     Heiner Kallweit <hkallweit1@...il.com>,
        "David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        <linuxarm@...wei.com>, Salil Mehta <salil.mehta@...wei.com>
Subject: Re: [question] net: phy: rtl8211f: link speed shows 1000Mb/s but
 actual link speed in phy is 100Mb/s

On 2020/5/13 9:59, Andrew Lunn wrote:
> On Wed, May 13, 2020 at 09:34:13AM +0800, Yonglong Liu wrote:
>> Hi, Andrew:
>> 	Thanks for your reply!
>>
>> On 2020/5/12 22:00, Andrew Lunn wrote:
>>> On Tue, May 12, 2020 at 08:48:21PM +0800, Yonglong Liu wrote:
>>>> I use two devices, both support 1000M speed, they are directly connected
>>>> with a network cable. Two devices enable autoneg, and then do the following
>>>> test repeatedly:
>>>> 	ifconfig eth5 down
>>>> 	ifconfig eth5 up
>>>> 	sleep $((RANDOM%6))
>>>> 	ifconfig eth5 down
>>>> 	ifconfig eth5 up
>>>> 	sleep 10
>>>>
>>>> With low probability, one device A link up with 100Mb/s, the other B link up with
>>>> 1000Mb/s(the actual link speed read from phy is 100Mb/s), and the network can
>>>> not work.
>>>>
>>>> device A:
>>>> Settings for eth5:
>>>>         Supported ports: [ TP ]
>>>>         Supported link modes:   10baseT/Half 10baseT/Full
>>>>                                 100baseT/Half 100baseT/Full
>>>>                                 1000baseT/Full
>>>>         Supported pause frame use: Symmetric Receive-only
>>>>         Supports auto-negotiation: Yes
>>>>         Supported FEC modes: Not reported
>>>>         Advertised link modes:  10baseT/Half 10baseT/Full
>>>>                                 100baseT/Half 100baseT/Full
>>>>                                 1000baseT/Full
>>>>         Advertised pause frame use: Symmetric
>>>>         Advertised auto-negotiation: Yes
>>>>         Advertised FEC modes: Not reported
>>>>         Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>>>                                              100baseT/Half 100baseT/Full
>>>>         Link partner advertised pause frame use: Symmetric
>>>>         Link partner advertised auto-negotiation: Yes
>>>>         Link partner advertised FEC modes: Not reported
>>>>         Speed: 100Mb/s
>>>>         Duplex: Full
>>>>         Port: MII
>>>>         PHYAD: 3
>>>>         Transceiver: internal
>>>>         Auto-negotiation: on
>>>>         Current message level: 0x00000036 (54)
>>>>                                probe link ifdown ifup
>>>>         Link detected: yes
>>>>
>>>> The regs value read from mdio are:
>>>> reg 9 = 0x200
>>>> reg a = 0
>>>>
>>>> device B:
>>>> Settings for eth5:
>>>>         Supported ports: [ TP ]
>>>>         Supported link modes:   10baseT/Half 10baseT/Full
>>>>                                 100baseT/Half 100baseT/Full
>>>>                                 1000baseT/Full
>>>>         Supported pause frame use: Symmetric Receive-only
>>>>         Supports auto-negotiation: Yes
>>>>         Supported FEC modes: Not reported
>>>>         Advertised link modes:  10baseT/Half 10baseT/Full
>>>>                                 100baseT/Half 100baseT/Full
>>>>                                 1000baseT/Full
>>>>         Advertised pause frame use: Symmetric
>>>>         Advertised auto-negotiation: Yes
>>>>         Advertised FEC modes: Not reported
>>>>         Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>>>                                              100baseT/Half 100baseT/Full
>>>>                                              1000baseT/Full
>>>>         Link partner advertised pause frame use: Symmetric
>>>>         Link partner advertised auto-negotiation: Yes
>>>>         Link partner advertised FEC modes: Not reported
>>>>         Speed: 1000Mb/s
>>>>         Duplex: Full
>>>>         Port: MII
>>>>         PHYAD: 3
>>>>         Transceiver: internal
>>>>         Auto-negotiation: on
>>>>         Current message level: 0x00000036 (54)
>>>>                                probe link ifdown ifup
>>>>         Link detected: yes
>>>>
>>>> The regs value read from mdio are:
>>>> reg 9 = 0
>>>> reg a = 0x800
>>>>
>>>> I had talk to the FAE of rtl8211f, they said if negotiation failed with 1000Mb/s,
>>>> rtl8211f will change reg 9 to 0, than try to negotiation with 100Mb/s.
>>>>
>>>> The problem happened as:
>>>> ifconfig eth5 up -> phy_start -> phy_start_aneg -> phy_modify_changed(MII_CTRL1000)
>>>> (this time both A and B, reg 9 = 0x200) -> wait for link up -> (B: reg 9 changed to 0)
>>>> -> link up.
>>>
>>> This sounds like downshift, but not correctly working. 1Gbps requires
>>> that 4 pairs in the cable work. If a 1Gbps link is negotiated, but
>>> then does not establish because one of the pairs is broken, some PHYs
>>> will try to 'downshift'. They drop down to 100Mbps, which only
>>> requires two pairs of the cable to work. To do this, the PHY should
>>> change what it is advertising, to no longer advertise 1G, just 100M
>>> and 10M. The link partner should then try to use 100Mbps and
>>> hopefully, a link is established.
>>>
>>> Looking at the ethtool, you can see device A is reporting device B is
>>> only advertising upto 100Mbps. Yet it is locally using 1G. That is
>>> broken. So i would say device A has the problem. Are both PHYs
>>> rtl8211f?
>>
>> Both PHY is rtl8211f. I think Device B is broken. Device B advertising
>> it supported 1G, but actually, in phy, downshift to 100M, so Device B
>> link up with 1G in driver side, but actually 100M in phy.
> 
> You have to be careful with the output of ethtool. Downshift is not
> part of 802.3. There i no standard register to indicate it has
> happened. Sometimes there is a vendor register. You should check the
> datasheet, and look at what other PHY drivers do for this, and
> phy_check_downshift().
> 
>>> Are you 100% sure your cable and board layout is good? Is it
>>> trying downshift because something is broken? Fix the
>>> cable/connector and the
> 
>> Will check the layout with hardware engineer. This happened with a low
>> probability. When this happened, another down/up operation or restart
>> autoneg will solved.
>  
>>> reason to downshift goes away. But it does not solve the problem if a
>>> customer has a broken cable. So you might want to deliberately cut a
>>> pair in the cable so it becomes 100% reproducable and try to debug it
>>> further. See if you can find out why auto-neg is not working
>>> correctly.
>>
>> So, your opinion is, maybe we should checkout whether the hardware layout
>> or cable have problem?
> 
> Well, there are a couple of issues here.
> 
> It could be a hardware problem. Best case, it is the cable. But if you
> can reproduce it with other boards, it is a board design issue, which
> you might want to get fixed. If it happens for you in the lab, it will
> probably happen out in the field.
> 
> You should also consider what you want to happen with a cable that
> really is broken. It would be nice if downshift worked. Slower
> networking is better than no networking. Unless you have a requirement
> that 100Mbps is too slow for your use case. So you might want to debug
> what is going wrong when downshift happens.
> 
>> By the way, do we have some mechanism to solve this downshift in software
>> side? If the PHY advertising downshift to 100M, but software still have
>> advertising with 1G(just like Device B), it will always have a broken network.
> 
> You might get some ideas from phy_check_downshift(). A lot will
> depended on what information you can get from the PHY.
> 
> 	 Andrew
> 

Hi, Andrew:
	Thanks very much! That's so helpfull!

> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ