lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ef25a0a2-e13f-def1-5e91-ceae1bfaf333@huawei.com>
Date:   Wed, 13 May 2020 09:34:13 +0800
From:   Yonglong Liu <liuyonglong@...wei.com>
To:     Andrew Lunn <andrew@...n.ch>
CC:     Heiner Kallweit <hkallweit1@...il.com>,
        "David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        <linuxarm@...wei.com>, Salil Mehta <salil.mehta@...wei.com>
Subject: Re: [question] net: phy: rtl8211f: link speed shows 1000Mb/s but
 actual link speed in phy is 100Mb/s

Hi, Andrew:
	Thanks for your reply!

On 2020/5/12 22:00, Andrew Lunn wrote:
> On Tue, May 12, 2020 at 08:48:21PM +0800, Yonglong Liu wrote:
>> I use two devices, both support 1000M speed, they are directly connected
>> with a network cable. Two devices enable autoneg, and then do the following
>> test repeatedly:
>> 	ifconfig eth5 down
>> 	ifconfig eth5 up
>> 	sleep $((RANDOM%6))
>> 	ifconfig eth5 down
>> 	ifconfig eth5 up
>> 	sleep 10
>>
>> With low probability, one device A link up with 100Mb/s, the other B link up with
>> 1000Mb/s(the actual link speed read from phy is 100Mb/s), and the network can
>> not work.
>>
>> device A:
>> Settings for eth5:
>>         Supported ports: [ TP ]
>>         Supported link modes:   10baseT/Half 10baseT/Full
>>                                 100baseT/Half 100baseT/Full
>>                                 1000baseT/Full
>>         Supported pause frame use: Symmetric Receive-only
>>         Supports auto-negotiation: Yes
>>         Supported FEC modes: Not reported
>>         Advertised link modes:  10baseT/Half 10baseT/Full
>>                                 100baseT/Half 100baseT/Full
>>                                 1000baseT/Full
>>         Advertised pause frame use: Symmetric
>>         Advertised auto-negotiation: Yes
>>         Advertised FEC modes: Not reported
>>         Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>                                              100baseT/Half 100baseT/Full
>>         Link partner advertised pause frame use: Symmetric
>>         Link partner advertised auto-negotiation: Yes
>>         Link partner advertised FEC modes: Not reported
>>         Speed: 100Mb/s
>>         Duplex: Full
>>         Port: MII
>>         PHYAD: 3
>>         Transceiver: internal
>>         Auto-negotiation: on
>>         Current message level: 0x00000036 (54)
>>                                probe link ifdown ifup
>>         Link detected: yes
>>
>> The regs value read from mdio are:
>> reg 9 = 0x200
>> reg a = 0
>>
>> device B:
>> Settings for eth5:
>>         Supported ports: [ TP ]
>>         Supported link modes:   10baseT/Half 10baseT/Full
>>                                 100baseT/Half 100baseT/Full
>>                                 1000baseT/Full
>>         Supported pause frame use: Symmetric Receive-only
>>         Supports auto-negotiation: Yes
>>         Supported FEC modes: Not reported
>>         Advertised link modes:  10baseT/Half 10baseT/Full
>>                                 100baseT/Half 100baseT/Full
>>                                 1000baseT/Full
>>         Advertised pause frame use: Symmetric
>>         Advertised auto-negotiation: Yes
>>         Advertised FEC modes: Not reported
>>         Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>                                              100baseT/Half 100baseT/Full
>>                                              1000baseT/Full
>>         Link partner advertised pause frame use: Symmetric
>>         Link partner advertised auto-negotiation: Yes
>>         Link partner advertised FEC modes: Not reported
>>         Speed: 1000Mb/s
>>         Duplex: Full
>>         Port: MII
>>         PHYAD: 3
>>         Transceiver: internal
>>         Auto-negotiation: on
>>         Current message level: 0x00000036 (54)
>>                                probe link ifdown ifup
>>         Link detected: yes
>>
>> The regs value read from mdio are:
>> reg 9 = 0
>> reg a = 0x800
>>
>> I had talk to the FAE of rtl8211f, they said if negotiation failed with 1000Mb/s,
>> rtl8211f will change reg 9 to 0, than try to negotiation with 100Mb/s.
>>
>> The problem happened as:
>> ifconfig eth5 up -> phy_start -> phy_start_aneg -> phy_modify_changed(MII_CTRL1000)
>> (this time both A and B, reg 9 = 0x200) -> wait for link up -> (B: reg 9 changed to 0)
>> -> link up.
> 
> This sounds like downshift, but not correctly working. 1Gbps requires
> that 4 pairs in the cable work. If a 1Gbps link is negotiated, but
> then does not establish because one of the pairs is broken, some PHYs
> will try to 'downshift'. They drop down to 100Mbps, which only
> requires two pairs of the cable to work. To do this, the PHY should
> change what it is advertising, to no longer advertise 1G, just 100M
> and 10M. The link partner should then try to use 100Mbps and
> hopefully, a link is established.
> 
> Looking at the ethtool, you can see device A is reporting device B is
> only advertising upto 100Mbps. Yet it is locally using 1G. That is
> broken. So i would say device A has the problem. Are both PHYs
> rtl8211f?

Both PHY is rtl8211f. I think Device B is broken. Device B advertising
it supported 1G, but actually, in phy, downshift to 100M, so Device B
link up with 1G in driver side, but actually 100M in phy.

> 
>> I think this is the bug of the rtl8211f itself, any one have an idea
>> to avoid this bug?
> 
> Are you 100% sure your cable and board layout is good? Is it trying> downshift because something is broken? Fix the cable/connector and the

Will check the layout with hardware engineer. This happened with a low
probability. When this happened, another down/up operation or restart
autoneg will solved.

> reason to downshift goes away. But it does not solve the problem if a
> customer has a broken cable. So you might want to deliberately cut a
> pair in the cable so it becomes 100% reproducable and try to debug it
> further. See if you can find out why auto-neg is not working
> correctly.

So, your opinion is, maybe we should checkout whether the hardware layout
or cable have problem?

By the way, do we have some mechanism to solve this downshift in software
side? If the PHY advertising downshift to 100M, but software still have
advertising with 1G(just like Device B), it will always have a broken network.

> 
> 	Andrew
> 
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ