lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 20 Jul 2019 16:22:50 +0200 (CEST)
From:   Thomas Voegtle <tv@...96.de>
To:     Heiner Kallweit <hkallweit1@...il.com>
cc:     linux-kernel@...r.kernel.org,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: network problems with r8169

On Sat, 20 Jul 2019, Heiner Kallweit wrote:

> On 19.07.2019 23:12, Thomas Voegtle wrote:
>> On Fri, 19 Jul 2019, Heiner Kallweit wrote:
>>
>>> On 18.07.2019 20:50, Thomas Voegtle wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm having network problems with the commits on r8169 since v5.2. There are ping packet loss, sometimes 100%, sometimes 50%. In the end network is unusable.
>>>>
>>>> v5.2 is fine, I bisected it down to:
>>>>
>>>> a2928d28643e3c064ff41397281d20c445525032 is the first bad commit
>>>> commit a2928d28643e3c064ff41397281d20c445525032
>>>> Author: Heiner Kallweit <hkallweit1@...il.com>
>>>> Date:   Sun Jun 2 10:53:49 2019 +0200
>>>>
>>>>     r8169: use paged versions of phylib MDIO access functions
>>>>
>>>>     Use paged versions of phylib MDIO access functions to simplify
>>>>     the code.
>>>>
>>>>     Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
>>>>     Signed-off-by: David S. Miller <davem@...emloft.net>
>>>>
>>>>
>>>> Reverting that commit on top of v5.2-11564-g22051d9c4a57 fixes the problem
>>>> for me (had to adjust the renaming to r8169_main.c).
>>>>
>>>> I have a:
>>>> 04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev
>>>> 0c)
>>>>         Subsystem: Biostar Microtech Int'l Corp Device [1565:2400]
>>>>         Kernel driver in use: r8169
>>>>
>>>> on a BIOSTAR H81MG motherboard.
>>>>
>>> Interesting. I have the same chip version (RTL8168g) and can't reproduce
>>> the issue. Can you provide a full dmesg output and test the patch below
>>> on top of linux-next? I'd be interested in the WARN_ON stack traces
>>> (if any) and would like to know whether the experimental change to
>>> __phy_modify_changed helps.
>>>
>>>>
>>>> greetings,
>>>>
>>>>   Thomas
>>>>
>>>>
>>> Heiner
>>>
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>> index 8d7dd4c5f..26be73000 100644
>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>> @@ -1934,6 +1934,8 @@ static int rtl_get_eee_supp(struct rtl8169_private *tp)
>>>     struct phy_device *phydev = tp->phydev;
>>>     int ret;
>>>
>>> +    WARN_ON(phy_read(phydev, 0x1f));
>>> +
>>>     switch (tp->mac_version) {
>>>     case RTL_GIGA_MAC_VER_34:
>>>     case RTL_GIGA_MAC_VER_35:
>>> @@ -1957,6 +1959,8 @@ static int rtl_get_eee_lpadv(struct rtl8169_private *tp)
>>>     struct phy_device *phydev = tp->phydev;
>>>     int ret;
>>>
>>> +    WARN_ON(phy_read(phydev, 0x1f));
>>> +
>>>     switch (tp->mac_version) {
>>>     case RTL_GIGA_MAC_VER_34:
>>>     case RTL_GIGA_MAC_VER_35:
>>> @@ -1980,6 +1984,8 @@ static int rtl_get_eee_adv(struct rtl8169_private *tp)
>>>     struct phy_device *phydev = tp->phydev;
>>>     int ret;
>>>
>>> +    WARN_ON(phy_read(phydev, 0x1f));
>>> +
>>>     switch (tp->mac_version) {
>>>     case RTL_GIGA_MAC_VER_34:
>>>     case RTL_GIGA_MAC_VER_35:
>>> @@ -2003,6 +2009,8 @@ static int rtl_set_eee_adv(struct rtl8169_private *tp, int val)
>>>     struct phy_device *phydev = tp->phydev;
>>>     int ret = 0;
>>>
>>> +    WARN_ON(phy_read(phydev, 0x1f));
>>> +
>>>     switch (tp->mac_version) {
>>>     case RTL_GIGA_MAC_VER_34:
>>>     case RTL_GIGA_MAC_VER_35:
>>> diff --git a/drivers/net/phy/phy-core.c b/drivers/net/phy/phy-core.c
>>> index 16667fbac..1aa1142b8 100644
>>> --- a/drivers/net/phy/phy-core.c
>>> +++ b/drivers/net/phy/phy-core.c
>>> @@ -463,12 +463,10 @@ int __phy_modify_changed(struct phy_device *phydev, u32 regnum, u16 mask,
>>>         return ret;
>>>
>>>     new = (ret & ~mask) | set;
>>> -    if (new == ret)
>>> -        return 0;
>>>
>>> -    ret = __phy_write(phydev, regnum, new);
>>> +    __phy_write(phydev, regnum, new);
>>>
>>> -    return ret < 0 ? ret : 1;
>>> +    return new != ret;
>>> }
>>> EXPORT_SYMBOL_GPL(__phy_modify_changed);
>>>
>>>
>>
>> Took your patch on top of next-20190719.
>> See attached dmesg.
>> It didn't work. Same thing, lots of ping drops, no usable network.
>>
>> like that:
>> 44 packets transmitted, 2 received, 95% packet loss, time 44005ms
>>
>>
>> Maybe important:
>> I build a kernel with no modules.
>>
>> I have to power off when I booted a kernel which doesn't work, a (soft) reboot into a older kernel (e.g. 4.9.y)  doesn't
>> fix the problem. Powering off and on does.
>>
>
> Then, what you could do is reversing the hunks of the patch step by step.
> Or make them separate patches and bisect.
> Relevant are the hunks from point 1 and 2.
>
> 1. first 5 hunks (I don't think you have to reverse them individually)
>   EEE-related
>
> 2. rtl8168g_disable_aldps, rtl8168g_phy_adjust_10m_aldps, rtl8168g_1_hw_phy_config
>   all of these hunks are in the path for RTL8168g
>
> 3. rtl8168h_1_hw_phy_config, rtl8168h_2_hw_phy_config, rtl8168ep_1_hw_phy_config,
>   rtl8168ep_2_hw_phy_config
>   not in the path for RTL8168g
>

this is the minimal revert:

diff --git a/drivers/net/ethernet/realtek/r8169_main.c 
b/drivers/net/ethernet/realtek/r8169_main.c
index efef5453b94f..267995a614b5 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -3249,12 +3249,14 @@ static void rtl8168g_1_hw_phy_config(struct 
rtl8169_private *tp)
         else
                 phy_modify_paged(tp->phydev, 0x0bcc, 0x12, 0, BIT(15));

-       ret = phy_read_paged(tp->phydev, 0x0a46, 0x13);
-       if (ret & BIT(8))
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
-       else
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
-
+       rtl_writephy(tp, 0x1f, 0x0a46);
+       if (rtl_readphy(tp, 0x13) & 0x0100) {
+               rtl_writephy(tp, 0x1f, 0x0c41);
+               rtl_w0w1_phy(tp, 0x15, 0x0002, 0x0000);
+       } else {
+               rtl_writephy(tp, 0x1f, 0x0c41);
+               rtl_w0w1_phy(tp, 0x15, 0x0000, 0x0002);
+       }
         /* Enable PHY auto speed down */
         phy_modify_paged(tp->phydev, 0x0a44, 0x11, 0, BIT(3) | BIT(2));



Could it be, that there is just a typo?

         if (ret & BIT(8))
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
+               phy_modify_paged(tp->phydev, 0x0c41, 0x15, 0, BIT(1));
         else
-               phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
+               phy_modify_paged(tp->phydev, 0x0c41, 0x15, BIT(1), 0);




greetings,

       Thomas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ