[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <add2c1e3-a216-5082-e847-452adb0169f7@gmail.com>
Date: Sat, 20 Jul 2019 18:44:47 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Thomas Voegtle <tv@...96.de>
Cc: linux-kernel@...r.kernel.org,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: network problems with r8169
On 20.07.2019 16:22, Thomas Voegtle wrote:
> On Sat, 20 Jul 2019, Heiner Kallweit wrote:
>
>> On 19.07.2019 23:12, Thomas Voegtle wrote:
>>> On Fri, 19 Jul 2019, Heiner Kallweit wrote:
>>>
>>>> On 18.07.2019 20:50, Thomas Voegtle wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I'm having network problems with the commits on r8169 since v5.2. There are ping packet loss, sometimes 100%, sometimes 50%. In the end network is unusable.
>>>>>
>>>>> v5.2 is fine, I bisected it down to:
>>>>>
>>>>> a2928d28643e3c064ff41397281d20c445525032 is the first bad commit
>>>>> commit a2928d28643e3c064ff41397281d20c445525032
>>>>> Author: Heiner Kallweit <hkallweit1@...il.com>
>>>>> Date: Sun Jun 2 10:53:49 2019 +0200
>>>>>
>>>>> r8169: use paged versions of phylib MDIO access functions
>>>>>
>>>>> Use paged versions of phylib MDIO access functions to simplify
>>>>> the code.
>>>>>
>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
>>>>> Signed-off-by: David S. Miller <davem@...emloft.net>
>>>>>
>>>>>
>>>>> Reverting that commit on top of v5.2-11564-g22051d9c4a57 fixes the problem
>>>>> for me (had to adjust the renaming to r8169_main.c).
>>>>>
>>>>> I have a:
>>>>> 04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev
>>>>> 0c)
>>>>> Subsystem: Biostar Microtech Int'l Corp Device [1565:2400]
>>>>> Kernel driver in use: r8169
>>>>>
>>>>> on a BIOSTAR H81MG motherboard.
>>>>>
>>>> Interesting. I have the same chip version (RTL8168g) and can't reproduce
>>>> the issue. Can you provide a full dmesg output and test the patch below
>>>> on top of linux-next? I'd be interested in the WARN_ON stack traces
>>>> (if any) and would like to know whether the experimental change to
>>>> __phy_modify_changed helps.
>>>>
>>>>>
>>>>> greetings,
>>>>>
>>>>> Thomas
>>>>>
>>>>>
>>>> Heiner
>>>>
>>>>
>>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>>> index 8d7dd4c5f..26be73000 100644
>>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>>> @@ -1934,6 +1934,8 @@ static int rtl_get_eee_supp(struct rtl8169_private *tp)
>>>> struct phy_device *phydev = tp->phydev;
>>>> int ret;
>>>>
>>>> + WARN_ON(phy_read(phydev, 0x1f));
>>>> +
>>>> switch (tp->mac_version) {
>>>> case RTL_GIGA_MAC_VER_34:
>>>> case RTL_GIGA_MAC_VER_35:
>>>> @@ -1957,6 +1959,8 @@ static int rtl_get_eee_lpadv(struct rtl8169_private *tp)
>>>> struct phy_device *phydev = tp->phydev;
>>>> int ret;
>>>>
>>>> + WARN_ON(phy_read(phydev, 0x1f));
>>>> +
>>>> switch (tp->mac_version) {
>>>> case RTL_GIGA_MAC_VER_34:
>>>> case RTL_GIGA_MAC_VER_35:
>>>> @@ -1980,6 +1984,8 @@ static int rtl_get_eee_adv(struct rtl8169_private *tp)
>>>> struct phy_device *phydev = tp->phydev;
>>>> int ret;
>>>>
>>>> + WARN_ON(phy_read(phydev, 0x1f));
>>>> +
>>>> switch (tp->mac_version) {
>>>> case RTL_GIGA_MAC_VER_34:
>>>> case RTL_GIGA_MAC_VER_35:
>>>> @@ -2003,6 +2009,8 @@ static int rtl_set_eee_adv(struct rtl8169_private *tp, int val)
>>>> struct phy_device *phydev = tp->phydev;
>>>> int ret = 0;
>>>>
>>>> + WARN_ON(phy_read(phydev, 0x1f));
>>>> +
>>>> switch (tp->mac_version) {
>>>> case RTL_GIGA_MAC_VER_34:
>>>> case RTL_GIGA_MAC_VER_35:
>>>> diff --git a/drivers/net/phy/phy-core.c b/drivers/net/phy/phy-core.c
>>>> index 16667fbac..1aa1142b8 100644
>>>> --- a/drivers/net/phy/phy-core.c
>>>> +++ b/drivers/net/phy/phy-core.c
>>>> @@ -463,12 +463,10 @@ int __phy_modify_changed(struct phy_device *phydev, u32 regnum, u16 mask,
>>>> return ret;
>>>>
>>>> new = (ret & ~mask) | set;
>>>> - if (new == ret)
>>>> - return 0;
>>>>
>>>> - ret = __phy_write(phydev, regnum, new);
>>>> + __phy_write(phydev, regnum, new);
>>>>
>>>> - return ret < 0 ? ret : 1;
>>>> + return new != ret;
>>>> }
>>>> EXPORT_SYMBOL_GPL(__phy_modify_changed);
>>>>
>>>>
>>>
>>> Took your patch on top of next-20190719.
>>> See attached dmesg.
>>> It didn't work. Same thing, lots of ping drops, no usable network.
>>>
>>> like that:
>>> 44 packets transmitted, 2 received, 95% packet loss, time 44005ms
>>>
>>>
>>> Maybe important:
>>> I build a kernel with no modules.
>>>
>>> I have to power off when I booted a kernel which doesn't work, a (soft) reboot into a older kernel (e.g. 4.9.y) doesn't
>>> fix the problem. Powering off and on does.
>>>
>>
>> Then, what you could do is reversing the hunks of the patch step by step.
>> Or make them separate patches and bisect.
>> Relevant are the hunks from point 1 and 2.
>>
>> 1. first 5 hunks (I don't think you have to reverse them individually)
>> EEE-related
>>
>> 2. rtl8168g_disable_aldps, rtl8168g_phy_adjust_10m_aldps, rtl8168g_1_hw_phy_config
>> all of these hunks are in the path for RTL8168g
>>
>> 3. rtl8168h_1_hw_phy_config, rtl8168h_2_hw_phy_config, rtl8168ep_1_hw_phy_config,
>> rtl8168ep_2_hw_phy_config
>> not in the path for RTL8168g
>>
>
> this is the minimal revert:
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index efef5453b94f..267995a614b5 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -3249,12 +3249,14 @@ static void rtl8168g_1_hw_phy_config(struct rtl8169_private *tp)
> else
> phy_modify_paged(tp->phydev, 0x0bcc, 0x12, 0, BIT(15));
>
> - ret = phy_read_paged(tp->phydev, 0x0a46, 0x13);
> - if (ret & BIT(8))
> - phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
> - else
> - phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
> -
> + rtl_writephy(tp, 0x1f, 0x0a46);
> + if (rtl_readphy(tp, 0x13) & 0x0100) {
> + rtl_writephy(tp, 0x1f, 0x0c41);
> + rtl_w0w1_phy(tp, 0x15, 0x0002, 0x0000);
> + } else {
> + rtl_writephy(tp, 0x1f, 0x0c41);
> + rtl_w0w1_phy(tp, 0x15, 0x0000, 0x0002);
> + }
> /* Enable PHY auto speed down */
> phy_modify_paged(tp->phydev, 0x0a44, 0x11, 0, BIT(3) | BIT(2));
>
>
>
> Could it be, that there is just a typo?
>
I looked a hundred times over this piece of code ..
Yes, it's simply a typo. I'll submit a patch for it.
Thanks a lot for your testing efforts!
> if (ret & BIT(8))
> - phy_modify_paged(tp->phydev, 0x0c41, 0x12, 0, BIT(1));
> + phy_modify_paged(tp->phydev, 0x0c41, 0x15, 0, BIT(1));
> else
> - phy_modify_paged(tp->phydev, 0x0c41, 0x12, BIT(1), 0);
> + phy_modify_paged(tp->phydev, 0x0c41, 0x15, BIT(1), 0);
>
>
>
>
> greetings,
>
> Thomas
Heiner
Powered by blists - more mailing lists