[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e11f8eae-2bca-037b-9a7e-43c4276a2be6@gmail.com>
Date: Tue, 9 Oct 2018 23:39:16 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Chris Clayton <chris2553@...glemail.com>,
"Maciej S. Szmigiero" <mail@...iej.szmigiero.name>
Cc: Azat Khuzhin <a3at.mail@...il.com>,
Realtek linux nic maintainers <nic_swsd@...ltek.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: R8169: Network lockups in 4.18.{8,9,10} (and 4.19 dev)
On 09.10.2018 16:40, Chris Clayton wrote:
> Thanks to Maciej and Heiner for their replies.
>
> On 09/10/2018 13:32, Maciej S. Szmigiero wrote:
>> On 07.10.2018 21:36, Chris Clayton wrote:
>>> Hi again,
>>>
>>> I didn't think there was anything in 4.19-rc7 to fix this regression, but tried it anyway. I can confirm that the
>>> regression is still present and my network still fails when, after a resume from suspend (to ram or disk), I open my
>>> browser or my mail client. In both those cases the failure is almost immediate - e.g. my home page doesn't get displayed
>>> in the browser. Pinging one of my ISPs name servers doesn't fail quite so quickly but the reported time increases from
>>> 14-15ms to more than 1000ms.
>>
>> You can try comparing chip registers (ethtool -d eth0) in the working
>> state (before a suspend) and in the broken state (after a resume).
>> Maybe there will be some obvious in the difference.
>>
>> The same goes for the PCI configuration (lspci -d :8168 -vv).
>>
> Maciej suggested comparing the output from lspci -vv for the ethernet device. They are identical.
>
> Both Maciej and Heiner suggested comparing the output from "ethtool -d" pre and post suspend. Again, they are identical.
> Heiner specifically suggested looking at the RxConfig. The value of that is 0x0002870e both pre and post suspend.
>
> I've attached files I redirected the outputs to.
>
> Please don't hesitate to ask for any other information needed to solve this problem. In the meantime, I've now got
> scripts that stop the network during suspend and restart it during resume. (Those scripts were removed whilst I gathered
> the diagnostics shown in the attachments.)
>
I'd like to check whether it may be a timing issue. The following experimental patch
adds a PCI commit after writing register ChipCmd. Could you please check whether
it changes anything?
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 7d3f671e1..f3c359492 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -4641,6 +4641,7 @@ static void rtl_hw_start(struct rtl8169_private *tp)
/* Initially a 10 us delay. Turned it into a PCI commit. - FR */
RTL_R8(tp, IntrMask);
RTL_W8(tp, ChipCmd, CmdTxEnb | CmdRxEnb);
+ RTL_R8(tp, ChipCmd);
rtl_init_rxcfg(tp);
rtl_set_tx_config_registers(tp);
--
2.19.1
Powered by blists - more mailing lists