[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <172787aa-9ef5-091d-f70f-baf89fe0b1ee@gmail.com>
Date: Tue, 29 Jan 2019 07:44:28 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: Peter Ceiley <peter@...ley.net>
Cc: Realtek linux nic maintainers <nic_swsd@...ltek.com>,
netdev@...r.kernel.org
Subject: Re: r8169 Driver - Poor Network Performance Since Kernel 4.19
Hi Peter,
I think the vendor driver doesn't enable ASPM per default.
So it's worth a try to disable ASPM in the BIOS or via sysfs.
Few older systems seem to have issues with ASPM, what kind of
system / mainboard are you using? The RTL8168 is the onboard
network chip?
Rgds, Heiner
On 29.01.2019 07:20, Peter Ceiley wrote:
> Hi Heiner,
>
> Thanks, I'll do some more testing. It might not be the driver - I
> assumed it was due to the fact that using the r8168 driver 'resolves'
> the issue. I'll see if I can test the r8169.c on top of 4.19 - this is
> a good idea.
>
> Cheers,
>
> Peter.
>
> On Tue, 29 Jan 2019 at 17:16, Heiner Kallweit <hkallweit1@...il.com> wrote:
>>
>> Hi Peter,
>>
>> at a first glance it doesn't look like a typical driver issue.
>> What you could do:
>>
>> - Test the r8169.c from 4.18 on top of 4.19.
>>
>> - Check whether disabling ASPM (/sys/module/pcie_aspm) has an effect.
>>
>> - Bisect between 4.18 and 4.19 to find the offending commit.
>>
>> Any specific reason why you think root cause is in the driver and not
>> elsewhere in the network subsystem?
>>
>> Heiner
>>
>>
>> On 28.01.2019 23:10, Peter Ceiley wrote:
>>> Hi Heiner,
>>>
>>> Thanks for getting back to me.
>>>
>>> No, I don't use jumbo packets.
>>>
>>> Bandwidth is *generally* good, and iperf results to my NAS provide
>>> over 900 Mbits/s in both circumstances. The issue seems to appear when
>>> establishing a connection and is most notable, for example, on my
>>> mounted NFS shares where it takes seconds (up to 10's of seconds on
>>> larger directories) to list the contents of each directory. Once a
>>> transfer begins on a file, I appear to get good bandwidth.
>>>
>>> I'm unsure of the best scientific data to provide you in order to
>>> troubleshoot this issue. Running the following
>>>
>>> netstat -s |grep retransmitted
>>>
>>> shows a steady increase in retransmitted segments each time I list the
>>> contents of a remote directory, for example, running 'ls' on a
>>> directory containing 345 media files did the following using kernel
>>> 4.19.18:
>>>
>>> increased retransmitted segments by 21 and the 'time' command showed
>>> the following:
>>> real 0m19.867s
>>> user 0m0.012s
>>> sys 0m0.036s
>>>
>>> The same command shows no retransmitted segments running kernel
>>> 4.18.16 and 'time' showed:
>>> real 0m0.300s
>>> user 0m0.004s
>>> sys 0m0.007s
>>>
>>> ifconfig does not show any RX/TX errors nor dropped packets in either case.
>>>
>>> dmesg XID:
>>> [ 2.979984] r8169 0000:03:00.0 eth0: RTL8168g/8111g,
>>> f8:b1:56:fe:67:e0, XID 4c000800, IRQ 32
>>>
>>> # lspci -vv
>>> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
>>> Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>> Latency: 0, Cache Line Size: 64 bytes
>>> Interrupt: pin A routed to IRQ 19
>>> Region 0: I/O ports at d000 [size=256]
>>> Region 2: Memory at f7b00000 (64-bit, non-prefetchable) [size=4K]
>>> Region 4: Memory at f2100000 (64-bit, prefetchable) [size=16K]
>>> Capabilities: [40] Power Management version 3
>>> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
>>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>> Address: 0000000000000000 Data: 0000
>>> Capabilities: [70] Express (v2) Endpoint, MSI 01
>>> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s
>>> <512ns, L1 <64us
>>> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>> SlotPowerLimit 10.000W
>>> DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
>>> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>> MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>> DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>>> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
>>> Latency L0s unlimited, L1 <64us
>>> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>> LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>> ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>> LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
>>> TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+,
>>> OBFF Via message/WAKE#
>>> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>>> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+,
>>> OBFF Disabled
>>> AtomicOpsCtl: ReqEn-
>>> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>> Transmit Margin: Normal Operating Range,
>>> EnterModifiedCompliance- ComplianceSOS-
>>> Compliance De-emphasis: -6dB
>>> LnkSta2: Current De-emphasis Level: -6dB,
>>> EqualizationComplete-, EqualizationPhase1-
>>> EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>> Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>> Vector table: BAR=4 offset=00000000
>>> PBA: BAR=4 offset=00000800
>>> Capabilities: [d0] Vital Product Data
>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>> Not readable
>>> Capabilities: [100 v1] Advanced Error Reporting
>>> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>>> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>> CESta: RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr-
>>> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
>>> AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
>>> ECRCChkCap+ ECRCChkEn-
>>> MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
>>> HeaderLog: 00000000 00000000 00000000 00000000
>>> Capabilities: [140 v1] Virtual Channel
>>> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
>>> Arb: Fixed- WRR32- WRR64- WRR128-
>>> Ctrl: ArbSelect=Fixed
>>> Status: InProgress-
>>> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>>> Status: NegoPending- InProgress-
>>> Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>> Capabilities: [170 v1] Latency Tolerance Reporting
>>> Max snoop latency: 71680ns
>>> Max no snoop latency: 71680ns
>>> Kernel driver in use: r8169
>>> Kernel modules: r8169
>>>
>>> Please let me know if you have any other ideas in terms of testing.
>>>
>>> Thanks!
>>>
>>> Peter.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, 29 Jan 2019 at 05:28, Heiner Kallweit <hkallweit1@...il.com> wrote:
>>>>
>>>> On 28.01.2019 12:13, Peter Ceiley wrote:
>>>>> Hi,
>>>>>
>>>>> I have been experiencing very poor network performance since Kernel
>>>>> 4.19 and I'm confident it's related to the r8169 driver.
>>>>>
>>>>> I have no issue with kernel versions 4.18 and prior. I am experiencing
>>>>> this issue in kernels 4.19 and 4.20 (currently running/testing with
>>>>> 4.20.4 & 4.19.18).
>>>>>
>>>>> If someone could guide me in the right direction, I'm happy to help
>>>>> troubleshoot this issue. Note that I have been keeping an eye on one
>>>>> issue related to loading of the PHY driver, however, my symptoms
>>>>> differ in that I still have a network connection. I have attempted to
>>>>> reload the driver on a running system, but this does not improve the
>>>>> situation.
>>>>>
>>>>> Using the proprietary r8168 driver returns my device to proper working order.
>>>>>
>>>>> lshw shows:
>>>>> description: Ethernet interface
>>>>> product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>> vendor: Realtek Semiconductor Co., Ltd.
>>>>> physical id: 0
>>>>> bus info: pci@...0:03:00.0
>>>>> logical name: enp3s0
>>>>> version: 0c
>>>>> serial:
>>>>> size: 1Gbit/s
>>>>> capacity: 1Gbit/s
>>>>> width: 64 bits
>>>>> clock: 33MHz
>>>>> capabilities: pm msi pciexpress msix vpd bus_master cap_list
>>>>> ethernet physical tp aui bnc mii fibre 10bt 10bt-fd 100bt 100bt-fd
>>>>> 1000bt-fd autonegotiation
>>>>> configuration: autonegotiation=on broadcast=yes driver=r8169
>>>>> duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=192.168.1.25
>>>>> latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
>>>>> resources: irq:19 ioport:d000(size=256)
>>>>> memory:f7b00000-f7b00fff memory:f2100000-f2103fff
>>>>>
>>>>> Kind Regards,
>>>>>
>>>>> Peter.
>>>>>
>>>> Hi Peter,
>>>>
>>>> the description "poor network performance" is quite vague, therefore:
>>>>
>>>> - Can you provide any measurements?
>>>> - iperf results before and after
>>>> - statistics about dropped packets (rx and/or tx)
>>>> - Do you use jumbo packets?
>>>>
>>>> Also help would be a "lspci -vv" output for the network card and
>>>> the dmesg output line with the chip XID.
>>>>
>>>> Heiner
>>>
>>
>
Powered by blists - more mailing lists