lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aead4da3-e1b0-ab6c-2842-634e175b33ab@gmail.com>
Date:   Wed, 30 Jan 2019 20:15:45 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Peter Ceiley <peter@...ley.net>
Cc:     Realtek linux nic maintainers <nic_swsd@...ltek.com>,
        netdev@...r.kernel.org
Subject: Re: r8169 Driver - Poor Network Performance Since Kernel 4.19

Hi Peter,

recently I had somebody where pcie_aspm=off for whatever reason didn't
do the trick, can you also check with pcie_aspm.policy=performance.

And please check with "ethtool -S <if>" whether the chip statistics
show a significant number of errors.

If this doesn't help you may have to bisect to find the offending commit.

Heiner


On 30.01.2019 10:59, Peter Ceiley wrote:
> Hi Heiner,
> 
> I tried disabling the ASPM using the pcie_aspm=off kernel parameter
> and this made no difference.
> 
> I tried compiling the 4.18.16 r8169.c with the 4.19.18 source and
> subsequently loaded the module in the running 4.19.18 kernel. I can
> confirm that this immediately resolved the issue and access to the NFS
> shares operated as expected.
> 
> I presume this means it is an issue with the r8169 driver included in
> 4.19 onwards?
> 
> To answer your last questions:
> 
> Base Board Information
>     Manufacturer: Alienware
>     Product Name: 0PGRP5
>     Version: A02
> 
> ... and yes, the RTL8168 is the onboard network chip.
> 
> Regards,
> 
> Peter.
> 
> On Tue, 29 Jan 2019 at 17:44, Heiner Kallweit <hkallweit1@...il.com> wrote:
>>
>> Hi Peter,
>>
>> I think the vendor driver doesn't enable ASPM per default.
>> So it's worth a try to disable ASPM in the BIOS or via sysfs.
>> Few older systems seem to have issues with ASPM, what kind of
>> system / mainboard are you using? The RTL8168 is the onboard
>> network chip?
>>
>> Rgds, Heiner
>>
>>
>> On 29.01.2019 07:20, Peter Ceiley wrote:
>>> Hi Heiner,
>>>
>>> Thanks, I'll do some more testing. It might not be the driver - I
>>> assumed it was due to the fact that using the r8168 driver 'resolves'
>>> the issue. I'll see if I can test the r8169.c on top of 4.19 - this is
>>> a good idea.
>>>
>>> Cheers,
>>>
>>> Peter.
>>>
>>> On Tue, 29 Jan 2019 at 17:16, Heiner Kallweit <hkallweit1@...il.com> wrote:
>>>>
>>>> Hi Peter,
>>>>
>>>> at a first glance it doesn't look like a typical driver issue.
>>>> What you could do:
>>>>
>>>> - Test the r8169.c from 4.18 on top of 4.19.
>>>>
>>>> - Check whether disabling ASPM (/sys/module/pcie_aspm) has an effect.
>>>>
>>>> - Bisect between 4.18 and 4.19 to find the offending commit.
>>>>
>>>> Any specific reason why you think root cause is in the driver and not
>>>> elsewhere in the network subsystem?
>>>>
>>>> Heiner
>>>>
>>>>
>>>> On 28.01.2019 23:10, Peter Ceiley wrote:
>>>>> Hi Heiner,
>>>>>
>>>>> Thanks for getting back to me.
>>>>>
>>>>> No, I don't use jumbo packets.
>>>>>
>>>>> Bandwidth is *generally* good, and iperf results to my NAS provide
>>>>> over 900 Mbits/s in both circumstances. The issue seems to appear when
>>>>> establishing a connection and is most notable, for example, on my
>>>>> mounted NFS shares where it takes seconds (up to 10's of seconds on
>>>>> larger directories) to list the contents of each directory. Once a
>>>>> transfer begins on a file, I appear to get good bandwidth.
>>>>>
>>>>> I'm unsure of the best scientific data to provide you in order to
>>>>> troubleshoot this issue. Running the following
>>>>>
>>>>>     netstat -s |grep retransmitted
>>>>>
>>>>> shows a steady increase in retransmitted segments each time I list the
>>>>> contents of a remote directory, for example, running 'ls' on a
>>>>> directory containing 345 media files did the following using kernel
>>>>> 4.19.18:
>>>>>
>>>>> increased retransmitted segments by 21 and the 'time' command showed
>>>>> the following:
>>>>>     real    0m19.867s
>>>>>     user    0m0.012s
>>>>>     sys    0m0.036s
>>>>>
>>>>> The same command shows no retransmitted segments running kernel
>>>>> 4.18.16 and 'time' showed:
>>>>>     real    0m0.300s
>>>>>     user    0m0.004s
>>>>>     sys    0m0.007s
>>>>>
>>>>> ifconfig does not show any RX/TX errors nor dropped packets in either case.
>>>>>
>>>>> dmesg XID:
>>>>> [    2.979984] r8169 0000:03:00.0 eth0: RTL8168g/8111g,
>>>>> f8:b1:56:fe:67:e0, XID 4c000800, IRQ 32
>>>>>
>>>>> # lspci -vv
>>>>> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
>>>>>     Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>>     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>>     Latency: 0, Cache Line Size: 64 bytes
>>>>>     Interrupt: pin A routed to IRQ 19
>>>>>     Region 0: I/O ports at d000 [size=256]
>>>>>     Region 2: Memory at f7b00000 (64-bit, non-prefetchable) [size=4K]
>>>>>     Region 4: Memory at f2100000 (64-bit, prefetchable) [size=16K]
>>>>>     Capabilities: [40] Power Management version 3
>>>>>         Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
>>>>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>>>>         Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>>>>     Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>>>>         Address: 0000000000000000  Data: 0000
>>>>>     Capabilities: [70] Express (v2) Endpoint, MSI 01
>>>>>         DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s
>>>>> <512ns, L1 <64us
>>>>>             ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>>>> SlotPowerLimit 10.000W
>>>>>         DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
>>>>>             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>>>             MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>>>         DevSta:    CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
>>>>>         LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
>>>>> Latency L0s unlimited, L1 <64us
>>>>>             ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>>>         LnkCtl:    ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>>>>             ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>>>         LnkSta:    Speed 2.5GT/s (ok), Width x1 (ok)
>>>>>             TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>>>         DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+,
>>>>> OBFF Via message/WAKE#
>>>>>              AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>>>>>         DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+,
>>>>> OBFF Disabled
>>>>>              AtomicOpsCtl: ReqEn-
>>>>>         LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>>>>              Transmit Margin: Normal Operating Range,
>>>>> EnterModifiedCompliance- ComplianceSOS-
>>>>>              Compliance De-emphasis: -6dB
>>>>>         LnkSta2: Current De-emphasis Level: -6dB,
>>>>> EqualizationComplete-, EqualizationPhase1-
>>>>>              EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>>>     Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>>>>         Vector table: BAR=4 offset=00000000
>>>>>         PBA: BAR=4 offset=00000800
>>>>>     Capabilities: [d0] Vital Product Data
>>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>>>>         Not readable
>>>>>     Capabilities: [100 v1] Advanced Error Reporting
>>>>>         UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>         UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>>         UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>>>>> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>>>         CESta:    RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr-
>>>>>         CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
>>>>>         AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
>>>>> ECRCChkCap+ ECRCChkEn-
>>>>>             MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
>>>>>         HeaderLog: 00000000 00000000 00000000 00000000
>>>>>     Capabilities: [140 v1] Virtual Channel
>>>>>         Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
>>>>>         Arb:    Fixed- WRR32- WRR64- WRR128-
>>>>>         Ctrl:    ArbSelect=Fixed
>>>>>         Status:    InProgress-
>>>>>         VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>>>>             Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>>>>             Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=01
>>>>>             Status:    NegoPending- InProgress-
>>>>>     Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>>>>     Capabilities: [170 v1] Latency Tolerance Reporting
>>>>>         Max snoop latency: 71680ns
>>>>>         Max no snoop latency: 71680ns
>>>>>     Kernel driver in use: r8169
>>>>>     Kernel modules: r8169
>>>>>
>>>>> Please let me know if you have any other ideas in terms of testing.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Peter.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 29 Jan 2019 at 05:28, Heiner Kallweit <hkallweit1@...il.com> wrote:
>>>>>>
>>>>>> On 28.01.2019 12:13, Peter Ceiley wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have been experiencing very poor network performance since Kernel
>>>>>>> 4.19 and I'm confident it's related to the r8169 driver.
>>>>>>>
>>>>>>> I have no issue with kernel versions 4.18 and prior. I am experiencing
>>>>>>> this issue in kernels 4.19 and 4.20 (currently running/testing with
>>>>>>> 4.20.4 & 4.19.18).
>>>>>>>
>>>>>>> If someone could guide me in the right direction, I'm happy to help
>>>>>>> troubleshoot this issue. Note that I have been keeping an eye on one
>>>>>>> issue related to loading of the PHY driver, however, my symptoms
>>>>>>> differ in that I still have a network connection. I have attempted to
>>>>>>> reload the driver on a running system, but this does not improve the
>>>>>>> situation.
>>>>>>>
>>>>>>> Using the proprietary r8168 driver returns my device to proper working order.
>>>>>>>
>>>>>>> lshw shows:
>>>>>>>        description: Ethernet interface
>>>>>>>        product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>>>>        vendor: Realtek Semiconductor Co., Ltd.
>>>>>>>        physical id: 0
>>>>>>>        bus info: pci@...0:03:00.0
>>>>>>>        logical name: enp3s0
>>>>>>>        version: 0c
>>>>>>>        serial:
>>>>>>>        size: 1Gbit/s
>>>>>>>        capacity: 1Gbit/s
>>>>>>>        width: 64 bits
>>>>>>>        clock: 33MHz
>>>>>>>        capabilities: pm msi pciexpress msix vpd bus_master cap_list
>>>>>>> ethernet physical tp aui bnc mii fibre 10bt 10bt-fd 100bt 100bt-fd
>>>>>>> 1000bt-fd autonegotiation
>>>>>>>        configuration: autonegotiation=on broadcast=yes driver=r8169
>>>>>>> duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=192.168.1.25
>>>>>>> latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
>>>>>>>        resources: irq:19 ioport:d000(size=256)
>>>>>>> memory:f7b00000-f7b00fff memory:f2100000-f2103fff
>>>>>>>
>>>>>>> Kind Regards,
>>>>>>>
>>>>>>> Peter.
>>>>>>>
>>>>>> Hi Peter,
>>>>>>
>>>>>> the description "poor network performance" is quite vague, therefore:
>>>>>>
>>>>>> - Can you provide any measurements?
>>>>>> - iperf results before and after
>>>>>> - statistics about dropped packets (rx and/or tx)
>>>>>> - Do you use jumbo packets?
>>>>>>
>>>>>> Also help would be a "lspci -vv" output for the network card and
>>>>>> the dmesg output line with the chip XID.
>>>>>>
>>>>>> Heiner
>>>>>
>>>>
>>>
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ