lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4e4902d-5534-6c66-63f5-d88059604c78@gmail.com>
Date:   Mon, 7 Jun 2021 22:39:06 +0200
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Johannes Brandstätter <jbrandst@....eu>
Cc:     netdev@...r.kernel.org
Subject: Re: Load on RTL8168g/8111g stalls network for multiple seconds

On 07.06.2021 15:11, Johannes Brandstätter wrote:
> Hi,
> 
> just the other day I wanted to set up a bridge between an external 2.5G
> RTL8156 USB Ethernet adapter (using r8152) and the built in dual
> RTL8168g/8111g Ethernet chip (using r8169).
> I compiled the kernel as of 5.13.0-rc4 because of the r8125 supporting
> the RTL8156.
> This was done using the Debian kernel config of 5.10.0-4 as a base and
> left the rest as default.
> 
> So this setup was working the way I wanted it to, but unfortunately
> when running iperf3 against the machine it would rather quickly stall
> all communications on the internal RTL8168g.
> I was still able to communicate fine over the external RTL8156 link
> with the machine.
> Even without the generated network load, it would occasionally become
> stalled.
> 
> The only information I could really gather were that the rx_missed
> counter was going up, and this kernel message some time after the stall
> was happening:
> 
> [81853.129107] r8169 0000:02:00.0 enp2s0: rtl_rxtx_empty_cond == 0
> (loop: 42, delay: 100).
> 
> Which has apparently to do with the wait for an empty fifo within the
> r8169 driver.
> 
> Until that the machine (an UP² board) using the RTL8168g ran without
> any issues for multiple years in different configurations.
> Only bridging immediately showed the issue when given enough network
> load.
> 
> After many hours of trying out different things, nothing of which
> showed any difference whatsoever, I tried to replace the internal
> RTL8168g with an additional external USB Ethernet adapter which I had
> laying around, having a RTL8153 inside.
> 
> Once the RTL8168g was removed and the RTL8153 added to the bridge, I
> was unable to reproduce the issue.
> Of course I'd rather like to make use of the two internal Ethernet
> ports if I somehow can.
> 
> So is there anything I could try to do?
> 
Do you have flow control enabled? From 5.13-rc r8169 supports adjusting
pause settings via ethtool. You could play with the settings to see
whether it makes a difference.
Next thing you could check is whether the issue persists when using
the r8168 vendor driver.

However I'm not an expert in bridging and don't know which difference
it could make whether a NIC is operated standalone or as part of a bridge.

> I'm eyeing with a regression test next on the kernel's r8168 driver.
> Though this is without me knowing if there ever was a working version.
> As this is a rather large task, with only limited time I wanted to seek
> out some help before I go down that route.
> 
> Maybe you could point me into the right direction, as to what to try
> next.
> 
> Thanks and best regards,
> Johannes
> 
Heiner

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ