lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 17 Oct 2018 20:12:48 +0200
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Holger Hoffstätte <holger@...lied-asynchrony.com>,
        David Miller <davem@...emloft.net>,
        Realtek linux nic maintainers <nic_swsd@...ltek.com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net] r8169: fix NAPI handling under high load

On 16.10.2018 23:17, Holger Hoffstätte wrote:
> On 10/16/18 22:37, Heiner Kallweit wrote:
>> rtl_rx() and rtl_tx() are called only if the respective bits are set
>> in the interrupt status register. Under high load NAPI may not be
>> able to process all data (work_done == budget) and it will schedule
>> subsequent calls to the poll callback.
>> rtl_ack_events() however resets the bits in the interrupt status
>> register, therefore subsequent calls to rtl8169_poll() won't call
>> rtl_rx() and rtl_tx() - chip interrupts are still disabled.
> 
> Very interesting! Could this be the reason for the mysterious
> hangs & resets we experienced when enabling BQL for r8169?
> They happened more often with TSO/GSO enabled and several people
> attempted to fix those hangs unsuccessfully; it was later reverted
> and has been since then (#87cda7cb43).
> If this bug has been there "forever" it might be tempting to
> re-apply BQL and see what happens. Any chance you could give that
> a try? I'll gladly test patches, just like I'll run this one.
> 
After reading through the old mail threads regarding BQL on r8169
I don't think the fix here is related.
It seems that BQL on r8169 worked fine for most people, just one
had problems on one of his systems. I assume the issue was specific
to one chip version. From the ~ 50 chip versions supported by
r8169 more or less each one requires its own quirks.
If we're lucky the chip-version-specific issue has been fixed in
the meantime and we can simply apply the old BQL patch again.

If it turns out that certain chip versions can't be used with BQL,
then we can disable the feature for these chip versions instead
of removing the feature completely.

I will apply the old BQL patch and see how it's on my system
(with GRO and SG enabled).

> cheers
> Holger
> 
Heiner

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ