lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1aac418-bdd4-48f4-ad12-68ba82dafcd6@gmail.com>
Date: Tue, 14 May 2024 21:47:51 +0100
From: Ken Milmore <ken.milmore@...il.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: Heiner Kallweit <hkallweit1@...il.com>,
 Alexander Lobakin <aleksander.lobakin@...el.com>,
 Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
 David Miller <davem@...emloft.net>,
 Realtek linux nic maintainers <nic_swsd@...ltek.com>,
 Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net 2/2] r8169: disable interrupts also for GRO-scheduled
 NAPI

It seems to me that these chips are known for being badly-documented and quirky,
so maybe an empirical approach is called for.

I have briefly surveyed various driver sources available from the vendor and
from the BSDs, and AFAICT they all follow the pattern of unconditionally
masking interrupts, then clearing the status bits, then processing the rings
and then re-enabling interrupts in that order. In this respect, the Linux
driver may have become an outlier in that it doesn't *always* mask interrupts
before acking them, and that it may process the rings while IRQs are unmasked.
I'm not saying that these are necessarily problems, but...

There are some differences in how the drivers work: the FreeBSD one masks
interrupts straight away but defers writing the status register to the bottom
half, the OpenBSD driver seems to do everything in the IRQ handler. These
drivers also tend to flip between using "hard IRQs" and the built-in timer,
which complicates things. But in terms of the "mask first" approach, I think
they all look equivalent.

https://sources.debian.org/src/r8168/8.053.00-1/src/r8168_n.c/
https://sources.debian.org/src/r8125/9.011.00-4/src/r8125_n.c/
https://github.com/openbsd/src/blob/master/sys/dev/ic/re.c
https://github.com/openbsd/src/blob/master/sys/dev/pci/if_rge.c
https://cgit.freebsd.org/src/tree/sys/dev/re/if_re.c


I also found this ancient netdev thread which looks startlingly familiar to the
behaviour in the present issue. It seems that people have been here before...

https://lore.kernel.org/netdev/1242328457.32579.12.camel@lap75545.ornl.gov/
"I added some code to print the irq status when it hangs, and it shows
0x0085, which is RxOK | TxOK | TxDescUnavail, which makes me think we've
lost an MSI-edge interrupt somehow."

https://lore.kernel.org/netdev/1243042174.3580.23.camel@obelisk.thedillows.org/
"The 8169 chip only generates MSI interrupts when all enabled event
sources are quiescent and one or more sources transition to active. If
not all of the active events are acknowledged, or a new event becomes
active while the existing ones are cleared in the handler, we will not
see a new interrupt."

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ