[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mu06fppx.fsf@nanos.tec.linutronix.de>
Date: Wed, 28 Oct 2020 14:24:26 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Heiner Kallweit <hkallweit1@...il.com>,
Serge Belyshev <belyshev@...ni.sinp.msu.ru>
Cc: Jakub Kicinski <kuba@...nel.org>,
David Miller <davem@...emloft.net>,
Realtek linux nic maintainers <nic_swsd@...ltek.com>,
"netdev\@vger.kernel.org" <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH net] r8169: fix operation under forced interrupt threading
On Wed, Oct 28 2020 at 13:17, Heiner Kallweit wrote:
> On 28.10.2020 12:43, Serge Belyshev wrote:
>>> For several network drivers it was reported that using
>>> __napi_schedule_irqoff() is unsafe with forced threading. One way to
>>> fix this is switching back to __napi_schedule, but then we lose the
>>> benefit of the irqoff version in general. As stated by Eric it doesn't
>>> make sense to make the minimal hard irq handlers in drivers using NAPI
>>> a thread. Therefore ensure that the hard irq handler is never
>>> thread-ified.
>> Hi! This patch actually breaks r8169 with threadirqs on an old box
>> where it was working before:
>>
>> [ 0.000000] DMI: Gigabyte Technology Co., Ltd. GA-MA790FX-DQ6/GA-MA790FX-DQ6, BIOS F7g 07/19/2010
>> ...
>> [ 1.072676] r8169 0000:02:00.0 eth0: RTL8168b/8111b, 00:1a:4d:5d:6b:c3, XID 380, IRQ 18
>> ...
>> [ 8.850099] genirq: Flags mismatch irq 18. 00010080 (eth0) vs. 00002080 (ahci[0000:05:00.0])
>>
>> (error is reported to userspace, interface failed to bring up).
>> Reverting the patch fixes the problem.
>>
> Thanks for the report. On this old chip version MSI is unreliable,
> therefore r8169 falls back to a PCI legacy interrupt. On your system
> this PCI legacy interrupt seems to be shared between network and
> disk. Then the IRQ core tries to threadify the disk interrupt
> (setting IRQF_ONESHOT), whilst the network interrupt doesn't have
> this flag set. This results in the flag mismatch error.
>
> Maybe, if one source of a shared interrupt doesn't allow forced
> threading, this should be applied to the other sources too.
> But this would require a change in the IRQ core, therefore
> +Thomas to get his opinion on the issue.
It's pretty simple. There is no way to fix that at the core
level. Shared interrupts suck and to make them work halfways correct the
sharing devices must have matching and non-competing flags.
Especially for threaded vs. non threaded case. Shared interrupts are
level triggered. So you have a conflict of interests:
The threaded handler requires that the interrupt line is masked until
the thread has completed otherwise the system will suffer an interrupt
storm. The non-threaded want's it to be unmasked after it finished.
Thanks,
tglx
Powered by blists - more mailing lists