linux-kernel - Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1k50r8u4p.fsf@fess.ebiederm.org>
Date:	Tue, 25 Aug 2009 14:24:06 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	David Dillow <dave@...dillows.org>
Cc:	Michael Riepe <michael.riepe@...glemail.com>,
	Michael Buesch <mb@...sch.de>,
	Francois Romieu <romieu@...zoreil.com>,
	Rui Santos <rsantos@...popie.com>,
	Michael Büker <m.bueker@...lin.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

David Dillow <dave@...dillows.org> writes:

> On Tue, 2009-08-25 at 13:22 -0700, Eric W. Biederman wrote:
>> David Dillow <dave@...dillows.org> writes:
>> > I'm not real happy with the interrupt handling in the driver; it makes a
>> > certain amount of sense to split the MSI vs non-MSI interrupt cases out.
>> > It also means another pass through re-auditing things against the vendor
>> > driver. That's more work than I'm able to commit to at the moment.
>> >
>> > I've not been able to reproduce it locally on my r8169d, running for ~30
>> > minutes straight at full speed. I've not tried running it in UP, though.
>> > Perhaps I can do that tomorrow.
>> >
>> > Here's a possible patch to mask the NAPI events while we're running in
>> > NAPI mode. I'm not sure it is going to help, since the intr_mask was
>> > 0xffff when you hit the loop guard, so I left it in for now.
>> 
>> Interesting.
>> 
>> If I understand this correctly the situation is that we have on the
>> chip there is correct logic for a level triggered interrupt and that
>> the msi logic sits on it and sends an event when the interrupt signal
>> goes high, but when we acknowledge some bits but not all it does not
>> send another interrupt.
>
> Correct, we have to acknowledge all current outstanding event sources
> before we get another MSI interrupt. It looks like the MSI interrupt is
> triggered on the edge transition of a logical OR of all irq sources.
>
>> Baring playing games with what version of the card has working logic
>> and which does not we seem to have to simple choices (if we don't want
>> to loop possibly forever).
>> - Don't use the msi logic on this card.
>> - Move all of the logic into rtl8169_poll and only come out of NAPI
>>   mode when we have caught up with all of the interrupt work.
>> 
>> Is that how you understand the hardware issue you are trying to work
>> around?
>
> That's how I understood the issue I was working around with the
> problematic patch, but I thought I had covered both issues fairly well
> without having to split the handling any further -- we ACK all existing
> sources each pass through the loop, so we'll get a new interrupt on the
> unmasked events, but not on ones we've masked out for NAPI until NAPI
> completes and unmasks them.


> I'm curious how you managed to receive an packet between us clearing the
> all current sources and reading the current source list continuously for
> 60+ seconds -- the loop is basically


> status = get IRQ events from chip
> while (status) {
> 	/* process events, start NAPI if needed */
> 	clear current events from chip
> 	status = get IRQ events from chip
> }
>
> That seems like a very small race window to consistently hit --
> especially for long enough to trigger soft lockups.

Interesting indeed.  When I hit the guard we had popped out of NAPI
mode while we were in the loop.  The only way to do that is if
poll and interrupt were running on different cpus.

I am a bit curious about TxDescUnavail.  Perhaps we had a temporary
memory shortage and that is what was screaming?  I don't think we do
anything at all with that state.

Perhaps the flaw here is simply not masking TxDescUnavail while we are
in NAPI mode?

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/