netdev - Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <1251236795.9607.34.camel@lap75545.ornl.gov>
Date:	Tue, 25 Aug 2009 17:46:35 -0400
From:	David Dillow <dave@...dillows.org>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Michael Riepe <michael.riepe@...glemail.com>,
	Michael Buesch <mb@...sch.de>,
	Francois Romieu <romieu@...zoreil.com>,
	Rui Santos <rsantos@...popie.com>,
	Michael Büker <m.bueker@...lin.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

On Tue, 2009-08-25 at 14:24 -0700, Eric W. Biederman wrote:
> David Dillow <dave@...dillows.org> writes:
> > I'm curious how you managed to receive an packet between us clearing the
> > all current sources and reading the current source list continuously for
> > 60+ seconds -- the loop is basically
> 
> 
> > status = get IRQ events from chip
> > while (status) {
> > 	/* process events, start NAPI if needed */
> > 	clear current events from chip
> > 	status = get IRQ events from chip
> > }
> >
> > That seems like a very small race window to consistently hit --
> > especially for long enough to trigger soft lockups.
> 
> Interesting indeed.  When I hit the guard we had popped out of NAPI
> mode while we were in the loop.  The only way to do that is if
> poll and interrupt were running on different cpus.

That is the normal case on an SMP machine, but again that race window
should be fairly small as well -- from the __napi_schedule() to the
acking of the interrupt source is only a few lines of code, most of
which is in an error case that is skipped. Granted there may be a fair
number of instructions there if debugging or tracing is on -- I've not
checked -- but even then hitting that race consistently for 60+ seconds
doesn't seem likely.

Being out of NAPI in the guard may be a red herring -- it doesn't tell
us how long you were out of NAPI when you hit it. If there's a stuck bit
somewhere, then you could have been out of NAPI after the first cycle
and we'd have no way to tell. You could add some variables to keep track
of the status and mask values, and how long ago they changed to see.

> I am a bit curious about TxDescUnavail.  Perhaps we had a temporary
> memory shortage and that is what was screaming?  I don't think we do
> anything at all with that state.

TxDescUnavail is normal -- it means the chip finished sending everything
we asked it to.

> Perhaps the flaw here is simply not masking TxDescUnavail while we are
> in NAPI mode?

No, we never enable it on the chip, and it gets masked out when we
decide if we want to go to NAPI mode -- it is not set in tp->napi_event:

	if (status & tp->intr_mask & tp->napi_event) {

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html