lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <1251236795.9607.34.camel@lap75545.ornl.gov>
Date:	Tue, 25 Aug 2009 17:46:35 -0400
From:	David Dillow <dave@...dillows.org>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Michael Riepe <michael.riepe@...glemail.com>,
	Michael Buesch <mb@...sch.de>,
	Francois Romieu <romieu@...zoreil.com>,
	Rui Santos <rsantos@...popie.com>,
	Michael Büker <m.bueker@...lin.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

On Tue, 2009-08-25 at 14:24 -0700, Eric W. Biederman wrote:
> David Dillow <dave@...dillows.org> writes:
> > I'm curious how you managed to receive an packet between us clearing the
> > all current sources and reading the current source list continuously for
> > 60+ seconds -- the loop is basically
> 
> 
> > status = get IRQ events from chip
> > while (status) {
> > 	/* process events, start NAPI if needed */
> > 	clear current events from chip
> > 	status = get IRQ events from chip
> > }
> >
> > That seems like a very small race window to consistently hit --
> > especially for long enough to trigger soft lockups.
> 
> Interesting indeed.  When I hit the guard we had popped out of NAPI
> mode while we were in the loop.  The only way to do that is if
> poll and interrupt were running on different cpus.

That is the normal case on an SMP machine, but again that race window
should be fairly small as well -- from the __napi_schedule() to the
acking of the interrupt source is only a few lines of code, most of
which is in an error case that is skipped. Granted there may be a fair
number of instructions there if debugging or tracing is on -- I've not
checked -- but even then hitting that race consistently for 60+ seconds
doesn't seem likely.

Being out of NAPI in the guard may be a red herring -- it doesn't tell
us how long you were out of NAPI when you hit it. If there's a stuck bit
somewhere, then you could have been out of NAPI after the first cycle
and we'd have no way to tell. You could add some variables to keep track
of the status and mask values, and how long ago they changed to see.

> I am a bit curious about TxDescUnavail.  Perhaps we had a temporary
> memory shortage and that is what was screaming?  I don't think we do
> anything at all with that state.

TxDescUnavail is normal -- it means the chip finished sending everything
we asked it to.

> Perhaps the flaw here is simply not masking TxDescUnavail while we are
> in NAPI mode?

No, we never enable it on the chip, and it gets masked out when we
decide if we want to go to NAPI mode -- it is not set in tp->napi_event:

	if (status & tp->intr_mask & tp->napi_event) {



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ