lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080721140128.GA32245@elte.hu>
Date:	Mon, 21 Jul 2008 16:01:28 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Evgeniy Polyakov <johnpol@....mipt.ru>
Cc:	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Vegard Nossum <vegard.nossum@...il.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>, cl@...ux-foundation.org,
	davem@...emloft.net
Subject: Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison
	overwritten


* Evgeniy Polyakov <johnpol@....mipt.ru> wrote:

> On Mon, Jul 21, 2008 at 01:55:55PM +0200, Ingo Molnar (mingo@...e.hu) wrote:
> > > > I could try run tests with netconsole deactivated, if you think 
> > > > that's a worthwile line of probing this problem. (although that 
> > > > would make me do blind tests in essence - having kernel log output 
> > > > is really essential.)
> > > 
> > > Let's try this way first. If system will continue to crash, we will 
> > > add some debug options in various pathes. Existing reports do not 
> > > contain enough information unfortunately, so we will not lose too 
> > > much.
> > 
> > ok. I've turned off netconsole - 8 successful bootups in a row so far. 
> > The box is a slow booter/builder with an 8 kernels/hour test throughput, 
> > so if everything goes fine we should have meaningful results in about 10 
> > hours.
> > 
> > ( there are other, faster testboxes in -tip testing with 33 kernels/hour 
> >   build+boot throughput where we'd have to wait only 2 hours - but as 
> >   per Murphy's law they dont trigger this bug ;-)
> 
> Since 2.6.25 there was only single change in netpoll.c:
> f5184d267c1aedb9b7a8cc44e08ff6b8d382c3b5
> Which looks innocent.
> 
> Is your driver e1000 or e1000e? Can you check different one?

i cannot check e1000 anymore due to this upstream commit:

| d03157babed7424f5391af43200593768ce69c9a is first bad commit
| commit d03157babed7424f5391af43200593768ce69c9a
| Author: Auke Kok <auke-jan.h.kok@...el.com>
| Date:   Sun Jun 22 15:21:29 2008 -0700
|
|    e1000: remove PCI Express device IDs
|
|    We do not want to prolong the situation much longer that e1000
|    and e1000e support these devices at the same time. As a result,
|    take out the bandage that was added for the interim period
|    and remove all the PCI Express device IDs from e1000.

but yes, this box was using e1000 for a long time, and recently migrated 
to e1000e. I'm not sure there's any connection, do you think there is?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ