[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080106145227.GB3117@ami.dom.local>
Date: Sun, 6 Jan 2008 15:52:27 +0100
From: Jarek Poplawski <jarkao2@...il.com>
To: Torsten Kaiser <just.for.lkml@...glemail.com>
Cc: Herbert Xu <herbert@...dor.apana.org.au>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Neil Brown <neilb@...e.de>,
"J. Bruce Fields" <bfields@...ldses.org>, netdev@...r.kernel.org,
Tom Tucker <tom@...ngridcomputing.com>
Subject: Re: 2.6.24-rc6-mm1
On Sun, Jan 06, 2008 at 11:30:48AM +0100, Torsten Kaiser wrote:
...
> I think this bug is highly timing dependent. Its not always the same
> package that dies and as this is a SMP system I would guess two CPUs
> using the same data will trigger this.
> And using the poison-option will definitily slow the system down and
> mess up the timings.
Of course it looks like using the same data, but it seems there is no
reason to think it needs the same time: e.g. some timer or workqueue
could retrigger after it's supposed to be killed. Any additional
debugging/poisonning might help to see it earlier, so this should be
safer for your system, but, most probably this would show data from
the damaged side, so not necessarily very helpful.
> What also speaks against the 'safer' offsets is, that after adding my
> notfreed-byte to skbuff the bug still triggered in the same way.
We are not even sure skbuffs were directly affected by this or they
were incorrectly freed because of other structures beeing damaged?
IMHO, e.g. starting your system with limited memory should cause
faster memory reclaiming, and thus more often triggering of these bugs,
but of course I can be wrong.
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists