[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090821191759.GA8607@elte.hu>
Date: Fri, 21 Aug 2009 21:17:59 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-tip-commits@...r.kernel.org,
Arjan van de Ven <arjan@...radead.org>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Dave Jones <davej@...hat.com>,
Kyle McMartin <kyle@...artin.ca>, Greg KH <gregkh@...e.de>,
linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...hat.com,
torvalds@...ux-foundation.org, catalin.marinas@....com,
a.p.zijlstra@...llo.nl, jens.axboe@...cle.com, fweisbec@...il.com,
stable@...nel.org, srostedt@...hat.com, tglx@...utronix.de
Subject: Re: [tip:tracing/urgent] tracing: Fix too large stack usage in
do_one_initcall()
* Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Fri, 21 Aug 2009 13:14:50 +0200 Ingo Molnar <mingo@...e.hu> wrote:
>
> > ...
> >
> > btw., it will just take two more features like kmemleak to
> > trigger hard to debug stack overflows again on 32-bit. We are
> > right at the edge and this situation is not really fixable in a
> > reliable way anymore.
> >
> > So i think we should be more drastic and solve the real problem:
> > we should drop 4K stacks and 8K combo-stacks on 32-bit, and go
> > exclusively to 8K split stacks on 32-bit.
>
> We seem to have overrun an 8k stack in
> http://bugzilla.kernel.org/show_bug.cgi?id=14029
Btw., this is inapposite because 8K stacks on 32-bit are 'shared' -
i.e. full process, softirq, hardirq and NMI context will nest on
each other, into the same 8K stack.
What i suggested in my mail was to up the 4K stacks option to 8K,
and to keep the separate IRQ/softirq stacks. (similar to 64-bit)
This is quite different from the 8K shared-stack option on 32-bit.
Plus, i looked at the oops cited above, and it's in the idle thread.
The idle thread does not do anything process level and if there's a
stack overflow it generally does not trigger in the idle thread
(because it has much less stack pressure).
Still, the stack could have overrun in IRQ context - but a more
likely scenario would be some other memory corruption crippling the
stack-overflow signature at the end of idle task's stack.
> Do we have a max-stack-depth tracer widget btw?
we do have an ftrace plugin for it, yes. But it has high cost (it
traces all the time to find the maximum), so i'm not sure how
realistic it would be to integrate it into the kerneloops daemon for
example.
It could certainly be done - a sufficiently enabled kernel has to be
built (perhaps a kernel-debug package) and the
/debug/tracing/max_stack_trace value can be monitored for 'too much'
values.
> > I.e. the stack size will be 'unified' too between 64-bit and
> > 32-bit to a certain degree: process stacks will be 8K on both
> > 64-bit and 32-bit x86, IRQ stacks will be separate. (on 64-bit
> > we also have the IST stacks for certain exceptions that further
> > isolates things)
> >
> > This will simplify the 32-bit situation quite a bit and removes
> > a contentious config option and makes the kernel more robust in
> > general. 8K combo stacks are not safe due to irq nesting and 4K
> > isolated stacks are not enough. 8K isolated stacks is the way to
> > go.
> >
> > Opinions?
>
> I wouldn't lose any sleep over it.
>
> I bet it would be sufficient to have 4k interrupt stacks though.
>
> My main concern would be maintenance. Over time we'll chew more
> and more stack space and eventually we'll get into trouble again.
> What means do we have for holding the line at 8k, and even
> improving things?
We already have the '8K line'. People who see stack overflows
_already_ go to the 8K stack kernel. The nasty thing about that is
that while it might solve their immediate problem, 8K shared stack
is still fragile in 'once in a blue moon' scenarios.
That's why i'm thinking about introducing the similar parameters all
across on x86: 8K process stack on both 32-bit and 64-bit, and 8K
(or larger) IRQ/softirq stacks.
This will have the added benefit of pushing the 'line of defense'
into the 64-bit space - which generally gets much more (and much
earlier) stress-testing in server shops - so we could find the
nastier bugs before they hit the 32-bit desktop en masse.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists