[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5d9929e0907081248q51e1fcfekff4c1b814e512184@mail.gmail.com>
Date: Wed, 8 Jul 2009 20:48:26 +0100
From: Joao Correia <joaomiguelcorreia@...il.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: LKML <linux-kernel@...r.kernel.org>,
Américo Wang <xiyou.wangcong@...il.com>
Subject: Re: [PATCH 1/3] Increase lockdep limits: MAX_STACK_TRACE_ENTRIES
On Wed, Jul 8, 2009 at 7:36 PM, Peter Zijlstra<a.p.zijlstra@...llo.nl> wrote:
> On Wed, 2009-07-08 at 13:22 -0400, Dave Jones wrote:
>> On Tue, Jul 07, 2009 at 05:55:01PM +0200, Peter Zijlstra wrote:
>> > On Tue, 2009-07-07 at 16:50 +0100, Joao Correia wrote:
>> >
>> > > >> Yes. Anything 2.6.31 forward triggers this immediatly during init
>> > > >> process, at random places.
>> > > >
>> > > > Not on my machines it doesn't.. so I suspect its something weird in
>> > > > your .config or maybe due to some hardware you have that I don't that
>> > > > triggers different drivers or somesuch.
>> > >
>> > > I am not the only one reporting this, and it happens, for example,
>> > > with a stock .config from a Fedora 11 install.
>> > >
>> > > It may, of course, be a funny driver interaction yes, but other than
>> > > stripping the box piece by piece, how would one go about debugging
>> > > this otherwise?
>> >
>> > One thing to do is stare (or share) at the output
>> > of /proc/lockdep_chains and see if there's some particularly large
>> > chains in there, or many of the same name or something.
>>
>> I don't see any long chains, just lots of them.
>> 29065 lines on my box that's hitting MAX_STACK_TRACE_ENTRIES
>>
>> > /proc/lockdep_stats might also be interesting, mine reads like:
>>
>> lock-classes: 1518 [max: 8191]
>> direct dependencies: 7142 [max: 16384]
>
> Since we have 7 states per class, and can take one trace per state, and
> also take one trace per dependency, this would yield a max of:
>
> 7*1518+7142 = 17768 stack traces
>
> With the current limit of 262144 stack-trace entries, that would leave
> us with and avg depth of:
>
> 262144/17768 = 14.75
>
> Now since we would not use all states for each class, we'd likely have a
> little more, but that would still suggest we have rather deep stack
> traces on avg.
>
> Looking at a lockdep dump hch gave me I can see that that is certainly
> possible, I see tons of very deep callchains.
>
> /me wonders if we're getting significantly deeper..
>
> OK I guess we can raise this one, does doubling work? That would get us
> around 29 entries per trace..
>
> Also, Dave do these distro init scrips still load every module on the
> planet or are we more sensible these days?
>
> module load/unload cycles are really bad for lockdep resources.
>
> --
>
> As a side node, I see that each and every trace ends with a -1 entry:
>
> ...
> [ 1194.412158] [<c01f7990>] do_mount+0x3c0/0x7c0
> [ 1194.412158] [<c01f7e14>] sys_mount+0x84/0xb0
> [ 1194.412158] [<c01221b1>] syscall_call+0x7/0xb
> [ 1194.412158] [<ffffffff>] 0xffffffff
>
> Which seems to come from:
>
> void save_stack_trace(struct stack_trace *trace)
> {
> dump_trace(current, NULL, NULL, 0, &save_stack_ops, trace);
> if (trace->nr_entries < trace->max_entries)
> trace->entries[trace->nr_entries++] = ULONG_MAX;
> }
> EXPORT_SYMBOL_GPL(save_stack_trace);
>
> commit 006e84ee3a54e393ec6bef2a9bc891dc5bde2843 seems involved,..
>
> Anybody got clue?
>
>
Im in no way pretending to understand all the details on this system,
but i do know there has been, as of late, an effort to make the init
process do more things at once, ie, load more modules in parallel to
speed up the process. Can't that be at least partially responsible for
this? Even if its just making some other problem more obvious?
Joao Correia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists