[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080718103559.GA4368@elte.hu>
Date: Fri, 18 Jul 2008 12:35:59 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [git pull] tracing fixes
* Ingo Molnar <mingo@...e.hu> wrote:
> > > CFLAGS_REMOVE_sched_clock.o = -pg
> > > +CFLAGS_REMOVE_sched.o = -mno-spe -pg
> > > endif
> > >
> >
> > Ingo,
> >
> > Why not trace the scheduler functions? I found a lot of useful
> > information from seeing what functions are being called (namely the
> > latencies caused by the fair scheduler balancing). Not being able to
> > trace sched.c seems to keep a lot of useful data from being accessed.
>
> i agree in general, but it was causing lockups with:
>
> http://redhat.com/~mingo/misc/config-Thu_Jul_17_13_34_52_CEST_2008
>
> note the MAXSMP in the config which sets NR_CPUS to 4096:
>
> CONFIG_NR_CPUS=4096
>
> our randconfig testing stumbled on it. That is a debug helper to "tune
> up the kernel for as large systems as possible" and can bring in
> regressions not normally seen.
ok, figured it out today: the lockups were due to the NMI watchdog and a
missing NMI protection in cpu_clock(). I've reactivated the topic that
solves this problem area and it all works fine now.
the sched.o change probably made a difference just because it reduced
the cross section between the NMI watchdog and the scheduler, making
lockups less likely during the ftrace self-test. I'll revert it once the
tracing/nmisafe is upstream.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists