[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090914151958.GA17791@elte.hu>
Date: Mon, 14 Sep 2009 17:19:58 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Martin Schwidefsky <schwidefsky@...ibm.com>
Cc: Jens Axboe <jens.axboe@...cle.com>,
John Stultz <johnstul@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Mike Galbraith <efault@....de>,
Con Kolivas <kernel@...ivas.org>, linux-kernel@...r.kernel.org
Subject: Re: [crash, bisected] Re: clocksource: Resolve cpu hotplug dead
lock with TSC unstable
* Martin Schwidefsky <schwidefsky@...ibm.com> wrote:
> On Fri, 11 Sep 2009 09:37:47 +0200
> Ingo Molnar <mingo@...e.hu> wrote:
>
> >
> > * Ingo Molnar <mingo@...e.hu> wrote:
> >
> > >
> > > * Ingo Molnar <mingo@...e.hu> wrote:
> > >
> > > >
> > > > * Jens Axboe <jens.axboe@...cle.com> wrote:
> > > >
> > > > > I went to try -tip btw, but it crashes on boot. Here's the
> > > > > backtrace, typed manually, it's crashing in
> > > > > queue_work_on+0x28/0x60.
> > > > >
> > > > > Call Trace:
> > > > > queue_work
> > > > > schedule_work
> > > > > clocksource_mark_unstable
> > > > > mark_tsc_unstable
> > > > > check_tsc_sync_source
> > > > > native_cpu_up
> > > > > relay_hotcpu_callback
> > > > > do_forK_idle
> > > > > _cpu_up
> > > > > cpu_up
> > > > > kernel_init
> > > > > kernel_thread_helper
> > > >
> > > > hm, that looks like an old bug i fixed days ago via:
> > > >
> > > > 00a3273: Revert "x86: Make tsc=reliable override boot time stability checks"
> > > >
> > > > Have you tested tip:master - do you still know which sha1?
> > >
> > > Ok, i reproduced it on a testbox and bisected it, the crash is
> > > caused by:
> > >
> > > 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16 is first bad commit
> > > commit 7285dd7fd375763bfb8ab1ac9cf3f1206f503c16
> > > Author: Thomas Gleixner <tglx@...utronix.de>
> > > Date: Fri Aug 28 20:25:24 2009 +0200
> > >
> > > clocksource: Resolve cpu hotplug dead lock with TSC unstable
> > >
> > > Martin Schwidefsky analyzed it:
> > >
> > > I've reverted it in tip/master for now.
> >
> > and that uncovers the circular locking bug that this commit was
> > supposed to fix ...
> >
> > Martin?
>
> This patch should fix the obvious problem that the watchdog_work
> structure is not yet initialized if the clocksource watchdog is not
> running yet.
> --
> Subject: [PATCH] clocksource: statically initialize watchdog workqueue
>
> From: Martin Schwidefsky <schwidefsky@...ibm.com>
>
> The watchdog timer is started after the watchdog clocksource and at least
> one watched clocksource have been registered. The clocksource work element
> watchdog_work is initialized just before the clocksource timer is started.
> This is too late for the clocksource_mark_unstable call from native_cpu_up.
> To fix this use a static initializer for watchdog_work.
>
> Signed-off-by: Martin Schwidefsky <schwidefsky@...ibm.com>
> ---
> kernel/time/clocksource.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/kernel/time/clocksource.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/clocksource.c
> +++ linux-2.6/kernel/time/clocksource.c
> @@ -123,10 +123,12 @@ static DEFINE_MUTEX(clocksource_mutex);
> static char override_name[32];
>
> #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
> +static void clocksource_watchdog_work(struct work_struct *work);
> +
> static LIST_HEAD(watchdog_list);
> static struct clocksource *watchdog;
> static struct timer_list watchdog_timer;
> -static struct work_struct watchdog_work;
> +static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
> static DEFINE_SPINLOCK(watchdog_lock);
> static cycle_t watchdog_last;
> static int watchdog_running;
> @@ -230,7 +232,6 @@ static inline void clocksource_start_wat
> {
> if (watchdog_running || !watchdog || list_empty(&watchdog_list))
> return;
> - INIT_WORK(&watchdog_work, clocksource_watchdog_work);
> init_timer(&watchdog_timer);
> watchdog_timer.function = clocksource_watchdog;
> watchdog_last = watchdog->read(watchdog);
Now another box crashes during bootup. Reverting these two:
f79e025: clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash
7285dd7: clocksource: Resolve cpu hotplug dead lock with TSC unstable
allows me to boot it.
plain 32-bit defconfig.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists