[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f19298770901071110vad914c1w364b5973b35c4c16@mail.gmail.com>
Date: Wed, 7 Jan 2009 22:10:18 +0300
From: "Alexey Zaytsev" <alexey.zaytsev@...il.com>
To: "Ingo Molnar" <mingo@...e.hu>
Cc: "Linus Torvalds" <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
"Nick Piggin" <nickpiggin@...oo.com.au>,
"Peter Zijlstra" <a.p.zijlstra@...llo.nl>
Subject: Re: linux-next: Tree for December 11
On Wed, Jan 7, 2009 at 21:47, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Alexey Zaytsev <alexey.zaytsev@...il.com> wrote:
>
>> And last time I bisected, it pointed to:
>>
>> commit 7317d7b87edb41a9135e30be1ec3f7ef817c53dd
>> Author: Nick Piggin <nickpiggin@...oo.com.au>
>> Date: Tue Sep 30 20:50:27 2008 +1000
>>
>> sched: improve preempt debugging
>>
>>
>> This patch helped me out with a problem I recently had....
>>
>> Basically, when the kernel lock is held, then preempt_count
>> underflow does not
>> get detected until it is released which may be a long time (and arbitrarily,
>> eg at different points it may be rescheduled). If the bkl is released at
>> schedule, the resulting output is actually fairly cryptic...
>>
>> With any other lock that elevates preempt_count, it is illegal to schedule
>> under it (which would get found pretty quickly). bkl allows scheduling with
>> preempt_count elevated, which makes underflows hard to debug.
>>
>> Signed-off-by: Ingo Molnar <mingo@...e.hu>
>>
>> so at least a dumb bisection won't do here.
>
> ah, sorry for being a slow starter, i missed that bit - merge window
> attention span troubles ...
>
> I think the kernel_locked() check added here is plain buggy against IRQ
> contexts: we drop the BKL spinlock and reduce current->kernel_depth
> non-atomically.
>
> So kernel_locked() can become detached from the preempt_count().
>
> Nick, can you think of any better way of still saving this debug check, or
> should we revert it?
>
> Although it seems a bit weird how consistently you seem to be able to
> trigger it - as this seems to be a narrow race. Is there an IRQ storm
> there perhaps, or something widens things up for Qemu to inject an IRQ
> right there?
I'm not sure about the qemu case, but at least on my laptop it happens
somewhere along
arch/x86/kernel/cpu/bugs.c:
92 printk(KERN_INFO "Checking 'hlt' instruction... ");
93 if (!boot_cpu_data.hlt_works_ok) {
94 printk("disabled\n");
95 return;
96 }
97 halt();
98 halt();
99 halt();
100 halt();
101 printk("OK.\n");
where an interrupt has to come in order to get the cpu from hlt, so
there is no surprise that I'm seeing this on every single boot. ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists