[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f19298770901070935u6396a367j86937c9394518877@mail.gmail.com>
Date: Wed, 7 Jan 2009 20:35:16 +0300
From: "Alexey Zaytsev" <alexey.zaytsev@...il.com>
To: "Ingo Molnar" <mingo@...e.hu>
Cc: "Linus Torvalds" <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
"Nick Piggin" <nickpiggin@...oo.com.au>,
"Peter Zijlstra" <a.p.zijlstra@...llo.nl>
Subject: Re: linux-next: Tree for December 11
On Wed, Jan 7, 2009 at 20:17, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
>> On Wed, 7 Jan 2009, Alexey Zaytsev wrote:
>> >
>> > Almost a month later, the warning is still there, and now also in
>> > Linus' git. Am I the only one who sees it?
>>
>> Possibly. But that may be because most people don't have DEBUG_PREEMPT.
>
> hm, i never saw this warning and i run tons of different kernels (on
> different hw with different build environments) so if this was a more
> generic BKL init problem i'd expect to have seen it one way or another.
>
> But i think the bug that Alexey is seeing is genuine:
Not really. As I mentioned before, It is reproducible in qemu, but
happens a bit earlier:
[ 0.004000] Memory: 117972k/131008k available (3686k kernel code,
12484k reserved, 1567k data, 328k init, 0k highmem)
[ 0.004000] virtual kernel memory layout:
[ 0.004000] fixmap : 0xfff85000 - 0xfffff000 ( 488 kB)
[ 0.004000] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
[ 0.004000] vmalloc : 0xc87f0000 - 0xff7fe000 ( 880 MB)
[ 0.004000] lowmem : 0xc0000000 - 0xc7ff0000 ( 127 MB)
[ 0.004000] .init : 0xc0627000 - 0xc0679000 ( 328 kB)
[ 0.004000] .data : 0xc04998e1 - 0xc0621678 (1567 kB)
[ 0.004000] .text : 0xc0100000 - 0xc04998e1 (3686 kB)
[ 0.004000] Checking if this processor honours the WP bit even in
supervisor mode...Ok.
[ 0.004000] SLUB: Genslabs=12, HWalign=32, Order=0-3, MinObjects=0,
CPUs=1, Nodes=1
[ 0.004768] ------------[ cut here ]------------
[ 0.007851] WARNING: at kernel/sched.c:4435 sub_preempt_count+0xaf/0xc0()
[ 0.008000] Hardware name:
[ 0.008000] Modules linked in:
[ 0.008000] Pid: 0, comm: swapper Not tainted 2.6.28-next-20090107 #181
[ 0.008000] Call Trace:
[ 0.008000] [<c0133526>] warn_slowpath+0x86/0xa0
[ 0.008000] [<c015887b>] ? mark_lock+0x38b/0xe00
[ 0.008000] [<c015b2b5>] ? __lock_acquire+0x475/0xa60
[ 0.008000] [<c0494d72>] ? _spin_unlock_irq+0x22/0x50
[ 0.008000] [<c013cf53>] ? run_timer_softirq+0x193/0x1c0
[ 0.008000] [<c0494d7d>] ? _spin_unlock_irq+0x2d/0x50
[ 0.008000] [<c049776f>] sub_preempt_count+0xaf/0xc0
[ 0.008000] [<c01387a2>] _local_bh_enable+0x52/0xc0
[ 0.008000] [<c0138bdf>] __do_softirq+0x11f/0x170
[ 0.008000] [<c0138ac0>] ? __do_softirq+0x0/0x170
[ 0.008000] <IRQ> [<c0138a44>] ? irq_exit+0x84/0x90
[ 0.008000] [<c01053e3>] ? do_IRQ+0xa3/0x120
[ 0.008000] [<c01039ec>] ? common_interrupt+0x2c/0x34
[ 0.008000] [<c0494d37>] ? _spin_unlock_irqrestore+0x57/0x70
[ 0.008000] [<c01686b3>] ? do_irq_select_affinity+0x183/0x2a0
[ 0.008000] [<c01687f1>] ? setup_irq+0x21/0x30
[ 0.008000] [<c063a8bb>] ? time_init_hook+0x2b/0x30
[ 0.008000] [<c062b116>] ? hpet_time_init+0x16/0x20
[ 0.008000] [<c06279f5>] ? start_kernel+0x245/0x360
[ 0.008000] [<c0627270>] ? unknown_bootoption+0x0/0x210
[ 0.008000] [<c0627106>] ? reserve_ebda_region+0x56/0x70
[ 0.008000] [<c062707d>] ? __init_begin+0x7d/0xb0
[ 0.008000] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 0.008125] Calibrating delay loop (skipped), value calculated
using timer frequency.. 3724.22 BogoMIPS (lpj=7448448)
[ 0.016700] Mount-cache hash table entries: 512
[ 0.027953] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 0.029292] CPU: L2 cache: 2048K
[ 0.032826] Checking 'hlt' instruction... OK.
[ 0.054785] SMP alternatives: switching to UP code
And last time I bisected, it pointed to:
commit 7317d7b87edb41a9135e30be1ec3f7ef817c53dd
Author: Nick Piggin <nickpiggin@...oo.com.au>
Date: Tue Sep 30 20:50:27 2008 +1000
sched: improve preempt debugging
This patch helped me out with a problem I recently had....
Basically, when the kernel lock is held, then preempt_count
underflow does not
get detected until it is released which may be a long time (and arbitrarily,
eg at different points it may be rescheduled). If the bkl is released at
schedule, the resulting output is actually fairly cryptic...
With any other lock that elevates preempt_count, it is illegal to schedule
under it (which would get found pretty quickly). bkl allows scheduling with
preempt_count elevated, which makes underflows hard to debug.
Signed-off-by: Ingo Molnar <mingo@...e.hu>
so at least a dumb bisection won't do here.
.config attached for those who were not on cc last time I sent it.
Download attachment ".config" of type "application/octet-stream" (58108 bytes)
Powered by blists - more mailing lists