[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070328130234.c0a23572.akpm@linux-foundation.org>
Date: Wed, 28 Mar 2007 13:02:34 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Mariusz Koz__owski <m.kozlowski@...land.pl>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Adrian Bunk <bunk@...sta.de>,
john stultz <johnstul@...ibm.com>
Subject: Re: 2.6.21-rc5-mm1
On Wed, 28 Mar 2007 18:44:57 +0200
Mariusz Koz__owski <m.kozlowski@...land.pl> wrote:
> Hello,
>
> I run 2.6.21-rc4-mm1 with no hangs for a week.
> Then when 2.6.21-rc5-mm1 showed up so I switched to it. Unfortunately
> today my laptop hunged twice in a similar way as described here:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/index.html#1165
It's not good that we went backwards between those two releases.
>
> The difference is that it happened when I closed the lid in my laptop.
> When reopend it the box was frozen (ACPI?). Again disk I/O was dead
> so nothing was found in syslog.
Adrian, does this look like any of the bugs whcih you're monitoring?
> I tried to reproduce it and capture something with netconsole.
> I tortured the box for a few hours but the system did not hang. I pushed
> the box real hard and what I got was only oom-killer firing etc ;-)
> Anyway I found something else you might be interested in:
>
>
>
> 1) This happened when 'echo 3 > /proc/sys/vm/drop_caches' on really
> busy system.
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.21-rc5-mm1 #1
> -------------------------------------------------------
> bash/20633 is trying to acquire lock:
> (&journal->j_list_lock){--..}, at: [<c01bb60e>] journal_try_to_free_buffers+0x151/0x1bc
>
> but task is already holding lock:
> (inode_lock){--..}, at: [<c0187d46>] drop_pagecache+0x58/0xf9
Yeah, that's a lock ranking error in the drop_caches code. I have a
super-long-term plan to fix it, but in the short term nothing suggests
itself apart from removing the drop_caches code, I'm afraid.
>
> 2) This was found a couple minutes later when the system was
> really busy and close to oom condition.
>
> INFO: lockdep is turned off.
> BUG: soft lockup detected on CPU#0!
> [<c0104614>] show_trace_log_lvl+0x1a/0x30
> [<c01052c9>] show_trace+0x12/0x14
> [<c0105355>] dump_stack+0x16/0x18
> [<c01467a0>] softlockup_tick+0x81/0xa8
> [<c011e4dc>] run_local_timers+0x12/0x14
> [<c011e8dd>] update_process_times+0x2b/0x63
> [<c012e4be>] tick_sched_timer+0x4d/0x9e
> [<c012af00>] hrtimer_interrupt+0x12e/0x1a6
> [<c0106f56>] timer_interrupt+0xe/0x15
> [<c0146af3>] handle_IRQ_event+0x28/0x59
> [<c01480a7>] handle_level_irq+0x6e/0xe7
> [<c0105d3e>] do_IRQ+0x3d/0x7f
> [<c01041b2>] common_interrupt+0x2e/0x34
> [<c011afef>] do_softirq+0x4d/0x50
> [<c011b263>] irq_exit+0x7e/0x80
> [<c0105d43>] do_IRQ+0x42/0x7f
> [<c01041b2>] common_interrupt+0x2e/0x34
> [<c0178bf2>] core_sys_select+0x1c6/0x310
> [<c0179101>] sys_select+0x39/0x18f
> [<c0103f44>] sysenter_past_esp+0x5d/0x99
> =======================
> Clocksource tsc unstable (delta = 9372804176 ns)
> Time: acpi_pm clocksource has been installed.
>
> Please find .config attached. Not sure who to CC on this (as usual ;-)).
>
I do ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists