[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090410150353.GL26366@ZenIV.linux.org.uk>
Date: Fri, 10 Apr 2009 16:03:53 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
linux-kernel@...r.kernel.org, hugh@...itas.com, jmorris@...ei.org,
akpm@...ux-foundation.org
Subject: Re: [2.6.30-rc1] RCU detected CPU 1 stall
On Fri, Apr 10, 2009 at 07:22:03AM -0700, Paul E. McKenney wrote:
> Hmmmm... This indicates that CPU 1 was spinning in the kernel for
> a long time. At 250 HZ, 32,565 jiffies is 130 seconds, or just over
> two -minutes-. Ouch!!!
>
> The interrupt happened on the stalled CPU, so we know that interrupts
> were enabled. Because we have CONFIG_PREEMPT_NONE=y, there is no
> preemption, so preemption need not be disabled. This could be due
> to lock contention, or even a simple infinite loop.
>
> The timer interrupt (apic_timer_interrupt) occurred in either
> __bprm_mm_init(), __get_user_4(), count(), or do_execve(). There
> have been some recent changes around check_unsafe_exec() -- any
> possibility that these introduced excessive lock contention or
> an infinite loop? Ditto for the recent security fixes?
Oh, joy... the loop in there is this:
for (t = next_thread(p); t != p; t = next_thread(t)) {
if (t->fs == p->fs)
n_fs++;
}
I find it hard to believe that it can take two minutes, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists