[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081009114449.GA6628@linux.vnet.ibm.com>
Date: Thu, 9 Oct 2008 04:44:49 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: mingo@...e.hu, linux-kernel@...r.kernel.org, rjw@...k.pl,
dipankar@...ibm.com, tglx@...utronix.de
Subject: Re: RCU hang on cpu re-hotplug with 2.6.27rc8
On Thu, Oct 09, 2008 at 06:56:46AM +0200, Andi Kleen wrote:
> [fix up Thomas' address to not bounce]
>
> On Wed, Oct 08, 2008 at 06:33:21PM -0700, Paul E. McKenney wrote:
> > The attached patch (similar to one in -tip, but set up for mainline and
> > tweaked to make stall-checking on by default) should get you a stack
> > trace of any CPUs holding up RCU grace periods for more than about
> > three seconds.
> >
> > On the off-chance that this helps.
>
> It actually does. The stall detector makes the online echo return
> after three seconds, although it's not 100% clear to me why.
Interesting. This behavior would be consistent with the CPU entering
dyntick-idle mode without RCU's being aware of this. Except that your
earlier .config file says "# CONFIG_NO_HZ is not set". And that would
mean that the CPU really should be invoking RCU's state machine every
scheduling tick.
I confess confusion.
Thanx, Paul
> here's the backtrace
>
> RCU detected CPU 14 stall (t=4295149800/5928 jiffies)
> Pid: 0, comm: swapper Not tainted 2.6.27-rc9 #5
>
> Call Trace:
> <IRQ> [<ffffffff8025d188>] __rcu_pending+0x6e/0x1d9
> [<ffffffff8025d329>] rcu_pending+0x36/0x6e
> [<ffffffff8023b480>] update_process_times+0x37/0x5b
> [<ffffffff8024be72>] tick_periodic+0x68/0x74
> [<ffffffff8024be9f>] tick_handle_periodic+0x21/0x66
> [<ffffffff8021bcd2>] smp_apic_timer_interrupt+0x8a/0xa8
> [<ffffffff8020bfe6>] apic_timer_interrupt+0x66/0x70
> <EOI> [<ffffffff803adb39>] ? acpi_safe_halt+0x2b/0x3e
> [<ffffffff803adbfa>] ? acpi_idle_enter_c1+0xae/0x102
> [<ffffffff804ffdd6>] ? cpuidle_idle_call+0x70/0xa2
> [<ffffffff8020a097>] ? cpu_idle+0x7e/0x9c
> [<ffffffff805bef4a>] ? start_secondary+0x157/0x15c
>
> Timer issue?
>
>
> -Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists