[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120620142459.GA2461@linux.vnet.ibm.com>
Date: Wed, 20 Jun 2012 07:24:59 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc: linux-kernel@...r.kernel.org, tglx@...utronix.de,
johnstul@...ibm.com, fweisbec@...il.com
Subject: Re: WARNING: at /home/konrad/ssd/linux/kernel/rcutree.c:1547
__rcu_process_callbacks+0x42e/0x440()
On Wed, Jun 20, 2012 at 09:58:33AM -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Jun 19, 2012 at 11:47:18AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 19, 2012 at 02:22:16PM -0400, Konrad Rzeszutek Wilk wrote:
> > >
> > > I've been getting this when booting a Xen PV guest with 3 CPUs (of which two are
> > > online). Any thoughts?
> >
> > Maybe... I am assuming that your kernel/rcutree.c:1547 is this line of code:
> >
> > WARN_ON_ONCE(cpu_is_offline(smp_processor_id()));
> >
> > This is line 1549 in current mainline.
>
> <nods>
> [ 0.064998] ------------[ cut here ]------------^M
> [ 0.065004] WARNING: at /home/konrad/linux-linus/kernel/rcutree.c:1549 __rcu_process_callbacks+0x42e/0x440()^M
> [ 0.065005] Modules linked in:^M
> [ 0.065006] Pid: 12, comm: migration/2 Not tainted 3.5.0-rc3upstream-00111-gf40759e #1^M
> [ 0.065007] Call Trace:^M
> [ 0.065011] <IRQ> [<ffffffff810718ba>] warn_slowpath_common+0x7a/0xb0^M
> [ 0.065013] [<ffffffff81071905>] warn_slowpath_null+0x15/0x20^M
> [ 0.065022] [<ffffffff810edb7e>] __rcu_process_callbacks+0x42e/0x440^M
> [ 0.065026] [<ffffffff810edbb0>] rcu_process_callbacks+0x20/0x40^M
> [ 0.065029] [<ffffffff81079299>] __do_softirq+0xa9/0x160^M
> [ 0.065033] [<ffffffff810a1035>] ? sched_clock_local+0x25/0x90^M
> [ 0.065037] [<ffffffff810d7201>] ? queue_stop_cpus_work+0x61/0xf0^M
> [ 0.065042] [<ffffffff815c44dc>] call_softirq+0x1c/0x30^M
> [ 0.065044] [<ffffffff81039435>] do_softirq+0x65/0xa0^M
> [ 0.065047] [<ffffffff81079095>] irq_exit+0xd5/0xf0^M
Here is the interrupt. Why are we taking an interrupt on an offline
CPU? This is very very bad.
> [ 0.065050] [<ffffffff81322f2f>] xen_evtchn_do_upcall+0x2f/0x40^M
> [ 0.065054] [<ffffffff815c452e>] xen_do_hypervisor_callback+0x1e/0x30^M
> [ 0.065058] <EOI> [<ffffffff810d7201>] ? queue_stop_cpus_work+0x61/0xf0^M
>
>
> >
> > If my guess is correct, my question is "why on earth is a CPU that has
> > marked itself offline taking a timer interrupt???"
>
> So.. part of this is that I think the CPU hotplug code is a bit brain-dead.
>
> In the Xen side, when a guest starts - it boots all the available CPUs
> (in this case three), and then it brings down the one it doesn't need.
> How many it brings down is dependent on two simple lines in the guest config:
>
> vcpus=2
> maxvcpus=3
>
> The "offline" CPU can be immediately brought back and its parked in the
> cpu_idle call. Which looking at it - means that it also hits the schedule_bug
> when it gets to be onlined. Grrrr..
>
> But irregardless of that - when a CPU is brought down it does call the CPU
> offline notifiers - and I am not sure why the RCU isn't notified? Could
> it be a race perhaps?
RCU -is- being notified of the CPU going down, as near as I can tell.
As noted previously, the real question is "Why on earth is an offline
CPU taking an interrupt???" RCU is complaining that it is being asked
to do work while running on an offline CPU.
So, where is that interrupt coming from? It needs to not be happening.
Thanx, Paul
> > I could provide a patch to make RCU work around this problem from its
> > viewpoint, but taking timer interrupts on an offline CPU is an extremely
> > bad idea. It would be good to fix the underlying problem instead of
>
> Right.
> > silencing RCU's warning.
>
> Of course.
> >
> > If my guess on what line is warning you is wrong, please do let me know
> > what the line really is -- or even better, the corresponding mainline
> > git commit ID.
>
> This is f40759e but I think earlier versions of v3.5 exhibited this too.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists