[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <505DDBE7.7010706@gmail.com>
Date: Sat, 22 Sep 2012 17:40:23 +0200
From: Sasha Levin <levinsasha928@...il.com>
To: paulmck@...ux.vnet.ibm.com
CC: Michael Wang <wangyun@...ux.vnet.ibm.com>,
Dave Jones <davej@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: RCU idle CPU detection is broken in linux-next
On 09/22/2012 05:09 PM, Paul E. McKenney wrote:
> On Sat, Sep 22, 2012 at 10:26:09AM +0200, Sasha Levin wrote:
>> On 09/21/2012 05:18 PM, Sasha Levin wrote:
>>> On 09/21/2012 05:12 PM, Paul E. McKenney wrote:
>>>> On Fri, Sep 21, 2012 at 03:26:27PM +0200, Sasha Levin wrote:
>>>>> On 09/21/2012 02:13 PM, Paul E. McKenney wrote:
>>>>>>> This might be unrelated, but I got the following dump as well when trinity
>>>>>>>> decided it's time to reboot my guest:
>>>>>> OK, sounds like we should hold off until you reproduce, then.
>>>>>
>>>>> I'm not sure what you mean.
>>>>>
>>>>> There are basically two issues I'm seeing now, which reproduce pretty much every
>>>>> time:
>>>>>
>>>>> 1. The "using when idle" warning.
>>>>> 2. The rcu related hangs during shutdown.
>>>>>
>>>>> The first one appears early on when I start fuzzing, the other one happens when
>>>>> shutting down - so both of them are reproducible in the same session.
>>>>
>>>> Ah, I misunderstood the "reboot my guest" -- I thought that you were
>>>> doing something like repeated modprobe/rmmod cycles on rcutorture while
>>>> running the guest for an extended time period. That will teach me not
>>>> to reply to email so soon after waking up. ;-)
>>>>
>>>> That said, #2 is expected behavior given the RCU CPU stall warnings in
>>>> your Sept. 20 dmesg. This is because rcutorture does rcu_barrier() on
>>>> the way out, which cannot complete if grace periods are not completing.
>>>> And the later soft lockup is also likely a consequence of the stall,
>>>> because CPU hotplug does a synchronize_sched() while holding the hotplug
>>>> lock, which will then cause get_online_cpus() to hang.
>>>>
>>>> Looking further down, there are hung tasks that are waiting for a
>>>> timeout, but this is also a consequence of the hang because they
>>>> are waiting for MAX_SCHEDULE_TIMEOUT -- in other words, they are
>>>> waiting to be killed at shutdown time. I could suppress this by using
>>>> schedule_timeout_interruptible() in a loop in order to reduce the noise
>>>> in this case.
>>>>
>>>> The remaining traces in that email are also consequences of the stall.
>>>>
>>>> So why the stall?
>>>>
>>>> Using RCU from a CPU that RCU believes to be idle can cause arbitrary
>>>> bad behavior (possibly including stalls), but with very low probability.
>>>> The reason that things can go arbitrarily bad is that RCU is ignoring
>>>> the CPU, and thus not waiting for any RCU read-side critical sections.
>>>> This could of course result in abitrary corruption of memory. The reason
>>>> for the low probability is that grace periods tend to be long and RCU
>>>> read-side critical sections tend to be short.
>>>>
>>>> It looks like you are running -next, which has RCU grace periods driven
>>>> by a kthread. Is it possible that this kthread is not getting a chance
>>>> to run (in fact, the "Stall ended before state dump start" is consistent
>>>> with that possibility), but in that case I would expect to see a soft
>>>> lockup from it. Furthermore, in that case, it would be expected to
>>>> start running again as soon as things started going idle during shutdown.
>>>>
>>>> Or did the system somehow manage to stay busy despite being in shutdown?
>>>> Or, for that matter, are you overcommitting the physical CPUs on your
>>>> trinity test setup?
>>>
>>> Nope, I originally had 4 vcpus in the guest with the host running 4 physical
>>> cpus, but I've also tested it with just 2 vcpus and still see the warnings.
>>
>> Some more info that might help, I'm also occasionally seeing:
>>
>> [ 42.389345] ------------[ cut here ]------------
>> [ 42.389348] WARNING: at kernel/rcutree.c:375 rcu_eqs_enter+0x5c/0xc0()
>> [ 42.389350] Pid: 0, comm: swapper/2 Tainted: G W
>> 3.6.0-rc6-next-20120921-sasha-00002-ge9c9495-dirty #378
>
> Hmmm... So either RCU is losing count or some CPU that is already
> marked as idle from RCU's perspective is trying to re-enter idle.
>
> This is helpful, thank you! It fits in nicely with the splat that
> you got after applying Michael Wang's patch. Could you please try the
> diagnostic patch below?
>
> Thanx, Paul
>
>> [ 42.389351] Call Trace:
>> [ 42.389354] [<ffffffff811c653c>] ? rcu_eqs_enter+0x5c/0xc0
>> [ 42.389356] [<ffffffff81106886>] warn_slowpath_common+0x86/0xb0
>> [ 42.389359] [<ffffffff81106975>] warn_slowpath_null+0x15/0x20
>> [ 42.389361] [<ffffffff811c653c>] rcu_eqs_enter+0x5c/0xc0
>> [ 42.389364] [<ffffffff811c66f3>] rcu_idle_enter+0x43/0xa0
>> [ 42.389366] [<ffffffff81078956>] cpu_idle+0x126/0x160
>> [ 42.389369] [<ffffffff83985a64>] start_secondary+0x26e/0x276
>> [ 42.389370] ---[ end trace 04c11301284d64ee ]---
>> [ 42.389394] ------------[ cut here ]------------
>> [ 42.389396] WARNING: at kernel/rcutree.c:350 rcu_eqs_enter_common+0x709/0x970()
>> [ 42.389398] Pid: 0, comm: swapper/2 Tainted: G W
>> 3.6.0-rc6-next-20120921-sasha-00002-ge9c9495-dirty #378
>> [ 42.389399] Call Trace:
>> [ 42.389402] [<ffffffff811c6019>] ? rcu_eqs_enter_common+0x709/0x970
>> [ 42.389405] [<ffffffff81106886>] warn_slowpath_common+0x86/0xb0
>> [ 42.389407] [<ffffffff81106975>] warn_slowpath_null+0x15/0x20
>> [ 42.389410] [<ffffffff811c6019>] rcu_eqs_enter_common+0x709/0x970
>> [ 42.389412] [<ffffffff811c658f>] rcu_eqs_enter+0xaf/0xc0
>> [ 42.389414] [<ffffffff811c66f3>] rcu_idle_enter+0x43/0xa0
>> [ 42.389417] [<ffffffff81078956>] cpu_idle+0x126/0x160
>> [ 42.389420] [<ffffffff83985a64>] start_secondary+0x26e/0x276
>> [ 42.389421] ---[ end trace 04c11301284d64ef ]---
>> [ 42.389424] ------------[ cut here ]------------
>> [ 42.389426] WARNING: at kernel/rcutree.c:527 rcu_eqs_exit+0x4f/0xb0()
>> [ 42.389427] Pid: 0, comm: swapper/2 Tainted: G W
>> 3.6.0-rc6-next-20120921-sasha-00002-ge9c9495-dirty #378
>> [ 42.389428] Call Trace:
>> [ 42.389431] [<ffffffff811c6d9f>] ? rcu_eqs_exit+0x4f/0xb0
>> [ 42.389433] [<ffffffff81106886>] warn_slowpath_common+0x86/0xb0
>> [ 42.389436] [<ffffffff81106975>] warn_slowpath_null+0x15/0x20
>> [ 42.389438] [<ffffffff811c6d9f>] rcu_eqs_exit+0x4f/0xb0
>> [ 42.389441] [<ffffffff811c6f13>] rcu_idle_exit+0x43/0xa0
>> [ 42.389443] [<ffffffff8107896d>] cpu_idle+0x13d/0x160
>> [ 42.389445] [<ffffffff83985a64>] start_secondary+0x26e/0x276
>> [ 42.389447] ---[ end trace 04c11301284d64f0 ]---
>> [ 42.389448] ------------[ cut here ]------------
>> [ 42.389450] WARNING: at kernel/rcutree.c:501 rcu_eqs_exit_common+0x4a/0x3a0()
>> [ 42.389451] Pid: 0, comm: swapper/2 Tainted: G W
>> 3.6.0-rc6-next-20120921-sasha-00002-ge9c9495-dirty #378
>> [ 42.389452] Call Trace:
>> [ 42.389455] [<ffffffff811c679a>] ? rcu_eqs_exit_common+0x4a/0x3a0
>> [ 42.389458] [<ffffffff81106886>] warn_slowpath_common+0x86/0xb0
>> [ 42.389460] [<ffffffff81106975>] warn_slowpath_null+0x15/0x20
>> [ 42.389462] [<ffffffff811c679a>] rcu_eqs_exit_common+0x4a/0x3a0
>> [ 42.389465] [<ffffffff811c6dec>] rcu_eqs_exit+0x9c/0xb0
>> [ 42.389467] [<ffffffff811c6f13>] rcu_idle_exit+0x43/0xa0
>> [ 42.389470] [<ffffffff8107896d>] cpu_idle+0x13d/0x160
>> [ 42.389472] [<ffffffff83985a64>] start_secondary+0x26e/0x276
>> [ 42.389474] ---[ end trace 04c11301284d64f1 ]---
>
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index da14e5c..39d0aec 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -418,16 +418,22 @@ void cpu_idle(void)
> pm_idle();
>
> rcu_idle_exit();
> + WARN_ON(rcu_is_cpu_idle());
> start_critical_timings();
>
> /* In many cases the interrupt that ended idle
> has already called exit_idle. But some idle
> loops can be woken up without interrupt. */
> + WARN_ON(rcu_is_cpu_idle());
> __exit_idle();
> + WARN_ON(rcu_is_cpu_idle());
> }
>
> + WARN_ON(rcu_is_cpu_idle());
> tick_nohz_idle_exit();
> + WARN_ON(rcu_is_cpu_idle());
> preempt_enable_no_resched();
> + WARN_ON(rcu_is_cpu_idle());
> schedule();
> preempt_disable();
> }
>
I don't see any new warnings using this patch. I've attached the config.
Thanks,
Sasha
View attachment "config-sasha" of type "text/plain" (134126 bytes)
Powered by blists - more mailing lists