[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120712141130.GA20195@localhost>
Date: Thu, 12 Jul 2012 22:11:30 +0800
From: Fengguang Wu <wfg@...ux.intel.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Josh Triplett <josh@...htriplett.org>,
Lai Jiangshan <laijs@...fujitsu.com>,
LKML <linux-kernel@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 08:55:59PM +0800, Fengguang Wu wrote:
> On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote:
> > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote:
> > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote:
> > > > Hi Paul,
> > > >
> > > > Fortunately this bug is bisectable and the first bad commit is:
> > > >
> > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0
> > > > Author: Paul E. McKenney <paul.mckenney@...aro.org>
> > > > Date: Fri Sep 30 12:10:22 2011 -0700
> > > >
> > > > rcu: Track idleness independent of idle tasks
> > >
> > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it
> > > traces while in idle. The confusion on my part was that in TREE_RCU,
> > > the nesting and dyntick-idle indication are different, while in
> > > TINY_RCU they are one and the same.
> > >
> > > Does the following patch help?
> >
> > Not exactly, but the error message is now changed to:
>
> Then I think it may help to test the linux-next tree that contains
> latest RCU fixes. And it somehow helped. There are no warnings in
> the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning
> in today's linux-next tree.
>
> [ 0.013241] Performance Events:
> [ 0.014222] ------------[ cut here ]------------
> [ 0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 worker_enter_idle+0x2fa/0x37c()
That's bisected to one of Tejun's workqueue commit..
> Both trees freeze after this point:
>
> modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory
> ^M
> modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory
> ^M
And that's bisected to Al's vfs work..
> So I'd better to base the test on one of your RCU branches..
> Which branch would your recommend?
And finally I tried the patch on top of rcu/next, and get the attached
dmesg and warnings:
[ 1.776415] Testing tracer preemptirqsoff: PASSED
[ 1.779383] Testing tracer wakeup: [ 2.091320] ftrace-test (15) used greatest stack depth: 7284 bytes left
PASSED
[ 2.094844] Testing tracer wakeup_rt: [ 2.179052] ------------[ cut here ]------------
[ 2.181087] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244()
[ 2.183023] Hardware name: Bochs
[ 2.183023] Modules linked in:
[ 2.183023]
[ 2.183023] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #59
[ 2.183023] Call Trace:
[ 2.183023] [<c103704c>] warn_slowpath_common+0xfc/0x13b
[ 2.183023] [<c1105d31>] ? rcu_idle_enter+0xf0/0x244
[ 2.183023] [<c1105d31>] ? rcu_idle_enter+0xf0/0x244
[ 2.183023] [<c10370c9>] warn_slowpath_null+0x3e/0x4e
[ 2.183023] [<c1105d31>] rcu_idle_enter+0xf0/0x244
[ 2.183023] [<c100fbdb>] cpu_idle+0x159/0x220
[ 2.183023] [<c14165a3>] rest_init+0x2db/0x2ef
[ 2.183023] [<c14162c8>] ? reciprocal_value+0x64/0x64
[ 2.183023] [<c171d456>] start_kernel+0x8eb/0x8ff
[ 2.183023] [<c171c62f>] ? repair_env_string+0xd3/0xd3
[ 2.183023] [<c171c342>] i386_start_kernel+0xf8/0x10b
[ 2.183023] ---[ end trace 44593438a59a9533 ]---
[ 2.195679] ------------[ cut here ]------------
[ 2.196428] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217()
[ 2.197633] Hardware name: Bochs
[ 2.198183] Modules linked in:
[ 2.198695] Pid: 0, comm: swapper Tainted: G W 3.5.0-rc5+ #59
[ 2.199671] Call Trace:
[ 2.199671] [<c103704c>] warn_slowpath_common+0xfc/0x13b
[ 2.199671] [<c1105b2a>] ? rcu_irq_exit+0x100/0x217
[ 2.199671] [<c1105b2a>] ? rcu_irq_exit+0x100/0x217
[ 2.199671] [<c10370c9>] warn_slowpath_null+0x3e/0x4e
[ 2.199671] [<c1105b2a>] rcu_irq_exit+0x100/0x217
[ 2.199671] [<c10469ae>] irq_exit+0x117/0x14d
[ 2.199671] [<c1005d31>] do_IRQ+0xe5/0x106
[ 2.199671] [<c10ba77e>] ? trace_hardirqs_off_caller+0x97/0x167
[ 2.199671] [<c1437975>] common_interrupt+0x35/0x3c
[ 2.199671] [<c10c00e0>] ? trace_hardirqs_on_caller+0x1b8/0x38c
[ 2.199671] [<c1025c7b>] ? native_safe_halt+0x13/0x23
The same rcu/next HEAD w/o patch, the warning is:
[ 4.456093] Testing event rcu_dyntick: [ 4.458251]
[ 4.458941] ===============================
[ 4.459352] [ INFO: suspicious RCU usage. ]
[ 4.459352] 3.5.0-rc5+ #60 Not tainted
[ 4.459352] -------------------------------
[ 4.459352] /c/wfg/linux/include/trace/events/rcu.h:275 suspicious rcu_dereference_check() usage!
[ 4.459352]
[ 4.459352] other info that might help us debug this:
[ 4.459352]
[ 4.459352]
[ 4.459352] RCU used illegally from idle CPU!
[ 4.459352] rcu_scheduler_active = 1, debug_locks = 0
[ 4.459352] RCU used illegally from extended quiescent state!
[ 4.459352] no locks held by swapper/0.
[ 4.459352]
[ 4.459352] stack backtrace:
[ 4.459352] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #60
[ 4.459352] Call Trace:
[ 4.459352]
[ 4.459352] [<c10bbb36>] lockdep_rcu_suspicious+0x1de/0x1f4
[ 4.459352] [<c1103171>] trace_rcu_dyntick+0xd9/0x167
[ 4.459352] [<c1105865>] rcu_idle_enter_common+0x84/0x22d
[ 4.459352] [<c10ba77e>] ? trace_hardirqs_off_caller+0x97/0x167
[ 4.459352] [<c1105d98>] rcu_idle_enter+0x16d/0x274
[ 4.459352] [<c100fbdb>] cpu_idle+0x159/0x220
[ 4.459352] [<c14165c3>] rest_init+0x2db/0x2ef
[ 4.459352] [<c14162e8>] ? reciprocal_value+0x64/0x64
[ 4.459352] [<c171d456>] start_kernel+0x8eb/0x8ff
[ 4.459352] [<c171c62f>] ? repair_env_string+0xd3/0xd3
[ 4.459352] [<c171c342>] i386_start_kernel+0xf8/0x10b
[ 4.481163] OK
[ 4.481437] Testing event rcu_prep_idle: OK
That's a great kconfig (from the testing respective)! :-P
Thanks,
Fengguang
View attachment "dmesg-kvm_bisect-waimea-5209-2012-07-12-22-02-55" of type "text/plain" (30986 bytes)
Powered by blists - more mailing lists