[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130611233725.GA30531@linux.vnet.ibm.com>
Date: Tue, 11 Jun 2013 16:37:25 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Fengguang Wu <fengguang.wu@...el.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: WARNING: at kernel/rcutorture.c:1243 rcu_torture_printk
On Tue, Jun 11, 2013 at 04:03:22PM -0700, Paul E. McKenney wrote:
> On Tue, Jun 11, 2013 at 10:14:55AM +0800, Fengguang Wu wrote:
> > Paul,
> >
> > On Mon, Jun 10, 2013 at 07:51:10AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jun 10, 2013 at 03:47:28PM +0800, Fengguang Wu wrote:
> > > > Greetings,
> > > >
> > > > I got the below dmesg and the first bad commit is
> > > >
> > > > commit 911af505ef407c2511106c224dd640f882f0f590
> > > > Author: Paul E. McKenney <paul.mckenney@...aro.org>
> > > > Date: Mon Feb 11 10:23:27 2013 -0800
> > > >
> > > > rcu: Provide compile-time control for no-CBs CPUs
> > > >
> > > > Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs
> > > > kernel command-line parameter. This is inconvenient in some cases,
> > > > particularly for randconfig testing, so this commit adds a new set of
> > > > kernel configuration parameters. CONFIG_RCU_NOCB_CPU_NONE (the default)
> > > > retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback
> > > > processing from CPU 0 (along with any other CPUs specified by the
> > > > rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads
> > > > callback processing from all CPUs.
> > > >
> > > > Signed-off-by: Paul E. McKenney <paul.mckenney@...aro.org>
> > > > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > >
> > > > However I guess it's a wrong bisect, the commit should be unmasking an
> > > > old bug because it merely provides more kconfig options.
> > > >
> > > > This warning happened only once in 74 boots of the kernel.
> > > >
> > > > I also attached the 2nd dmesg which shows boot hang for the same
> > > > kernel.
> > > >
> > > > Note that there are also 2 kernel hangs in the 74 boots. It should
> > > > not be an RCU problem, however for completeness, I also attach a dmesg
> > > > for your reference when considering this RCU warning.
> > >
> > > Either way, that is a nasty warning! You get this on recent kernels as
> > > well, I take it?
> >
> > Yes, the warning is still in recent kernels, indicated from the last
> > lines of the bisect log.
> >
> > 911af505ef407c2511106c224dd640f882f0f590 is bad and its parent
> > 34ed62461ae4970695974afb9a60ac3df0086830 is good:
> > > > git bisect bad 911af505ef407c2511106c224dd640f882f0f590 # 65 2013-06-07 16:14:41 rcu: Provide compile-time control for no-CBs CPUs
> > > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830 # 900 2013-06-08 04:57:34 rcu: Remove restrictions on no-CBs CPUs
> >
> > After bisect, the script tries to confirm that
> > 34ed62461ae4970695974afb9a60ac3df0086830 is really good by boot
> > testing it 2700 more times:
> >
> > > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830 # 2700 2013-06-10 14:33:56 rcu: Remove restrictions on no-CBs CPUs
> >
> > And continue to find out whether the linus/master and
> > linux-next/master are good/bad (here the results are both bad):
> >
> > > > git bisect bad f43e7a34255a387930bc2bb826ec17c052ce975e # 14:34 0 Merge branch 'for-next'
> > > > git bisect bad c04efed734409f5a44715b54a6ca1b54b0ccf215 # 14:41 11 Add linux-next specific files for 20130607
>
> Just to let you know that I am putting some time into this...
>
> Well, I got a panic, but it turned out to be the kernel being unable to
> locate init. Why your .config should have this effect, I have no idea --
> I don't see any Kconfig parameters that look to me like they should be
> specifying where init lives. Any hints?
>
> Left to myself, my next step will be to try just the RCU-related Kconfig
> parameters.
Which gets rid of the "unable to locate init" panic. One strange thing
in your .config:
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
Seems harmless, though.
I will do repeated runs with this config fragment to see if I can reproduce
your warning:
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_USER_QS=n
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
CONFIG_RCU_FANOUT_EXACT=n
CONFIG_RCU_BOOST=y
CONFIG_RCU_BOOST_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_NONE=n
CONFIG_RCU_NOCB_CPU_ZERO=y
CONFIG_RCU_NOCB_CPU_ALL=n
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_SPARSE_RCU_POINTER=n
CONFIG_RCU_TORTURE_TEST=y
CONFIG_RCU_TORTURE_TEST_RUNNABLE=y
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_CPU_STALL_VERBOSE=y
CONFIG_RCU_CPU_STALL_INFO=y
CONFIG_RCU_TRACE=n
CONFIG_NR_CPUS=8
CONFIG_HZ_PERIODIC=y
CONFIG_NO_HZ_IDLE=n
CONFIG_NO_HZ_FULL=n
CONFIG_NO_HZ=y
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists