[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110524011824.GL7428@linux.vnet.ibm.com>
Date:	Mon, 23 May 2011 18:18:24 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Yinghai Lu <yinghai@...nel.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...hat.com, hpa@...or.com,
	tglx@...utronix.de, mingo@...e.hu
Subject: Re: [tip:core/rcu] Revert "rcu: Decrease memory-barrier usage
 based on semi-formal proof"
On Mon, May 23, 2011 at 03:58:45PM -0700, Yinghai Lu wrote:
> On 05/23/2011 03:55 PM, Yinghai Lu wrote:
> > On 05/23/2011 03:01 PM, Yinghai Lu wrote:
> >> On 05/23/2011 02:25 PM, Paul E. McKenney wrote:
> >>> On Mon, May 23, 2011 at 01:14:22PM -0700, Yinghai Lu wrote:
> >>>> On 05/21/2011 07:08 AM, Paul E. McKenney wrote:
> >>>>> On Sat, May 21, 2011 at 06:18:44AM -0700, Paul E. McKenney wrote:
> >>>>>> On Fri, May 20, 2011 at 05:02:40PM -0700, Yinghai Lu wrote:
> >>>>>>> On 05/20/2011 04:49 PM, Paul E. McKenney wrote:
> >>>>>>>> On Fri, May 20, 2011 at 04:16:28PM -0700, Yinghai Lu wrote:
> >>>>>>> ...
> >>>>>>>>>
> >>>>>>>>> the same one i sent out before, but let DEBUG_LOCKING_API_SELFTESTS disabled.
> >>>>>>>>
> >>>>>>>> OK, just to make sure I understand...  You are compiling exactly the
> >>>>>>>> same kernel source tree with exactly the same .config, just with two
> >>>>>>>> different versions of gcc, correct?
> >>>>>>> yes.
> >>>>>>>>
> >>>>>>>> If so, it is quite possible that the slow one is the correct one.  :-/
> >>>>>>> yeah, new version always have problem.
> >>>>>>>
> >>>>>>> looks like opensuse11.3 has 4.5.0 and fedora14 has 4.5.1
> >>>>>>
> >>>>>> OK, so fedora14 is the fast one (4.5.1) and opensuse11.3 is the slow
> >>>>>> one (4.5.0), correct?
> >>>>>
> >>>>> And does commit c7a3786030 help?  This commit (from Peter Zijlstra)
> >>>>> tidied up RCU kthreads' scheduler interactions.  The patch is below,
> >>>>> though it is probably more convenient to pull it from the rcu/next
> >>>>> branch of:
> >>>>>
> >>>>>   git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
> >>>>>
> >>>
> > gcc in Fedora 14 is fine with your tree.
> > 
> 
> sorry, I should wait for longer to see Fedora 14 is ok.
> 
> got same warning with the one compiled from fedora 14...
> 
> [  372.937251] INFO: task rcun0:8 blocked for more than 120 seconds.
> [  372.937618] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  372.956130] rcun0           D 0000000000000000     0     8      2 0x00000000
> [  372.956498]  ffff882070d65e90 0000000000000046 ffff882070d64000 0000000000004000
> [  372.956528]  00000000001d1f40 ffff882070d65fd8 00000000001d1f40 ffff882070d65fd8
> [  372.956555]  0000000000004000 00000000001d1f40 ffff882070d18000 ffff882070d6a2b0
> [  372.956581] Call Trace:
> [  372.956605]  [<ffffffff810afce3>] ? __lock_release+0x166/0x16f
> [  372.956624]  [<ffffffff81c229d1>] ? _raw_spin_unlock_irqrestore+0x3f/0x46
> [  372.956639]  [<ffffffff810ce941>] ? rcu_cpu_kthread_should_stop+0x137/0x137
> [  372.956650]  [<ffffffff810adfd5>] ? trace_hardirqs_on+0xd/0xf
> [  372.956661]  [<ffffffff810ce941>] ? rcu_cpu_kthread_should_stop+0x137/0x137
> [  372.956673]  [<ffffffff8109a0a5>] kthread+0x8c/0xa8
> [  372.956689]  [<ffffffff81c2a754>] kernel_thread_helper+0x4/0x10
> [  372.956701]  [<ffffffff81c22c80>] ? retint_restore_args+0xe/0xe
> [  372.956711]  [<ffffffff8109a019>] ? __init_kthread_worker+0x5b/0x5b
> [  372.956722]  [<ffffffff81c2a750>] ? gs_change+0xb/0xb
> [  372.956726] INFO: lockdep is turned off.
> [  492.750827] INFO: task rcun0:8 blocked for more than 120 seconds.
> [  492.751150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  492.762991] rcun0           D 0000000000000000     0     8      2 0x00000000
> [  492.763264]  ffff882070d65e90 0000000000000046 ffff882070d64000 0000000000004000
> [  492.763294]  00000000001d1f40 ffff882070d65fd8 00000000001d1f40 ffff882070d65fd8
> [  492.763320]  0000000000004000 00000000001d1f40 ffff882070d18000 ffff882070d6a2b0
> [  492.763346] Call Trace:
> [  492.763359]  [<ffffffff810afce3>] ? __lock_release+0x166/0x16f
> [  492.763371]  [<ffffffff81c229d1>] ? _raw_spin_unlock_irqrestore+0x3f/0x46
> [  492.763382]  [<ffffffff810ce941>] ? rcu_cpu_kthread_should_stop+0x137/0x137
> [  492.763393]  [<ffffffff810adfd5>] ? trace_hardirqs_on+0xd/0xf
> [  492.763404]  [<ffffffff810ce941>] ? rcu_cpu_kthread_should_stop+0x137/0x137
> [  492.763414]  [<ffffffff8109a0a5>] kthread+0x8c/0xa8
> [  492.763427]  [<ffffffff81c2a754>] kernel_thread_helper+0x4/0x10
> [  492.763439]  [<ffffffff81c22c80>] ? retint_restore_args+0xe/0xe
> [  492.763449]  [<ffffffff8109a019>] ? __init_kthread_worker+0x5b/0x5b
> [  492.763460]  [<ffffffff81c2a750>] ? gs_change+0xb/0xb
> [  492.763463] INFO: lockdep is turned off.
> 
> if reverting PeterZ's patch will not have that warning.
OK, so it looks like I need to get this out of the way in order to track
down the delays.  Or does reverting PeterZ's patch get you a stable
system, but with the longish delays in memory_dev_init()?  If the latter,
it might be more productive to handle the two problems separately.
For whatever it is worth, I do see about 5% increase in grace-period
duration when switching to kthreads.  This is acceptable -- your
30x increase clearly is completely unacceptable and must be fixed.
Other than that, the main thing that affects grace period duration is
the setting of CONFIG_HZ -- the smaller the HZ value, the longer the
grace-period duration.
							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
