lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110513162646.GW2258@linux.vnet.ibm.com>
Date:	Fri, 13 May 2011 09:26:46 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Yinghai Lu <yinghai@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL rcu/next] rcu commits for 2.6.40

On Fri, May 13, 2011 at 05:07:44PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > On Fri, May 13, 2011 at 03:12:18PM +0200, Ingo Molnar wrote:
> > > 
> > > * Ingo Molnar <mingo@...e.hu> wrote:
> > > 
> > > > I started bisecting this, and the two relevant endpoints:
> > > > 
> > > >   bad: 11c476f: net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
> > > >  good: 0ee5623f: Linux 2.6.39-rc6
> > > > 
> > > > very clearly indicate that this is an RCU regression.
> > > 
> > > This might be the same one Yinghai found:
> > > 
> > >  e59fb3120bec: rcu: Decrease memory-barrier usage based on semi-formal proof
> > > 
> > > So with the config i sent it's definitely reproducible.
> > > 
> > > At first sight couldnt this be related not to barriers, but to not setting 
> > > need_resched() like we did before?
> > 
> > Thank you both!!!  I had inspected the commit, but missed the fact that
> > the new version refuses to call set_need_resched() if irqs are enabled.  :-(
> > The following (untested) patch restores the set_need_resched() operation.
> 
> Btw., in hindsight, e59fb3120bec was a tad big, which made analysis harder.
> 
> Would it have been possible to split it in two, one for the movement of the 
> notifiers, the other for the barrier changes?
> 
> That way the bisection would have fingered the movement commit. Or so.

In hindsight, that certainly would have been better.

> > Does this help?
> 
> No, unfortunately not, the long delay is still there:
> 
> device: 'ttyS0': device_add
> PM: Adding info for No Bus:ttyS0
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 0} (detected by 1, t=6002 jiffies)

I was afraid of that...

On the off-chance that moving the memory barriers was at fault,
the following patch restores all of them that don't have in situ
replacements.  Grasping at straws, admittedly.

						Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8c490ef..a4a2ef0 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1449,10 +1449,12 @@ __rcu_process_callbacks(struct rcu_state *rsp, struct rcu_data *rdp)
  */
 static void rcu_process_callbacks(void)
 {
+	smp_mb();
 	__rcu_process_callbacks(&rcu_sched_state,
 				&__get_cpu_var(rcu_sched_data));
 	__rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
 	rcu_preempt_process_callbacks();
+	smp_mb();
 
 	/* If we are last CPU on way to dyntick-idle mode, accelerate it. */
 	rcu_needs_cpu_flush();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ