lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Jan 2010 09:55:32 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Josh Triplett <josh@...htriplett.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	akpm@...ux-foundation.org, tglx@...utronix.de,
	Valdis.Kletnieks@...edu, dhowells@...hat.com, laijs@...fujitsu.com,
	dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
	barrier

On Thu, Jan 07, 2010 at 12:44:35PM -0500, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > On Thu, Jan 07, 2010 at 06:18:36PM +0100, Peter Zijlstra wrote:
> > > On Thu, 2010-01-07 at 08:52 -0800, Paul E. McKenney wrote:
> > > > On Thu, Jan 07, 2010 at 09:44:15AM +0100, Peter Zijlstra wrote:
> > > > > On Wed, 2010-01-06 at 22:35 -0800, Josh Triplett wrote:
> > > > > > 
> > > > > > The number of threads doesn't matter nearly as much as the number of
> > > > > > threads typically running at a time compared to the number of
> > > > > > processors.  Of course, we can't measure that as easily, but I don't
> > > > > > know that your proposed heuristic would approximate it well.
> > > > > 
> > > > > Quite agreed, and not disturbing RT tasks is even more important.
> > > > 
> > > > OK, so I stand un-Reviewed-by twice in one morning.  ;-)
> > > > 
> > > > > A simple:
> > > > > 
> > > > >   for_each_cpu(cpu, current->mm->cpu_vm_mask) {
> > > > >      if (cpu_curr(cpu)->mm == current->mm)
> > > > >         smp_call_function_single(cpu, func, NULL, 1);
> > > > >   }
> > > > > 
> > > > > seems far preferable over anything else, if you really want you can use
> > > > > a cpumask to copy cpu_vm_mask in and unset bits and use the mask with
> > > > > smp_call_function_any(), but that includes having to allocate the
> > > > > cpumask, which might or might not be too expensive for Mathieu.
> > > > 
> > > > This would be vulnerable to the sys_membarrier() CPU seeing an old value
> > > > of cpu_curr(cpu)->mm, and that other task seeing the old value of the
> > > > pointer we are trying to RCU-destroy, right?
> > > 
> > > Right, so I was thinking that since you want a mb to be executed when
> > > calling sys_membarrier(). If you observe a matching ->mm but the cpu has
> > > since scheduled, we're good since it scheduled (but we'll still send the
> > > IPI anyway), if we do not observe it because the task gets scheduled in
> > > after we do the iteration we're still good because it scheduled.
> > 
> > Something like the following for sys_membarrier(), then?
> > 
> >   smp_mb();
> 
> This smp_mb() is redundant, as we issue it through the for_each_cpu loop
> on the local CPU already.

But we need to do the smp_mb() -before- checking the first cpu_curr(cpu)->mm.

> >   for_each_cpu(cpu, current->mm->cpu_vm_mask) {
> >      if (cpu_curr(cpu)->mm == current->mm)
> >         smp_call_function_single(cpu, func, NULL, 1);
> >   }
> > 
> > Then the code changing ->mm on the other CPU also needs to have a
> > full smp_mb() somewhere after the change to ->mm, but before starting
> > user-space execution.  Which it might well just due to overhead, but
> > we need to make sure that someone doesn't optimize us out of existence.
> 
> I believe we also need one between execution of the userspace task and
> change to ->mm. If we have these guarantees I think we are fine.

Agreed, in case an outgoing RCU read-side critical section does a store
into an RCU-protected data structure.  Unconventional, but definitely
permitted.

							Thanx, Paul

> Mathieu
> 
> > 
> > 							Thanx, Paul
> > 
> > > As to needing to keep rcu_read_lock() around the iteration, for sure we
> > > need that to ensure the remote task_struct reference we take is valid.
> > > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ