lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 1 Feb 2010 11:09:30 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	akpm@...ux-foundation.org, Ingo Molnar <mingo@...e.hu>,
	linux-kernel@...r.kernel.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
	dipankar@...ibm.com, josh@...htriplett.org, dvhltc@...ibm.com,
	niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
	Valdis.Kletnieks@...edu, dhowells@...hat.com
Subject: Re: [patch 2/3] scheduler: add full memory barriers upon task
	switch at runqueue lock/unlock

* Linus Torvalds (torvalds@...ux-foundation.org) wrote:
> 
> 
> On Sun, 31 Jan 2010, Mathieu Desnoyers wrote:
> > 
> > Adds no overhead on x86, because LOCK-prefixed atomic operations of the spin
> > lock/unlock already imply a full memory barrier.
> 
> .. and as Nick pointed out, you're fundamentally incorrect on this. 
> unlock on x86 is no memory barrier at all, since the x86 memory ordering 
> rules are such that a regular store always has release consistency.
> 
> But more importantly, you don't even explain why the addded smp_mb() 
> helps.
> 
> Why does a smp_mb() at the lock/unlock even matter? Reading accesses by 
> the same CPU sure as hell do _not_ matter, so the whole concept seems 
> totally broken. There is no way in _hell_ that whatever unlocked thing 
> can ever write the variables protected by the lock, only read them. So a 
> full memory barrier makes zero sense to begin with.
> 
> So what are these magical memory barriers all about?
> 
> 		Linus

To show the ordering required by sys_membarrier, let's first express
sys_membarrier() succinctly:

  smp_mb();
  rcu_read_lock(); /* ensures validity of cpu_curr(cpu) tasks */
  for_each_cpu(cpu, mm_cpumask(current->mm))
    if (current->mm == cpu_curr(cpu)->mm)
      smp_call_function_single(cpu, membarrier_ipi, NULL, 1);
  rcu_read_unlock();
  smp_mb();

We need to guarantee that between the two memory barriers issued by
sys_membarrier(), we run a membarrier_ipi on every other running thread
belonging to the same process.

Where we have to be careful here is that the context switch does not
imply memory barriers both before and after:

- mm_cpumask update
- rq->cur update

Not having memory barriers between user-space instruction execution and
these updates performed by the scheduler leaves room for a race between
sys_membarrier() and the scheduler, where we would not deliver a
membarrier_ipi to a thread which either starts running or is about to be
context switched.

One option is to surround the context switch by memory barriers. If we
choose not to do that, we are left with these solutions:

We can deal with the rq->cur update by holding the rq lock in each
iteration of the for_each_cpu(cpu, mm_cpumask(current->mm)) loop. This
ensures that if rq->cur is updated, we have an associated memory barrier
issued (e.g. on x86, implied by writing to cr3 while the rq lock is held).

However, this does not deal with mm_cpumask update, and we cannot use
the per-cpu rq lock, as it's a process-wide data structure updated with
clear_bit/set_bit in switch_mm(). So at the very least, we would have to
add memory barriers in switch_mm() on some architectures to deal with
this.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ