lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150829092514.GA3240@fixme-laptop.cn.ibm.com>
Date:	Sat, 29 Aug 2015 17:25:52 +0800
From:	Boqun Feng <boqun.feng@...il.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Michal Hocko <mhocko@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	David Howells <dhowells@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Jonathan Corbet <corbet@....net>
Subject: Re: wake_up_process implied memory barrier clarification

Hi Oleg

On Fri, Aug 28, 2015 at 06:06:37PM +0200, Oleg Nesterov wrote:
> On 08/28, Michal Hocko wrote:
> >
> > On Thu 27-08-15 20:26:54, Oleg Nesterov wrote:
> > > On 08/27, Michal Hocko wrote:
> > > >
> > > > --- a/Documentation/memory-barriers.txt
> > > > +++ b/Documentation/memory-barriers.txt
> > > > @@ -2031,6 +2031,9 @@ something up.  The barrier occurs before the task state is cleared, and so sits
> > > >  	    <general barrier>		  STORE current->state
> > > >  	LOAD event_indicated
> > > >
> > > > +Please note that wake_up_process is an exception here because it implies
> > > > +the write memory barrier unconditionally.
> > > > +
> > >
> > > I simply can't understand (can't even parse) this part of memory-barriers.txt.
> >
> > Do you mean the added text or the example above it?
> 
> Both ;)
> 
> but note that "load from X might see 0" is true of course, and in this

By this, I think you actually means the example below the added text,
i.e. the example for "to repeat..", right?

I think that's not a good example, and actually that example explains
that the barriers in -wait_event()- rather than in -wake_up()- are
conditional.

I wrote a patch to make things more clear, hope that helps.

(Add Paul and Jonathan to CCed)

Regards,
Boqun

> sense wake_up_process() is not exception:
> 
> 	X = 1;
> 	wmb();	// unless I am totally confused this just adds more confusion
> 	Y = 1;
> 	wake_up_process(TASK);
> 
> vs TASK doing
> 
> 	for (;;) {
> 		set_current_state(...);
> 		if (Y)
> 			break;
> 		schedule();
> 	}
> 
> 	BUG_ON(X == 0)
> 
> is not correct, it can actually can hit the BUG_ON() above. However, if
> wake_up_process() actually wakes a sleeping TASK up, then it should also
> see X = 1. Even without wmb(), even if we do
> 
> 	Y = 1;
> 	X = 1;
> 	wake_up_process(TASK);
> 

----------->8
Subject: Documentation: call out conditional barriers of wait_*() and wake_up*()

The memory barriers in some sleep and wakeup functions are conditional,
there are several situations that there is no barriers:

1.	If wait_event() and co. actually don't prepare to sleep, there
	may be no barrier in them.

2.	No matter whether a sleep occurs or not, there may be no barrier
	between a successful wait-condition checking(the result of which
	is true) in wait_event() and the following instructions.

3.	If wake_up() and co. actually wake up no one, there may be no
	write barrier in them.

However, the current version of memory-barriers.txt doesn't call these
out. Further more, the example which wanted to explain that write
barriers in wake_up*() are conditional fails its job. To see that,
consider a similar example:

	CPU 1				CPU 2
	===============================	===============================
	X = 1;
	smp_mb();
	Y = 1;				wait_event(wq, Y == 1);
					  load from Y sees 1, no memory barrier
	smp_wmb();
	wake_up();
					load from X might see 0

CPU 2's load from X might still be 0 even if we add a write barrier
explicitly, which means that even if wake_up() guarantees a write
barrier, the original example still doesn't have the desired order
guarantee. In fact, the original example can explains the conditionality
of barriers in wait_*() rather than wake_up*().

This patch makes things clear, by calling out that the barriers in
wait_*() and wake_up*() are conditional and giving exact examples to
explain that.

Signed-off-by: Boqun Feng <boqun.feng@...il.com>
---

 Documentation/memory-barriers.txt | 81 +++++++++++++++++++++++++++++++--------
 1 file changed, 65 insertions(+), 16 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index eafa6a5..df69841 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1975,7 +1975,8 @@ set_current_state() may be wrapped by:
 
 which therefore also imply a general memory barrier after setting the state.
 The whole sequence above is available in various canned forms, all of which
-interpolate the memory barrier in the right place:
+imply a general barrier if and only if a sleep is at least about to happen,
+i.e. prepare_to_wait*() is called.
 
 	wait_event();
 	wait_event_interruptible();
@@ -1986,6 +1987,9 @@ interpolate the memory barrier in the right place:
 	wait_on_bit();
 	wait_on_bit_lock();
 
+Further more, no barrier is guaranteed after the successful wait condition
+checkings, whose results are true, in wait_*() and before the instructions
+following wait_*().
 
 Secondly, code that performs a wake up normally follows something like this:
 
@@ -2009,21 +2013,6 @@ between the STORE to indicate the event and the STORE to set TASK_RUNNING:
 	    <general barrier>		  STORE current->state
 	LOAD event_indicated
 
-To repeat, this write memory barrier is present if and only if something
-is actually awakened.  To see this, consider the following sequence of
-events, where X and Y are both initially zero:
-
-	CPU 1				CPU 2
-	===============================	===============================
-	X = 1;				STORE event_indicated
-	smp_mb();			wake_up();
-	Y = 1;				wait_event(wq, Y == 1);
-	wake_up();			  load from Y sees 1, no memory barrier
-					load from X might see 0
-
-In contrast, if a wakeup does occur, CPU 2's load from X would be guaranteed
-to see 1.
-
 The available waker functions include:
 
 	complete();
@@ -2042,6 +2031,66 @@ The available waker functions include:
 	wake_up_poll();
 	wake_up_process();
 
+To repeat, barriers in wait_event(), wake_up() and co. are conditional, meaning
+they are present if and only if a sleep or a wakeup actually occurs. To see
+this, consider the following three examples.
+
+The first example is for the conditional barriers in wait_*(), say X and Y
+are both initially zero:
+
+	CPU 1				CPU 2
+	===============================	===============================
+					X = 1;
+					smp_mb();
+	wait_event(wq, Y == 1);		Y = 1;
+	  load from Y sees 1		wake_up();
+	  <no memory barrier>
+	load from X might see 0
+
+And even if a sleep and a wakeup really occurs, there might be no barrier
+between the last load from Y(which the makes wait condition checking succeed)
+and load from X, so that load from X might still see 0. The second example shows
+that, the code running on CPU 1 & 2 is the same with the first example, however
+the sequence of events are different, say X, Y and Z are all initially zero,
+and another task may be waiting in wait_event(wq, Z == 1) on CPU 4:
+
+	CPU 1				CPU 2				CPU 3
+	===============================	=============================== ===============================
+	wait_event(wq, Y == 1);						Z = 1;
+	  load from Y sees 0
+	  prepare_to_wait_event();
+	    <general barrier>
+	  schedule();
+	    <general barrier>
+									wake_up();
+	  prepare_to_wait_event();
+	    <general barrier>
+					X = 1;
+					smp_mb();
+	  load from Y sees 1		Y = 1;
+	  <no memory barrier>		wake_up();
+	load from X might see 0
+
+So that, to guarantee that CPU 1's load from X is 1 in the two examples above,
+a read barrier must be added after wait_event() in the code running on CPU 1.
+
+The third example is for the conditional write barriers in wake_up*(), say
+X, Y and Z are all initially zero:
+
+	CPU 1				CPU 2
+	===============================	===============================
+					X = 1;
+	wait_event(wq, Y == 1);		Y = 1;
+	  load from Y sees 1		wake_up();
+					  <no memory barrier>
+					Z = 1;
+	load from Z sees 1
+	smp_rmb();
+	load from X might see 0
+
+In contrast, if a wakeup does occur, CPU 1's load from X would be guaranteed
+to see 1. To guarantee that CPU 1's load from X is 1, a write barrier must be
+added between store to X and store to Z in the code running on CPU 2.
 
 [!] Note that the memory barriers implied by the sleeper and the waker do _not_
 order multiple stores before the wake-up with respect to loads of those stored

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ