linux-kernel - Re: wake_up_process implied memory barrier clarification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150901163923.GA20055@redhat.com>
Date:	Tue, 1 Sep 2015 18:39:23 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Boqun Feng <boqun.feng@...il.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Michal Hocko <mhocko@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	David Howells <dhowells@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jonathan Corbet <corbet@....net>
Subject: Re: wake_up_process implied memory barrier clarification

On 09/01, Boqun Feng wrote:
>
> On Tue, Sep 01, 2015 at 11:59:23AM +0200, Oleg Nesterov wrote:
> >
> > And just in case, wake_up() differs in a sense that it doesn't even need
> > that STORE-LOAD barrier in try_to_wake_up(), we can rely on
> > wait_queue_head_t->lock. Assuming that wake_up() pairs with the "normal"
> > wait_event()-like code.

Looks like, you have missed this part of my previous email. See below.

> I think maybe I have a misunderstanding of barrier pairing.

Or me. I can only say how it is supposed to work.

> think that a barrier pairing can only happen:

Well, no. See for example https://lkml.org/lkml/2014/7/16/310
Or, say, the comment in completion_done().

And please do not assume I can answer authoritatively the questions
in this area. Fortunately we have paulmck/peterz in CC, they can
correct me.

Plus I didn't sleep today, not sure I can understand you correctly
and/or answer your question ;)

> One example of #2 pairing is the following sequence of events:
>
> Initially X = 0, Y = 0
>
> CPU 1				CPU 2
> ===========================	================================
> WRITE_ONCE(Y, 1);
> smp_mb();
> r1 = READ_ONCE(X); // r1 == 0
> 				WRITE_ONCE(X, 1);
> 				smp_mb();
> 				r2 = READ_ONCE(Y);
>
> ----------------------------------------------------------------
> { assert(!(r1 == 0 && r2 == 0)); // means if r1 == 0 then r2 == 1}
>
> If CPU 1 _fails_ to read the value of X written by CPU 1, r2 is
> guaranteed to 1, which means assert(!(r1 == 0 && r2 == 0)) afterwards
> wouldn't be triggered in any case.
>
> And this is actually the case of wake_up/wait, assuming that
> prepare_to_wait() is called on CPU 1 and wake_up() is called on CPU 2,

See above, wake_up/wait_event are fine in any case because they use
the same lock.

But as for try_to_wake_up() you are right, we rely on STORE-MB-LOAD.
Now let me quote another previous email,

	So in fact try_to_wake_up() needs mb() before it does

		if (!(p->state & state))
			goto out;

	But mb() is heavy, we can use wmb() instead, but only in this particular
	case: before spin_lock(), which guarantees that LOAD(p->state) can't leak
	out of the critical section. And since spin_lock() itself is the STORE,
	this guarantees that STORE(CONDITION) can't leak _into_ the critical section
	and thus it can't be reordered with LOAD(p->state).

And I think that smp_mb__before_spinlock() + spin_lock() should pair
correctly with mb(). If not - we should redefine it.

> X is the condition and Y is the task state,

Yes,

> and replace smp_mb() with really necessary barriers, right?

Sorry, can't understand this part...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/