linux-kernel - Re: [PATCH 2/2] [PATCH] sched: Add smp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150218155904.GA27687@redhat.com>
Date:	Wed, 18 Feb 2015 16:59:04 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Manfred Spraul <manfred@...orfullife.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Kirill Tkhai <ktkhai@...allels.com>,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
	Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH 2/2] [PATCH] sched: Add smp_rmb() in task rq locking
	cycles

(Forgot to add Manfred, resending)

Thanks Paul and Peter, this was the interesting reading ;)

This is almost off-topic (but see below), but perhaps memory-barriers.txt
could also mention spin_unlock_wait() to explain that _most probably_ it is
pointless without the memory barrier(s), and the barrer before-or-after
unlock_wait() pairs with release-or-acquire.

At the same time, the code like

	spin_unlock_wait();
	STORE;

_can_ be correct because this implies the load-store control dependency.

On 02/17, Paul E. McKenney wrote:
>
>       |   mb  |  wmb  |  rmb  |  rbd  |  acq  |  rel  |  ctl  |
>  -----+-------+-------+-------+-------+-------+-------+-------+
>    mb |   Y   |       |   Y   |   y   |   Y   |       |   Y   +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   wmb |   Y   |       |   Y   |   y   |   Y   |       |   Y   +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   rmb |       |       |       |       |       |       |       +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   rbd |       |       |       |       |       |       |       +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   acq |       |       |       |       |       |       |       +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   rel |   Y   |       |   Y   |   y   |   Y   |       |   Y   +
>  -----+-------+-------+-------+-------+-------+-------+-------+
>   ctl |       |       |       |       |       |       |       +
>  -----+-------+-------+-------+-------+-------+-------+-------+

OK, so "acq" can't pair with "acq", and I am not sure I understand.

First of all, it is not clear to me how you can even try to pair them
unless you do something like spin_unlock_wait(). I would like to see
an example which is not "obviously wrong".

At the same time, if you play with spin_unlock_wait() or spin_is_locked()
then acq can pair with acq?

Let's look at sem_lock(). I never looked at this code before, I can be
easily wrong. Manfred will correct me. But at first glance we can write
the oversimplified pseudo-code:

	spinlock_t local, global;

	bool my_lock(bool try_local)
	{
		if (try_local) {
			spin_lock(&local);
			if (!spin_is_locked(&global))
				return true;
			spin_unlock(&local);
		}

		spin_lock(&global);
		spin_unlock_wait(&local);
		return false;
	}

	void my_unlock(bool drop_local)
	{
		if (drop_local)
			spin_unlock(&local);
		else
			spin_unlock(&global);
	}

it assumes that the "local" lock is cheaper than "global", the usage is

	bool xxx = my_lock(condition);
	/* CRITICAL SECTION */
	my_unlock(xxx);

Now. Unless I missed something, my_lock() does NOT need a barrier BEFORE
spin_unlock_wait() (or spin_is_locked()). Either my_lock(true) should see
spin_is_locked(global) == T, or my_lock(false)->spin_unlock_wait() should
see that "local" is locked and wait.

Doesn't this mean that acq can pair with acq or I am totally confused?

Another question is do we need a barrier AFTER spin_unlock_wait(). I do not
know what ipc/sem.c actually needs, but in general (I think) this does need
mb(). Otherwise my_lock / my_unlock itself does not have the proper acq/rel
semantics. For example, my_lock(false) can miss the changes which were done
under my_lock(true).

So I think that (in theory) sem_wait_array() need smp_mb() at the end. But,
given that we have the control dependency, perhaps smp_rmb() is enough?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/