linux-kernel - Re: blk-throttle: Correct the placement of smp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101209014519.GO2094@linux.vnet.ibm.com>
Date:	Wed, 8 Dec 2010 17:45:19 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Vivek Goyal <vgoyal@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: blk-throttle: Correct the placement of smp_rmb()

On Wed, Dec 08, 2010 at 11:06:40PM +0100, Oleg Nesterov wrote:
> On 12/08, Oleg Nesterov wrote:
> >
> > Unfortunately, I can't prove this. You can ask
> > Paul McKenney if you want the authoritative answer.
> 
> Well. I think we should ask ;) This is interesting.
> 
> Paul could you please shed a light?
> 
> Suppose we have 2 variables, A = 0 and B = 0.
> 
> 	CPU0 does:
> 
> 		A = 1;
> 		wmb();
> 		B = 1;
> 
> 	CPU1 does:
> 
> 		B = 0;
> 		mb();
> 		if (A)
> 			A = 2;
> 
> My understanding is: after that we can safely assume that
> 
> 	B == 1 || A == 2
> 
> IOW. Either CPU1 notices that A was changed, or CPU0 "wins"
> and sets B = 1 "after" CPU1. But, it is not possible that
> CPU1 clears B "after" it was set by CPU0 _and_ sees A == 0.
> 
> Is it true? I think it should be true, but can't prove.

I was afraid that a question like this might be coming...  ;-)

The question is whether you can rely on the modification order of the
stores to B to deduce anything useful about the order in which the
accesses to A occurred.  The answer currently is I believe you can
for a simple example such as the one above, but I am checking with
the hardware guys.  In addition, please note that I am not sure if
all possible generalizations do what you want.  For example, imagine a
1024-CPU system in which the first 1023 CPUs do:

	A[smp_processor_id()] = 1;
	wmb();
	B = smp_processor_id();

where the elements of A are cache-line aligned and padded.  Suppose
that the remaining CPU does:

	i = random() % 1023;
	B = -1;
	mb();
	if (A[i])
		A[i] = 2;

Are we guaranteed that B!=-1||A[i]==2?

In this case, it could take all of the CPUs quite some time to come to
agreement on the order of all 1024 assignments to B.  I am bugging some
hardware guys about this.  It has been awhile, so they forgot to run
away when they saw me coming.  ;-)

>                                                         This
> reminds me the old (and long) discussion about STORE-MB-LOAD.
> Iirc, finally it was decided that
> 
> 	CPU0:				CPU1:
> 
> 	A = 1;				B = 1;
> 	mb();				mb();
> 	if (B)				if (A)
> 		printf("Yes");			printf("Yes");
> 
> should print "Yes" at least once. This looks very similar to
> the the previous example.

>From a hardware point of view, this example is very different than the
earlier one.  You are not using the order of independent CPUs' stores to a
single variable here and in addition are using mb() everywhere instead of
a combination of mb() and wmb().  So, yes, this one is guaranteed to work.

But what the heck are you guys really trying to do, anyway?  ;-)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/