lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111221174057.GA32347@redhat.com>
Date:	Wed, 21 Dec 2011 18:40:58 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	David Howells <dhowells@...hat.com>
Subject: Re: [PATCH for-3.3] mempool: clean up and document synchronization
	and memory barrier usage

Hi,

On 12/21, Tejun Heo wrote:
>
> Hello, Oleg.
>
> On Wed, Dec 21, 2011 at 03:55:56PM +0100, Oleg Nesterov wrote:
> > > The wmb can be the implied one from unlocking pool->lock and smp_mb()
> > > in mempool_free() can be replaced with smp_rmb().
> >
> > OK... I do not know if this is the bug or not, but I agree this race
> > is possible.
>
> I'm not entirely sure whether this is a condition which we need to
> guarantee but, if the scenario like the above ever happens, it's gonna
> be a hellish one to track down.

Yes, yes, I agree. In fact I was going to remove this sentence after
I actually read the patch, but forgot.

And in any case, it is very nice to document the barriers.

> > Not really afaics. unlock() doesn't imply wmb().
> >
> > So, if some thread does PTR = mempool_alloc(), the "final" store to PTR
> > can leak into the critical section, and it can be reordered inside this
> > section with --curr_nr.
> >
> > See the fat comment about set_current_state() in prepare_to_wait().
>
> Hmmm.... I'm quoting it part here.  (cc'ing David and Paul, hi guys!)
>
> /*
>  * Note: we use "set_current_state()" _after_ the wait-queue add,
>  * because we need a memory barrier there on SMP, so that any
>  * wake-function that tests for the wait-queue being active
>  * will be guaranteed to see waitqueue addition _or_ subsequent
>  * tests in this thread will see the wakeup having taken place.
>  *
>  * The spin_unlock() itself is semi-permeable and only protects
>  * one way (it only protects stuff inside the critical region and
>  * stops them from bleeding out - it would still allow subsequent
>  * loads to move into the critical region).
>  */
>
> The first paragraph is saying that at that point full barrier (for
> both stores and loads) is necessary at that point and the second
> paragraph is a bit confusing but the last sentence seems to say that
> only loads after the unlock can creep above unlock,

Probably, this is because the comment tries to explain the possible
reordering with the subsequent "if (condition)" check, so it only
mentions loads.

> I think clearer explanation is in memory-barrier.txt.
>
>  (6) UNLOCK operations.
>
>      This also acts as a one-way permeable barrier.  It guarantees that all
>      memory operations before the UNLOCK operation will appear to happen before
>      the UNLOCK operation with respect to the other components of the system.
>
>      Memory operations that occur after an UNLOCK operation may appear to
>      happen before it completes.
>
>      LOCK and UNLOCK operations are guaranteed to appear with respect to each
>      other strictly in the order specified.
>
>      The use of LOCK and UNLOCK operations generally precludes the need for
>      other sorts of memory barrier (but note the exceptions mentioned in the
>      subsection "MMIO write barrier").
>
> So, in a nutshell, I think it's saying w.r.t. stores is that stores
> before unlock may not be deferred across the unlock but stores after
> unlock may creep upwards past unlock, so it acts as one-way write
> barrier.  And, yeah, that would be the minimal requirement for proper
> lock operation.  Curious whether any processor implements such fine
> grained barrier tho, maybe alpha does that too?
>
> Paul, David, am I understanding it correctly?
>
> Anyways, yeah, you're right.  We need a smp_wmb() before returning but
> I think the comment on top of prepare_to_wait() is misleading.

Hmm. I am not sure I understand... Although almost everything written
in English looks misleading to me ;)

I think that "Memory operations that occur after" above means both
loads and stores, so the comment is probably fine. set_current_state()
inserts the barrier between store and load:


	__add_wait_queue();

	mb();			// set_current_state()

	unlock();		// can't serialize with LOAD(condition) below

	if (!condition)
		schedule();

(let's forget about task->state to simplify)

> Great, thanks.  I'll wait a bit for futher comments and repost w/
> smp_wmb() added.

Well. This is almost off-topic, but perhaps we can add
smp_mb__after_unlock() ? We already have smp_mb__after_lock.
Afaics prepare_to_wait() could use it.

I am not talking about perfomance issues, just I think the code
will be more understandable.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ