lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150918134630.GW3816@twins.programming.kicks-ass.net>
Date:	Fri, 18 Sep 2015 15:46:30 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Dmitry Vyukov <dvyukov@...gle.com>, ebiederm@...ssion.com,
	Al Viro <viro@...iv.linux.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>, mhocko@...e.cz,
	LKML <linux-kernel@...r.kernel.org>, ktsan@...glegroups.com,
	Kostya Serebryany <kcc@...gle.com>,
	Andrey Konovalov <andreyknvl@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Hans Boehm <hboehm@...gle.com>
Subject: Re: [PATCH] kernel: fix data race in put_pid

On Fri, Sep 18, 2015 at 03:28:44PM +0200, Oleg Nesterov wrote:
> On 09/18, Peter Zijlstra wrote:
> >
> > On Thu, Sep 17, 2015 at 08:09:19PM +0200, Oleg Nesterov wrote:
> >
> > > I need to recheck, but afaics this is not possible. This optimization
> > > is fine, but probably needs a comment.
> >
> > For sure, this code doesn't make any sense to me.
> 
> So yes, after a sleep I am starting to agree that in theory this fast-path
> check is wrong. I'll write another email..

This other mail will include a patch adding comments to pid.c ? That
code didn't want to make sense to me this morning.

> > As an alternative patch, could we not do:
> >
> >   void put_pid(struct pid *pid)
> >   {
> > 	struct pid_namespace *ns;
> >
> > 	if (!pid)
> > 		return;
> >
> > 	ns = pid->numbers[pid->level].ns;
> > 	if ((atomic_read(&pid->count) == 1) ||
> > 	     atomic_dec_and_test(&pid->count)) {
> >
> > +		smp_read_barrier_depends(); /* ctrl-dep */
> 
> Not sure... Firstly it is not clear what this barrier pairs with. And I
> have to admit that I can not understand if _CTRL() logic applies here.
> The same for atomic_read_ctrl().

The control dependency barrier pairs with the full barrier of
atomic_dec_and_test.

So the two put_pid() instances:

	CPU0					CPU1

	pid->foo = 1;
	atomic_dec_and_test() == false		atomic_read_ctrl() == 1
						kmem_cache_free(pid)

CPU0 will modify a pid field and decrement, but not reach 0.
CPU1 finds we're the last, but must also be able to observe our foo
store such that we can rest assured  it is complete before we free the
storage.

The freeing of pid, on CPU1, is stores, these must not happen before we
satisfy the freeing condition, iow a load-store barrier, which is what
the control dependency provides.

> OK, please forget about put_pid() for the moment. Suppose we have
> 
> 	X = 1;
> 	synchronize_sched();
> 	Y = 1;
> 
> Or
> 	X = 1;
> 	call_rcu_sched( func => { Y = 1; } );
> 
> 
> 
> Now. In theory this this code is wrong:
> 
> 	if (Y) {
> 		BUG_ON(X == 0);
> 	}
> 
> But this is correct:
> 
> 	if (Y) {
> 		rcu_read_lock_sched();
> 		rcu_read_unlock_sched();
> 		BUG_ON(X == 0);
> 	}
> 
> So perhaps something like this
> 
> 	/*
> 	 * Comment to explain it is eq to read_lock + read_unlock,
> 	 * in a sense that this guarantees a full barrier wrt to
> 	 * the previous synchronize_sched().
> 	 */
> 	#define rcu_read_barrier_sched()	barrier()
> 
> make sense?
> 
> 
> And again, I simply can't understand if this code
> 
> 	if (READ_ONCE_CTRL(Y))
> 		BUG_ON(X == 0);
> 
> to me it does _not_ look correct in theory.

So control dependencies provide a load-store barrier. Your examples
above rely on a load-load barrier; BUG_ON(X == 0) is a load.

kmem_cache_free() OTOH is stores (we must modify the free list).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ