linux-kernel - Re: safety of *mutex_unlock() (Was: [BUG] signal: sighand unprotected when accessed by /proc)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140610144830.GD3213@twins.programming.kicks-ass.net>
Date:	Tue, 10 Jun 2014 16:48:30 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Oleg Nesterov <oleg@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Clark Williams <williams@...hat.com>
Subject: Re: safety of *mutex_unlock() (Was: [BUG] signal: sighand
 unprotected when accessed by /proc)

On Tue, Jun 10, 2014 at 05:56:55AM -0700, Paul E. McKenney wrote:
> On Mon, Jun 09, 2014 at 11:51:09AM -0700, Linus Torvalds wrote:

> > This is subtle, and it is basically unavoidable. If a mutex (or
> > counting semaphore) has a fast-path - and a mutex/semaphore without a
> > fast-path is shit - then this issue will exist. Exactly because the
> > fast-path will depend on just one part of the whole big mutex
> > structure, and the slow-path will have other pieces to it.
> > 
> > There might be reasonable ways to avoid this issue (having the
> > fastpath locking field share memory with the slow-path locking, for
> > example), but it's not how our semaphores and mutexes work, and I
> > suspect it cannot be the case in general (because it limits you too
> > badly in how to implement the mutex). As a result, this is all "by
> > design" as opposed to being a bug.
> 
> So to safely free a structure containing a mutex, is there some better
> approach than the following?
> 
> 	mutex_lock(mem->mutex);
> 	kill_it = !--mem->refcount;
> 	rcu_read_lock();
> 	mutex_unlock(mem->mutex);
> 	rcu_read_unlock();
> 	if (kill_it)
> 		kfree_rcu(mem, rh); /* rh is the rcu_head field in mem. */
> 
> For example, is there some other way to know that all the prior lock
> releases have finished their post-release accesses?

So Thomas posted a patch curing rt_mutex, and for that we really _have_
to because it needs to replace a spinlock_t. But for the regular mutex
its better (from a performance pov) to not do this.

By releasing early and checking for pending waiters later we allow
earlier lock stealing, which is good for throughput.

Back to your example, I think your example is misleading in that it
states: 'a structure containing a mutex'. The problem only arises when
that mutex is used as part of the life-time management of said
structure.

If it has regular (atomic_t or atomic_long_t or spinlock_t) reference
counting, we know the mutex_unlock() must have competed by the time we
do put_*(), and if that put was the last one, there cannot have been
another, otherwise your reference counting is broken.

Content of type "application/pgp-signature" skipped