lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 5 Dec 2013 00:06:51 -0800
From:	Simon Kirby <sim@...tway.ca>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Waiman Long <Waiman.Long@...com>,
	Ian Applegate <ia@...udflare.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Christoph Lameter <cl@...two.org>,
	Pekka Enberg <penberg@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...ionio.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant
 to debug SMP races

On Wed, Dec 04, 2013 at 01:14:56PM -0800, Linus Torvalds wrote:

> The lock we're moving up isn't the lock that actually protects the
> whole allocation logic (it's the lock that then protects the pipe
> contents when a pipe is *used*). So it's a useless lock, and moving it
> up is a good idea regardless (because it makes the locks only protect
> the parts they are actually *supposed* to protect.
> 
> And while extraneous lock wouldn't normally hurt, the sleeping locks
> (both mutexes and semaphores) aren't actually safe wrt de-allocation -
> they protect anything *inside* the lock, but the lock data structure
> itself is accessed racily wrt other lockers (in a way that still
> leaves the locked region protected, but not the lock itself). If you
> care about details, you can walk through my example.

Yes, this makes sense now. It was spin_unlock_mutex() on the pipe lock
that itself was already already freed and poisoned by another cpu. This
explicit poison check also fires:

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index bf156de..ae425d0 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -159,6 +159,7 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 			__ticket_unlock_slowpath(lock, prev);
 	} else
 		__add(&lock->tickets.head, TICKET_LOCK_INC, UNLOCK_LOCK_PREFIX);
+	WARN_ON(*(unsigned int *)&lock->tickets.head == 0x6b6b6b6c);
 }
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)

It warns only as often as the poison checking already did, with a stack
of warn_*, __mutex_unlock_slowpath(), mutex_unlock(), pipe_release().

Trying to prove a negative, of course, but I tested with your first fix
overnight and got no errors. Current git (with b0d8d2292160bb63de) also
looks good. I will leave it running for a few days.

Thanks for getting stuck on this one. It was educational, at least!

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ