lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 18 Oct 2014 08:54:45 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	Catalin Marinas <catalin.marinas@....com>
Cc:	linux-kernel@...r.kernel.org,
	Matteo Franchin <Matteo.Franchin@....com>,
	Davidlohr Bueso <davidlohr@...com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Darren Hart <dvhart@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a
 barrier

On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: 
> Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> nothing to wake up) changes the futex code to avoid taking a lock when
> there are no waiters. This code has been subsequently fixed in commit
> 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> Both the original commit and the fix-up rely on get_futex_key_refs() to
> always imply a barrier.
> 
> However, for private futexes, none of the cases in the switch statement
> of get_futex_key_refs() would be hit and the function completes without
> a memory barrier as required before checking the "waiters" in
> futex_wake() -> hb_waiters_pending(). The consequence is a race with a
> thread waiting on a futex on another CPU, allowing the waker thread to
> read "waiters == 0" while the waiter thread to have read "futex_val ==
> locked" (in kernel).
> 
> Without this fix, the problem (user space deadlocks) can be seen with
> Android bionic's mutex implementation on an arm64 multi-cluster system.

How 'bout that, you just triggered my "watch this pot" alarm.

https://lkml.org/lkml/2014/10/8/406

The hang I encountered with stockfish only ever happened on one specific
box.  Linus/Thomas said it I was likely a problem with the futex usage,
but it suspiciously deterministic, so I put this on the "watch out for
further evidence" back burner.

The barrier fixing up my problematic box smells a lot like evidence.

> Signed-off-by: Catalin Marinas <catalin.marinas@....com>
> Reported-by: Matteo Franchin <Matteo.Franchin@....com>
> Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
> Cc: <stable@...r.kernel.org>
> Cc: Davidlohr Bueso <davidlohr@...com>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Darren Hart <dvhart@...ux.intel.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...nel.org>
> Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> ---
>  kernel/futex.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 815d7af2ffe8..f3a3a071283c 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
>  	case FUT_OFF_MMSHARED:
>  		futex_get_mm(key); /* implies MB (B) */
>  		break;
> +	default:
> +		smp_mb(); /* explicit MB (B) */
>  	}
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ