linux-kernel - Re: [patch V2 2/7] futex: Hash private futexes per process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160511210700.GD4225@f23x64.localdomain>
Date:	Wed, 11 May 2016 14:07:00 -0700
From:	Darren Hart <dvhart@...radead.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Darren Hart <darren@...art.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Michael Kerrisk <mtk.manpages@...glemail.com>,
	Davidlohr Bueso <dave@...olabs.net>, Chris Mason <clm@...com>,
	Carlos O'Donell <carlos@...hat.com>,
	Torvald Riegel <triegel@...hat.com>,
	Eric Dumazet <edumazet@...gle.com>
Subject: Re: [patch V2 2/7] futex: Hash private futexes per process

On Sat, May 07, 2016 at 10:44:39AM +0200, Thomas Gleixner wrote:
> On Fri, 6 May 2016, Darren Hart wrote:
> > On Thu, May 05, 2016 at 08:44:04PM -0000, Thomas Gleixner wrote:
> > > --- /dev/null
> > > +++ b/include/linux/futex_types.h
> > > @@ -0,0 +1,12 @@
> > > +#ifndef _LINUX_FUTEX_TYPES_H
> > > +#define _LINUX_FUTEX_TYPES_H
> > > +
> > > +struct futex_hash_bucket;
> > > +
> > > +struct futex_hash {
> > > +	struct raw_spinlock		lock;
> > 
> > As it isn't always obvious to everone, it would be good to add a single line
> > comment stating why a *raw* spinlock is necessary.
> 
> Well. Necessary. It protects the hash pointer and the hash bits. So the scope
> is very limited and really does not need the heavy weight version of a
> sleeping spinlock in RT.
>  
> > In this case... I suppose this could lead to some nasty scenarios setting up IPC
> > mechanisms between threads if they weren't strictly serialized? Something else?
> 
> Sure, we need to serialize attempts to populate the hash. Especially in the
> non preallocated case. The thing with raw vs. non raw spinlocks is that the
> latter are expensive on RT and if there are just 5 instructions to protect it
> does not make any sense to chose the heavy version.
>  
> > > +config FUTEX_PRIVATE_HASH
> > > +	bool
> > > +	default FUTEX && SMP
> > > +
> > 
> > So no prompt, not user selectable. If you have SMP, you get this? I think
> > automatic is a good call... but is SMP the right criteria, or would NUMA be more
> > appropriate since I thought it was keeping the hash local to the NUMA node that
> > was the big win?
> 
> Yes, we can make it depend on NUMA. I even thought about making a run time
> decision for non preallocated ones when the machine is not numa. But for test
> coverage I wanted to have it as widely used as possible.

OK, understood to here.

> > > +	/*
> > > +	 * Futexes which use the per process hash have the lower bits cleared
> > > +	 */
> > > +	if (key->both.offset & (FUT_OFF_INODE | FUT_OFF_MMSHARED))
> > > +		return hash_global_futex(key);
> > > +
> > > +	slot = hash_long(key->private.address, mm->futex_hash.hash_bits);
> > > +	return &mm->futex_hash.hash[slot];
> > 
> > Don't we also need to check if the private hash exists? Per the commit
> > description, if we fail to allocate the private hash, we fall back to using the
> > global hash...
> 
> If we fall back to the global hash, then the lower bits in offset are not
> 0. So the hash is guaranteed to be available.
> 

Ah right. Since the position of the bits in the two flags isn't obvious when
reading the test, the comment about the lower bits being cleared didn't
translate to that case being implicitly covered by the test.

Maybe make this explicit?

/*
 * Only private futexes use the per process hash and they will not have
 * FUT_OFF_INODE nor FUT_OFF_MMSHARED set.
 */


-- 
Darren Hart
Intel Open Source Technology Center