linux-kernel - Re: [PATCH 0/3] 64-bit futexes: Intro

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 6 Jun 2008 13:53:05 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...e.hu>, David Howells <dhowells@...hat.com>,
	Ulrich Drepper <drepper@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 0/3] 64-bit futexes: Intro

On Thu, Jun 05, 2008 at 08:37:19PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 6 Jun 2008, Nick Piggin wrote:
> > 
> > What you *could* maybe do, to slightly speed up the reader fastpath, at
> > the expense of the writer fastpath, is to also have the active writer add
> > 4 to the count too, so your unlock can start with a lock xadd -4, count
> > in order to get the write-intent on the cacheline straight up.
> 
> Yes, nice idea. It avoids the possible unnecessary S->M transition, but 
> the downside is that it effectively slows down the write unlock by making 
> it do two atomic ops even for the fastpath. So if I were to _only_ care 
> about the reader path, I think it would be a great idea, but as it is, the 
> current non-contended write case is actually pretty close to optimal, and 
> doing the unconditional xaddl on the unlock path would slow that one down.

Yeah, it is a case of a large slowdown for write for a small speedup
for read (pity the API doesn't have explicit read and write unlocks
-- were they too lazy to type the last bit, or did they expect people
to lose track of whether they had a read or write lock? :P)

Anyway, it's obviously a tradeoff you'd just have to carefully
benchmark in real situations.

 
> > I'd be more interested to know why this code can't be evolved into a full
> > rwlock implementation? This is a rather standard (though neat) looking rwlock
> > -- so my question is what can the patented 64-bit futex locks do that this
> > can't, or what can they do faster?
> 
> Quite frankly - and this was my argument the whole time - I do not believe 

> consider things like timeouts etc. Timeouts are "hard" to handle because 
> they mean that you cannot use any kind of trivially incrementing "ticket 
> locks" with sequence numbers (because we may have to just avoid a sequence 
> if it times out), so the sequence number approach that we now use for 
> kernel spinlocks was not an option. I didn't actually *write* the timeout 
> versions, of course, but given the structure of the locks they really 
> should be very straightforward.
> 
> [ Half-way subtle thing: a writer that times out needs to be very careful 
>   that it doesn't lose a wakeup event, but futexes actually make that part 
>   pretty easy - since FUTEX_WAIT returns whether you got woken up or not, 
>   you can just decide to wake up the next write-waiter if you cannot get 
>   the lock immediately and have to exit due to a timeout. ]
> 
> But I really haven't tested my rwlocks very exhaustively, and I did not 
> verify that they actualyl scale with lots of CPU's, for example.  I 
> literally only have dual-core CPU's in use at home, right now, nothing 
> fancier. Somebody with dual-socket quads would be a lot better off, and 
> the more the merrier, of course.

Well... a single lock is only going to be so scalable. I don't see how
it could be done really significantly better? Maybe a small factor of
improvement if you were to concentrate on the contended case (but you
wouldn't want to do that anyway)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/