linux-kernel - Re: [PATCH 3/3] 64-bit futexes: x86 support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4840D335.6040406@redhat.com>
Date:	Fri, 30 May 2008 21:25:25 -0700
From:	Ulrich Drepper <drepper@...hat.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	mtk.manpages@...il.com
Subject: Re: [PATCH 3/3] 64-bit futexes: x86 support

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> I'm suggesting that there are better approaches than forcing yet another 
> new interface onto this thing. As mentioned, the kernel has had 32-bit 
> rwsemaphores for a long time. Yes, there is "extra information" outside 
> the 32-bit field, but I suspect you simply haven't looked hard enough at 
> just using the existing futex code.

I guess have to expect to expect to be insulted by you.  What else is
new?  I think it's safe to say that nobody has looked more deeply at the
implementation possibilities with futexes.  I don't know how many
different primitives I've written over the years.  If you think you can
do it better, well, that's then a challenge to you.  Come up with
something better.

  The rwlocks cannot be handled as efficiently with 32-bit futexes.

I stand by that.  Proof me wrong.  You don't have to write code, just
describe it.  But don't publish it if

- - the uncontended reader *unconditionally* uses more than one memory
  access (atomic of course)

- - the contended readers uses the one access above plus one or two
  compare-and-exchange depending on the rwlock preferring readers or
  writers.  The last compare-and-exchange has to be repeated for
  spurious wakeups which are highly unlikely since we only wake what
  is ready to use.

- - the uncontended writer needs more than one single compare-and-exchange
  plus one memory read plus one write (both not atomic)

- - the contended writer needs more than the above plus one additional
  compare-and-exchange for every time we delay in futex_wait (again,
  highly unlikely to have spurious wakeups)

- - the unlock operation needs more than one read plus either one
  (repeated) compare-and-exchange or another atomic operation plus one
  memory write for writer locks

- - you can handle timedlock variants which don't manage to get the lock
  efficiently without waking up anyone else

That's it.

And just to point out some more fun: in a highly contended case this is
a performance killer (MESI protocol states):

               core #1                           core #2

         instr     cache state               instr    cache state

         load      -> S

         store     -> M

                   -> S                       load    -> S

                                              store   -> M


I.e., of the cache line with the rwlock is in one core's cache and you
first load a value from that cache line just to modify it right after
that, you're paying a terrible price.


Have fun coming up with something which is as good or better.  But shall
we say there is a deadline?  Otherwise we'd wait too long.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkhA0zUACgkQ2ijCOnn/RHRI2QCbBGCcjES6qMJYay1WRVLdSAAk
lHEAnA754SGDsjuzvCYg2ZBrEUIktxYV
=NYYc
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/