lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <55B61EF3.7080302@gmail.com>
Date:	Mon, 27 Jul 2015 14:07:15 +0200
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Darren Hart <dvhart@...radead.org>,
	Torvald Riegel <triegel@...hat.com>
CC:	mtk.manpages@...il.com, Carlos O'Donell <carlos@...hat.com>,
	Ingo Molnar <mingo@...e.hu>, Jakub Jelinek <jakub@...hat.com>,
	linux-man <linux-man@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Davidlohr Bueso <dave@...olabs.net>,
	Arnd Bergmann <arnd@...db.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Linux API <linux-api@...r.kernel.org>,
	Roland McGrath <roland@...k.frob.com>,
	Anton Blanchard <anton@...ba.org>,
	Eric Dumazet <edumazet@...gle.com>,
	bill o gallmeister <bgallmeister@...il.com>,
	Jan Kiszka <jan.kiszka@...mens.com>,
	Daniel Wagner <wagi@...om.org>, Rich Felker <dalias@...c.org>,
	Andy Lutomirski <luto@...capital.net>,
	bert hubert <bert.hubert@...herlabs.nl>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Heinrich Schuchardt <xypron.glpk@....de>
Subject: Next round: revised futex(2) man page for review

Hello all,

>From a draft sent out in March, I got a few useful comments that
I've now incorporated into this draft. And I got some complaints
from people who did not want to read groff source. My point
was that there are a bunch of FIXMEs in the page source that I
wanted people to look at... Anyway, this time, I will take
a different tack, interspersing the FIXMEs in a rendered 
version of the page. I'd greatly appreciate help with those FIXMEs.

The current page source can be found at in a branch at
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_futex

===

As becomes quickly obvious upon reading it, the current futex(2) 
man page is in a sorry state, lacking many important details, and
also the various additions that have been made to the interface
over the last years. I've been working on revising it, first
of all based on input I got in response to a request for help
last year (http://thread.gmane.org/gmane.linux.kernel/1703405), 
especially taking Thomas Gleixner's input 
(http://thread.gmane.org/gmane.linux.kernel/1703405/focus=2952) 
into account. I also got some further offlist input from Darren
 Hart, Torvald Riegel, and Davidlohr Bueso that has been
incorporated into the revised draft. Other than that, I got
some useful info out of Ulrich Drepper's paper (cited at the
end of the page) and one or two web pages (cited in the page
source).

The page has now increased in size by a factor of about 5, but
is far from complete. In particular, as I reworked the page, 
there were many details that I was not 100% certain of, and I
have added FIXME markers to the page source. In addition,
Torvald added some text, and a few more FIXMEs. Some of
the FIXMEs are trivial, as in: I'd like confirmation that
I have correctly captured a technical detail. Others are more 
substantial, probably requiring the addition of further text.

I appreciate that there are probably other things that can be
improved in the page. (Torvald and Darren have some ideas.)
However, before growing the page any further, I would like to
resolve as many of the FIXMEs (and any other problems that people
see) as possible in the existing text. I need help with that. 
(And I know that dealing with that help, if I get it, will in 
itself will be quite a task to deal with, which is why I have 
been delaying it for many weeks now, as my time has been 
rather limited recently.)

So, please take a look at the page below. At this point,
I would most especially appreciate help with the FIXMEs.

Cheers,

Michael



FUTEX(2)                Linux Programmer's Manual               FUTEX(2)

NAME
       futex - fast user-space locking

SYNOPSIS
       #include <linux/futex.h>
       #include <sys/time.h>

       int futex(int *uaddr, int futex_op, int val,
                 const struct timespec *timeout,   /* or: uint32_t val2 */
                 int *uaddr2, int val3);

       Note: There is no glibc wrapper for this system call; see NOTES.

DESCRIPTION
       The  futex()  system  call  provides a method for waiting until a
       certain condition becomes true.  It is typically used as a block‐
       ing  construct  in  the context of shared-memory synchronization:
       The program implements the majority  of  the  synchronization  in
       user  space,  and  uses  one of the operations of the system call
       when it is likely that it has to block for a  longer  time  until
       the  condition  becomes true.  The program uses another operation
       of the system call to wake anyone waiting for a particular condi‐
       tion.

       The  condition  is  represented  by  the  futex word, which is an
       address in memory supplied to the futex() system  call,  and  the
       32-bit  value  at  this  memory  location.   (While  the  virtual
       addresses for the same physical memory address in  separate  pro‐
       cesses  may be different, the same physical address may be shared
       by the processes using mmap(2).)

       When executing a futex operation that requests to block a thread,
       the  kernel  will block only if the futex word has the value that
       the calling thread supplied as expected value.  The load from the
       futex  word,  the  comparison  with  the  expected value, and the
       actual blocking will happen atomically and totally  ordered  with
       respect  to  concurrently  executing futex operations on the same
       futex word.  Thus, the futex word is used to connect the synchro‐
       nization in user space with the implementation of blocking by the
       kernel; similar to an atomic compare-and-exchange operation  that
       potentially  changes  shared  memory,  blocking via a futex is an
       atomic compare-and-block operation.

       One example use of futexes is implementing locks.  The  state  of
       the  lock  (i.e., acquired or not acquired) can be represented as
       an atomically accessed flag in shared memory.  In the uncontended
       case,  a  thread  can access or modify the lock state with atomic
       instructions,  for  example  atomically  changing  it  from   not
       acquired   to   acquired  using  an  atomic  compare-and-exchange
       instruction.  A thread maybe unable acquire a lock because it  is
       already  acquired by another thread.  It then may pass the lock's
       flag as futex word and the value representing the acquired  state
       as  the  expected value to a futex() wait operation.  The call to
       futex() will block if and only if the  lock  is  still  acquired.
       When  releasing  the  lock,  a thread has to first reset the lock
       state to not acquired and then execute  a  futex  operation  that
       wakes  threads  blocked on the lock flag used as futex word (this
       can be be further optimized to avoid unnecessary wake-ups).   See
       futex(7) for more detail on how to use futexes.

       Besides the basic wait and wake-up futex functionality, there are
       further futex operations aimed at  supporting  more  complex  use
       cases.   Also note that no explicit initialization or destruction
       are necessary to use futexes; the kernel maintains a futex (i.e.,
       the  kernel-internal  implementation  artifact) only while opera‐
       tions such as FUTEX_WAIT, described below, are being performed on
       a particular futex word.

   Arguments
       The  uaddr  argument points to the futex word.  On all platforms,
       futexes are four-byte integers that must be aligned  on  a  four-
       byte  boundary.   The operation to perform on the futex is speci‐
       fied in the futex_op argument; val is a value whose  meaning  and
       purpose depends on futex_op.

       The  remaining arguments (timeout, uaddr2, and val3) are required
       only for certain of the futex operations described below.   Where
       one of these arguments is not required, it is ignored.

       For  several  blocking  operations,  the  timeout  argument  is a
       pointer to a timespec structure that specifies a timeout for  the
       operation.   However,  notwithstanding the prototype shown above,
       for some operations, the least significant four bytes are used as
       an  integer  whose  meaning  is determined by the operation.  For
       these operations, the kernel casts the  timeout  value  first  to
       unsigned  long,  then  to  uint32_t, and in the remainder of this
       page, this argument is referred to as val2  when  interpreted  in
       this fashion.

       Where  it is required, the uaddr2 argument is a pointer to a sec‐
       ond futex word that is employed by the operation.  The  interpre‐
       tation of the final integer argument, val3, depends on the opera‐
       tion.

   Futex operations
       The futex_op argument consists of two parts: a command that spec‐
       ifies  the  operation to be performed, bit-wise ORed with zero or
       or more options that modify the behaviour of the operation.   The
       options that may be included in futex_op are as follows:

       FUTEX_PRIVATE_FLAG (since Linux 2.6.22)
              This option bit can be employed with all futex operations.
              It tells the kernel that the futex is process-private  and
              not  shared  with  another process (i.e., it is being used
              for synchronization  only  between  threads  of  the  same
              process).   This allows the kernel to make some additional
              performance optimizations.

              As a convenience, <linux/futex.h> defines a  set  of  con‐
              stants  with  the  suffix _PRIVATE that are equivalents of
              all  of  the  operations  listed  below,  but   with   the
              FUTEX_PRIVATE_FLAG  ORed  into  the constant value.  Thus,
              there are FUTEX_WAIT_PRIVATE, FUTEX_WAKE_PRIVATE,  and  so
              on.

       FUTEX_CLOCK_REALTIME (since Linux 2.6.28)
              This   option   bit   can   be   employed  only  with  the
              FUTEX_WAIT_BITSET and FUTEX_WAIT_REQUEUE_PI operations.

              If this option is set, the kernel  treats  timeout  as  an
              absolute time based on CLOCK_REALTIME.

.\" FIXME XXX I added CLOCK_MONOTONIC below. Okay?
              If  this  option  is not set, the kernel treats timeout as
              relative time, measured against the CLOCK_MONOTONIC clock.

       The operation specified in futex_op is one of the following:

       FUTEX_WAIT (since Linux 2.6.0)
              This operation tests that the  value  at  the  futex  word
              pointed  to  by  the  address  uaddr  still  contains  the
              expected value  val,  and  if  so,  then  sleeps  awaiting
              FUTEX_WAKE  on  the  futex word.  The load of the value of
              the futex word is an atomic  memory  access  (i.e.,  using
              atomic  machine  instructions  of the respective architec‐
              ture).  This load, the comparison with the expected value,
              and starting to sleep are performed atomically and totally
              ordered with respect to other futex operations on the same
              futex  word.  If the thread starts to sleep, it is consid‐
              ered a waiter on this futex word.  If the futex value does
              not  match  val,  then the call fails immediately with the
              error EAGAIN.

              The purpose of the comparison with the expected  value  is
              to  prevent  lost  wake-ups: If another thread changed the
              value of the futex word after the calling  thread  decided
              to block based on the prior value, and if the other thread
              executed a FUTEX_WAKE operation (or similar wake-up) after
              the  value  change  and  before this FUTEX_WAIT operation,
              then the latter will observe the value change and will not
              start to sleep.

              If  the timeout argument is non-NULL, its contents specify
              a relative timeout for the wait, measured according to the
.\" FIXME XXX I added CLOCK_MONOTONIC below. Okay?
              CLOCK_MONOTONIC  clock.  (This interval will be rounded up
              to the system clock  granularity,  and  kernel  scheduling
              delays  mean  that  the blocking interval may overrun by a
              small amount.)  If timeout is NULL, the call blocks indef‐
              initely.

              The arguments uaddr2 and val3 are ignored.


       FUTEX_WAKE (since Linux 2.6.0)
              This  operation  wakes at most val of the waiters that are
              waiting (e.g., inside FUTEX_WAIT) on the futex word at the
              address  uaddr.  Most commonly, val is specified as either
              1 (wake up a single waiter) or INT_MAX (wake up all  wait‐
              ers).   No  guarantee  is provided about which waiters are
              awoken (e.g., a waiter with a higher  scheduling  priority
              is  not  guaranteed to be awoken in preference to a waiter
              with a lower priority).

              The arguments timeout, uaddr2, and val3 are ignored.


       FUTEX_FD (from Linux 2.6.0 up to and including Linux 2.6.25)
              This operation creates a file descriptor that  is  associ‐
              ated  with  the futex at uaddr.  The caller must close the
              returned file descriptor after use.  When another  process
              or  thread  performs  a  FUTEX_WAKE on the futex word, the
              file  descriptor  indicates   as   being   readable   with
              select(2), poll(2), and epoll(7)

              The  file  descriptor  can  be used to obtain asynchronous
              notifications:  if  val  is  nonzero,  then  when  another
              process  or  thread executes a FUTEX_WAKE, the caller will
              receive the signal number that was passed in val.

              The arguments timeout, uaddr2 and val3 are ignored.

.\" FIXME(Torvald) We never define "upped".  Maybe just remove the
.\"      following sentence?
              To prevent race conditions, the caller should test if  the
              futex has been upped after FUTEX_FD returns.

              Because  it was inherently racy, FUTEX_FD has been removed
              from Linux 2.6.26 onward.

       FUTEX_REQUEUE (since Linux 2.6.0)
.\" FIXME(Torvald) Is there some indication that FUTEX_REQUEUE is broken
.\"     in general, or is this comment implicitly speaking about the
.\"     condvar (?) use case? If the latter we might want to weaken the
.\"     advice below a little.
.\" [Anyone else have input on this?]
              Avoid using this operation.  It is broken for its intended
              purpose.  Use FUTEX_CMP_REQUEUE instead.

              This    operation    performs    the    same    task    as
              FUTEX_CMP_REQUEUE, except that no check is made using  the
              value in val3.  (The argument val3 is ignored.)

       FUTEX_CMP_REQUEUE (since Linux 2.6.7)
              This  operation  first  checks  whether the location uaddr
              still contains the value  val3.   If  not,  the  operation
              fails  with  the  error  EAGAIN.  Otherwise, the operation
              wakes up a maximum of val waiters that are waiting on  the
              futex  at uaddr.  If there are more than val waiters, then
              the remaining waiters are removed from the wait  queue  of
              the  source  futex at uaddr and added to the wait queue of
              the target futex at uaddr2.  The val2  argument  specifies
              an  upper limit on the number of waiters that are requeued
              to the futex at uaddr2.

.\" FIXME(Torvald) Is the following correct?  Or is just the decision
.\" which threads to wake or requeue part of the atomic operation?
              The load from uaddr is  an  atomic  memory  access  (i.e.,
              using atomic machine instructions of the respective archi‐
              tecture).  This load, the comparison with  val3,  and  the
              requeueing  of  any  waiters  are performed atomically and
              totally ordered with respect to other  operations  on  the
              same futex word.

              This  operation was added as a replacement for the earlier
              FUTEX_REQUEUE.  The difference is that the  check  of  the
              value  at uaddr can be used to ensure that requeueing hap‐
              pens only under certain conditions.  Both  operations  can
              be   used   to  avoid  a  "thundering  herd"  effect  when
              FUTEX_WAKE is used and all of the waiters that  are  woken
              need to acquire another futex.

.\" FIXME Please review the following new paragraph to see if it is
.\"       accurate.
              Typical  values to specify for val are 0 or or 1.  (Speci‐
              fying INT_MAX is not useful, because  it  would  make  the
              FUTEX_CMP_REQUEUE  operation  equivalent  to  FUTEX_WAKE.)
              The limit value specified via val2 is typically  either  1
              or  INT_MAX.  (Specifying the argument as 0 is not useful,
              because it  would  make  the  FUTEX_CMP_REQUEUE  operation
              equivalent to FUTEX_WAIT.)
.\" FIXME Here, it would be helpful to have an example of how
.\"       FUTEX_CMP_REQUEUE might be used, at the same time illustrating
.\"       why FUTEX_WAKE is unsuitable for the same use case.


       FUTEX_WAKE_OP (since Linux 2.6.14)
.\" FIXME I added a lengthy piece of text on FUTEX_WAKE_OP text,
.\"       and I'd be happy if someone checked it.
.\"
.\" FIXME(Torvald) The glibc condvar implementation is currently being
.\"     revised (e.g., to not use an internal lock anymore).
.\"     It is probably more future-proof to remove this paragraph.
.\" [Torvald, do you have an update here?]
.\"
              This  operation  was  added to support some user-space use
              cases where more than one futex must  be  handled  at  the
              same time.  The most notable example is the implementation
              of pthread_cond_signal(3), which  requires  operations  on
              two  futexes,  the one used to implement the mutex and the
              one used in the implementation of the wait  queue  associ‐
              ated  with  the  condition variable.  FUTEX_WAKE_OP allows
              such cases to be implemented without leading to high rates
              of contention and context switching.

              The FUTEX_WAIT_OP operation is equivalent to executing the
              following code atomically and totally ordered with respect
              to other futex operations on any of the two supplied futex
              words:

                  int oldval = *(int *) uaddr2;
                  *(int *) uaddr2 = oldval op oparg;
                  futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
                  if (oldval cmp cmparg)
                      futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);

              In other words, FUTEX_WAIT_OP does the following:

              *  saves the original value of the futex  word  at  uaddr2
                 and  performs  an  operation to modify the value of the
                 futex at uaddr2; this is  an  atomic  read-modify-write
                 memory  access (i.e., using atomic machine instructions
                 of the respective architecture)

              *  wakes up a maximum of val waiters on the futex for  the
                 futex word at uaddr; and

              *  dependent  on  the  results  of  a test of the original
                 value of the futex word at uaddr2, wakes up  a  maximum
                 of  val2  waiters  on  the  futex for the futex word at
                 uaddr2.

              The operation and comparison that are to be performed  are
              encoded  in  the  bits of the argument val3.  Pictorially,
              the encoding is:

                      +---+---+-----------+-----------+
                      |op |cmp|   oparg   |  cmparg   |
                      +---+---+-----------+-----------+
                        4   4       12          12    <== # of bits

              Expressed in code, the encoding is:

                  #define FUTEX_OP(op, oparg, cmp, cmparg) \
                                  (((op & 0xf) << 28) | \
                                  ((cmp & 0xf) << 24) | \
                                  ((oparg & 0xfff) << 12) | \
                                  (cmparg & 0xfff))

              In the above, op and cmp are each one of the codes  listed
              below.   The  oparg  and  cmparg  components  are  literal
              numeric values, except as noted below.

              The op component has one of the following values:

                  FUTEX_OP_SET        0  /* uaddr2 = oparg; */
                  FUTEX_OP_ADD        1  /* uaddr2 += oparg; */
                  FUTEX_OP_OR         2  /* uaddr2 |= oparg; */
                  FUTEX_OP_ANDN       3  /* uaddr2 &= ~oparg; */
                  FUTEX_OP_XOR        4  /* uaddr2 ^= oparg; */

              In addition, bit-wise ORing the following  value  into  op
              causes (1 << oparg) to be used as the operand:

                  FUTEX_OP_ARG_SHIFT  8  /* Use (1 << oparg) as operand */

              The cmp field is one of the following:

                  FUTEX_OP_CMP_EQ     0  /* if (oldval == cmparg) wake */
                  FUTEX_OP_CMP_NE     1  /* if (oldval != cmparg) wake */
                  FUTEX_OP_CMP_LT     2  /* if (oldval < cmparg) wake */
                  FUTEX_OP_CMP_LE     3  /* if (oldval <= cmparg) wake */
                  FUTEX_OP_CMP_GT     4  /* if (oldval > cmparg) wake */
                  FUTEX_OP_CMP_GE     5  /* if (oldval >= cmparg) wake */

              The return value of FUTEX_WAKE_OP is the sum of the number
              of waiters woken on the futex uaddr  plus  the  number  of
              waiters woken on the futex uaddr2.

       FUTEX_WAIT_BITSET (since Linux 2.6.25)
              This operation is like FUTEX_WAIT except that val3 is used
              to provide a 32-bit bitset to the kernel.  This bitset  is
              stored  in  the  kernel-internal state of the waiter.  See
              the description of FUTEX_WAKE_BITSET for further details.

              The FUTEX_WAIT_BITSET operation also interprets the  time‐
              out argument differently from FUTEX_WAIT.  See the discus‐
              sion of FUTEX_CLOCK_REALTIME, above.

              The uaddr2 argument is ignored.

       FUTEX_WAKE_BITSET (since Linux 2.6.25)
              This operation is the same as FUTEX_WAKE except  that  the
              val3  argument  is  used to provide a 32-bit bitset to the
              kernel.  This bitset  is  used  to  select  which  waiters
              should  be  woken up.  The selection is done by a bit-wise
              AND of the "wake" bitset (i.e., the value in val3) and the
              bitset which is stored in the kernel-internal state of the
              waiter   (the   "wait"   bitset   that   is   set    using
              FUTEX_WAIT_BITSET).   All  of  the  waiters  for which the
              result of the AND is nonzero are woken up;  the  remaining
              waiters are left sleeping.

.\" FIXME XXX Is this next paragraph that I added okay?
              The  effect  of FUTEX_WAIT_BITSET and FUTEX_WAKE_BITSET is
              to allow selective wake-ups among  multiple  waiters  that
              are  blocked on the same futex.  Note, however, that using
              this bitset multiplexing feature on a futex is less  effi‐
              cient  than simply using multiple futexes, because employ‐
              ing bitset multiplexing requires the kernel to  check  all
              waiters  on  a  futex, including those that are not inter‐
              ested in being woken up (i.e., they do not have the  rele‐
              vant bit set in their "wait" bitset).

              The uaddr2 and timeout arguments are ignored.

              The  FUTEX_WAIT  and  FUTEX_WAKE  operations correspond to
              FUTEX_WAIT_BITSET and FUTEX_WAKE_BITSET  operations  where
              the bitsets are all ones.

   Priority-inheritance futexes
       Linux supports priority-inheritance (PI) futexes in order to han‐
       dle priority-inversion problems that can be encountered with nor‐
       mal  futex  locks.  Priority inversion is the problem that occurs
       when a high-priority task is blocked waiting to  acquire  a  lock
       held  by a low-priority task, while tasks at an intermediate pri‐
       ority continuously preempt the low-priority task  from  the  CPU.
       Consequently,  the  low-priority  task  makes  no progress toward
       releasing the lock, and the high-priority task remains blocked.

       Priority inheritance is a mechanism for dealing with  the  prior‐
       ity-inversion problem.  With this mechanism, when a high-priority
       task becomes blocked by a lock held by a low-priority  task,  the
       latter's priority is temporarily raised to that of the former, so
       that it is not preempted by any intermediate level tasks, and can
       thus  make  progress toward releasing the lock.  To be effective,
       priority inheritance must be transitive, meaning that if a  high-
       priority task blocks on a lock held by a lower-priority task that
       is itself blocked by lock held by  another  intermediate-priority
       task  (and  so  on, for chains of arbitrary length), then both of
       those task (or more generally, all of the tasks in a lock  chain)
       have  their priorities raised to be the same as the high-priority
       task.

.\" FIXME XXX The following is my attempt at a definition of PI futexes,
.\"       based on mail discussions with Darren Hart. Does it seem okay?

       From a user-space perspective, what makes a futex PI-aware  is  a
       policy  agreement  between  user  space  and the kernel about the
       value of the futex word (described in a moment), coupled with the
       use  of  the  PI futex operations described below (in particular,
       FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI, and FUTEX_CMP_REQUEUE_PI).

.\" FIXME XXX ===== Start of adapted Hart/Guniguntala text =====
.\"       The following text is drawn from the Hart/Guniguntala paper
.\"       (listed in SEE ALSO), but I have reworded some pieces
.\"       significantly. Please check it.

       The PI futex operations described below  differ  from  the  other
       futex  operations  in  that  they impose policy on the use of the
       value of the futex word:

       *  If the lock is not acquired, the futex word's value  shall  be
          0.

       *  If  the  lock is acquired, the futex word's value shall be the
          thread ID (TID; see gettid(2)) of the owning thread.

       *  If the lock is owned and there are threads contending for  the
          lock,  then  the  FUTEX_WAITERS  bit shall be set in the futex
          word's value; in other words, this value is:

              FUTEX_WAITERS | TID


       Note that a PI futex word never just has the value FUTEX_WAITERS,
       which is a permissible state for non-PI futexes.

       With this policy in place, a user-space application can acquire a
       not-acquired lock or release a lock that no other threads try  to
       acquire using atomic instructions executed in user space (e.g., a
       compare-and-swap operation such as cmpxchg on the  x86  architec‐
       ture).   Acquiring  a  lock simply consists of using compare-and-
       swap to atomically set the futex word's value to the caller's TID
       if  its  previous  value  was 0.  Releasing a lock requires using
       compare-and-swap to set the futex word's value to 0 if the previ‐
       ous value was the expected TID.

       If a futex is already acquired (i.e., has a nonzero value), wait‐
       ers must employ the FUTEX_LOCK_PI operation to acquire the  lock.
       If other threads are waiting for the lock, then the FUTEX_WAITERS
       bit is set in the futex value; in this case, the lock owner  must
       employ the FUTEX_UNLOCK_PI operation to release the lock.

       In  the  cases  where  callers  are forced into the kernel (i.e.,
       required to perform a futex() call), they then deal directly with
       a so-called RT-mutex, a kernel locking mechanism which implements
       the required priority-inheritance semantics.  After the  RT-mutex
       is  acquired,  the futex value is updated accordingly, before the
       calling thread returns to user space.
.\" FIXME ===== End of adapted Hart/Guniguntala text =====



.\" FIXME We need some explanation in the following paragraph of *why*
.\"       it is important to note that "the kernel will update the
.\"       futex word's value prior
       It is important to note to returning to user space" . Can someone
       explain?   that  the  kernel  will  update the futex word's value
       prior to returning to user space.  Unlike the other futex  opera‐
       tions  described  above, the PI futex operations are designed for
       the implementation of very specific IPC mechanisms.
.\"
.\" FIXME XXX In discussing errors for FUTEX_CMP_REQUEUE_PI, Darren Hart
.\"       made the observation that "EINVAL is returned if the non-pi 
.\"       to pi or op pairing semantics are violated."
.\"       Probably there needs to be a general statement about this
.\"       requirement, probably located at about this point in the page.
.\"       Darren (or someone else), care to take a shot at this?
.\"
.\" FIXME Somewhere on this page (I guess under the discussion of PI
.\"       futexes) we need a discussion of the FUTEX_OWNER_DIED bit.
.\"       Can someone propose a text?



       PI futexes are operated on by specifying  one  of  the  following
       values in futex_op:

       FUTEX_LOCK_PI (since Linux 2.6.18)
.\" FIXME I did some significant rewording of tglx's text to create
.\"       the text below.
.\"       Please check the following paragraph, in case I injected
.\"       errors.
              This  operation  is used after after an attempt to acquire
              the lock  via  an  atomic  user-space  instruction  failed
              because  the  futex word has a nonzero value—specifically,
              because it contained the  namespace-specific  TID  of  the
              lock owner.
.\" FIXME In the preceding line, what does "namespace-specific" mean?
.\"       (I kept those words from tglx.)
.\"       That is, what kind of namespace are we talking about?
.\"       (I suppose we are talking PID namespaces here, but I want to
.\"       be sure.)


              The  operation  checks  the value of the futex word at the
              address uaddr.  If the value is 0, then the  kernel  tries
              to atomically set the futex value to the caller's TID.  
.\" FIXME What would be the cause(s) of failure referred to
.\"       in the following sentence?
              If
              that fails, or the futex word's value is nonzero, the ker‐
              nel  atomically  sets the FUTEX_WAITERS bit, which signals
              the futex owner that it cannot unlock the  futex  in  user
              space  atomically  by setting the futex value to 0.  After
              that, the kernel tries to find the thread which is associ‐
              ated with the owner TID, creates or reuses kernel state on
              behalf of the owner and attaches the waiter  to  it.   
.\" FIXME Could I get a bit more detail on the previous lines?
.\"       What is "creates or reuses kernel state" about?
.\"       (I think this needs to be clearer in the page)

.\" FIXME In the next line, what type of "priority" are we talking about?
.\"       Realtime priorities for SCHED_FIFO and SCHED_RR?
.\"       Or something else?

              The
              enqueueing  of  the waiter is in descending priority order
              if more than one waiter exists.  

.\" FIXME In the next sentence, what type of "priority" are we talking about?
.\"       Realtime priorities for SCHED_FIFO and SCHED_RR?
.\"       Or something else?
.\" FIXME What does "bandwidth" refer to in the next sentence?

              The owner inherits either
              the priority or the bandwidth of the waiter.  
.\" FIXME In the preceding sentence, what determines whether the
.\"       owner inherits the priority versus the bandwidth?

.\" FIXME Could I get some help translating the next sentence into
.\"       something that user-space developers (and I) can understand?
.\"       In particular, what are "nested locks" in this context?

              This inheri‐
              tance follows the lock chain in the case of nested locking
              and performs deadlock detection.

.\" FIXME tglx said "The timeout argument is handled as described in
.\"       FUTEX_WAIT." However, it appears to me that this is not right.
.\"       Is the following formulation correct?
              The  timeout  argument  provides  a  timeout  for the lock
              attempt.  It is interpreted as an absolute time,  measured
              against the CLOCK_REALTIME clock.  If timeout is NULL, the
              operation will block indefinitely.

              The uaddr2, val, and val3 arguments are ignored.

       FUTEX_TRYLOCK_PI (since Linux 2.6.18)
.\" FIXME I think it would be helpful here to say a few more words about
.\"       the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
.\"       Can someone propose something?
              This operation tries to acquire the futex  at  uaddr.   It
              deals  with  the situation where the TID value at uaddr is
              0, but the FUTEX_WAITERS bit is set.   User  space  cannot
              handle this condition in a race-free manner
.\" FIXME How does the situation in the previous sentence come about?
.\"       Probably it would be helpful to say something about that in
.\"       the man page.
.\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?


              The uaddr2, val, timeout, and val3 arguments are ignored.

       FUTEX_UNLOCK_PI (since Linux 2.6.18)
              This operation wakes the top priority waiter that is wait‐
              ing in FUTEX_LOCK_PI on the futex address provided by  the
              uaddr argument.

              This  is  called when the user space value at uaddr cannot
              be changed atomically from a TID (of the owner) to 0.

              The uaddr2, val, timeout, and val3 arguments are ignored.

       FUTEX_CMP_REQUEUE_PI (since Linux 2.6.31)
              This operation is a PI-aware variant of FUTEX_CMP_REQUEUE.
              It    requeues    waiters    that    are    blocked    via
              FUTEX_WAIT_REQUEUE_PI on uaddr from a non-PI source  futex
              (uaddr) to a PI target futex (uaddr2).

              As with FUTEX_CMP_REQUEUE, this operation wakes up a maxi‐
              mum of val waiters that are waiting on the futex at uaddr.
              However, for FUTEX_CMP_REQUEUE_PI, val is required to be 1
              (since the main point is to avoid a thundering herd).  The
              remaining  waiters  are removed from the wait queue of the
              source futex at uaddr and added to the wait queue  of  the
              target futex at uaddr2.

              The val2 and val3 arguments serve the same purposes as for
              FUTEX_CMP_REQUEUE.
.\" FIXME The page at http://locklessinc.com/articles/futex_cheat_sheet/
.\"       notes that "priority-inheritance Futex to priority-inheritance
.\"       Futex requeues are currently unsupported". Do we need to say
.\"       something in the man page about that?



       FUTEX_WAIT_REQUEUE_PI (since Linux 2.6.31)

.\" FIXME I find the next sentence (from tglx) pretty hard to grok.
.\"       Could someone explain it a bit more?

              Wait operation to wait on a  non-PI  futex  at  uaddr  and
              potentially  be  requeued  onto a PI futex at uaddr2.  The
              wait operation on uaddr is the same  as  FUTEX_WAIT.   

.\" FIXME I'm not quite clear on the meaning of the following sentence.
.\"       Is this trying to say that while blocked in a
.\"       FUTEX_WAIT_REQUEUE_PI, it could happen that another
.\"       task does a FUTEX_WAKE on uaddr that simply causes
.\"       a normal wake, with the result that the FUTEX_WAIT_REQUEUE_PI
.\"       does not complete? What happens then to the FUTEX_WAIT_REQUEUE_PI
.\"       opertion? Does it remain blocked, or does it unblock
.\"       In which case, what does user space see?

              The
              waiter   can  be  removed  from  the  wait  on  uaddr  via
              FUTEX_WAKE without requeueing on uaddr2.

.\" FIXME Please check the following. tglx said "The timeout argument
.\"       is handled as described in FUTEX_WAIT.", but the truth is
.\"       as below, AFAICS

              If timeout is not NULL, it specifies  a  timeout  for  the
              wait  operation;  this  timeout is interpreted as outlined
              above  in  the  description  of  the  FUTEX_CLOCK_REALTIME
              option.   If  timeout  is  NULL,  the  operation can block
              indefinitely.

              The val3 argument is ignored.

.\" FIXME Re the preceding sentence... Actually 'val3' is internally set to
.\"       FUTEX_BITSET_MATCH_ANY before calling futex_wait_requeue_pi().
.\"       I'm not sure we need to say anything about this though.
.\"       Comments?


              The FUTEX_WAIT_REQUEUE_PI  and  FUTEX_CMP_REQUEUE_PI  were
              added  to  support a fairly specific use case: support for
              priority-inheritance-aware POSIX threads  condition  vari‐
              ables.  The idea is that these operations should always be
              paired, in order to ensure that user space and the  kernel
              remain in sync.  Thus, in the FUTEX_WAIT_REQUEUE_PI opera‐
              tion, the user-space application pre-specifies the  target
              of    the    requeue    that    takes    place    in   the
              FUTEX_CMP_REQUEUE_PI operation.

RETURN VALUE
       In the event of an error (and assuming that futex()  was  invoked
       via  syscall(2)), all operations return -1 and set errno to indi‐
       cate the cause of the error.  The return value on success depends
       on the operation, as described in the following list:

       FUTEX_WAIT
              Returns 0 if the caller was woken up.  Note that a wake-up
              can also be caused by common futex usage patterns in unre‐
              lated code that happened to have previously used the futex
              word's memory location (e.g., typical  futex-based  imple‐
              mentations  of  Pthreads mutexes can cause this under some
              conditions).  Therefore, callers should  always  conserva‐
              tively assume that a return value of 0 can mean a spurious
              wake-up, and use the futex word's value  (i.e.,  the  user
              space synchronization scheme)
                  to decide whether to continue to block or not.

       FUTEX_WAKE
              Returns the number of waiters that were woken up.

       FUTEX_FD
              Returns the new file descriptor associated with the futex.

       FUTEX_REQUEUE
              Returns the number of waiters that were woken up.

       FUTEX_CMP_REQUEUE
              Returns  the total number of waiters that were woken up or
              requeued to the futex for the futex word  at  uaddr2.   If
              this  value  is  greater  than val, then difference is the
              number of waiters requeued to the futex for the futex word
              at uaddr2.

       FUTEX_WAKE_OP
              Returns  the  total  number of waiters that were woken up.
              This is the sum of the woken waiters on  the  two  futexes
              for the futex words at uaddr and uaddr2.

       FUTEX_WAIT_BITSET
              Returns  0 if the caller was woken up.  See FUTEX_WAIT for
              how to interpret this correctly in practice.

       FUTEX_WAKE_BITSET
              Returns the number of waiters that were woken up.

       FUTEX_LOCK_PI
              Returns 0 if the futex was successfully locked.

       FUTEX_TRYLOCK_PI
              Returns 0 if the futex was successfully locked.

       FUTEX_UNLOCK_PI
              Returns 0 if the futex was successfully unlocked.

       FUTEX_CMP_REQUEUE_PI
              Returns the total number of waiters that were woken up  or
              requeued  to  the  futex for the futex word at uaddr2.  If
              this value is greater than val,  then  difference  is  the
              number of waiters requeued to the futex for the futex word
              at uaddr2.

       FUTEX_WAIT_REQUEUE_PI
              Returns 0 if the caller was successfully requeued  to  the
              futex for the futex word at uaddr2.

ERRORS
       EACCES No read access to the memory of a futex word.

       EAGAIN (FUTEX_WAIT, FUTEX_WAIT_BITSET, FUTEX_WAIT_REQUEUE_PI) The
              value pointed to by uaddr was not equal  to  the  expected
              value val at the time of the call.

              Note:  on Linux, the symbolic names EAGAIN and EWOULDBLOCK
              (both of which appear in different  parts  of  the  kernel
              futex code) have the same value.

       EAGAIN (FUTEX_CMP_REQUEUE,    FUTEX_CMP_REQUEUE_PI)   The   value
              pointed to by uaddr is not equal  to  the  expected  value
              val3.   (This  probably  indicates  a  race;  use the safe
              FUTEX_WAKE now.)
.\" FIXME: Is the preceding sentence "(This probably...") correct?
.\" [I would prefer to remove this sentence. --triegel@...hat.com]


       EAGAIN (FUTEX_LOCK_PI,  FUTEX_TRYLOCK_PI,   FUTEX_CMP_REQUEUE_PI)
              The    futex    owner    thread    ID    of   uaddr   (for
              FUTEX_CMP_REQUEUE_PI: uaddr2) is about to  exit,  but  has
              not yet handled the internal state cleanup.  Try again.

.\" FIXME XXX Should there be an EAGAIN case for FUTEX_TRYLOCK_PI?
.\"       It seems so, looking at the handling of the rt_mutex_trylock()
.\"       call in futex_lock_pi()
.\"       (Davidlohr also thinks so.)


       EDEADLK
              (FUTEX_LOCK_PI,   FUTEX_TRYLOCK_PI,  FUTEX_CMP_REQUEUE_PI)
              The futex word at uaddr is already locked by the caller.

       EDEADLK

.\" FIXME I reworded tglx's text somewhat; is the following okay?

              (FUTEX_CMP_REQUEUE_PI) While requeueing a waiter to the PI
              futex  for the futex word at uaddr2, the kernel detected a
              deadlock.

.\" FIXME XXX I see that kernel/locking/rtmutex.c uses EDEADLK in some
.\"       places, and EDEADLOCK in others. On almost all architectures
.\"       these constants are synonymous. Is there a reason that both
.\"       names are used?

       EFAULT A required pointer argument (i.e., uaddr, uaddr2, or time‐
              out) did not point to a valid user-space address.

       EINTR  A  FUTEX_WAIT  or  FUTEX_WAIT_BITSET  operation was inter‐
              rupted by a signal (see  signal(7)).   In  kernels  before
              Linux  2.6.22,  this error could also be returned for on a
              spurious wakeup; since Linux 2.6.22, this no  longer  hap‐
              pens.

       EINVAL The  operation  in futex_op is one of those that employs a
              timeout, but the supplied  timeout  argument  was  invalid
              (tv_sec  was  less than zero, or tv_nsec was not less than
              1,000,000,000).

       EINVAL The operation specified in futex_op employs one or both of
              the  pointers  uaddr and uaddr2, but one of these does not
              point to a valid object—that is, the address is not  four-
              byte-aligned.

       EINVAL (FUTEX_WAIT_BITSET, FUTEX_WAKE_BITSET) The bitset supplied
              in val3 is zero.

       EINVAL (FUTEX_CMP_REQUEUE_PI)  uaddr  equals  uaddr2  (i.e.,   an
              attempt was made to requeue to the same futex).

       EINVAL (FUTEX_FD) The signal number supplied in val is invalid.

       EINVAL (FUTEX_WAKE,       FUTEX_WAKE_OP,       FUTEX_WAKE_BITSET,
              FUTEX_REQUEUE, FUTEX_CMP_REQUEUE) The kernel  detected  an
              inconsistency  between  the  user-space state at uaddr and
              the kernel state—that is, it detected a waiter which waits
              in FUTEX_LOCK_PI on uaddr.

       EINVAL (FUTEX_LOCK_PI,   FUTEX_TRYLOCK_PI,  FUTEX_UNLOCK_PI)  The
              kernel detected an inconsistency  between  the  user-space
              state  at  uaddr  and  the  kernel  state.  This indicates
              either state corruption or that the kernel found a  waiter
              on    uaddr   which   is   waiting   via   FUTEX_WAIT   or
              FUTEX_WAIT_BITSET.
.\" FIXME Above, tglx did not mention the "state corruption" case for
.\"       FUTEX_UNLOCK_PI, but I have added it, since I'm estimating
.\"       that it also applied for FUTEX_UNLOCK_PI.
.\"       So, does that case also apply for FUTEX_UNLOCK_PI?


       EINVAL (FUTEX_CMP_REQUEUE_PI) The kernel  detected  an  inconsis‐
              tency  between the user-space state at uaddr2 and the ker‐
              nel state; that is, the kernel  detected  a  waiter  which
              waits via FUTEX_WAIT on uaddr2.
.\" FIXME In the preceding sentence, tglx did not mention FUTEX_WAIT_BITSET,
.\"       but should that not also be included here?


       EINVAL (FUTEX_CMP_REQUEUE_PI)  The  kernel  detected an inconsis‐
              tency between the user-space state at uaddr and the kernel
              state;  that  is, the kernel detected a waiter which waits
              via FUTEX_WAIT or FUTEX_WAIT_BITESET on uaddr.

       EINVAL (FUTEX_CMP_REQUEUE_PI) The kernel  detected  an  inconsis‐
              tency between the user-space state at uaddr and the kernel
              state; that is, the kernel detected a waiter  which  waits
              on     uaddr     via     FUTEX_LOCK_PI     (instead     of
              FUTEX_WAIT_REQUEUE_PI).

.\" FIXME XXX The following is a reworded version of Darren Hart's text.
.\"       Please check that I did not introduce any errors.
       EINVAL (FUTEX_CMP_REQUEUE_PI) An attempt was made  to  requeue  a
              waiter  to a futex other than that specified by the match‐
              ing FUTEX_WAIT_REQUEUE_PI call for that waiter.

       EINVAL (FUTEX_CMP_REQUEUE_PI) The val argument is not 1.

       EINVAL Invalid argument.

       ENOMEM (FUTEX_LOCK_PI,  FUTEX_TRYLOCK_PI,   FUTEX_CMP_REQUEUE_PI)
              The  kernel could not allocate memory to hold state infor‐
              mation.

       ENFILE (FUTEX_FD) The system limit on the total  number  of  open
              files has been reached.

       ENOSYS Invalid operation specified in futex_op.

       ENOSYS The FUTEX_CLOCK_REALTIME option was specified in futex_op,
              but the accompanying operation was neither FUTEX_WAIT_BIT‐
              SET nor FUTEX_WAIT_REQUEUE_PI.

       ENOSYS (FUTEX_LOCK_PI,     FUTEX_TRYLOCK_PI,     FUTEX_UNLOCK_PI,
              FUTEX_CMP_REQUEUE_PI,  FUTEX_WAIT_REQUEUE_PI)  A  run-time
              check determined that the operation is not available.  The
              PI futex operations are not implemented on  all  architec‐
              tures and are not supported on some CPU variants.

       EPERM  (FUTEX_LOCK_PI,   FUTEX_TRYLOCK_PI,  FUTEX_CMP_REQUEUE_PI)
              The caller is not allowed to attach itself to the futex at
              uaddr  (for  FUTEX_CMP_REQUEUE_PI:  the  futex at uaddr2).
              (This may be caused by a state corruption in user space.)

       EPERM  (FUTEX_UNLOCK_PI) The caller does not own the lock  repre‐
              sented by the futex word.

       ESRCH  (FUTEX_LOCK_PI,   FUTEX_TRYLOCK_PI,  FUTEX_CMP_REQUEUE_PI)

.\" FIXME I reworded the following sentence a bit differently from
.\"       tglx's formulation. Is it okay?

              The thread ID in the futex word at uaddr does not exist.

       ESRCH  (FUTEX_CMP_REQUEUE_PI) 

.\" FIXME I reworded the following sentence a bit differently from
.\"       tglx's formulation. Is it okay?

              The thread ID in the futex word  at
              uaddr2 does not exist.

       ETIMEDOUT
              The  operation  in futex_op employed the timeout specified
              in timeout, and the timeout expired before  the  operation
              completed.

VERSIONS
       Futexes were first made available in a stable kernel release with
       Linux 2.6.0.

       Initial futex support was merged in Linux 2.5.7 but with  differ‐
       ent  semantics  from  what  was described above.  A four-argument
       system call with the semantics described in this page was  intro‐
       duced  in Linux 2.5.40.  In Linux 2.5.70, one argument was added.
       In Linux 2.6.7, a sixth argument was added—messy,  especially  on
       the s390 architecture.

CONFORMING TO
       This system call is Linux-specific.

NOTES
       Glibc  does  not  provide a wrapper for this system call; call it
       using syscall(2).

       Various higher-level programming abstractions are implemented via
       futexes, including POSIX threads mutexes and condition variables,
       as well as POSIX semaphores.

EXAMPLE

.\" FIXME Is it worth having an example program?
.\" FIXME Anything obviously broken in the example program?

       The program below demonstrates use of futexes in a program  where
       parent  and  child  use a pair of futexes located inside a shared
       anonymous mapping to synchronize access to a shared resource: the
       terminal.   The  two  processes each write nloops (a command-line
       argument that defaults to 5 if omitted) messages to the  terminal
       and  employ  a  synchronization  protocol  that ensures that they
       alternate in writing messages.  Upon running this program we  see
       output such as the following:

           $ ./futex_demo
           Parent (18534) 0
           Child  (18535) 0
           Parent (18534) 1
           Child  (18535) 1
           Parent (18534) 2
           Child  (18535) 2
           Parent (18534) 3
           Child  (18535) 3
           Parent (18534) 4
           Child  (18535) 4

   Program source

       /* futex_demo.c

          Usage: futex_demo [nloops]
                           (Default: 5)

          Demonstrate the use of futexes in a program where parent and child
          use a pair of futexes located inside a shared anonymous mapping to
          synchronize access to a shared resource: the terminal. The two
          processes each write 'num-loops' messages to the terminal and employ
          a synchronization protocol that ensures that they alternate in
          writing messages.
       */
       #define _GNU_SOURCE
       #include <stdio.h>
       #include <errno.h>
       #include <stdlib.h>
       #include <unistd.h>
       #include <sys/wait.h>
       #include <sys/mman.h>
       #include <sys/syscall.h>
       #include <linux/futex.h>
       #include <sys/time.h>

       #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                               } while (0)

       static int *futex1, *futex2, *iaddr;

       static int
       futex(int *uaddr, int futex_op, int val,
             const struct timespec *timeout, int *uaddr2, int val3)
       {
           return syscall(SYS_futex, uaddr, futex_op, val,
                          timeout, uaddr, val3);
       }

       /* Acquire the futex pointed to by 'futexp': wait for its value to
          become 1, and then set the value to 0. */

       static void
       fwait(int *futexp)
       {
           int s;

           /* __sync_bool_compare_and_swap(ptr, oldval, newval) is a gcc
              built-in function.  It atomically performs the equivalent of:

                  if (*ptr == oldval)
                      *ptr = newval;

              It returns true if the test yielded true and *ptr was updated.
              The alternative here would be to employ the equivalent atomic
              machine-language instructions.  For further information, see
              the GCC Manual. */

           while (1) {

               /* Is the futex available? */

               if (__sync_bool_compare_and_swap(futexp, 1, 0))
                   break;      /* Yes */

               /* Futex is not available; wait */

               s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
               if (s == -1 && errno != EAGAIN)
                   errExit("futex-FUTEX_WAIT");
           }
       }

       /* Release the futex pointed to by 'futexp': if the futex currently
          has the value 0, set its value to 1 and the wake any futex waiters,
          so that if the peer is blocked in fpost(), it can proceed. */

       static void
       fpost(int *futexp)
       {
           int s;

           /* __sync_bool_compare_and_swap() was described in comments above */

           if (__sync_bool_compare_and_swap(futexp, 0, 1)) {

               s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
               if (s  == -1)
                   errExit("futex-FUTEX_WAKE");
           }
       }

       int
       main(int argc, char *argv[])
       {
           pid_t childPid;
           int j, nloops;

           setbuf(stdout, NULL);

           nloops = (argc > 1) ? atoi(argv[1]) : 5;

           /* Create a shared anonymous mapping that will hold the futexes.
              Since the futexes are being shared between processes, we
              subsequently use the "shared" futex operations (i.e., not the
              ones suffixed "_PRIVATE") */

           iaddr = mmap(NULL, sizeof(int) * 2, PROT_READ | PROT_WRITE,
                       MAP_ANONYMOUS | MAP_SHARED, -1, 0);
           if (iaddr == MAP_FAILED)
               errExit("mmap");

           futex1 = &iaddr[0];
           futex2 = &iaddr[1];

           *futex1 = 0;        /* State: unavailable */
           *futex2 = 1;        /* State: available */

           /* Create a child process that inherits the shared anonymous
              mapping */

           childPid = fork();
           if (childPid == -1)
               errExit("fork");

           if (childPid == 0) {        /* Child */
               for (j = 0; j < nloops; j++) {
                   fwait(futex1);
                   printf("Child  (%ld) %d\n", (long) getpid(), j);
                   fpost(futex2);
               }

               exit(EXIT_SUCCESS);
           }

           /* Parent falls through to here */

           for (j = 0; j < nloops; j++) {
               fwait(futex2);
               printf("Parent (%ld) %d\n", (long) getpid(), j);
               fpost(futex1);
           }

           wait(NULL);

           exit(EXIT_SUCCESS);
       }

SEE ALSO
       get_robust_list(2), restart_syscall(2), futex(7)

       The following kernel source files:

       * Documentation/pi-futex.txt

       * Documentation/futex-requeue-pi.txt

       * Documentation/locking/rt-mutex.txt

       * Documentation/locking/rt-mutex-design.txt

       * Documentation/robust-futex-ABI.txt

       Franke, H., Russell, R., and Kirwood, M., 2002.  Fuss, Futexes
       and Furwocks: Fast Userlevel Locking in Linux (from proceedings
       of the Ottawa Linux Symposium 2002),
       ⟨http://kernel.org/doc/ols/2002/ols2002-pages-479-495.pdf⟩

       Hart, D., 2009. A futex overview and update,
       ⟨http://lwn.net/Articles/360699/⟩

       Hart, D. and Guniguntala, D., 2009.  Requeue-PI: Making Glibc
       Condvars PI-Aware (from proceedings of the 2009 Real-Time Linux
       Workshop),
       ⟨http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf⟩

       Drepper, U., 2011. Futexes Are Tricky,
       ⟨http://www.akkadia.org/drepper/futex.pdf⟩

       Futex example library, futex-*.tar.bz2 at
       ⟨ftp://ftp.kernel.org/pub/linux/kernel/people/rusty/⟩

.\" FIXME Are there any other resources that should be listed
.\"       in the SEE ALSO section?

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ