[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53752157.9070803@gmail.com>
Date: Thu, 15 May 2014 22:19:35 +0200
From: "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: mtk.manpages@...il.com, Carlos O'Donell <carlos@...hat.com>,
Darren Hart <dvhart@...ux.intel.com>,
Ingo Molnar <mingo@...e.hu>, Jakub Jelinek <jakub@...hat.com>,
"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
lkml <linux-kernel@...r.kernel.org>,
Davidlohr Bueso <davidlohr.bueso@...com>,
Arnd Bergmann <arnd@...db.de>,
Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <peterz@...radead.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: futex(2) man page update help request
On 05/15/2014 04:14 PM, Thomas Gleixner wrote:
> On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
>> And that universe would love to have your documentation of
>> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),
>
> I give you almost the full treatment, but I leave REQUEUE_PI to Darren
> and FUTEX_WAKE_OP to Jakub. :)
Thanks Thomas--that's fantastic! Hopefully, Darren and Jakub fill in those
missing pieces...
Cheers,
Michael
> FUTEX_WAIT
>
> < Existing blurb seems ok >
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The supplied timeout argument is not normalized.
>
> [EWOULDBLOCK] The atomic enqueueing failed. User space value
> at uaddr is not equal val argument.
>
> [ETIMEDOUT] timeout expired
>
>
> FUTEX_WAKE
>
> < Existing blurb seems ok >
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI
>
> FUTEX_REQUEUE
>
> Existing blurb seems ok , except for this:
>
> The argument val contains the number of waiters on uaddr which
> are immediately woken up.
>
> The timeout argument is abused to transport the number of
> waiters which are requeued to the futex at uaddr2. The pointer
> is typecasted to u32.
>
>
> [EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2
>
> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> valid object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI on uaddr
>
> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>
> FUTEX_REQUEUE_CMP
>
> Existing blurb seems ok , except for this:
>
> The argument val is contains the number of waiters on uaddr
> which are immediately woken up.
>
> The timeout argument is abused to transport the number of
> waiters which are requeued to the futex at uaddr2. The pointer
> is typecasted to u32.
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2
>
> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> valid object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI on uaddr
>
> [EAGAIN] uaddr1 readout is not equal the compare value in
> argument val3
>
> FUTEX_WAKE_OP
>
>
> Jakub, can you please explain it? I'm lost :)
>
>
> The argument val contains the maximum number of waiters on
> uaddr which are immediately woken up.
>
> The timeout argument is abused to transport the maximum
> number of waiters on uaddr2 which are woken up. The pointer
> is typecasted to u32.
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex values at uaddr
> or uaddr2
>
> [EINVAL] The supplied uaddr or uaddr2 argument does not point
> to a valid object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI on uaddr
>
>
> FUTEX_WAIT_BITSET
>
> The same as FUTEX_WAIT except that val3 is used to provide a
> 32bit bitset to the kernel. This bitset is stored in the
> kernel internal state of the waiter.
>
> This futex op also allows to have the option bit
> FUTEX_CLOCK_REALTIME set.
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The supplied bitset is zero.
>
> [EINVAL] The supplied timeout argument is not normalized.
>
> [ETIMEDOUT] timeout expired
>
>
> FUTEX_WAKE_BITSET
>
> The same as FUTEX_WAKE except that val3 is used to provide a
> 32bit bitset to the kernel. This bitset is used to select
> waiters on the futex. The selection is done by a bitwise AND
> of the wake side supplied bitset and the bitset which is
> stored in the kernel internal state of the waiters. If the
> result is non zero, the waiter is woken, otherwise left
> waiting.
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The supplied bitset is zero.
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI
>
> FUTEX_LOCK_PI
>
> This operation reads from the futex address provided by the
> uaddr argument, which contains the namespace specific TID of
> the lock owner. If the TID is 0, then the kernel tries to set
> the waiters TID atomically. If the TID is nonzero or the take
> over fails the kernel sets atomically the FUTEX_WAITERS bit
> which signals the owner, that it cannot unlock the futex in
> user space atomically by transitioning from TID to 0. After
> that the kernel tries to find the task which is associated to
> the owner TID, creates or reuses kernel state on behalf of the
> owner and attaches the waiter to it. The enqueing of the
> waiter is in descending priority order if more than one waiter
> exists. The owner inherits either the priority or the
> bandwidth of the waiter. This inheritance follows the lock
> chain in the case of nested locking and performs deadlock
> detection.
>
> The timeout argument is handled as described in FUTEX_WAIT.
> The arguments uaddr2, val, and val3 are ignored.
>
> Related return values
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [ENOMEM] Kernel could not allocate state
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The supplied timeout argument is not normalized.
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state. Thats
> either state corruption or it found a waiter on uaddr
> which is waiting on FUTEX_WAIT[_BITSET]
>
> [EPERM] Caller is not allowed to attach itself to the futex.
> Can be a legitimate issue or a hint for state
> corruption in user space
>
> [ESRCH] The TID in the user space value does not exist
>
> [EAGAIN] The futex owner TID is about to exit, but has not yet
> handled the internal state cleanup. Try again.
>
> [ETIMEDOUT] timeout expired
>
> [EDEADLOCK] The futex is already locked by the caller or the kernel
> detected a deadlock scenario in a nested lock chain
>
> [EOWNERDIED] The owner of the futex died and the kernel made the
> caller the new owner. The kernel sets the
> FUTEX_OWNER_DIED bit in the futex userspace value.
> Caller is responsible for cleanup
>
> [ENOSYS] Not implemented on all architectures and not supported
> on some CPU variants (runtime detection)
>
> FUTEX_TRYLOCK_PI
>
> This operation tries to acquire the futex at uaddr. It deals
> with the situation where the TID value at uaddr is 0, but the
> FUTEX_HAS_WAITER bit is set. User space cannot handle this
> race free.
>
> The arguments uaddr2, val, timeout and val3 are ignored.
>
> Return values:
>
> [EFAULT] Kernel was unable to access the futex value at uaddr.
>
> [ENOMEM] Kernel could not allocate state
>
> [EINVAL] The supplied uaddr argument does not point to a valid
> object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The kernel detected inconsistent state between the user
> space state at uaddr and the kernel state
>
> [EPERM] Caller is not allowed to attach itself to the futex.
> Can be a legitimate issue or a hint for state
> corruption in user space
>
> [ESRCH] The TID in the user space value does not exist
>
> [EAGAIN] The futex owner TID is about to exit, but has not yet
> handled the internal state cleanup. Try again.
>
> [EDEADLOCK] The futex is already locked by the caller.
>
> [EOWNERDIED] The owner of the futex died and the kernel made the
> caller the new owner. The kernel sets the
> FUTEX_OWNER_DIED bit in the futex userspace value.
> Caller is responsible for cleanup
>
> [ENOSYS] Not implemented on all architectures and not supported
> on some CPU variants (runtime detection)
>
> FUTEX_UNLOCK_PI
>
> This operation wakes the top priority waiter which is waiting
> in FUTEX_LOCK_PI on the futex address provided by the uaddr
> argument.
>
> This is called when the user space value at uaddr cannot be
> changed atomically from TID (of the owner) to 0.
>
> The arguments uaddr2, val, timeout and val3 are ignored.
>
> Related return values:
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_WAIT[_BITSET].
>
> [EPERM] Caller does not own the futex.
>
> [ENOSYS] Not implemented on all architectures and not supported
> on some CPU variants (runtime detection)
>
> FUTEX_WAIT_REQUEUE_PI
>
> Wait operation to wait on a non pi futex at uaddr and
> potentially be requeued on a pi futex at uaddr2. The wait
> operation on uaddr is the same as FUTEX_WAIT. The waiter can
> be removed from the wait on uaddr via FUTEX_WAKE without
> requeuing on uaddr2.
>
> The timeout argument is handled as described in FUTEX_WAIT.
>
> Darren, can you fill in the missing details?
>
> Return values:
>
> [EFAULT] Kernel was unable to access the futex value at uaddr
> or uaddr2
>
> [EINVAL] The supplied uaddr or uaddr2 argument does not point
> to a valid object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] The supplied timeout argument is not normalized.
>
> [EINVAL] The supplied bitset is zero.
>
> [EWOULDBLOCK] The atomic enqueueing failed. User space value
> at uaddr is not equal val argument.
>
> [ETIMEDOUT] timeout expired
>
> [EOWNERDIED] The owner of the PI futex at uaddr2 died and the
> kernel made the caller the new owner. The kernel
> sets the FUTEX_OWNER_DIED bit in the uaddr2 futex
> userspace value. Caller is responsible for
> cleanup
>
> [ENOSYS] Not implemented on all architectures and not supported
> on some CPU variants (runtime detection)
>
>
> FUTEX_CMP_REQUEUE_PI
>
> PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
> a non PI futex. Outer futex to which is requeued is a PI futex
> at uaddr2.
>
> The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.
>
> The argument val is contains the number of waiters on uaddr
> which are immediately woken up. Must be 1 for this opcode.
>
> The timeout argument is abused to transport the number of
> waiters which are requeued on to the futex at uaddr2. The
> pointer is typecasted to u32.
>
> Darren, can you fill in the missing details?
>
> [EFAULT] Kernel was unable to access the futex value at uaddr
> or uaddr2
>
> [ENOMEM] Kernel could not allocate state
>
> [EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
> valid object, i.e. pointer is not 4 byte aligned
>
> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_LOCK_PI on uaddr
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_WAIT[_BITSET] on uaddr
>
> [EINVAL] The kernel detected inconsistent state between the
> user space state at uaddr2 and the kernel state,
> i.e. it detected a waiter which waits in
> FUTEX_WAIT on uaddr2.
>
> [EINVAL] The supplied bitset is zero.
>
> [EAGAIN] uaddr1 readout is not equal the compare value in
> argument val3
>
> [EAGAIN] The futex owner TID of uaddr2 is about to exit, but
> has not yet handled the internal state cleanup. Try
> again.
>
> [EPERM] Caller is not allowed to attach the waiter to the
> futex at uaddr2 Can be a legitimate issue or a hint
> for state corruption in user space
>
> [ESRCH] The TID in the user space value at uaddr2 does not exist
>
> [EDEADLOCK] The requeuing of a waiter to the kernel representation
> of the PI futex at uaddr2 detected a deadlock scenario.
>
> [ENOSYS] Not implemented on all architectures and not supported
> on some CPU variants (runtime detection)
>
>
> The various option bits seem to be undocumented as well
>
> FUTEX_PRIVATE_FLAG
>
> This option bit can be ored on all futex ops.
>
> It tells the kernel, that the futex is process private and not
> shared with another process. That allows the kernel to chose
> the fast path for validating the user space address and avoids
> expensive VMA lookup, taking refcounts on file backing store
> etc.
>
> FUTEX_CLOCK_REALTIME
>
> This option bit can be ored on the futex ops FUTEX_WAIT_BITSET
> and FUTEX_WAIT_REQUEUE_PI
>
> If set the kernel treats the user space supplied timeout as
> absolute time based on CLOCK_REALTIME.
>
> If not set the kernel treats the user space supplied timeout
> as relative time.
>
> If this is set on any other op than the supported ones, kernel
> returns ENOSYS!
>
>
> Thanks,
>
> tglx
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists