lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1405151144390.6261@ionos.tec.linutronix.de>
Date:	Thu, 15 May 2014 16:14:52 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
cc:	Carlos O'Donell <carlos@...hat.com>,
	Darren Hart <dvhart@...ux.intel.com>,
	Ingo Molnar <mingo@...e.hu>, Jakub Jelinek <jakub@...hat.com>,
	"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Arnd Bergmann <arnd@...db.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Linux API <linux-api@...r.kernel.org>
Subject: Re: futex(2) man page update help request

On Thu, 15 May 2014, Michael Kerrisk (man-pages) wrote:
> And that universe would love to have your documentation of
> FUTEX_WAKE_BITSET and FUTEX_WAIT_BITSET ;-),

I give you almost the full treatment, but I leave REQUEUE_PI to Darren
and FUTEX_WAKE_OP to Jakub. :)


FUTEX_WAIT

	< Existing blurb seems ok >

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.

	[EWOULDBLOCK] The atomic enqueueing failed. User space value
		      at uaddr is not equal val argument.

	[ETIMEDOUT] timeout expired 


FUTEX_WAKE

	< Existing blurb seems ok >

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI

FUTEX_REQUEUE

	Existing blurb seems ok , except for this:

	The argument val contains the number of waiters on uaddr which
	are immediately woken up.

	The timeout argument is abused to transport the number of
	waiters which are requeued to the futex at uaddr2. The pointer
	is typecasted to u32.


	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

FUTEX_REQUEUE_CMP

	Existing blurb seems ok , except for this:

	The argument val is contains the number of waiters on uaddr
	which are immediately woken up.

	The timeout argument is abused to transport the number of
	waiters which are requeued to the futex at uaddr2. The pointer
	is typecasted to u32.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr or uaddr2

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EAGAIN] uaddr1 readout is not equal the compare value in
		 argument val3

FUTEX_WAKE_OP


Jakub, can you please explain it? I'm lost :)


	The argument val contains the maximum number of waiters on
	uaddr which are immediately woken up.

	The timeout argument is abused to transport the maximum
	number of waiters on uaddr2 which are woken up. The pointer
	is typecasted to u32.

	Related return values

	[EFAULT] Kernel was unable to access the futex values at uaddr
		 or uaddr2

	[EINVAL] The supplied uaddr or uaddr2 argument does not point
		 to a valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr


FUTEX_WAIT_BITSET

	The same as FUTEX_WAIT except that val3 is used to provide a
	32bit bitset to the kernel. This bitset is stored in the
	kernel internal state of the waiter.

	This futex op also allows to have the option bit
	FUTEX_CLOCK_REALTIME set.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

  	[EINVAL] The supplied bitset is zero.

	[EINVAL] The supplied timeout argument is not normalized.

	[ETIMEDOUT] timeout expired 


FUTEX_WAKE_BITSET

	The same as FUTEX_WAKE except that val3 is used to provide a
	32bit bitset to the kernel. This bitset is used to select
	waiters on the futex. The selection is done by a bitwise AND
	of the wake side supplied bitset and the bitset which is
	stored in the kernel internal state of the waiters. If the
	result is non zero, the waiter is woken, otherwise left
	waiting.

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

  	[EINVAL] The supplied bitset is zero.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI

FUTEX_LOCK_PI

	This operation reads from the futex address provided by the
	uaddr argument, which contains the namespace specific TID of
	the lock owner. If the TID is 0, then the kernel tries to set
	the waiters TID atomically. If the TID is nonzero or the take
	over fails the kernel sets atomically the FUTEX_WAITERS bit
	which signals the owner, that it cannot unlock the futex in
	user space atomically by transitioning from TID to 0. After
	that the kernel tries to find the task which is associated to
	the owner TID, creates or reuses kernel state on behalf of the
	owner and attaches the waiter to it. The enqueing of the
	waiter is in descending priority order if more than one waiter
	exists. The owner inherits either the priority or the
	bandwidth of the waiter. This inheritance follows the lock
	chain in the case of nested locking and performs deadlock
	detection.

	The timeout argument is handled as described in FUTEX_WAIT.
	The arguments uaddr2, val, and val3 are ignored.

	Related return values

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.
		 
	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state. Thats
		 either state corruption or it found a waiter on uaddr
		 which is waiting on FUTEX_WAIT[_BITSET]

	[EPERM]  Caller is not allowed to attach itself to the futex.
		 Can be a legitimate issue or a hint for state
		 corruption in user space

	[ESRCH]	 The TID in the user space value does not exist

	[EAGAIN] The futex owner TID is about to exit, but has not yet
		 handled the internal state cleanup. Try again.	 

	[ETIMEDOUT] timeout expired 

	[EDEADLOCK] The futex is already locked by the caller or the kernel
		    detected a deadlock scenario in a nested lock chain

	[EOWNERDIED] The owner of the futex died and the kernel made the
		     caller the new owner. The kernel sets the
		     FUTEX_OWNER_DIED bit in the futex userspace value.
		     Caller is responsible for cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants  (runtime detection)
		     
FUTEX_TRYLOCK_PI

	This operation tries to acquire the futex at uaddr. It deals
	with the situation where the TID value at uaddr is 0, but the
	FUTEX_HAS_WAITER bit is set. User space cannot handle this
	race free.

	The arguments uaddr2, val, timeout and val3 are ignored.

	Return values:

	[EFAULT] Kernel was unable to access the futex value at uaddr.

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr argument does not point to a valid
		 object, i.e. pointer is not 4 byte aligned

	[EINVAL] The kernel detected inconsistent state between the user
		 space state at uaddr and the kernel state

	[EPERM]  Caller is not allowed to attach itself to the futex.
		 Can be a legitimate issue or a hint for state
		 corruption in user space

	[ESRCH]	 The TID in the user space value does not exist

	[EAGAIN] The futex owner TID is about to exit, but has not yet
		 handled the internal state cleanup. Try again.	 

	[EDEADLOCK] The futex is already locked by the caller.

	[EOWNERDIED] The owner of the futex died and the kernel made the
		     caller the new owner. The kernel sets the
		     FUTEX_OWNER_DIED bit in the futex userspace value.
		     Caller is responsible for cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)

FUTEX_UNLOCK_PI

	This operation wakes the top priority waiter which is waiting
	in FUTEX_LOCK_PI on the futex address provided by the uaddr
	argument.

	This is called when the user space value at uaddr cannot be
	changed atomically from TID (of the owner) to 0.

	The arguments uaddr2, val, timeout and val3 are ignored.

	Related return values:
	
	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT[_BITSET].

	[EPERM]  Caller does not own the futex.

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)

FUTEX_WAIT_REQUEUE_PI

	Wait operation to wait on a non pi futex at uaddr and
	potentially be requeued on a pi futex at uaddr2. The wait
	operation on uaddr is the same as FUTEX_WAIT. The waiter can
	be removed from the wait on uaddr via FUTEX_WAKE without
	requeuing on uaddr2.

	The timeout argument is handled as described in FUTEX_WAIT.

Darren, can you fill in the missing details?

	Return values:

	[EFAULT] Kernel was unable to access the futex value at uaddr
		 or uaddr2

	[EINVAL] The supplied uaddr or uaddr2 argument does not point
		 to a valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] The supplied timeout argument is not normalized.

  	[EINVAL] The supplied bitset is zero.

	[EWOULDBLOCK] The atomic enqueueing failed. User space value
		      at uaddr is not equal val argument.

	[ETIMEDOUT] timeout expired 

	[EOWNERDIED] The owner of the PI futex at uaddr2 died and the
		     kernel made the caller the new owner. The kernel
		     sets the FUTEX_OWNER_DIED bit in the uaddr2 futex
		     userspace value.  Caller is responsible for
		     cleanup

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)


FUTEX_CMP_REQUEUE_PI

	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
	a non PI futex. Outer futex to which is requeued is a PI futex
	at uaddr2.

	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.

	The argument val is contains the number of waiters on uaddr
	which are immediately woken up. Must be 1 for this opcode.

	The timeout argument is abused to transport the number of
	waiters which are requeued on to the futex at uaddr2. The
	pointer is typecasted to u32.

Darren, can you fill in the missing details?

	[EFAULT] Kernel was unable to access the futex value at uaddr
		 or uaddr2

	[ENOMEM] Kernel could not allocate state

	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
		 valid object, i.e. pointer is not 4 byte aligned

	[EINVAL] uaddr equal uaddr2. Requeue to same futex.

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_LOCK_PI on uaddr

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT[_BITSET] on uaddr

	[EINVAL] The kernel detected inconsistent state between the
		 user space state at uaddr2 and the kernel state,
		 i.e. it detected a waiter which waits in
		 FUTEX_WAIT on uaddr2.

  	[EINVAL] The supplied bitset is zero.

	[EAGAIN] uaddr1 readout is not equal the compare value in
		 argument val3

	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
		 has not yet handled the internal state cleanup. Try
		 again.

	[EPERM]  Caller is not allowed to attach the waiter to the
		 futex at uaddr2 Can be a legitimate issue or a hint
		 for state corruption in user space

	[ESRCH]	 The TID in the user space value at uaddr2 does not exist

	[EDEADLOCK] The requeuing of a waiter to the kernel representation
		    of the PI futex at uaddr2 detected a deadlock scenario.

        [ENOSYS] Not implemented on all architectures and not supported
		 on some CPU variants (runtime detection)


The various option bits seem to be undocumented as well

FUTEX_PRIVATE_FLAG

	This option bit can be ored on all futex ops.

	It tells the kernel, that the futex is process private and not
	shared with another process. That allows the kernel to chose
	the fast path for validating the user space address and avoids
	expensive VMA lookup, taking refcounts on file backing store
	etc.

FUTEX_CLOCK_REALTIME

	This option bit can be ored on the futex ops FUTEX_WAIT_BITSET
	and FUTEX_WAIT_REQUEUE_PI

	If set the kernel treats the user space supplied timeout as
	absolute time based on CLOCK_REALTIME.

	If not set the kernel treats the user space supplied timeout
	as relative time.

	If this is set on any other op than the supported ones, kernel
	returns ENOSYS!


Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ