[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fs57mucg3z5ay5ga7gqr6kdhlddydtmspwfkbm3rjtpjp57b6y@opvhf34v5xq4>
Date: Fri, 30 May 2025 11:51:58 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-man@...r.kernel.org,
André Almeida <andrealmeid@...lia.com>, Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>
Subject: Re: [[PATCH v3] 1/4] man/man2/prctl.2,
man/man2const/PR_FUTEX_HASH.2const: Document PR_FUTEX_HASH
Hi Sebastian,
On Mon, May 26, 2025 at 05:55:20PM +0200, Sebastian Andrzej Siewior wrote:
> The prctl(PR_FUTEX_HASH) is queued for the v6.16 merge window.
> Add some documentation of the interface.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> man/man2/prctl.2 | 3 +
> man/man2const/PR_FUTEX_HASH.2const | 92 ++++++++++++++++++++++++++++++
> 2 files changed, 95 insertions(+)
> create mode 100644 man/man2const/PR_FUTEX_HASH.2const
>
> diff --git a/man/man2/prctl.2 b/man/man2/prctl.2
> index cb5e75bf79ab2..ddfd1d1f5b940 100644
> --- a/man/man2/prctl.2
> +++ b/man/man2/prctl.2
> @@ -150,6 +150,8 @@ with a significance depending on the first one.
> .B PR_GET_MDWE
> .TQ
> .B PR_RISCV_SET_ICACHE_FLUSH_CTX
> +.TQ
> +.B PR_FUTEX_HASH
> .SH RETURN VALUE
> On success,
> a nonnegative value is returned.
> @@ -262,4 +264,5 @@ so these operations should be used with care.
> .BR PR_SET_MDWE (2const),
> .BR PR_GET_MDWE (2const),
> .BR PR_RISCV_SET_ICACHE_FLUSH_CTX (2const),
> +.BR PR_FUTEX_HASH (2const),
> .BR core (5)
> diff --git a/man/man2const/PR_FUTEX_HASH.2const b/man/man2const/PR_FUTEX_HASH.2const
> new file mode 100644
> index 0000000000000..c27adcb73d079
> --- /dev/null
> +++ b/man/man2const/PR_FUTEX_HASH.2const
> @@ -0,0 +1,92 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH PR_FUTEX_HASH 2const (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +PR_FUTEX_HASH
> +\-
> +configure the private futex hash
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#include <linux/prctl.h>" " /* Definition of " PR_* " constants */"
> +.B #include <sys/prctl.h>
> +.P
> +.BI "int prctl(PR_FUTEX_HASH, unsigned long " op ", ...);"
> +.fi
> +.SH DESCRIPTION
> +Configure the attributes for the underlying hash used by the
> +.BR futex (2)
> +family of operations.
> +The Linux kernel uses a hash to distribute the unrelated
> +.BR futex (2)
> +requests to different data structures
> +in order to reduce the lock contention.
> +Unrelated requests are requests which are not related to one another
> +because they use a different
> +.I uaddr
> +value of the syscall or the requests are issued by different processes
I think 'use a different uaddr value of the syscall' is technically
incorrect, because two processes may have a different address for the
same futex word, as their address space is different, right?
See futex(2):
$ MANWIDTH=72 man futex | grep -B7 -A5 different.v
A futex is a 32‐bit value——referred to below as a futex word——
whose address is supplied to the futex() system call. (Futexes
are 32 bits in size on all platforms, including 64‐bit systems.)
All futex operations are governed by this value. In order to
share a futex between processes, the futex is placed in a region
of shared memory, created using (for example) mmap(2) or shmat(2).
(Thus, the futex word may have different virtual addresses in dif‐
ferent processes, but these addresses all refer to the same loca‐
tion in physical memory.) In a multithreaded program, it is suf‐
ficient to place the futex word in a global variable shared by all
threads.
Maybe say 'use a different futex word'?
> +and the
> +.B FUTEX_PRIVATE_FLAG
> +option is set.
By referring to a different futex word, this is already implied, so we
can drop it.
> +The data structure holds the in-kernel representation of the operation and
> +keeps track of the current users which are enqueued and wait for a wake up.
> +It also provides synchronisation of waiters against wakers.
> +The size of the global hash is determined at boot time
> +and is based on the number of CPUs in the system.
> +Due to hash collision two unrelated
s/ two/, two/
> +.BR futex (2)
> +requests can share the same hash bucket.
> +This in turn can lead to delays of the
> +.BR futex (2)
> +operation due to lock contention while accessing the data structure.
> +These delays can be problematic on a real-time system
> +since random processes can
> +share in-kernel locks
> +and it is not deterministic which process will be involved.
> +.P
> +Linux 6.16 implements a process-wide private hash which is used by all
> +.BR futex (2)
> +operations that specify the
> +.B FUTEX_PRIVATE_FLAG
> +option as part of the operation.
> +Without any configuration
> +the kernel will allocate 16 hash slots
> +once the first thread has been created.
> +If the process continues to create threads,
> +the kernel will try to resize the private hash based on the number of threads
> +and available CPUs in the system.
> +The kernel will only increase the size and will make sure it does not exceed
> +the size of the global hash.
> +.P
> +The user can configure the size of the private hash which will also disable the
s/hash/\nhash/
> +automatic resize provided by the kernel.
> +.P
> +The value in
> +.I op
> +is one of the options below.
> +.TP
> +.B PR_FUTEX_HASH_GET_IMMUTABLE
> +.TQ
> +.B PR_FUTEX_HASH_GET_SLOTS
> +.TQ
> +.B PR_FUTEX_HASH_SET_SLOTS
> +.SH RETURN VALUE
> +On success,
> +these calls return a nonnegative value.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 6.16.
> +.SH SEE ALSO
> +.BR prctl (2),
> +.BR futex (2),
> +.BR PR_FUTEX_HASH_GET_IMMUTABLE (2const),
> +.BR PR_FUTEX_HASH_GET_SLOTS (2const),
> +.BR PR_FUTEX_HASH_SET_SLOTS (2const)
> --
> 2.49.0
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es/>
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists