[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fzoy24xjr4w3kxt6suya5dfici6uxk6d3gzrjwruujlkca3zwh@3dqnmbah3bxg>
Date: Thu, 22 May 2025 14:22:15 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-man@...r.kernel.org, linux-kernel@...r.kernel.org,
André Almeida <andrealmeid@...lia.com>, Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v2] prctl: Add documentation for PR_FUTEX_HASH
Hi Sebastian,
> Subject: Re: [PATCH v2] prctl: Add documentation for PR_FUTEX_HASH
Please change the subject to
man/man2/prctl.2, man/man2const/PR_FUTEX_HASH.2const: Document PR_FUTEX_HASH
On Tue, May 20, 2025 at 12:42:47PM +0200, Sebastian Andrzej Siewior wrote:
> The prctl(PR_FUTEX_HASH) is queued for the v6.16 merge window.
> Add some documentation of the interface.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> v1…v2: https://lore.kernel.org/all/20250516161422.BqmdlxlF@linutronix.de/
> - Partly reword
> - Use "semantic newlines"
>
> man/man2/prctl.2 | 3 +
> man/man2const/PR_FUTEX_HASH.2const | 122 +++++++++++++++++++++++++++++
> 2 files changed, 125 insertions(+)
> create mode 100644 man/man2const/PR_FUTEX_HASH.2const
>
> diff --git a/man/man2/prctl.2 b/man/man2/prctl.2
> index f29b745b12578..a884064a40b7d 100644
> --- a/man/man2/prctl.2
> +++ b/man/man2/prctl.2
> @@ -150,6 +150,8 @@ with a significance depending on the first one.
> .B PR_GET_MDWE
> .TQ
> .B PR_RISCV_SET_ICACHE_FLUSH_CTX
> +.TQ
> +.B PR_FUTEX_HASH
> .SH RETURN VALUE
> On success,
> a nonnegative value is returned.
> @@ -262,4 +264,5 @@ so these operations should be used with care.
> .BR PR_SET_MDWE (2const),
> .BR PR_GET_MDWE (2const),
> .BR PR_RISCV_SET_ICACHE_FLUSH_CTX (2const),
> +.BR PR_FUTEX_HASH (2const),
> .BR core (5)
> diff --git a/man/man2const/PR_FUTEX_HASH.2const b/man/man2const/PR_FUTEX_HASH.2const
> new file mode 100644
> index 0000000000000..c7aa36064b79e
> --- /dev/null
> +++ b/man/man2const/PR_FUTEX_HASH.2const
> @@ -0,0 +1,122 @@
> +.\" Copyright, The authors of the Linux man-pages project
Please use lowercase 'the'.
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH PR_FUTEX_HASH 2const (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +PR_FUTEX_HASH
> +\-
> +configure the private futex hash
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#include <linux/prctl.h>" " /* Definition of " PR_* " constants */"
> +.B #include <sys/prctl.h>
> +.P
> +.BI "int prctl(PR_FUTEX_HASH, unsigned long " op ", ...);"
> +.fi
> +.SH DESCRIPTION
> +Configure the attributes for the underlying hash used by the
> +.BR futex (2)
> +family of operations.
> +The Linux kernel uses a hash to distributes the
s/distributes/distribute/
> +.BR futex (2)
> +users on different data structures.
> +The data structure holds the in-kernel representation of the operation and
> +keeps track of the current users which are enqueued and wait for a wake up.
> +It also provides synchronisation with users who perform a wake up.
> +The size of the global hash is determined at boot time and is based on the
Please don't break the line at a random point; I'd break after 'time'.
> +number of CPUs in the system.
> +Since the mapping from the provided
> +.I uaddr
This is the only reference in the entire page, and is also not mentioned
in prctl(2). I guess it refers to the one passed to futex(2), but I
think that should be mentioned.
> +value to the in-kernel representation is based on a hash, two unrelated tasks
Please break after ','. (Semantic newlines)
> +in the system can share the same hash bucket.
> +This in turn can lead to delays of the
> +.BR futex (2)
> +operation due to lock contention of the data structure.
> +These delays can be problematic on a real-time system since random tasks can
Break before 'since'.
> +share in-kernel locks and it is not deterministic which tasks will be involved.
Break before 'and'.
> +.P
> +Linux v6.16 implements a process wide private hash which is used by all
s/v6.16/6.16/
$ grep -rho 'Linux v\?6\.[0-9]*' man/ | sort | uniq -c
1 Linux 6.
2 Linux 6.0
6 Linux 6.1
7 Linux 6.10
10 Linux 6.11
1 Linux 6.12
8 Linux 6.13
6 Linux 6.14
2 Linux 6.15
5 Linux 6.2
6 Linux 6.3
3 Linux 6.4
3 Linux 6.5
3 Linux 6.6
9 Linux 6.7
6 Linux 6.8
7 Linux 6.9
1 Linux v6.14
s/process wide/process-wide/
> +.BR futex (2)
> +operations which specify the
> +.B FUTEX_PRIVATE_FLAG
> +as part of the operation.
> +Without any configuration the kernel will allocate 16 hash slots once the first
s/configuration/&\n/
s/slots/&\n/
> +thread has been created.
> +If the process continues to create threads, the kernel will try to resize the
s/,/&\n/
> +private hash based on the number of threads and available CPUs in the system.
s/hash/&\n/
> +The kernel will only increase the size and will make sure it does not exceed
s/size/&\n/
> +the size of the global hash.
> +.P
> +The user can configure the size of the private hash which will also disable the
s/hash/&\n/
> +automatic resize provided by the kernel.
> +.P
> +The following values for
> +.I op
> +can be specified:
> +.TP
> +.BI "int prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, unsigned long " hash_size ", unsigned long " hash_flags ");
I prefer if the suboperations get their own manual page:
PR_FUTEX_HASH_SET_SLOTS(2const)
PR_FUTEX_HASH_GET_SLOTS(2const)
PR_FUTEX_HASH_GET_IMMUTABLE(2const)
You can use a single page for all the suboperations, or separate them in
two or three pages, as you see fit, but I'd like this page to only
document the main operation, and the things that are common to all of
them.
See for example:
PR_SET_MM(2const)
and its suboperations, which include PR_SET_MM_ARG_START(2const) among
others. You can base these pages on those.
Please use one patch for each of the pages. The first one should
document the main operation, and then you can send one patch per page
that you write for the subops.
> +Set the number of slots to use for the private hash.
> +.P
> +.RS
> +.TP
> +.I hash_size
> +Specifies the size of private hash to allocate. Possible values are:
> +.RS
> +.TP
> +.I 0
> +Use the global hash.
> +This is the behaviour used before v6.16.
> +The operation can not be undone.
> +.TP
> +.I >0
> +Specifies the number of slots to allocate.
> +The value must be power of two and the lowest possible value is 2.
> +The upper limit depends on the available memory in the system.
> +Each slot requires 64bytes of memory.
> +Kernels compiled with
> +.I CONFIG_PROVE_LOCKING
> +will consume more than that.
> +.RE
> +.TP
> +.I hash_flags
> +.RS
> +The following flags can be specified:
> +.TP
> +.I FH_FLAG_IMMUTABLE
> +The private hash can no longer be changed.
> +By using an immutable privat hash the kernel can avoid some accounting for the
s/privat/private/
s/hash/&\n/ (semantic newlines)
Have a lovely day!
Alex
> +data structure.
> +This accounting is visible in benchmarks if many
> +.BR futex (2)
> +operations are invoked in parallel on different CPUs.
> +.RE
> +.RE
> +.TP
> +.BI "int prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS);
> +Returns the current size of the the private hash.
> +A value of 0 means that a private hash has not been allocated and the global
> +hash is in use.
> +A value >0 specifies the size of the private hash.
> +.TP
> +.BI "int prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_IMMUTABLE);
> +Return 1 if the hash has been made immutable and not be changed.
> +Otherwise 0.
> +.\"
> +.SH RETURN VALUE
> +On success,
> +these calls return a nonnegative value.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 6.16.
> +.SH SEE ALSO
> +.BR prctl (2) ,
> +.BR futex (2) ,
> +.BR futex (7)
> --
> 2.49.0
>
--
<https://www.alejandro-colomar.es/>
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists