[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8cb6136-6924-4fae-f4bb-981dec2e1952@efficios.com>
Date: Fri, 6 Jan 2023 15:57:21 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Alejandro Colomar <alx.manpages@...il.com>,
linux-man@...r.kernel.org, Alejandro Colomar <alx@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Boqun Feng <boqun.feng@...il.com>, paulmck <paulmck@...nel.org>
Subject: Re: rseq(2) man page
On 2023-01-06 12:50, Alejandro Colomar wrote:
> Hi Mathieu,
>
> See some comments below.
>
> Cheers,
>
> Alex
[...]
>
> We now use SPDX-License-Identifier.
OK
>
>> .\"
>> .TH RSEQ 2 2020-06-05 "Linux" "Linux Programmer's Manual"
>
> We use lowercase for the function name (or to be more precise, the same
> case that the function name uses.
>
> The date is specified with a placeholder that is replaced at the time of
> creation of the tarball.
>
> The 5th argument is unspecified.
>
> The 4th argument is now the project name and a placeholder for the version.
>
> See an example:
>
> $ cat man2/membarrier.2 | grep '^.TH'
> .TH membarrier 2 (date) "Linux man-pages (unreleased)"
OK
>
>
>> .SH NAME
>> rseq \- Restartable sequences system call
>
> We use lowercase here, so s/Restartable/restartable/
OK
>
>> .SH SYNOPSIS
>> .nf
>> .B #include <linux/rseq.h>
>
> Is there a glibc wrapper for this syscall? Do you expect that it will
> be added relatively soon? Or is it expected that this syscall will be
> called through syscall(2) for many years?
The system call is only for registering/unregistering the per-thread
area, so I expect it will only be used through syscall(2), and typically
only from libc.
User applications/libraries are expected to use the per-thread area to
communicate with the kernel, but that does not require use of rseq(2).
>
> If so, it may be better to document it directly as such, like for
> example membarrier:
>
> SYNOPSIS
> #include <linux/membarrier.h> /* Definition of MEMBARRIER_*
> constants */
> #include <sys/syscall.h> /* Definition of SYS_* constants */
> #include <unistd.h>
>
> int syscall(SYS_membarrier, int cmd, unsigned int flags, int
> cpu_id);
>
> Note: glibc provides no wrapper for membarrier(), necessitating
> the use
> of syscall(2).
>
OK
>> .sp
>
> s/sp/PP/
OK
>
>> .BI "int rseq(struct rseq * " rseq ", uint32_t " rseq_len ", int "
>> flags ",
>
> Is it valid to pass NULL in 'rseq'? If it is, we now document that
> using _Nullable. See for example recv(2):
>
> SYNOPSIS
> #include <sys/socket.h>
>
> ssize_t recv(int sockfd, void buf[.len], size_t len,
> int flags);
> ssize_t recvfrom(int sockfd, void buf[restrict .len], size_t len,
> int flags,
> struct sockaddr *_Nullable restrict src_addr,
> socklen_t *_Nullable restrict addrlen);
> ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);
>
OK
>
>
>
>> uint32_t " sig ");
>> .sp
>
> .sp is unnecessary here.
OK
>
> .fi is missing (it's the "closing" pair to .nf).
OK
>
>> .SH DESCRIPTION
>>
>
> Use .PP instead of blank lines.
OK
>
>> The
>> .BR rseq ()
>> ABI accelerates specific user-space operations by registering a
>> per-thread data structure shared between kernel and user-space. This
>
> Use semantic newlines. See man-pages(7):
>
> Use semantic newlines
> In the source of a manual page, new sentences should be started
> on new
> lines, long sentences should be split into lines at clause
> breaks (com‐
> mas, semicolons, colons, and so on), and long clauses should
> be split
> at phrase boundaries. This convention, sometimes known as
> "semantic
> newlines", makes it easier to see the effect of patches,
> which often
> operate at the level of individual sentences, clauses, or phrases.
>
OK, done across the entire the manual page source.
[...]
>> .PP
>> Some examples of operations that can be accelerated or improved
>> by this ABI:
>> .IP \[bu] 2
>
> Use 3 instead of 2. See man-pages:
>
> Lists
> There are different kinds of lists:
>
> [...]
>
> Bullet lists
> Elements are preceded by bullet symbols (\(bu).
> Anything that
> doesn’t fit elsewhere is usually covered by this type of
> list.
>
> [...]
>
> There should always be exactly 2 spaces between the list symbol
> and the
> elements. This doesn’t apply to "tagged paragraphs", which use
> the de‐
> fault indentation rules.
>
>
> The rationale for that was that if you use 1 space, then the list
> introducer can be confused with the list contents. Two spaces makes the
> difference more clear.
OK
>
>
> Also, we use \(bu instead of \[bu]. I'm not particularly worried by
> using it, but I prefer being consistent at which one we use.
OK
[...]
>>
>> .TP
>> .in +4n
>
> I guess you're looking for .RS/.RE. It would wrap all the indented
> stuff, replacing .in.
OK.
>
>> .I cpu_id_start
>> Always-updated value of the CPU number on which the registered thread is
>> running. Its value is guaranteed to always be a possible CPU number,
>> even when rseq is not registered. Its value should always be confirmed by
>
> rseq (and maybe other cases around too) should be formatted in italics,
> since it's a variable name (.I).
"rseq" here does not refer to a variable name, but rather to the system
call. Should it be formatted in italics ? I format it as ".BR" elsewhere
in the man page.
I'll push a first update without the italics changes, please let me know
if there are still issues left.
[...]
>> .in +4n
>> .I flags
>> Flags indicating the restart behavior of this structure. Can be a
>> combination
>> of:
>> .IP \[bu]
>
> This list should be a tagged paragraph instead. See man-pages(7):
>
> Lists
> There are different kinds of lists:
>
> Tagged paragraphs
> These are used for a list of tags and their
> descriptions. When
> the tags are constants (either macros or numbers) they
> are in
> bold. Use the .TP macro.
>
> An example is this "Tagged paragraphs" subsection is itself.
>
> [...]
>
> Bullet lists
> Elements are preceded by bullet symbols (\(bu).
> Anything that
> doesn’t fit elsewhere is usually covered by this type of
> list.
>
> [...]
OK
[...]
>> .SH RETURN VALUE
>> A return value of 0 indicates success. On error, \-1 is returned, and
>> .I errno
>> is set appropriately.
>>
>> .SH ERRORS
>> .TP
>> .B EINVAL
>> Either
>> .I flags
>> contains an invalid value, or
>> .I rseq
>> contains an address which is not appropriately aligned, or
>> .I rseq_len
>> contains an incorrect size.
>> .TP
>> .B ENOSYS
>> The
>> .BR rseq ()
>> system call is not implemented by this kernel.
>> .TP
>> .B EFAULT
>> .I rseq
>> is an invalid address.
>
> Doesn't this result in a SEGV? It's trying to access invalid memory.
> We had some discussion about this in other syscalls, and concluded that
> that's undefined behavior, and a crash is valid behavior (and probably a
> good thing to do), right? I'm just curious about the view from the
> kernel point of view.
If the registered rseq pointer / size points to invalid memory on rseq
registration, the rseq registration system call will return -1,
errno=EFAULT. If at some point _after_ registration the mapping becomes
invalid (e.g. unmapped without prior unregistration), then a SIGSEGV can
be triggered.
I was not aware of this discussion a regarding returning EFAULT errno vs
SIGSEGV. If this is becoming a consensus across system calls to segfault
rather than return EFAULT errno, I'm open to improve sys_rseq accordingly.
>
>> .TP
>> .B EBUSY
>> Restartable sequence is already registered for this thread.
>> .TP
>> .B EPERM
>> The
>> .I sig
>> argument on unregistration does not match the signature received
>> on registration.
>>
>> .SH VERSIONS
>> The
>> .BR rseq ()
>> system call was added in Linux 4.18.
>>
>> .SH CONFORMING TO
>
> We call that section STANDARDS now.
OK
I've done a first pass applying all the relevant changes. I've pushed
the changes into the librseq master branch. I may have omitted
bold/italic for some identifiers. If it's the case, please let me know.
Thanks!
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists