linux-kernel - Re: [PATCH] vsprintf/doc: Document format flags including field width and precision

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e424481-6428-068c-d58b-7a11e36c2cc6@rasmusvillemoes.dk>
Date:   Mon, 22 May 2023 23:04:44 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Petr Mladek <pmladek@...e.com>,
        John Ogness <john.ogness@...utronix.de>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
Cc:     Jonathan Corbet <corbet@....net>, phone-devel@...r.kernel.org,
        linux-doc@...r.kernel.org, Luca Weiss <luca.weiss@...rphone.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vsprintf/doc: Document format flags including field width
 and precision

On 22/05/2023 17.08, Petr Mladek wrote:
> The kernel implementation of vsprintf() tries to be as compatible with
> the user space variant as possible. Though it does not implement all
> features. On the other hand, it adds some special pointer printing
> modifiers.
> 
> Most differences are described in Documentation/core-api/printk-formats.rst
> Add the missing documentation of the supported flag characters
> '#', '0', '-', ' ', '+' together with field width and precision modifiers.
> 
> Suggested-by: Luca Weiss <luca.weiss@...rphone.com>
> Signed-off-by: Petr Mladek <pmladek@...e.com>
> ---
> What about something like this, please?
> 
>  Documentation/core-api/printk-formats.rst | 69 +++++++++++++++++++++++
>  1 file changed, 69 insertions(+)
> 
> diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
> index dfe7e75a71de..79655b319658 100644
> --- a/Documentation/core-api/printk-formats.rst
> +++ b/Documentation/core-api/printk-formats.rst
> @@ -8,6 +8,75 @@ How to get printk format specifiers right
>  :Author: Andrew Murray <amurray@...-data.co.uk>
>  
>  
> +Flag characters
> +===============
> +
> +The character '%' might be followed by the following flags that modify
> +the output:
> +
> +	- '#' - prepend '0', '0x', or 'OX for 'o', 'x', 'X' number conversions
> +	- '0' - zero pad number conversions on the field boundary
> +	- '-' - left adjust on the field boundary, blank pad on the right
> +	- ' ' - prepend space on positive numbers
> +	- '+' - prepend + for positive numbers when using signed formats

[I wonder if we have a single user of any of the latter two in the
entire tree.]

> +Examples::
> +
> +	|%x|	|1a|
> +	|%#x|	|0x1a|
> +	|%d|	|26|
> +	|% d|	| 26|
> +	|%+d|	|+26|
> +
> +
> +Field width
> +===========
> +
> +A field width may be defined when '%' is optionally followed by the above flag
> +characters and:
> +
> +	- 'number' - the decimal number defines the field width
> +	- '*' the field width is defined by an extra parameter
> +
> +Values are never truncated when the filed width is not big enough.

filed -> field (several places)

> +Spaces are used by default when a padding is needed.
> +
> +Examples::
> +
> +	|%6d|	|    26|
> +	|%-6d|	|26    |
> +	|%06d|	|000026|
> +
> +	printk("Dynamic table: |%*d|%*s|\n", id_width, id, max_name_len, name);
> +
> +The filed width value might have special meaning for some pointer formats.
> +For example, it limits the size of the bitmap handled by %*pb format.

It should also be noted that a negative field width passed as a *
argument is interpreted as if the - flag is used and then the absolute
value is used as field width.

> +
> +
> +Field precision:
> +================
> +
> +A field width may be defined when '%' is optionally followed by the above flag
> +characters:
> +
> +	- '.number' - the decimal number defines the field precision
> +	- '.*' the field precision is defined by an extra parameter
> +
> +The precision defines:
> +
> +	- number of digits after the decimal point in float number conversions

No, don't mention floats, the kernel doesn't do those.

> +	- minimal number of digits in integer conversions
> +	- maximum number of characters in string conversions
> +
> +Examples::
> +
> +	|%.3f|	|12.300|

Remove.

> +	|%.6d|	|    26|

Nope, that actually produces 000026.

---

So overall, I'm not sure this is a net win. I think it might be better
to emphasize that

- the kernel doesn't do floats, argument reordering via m$, wide
characters/strings, %m or %n (just so that's out of the equation)

- for string and integer conversions, the kernel's printf is very very
close to following POSIX/libc/whatever, in terms of flags, field width
etc. [There are a few exceptions, those I've found are documented in
test_printf.c, but nobody is ever likely to hit those.]

- for %p, the kernel has its own rules, starting with the fact that
modifying behaviour based on alphanumerics following the p is completely
non-standard.

and then spend the rest explaining those rules, and perhaps also some
background on why the %p extensions exist and why they are implemented
the way they are - for example "we want -Wformat to tell us if something
is wrong", but that, for example, means we can only use a field width
and not a precision to pass an extra argument to a %psomething. And
alphanumerics are chosen because nobody would usually follow a normal %p
by anything but whitespace or punctuation, and because the compiler
format checking is happy as long as there's some pointer argument
corresponding to the %p, and the remaining characters are, from the
compiler's POV, just literal characters.

Rasmus