lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYdVCyJIT2oNnq9_@1wt.eu>
Date: Sat, 7 Feb 2026 16:06:51 +0100
From: Willy Tarreau <w@....eu>
To: david.laight.linux@...il.com
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
        linux-kernel@...r.kernel.org, Cheng Li <lechain@...il.com>
Subject: Re: [PATCH next] tools/nolibc: Optimise and common up number to
 ascii functions

Hi David,

On Tue, Feb 03, 2026 at 03:13:15PM +0000, david.laight.linux@...il.com wrote:
> From: David Laight <david.laight.linux@...il.com>
> 
> Implement u[64]to[ah]_r() using a common function that uses multiply
> by reciprocal to generate the least significant digit first and then
> reverses the string.
> 
> On 32bit this is five multiplies (with 64bit product) for each output
> digit. I think the old utoa_r() always did 36 multiplies and a lot
> of subtracts - so this is likely faster even for 32bit values.
> Definitely better for 64bit values (especially small ones).
> 
> Clearly shifts are faster for base 16, but reversing the output buffer
> makes a big difference.
> 
> Sharing the code reduces the footprint (unless gcc decides to constant
> fold the functions).
> Definitely helps vfprintf() where the constants get loaded and a single
> call is down.
> Also makes it cheap to add octal support to vfprintf for completeness.
> 
> Signed-off-by: David Laight <david.laight.linux@...il.com>

OK, I had a long series of tests on it, including with older compilers
going back to gcc-4.7 and on various archs. Except for code that would
previously only use utoh(), the new code is slightly smaller in the vast
majority of cases. And this combined with the added flexibility looks
like a good addition. The code is not trivial (as every time we're
dealing with number representation) but it's well documented, so I'm
personally fine with the change.

I'm just having a few comments below:

> -static __attribute__((unused))
> -int utoh_r(unsigned long in, char *buffer)
> +#define __U64TOA_RECIP(base) ((base) & 1 ? ~0ull / (base) : (1ull << 63) / ((base) / 2))

Please rename this macro to have _NOBLIC_ as a prefix.

> +#if defined(__SIZEOF_INT128__) && !defined(__mips__)

Out of curiosity, why !mips ? I tried with -mabi=64 and the function size
dropped from 0x120 to 0xc0 (lost 1/3 of its size).

> +		q = ((unsigned __int128)in * recip) >> 64;
> +#else
(...)

Once the macro is renamed, feel free to add:

Acked-by: Willy Tarreau <w@....eu>

Thanks!
Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ