lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYjCiIfTZdy3q16P@1wt.eu>
Date: Sun, 8 Feb 2026 18:06:16 +0100
From: Willy Tarreau <w@....eu>
To: David Laight <david.laight.linux@...il.com>
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
        linux-kernel@...r.kernel.org, Cheng Li <lechain@...il.com>
Subject: Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify
 __nolibc_printf()

On Sun, Feb 08, 2026 at 04:54:25PM +0000, David Laight wrote:
> > However here I finally found what inflates the code, when disassembling
> > the whole function: with the move of the multiple "if" statements,
> > recent compilers managed to turn it into a jump table, that considerably
> > inflates .rodata and the function as well. By passing -fno-jump-tables,
> > the size drops by ~500 bytes:
> 
> That is just insane...
> That might go away with the patch that changes is all to bit-masks.

Yes, as mentioned later, it does.

> I'd done some full disassembly comparisons myself to see why changes
> made the code larger.
> I had an OPTIMIZER_HIDE_VAR(sign) in there to help, but the final
> version didn't need it.
> What this sort of code needs is something to force the compiler to
> only have one copy of something - I found a proposal for an attribute
> (or similar) for an asm block to do that, but nothing came of it.

Yeah I'm using similar hacks against the optimizer sometimes. That's
no big deal as there will always be variations between compilers, what
matters to me is that we can explain them (and indeed often when we
can we're also able to prevent the compiler from acting against us).

> > 
> >    text    data     bss     dec     hex filename
> >    2422      48      24    2494     9be hello-patch4
> >    1917      48      24    1989     7c5 hello-patch4-alt   <---
> > 
> > Building with gcc before 13 also avoids this table and explains why
> > you had better code with gcc-12.
> > 
> > I also noticed that we can reduce the loop by ~40 bytes by moving the
> > literal copy after after the block that deals with format sequences,
> > because it eases comparisons, but that's no big deal for now since your
> > subsequent patches are going to change all that.
> 
> Some of the early patches are carefully arranged to reduce churn
> later on.

Yes I noticed that. But the whole function is changed in the end so
we cannot avoid a number of complicated changes anyway.

> I might add the 'if (v == 0)' clause much earlier to avoid the churn
> cause by the extra indent when it is added.
> 
> I'll add some extra comments as you suggested in the other patches.

Yes, that's what is the most needed (and I don't deny that there are
already quite a bunch). When optimizing code, often the code ends up
being write-only. You're doing something while having the data flow in
your head and it turns into code (like size>=256), but when you don't
know the initial assumptions and you face this, you think "WTF?". Here
the comments need to indicate the developer's design choices (e.g.
"sign can hold up to two chars starting from LSB") and some of the
assumptions that become complicated to establish due to the long list
of if/else dealing with the multiple variants of specifiers.

> I do know all about optimising for size, and for the 'worst case path'.
> The latter was some embedded hdlc code that had to finish in 196 clocks.

Rest assured that it's quite visible, we're using the same tricks to save
every possible resource (making bitmaps from words etc), it's just that
doing this requires an amazing amount of comments. I'm used to saying
that each source or object byte saved offers more budget for comments :-)

Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ