lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYihTXM3titVomKc@1wt.eu>
Date: Sun, 8 Feb 2026 15:44:29 +0100
From: Willy Tarreau <w@....eu>
To: David Laight <david.laight.linux@...il.com>
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
        linux-kernel@...r.kernel.org, Cheng Li <lechain@...il.com>
Subject: Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify
 __nolibc_printf()

Hi David,

On Sun, Feb 08, 2026 at 12:20:31PM +0000, David Laight wrote:
> On Sat, 7 Feb 2026 23:50:19 +0000
> David Laight <david.laight.linux@...il.com> wrote:
> 
> > On Sat, 7 Feb 2026 21:05:42 +0100
> > Willy Tarreau <w@....eu> wrote:
> > 
> > > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@...il.com wrote:  
> > > > From: David Laight <david.laight.linux@...il.com>
> > > > 
> > > > Move the check for the length modifiers into the format processing
> > > > between the field width and conversion specifier.
> > > > This lets the loop be simplified and a 'fast scan' for a format start
> > > > used.
> > > > 
> > > > If an error is detected (eg an invalid conversion specifier) then
> > > > copy the invalid format to the output buffer.
> > > > 
> > > > Reduces code size by about 10% on x86-64.    
> > > 
> > > I'm surprised, because for me it's the opposite:
> > > 
> > >   $ size hello-patch*
> > >      text    data     bss     dec     hex filename
> > >      1859      48      24    1931     78b hello-patch1
> > >      2071      48      24    2143     85f hello-patch2
> > >      2091      48      24    2163     873 hello-patch3
> > >      2422      48      24    2494     9be hello-patch4
> > > 
> > > The whole program grew by almost 16%, and that's a 30% increase since
> > > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > > 15 bytes since previous patch.
> > > 
> > > I have not figured what makes this change yet, I'm still digging.  
> > 
> > Running scripts/bloat-o-meter will give more detail.
> > 
> > > Willy  
> > 
> > I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> > The object looks like what I'd expect, so might be -O2.
> > 
> > Is it constant folding the #defines.
> > For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.
> 
> Further thoughts:
> 
> On some of the builds I've done gcc duplicated the code following an 'if'
> into both the 'then' and 'else' clauses.
> This isn't good for code size.

That's common in loops for example. That's also one reason for avoiding
"else" statements in compact code.

However here I finally found what inflates the code, when disassembling
the whole function: with the move of the multiple "if" statements,
recent compilers managed to turn it into a jump table, that considerably
inflates .rodata and the function as well. By passing -fno-jump-tables,
the size drops by ~500 bytes:

   text    data     bss     dec     hex filename
   2422      48      24    2494     9be hello-patch4
   1917      48      24    1989     7c5 hello-patch4-alt   <---

Building with gcc before 13 also avoids this table and explains why
you had better code with gcc-12.

I also noticed that we can reduce the loop by ~40 bytes by moving the
literal copy after after the block that deals with format sequences,
because it eases comparisons, but that's no big deal for now since your
subsequent patches are going to change all that.

At least I wanted to understand what was causing this difference for
us both, and whether it risked remaining definitive or not, so now
this patch is OK to me.

Acked-by: Willy Tarreau <w@....eu>

Willy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ