[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYihTXM3titVomKc@1wt.eu>
Date: Sun, 8 Feb 2026 15:44:29 +0100
From: Willy Tarreau <w@....eu>
To: David Laight <david.laight.linux@...il.com>
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
linux-kernel@...r.kernel.org, Cheng Li <lechain@...il.com>
Subject: Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify
__nolibc_printf()
Hi David,
On Sun, Feb 08, 2026 at 12:20:31PM +0000, David Laight wrote:
> On Sat, 7 Feb 2026 23:50:19 +0000
> David Laight <david.laight.linux@...il.com> wrote:
>
> > On Sat, 7 Feb 2026 21:05:42 +0100
> > Willy Tarreau <w@....eu> wrote:
> >
> > > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@...il.com wrote:
> > > > From: David Laight <david.laight.linux@...il.com>
> > > >
> > > > Move the check for the length modifiers into the format processing
> > > > between the field width and conversion specifier.
> > > > This lets the loop be simplified and a 'fast scan' for a format start
> > > > used.
> > > >
> > > > If an error is detected (eg an invalid conversion specifier) then
> > > > copy the invalid format to the output buffer.
> > > >
> > > > Reduces code size by about 10% on x86-64.
> > >
> > > I'm surprised, because for me it's the opposite:
> > >
> > > $ size hello-patch*
> > > text data bss dec hex filename
> > > 1859 48 24 1931 78b hello-patch1
> > > 2071 48 24 2143 85f hello-patch2
> > > 2091 48 24 2163 873 hello-patch3
> > > 2422 48 24 2494 9be hello-patch4
> > >
> > > The whole program grew by almost 16%, and that's a 30% increase since
> > > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > > 15 bytes since previous patch.
> > >
> > > I have not figured what makes this change yet, I'm still digging.
> >
> > Running scripts/bloat-o-meter will give more detail.
> >
> > > Willy
> >
> > I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> > The object looks like what I'd expect, so might be -O2.
> >
> > Is it constant folding the #defines.
> > For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.
>
> Further thoughts:
>
> On some of the builds I've done gcc duplicated the code following an 'if'
> into both the 'then' and 'else' clauses.
> This isn't good for code size.
That's common in loops for example. That's also one reason for avoiding
"else" statements in compact code.
However here I finally found what inflates the code, when disassembling
the whole function: with the move of the multiple "if" statements,
recent compilers managed to turn it into a jump table, that considerably
inflates .rodata and the function as well. By passing -fno-jump-tables,
the size drops by ~500 bytes:
text data bss dec hex filename
2422 48 24 2494 9be hello-patch4
1917 48 24 1989 7c5 hello-patch4-alt <---
Building with gcc before 13 also avoids this table and explains why
you had better code with gcc-12.
I also noticed that we can reduce the loop by ~40 bytes by moving the
literal copy after after the block that deals with format sequences,
because it eases comparisons, but that's no big deal for now since your
subsequent patches are going to change all that.
At least I wanted to understand what was causing this difference for
us both, and whether it risked remaining definitive or not, so now
this patch is OK to me.
Acked-by: Willy Tarreau <w@....eu>
Willy
Powered by blists - more mailing lists