[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YrFfQOSrqpX/qjhd@FVFF77S0Q05N>
Date: Tue, 21 Jun 2022 07:03:44 +0100
From: Mark Rutland <mark.rutland@....com>
To: Alexander Lobakin <alexandr.lobakin@...el.com>
Cc: Arnd Bergmann <arnd@...db.de>, Yury Norov <yury.norov@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Matt Turner <mattst88@...il.com>,
Brian Cain <bcain@...cinc.com>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>,
"David S. Miller" <davem@...emloft.net>,
Kees Cook <keescook@...omium.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Marco Elver <elver@...gle.com>, Borislav Petkov <bp@...e.de>,
Tony Luck <tony.luck@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-alpha@...r.kernel.org, linux-hexagon@...r.kernel.org,
linux-ia64@...r.kernel.org, linux-m68k@...ts.linux-m68k.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/7] bitops: let optimize out non-atomic bitops on
compile-time constants
On Mon, Jun 20, 2022 at 05:08:55PM +0200, Alexander Lobakin wrote:
> From: Mark Rutland <mark.rutland@....com>
> Date: Mon, 20 Jun 2022 15:19:42 +0100
>
> > On Fri, Jun 17, 2022 at 04:40:24PM +0200, Alexander Lobakin wrote:
> >
> > > The savings are architecture, compiler and compiler flags dependent,
> > > for example, on x86_64 -O2:
> > >
> > > GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235)
> > > LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)
> > > LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)
> > >
> > > and ARM64 (courtesy of Mark[0]):
> > >
> > > GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240)
> > > LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)
> >
> > Hmm... with *this version* of the series, I'm not getting results nearly as
> > good as that when building defconfig atop v5.19-rc3:
> >
> > GCC 8.5.0: add/remove: 83/49 grow/shrink: 973/1147 up/down: 32020/-47824 (-15804)
> > GCC 9.3.0: add/remove: 68/51 grow/shrink: 1167/592 up/down: 30720/-31352 (-632)
> > GCC 10.3.0: add/remove: 84/37 grow/shrink: 1711/1003 up/down: 45392/-41844 (3548)
> > GCC 11.1.0: add/remove: 88/31 grow/shrink: 1635/963 up/down: 51540/-46096 (5444)
> > GCC 11.3.0: add/remove: 89/32 grow/shrink: 1629/966 up/down: 51456/-46056 (5400)
> > GCC 12.1.0: add/remove: 84/31 grow/shrink: 1540/829 up/down: 48772/-43164 (5608)
> >
> > LLVM 12.0.1: add/remove: 118/58 grow/shrink: 437/381 up/down: 45312/-65668 (-20356)
> > LLVM 13.0.1: add/remove: 35/19 grow/shrink: 416/243 up/down: 14408/-22200 (-7792)
> > LLVM 14.0.0: add/remove: 42/16 grow/shrink: 415/234 up/down: 15296/-21008 (-5712)
> >
> > ... and that now seems to be regressing codegen with recent versions of GCC as
> > much as it improves it LLVM.
> >
> > I'm not sure if we've improved some other code and removed the benefit between
> > v5.19-rc1 and v5.19-rc3, or whether something else it at play, but this doesn't
> > look as compelling as it did.
>
> Mostly likely it's due to that in v1 I mistakingly removed
> `volatile` from gen[eric]_test_bit(), so there was an impact for
> non-constant cases as well.
> +5 Kb sounds bad tho. Do you have CONFIG_TEST_BITMAP enabled, does
> it work?
I didn't have it enabled, but I tried that just nw with GCC 12.1.0 and it
builds cleanly, and test_bitmap_const_eval() gets optimized away entirely. If i
remove the `static` from that, GCC generates:
| <test_bitmap_const_eval>:
| paciasp
| autiasp
| ret
... which is a trivial stub.
> Probably the same reason as for m68k, more constant
> optimization -> more aggressive inlining or inlining rebalance ->
> larger code. OTOH I've no idea why sometimes compiler decides to
> uninline really tiny functions where due to this patch series some
> bitops have been converted to constants, like it goes on m68k.
Hmm.... it'd be interesting to take a look at a few architectures and see what
the common case is.
Thanks,
Mark.
Powered by blists - more mailing lists