lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Jun 2022 14:45:41 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Alexander Lobakin <alexandr.lobakin@...el.com>
Cc:     Arnd Bergmann <arnd@...db.de>, Yury Norov <yury.norov@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Richard Henderson <rth@...ddle.net>,
        Matt Turner <mattst88@...il.com>,
        Brian Cain <bcain@...cinc.com>,
        Yoshinori Sato <ysato@...rs.sourceforge.jp>,
        Rich Felker <dalias@...c.org>,
        "David S. Miller" <davem@...emloft.net>,
        Kees Cook <keescook@...omium.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Marco Elver <elver@...gle.com>, Borislav Petkov <bp@...e.de>,
        Tony Luck <tony.luck@...el.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        alpha <linux-alpha@...r.kernel.org>,
        "open list:QUALCOMM HEXAGON..." <linux-hexagon@...r.kernel.org>,
        "linux-ia64@...r.kernel.org" <linux-ia64@...r.kernel.org>,
        linux-m68k <linux-m68k@...ts.linux-m68k.org>,
        Linux-sh list <linux-sh@...r.kernel.org>,
        sparclinux <sparclinux@...r.kernel.org>,
        Linux-Arch <linux-arch@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/6] bitops: let optimize out non-atomic bitops on
 compile-time constants

Hi Alexander,

On Mon, Jun 6, 2022 at 1:50 PM Alexander Lobakin
<alexandr.lobakin@...el.com> wrote:
> While I was working on converting some structure fields from a fixed
> type to a bitmap, I started observing code size increase not only in
> places where the code works with the converted structure fields, but
> also where the converted vars were on the stack. That said, the
> following code:
>
>         DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1];
>         unsigned long bar = BIT(BAR_BIT);
>         unsigned long baz = 0;
>
>         __set_bit(FOO_BIT, foo);
>         baz |= BIT(BAZ_BIT);
>
>         BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
>         BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT));
>         BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));
>
> triggers the first assertion on x86_64, which means that the
> compiler is unable to evaluate it to a compile-time initializer
> when the architecture-specific bitop is used even if it's obvious.
> I found that this is due to that many architecture-specific
> non-atomic bitop implementations use inline asm or other hacks which
> are faster or more robust when working with "real" variables (i.e.
> fields from the structures etc.), but the compilers have no clue how
> to optimize them out when called on compile-time constants.
>
> So, in order to let the compiler optimize out such cases, expand the
> test_bit() and __*_bit() definitions with a compile-time condition
> check, so that they will pick the generic C non-atomic bitop
> implementations when all of the arguments passed are compile-time
> constants, which means that the result will be a compile-time
> constant as well and the compiler will produce more efficient and
> simple code in 100% cases (no changes when there's at least one
> non-compile-time-constant argument).
> The condition itself:
>
> if (
> __builtin_constant_p(nr) &&     /* <- bit position is constant */
> __builtin_constant_p(!!addr) && /* <- compiler knows bitmap addr is
>                                       always either NULL or not */
> addr &&                         /* <- bitmap addr is not NULL */
> __builtin_constant_p(*addr)     /* <- compiler knows the value of
>                                       the target bitmap */
> )
>         /* then pick the generic C variant
> else
>         /* old code path, arch-specific
>
> I also tried __is_constexpr() as suggested by Andy, but it was
> always returning 0 ('not a constant') for the 2,3 and 4th
> conditions.
>
> The savings on x86_64 with LLVM are insane (.text):
>
> $ scripts/bloat-o-meter -c vmlinux.{base,test}
> add/remove: 72/75 grow/shrink: 182/518 up/down: 53925/-137810 (-83885)
>
> $ scripts/bloat-o-meter -c vmlinux.{base,mod}
> add/remove: 7/1 grow/shrink: 1/19 up/down: 1135/-4082 (-2947)
>
> $ scripts/bloat-o-meter -c vmlinux.{base,all}
> add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)

Thank you!

I gave it a try on m68k, and am a bit disappointed seeing an increase
in code size:

    add/remove: 49/13 grow/shrink: 279/138 up/down: 6434/-3342 (3092)

This is atari_defconfig on a tree based on v5.19-rc1, with
m68k-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0, GNU ld (GNU
Binutils for Ubuntu) 2.34).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ