lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFULd4ZSS+gLeqFs4hNSoEsN-T-UUOGVzLX9OjR07maMrP+CHQ@mail.gmail.com>
Date: Fri, 28 Feb 2025 14:13:50 +0100
From: Uros Bizjak <ubizjak@...il.com>
To: x86@...nel.org, linux-kernel@...r.kernel.org
Cc: Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic
 locking insns

On Fri, Feb 28, 2025 at 1:38 PM Uros Bizjak <ubizjak@...il.com> wrote:
>
> According to:
>
>   https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html
>
> the usage of asm pseudo directives in the asm template can confuse
> the compiler to wrongly estimate the size of the generated
> code.
>
> The LOCK_PREFIX macro expands to several asm pseudo directives, so
> its usage in atomic locking insns causes instruction length estimate
> to fail significantly (the specially instrumented compiler reports
> the estimated length of these asm templates to be 6 instructions long).
>
> This incorrect estimate further causes unoptimal inlining decisions,
> unoptimal instruction scheduling and unoptimal code block alignments
> for functions that use these locking primitives.
>
> Use asm_inline instead:
>
>   https://gcc.gnu.org/pipermail/gcc-patches/2018-December/512349.html
>
> which is a feature that makes GCC pretend some inline assembler code
> is tiny (while it would think it is huge), instead of just asm.
>
> For code size estimation, the size of the asm is then taken as
> the minimum size of one instruction, ignoring how many instructions
> compiler thinks it is.
>
> The code size of the resulting x86_64 defconfig object file increases
> for 33.264 kbytes, representing 1.2% code size increase:
>
>    text    data     bss     dec     hex filename
> 27450107        4633332  814148 32897587        1f5fa33 vmlinux-old.o
> 27483371        4633784  814148 32931303        1f67de7 vmlinux-new.o
>
> mainly due to different inlining decisions of -O2 build.

FTR, -Os (where generated code size really matters) x86_64 defconfig
object file *decreases* for 24.388 kbytes, representing 1.0% code size
*decrease*:

  text    data     bss     dec     hex filename
23883860        4617284  814212 29315356        1bf511c vmlinux-old.o
23859472        4615404  814212 29289088        1beea80 vmlinux-new.o

again mainly due to different inlining decisions of -Os build.

Uros.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ