lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 8 Apr 2024 12:42:31 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Peter Anvin <hpa@...or.com>, 
	"the arch/x86 maintainers" <x86@...nel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: More annoying code generation by clang

On Mon, 8 Apr 2024 at 11:32, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> It's been reported long ago, it seems to be hard to fix.
>
> I suspect the issue is that the inline asm format is fairly closely
> related to the gcc machine descriptions (look at the machine
> descriptor files in gcc, and if you can ignore the horrid LISP-style
> syntax you see how close they are).

Actually, one of the github issues pages has more of an explanation
(and yes, it's tied to impedance issues between the inline asm syntax
and how clang works):

      https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442

so I wrote more of a commit log and did that "ASM_SOURCE_G" thing
(except I decided to call it "input" instead of "source", since that's
the standard inline asm language).

This version also has that output size fixed, and the commit message
talks about it.

This does *not* fix other inline asms to use "ASM_INPUT_G/RM".

I think it's mainly some of the bitop code that people have noticed
before - fls and variable_ffs() and friends.

I suspect clang is more common in the arm64 world than it is for
x86-64 kernel developers, and arm64 inline asm basically never uses
"rm" or "g" since arm64 doesn't have instructions that take either a
register or a memory operand.

Anyway, with gcc this generates

        cmp (%rdx),%ebx; sbb %rax,%rax  # _7->max_fds, fd, __mask

IOW, it uses the memory location for "max_fds". It couldn't do that
before, because it used to think that it always had to do the compare
in 64 bits, and the memory location is only 32-bit.

With clang, this generates

        movl    (%rcx), %eax
        cmpl    %eax, %edi
        sbbq    %rdi, %rdi

which has that extra register use, but is at least much better than
what it used to generate with crazy "load into register, spill to
stack, then compare against stack contents".

               Linus

View attachment "0001-x86-improve-array_index_mask_nospec-code-generation.patch" of type "text/x-patch" (4554 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ