lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 14 Aug 2022 13:29:28 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Nathan Chancellor <nathan@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Jeff Layton <jlayton@...nel.org>,
        Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
        linux-kernel@...r.kernel.org, Matthew Wilcox <willy@...radead.org>,
        clang-built-linux <llvm@...ts.linux.dev>
Subject: Re: [GIT PULL] Ceph updates for 5.20-rc1

On Sun, Aug 14, 2022 at 1:03 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Gcc does well regardless, and clang ends up really wanting to move so
> much out of the dentry_cmp() loop that it runs out of registers and
> always ends up doing a couple of spills.
>
> I think it reduced the spills by one, but not enough to generate the
> nice non-frame code that gcc does.

Note that that code was basically written to make gcc happy, so the
fact that clang then does worse is not hugely surprising.

I dug into it some more, and it is really "load_unaligned_zeropad()"
that makes clang really uncomfortable.

The problem ends up being that clang sees that it's inside that inner
loop, and tries very hard to optimize the shift-and-mask that happens
if the exception happens.

The fact that the exception *never* happens unless DEBUG_PAGEALLOC is
enabled - and very very seldom even then - is not something we can
really explain to clang.

So it thinks that code is really hot in the inner loop, and does all
kinds of silly things due to that.

Gcc, in contrast, generates the obvious straightforward code, and
that's what I wrote and optimized that code for.

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ