lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1GQSyCj1L8fFG4Pah8dr5Lanw=1yuimX1o+53ARzOX+Q@mail.gmail.com>
Date:   Tue, 30 Jul 2019 09:34:28 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Segher Boessenkool <segher@...nel.crashing.org>
Cc:     Nathan Chancellor <natechancellor@...il.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        christophe leroy <christophe.leroy@....fr>,
        kbuild test robot <lkp@...el.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        clang-built-linux <clang-built-linux@...glegroups.com>
Subject: Re: [PATCH] powerpc: workaround clang codegen bug in dcbz

On Mon, Jul 29, 2019 at 11:52 PM Segher Boessenkool
<segher@...nel.crashing.org> wrote:
>
> On Mon, Jul 29, 2019 at 01:32:46PM -0700, Nathan Chancellor wrote:
> > For the record:
> >
> > https://godbolt.org/z/z57VU7
> >
> > This seems consistent with what Michael found so I don't think a revert
> > is entirely unreasonable.
>
> Try this:
>
>   https://godbolt.org/z/6_ZfVi
>
> This matters in non-trivial loops, for example.  But all current cases
> where such non-trivial loops are done with cache block instructions are
> actually written in real assembler already, using two registers.
> Because performance matters.  Not that I recommend writing code as
> critical as memset in C with inline asm :-)

Upon a second look, I think the issue is that the "Z" is an input argument
when it should be an output. clang decides that it can make a copy of the
input and pass that into the inline asm. This is not the most efficient
way, but it seems entirely correct according to the constraints.

Changing it to an output "=Z" constraint seems to make it work:

https://godbolt.org/z/FwEqHf

Clang still doesn't use the optimum form, but it passes the correct pointer.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ