[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h873zs88.fsf@concordia.ellerman.id.au>
Date: Tue, 30 Jul 2019 21:17:43 +1000
From: Michael Ellerman <mpe@...erman.id.au>
To: Arnd Bergmann <arnd@...db.de>,
Segher Boessenkool <segher@...nel.crashing.org>
Cc: Nathan Chancellor <natechancellor@...il.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
christophe leroy <christophe.leroy@....fr>,
kbuild test robot <lkp@...el.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>
Subject: Re: [PATCH] powerpc: workaround clang codegen bug in dcbz
Arnd Bergmann <arnd@...db.de> writes:
> On Mon, Jul 29, 2019 at 11:52 PM Segher Boessenkool
> <segher@...nel.crashing.org> wrote:
>> On Mon, Jul 29, 2019 at 01:32:46PM -0700, Nathan Chancellor wrote:
>> > For the record:
>> >
>> > https://godbolt.org/z/z57VU7
>> >
>> > This seems consistent with what Michael found so I don't think a revert
>> > is entirely unreasonable.
>>
>> Try this:
>>
>> https://godbolt.org/z/6_ZfVi
>>
>> This matters in non-trivial loops, for example. But all current cases
>> where such non-trivial loops are done with cache block instructions are
>> actually written in real assembler already, using two registers.
>> Because performance matters. Not that I recommend writing code as
>> critical as memset in C with inline asm :-)
>
> Upon a second look, I think the issue is that the "Z" is an input argument
> when it should be an output. clang decides that it can make a copy of the
> input and pass that into the inline asm. This is not the most efficient
> way, but it seems entirely correct according to the constraints.
>
> Changing it to an output "=Z" constraint seems to make it work:
>
> https://godbolt.org/z/FwEqHf
>
> Clang still doesn't use the optimum form, but it passes the correct pointer.
Thanks Arnd. This seems like a better solution.
I'll drop the revert I have staged.
Segher does this look OK to you?
Nathan/Nick, are one of you able to test this with your clang CI?
cheers
Powered by blists - more mailing lists