[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190729215200.GN31406@gate.crashing.org>
Date: Mon, 29 Jul 2019 16:52:00 -0500
From: Segher Boessenkool <segher@...nel.crashing.org>
To: Nathan Chancellor <natechancellor@...il.com>
Cc: Nick Desaulniers <ndesaulniers@...gle.com>, mpe@...erman.id.au,
christophe.leroy@....fr, arnd@...db.de,
kbuild test robot <lkp@...el.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
clang-built-linux@...glegroups.com
Subject: Re: [PATCH] powerpc: workaround clang codegen bug in dcbz
On Mon, Jul 29, 2019 at 01:32:46PM -0700, Nathan Chancellor wrote:
> For the record:
>
> https://godbolt.org/z/z57VU7
>
> This seems consistent with what Michael found so I don't think a revert
> is entirely unreasonable.
Try this:
https://godbolt.org/z/6_ZfVi
This matters in non-trivial loops, for example. But all current cases
where such non-trivial loops are done with cache block instructions are
actually written in real assembler already, using two registers.
Because performance matters. Not that I recommend writing code as
critical as memset in C with inline asm :-)
Segher
Powered by blists - more mailing lists