[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <42552be3-df90-4f8f-a27b-abd08e350fc4@arm.com>
Date: Tue, 9 Jul 2024 16:52:49 +0100
From: Robin Murphy <robin.murphy@....com>
To: Yangyu Chen <cyy@...self.name>, Christoph Hellwig <hch@....de>
Cc: iommu@...ts.linux.dev,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>
Subject: Re: [PATCH -fixes] dma-mapping: add default implementation to
arch_dma_{set|clear}_uncached
On 09/07/2024 1:22 pm, Yangyu Chen wrote:
>
>
>> On Jul 9, 2024, at 19:46, Christoph Hellwig <hch@....de> wrote:
>>
>> On Tue, Jul 09, 2024 at 07:39:29PM +0800, Yangyu Chen wrote:
>>> The reason is that some optimizations failed to apply after adding
>>> some passes. I will fix the compiler later. Whatever, we should not
>>> rely on this optimization to get the code being successfully compiled.
>>
>> The Linux kernel relies on constant propagation and dead code
>> eliminiation a lot to make code simpler and more readable.
>
> Actually, the compiler is patched LLVM with -O2 optimization. I didn’t
> turn off the optimization.
Well, sorry, you did do that, by patching the compiler in a way which
makes it no longer happen as before. If LLVM is otherwise able to make
this optimisation as expected then to me that sounds like your patch
effectively causes a codegen regression in LLVM, thus it should hardly
be the concern of Linux, nor every other project which may get worse
codegen because of it, to work around it.
Regardless of whether people think it's reasonable to *depend* on
compiler optimisation or not, emitting dead code which was not
previously emitted at the same optimisation level seems like a clear
step backwards.
> You can see what we did for the compiler here[1] and compile the
> kernel with `-march=rv64imac_zicond_zicldst` added to KBUILD_CFLAGS.
> I added conditional load/store pass as Intel did for the x86 APX
> extension, which appeared last year (called hoist load/store in
> LLVM if you want to search the PR), and then LLVM failed to optimize
> this.
>
> The only failed symbol on the kernel with `ARCH=riscv defconfig`
> is `arch_dma_set_uncached` since the compiler requires all possible
> values to be known. I think a pattern like in kernel/dma/direct.c:349
> for symbol `arch_dma_clear_uncached`, which uses `if
> (IS_ENABLE(CONFIG_ARCH_HAS_xxx)) xxx` is acceptable. But for the
> symbol `arch_dma_set_uncached`, a complex analysis is needed for a
> value set in a block of branches. I think we should not rely on such
> compiler optimization in such a complex pattern.
I'm not a compiler guy, but is it really that complex when the variable
is only ever written with the same compile-time-constant value that it's
already initialised with? If the optimisation pass is so focused on
being able to use a conditional store instruction that it would rather
emit one which has no effect either way than elide it entirely, that
doesn't strike me as a particularly good optimisation :/
Thanks,
Robin.
>
> In addition, patching this way can also make this symbol safer to use.
>
> [1] https://github.com/cyyself/llvm-project/tree/zicldst-support-bugless-v3
>
> Thanks,
> Yangyu Chen
Powered by blists - more mailing lists