linux-kernel - Re: [PATCH -fixes] dma-mapping: add default implementation to arch_dma_{set|clear}

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <42552be3-df90-4f8f-a27b-abd08e350fc4@arm.com>
Date: Tue, 9 Jul 2024 16:52:49 +0100
From: Robin Murphy <robin.murphy@....com>
To: Yangyu Chen <cyy@...self.name>, Christoph Hellwig <hch@....de>
Cc: iommu@...ts.linux.dev,
 Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
 Marek Szyprowski <m.szyprowski@...sung.com>
Subject: Re: [PATCH -fixes] dma-mapping: add default implementation to
 arch_dma_{set|clear}_uncached

On 09/07/2024 1:22 pm, Yangyu Chen wrote:
> 
> 
>> On Jul 9, 2024, at 19:46, Christoph Hellwig <hch@....de> wrote:
>>
>> On Tue, Jul 09, 2024 at 07:39:29PM +0800, Yangyu Chen wrote:
>>> The reason is that some optimizations failed to apply after adding
>>> some passes. I will fix the compiler later. Whatever, we should not
>>> rely on this optimization to get the code being successfully compiled.
>>
>> The Linux kernel relies on constant propagation and dead code
>> eliminiation a lot to make code simpler and more readable.
> 
> Actually, the compiler is patched LLVM with -O2 optimization. I didn’t
> turn off the optimization.

Well, sorry, you did do that, by patching the compiler in a way which 
makes it no longer happen as before. If LLVM is otherwise able to make 
this optimisation as expected then to me that sounds like your patch 
effectively causes a codegen regression in LLVM, thus it should hardly 
be the concern of Linux, nor every other project which may get worse 
codegen because of it, to work around it.

Regardless of whether people think it's reasonable to *depend* on 
compiler optimisation or not, emitting dead code which was not 
previously emitted at the same optimisation level seems like a clear 
step backwards.

> You can see what we did for the compiler here[1] and compile the
> kernel with `-march=rv64imac_zicond_zicldst` added to KBUILD_CFLAGS.
> I added conditional load/store pass as Intel did for the x86 APX
> extension, which appeared last year (called hoist load/store in
> LLVM if you want to search the PR), and then LLVM failed to optimize
> this.
> 
> The only failed symbol on the kernel with `ARCH=riscv defconfig`
> is `arch_dma_set_uncached` since the compiler requires all possible
> values to be known. I think a pattern like in kernel/dma/direct.c:349
> for symbol `arch_dma_clear_uncached`, which uses `if
> (IS_ENABLE(CONFIG_ARCH_HAS_xxx)) xxx` is acceptable. But for the
> symbol `arch_dma_set_uncached`, a complex analysis is needed for a
> value set in a block of branches. I think we should not rely on such
> compiler optimization in such a complex pattern.

I'm not a compiler guy, but is it really that complex when the variable 
is only ever written with the same compile-time-constant value that it's 
already initialised with? If the optimisation pass is so focused on 
being able to use a conditional store instruction that it would rather 
emit one which has no effect either way than elide it entirely, that 
doesn't strike me as a particularly good optimisation :/

Thanks,
Robin.

> 
> In addition, patching this way can also make this symbol safer to use.
> 
> [1] https://github.com/cyyself/llvm-project/tree/zicldst-support-bugless-v3
> 
> Thanks,
> Yangyu Chen