[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJF2gTSRuj3-AgynXxZeXc2vGSH8Ohn5eP2hsuKi8rTzSPLhRQ@mail.gmail.com>
Date: Sat, 6 May 2023 15:25:41 +0800
From: Guo Ren <guoren@...nel.org>
To: Arnd Bergmann <arnd@...db.de>
Cc: Arnd Bergmann <arnd@...nel.org>, Christoph Hellwig <hch@....de>,
linux-kernel@...r.kernel.org, Vineet Gupta <vgupta@...nel.org>,
Will Deacon <will@...nel.org>,
Russell King <linux@...linux.org.uk>,
Neil Armstrong <neil.armstrong@...aro.org>,
Linus Walleij <linus.walleij@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Brian Cain <bcain@...cinc.com>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Michal Simek <monstr@...str.eu>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Dinh Nguyen <dinguyen@...nel.org>,
Stafford Horne <shorne@...il.com>,
Helge Deller <deller@....de>,
Michael Ellerman <mpe@...erman.id.au>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Rich Felker <dalias@...c.org>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
"David S . Miller" <davem@...emloft.net>,
Max Filippov <jcmvbkbc@...il.com>,
Robin Murphy <robin.murphy@....com>,
"Lad, Prabhakar" <prabhakar.mahadev-lad.rj@...renesas.com>,
"Conor.Dooley" <conor.dooley@...rochip.com>,
linux-snps-arc@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org,
"linux-oxnas@...ups.io" <linux-oxnas@...ups.io>,
"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>,
linux-hexagon@...r.kernel.org, linux-m68k@...ts.linux-m68k.org,
linux-mips@...r.kernel.org,
"linux-openrisc@...r.kernel.org" <linux-openrisc@...r.kernel.org>,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-riscv@...ts.infradead.org, linux-sh@...r.kernel.org,
sparclinux@...r.kernel.org, linux-xtensa@...ux-xtensa.org
Subject: Re: [PATCH 09/21] riscv: dma-mapping: skip invalidation before
bidirectional DMA
On Fri, May 5, 2023 at 9:19 PM Arnd Bergmann <arnd@...db.de> wrote:
>
> On Fri, May 5, 2023, at 07:47, Guo Ren wrote:
> > On Mon, Mar 27, 2023 at 8:15 PM Arnd Bergmann <arnd@...nel.org> wrote:
>
> >>
> >> riscv also invalidates the caches before the transfer, which does
> >> not appear to serve any purpose.
> > Yes, we can't guarantee the CPU pre-load cache lines randomly during
> > dma working.
> >
> > But I've two purposes to keep invalidates before dma transfer:
> > - We clearly tell the CPU these cache lines are invalid. The caching
> > algorithm would use these invalid slots first instead of replacing
> > valid ones.
> > - Invalidating is very cheap. Actually, flush and clean have the same
> > performance in our machine.
>
> The main purpose of the series was to get consistent behavior on
> all machines, so I really don't want a custom optimization on
> one architecture. You make a good point about cacheline reuse
> after invalidation, but if we do that, I'd suggest doing this
> across all architectures.
Yes, invalidation of DMA_FROM_DEVICE-for_device is a proposal for all
architectures.
>
> > So, how about:
> >
> > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
> > index d919efab6eba..2c52fbc15064 100644
> > --- a/arch/riscv/mm/dma-noncoherent.c
> > +++ b/arch/riscv/mm/dma-noncoherent.c
> > @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
> > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> > break;
> > case DMA_FROM_DEVICE:
> > - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> > - break;
> > case DMA_BIDIRECTIONAL:
> > ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> > break;
>
> This is something we can consider. Unfortunately, this is something
> that no architecture (except pa-risc, which has other problems)
> does at the moment, so we'd probably need to have a proper debate
> about this.
>
> We already have two conflicting ways to handle DMA_FROM_DEVICE,
> either invalidate/invalidate, or clean/invalidate. I can see
I vote to invalidate/invalidate.
My key point is to let DMA_FROM_DEVICE-for_device invalidate, and
DMA_BIDIRECTIONAL contains DMA_FROM_DEVICE.
So I also agree:
@@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
break;
case DMA_FROM_DEVICE:
- ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(invalidate, vaddr, size, riscv_cbom_block_size);
break;
case DMA_BIDIRECTIONAL:
ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
break;
> that flush/invalidate may be a sensible option as well, but I'd
> want to have that discussion after the series is complete, so
> we can come to a generic solution that has the same documented
> behavior across all architectures.
Yes, I agree to unify them into a generic solution first. My proposal
could be another topic in the future.
For that purpose, I give
Acked-by: Guo Ren <guoren@...nel.org>
>
> In particular, if we end up moving arm64 and riscv back to the
> traditional invalidate/invalidate for DMA_FROM_DEVICE and
> document that driver must not rely on buffers getting cleaned
After invalidation, the cache lines are also cleaned, right? So why do
we need to document it additionally?
> before a partial DMA_FROM_DEVICE, the question between clean
> or flush becomes moot as well.
>
> > @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
> > break;
> > case DMA_FROM_DEVICE:
> > case DMA_BIDIRECTIONAL:
> > /* I'm not sure all drivers have guaranteed cacheline
> > alignment. If not, this inval would cause problems */
> > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> > + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
> > break;
>
> This is my original patch, and I would not mix it with the other
> change. The problem with non-aligned DMA_BIDIRECTIONAL buffers in
> is that both flush and inval would be wrong if you get simultaneous
> writes from device and cpu to the same cache line, so there is
> no way to win this. Using inval instead of flush would at least
> work if the CPU data in the cacheline is read-only from the CPU,
> so that seems better than something that is always wrong.
If CPU data in the cacheline is read-only, the cacheline would never
be dirty. Yes, It's always safe.
Okay, I agree we must keep cache-line-aligned. I comment it here, just
worry some dirty drivers couldn't work with the "invalid mechanism"
because of the CPU data corruption, and device data in the cacheline
is useless.
>
> The documented API is that sharing the cache line is not allowed
> at all, so anything that would observe a difference between the
> two is also a bug. One idea that we have considered already is
> that we could overwrite the unused bits of the cacheline with
> poison values and/or mark them as invalid using KASAN for debugging
> purposes, to find drivers that already violate this.
>
> Arnd
--
Best Regards
Guo Ren
Powered by blists - more mailing lists