[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <182840dedb4890a88c672b1c5d556920bf89a8fb.camel@synopsys.com>
Date: Fri, 18 May 2018 19:57:34 +0000
From: Alexey Brodkin <Alexey.Brodkin@...opsys.com>
To: "linux@...linux.org.uk" <linux@...linux.org.uk>
CC: "deanbo422@...il.com" <deanbo422@...il.com>,
"linux-sh@...r.kernel.org" <linux-sh@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"nios2-dev@...ts.rocketboards.org" <nios2-dev@...ts.rocketboards.org>,
"linux-xtensa@...ux-xtensa.org" <linux-xtensa@...ux-xtensa.org>,
"linux-m68k@...ts.linux-m68k.org" <linux-m68k@...ts.linux-m68k.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-hexagon@...r.kernel.org" <linux-hexagon@...r.kernel.org>,
"hch@....de" <hch@....de>,
"linux-alpha@...r.kernel.org" <linux-alpha@...r.kernel.org>,
"linux-snps-arc@...ts.infradead.org"
<linux-snps-arc@...ts.infradead.org>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"green.hu@...il.com" <green.hu@...il.com>,
"Vineet.Gupta1@...opsys.com" <Vineet.Gupta1@...opsys.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"openrisc@...ts.librecores.org" <openrisc@...ts.librecores.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"monstr@...str.eu" <monstr@...str.eu>,
"linux-parisc@...r.kernel.org" <linux-parisc@...r.kernel.org>,
"linux-c6x-dev@...ux-c6x.org" <linux-c6x-dev@...ux-c6x.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"sparclinux@...r.kernel.org" <sparclinux@...r.kernel.org>
Subject: Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH
02/20] dma-mapping: provide a generic dma-noncoherent implementation)
Hi Russel,
On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote:
> It's necessary. Take a moment to think carefully about this:
>
> dma_map_single(, dir)
>
> dma_sync_single_for_cpu(, dir)
>
> dma_sync_single_for_device(, dir)
>
> dma_unmap_single(, dir)
>
> In the case of a DMA-incoherent architecture, the operations done at each
> stage depend on the direction argument:
>
> map for_cpu for_device unmap
> TO_DEV writeback none writeback none
> TO_CPU invalidate invalidate* invalidate invalidate*
> BIDIR writeback invalidate writeback invalidate
>
> * - only necessary if the CPU speculatively prefetches.
I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even
if CPU doesn't preferch - what if we reuse the same buffer for multiple
reads from DMA device?
> The multiple invalidations for the TO_CPU case handles different
> conditions that can result in data corruption, and for some CPUs, all
> four are necessary.
I would agree that map()/unmap() a quite a special cases and so depending
on direction we need to execute in them either for_cpu() or for_device()
call-backs depending on direction.
As for invalidation in case of for_device(TO_CPU) I still don't see
a rationale behind it. Would be interesting to see a real example where
we benefit from this.
> This is what is implemented for 32-bit ARM, depending on the CPU
> capabilities, as we have DMA incoherent devices and we have CPUs that
> speculatively prefetch data, and so may load data into the caches while
> DMA is in operation.
>
>
> Things get more interesting if the implementation behind the DMA API has
> to copy data between the buffer supplied to the mapping and some DMA
> accessible buffer:
>
> map for_cpu for_device unmap
> TO_DEV copy to dma none copy to dma none
> TO_CPU none copy to cpu none copy to cpu
> BIDIR copy to dma copy to cpu copy to dma copy to cpu
>
> So, in both cases, the value of the direction argument defines what you
> need to do in each call.
Interesting enough in your seond table (which describes more complicated
case indeed) you set "none" for for_device(TO_CPU) which looks logical
to me.
So IMHO that's what make sense:
---------------------------->8-----------------------------
map for_cpu for_device unmap
TO_DEV writeback none writeback none
TO_CPU none invalidate none invalidate*
BIDIR writeback invalidate writeback invalidate*
---------------------------->8-----------------------------
* is the case for prefetching CPU.
-Alexey
Powered by blists - more mailing lists