[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEbi=3cdWLfZKPmELwR_P5gDa_ocJZPoexj7N4QnuDfwx4crtQ@mail.gmail.com>
Date: Fri, 10 Nov 2017 16:13:13 +0800
From: Greentime Hu <green.hu@...il.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Greentime <greentime@...estech.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Jason Cooper <jason@...edaemon.net>,
Marc Zyngier <marc.zyngier@....com>,
Rob Herring <robh+dt@...nel.org>,
Networking <netdev@...r.kernel.org>,
Vincent Chen <vincentc@...estech.com>, deanbo422@...il.com
Subject: Re: [PATCH 13/31] nds32: DMA mapping API
2017-11-09 18:14 GMT+08:00 Arnd Bergmann <arnd@...db.de>:
> On Thu, Nov 9, 2017 at 8:12 AM, Greentime Hu <green.hu@...il.com> wrote:
>> 2017-11-08 17:09 GMT+08:00 Arnd Bergmann <arnd@...db.de>:
>>> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu <green.hu@...il.com> wrote:
>>>
>
>>> You do the same cache operations for _to_cpu and _to_device, which
>>> usually works,
>>> but is more expensive than you need. It's better to take the ownership into
>>> account and only do what you need.
>>>
>> Like this?
>>
>> static void
>> nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
>> size_t size, enum dma_data_direction dir)
>> {
>> consistent_sync((void *)dma_to_virt(dev, handle), size,
>> DMA_FROM_DEVICE);
>> }
>>
>> static void
>> nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
>> size_t size, enum dma_data_direction dir)
>> {
>> consistent_sync((void *)dma_to_virt(dev, handle), size,
>> DMA_TO_DEVICE);
>> }
>
> No, it's more complicated than that. You need to pass both the direction of the
> DMA transaction and the ownership to consistent_sync(), and then do the
> correct cache maintenance operation for each of the six combinations.
>
> Which operation that is depends on the microarchitecture to some degree,
> e.g. on machines that can load arbitrary cache lines during speculative
> execution, you have to invalidate the caches during both
> _for_device/FROM_DEVICE _for_cpu/FROM_DEVICE, while machines
> without speculative execution can skip the second invalidation, they
> only need to get rid of dirty cache lines before the DMA from device.
>
> Usually you don't have to do a writeback during _for_cpu, since there
> are no dirty cache lines after the _for_device operation.
>
> It's not entirely clear what the correct behavior is for buffers that
> are not cache line aligned, some architectures use wbinval instead
> of inval for the _for_device/_FROM_DEVICE operation, on
> any partial cache line, but you wouldn't want to do that on the
> _for_cpu/_FROM_DEVICE operation.
I get your point. I prefer to keep it that way because it will be a
little bit complex.
I will still study the code to see what I can improve in the next version patch.
Powered by blists - more mailing lists