[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a0BVWgUvHnJwfgUYbc9ZZqmCaG3XVe0thXX6kaaPFpZ_g@mail.gmail.com>
Date: Thu, 9 Nov 2017 11:14:56 +0100
From: Arnd Bergmann <arnd@...db.de>
To: Greentime Hu <green.hu@...il.com>
Cc: Greentime <greentime@...estech.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Jason Cooper <jason@...edaemon.net>,
Marc Zyngier <marc.zyngier@....com>,
Rob Herring <robh+dt@...nel.org>,
Networking <netdev@...r.kernel.org>,
Vincent Chen <vincentc@...estech.com>, deanbo422@...il.com
Subject: Re: [PATCH 13/31] nds32: DMA mapping API
On Thu, Nov 9, 2017 at 8:12 AM, Greentime Hu <green.hu@...il.com> wrote:
> 2017-11-08 17:09 GMT+08:00 Arnd Bergmann <arnd@...db.de>:
>> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu <green.hu@...il.com> wrote:
>>
>> You do the same cache operations for _to_cpu and _to_device, which
>> usually works,
>> but is more expensive than you need. It's better to take the ownership into
>> account and only do what you need.
>>
> Like this?
>
> static void
> nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir)
> {
> consistent_sync((void *)dma_to_virt(dev, handle), size,
> DMA_FROM_DEVICE);
> }
>
> static void
> nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
> size_t size, enum dma_data_direction dir)
> {
> consistent_sync((void *)dma_to_virt(dev, handle), size,
> DMA_TO_DEVICE);
> }
No, it's more complicated than that. You need to pass both the direction of the
DMA transaction and the ownership to consistent_sync(), and then do the
correct cache maintenance operation for each of the six combinations.
Which operation that is depends on the microarchitecture to some degree,
e.g. on machines that can load arbitrary cache lines during speculative
execution, you have to invalidate the caches during both
_for_device/FROM_DEVICE _for_cpu/FROM_DEVICE, while machines
without speculative execution can skip the second invalidation, they
only need to get rid of dirty cache lines before the DMA from device.
Usually you don't have to do a writeback during _for_cpu, since there
are no dirty cache lines after the _for_device operation.
It's not entirely clear what the correct behavior is for buffers that
are not cache line aligned, some architectures use wbinval instead
of inval for the _for_device/_FROM_DEVICE operation, on
any partial cache line, but you wouldn't want to do that on the
_for_cpu/_FROM_DEVICE operation.
Arnd
Powered by blists - more mailing lists