lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Nov 2017 11:14:56 +0100
From:   Arnd Bergmann <>
To:     Greentime Hu <>
Cc:     Greentime <>,
        Linux Kernel Mailing List <>,
        linux-arch <>,
        Thomas Gleixner <>,
        Jason Cooper <>,
        Marc Zyngier <>,
        Rob Herring <>,
        Networking <>,
        Vincent Chen <>,
Subject: Re: [PATCH 13/31] nds32: DMA mapping API

On Thu, Nov 9, 2017 at 8:12 AM, Greentime Hu <> wrote:
> 2017-11-08 17:09 GMT+08:00 Arnd Bergmann <>:
>> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu <> wrote:

>> You do the same cache operations for _to_cpu and _to_device, which
>> usually works,
>> but is more expensive than you need. It's better to take the ownership into
>> account and only do what you need.
> Like this?
> static void
> nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
>                               size_t size, enum dma_data_direction dir)
> {
>         consistent_sync((void *)dma_to_virt(dev, handle), size,
> }
> static void
> nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
>                                  size_t size, enum dma_data_direction dir)
> {
>         consistent_sync((void *)dma_to_virt(dev, handle), size,
> }

No, it's more complicated than that. You need to pass both the direction of the
DMA transaction and the ownership to consistent_sync(), and then do the
correct cache maintenance operation for each of the six combinations.

Which operation that is depends on the microarchitecture to some degree,
e.g. on machines that can load arbitrary cache lines during speculative
execution, you have to invalidate the caches during both
_for_device/FROM_DEVICE _for_cpu/FROM_DEVICE, while machines
without speculative execution can skip the second invalidation, they
only need to get rid of dirty cache lines before the DMA from device.

Usually you don't have to do a writeback during _for_cpu, since there
are no dirty cache lines after the _for_device operation.

It's not entirely clear what the correct behavior is for buffers that
are not cache line aligned, some architectures use wbinval instead
of inval for the _for_device/_FROM_DEVICE operation, on
any partial cache line, but you wouldn't want to do that on the
_for_cpu/_FROM_DEVICE operation.


Powered by blists - more mailing lists