lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 9 Nov 2017 11:14:56 +0100 From: Arnd Bergmann <arnd@...db.de> To: Greentime Hu <green.hu@...il.com> Cc: Greentime <greentime@...estech.com>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, linux-arch <linux-arch@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, Jason Cooper <jason@...edaemon.net>, Marc Zyngier <marc.zyngier@....com>, Rob Herring <robh+dt@...nel.org>, Networking <netdev@...r.kernel.org>, Vincent Chen <vincentc@...estech.com>, deanbo422@...il.com Subject: Re: [PATCH 13/31] nds32: DMA mapping API On Thu, Nov 9, 2017 at 8:12 AM, Greentime Hu <green.hu@...il.com> wrote: > 2017-11-08 17:09 GMT+08:00 Arnd Bergmann <arnd@...db.de>: >> On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu <green.hu@...il.com> wrote: >> >> You do the same cache operations for _to_cpu and _to_device, which >> usually works, >> but is more expensive than you need. It's better to take the ownership into >> account and only do what you need. >> > Like this? > > static void > nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir) > { > consistent_sync((void *)dma_to_virt(dev, handle), size, > DMA_FROM_DEVICE); > } > > static void > nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir) > { > consistent_sync((void *)dma_to_virt(dev, handle), size, > DMA_TO_DEVICE); > } No, it's more complicated than that. You need to pass both the direction of the DMA transaction and the ownership to consistent_sync(), and then do the correct cache maintenance operation for each of the six combinations. Which operation that is depends on the microarchitecture to some degree, e.g. on machines that can load arbitrary cache lines during speculative execution, you have to invalidate the caches during both _for_device/FROM_DEVICE _for_cpu/FROM_DEVICE, while machines without speculative execution can skip the second invalidation, they only need to get rid of dirty cache lines before the DMA from device. Usually you don't have to do a writeback during _for_cpu, since there are no dirty cache lines after the _for_device operation. It's not entirely clear what the correct behavior is for buffers that are not cache line aligned, some architectures use wbinval instead of inval for the _for_device/_FROM_DEVICE operation, on any partial cache line, but you wouldn't want to do that on the _for_cpu/_FROM_DEVICE operation. Arnd
Powered by blists - more mailing lists