lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Sep 2020 14:45:22 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Huazhong Tan <tanhuazhong@...wei.com>
Cc:     <davem@...emloft.net>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <salil.mehta@...wei.com>,
        <yisen.zhuang@...wei.com>, <linuxarm@...wei.com>,
        Yunsheng Lin <linyunsheng@...wei.com>
Subject: Re: [PATCH net-next 5/6] net: hns3: use writel() to optimize the
 barrier operation

On Mon, 14 Sep 2020 20:06:56 +0800 Huazhong Tan wrote:
> From: Yunsheng Lin <linyunsheng@...wei.com>
> 
> writel() can be used to order I/O vs memory by default when
> writing portable drivers. Use writel() to replace wmb() +
> writel_relaxed(), and writel() is dma_wmb() + writel_relaxed()
> for ARM64, so there is an optimization here because dma_wmb()
> is a lighter barrier than wmb().

Cool, although lots of drivers will need a change like this now. 

And looks like memory-barriers.txt is slightly, eh, not coherent there,
between the documentation of writeX() and dma_wmb() :S

	3. A writeX() by a CPU thread to the peripheral will first wait for the
	   completion of all prior writes to memory either issued by, or
	   propagated to, the same thread. This ensures that writes by the CPU
	   to an outbound DMA buffer allocated by dma_alloc_coherent() will be
	   visible to a DMA engine when the CPU writes to its MMIO control
	   register to trigger the transfer.



 (*) dma_wmb();
 (*) dma_rmb();

     These are for use with consistent memory to guarantee the ordering
     of writes or reads of shared memory accessible to both the CPU and a
     DMA capable device.

     For example, consider a device driver that shares memory with a device
     and uses a descriptor status value to indicate if the descriptor belongs
     to the device or the CPU, and a doorbell to notify it when new
     descriptors are available:

	if (desc->status != DEVICE_OWN) {
		/* do not read data until we own descriptor */
		dma_rmb();

		/* read/modify data */
		read_data = desc->data;
		desc->data = write_data;

		/* flush modifications before status update */
		dma_wmb();

		/* assign ownership */
		desc->status = DEVICE_OWN;

		/* notify device of new descriptors */
		writel(DESC_NOTIFY, doorbell);
	}

     The dma_rmb() allows us guarantee the device has released ownership
     before we read the data from the descriptor, and the dma_wmb() allows
     us to guarantee the data is written to the descriptor before the device
     can see it now has ownership.  Note that, when using writel(), a prior
     wmb() is not needed to guarantee that the cache coherent memory writes
     have completed before writing to the MMIO region.  The cheaper
     writel_relaxed() does not provide this guarantee and must not be used
     here.

Powered by blists - more mailing lists