lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 15 Feb 2019 11:10:07 +0000
From:   Russell King - ARM Linux admin <linux@...linux.org.uk>
To:     Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        netdev@...r.kernel.org
Subject: Re: [REGRESSION 4.20] mvneta - DMA-API: device driver tries to sync
 DMA memory it has not allocated

On Tue, Feb 12, 2019 at 01:59:19PM +0000, Russell King - ARM Linux admin wrote:
> Hi,
> 
> Booting 4.20 on SolidRun Clearfog reliably provokes the following
> warning - this is with mvneta built in, but DSA as modules:
> 
> WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
> mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
> Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
> CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
> Hardware name: Marvell Armada 380/385 (Device Tree)
> [<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
> [<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
> [<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
> [<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
> [<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
> [<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
> [<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
> [<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
> [<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
> [<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
> [<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
> [<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
> [<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
> ...
> 
> This appears to be from:
> 
>                 if (rx_bytes <= rx_copybreak) {
>                         /* better copy a small frame and not unmap the DMA region */
>                         skb = netdev_alloc_skb_ip_align(dev, rx_bytes);
>                         if (unlikely(!skb))
>                                 goto err_drop_frame_ret_pool;
> 
>                         dma_sync_single_range_for_cpu(dev->dev.parent,
>                                                       rx_desc->buf_phys_addr,
>                                                       MVNETA_MH_SIZE + NET_SKB_PAD,
>                                                       rx_bytes,
>                                                       DMA_FROM_DEVICE);
> 
> which suggests that rx_desc->buf_phys_addr is not something that should
> be passed to dma_sync_single_range_for_cpu().  I've not been able to
> track down why that is, nor which interface is provoking that.
> 
> As I don't have the details of how the buffer management hardware works
> on Armada 388, I'm unable to debug this myself.

Doing what debugging I _can_ do, it seems that this has been a long-term
error in mvneta, but one that was merely uncovered by:

  commit 562e2f467e71f45f0400ebee5077eaa426d3e426
  Author: Yelena Krivosheev <yelena@...vell.com>
  Date:   Wed Jul 18 18:10:57 2018 +0200

The buffer that is being complained about is sync'd using a device of
dev->dev.parent 'f1070000.ethernet', but is allocated by
mvneta_bm_construct() against a different device:

mvneta_bm_construct: 0x2dd85c00 +0x140 for ee113294 (f10c8000.bm)

namely 'f10c8000.bm'.

It's long-term, because it will only trigger in older kernels if we
hit the copy-break stuff, which used to do:

	if (rx_bytes <= rx_copybreak) {
		skb = netdev_alloc_skb_ip_align(dev, rx_bytes);
		if (unlikely(!skb)) {
			...
		}
		
		dma_sync_single_range_for_cpu(dev->dev.parent,
					      phys_addr,
					      MVNETA_MH_SIZE + NET_SKB_PAD,
					      rx_bytes,
					      DMA_FROM_DEVICE);

where rx_copybreak is 256 bytes.  Quite why that hasn't been seen
already, I do not know.

Looking at the code after the commit, if mvneta is used on a
non-coherent platform, then we have problems:

	copy_size = min(skb_size, rx_bytes);
	...
	memcpy(rxq->skb->data, data + MVNETA_MH_SIZE,
	       copy_size);
	...
	if (rxq->left_size == 0) {
		int size = copy_size + MVNETA_MH_SIZE;
		dma_sync_single_range_for_cpu(dev->dev.parent,
					      phys_addr, 0,
					      size,
					      DMA_FROM_DEVICE);

Since the sync is done _after_ we've copied data from a non-coherent
buffer.

If this code has been written to assume that we're always coherent,
then is there any point at all to having the incorrect dma_sync_*()
calls at all?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Powered by blists - more mailing lists