lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Oct 2016 09:55:27 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Russell King - ARM Linux <linux@...linux.org.uk>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: DMA-API: cpu touching an active dma mapped cacheline

On Thu, Oct 6, 2016 at 8:34 AM, Russell King - ARM Linux
<linux@...linux.org.uk> wrote:
> Hi,
>
> With DMA API debugging enabled, I'm seeing this splat from it, which to
> me looks like the DMA API debugging is getting too eager for it's own
> good.
>
> The fact of the matter is that the VM passes block devices pages to be
> written out to disk which are page cache pages, which may be looked up
> and written to by write() syscalls and via mmap() mappings.  For example,
> take the case of a writable shared mapping of a page backed by a file on
> a disk - the VM will periodically notice that the page has been dirtied,
> and schedule a writeout to disk.  The disk driver has no idea that the
> page is still mapped - and arguably it doesn't matter.
>
> So, IMHO this whole "the CPU is touching a DMA mapped buffer" is quite
> bogus given our VM behaviour: we have never guaranteed exclusive access
> to DMA buffers.
>
> I don't see any maintainer listed for lib/dma-debug.c, but I see the
> debug_dma_assert_idle() stuff was introduced by Dan via akpm in 2014.

Hmm, there are benign cases where this happens, but there's also one's
that lead to data corruption as was the case with the NET_DMA receive
offload.  Perhaps this change is enough to distinguish between the two
cases:

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index fcfa1939ac41..dd18235097d0 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -597,7 +597,7 @@ void debug_dma_assert_idle(struct page *page)
        }
        spin_unlock_irqrestore(&radix_lock, flags);

-       if (!entry)
+       if (!entry || entry->direction != DMA_FROM_DEVICE)
                return;

        cln = to_cacheline_number(entry);

...because the problem in the NET_DMA case was that the engine was
writing to page that the process no longer cared about because the cpu
had written to it causing a cow copy to be established.  In the disk
DMA case its fine if the DMA is acting on stale results in a
DMA_TO_DEVICE operation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ