linux-kernel - Re: DMA-API: cpu touching an active dma mapped cacheline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20161006175722.GU1041@n2100.armlinux.org.uk>
Date:   Thu, 6 Oct 2016 18:57:22 +0100
From:   Russell King - ARM Linux <linux@...linux.org.uk>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: DMA-API: cpu touching an active dma mapped cacheline

On Thu, Oct 06, 2016 at 09:55:27AM -0700, Dan Williams wrote:
> On Thu, Oct 6, 2016 at 8:34 AM, Russell King - ARM Linux
> <linux@...linux.org.uk> wrote:
> > Hi,
> >
> > With DMA API debugging enabled, I'm seeing this splat from it, which to
> > me looks like the DMA API debugging is getting too eager for it's own
> > good.
> >
> > The fact of the matter is that the VM passes block devices pages to be
> > written out to disk which are page cache pages, which may be looked up
> > and written to by write() syscalls and via mmap() mappings.  For example,
> > take the case of a writable shared mapping of a page backed by a file on
> > a disk - the VM will periodically notice that the page has been dirtied,
> > and schedule a writeout to disk.  The disk driver has no idea that the
> > page is still mapped - and arguably it doesn't matter.
> >
> > So, IMHO this whole "the CPU is touching a DMA mapped buffer" is quite
> > bogus given our VM behaviour: we have never guaranteed exclusive access
> > to DMA buffers.
> >
> > I don't see any maintainer listed for lib/dma-debug.c, but I see the
> > debug_dma_assert_idle() stuff was introduced by Dan via akpm in 2014.
> 
> Hmm, there are benign cases where this happens, but there's also one's
> that lead to data corruption as was the case with the NET_DMA receive
> offload.  Perhaps this change is enough to distinguish between the two
> cases:
> 
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index fcfa1939ac41..dd18235097d0 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -597,7 +597,7 @@ void debug_dma_assert_idle(struct page *page)
>         }
>         spin_unlock_irqrestore(&radix_lock, flags);
> 
> -       if (!entry)
> +       if (!entry || entry->direction != DMA_FROM_DEVICE)
>                 return;
> 
>         cln = to_cacheline_number(entry);
> 
> ...because the problem in the NET_DMA case was that the engine was
> writing to page that the process no longer cared about because the cpu
> had written to it causing a cow copy to be established.  In the disk
> DMA case its fine if the DMA is acting on stale results in a
> DMA_TO_DEVICE operation.

Yes, that seems to avoid the warning for me from an initial test - I'm
not sure how reproducable it is yet though.

Thanks for the patch.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.