lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 Aug 2009 15:06:23 +0200
From:	Laurent Pinchart <laurent.pinchart@...asonboard.com>
To:	Ben Dooks <ben-linux@...ff.org>
Cc:	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Robin Holt <holt@....com>, linux-kernel@...r.kernel.org,
	"v4l2_linux" <linux-media@...r.kernel.org>,
	linux-arm-kernel@...ts.arm.linux.org.uk
Subject: Re: How to efficiently handle DMA and cache on ARMv7 ? (was "Is get_user_pages() enough to prevent pages from being swapped out ?")

Hi Ben,

On Thursday 06 August 2009 13:46:19 Ben Dooks wrote:
> On Thu, Aug 06, 2009 at 12:08:21PM +0200, Laurent Pinchart wrote:
[snip]
> >
> > The second problem is to ensure cache coherency. As the userspace
> > application will read data from the video buffers, those buffers will end
> > up being cached in the processor's data cache. The driver does need to
> > invalidate the cache before starting the DMA operation (userspace could
> > in theory write to the buffers, but the data will be overwritten by DMA
> > anyway, so there's no need to clean the cache).
>
> You'll need to clean the write buffers, otherwise the CPU may have data
> queued that it has yet to write back to memory.

Good points, thanks.

> > As the cache is of the VIPT (Virtual Index Physical Tag) type, cache
> > invalidation can either be done globally (in which case the cache is
> > flushed instead of being invalidated) or based on virtual addresses. In
> > the last case the processor will need to look physical addresses up,
> > either in the TLB or through hardware table walk.
> >
> > I can see three solutions to the DMA/cache problem.
> >
> > 1. Flushing the whole data cache right before starting the DMA transfer.
> > There's no API for that in the ARM architecture, so a whole I+D cache is
> > required. This is quite costly, we're talking about around 30 flushes per
> > second, but it doesn't involve the MMU. That's the solution that I
> > currently use.
> >
> > 2. Invalidating only the cache lines that store video buffer data. This
> > requires a TLB lookup or a hardware table walk, so the userspace
> > application MM context needs to be available (no problem there as where's
> > flushing in userspace context) and all pages need to be mapped properly.
> > This can be a problem as, as Hugh pointed out, pages can still be
> > unmapped from the userspace context after get_user_pages() returns. I
> > have experienced one oops due to a kernel paging request failure:
>
> If you already know the virtual addresses of the buffers, why do you need
> a TLB lookup (or am I being dense here?)

The virtual address is used to compute the cache lines index, and the physical 
address is then used when comparing the cache line tag. So the processor (or 
actually the CP15 coprocessor if I'm not wrong) does a TLB lookup to get the 
physical address during cache invalidation/flushing.

> >         Unable to handle kernel paging request at virtual address
> > 44e12000 pgd = c8698000
> >         [44e12000] *pgd=8a4fd031, *pte=8cfda1cd, *ppte=00000000
> >         Internal error: Oops: 817 [#1] PREEMPT
> >         PC is at v7_dma_inv_range+0x2c/0x44
> >
> > Fixing this requires more investigation, and I'm not sure how to proceed
> > to find out if the page fault is really caused by pages being unmapped
> > from the userspace context. Help would be appreciated.
> >
> > 3. Mark the pages as non-cacheable. Depending on how the buffers are then
> > used by userspace, the additional cache misses might destroy any benefit
> > I would get from not flushing the cache before DMA. I'm not sure how to
> > mark a bunch of pages as non-cacheable though. What usually happens is
> > that video drivers allocate DMA-coherent memory themselves, but in this
> > case I need to deal with an arbitrary buffer allocated by userspace. If
> > someone has any experience with this, it would be appreciated.

Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ