lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110428145816.GR17290@n2100.arm.linux.org.uk>
Date:	Thu, 28 Apr 2011 15:58:16 +0100
From:	Russell King - ARM Linux <linux@....linux.org.uk>
To:	Arnd Bergmann <arnd@...db.de>
Cc:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	'Benjamin Herrenschmidt' <benh@...nel.crashing.org>,
	linaro-mm-sig@...ts.linaro.org, linux-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [Linaro-mm-sig] [RFC] ARM DMA mapping TODO, v1

On Thu, Apr 28, 2011 at 04:39:59PM +0200, Arnd Bergmann wrote:
> On Thursday 28 April 2011, Russell King - ARM Linux wrote:
> > On Thu, Apr 28, 2011 at 04:29:52PM +0200, Arnd Bergmann wrote:
> > > Given that people still want to have an interface that does what I
> > > though this one did, I guess we have two options:
> > > 
> > > * Kill off dma_cache_sync and replace it with calls to dma_sync_*
> > >   so we can start using dma_alloc_noncoherent on ARM
> > 
> > I don't think this is an option as dma_sync_*() is part of the streaming
> > DMA mapping API (dma_map_*) which participates in the idea of buffer
> > ownership, which the noncoherent API doesn't appear to.
> 
> I thought the problem was in fact that the noncoherent API cannot be
> implemented on architectures like ARM specifically because there is
> no concept of buffer ownership. The obvious way to fix that would
> be to redefine the API. What am I missing?

You are partially correct.  With the streaming interface, we're fairly
strict with the buffer ownership stuff, as the most effective way to
implement it across all our CPUs is to deal with the mapping, sync and
unmapping in terms of buffers being passed from CPU control to DMA device
control and back again.

With the noncoherent interface, there is less of a buffer ownership idea.
For instance, to read from a noncoherent buffer, the following is required
(in order, I'm not considering the effects of weakly ordered stuff):

	/* dma happens, signalled complete */
	dma_cache_invalidate(buffer, size);
	/* cpu can now see up to date data */
	message = *buffer;

Unlike the streaming API, we don't need to hand the buffer back to the
device before the CPU can repeat the above code sequence.

If we want to write to a noncoherent buffer, then we need:

	*buffer = value;
	dma_cache_writeback(buffer, size);
	/* dma can only now see new value */

and again, the same thing applies.

There is an additional problem lurking in amongst this though - a buffer
which is both read and written by the CPU has to be extremely careful of
cache writebacks - this for instance would not be legal:

	*buffer = value;
	...
	/* dma from device */
	dma_cache_invalidate(buffer, size);
	message = *buffer;

as it is not predictable whether we'll see 'value' or the DMA data - that
depends on the relative ordering of the DMA writing to RAM vs the cache
eviction of the CPU write.

So, there is a kind of buffer ownership here:

	/* cpu owns */
	dma_cache_writeback(buffer, size);
	/* dma owns */
	dma_cache_invalidate(buffer, size);
	/* cpu owns */

but as shown above it doesn't need to be as strict as the streaming API.

Also note that there's a problem lurking here with DMA cache line size:

| int
| dma_get_cache_alignment(void)
| 
| Returns the processor cache alignment.  This is the absolute minimum
| alignment *and* width that you must observe when either mapping
| memory or doing partial flushes.
| 
| Notes: This API may return a number *larger* than the actual cache
| line, but it will guarantee that one or more cache lines fit exactly
| into the width returned by this call.  It will also always be a power
| of two for easy alignment.

$ grep -L dma_get_cache_alignment $(grep dma_alloc_noncoherent drivers/ -lr)
drivers/base/dma-mapping.c
drivers/scsi/sgiwd93.c
drivers/scsi/53c700.c
drivers/net/au1000_eth.c
drivers/net/sgiseeq.c
drivers/net/lasi_82596.c
drivers/video/au1200fb.c

so we have a bunch of drivers which presumably don't take any notice of
the DMA cache line size, which may be very important.  53c700 for instance
aligns its buffers using L1_CACHE_ALIGN(), which may be smaller than
what's actually required...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ