lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Apr 2008 08:07:00 -0700
From:	Arjan van de Ven <arjan@...radead.org>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
Cc:	Jeff Garzik <jeff@...zik.org>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	"David S. Miller" <davem@...emloft.net>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [patch] x86, voyager: fix ioremap_nocache()

On Mon, 28 Apr 2008 10:29:08 -0400
James Bottomley <James.Bottomley@...senPartnership.com> wrote:

> On Mon, 2008-04-28 at 07:10 -0700, Arjan van de Ven wrote:
> > On Sun, 27 Apr 2008 18:39:24 -0400
> > Jeff Garzik <jeff@...zik.org> wrote:
> > 
> > > James Bottomley wrote:
> > > > Here's another piece of the x86 API that's designed to be
> > > > cached. The dma_declare_coherent_memory() usually represents
> > > > behind bridge memory that's fully participatory in the
> > > > coherence model.
> > > > 
> > > > Making it uncached damages the utility of this memory because
> > > > doing cacheline sized burst cycles when needed to it is far
> > > > faster than individual byte/word/quad writes.
> > > > 
> > > > Signed-off-by: James Bottomley
> > > > <James.Bottomley@...senPartnership.com>
> > > > 
> > > > ---
> > > > 
> > > > diff --git a/arch/x86/kernel/pci-dma.c
> > > > b/arch/x86/kernel/pci-dma.c index 388b113..df83ffd 100644
> > > > --- a/arch/x86/kernel/pci-dma.c
> > > > +++ b/arch/x86/kernel/pci-dma.c
> > > > @@ -214,7 +214,7 @@ int dma_declare_coherent_memory(struct
> > > > device *dev, dma_addr_t bus_addr, 
> > > >  	/* FIXME: this routine just ignores
> > > > DMA_MEMORY_INCLUDES_CHILDREN */ 
> > > > -	mem_base = ioremap(bus_addr, size);
> > > > +	mem_base = ioremap_cache(bus_addr, size);
> > > >  	if (!mem_base)
> > > >  		goto out;
> > 
> > this patch patch is likely broken on x86; or rather, anyone who
> > uses it is... thinking you can find cache coherent memory on a PCI
> > or similar bus that is actually cachable... keep dreaming. (for
> > now; there's talk about extending PCI)
> 
> No ... it works for me, and caching is a performance advantage for me
> too.  The only current consumer of this API is the NCR_Q720 SCSI card
> which keeps a bunch of cacheable memory remote across the MCA bus.
> 
> If you think about it logically, most busses are second citizens in
> the caching hierarchy: they really only get to force a flush and
> invalidate of the CPU cache line rather than being fully
> participatory in the coherence protocol. 


Cached means that the cpu, at any time, can do a speculative read to the memory.
It also means that the cpu can then write the speculated cacheline back at any time later,
if some speculation was going to write to the cacheline but didn't actually happen.
(before you think this is bogus, at least AMD cpus do this and I can't vouch for Intel
cpus never doing this).
If the on-the-bus hardware *ever* writes to the memory without being part of the full
cache coherence protocol it's in trouble. Big time.
Even if it sends an invalidate first (which PCI and others just don't allow, not sure about MCA though),
it's not enough because the cpu can just read it right back... one needs a "take for ownership" not
an "invalidate" for this to work, and that means being part of the full protocol.


> However, even being second
> class is enough of a speed up on slow busses because it allows
> bursting of the cache line for the bus transfers.

that's write combining not cachability I suspect... at least for writes it is.
> 
> The other consumers are SoC embedded ... so yes, perhaps I should ask
> about this on linux-arch.

x86 SoC ? (we're talking about an arch/x86 file here)
Esp in SoC people don't bother doing cache coherency if they can get away
with it.



-- 
If you want to reach me at my work email, use arjan@...ux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ