lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <787b0d920609112130v2d855023ief2457942736ccfd@mail.gmail.com>
Date:	Tue, 12 Sep 2006 00:30:06 -0400
From:	"Albert Cahalan" <acahalan@...il.com>
To:	benh@...nel.crashing.org, jbarnes@...tuousgeek.org,
	alan@...rguk.ukuu.org.uk, davem@...emloft.net, jeff@...zik.org,
	paulus@...ba.org, torvalds@...l.org, linux-kernel@...r.kernel.org,
	akpm@...l.org, segher@...nel.crashing.org
Subject: Re: Opinion on ordering of writel vs. stores to RAM

Benjamin Herrenschmidt writes:
> On Mon, 2006-09-11 at 11:08 -0700, Jesse Barnes wrote:

>> Ok, that's fine, though I think you'd only want the very weak
>> semantics (as provided by your __raw* routines) on write
>> combined memory typically?
>
> Well, that and memory with no side effects (like frame buffers)

Oh no, it's great for regular device driver work. I used this
type of system all the time on a different PowerPC OS.

Suppose you need to set up a piece of hardware. Assume that the
hardware isn't across some nasty bridge. You do this:

hw->x = 42;
hw->y = 19;
eieio();
hw->p = 11;
hw->q = 233;
hw->r = 87;
eieio()
hw->n = 101;
hw->m = 5;
eieio()

In that ficticious example, I get 7 writes to the hardware device
with only 3 "eieio" operations. It's not hard at all. Sometimes
a "sync" is used instead, also explicitly.

To get even more speed, you can mark memory as non-coherent.
You can even do this for RAM. There are cache control instructions
to take care of any problems; simply ask the CPU to write things
out as needed.

Linux should probably do this:

Plain stuff is like x86. If you want the performance of loose
ordering, ask for it when you get the mapping and use read/write
functions that have a "_" prefix. If you mix the "_" versions
with a plain x86-like mapping or the other way, the behavior you
get will be an arch-specific middle ground.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ