lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <787b0d920609112304x3342e3bek88a8e12da62adac4@mail.gmail.com>
Date:	Tue, 12 Sep 2006 02:04:30 -0400
From:	"Albert Cahalan" <acahalan@...il.com>
To:	"Benjamin Herrenschmidt" <benh@...nel.crashing.org>
Cc:	jbarnes@...tuousgeek.org, alan@...rguk.ukuu.org.uk,
	davem@...emloft.net, jeff@...zik.org, paulus@...ba.org,
	torvalds@...l.org, linux-kernel@...r.kernel.org, akpm@...l.org,
	segher@...nel.crashing.org
Subject: Re: Opinion on ordering of writel vs. stores to RAM

On 9/12/06, Benjamin Herrenschmidt <benh@...nel.crashing.org> wrote:
>
> > Oh no, it's great for regular device driver work. I used this
> > type of system all the time on a different PowerPC OS.
> >
> > Suppose you need to set up a piece of hardware. Assume that the
> > hardware isn't across some nasty bridge. You do this:
> >
> > hw->x = 42;
> > hw->y = 19;
> > eieio();
> > hw->p = 11;
> > hw->q = 233;
> > hw->r = 87;
> > eieio()
> > hw->n = 101;
> > hw->m = 5;
> > eieio()
> >
> > In that ficticious example, I get 7 writes to the hardware device
> > with only 3 "eieio" operations. It's not hard at all. Sometimes
> > a "sync" is used instead, also explicitly.
>
> You can do that with my proposed __writel which is a simple store as
> writes to non-cacheable and guarded storage have to stay in order
> according to the PowerPC architecture. No need for __raw.

Oops, I forgot about store-store ordering being automatic.
Pretend I had some loads in my example.

A proper interface would be more explicit about what the
fence does, so that driver authors shouldn't need to know
this detail.

> > To get even more speed, you can mark memory as non-coherent.
>
> Ugh ? MMIO space is always marked non-coherent. You are not supposed to
> set the M bit if the I is set in the page tables. If you are talking
> about main memory, then it's a completely different discussion.

OK, a different discussion... though memory being used
for DMA seems rather related. You need to flush before
a DMA out, or invalidate before a DMA in.

> > Linux should probably do this:
> >
> > Plain stuff is like x86. If you want the performance of loose
> > ordering, ask for it when you get the mapping and use read/write
> > functions that have a "_" prefix. If you mix the "_" versions
> > with a plain x86-like mapping or the other way, the behavior you
> > get will be an arch-specific middle ground.
>
> No. I want precisely defined semantics in all cases.

So you say: never mix strict mappings with loose operations,
and never mix loose mappings with strict operations.

That is an excellent rule. I see no need to stop people from
actively trying to shoot their feet though. I'm certainly not
suggesting that people be mixing things.

For some CPUs, you want to be specifying things when you
set up the mapping. For other CPUs, the read/write code is
how this gets determined. So developers specify both.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ