lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200609101734.06839.jbarnes@virtuousgeek.org>
Date:	Sun, 10 Sep 2006 17:34:06 -0700
From:	Jesse Barnes <jbarnes@...tuousgeek.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Segher Boessenkool <segher@...nel.crashing.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	David Miller <davem@...emloft.net>, jeff@...zik.org,
	paulus@...ba.org, torvalds@...l.org, linux-kernel@...r.kernel.org,
	akpm@...l.org
Subject: Re: Opinion on ordering of writel vs. stores to RAM

On Sunday, September 10, 2006 5:12 pm, Benjamin Herrenschmidt wrote:
> Ok, so we have two different proposals here...
>
> Maybe we should cast a vote ? :)
>
>  * Option A:
>
>  - writel/readl are fully synchronous (minus mmiowb for spinlocks)
>  - we provide __writel/__readl with some barriers with relaxed
> ordering between memory and MMIO (though still _precise_ semantics,
> not arch specific)
>
>  * Option B:
>
>  - The driver decides at ioremap time wether it wants a fully ordered
> mapping or not using
>    a "special" version of ioremap (with flags ?)
>  - writel/readl() behave differently depending on the mapping
>  - __writel/__readl might exist but are architecture specific
> (ahem... still to be debated)
>
> The former seems easier to me to implement. The later might indeed be
> a bit easier for drivers writers, I'm not 100% convinced tho. The
> later means stuffing special tokens in the returned address from
> ioremap and testing for them in writel. However, the later is also
> what we need for write combining (at least for PowerPC, maybe for
> other archs, wether a mapping can write combine has to be decided by
> using flags in the page table, thus has to be done at ioremap time.
> (*)

Yeah, write combining is a good point.  After all these years we *still* 
don't have a good in-kernel interface for changing memory mapped 
attributes, so adding a 'flags' argument to ioremap might be a good 
idea (cached, uncached, write combine are the three variants I can 
think of off the top of my head).

But doing MMIO ordering this way seems somewhat expensive since it means 
extra checks in the readX/writeX routines, which are normally very 
fast.

So I guess I'm saying we should have both.
  - existing readX/writeX routines are defined to be strongly ordered
  - new MMIO accessors are added with weak semantics (not sure I like
    the __ naming though, driver authors will have to continually refer
    to documentation to figure out what they mean) along with new
    barrier macros to synchronize things appropriately
  - flags argument to ioremap for cached, uncached, write combine
    attributes (this implies some TLB flushing and other arch specific
    state flushing, also needed for proper PAT support)

Oh, and all MMIO accessors are *documented* with strongly defined 
semantics. :)

If we go this route though, can I request that we don't introduce any 
performance regressions in drivers currently using mmiowb()?  I.e. 
they'll be converted over to the new accessor routines when they become 
available along with the new barrier macros?

Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ