[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1158015661.3879.88.camel@localhost.localdomain>
Date: Tue, 12 Sep 2006 09:01:01 +1000
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Jesse Barnes <jbarnes@...tuousgeek.org>
Cc: Linux Kernel list <linux-kernel@...r.kernel.org>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Jeff Garzik <jgarzik@...ox.com>,
Paul Mackerras <paulus@...ba.org>,
Linus Torvalds <torvalds@...l.org>,
Andrew Morton <akpm@...l.org>,
Segher Boessenkool <segher@...nel.crashing.org>
Subject: Re: [RFC] MMIO accessors & barriers documentation
> I think that's a separate issue? As Jeff points out, those macros are
> intended to provide memory vs. I/O ordering, but isn't PPC the only platform
> that will reorder accesses so aggressively and independently? I don't think
> ia64 for example will reorder them separately, so a regular memory barrier
> *should* be enough to ensure ordering in both domains.
Well, I don't know, that's what I'm asking since the comment in the
driver specifically mentions IA64 :)
> > Hence the question: do we provide -fully- ordered accessors in class 1,
> > or do we provide -mostly- ordered accessors, ordered in all means except
> > rule #4 vs locks. ia64 is afaik by far the platform taking the biggest
> > hit if you have to provide #4, so I'm interesting in your point of view
> > here.
>
> Either way is fine with me as long as we have a way to get at the fast and
> loose stuff (and required barriers of course) in a portable way. And that we
> don't regress the existing users of mmiowb().
Well, existing users of mmiowb() will regress in performances if we
decide that class 1 (ordered) accessors do imply rule #4 (ordering with
locks) since they'll end up doing redundant mmiowb's ;) but then,
they'll be affected anyway to to the sheer amount of mmiowb's (one per
IO) unless you implement the trick I described, which would bring down
the cost to nothing except maybe the test in spin_unlock (which I still
need to measure on PowerPC).
> > We don't need counters, just a flag. We did a test implementation, seems
> > to work. We also clear the flag in spin_lock. That means that MMIOs
> > issued before a lock aren't ordered vs. the locked section. But because
> > of rule #1, they should be ordered vs. other MMIOs inside the locked
> > section and thus implicitely get ordered anyway.
>
> Oh right, a flag would be enough. Is it good enough for -mm yet? Might be
> fun to run on an Altix machine with a bunch of supported devices (not that I
> work with them anymore...).
The PowerPC patch is probably good enough for 2.6.18 in fact :) I'll let
Paulus post what he has. It's fairly ppc specific in the actual
implementation though.
> > > For ia64 in particular it doesn't matter, though there was speculation
> > > several years that it might be necessary. No actual examples stepped
> > > forward though, so the current implementation doesn't take an argument.
> >
> > Ok. My question is wether it would improve the implementation to take
> > it. If we define a new macro with a new name, we can do it....
>
> Right, but unless there's a real need at this point, we probably shouldn't
> bother. Let the poor sucker with the future machine needing the device
> argument do the work. :)
Ok :)
Ben.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists