[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0811090919370.29560@gandalf.stny.rr.com>
Date: Sun, 9 Nov 2008 09:31:37 -0500 (EST)
From: Steven Rostedt <rostedt@...dmis.org>
To: David Howells <dhowells@...hat.com>
cc: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
akpm@...ux-foundation.org, Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
linux-kernel@...r.kernel.org, Nicolas Pitre <nico@....org>,
Ralf Baechle <ralf@...ux-mips.org>, benh@...nel.crashing.org,
paulus@...ba.org, David Miller <davem@...emloft.net>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-arch@...r.kernel.org
Subject: Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()
On Sun, 9 Nov 2008, David Howells wrote:
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > > Note that that does not guarantee that the two reads will be done in the
> > > order you want. The compiler barrier _only_ affects the compiler. It
> > > does not stop the CPU from doing the reads in any order it wants. You
> > > need something stronger than smp_rmb() if you need the reads to be so
> > > ordered.
> >
> > For reading hardware devices that can indeed be correct. But for normal
> > memory access on a uniprocessor, if the CPU were to reorder the reads that
> > would effect the actual algorithm then that CPU is broken.
Please read what I said above again.
"For reading hardware devices that can indeed be correct."
There I agree that accessing devices will require a rmb.
"But for normal memory access on a uniprocessor, if the CPU were to
reorder the reads that would effect the actual algorithm then that CPU is
broken."
Here I'm talking about accessing normal RAM. If the CPU decides to read b
before reading a then that will break the code.
> >
> > read a
> > <--- interrupt - should see read a here before read b is done.
> > read b
>
> Life isn't that simple. Go and read the section labelled "The things cpus get
> up to" in Documentation/memory-barriers.txt.
I've read it. Several times ;-)
>
> The two reads we're talking about are independent of each other. Independent
> reads and writes can be reordered and merged at will by the CPU, subject to
> restrictions imposed by barriers, cacheability attributes, MMIO attributes and
> suchlike.
>
> You can get read b happening before read a, but in such a case both
> instructions will be in the CPU's execution pipeline. When an interrupt
> occurs, the CPU will presumably finish clearing what's in its pipeline before
> going and servicing the interrupt handler.
This above sounds like you just answered my question, and a smp_rmb is
enough. If an interrupt occurs, then the read a and read b will be
completed. Really does not matter in which order, as long as the interrupt
itself does not see the read b before the read a.
>
> If a CPU is strictly ordered with respect to reads, do you actually need read
> barriers?
>
> The fact that a pair of reads might be part of an algorithm that is critically
> dependent on the ordering of those reads isn't something the CPU cares about.
> It doesn't know there's an algorithm there.
>
> > Now the fact that one of the reads is a hardware clock, then this
> > statement might not be too strong. But the fact that it is a clock, and
> > not some memory mapped device register, I still think smp_rmb is
> > sufficient.
>
> To quote again from memory-barriers.txt, section "CPU memory barriers":
>
> Mandatory barriers should not be used to control SMP effects, since
> mandatory barriers unnecessarily impose overhead on UP systems. They
> may, however, be used to control MMIO effects on accesses through
> relaxed memory I/O windows. These are required even on non-SMP
> systems as they affect the order in which memory operations appear to
> a device by prohibiting both the compiler and the CPU from reordering
> them.
>
> Section "Accessing devices":
>
> (2) If the accessor functions are used to refer to an I/O memory window with
> relaxed memory access properties, then _mandatory_ memory barriers are
> required to enforce ordering.
My confidence on reading a clock is not as strong that a smp_rmb is
enough. And it may not be. I'll have to think about this a bit more.
Again, the question arrises with:
read a (memory)
<---- interrupt
read b (clock)
Will the b be seen before the interrupt occurred, and before the a is
read? That is what will break the algorithm on UP. If we can not
guarantee this statement, then a rmb is needed.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists