[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080223193620.GA89@tv-sign.ru>
Date: Sat, 23 Feb 2008 22:36:20 +0300
From: Oleg Nesterov <oleg@...sign.ru>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
dmitry.adamushko@...il.com, a.p.zijlstra@...llo.nl,
apw@...dowen.org, Ingo Molnar <mingo@...e.hu>,
nickpiggin@...oo.com.au, paulmck@...ux.vnet.ibm.com,
rusty@...tcorp.com.au, Steven Rostedt <rostedt@...dmis.org>,
linux-arch@...r.kernel.org
Subject: Re: + kthread-add-a-missing-memory-barrier-to-kthread_stop.patch added to -mm tree
On 02/23, Linus Torvalds wrote:
>
> On Sat, 23 Feb 2008, Oleg Nesterov wrote:
> >
> > Yes, but still I suspect wmb() is not enough. Note that try_to_wake_up()
> > first checks (reads) the task->state,
> >
> > if (!(old_state & state))
> > goto out;
> >
> > without the full mb() it is (in theory) possible that try_to_wake_up()
> > first reads TASK_RUNNING and only then sets CONDITION. IOW, STORE and
> > LOAD could be re-ordered.
>
> No. The spinlock can have preceding stores (and loads, for that matter)
> percolate *into* the locked region, but a spinlock can *not* have loads
> (and stores) escape *out* of the region withou being totally broken.
>
> So the spinlock that predeces that load of "old_state" absolutely *must*
> be a memory barrier as far as that load is concerned.
>
> Spinlocks are just "one-way" permeable. They stop both reads and writes,
> but they stop them from escaping up to earlier code (and the spin-unlock
> in turn stops reads and writes within the critical section from escaping
> down past the unlock).
>
> But they definitely must stop that direction, and no loads or stores that
> are inside the critical section can possibly be allowed to be done outside
> of it, or the spinlock would be pointless.
Yes sure. But I meant that the STORE (set CONDITION) can leak into the critical
section, and it could be re-ordered with the LOAD (check ->state) _inside_
the critical section.
However,
> So I think a simple smp_wmb() should be ok.
now I think you are probably right. Because (please ack/nack my understanding)
this smp_wmb() ensures that the preceding STORE can't actually go into
the locked region.
> That said, I do not *mind* the notion of "smp_mb_before/after_spinlock()",
> and just have architectures do whatever they want (with x86 just being a
> no-op). I don't think it's wrong in any way, and may be the really
> generic solution for some theoretical case where locking is done not by
> using the normal cache coherency, but over a separate lock network. But I
> would suggest against adding new abstractions unless there are real cases
> of it mattering.
>
> (The biggest reason to do it in practice is probably architectures that
> have a really slow smp_wmb() and that also have a spinlock that is already
> serializing enough, but if this is only for one single use - in
> try_to_wake_up() - even that doesn't really seem to be a very strong
> argument).
Great.
> > A bit offtopic, but let's take another example, __queue_work()->insert_work().
> > With some trivial modification we can move set_wq_data() outside of cwq->lock.
> > But according to linux's memory model we still need wmb(), which is a bit
> > annoying. Perhaps we can also add smp_wmb_before_spinlock(). Not sure this
> > is not too ugly though.
>
> Is that really so performance-critical? We still need the spinlock for the
> actual list manipulation, so why would we care?
>
> And the smp_wmb() really should be free. It's literally free on x86, but
> doing a write barrier is generally a cheap operation on any sane
> architecture (ie generally cheaper than a read barrier or a full barrier:
> it should mostly be a matter of inserting a magic entry in the write
> buffers).
Yes, yes, this was just a "on a related note", thanks!
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists