[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.0.98.0706210923000.3593@woody.linux-foundation.org>
Date: Thu, 21 Jun 2007 09:32:45 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...e.hu>
cc: Jarek Poplawski <jarkao2@...pl>,
Miklos Szeredi <miklos@...redi.hu>, cebbert@...hat.com,
chris@...ee.ca, linux-kernel@...r.kernel.org, tglx@...utronix.de,
akpm@...ux-foundation.org
Subject: Re: [BUG] long freezes on thinkpad t60
On Thu, 21 Jun 2007, Ingo Molnar wrote:
>
> so the problem was not the trylock based spin_lock() itself (no matter
> how it's structured in the assembly), the problem was actually modifying
> the lock and re-modifying it again and again in a very tight
> high-frequency loop, and hence not giving it to the other core?
That's what I think, yes.
And it matches the behaviour we've seen before (remember the whole thread
about whether Opteron or Core 2 hardware is "more fair" between cores?)
It all simply boils down to the fact that releasing and almost immediately
re-taking a lock is "invisible" to outside cores, because it will happen
entirely within the L1 cache of one core if the cacheline is already in
"owned" state.
Another core that spins wildly trying to get it ends up using a much
slower "beat" (the cache coherency clock), and with just a bit of bad luck
the L1 cache would simply never get probed, and the line never stolen, at
the point where the lock happens to be released.
The fact that writes (as in the store that releases the lock)
automatically get delayed by any x86 core by the store buffer, and the
fact that atomic read-modify-write cycles do *not* get delayed, just means
that if the "spinlock release" was "fairly close" to the "reacquire" in a
software sense, the hardware will actually make them *even*closer*.
So you could make the spinlock release be a read-modify-write cycle, and
it would make the spinlock much slower, but it would also make it much
more likely that the other core will *see* the release.
For the same reasons, if you add a "smp_mb()" in between the release and
the re-acquire, the other core is much more likely to see it: it means
that the release won't be delayed, and thus it just won't be as "close" to
the re-acquire.
So you can *hide* this problem in many ways, but it's still just hiding
it.
The proper fix is to not do that kind of "release and re-acquire"
behaviour in a loop.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists