[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0805070956060.3024@woody.linux-foundation.org>
Date: Wed, 7 May 2008 10:08:18 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Matthew Wilcox <matthew@....cx>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
"J. Bruce Fields" <bfields@...i.umich.edu>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Alexander Viro <viro@....linux.org.uk>,
linux-fsdevel@...r.kernel.org
Subject: Re: AIM7 40% regression with 2.6.26-rc1
On Wed, 7 May 2008, Linus Torvalds wrote:
>
> Which, btw, is probably true. The BKL is normally held for short times,
> and released (by that thread) for relatively much longer times. Which
> is when spinlocks tend to work the best, even when they are fair (because
> it's not so much a fairness issue, it's simply a cost-of-taking-the-lock
> issue!)
.. and don't get me wrong: the old semaphores (and the new mutexes) should
also have this property when lucky: taking the lock is often a hot-path
case.
And the spinlock+generic semaphore thing probably makes that "lucky"
behavior be exponentially less likely, because now to hit the lucky case,
rather than the hot path having just *one* access to the interesting cache
line, it has basically something like 4 accesses (spinlock, count test,
count decrement, spinunlock), in addition to various serializing
instructions, so I suspect it quite often gets serialized simply because
even the "fast path" is actually about ten times as long!
As a result, a slow "fast path" means that the thing gets saturated much
more easily, and that in turn means that the "fast path" turns into a
"slow path" more easily, which is how you end up in the scheduler rather
than just taking the fast path.
This is why sleeping locks are more expensive in general: they have a
*huge* cost from when they get contended. Hundreds of times higher than a
spinlock. And the faster they are, the longer it takes for them to get
contended under load. So slowing them down in the fast path is a double
whammy, in that it shows their bad behaviour much earlier.
And the generic semaphores really are slower than the old optimized ones
in that fast path. By a *big* amount.
Which is why I'm 100% convinced it's not even worth saving the old code.
It needs to use mutexes, or spinlocks. I bet it has *nothing* to do with
"slow path" other than the fact that it gets to that slow path much more
these days.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists