[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25821.1176466182@redhat.com>
Date: Fri, 13 Apr 2007 13:09:42 +0100
From: David Howells <dhowells@...hat.com>
To: Nick Piggin <npiggin@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Andi Kleen <ak@...e.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [patch] generic rwsems
Nick Piggin <npiggin@...e.de> wrote:
> This patch converts all architectures to a generic rwsem implementation,
> which will compile down to the same code for i386, or powerpc, for
> example,
> and will allow some (eg. x86-64) to move away from spinlock based rwsems.
Which are better on UP kernels because spinlocks degrade to nothing, and then
you're left with a single disable/enable interrupt pair per operation, and no
requirement for atomic ops at all.
What you propose may wind up with several per op because if the CPU does not
support atomic ops directly and cannot emulate them through other atomic ops,
then these have to be emulated by:
atomic_op() {
spin_lock_irqsave
do op
spin_unlock_irqrestore
}
> Move to an architecture independent rwsem implementation, using the
> better of the two rwsem implementations
That's not necessarily the case, as I said above.
Furthermore, the spinlock implementation struct is smaller on 64-bit machines,
and is less prone to counter overrun on 32-bit machines.
> Out-of-line the fastpaths, to bring rw-semaphores into line with
> mutexes and spinlocks WRT our icache vs function call policy.
That should depend on whether you optimise for space or for speed. Function
calls are relatively heavyweight.
Please re-inline and fix Ingo's mess if you must clean up. Take the i386
version, for instance, I'd made it so that the compiler didn't know it was
taking a function call when it went down the slow path, thus meaning the
compiler didn't have to deal with that. Furthermore, it only interpolated two
or three instructions into the calling code in the fastpath. It's a real shame
that gcc inline asm doesn't allow you to use status flags as boolean returns,
otherwise I could reduce that even further.
> Spinlock based rwsems are inferior to atomic based ones one most
> architectures that can do atomic ops without spinlocks:
Note the "most" in your statement...
Think about it. This algorithm is only optimal where XADD is available. If
you don't have XADD, but you do have LL/SC or CMPXCHG, you can do better.
If the only atomic op you have is XCHG, then this is a really poor choice;
similarly if you are using a UP-compiled kernel.
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists