[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0808161014300.3324@nehalem.linux-foundation.org>
Date: Sat, 16 Aug 2008 10:30:54 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc: "H. Peter Anvin" <hpa@...or.com>,
Jeremy Fitzhardinge <jeremy@...p.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>, Joe Perches <joe@...ches.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86_64 : support atomic ops with 64 bits integer
values
On Sat, 16 Aug 2008, Mathieu Desnoyers wrote:
>
> I have hit this problem when tying to implement a better rwlock design
> than is currently in the mainline kernel (I know the RT kernel has a
> hard time with rwlocks)
Have you looked at my sleping rwlock trial thing?
It's very different from a spinning one, but I think the fast path should
be identical, and that's the one I tried to make fairly optimal.
See
http://git.kernel.org/?p=linux/kernel/git/torvalds/rwlock.git;a=summary
for a git tree. The sleeping version has two extra words for the sleep
events, but those would be irrelevant for the spinning version.
The fastpath is
movl $4,%eax
lock ; xaddl %eax,(%rdi)
testl $3,%eax
jne __my_rwlock_rdlock
for the read-lock (the two low bits are contention bits, so you can make
contention have any behaviour you want - including fairish, prefer-reads,
or prefer-writes).
The write fastpath is
xorl %eax,%eax
movl $1,%edx
lock ; cmpxchgl %edx,(%rdi)
jne __my_rwlock_wrlock
and the "unlock" case is actually unnecessarily complex in my
implementation, because it needs to
- wake things up in case of a conflict (not true of a spinning version,
of course)
- it's pthreads-compatible, so the same function needs to handle both a
read-unlock and a write-unlock.
but a spinning version should be much simpler.
Anyway, I haven't tried turning it into a spinning version, but it was
very much designed to
- work with both 32-bit and 64-bit x86 by making the fastpath only do
32-bit locked accesses
- have any number of pending readers/writers (which is not a big deal for
a spinning one, but at least there are no CPU count overflows).
- and because it is designed for sleeping, I'm pretty sure that you can
easily drop interrupts in the contention path, to make
write_lock_irq[save]() be reasonable.
In particular, the third bullet is the important one: because it's
designed to have a "contention" path that has _extra_ information for the
contended case, you could literally make the extra information have things
like a list of pending writers, so that you can drop interrupts on one
CPU, while you adding information to let the reader side know that if the
read-lock happens on that CPU, it needs to be able to continue in order to
not deadlock.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists