lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0808211500530.3487@nehalem.linux-foundation.org>
Date:	Thu, 21 Aug 2008 15:22:56 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, Joe Perches <joe@...ches.com>,
	linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC PATCH] Writer-biased low-latency rwlock v8



On Thu, 21 Aug 2008, Linus Torvalds wrote:
> 
>  rdlock:
> 	movl $4,%eax
> 	lock ; xaddl %eax,(%rdi)
> 	testl $3,%eax
> 	jne __rdlock_slowpath
> 	ret


Ok, so change that "testl" to "andl" to shave two bytes, and then make the 
whole lock structure look something like this:

	struct rwlock {
		u32 fastpath;
		u32 pending_readers;
		u32 pending_writers;
	};

and then the slowpath would look something like this:

	void __rdlock_slowpath(struct rwlock *lock)
	{
		/* Move the read count to the pending path and undo our xadd */
		atomic_add(1, &lock->pending_readers);
		atomic_sub(4, &lock->fastpath);

		for (;;) {
			unsigned fastpath;
			while (lock->pending_writers)
				cpu_relax();
			while ((fastpath = lock->fastpath) & 1)
				cpu_relax();
			/* This will also undo the contention bit */
			if (atomic_cmpxchg(&lock->fastpath, fastpath, (fastpath & ~2)+4)) == fastpath);
				break;
		}
		atomic_sub(1, &lock->pending_readers);
	}

and yes, there are more atomic accesses in there than would be necessary, 
in that if you can cram the whole thing into a single 64-bit word you can 
do both the initial pending fixup and the final cmpxchg/pending_readers 
thing as a single locked 64-bit op, but hey, the above is fairly simple.

You could also just use a spinlock to protect all the other data, of 
course.

There are fairness issues here too - do you really want to wait for 
pending writes every time, or just the first time through the loop? It 
depends on just how much you want to prefer writers.

The slow-paths for writers is similar, and has similar issues for the 
fairness issue. For example, instead of a simple "pending_writers" count, 
maybe you want to use a ticket lock there, to make writers be fair among 
themselves? Of course, the hope would be that there would never be that 
kind of contention by pure writers, so that sounds like overdesign, but 
the thing I'm trying to point out here is that this is all a separate 
decision from the actual fast-path, and having fair writers doesn't 
necessarily mean that the fastpath has to even know/care.

The placement of the counts is also something that can be optimized. For 
example, on teh assumption that readers are much more common than writers, 
I put the "pending_readers" next to the "fastpath" lock, exactly so that 
you -can- do the update of both with a single 64-bit locked access, even 
if the fastpath just used a 32-bit one. But then you should at least add a 
"__attribute__((aligned(8)))" to the structure definition.

And maybe you want to share the slowpath between all the irq-off etc 
versions, in which case you need to make the "irq enable while waiting" 
thing be conditional. 

And if the write path wants to wait for pending readers, and you want to 
make _that_ fair too, then you want to have that punch through if the 
writer is in an interrupt, to avoid deadlocks. And again, you might want 
to make the reader thing be some kind of ticket-based system, so that 
writers can wait for the readers that came in _before_ that writer, but 
not to later ones.

But again - none of that has anything to do with the fast-path. The 
fast-path remains exactly the same for all these issues.

And finally: NONE of the code is at all tested. It's all just written in 
my MUA. Caveat emptor.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ