linux-kernel - Re: [PATCH] x86_64 : support atomic ops with 64 bits integer values

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.10.0808161014300.3324@nehalem.linux-foundation.org>
Date:	Sat, 16 Aug 2008 10:30:54 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc:	"H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, Joe Perches <joe@...ches.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86_64 : support atomic ops with 64 bits integer
 values

On Sat, 16 Aug 2008, Mathieu Desnoyers wrote:
> 
> I have hit this problem when tying to implement a better rwlock design
> than is currently in the mainline kernel (I know the RT kernel has a
> hard time with rwlocks)

Have you looked at my sleping rwlock trial thing?

It's very different from a spinning one, but I think the fast path should 
be identical, and that's the one I tried to make fairly optimal.

See 

	http://git.kernel.org/?p=linux/kernel/git/torvalds/rwlock.git;a=summary

for a git tree. The sleeping version has two extra words for the sleep 
events, but those would be irrelevant for the spinning version.

The fastpath is

	movl $4,%eax
	lock ; xaddl %eax,(%rdi)
	testl $3,%eax
	jne __my_rwlock_rdlock

for the read-lock (the two low bits are contention bits, so you can make 
contention have any behaviour you want - including fairish, prefer-reads, 
or prefer-writes).

The write fastpath is

	xorl %eax,%eax
	movl $1,%edx
	lock ; cmpxchgl %edx,(%rdi)
	jne __my_rwlock_wrlock

and the "unlock" case is actually unnecessarily complex in my 
implementation, because it needs to

 - wake things up in case of a conflict (not true of a spinning version, 
   of course)
 - it's pthreads-compatible, so the same function needs to handle both a 
   read-unlock and a write-unlock.

but a spinning version should be much simpler.

Anyway, I haven't tried turning it into a spinning version, but it was 
very much designed to

 - work with both 32-bit and 64-bit x86 by making the fastpath only do 
   32-bit locked accesses
 - have any number of pending readers/writers (which is not a big deal for 
   a spinning one, but at least there are no CPU count overflows).
 - and because it is designed for sleeping, I'm pretty sure that you can 
   easily drop interrupts in the contention path, to make 
   write_lock_irq[save]() be reasonable.

In particular, the third bullet is the important one: because it's 
designed to have a "contention" path that has _extra_ information for the 
contended case, you could literally make the extra information have things 
like a list of pending writers, so that you can drop interrupts on one 
CPU, while you adding information to let the reader side know that if the 
read-lock happens on that CPU, it needs to be able to continue in order to 
not deadlock.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/