linux-kernel - Re: x86-32: clean up rwsem inline asm statements

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4B4E67C8.501@zytor.com>
Date:	Wed, 13 Jan 2010 16:39:36 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	George Spelvin <linux@...izon.com>
CC:	linux-kernel@...r.kernel.org, schwab@...ux-m68k.org,
	torvalds@...ux-foundation.org
Subject: Re: x86-32: clean up rwsem inline asm statements

On 01/13/2010 04:27 PM, George Spelvin wrote:
>> There are a number of things that can be done better... for one thing,
>> "+m" (sem->count) and "a" (sem) is just bloody wrong.  The right thing
>> would be "a" (&sem->count) for proper robustness.
> 
> Actually, no.  The "+m" (sem->count) is telling GCC that sem->count is
> updated; "a" (&sem->count) does *not* tell it to invalidate cached
> copies of sem->count that it may have lying around.
> 
> However, we can't just use "+m" (sem->count) because GCC has a poor
> grasp on the concept of atomic operations; as far as it is concerned,
> it is exactly equivalent to copying the value into a stack slot, do the
> operation there, and copy it back.

You completely missed the point.

The reason we need both "+m" (sem->count) and "a" (sem) is because we
need the address of the semaphore in %eax in case we hit the slow path.
 They're actually orthogonal requirements, and we could implement them as:

	LOCK_PREFIX xadd %1,%0
	jz 1f
	call call_rwsem_wake
1:

	: "+m" (sem->count), "=d" (tmp)
       	: "a" (sem)

... just fine, and arguably that would be more correct in the sense that
if sem->count isn't the first member of *sem, then we still pass the
address of sem in %eax to call_rwsem_wake.  In practice, with the
zero-offset sem->count, gcc will typically generate (%eax) as the
argument since it has it available anyway, but it wouldn't have to.

The code as it currently is:

	LOCK_PREFIX xadd %1,(%2)
	jz 1f
	call call_rwsem_wake
1:

	: "+m" (sem->count), "=d" (tmp)
       	: "a" (sem)

... implicitly assumes the offset of "count" is zero but doesn't enforce
it in any way.  My comment was that since the assumption is clearly that
arguments 0 and 2 point to the same memory object, it should be
(&sem->count) in the second case, but that's actually not really the
"proper" fix because call_rwsem_wake presumably expects the value of sem.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/