lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51CB57F6.6010003@hp.com>
Date:	Wed, 26 Jun 2013 17:07:02 -0400
From:	Waiman Long <waiman.long@...com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	Alexander Viro <viro@...iv.linux.org.uk>,
	Jeff Layton <jlayton@...hat.com>,
	Miklos Szeredi <mszeredi@...e.cz>,
	Ingo Molnar <mingo@...hat.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	"Chandramouleeswaran, Aswin" <aswin@...com>,
	"Norton, Scott J" <scott.norton@...com>
Subject: Re: [PATCH v2 1/2] spinlock: New spinlock_refcount.h for lockless
 update of refcount

On 06/26/2013 04:17 PM, Andi Kleen wrote:
>> + * The combined data structure is 8-byte aligned. So proper placement of this
>> + * structure in the larger embedding data structure is needed to ensure that
>> + * there is no hole in it.
> On i386 u64 is only 4 bytes aligned. So you need to explicitely align
> it to 8 bytes. Otherwise you risk the two members crossing a cache line, which
> would be really expensive with atomics.

Do you mean the original i386 or the i586 that are now used by most 
distribution now? If it is the former, I recall that i386 is now no 
longer supported.

I also look around some existing codes that use cmpxchg64. It doesn't 
seem like they use explicit alignment. I will need more investigation to 
see if it is a real problem.
>> +	/*
>> +	 * Code doesn't work if raw spinlock is larger than 4 bytes
>> +	 * or is empty.
>> +	 */
>> +	BUG_ON((sizeof(arch_spinlock_t)>  4) || (sizeof(arch_spinlock_t) == 0));
> BUILD_BUG_ON

Thank for the suggestion, will make the change.

>> +
>> +	spin_unlock_wait(plock);	/* Wait until lock is released */
>> +	old.__lock_count = ACCESS_ONCE(*plockcnt);
>> +	get_lock = ((threshold>= 0)&&  (old.count == threshold));
>> +	if (likely(!get_lock&&  spin_can_lock(&old.lock))) {
> What is that for? Why can't you do the CMPXCHG unconditially ?

An unconditional CMPXCHG can be as bad as acquiring the spinlock. So we 
need to check the conditions are ready before doing an actual CMPXCHG.
> If it's really needed, it is most likely a race?

If there is a race going on between threads, the code will fall back to 
the old way of acquiring the spinlock.

> The duplicated code should be likely an inline.

The duplicated code is only used once in the function. I don't think an 
additional inline is really needed, but I can do it if other people also 
think that is a good idea.

>> +/*
>> + * The presence of either one of the CONFIG_DEBUG_SPINLOCK or
>> + * CONFIG_DEBUG_LOCK_ALLOC configuration parameter will force the
>> + * spinlock_t structure to be 8-byte aligned.
>> + *
>> + * To support the spinlock/reference count combo data type for 64-bit SMP
>> + * environment with spinlock debugging turned on, the reference count has
>> + * to be integrated into the spinlock_t data structure in this special case.
>> + * The spinlock_t data type will be 8 bytes larger if CONFIG_GENERIC_LOCKBREAK
>> + * is also defined.
> I would rather just disable the optimization when these CONFIGs are set

Looking from the other perspective, we may want the locking code to have 
the same behavior whether spinlock debugging is enabled or not. 
Disabling the optimization will cause the code path to differ which may 
not be what we want. Of course, I can change it if other people also 
think it is the right way to do it.

Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ