lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 31 Jan 2014 14:28:45 -0500
From:	Waiman Long <waiman.long@...com>
To:	George Spelvin <linux@...izon.com>
CC:	peterz@...radead.org, akpm@...ux-foundation.org,
	andi@...stfloor.org, arnd@...db.de, aswin@...com,
	daniel@...ascale.com, halcy@...dex.ru, hpa@...or.com,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
	mingo@...hat.com, paulmck@...ux.vnet.ibm.com,
	raghavendra.kt@...ux.vnet.ibm.com, riel@...hat.com,
	rostedt@...dmis.org, scott.norton@...com, tglx@...utronix.de,
	thavatchai.makpahibulchoke@...com, tim.c.chen@...ux.intel.com,
	torvalds@...ux-foundation.org, walken@...gle.com, x86@...nel.org
Subject: Re: [PATCH v3 1/2] qspinlock: Introducing a 4-byte queue spinlock
 implementation

On 01/31/2014 02:14 PM, George Spelvin wrote:
>> Yes, we can do something like that. However I think put_qnode() needs to
>> use atomic dec as well. As a result, we will need 2 additional atomic
>> operations per slowpath invocation. The code may look simpler, but I
>> don't think it will be faster than what I am currently doing as the
>> cases where the used flag is set will be relatively rare.
> The increment does *not* have to be atomic.
>
> First of all, note that the only reader that matters is a local interrupt;
> other processors never access the variable at all, so what they see
> is irrelevant.
>
> "Okay, so I use a non-atomic RMW instruction; what about non-x86
> processors without op-to-memory?"
>
> Well, they're okay, too.  The only requriement is that the write to
> qna->cnt must be visible to the local processor (barrier()) before the
> qna->nodes[] slot is used.
>
> Remember, a local interrupt may use a slot temporarily, but will always
> return qna->cnt to its original value before returning.  So there's
> nothing wrong with
>
> - Load qna->cnt to register
> - Increment register
> - Store register to qna->cnt
>
> Because an interrupt, although it may temporarily modify qna->cnt, will
> restore it before returning so this code will never see any modification.
>
> Just like using the stack below the %rsp, the only requirement is to
> ensure that the qna->cnt increment is visble *to the local processor's
> interrupt handler* before actually using the slot.
>
> The effect of the interrupt handler is that it may corrupt, at any
> time and without warning, any slot not marked in use via qna->cnt.
> But that's not a difficult thing to deal with, and does *not* require
> atomic operations.

George, you are right. I am thinking too much from the general 
perspective of RMW instruction.

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ