linux-kernel - Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spin with irqs disable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200807081207.34453.nickpiggin@yahoo.com.au>
Date:	Tue, 8 Jul 2008 12:07:34 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Rik van Riel <riel@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Lameter <clameter@....com>,
	Petr Tesarik <ptesarik@...e.cz>, Ingo Molnar <mingo@...e.hu>,
	linux-kernel@...r.kernel.org
Subject: Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spin with irqs disable

On Tuesday 08 July 2008 06:14, Jeremy Fitzhardinge wrote:

> The other point, of course, is that ticket locks are massive overkill
> for the problem they're trying to solve.

No they aren't.

> It's one thing to introduce an 
> element of fairness into spinlocks, its another to impose strict FIFO
> ordering.  It would be enough to make the locks "polite" by preventing a
> new lock-holder from taking the lock while its under contention.
> Something like:
>
> 	union lock {
> 		unsigned short word;
> 		struct { unsigned char lock, count; };
> 	};
>
> spin_lock:		# ebx - lock pointer
> 	movw	$0x0001, %ax	# add 1 to lock, 0 to count
> 	xaddw	%ax, (%ebx)	# attempt to take lock and test user count
> 	testw	%ax,%ax
> 	jnz	slow
>
> taken:	ret
>
> # slow path
> slow:	lock incb 1(%ebx)	# inc count
>
> 1:	rep;nop
> 	cmpb	$0,(%ebx)
> 	jnz	1b		# wait for unlocked
>
> 	movb	$1,%al		# attempt to take lock (count already increased)
> 	xchgb	%al,(%ebx)
> 	testb	%al,%al
> 	jnz	1b
>
> 	lock decb 1(%ebx)	# drop count
> 	jmp	taken
>
> spin_unlock:
> 	movb	$0,(%ebx)
> 	ret
>
>
> The uncontended fastpath is similar to the pre-ticket locks, but it
> refuses to take the lock if there are other waiters, even if the lock is
> not currently held.  This prevents the rapid lock-unlock cycle on one
> CPU from starving another CPU, which I understand was the original
> problem tickets locks were trying to solve.

They prevent lots of unfairness and starvation problems. The most
prominent one (ie. actually observed in Linux) was a single CPU
being totally starved by N others (to the point where lockup timers
would kick in).

As an aside, these locks you propose are also a lot more costly in
the contended path. 4 vs 1 atomic operations on the lock cacheline
is not so great.

> But it also means that all the contended spinners get the lock in
> whatever order the system decides to give it to them, rather than
> imposing a strict order.

The exact problem is that the system actively does the wrong thing
when you allow it to decide.

Unlike simple cacheline access, I don't believe it is such a good
idea to batch up locks many times on the same CPU for example. While
it surely could improve performance in specific situations, I think
that if code is taking and releasing a lock many times, then it most
probably should be either reworked to hold the lock for longer, or
changed completely. And once locks become *really* contended, then
the cost of moving the critical section to another CPU is really
drowned out by the cost of contention itself (all the idle time,
and taking/releasing the lock cacheline).

So far my theory has held up (except for virtualized systems).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/