lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1109021222080.12963@kaball-desktop>
Date:	Fri, 2 Sep 2011 12:22:31 +0100
From:	Stefano Stabellini <stefano.stabellini@...citrix.com>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
CC:	"H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	the arch/x86 maintainers <x86@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...nel.dk>, Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	KVM <kvm@...r.kernel.org>, Andi Kleen <andi@...stfloor.org>,
	Xen Devel <xen-devel@...ts.xensource.com>,
	Jeremy Fitzhardinge <Jeremy.Fitzhardinge@...rix.com>
Subject: Re: [PATCH 00/13] [PATCH RFC] Paravirtualized ticketlocks

On Fri, 2 Sep 2011, Jeremy Fitzhardinge wrote:
> From: Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>
> 
> This series replaces the existing paravirtualized spinlock mechanism
> with a paravirtualized ticketlock mechanism.
> 
> Ticket locks have an inherent problem in a virtualized case, because
> the vCPUs are scheduled rather than running concurrently (ignoring
> gang scheduled vCPUs).  This can result in catastrophic performance
> collapses when the vCPU scheduler doesn't schedule the correct "next"
> vCPU, and ends up scheduling a vCPU which burns its entire timeslice
> spinning.  (Note that this is not the same problem as lock-holder
> preemption, which this series also addresses; that's also a problem,
> but not catastrophic).
> 
> (See Thomas Friebel's talk "Prevent Guests from Spinning Around"
> http://www.xen.org/files/xensummitboston08/LHP.pdf for more details.)
> 
> Currently we deal with this by having PV spinlocks, which adds a layer
> of indirection in front of all the spinlock functions, and defining a
> completely new implementation for Xen (and for other pvops users, but
> there are none at present).
> 
> PV ticketlocks keeps the existing ticketlock implemenentation
> (fastpath) as-is, but adds a couple of pvops for the slow paths:
> 
> - If a CPU has been waiting for a spinlock for SPIN_THRESHOLD
>   iterations, then call out to the __ticket_lock_spinning() pvop,
>   which allows a backend to block the vCPU rather than spinning.  This
>   pvop can set the lock into "slowpath state".
> 
> - When releasing a lock, if it is in "slowpath state", the call
>   __ticket_unlock_kick() to kick the next vCPU in line awake.  If the
>   lock is no longer in contention, it also clears the slowpath flag.
> 
> The "slowpath state" is stored in the LSB of the within the lock
> ticket.  This has the effect of reducing the max number of CPUs by
> half (so, a "small ticket" can deal with 128 CPUs, and "large ticket"
> 32768).
> 
> This series provides a Xen implementation, but it should be
> straightforward to add a KVM implementation as well.
> 
> Overall, it results in a large reduction in code, it makes the native
> and virtualized cases closer, and it removes a layer of indirection
> around all the spinlock functions.  The downside is that it does add a
> few instructions into the fastpath in the native case.
> 
> Most of the heavy lifting code is in the slowpaths, but it does have
> an effect on the fastpath code.  The inner part of ticket lock code
> becomes:
> 	inc = xadd(&lock->tickets, inc);
> 	inc.tail &= ~TICKET_SLOWPATH_FLAG;
> 
> 	for (;;) {
> 		unsigned count = SPIN_THRESHOLD;
> 
> 		do {
> 			if (inc.head == inc.tail)
> 				goto out;
> 			cpu_relax();
> 			inc.head = ACCESS_ONCE(lock->tickets.head);
> 		} while (--count);
> 		__ticket_lock_spinning(lock, inc.tail);
> 	}
> 
> which results in:
> 
>       pushq   %rbp
>       movq    %rsp,%rbp
> 
>       movl    $512, %ecx
>       lock; xaddw %cx, (%rdi)	# claim ticket
> 
>       movzbl  %ch, %edx
>       movl    $2048, %eax	# set spin count
>       andl    $-2, %edx		# mask off TICKET_SLOWPATH_FLAG
>       movzbl  %dl, %esi
> 
> 1:    cmpb    %dl, %cl		# compare head and tail
>       je      2f   		# got it!
> 
>       ### BEGIN SLOWPATH
>       rep; nop			# pause
>       decl    %eax		# dec count
>       movb    (%rdi), %cl	# re-fetch head
>       jne     1b      		# try again
> 
>       call    *pv_lock_ops	# call __ticket_lock_spinning
>       movl    $2048, %eax	# reload spin count
>       jmp     1b
>       ### END SLOWPATH
> 
> 2:    popq    %rbp
>       ret
> 
> with CONFIG_PARAVIRT_SPINLOCKS=n, the same code identical asm to the
> current ticketlock code:
> 
> 	pushq   %rbp
>         movq    %rsp, %rbp
> 
>         movl    $256, %eax
> 	lock; xaddw %ax, (%rdi)
> 
> 	movzbl  %ah, %edx
> 
> 1:	cmpb    %dl, %al	# compare head and tail
> 	je	2f   		# got it!
> 
> 	### BEGIN SLOWPATH
> 	rep; nop		# pause
> 	movb    (%rdi), %al	# reload head
> 	jmp	1b		# loop
> 	### END SLOWPATH
> 
> 2:	popq	%rbp
> 	ret
> 
> so the pv ticketlocks add 3 extra instructions to the fastpath, one of
> which really doesn't need to be there (setting up the arg for the
> slowpath function):
>       movl    $2048, %eax	# set spin count
>       andl    $-2, %edx		# mask off SLOW_PATH_FLAG
>       movzbl  %dl, %esi		# set up __ticket_lock_spinning arg
> 
> The unlock code is very straightforward:
> 	__ticket_unlock_release(lock);
> 	if (unlikely(__ticket_in_slowpath(lock)))
> 		__ticket_unlock_slowpath(lock);
> which generates:
>       addb $2, (%rdi)
>       testb    $1, 1(%rdi)
>       je       1f
>       call     __ticket_unlock_slowpath
> 1:
> 
> which, while simple, is more complex than the simple "incb (%rdi)".
> (I'm not sure whether its worth inlining this or not.)
> 
> Thoughts? Comments? Suggestions?
> 
> Thanks,
> 	J
> 
> Jeremy Fitzhardinge (12):
>   x86/spinlocks: replace pv spinlocks with pv ticketlocks
>   x86/ticketlock: collapse a layer of functions
>   xen/pvticketlock: Xen implementation for PV ticket locks
>   x86/pvticketlock: use callee-save for lock_spinning
>   x86/ticketlocks: when paravirtualizing ticket locks, increment by 2
>   x86/ticketlock: add slowpath logic
>   x86/ticketlocks: tidy up __ticket_unlock_kick()
>   xen/pvticketlock: disable interrupts while blocking
>   x86/pvticketlocks: we only need to kick if there's waiters
>   xen/pvticket: allow interrupts to be enabled while blocking
>   x86/pvticketlock: make sure unlock_kick pvop call is inlined
>   x86/pvticketlock: use __ticket_t for pvop args
> 
> Srivatsa Vaddagiri (1):
>   x86/ticketlock: only do kick after doing unlock
> 
>  arch/x86/include/asm/paravirt.h       |   30 +---
>  arch/x86/include/asm/paravirt_types.h |    8 +-
>  arch/x86/include/asm/spinlock.h       |  118 ++++++++-----
>  arch/x86/include/asm/spinlock_types.h |   16 ++-
>  arch/x86/kernel/paravirt-spinlocks.c  |   40 +++--
>  arch/x86/xen/spinlock.c               |  305 ++++++++-------------------------
>  6 files changed, 186 insertions(+), 331 deletions(-)

do you have a git tree somewhere with this series?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ