lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 May 2009 08:19:00 +0100
From:	"Jan Beulich" <JBeulich@...ell.com>
To:	"Jeremy Fitzhardinge" <jeremy@...p.org>
Cc:	"Ingo Molnar" <mingo@...e.hu>,
	"Jun Nakajima" <jun.nakajima@...el.com>,
	"Xiaohui Xin" <xiaohui.xin@...el.com>, "Xin Li" <xin.li@...el.com>,
	"Xen-devel" <xen-devel@...ts.xensource.com>,
	"Nick Piggin" <npiggin@...e.de>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [Xen-devel] Performance overhead of paravirt_ops on		
	 nativeidentified

>>> Jeremy Fitzhardinge <jeremy@...p.org> 15.05.09 20:50 >>>
>Jan Beulich wrote:
>> A patch for the pv-ops kernel would require some time. What I can give you
>> right away - just for reference - are the sources we currently use in our kernel:
>> attached.
>
>Hm, I see.  Putting a call out to a pv-ops function in the ticket lock 
>slow path looks pretty straightforward.  The need for an extra lock on 
>the contended unlock side is a bit unfortunate; have you measured to see 
>what hit that has?  Seems to me like you could avoid the problem by 
>using per-cpu storage rather than stack storage (though you'd need to 
>copy the per-cpu data to stack when handling a nested spinlock).

Not sure how you'd imagine this to work: The unlock code has to look at all
cpus' data in either case, so an inner lock would still be required imo.

>What's the thinking behind the xen_spin_adjust() stuff?

That's the placeholder for implementing interrupt re-enabling in the irq-save
lock path. The requirement is that if a nested lock attempt hits the same
lock on the same cpu that it failed to get acquired on earlier (but got a ticket
already), tickets for the given (lock, cpu) pair need to be circularly shifted
around so that the innermost requestor gets the earliest ticket. This is what
that function's job will become if I ever get to implement this.

>> static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock) { 
>> unsigned int token, count; bool free; __ticket_spin_lock_preamble; if 
>> (unlikely(!free)) token = xen_spin_adjust(lock, token); do { count = 1 
>> << 10; __ticket_spin_lock_body; } while (unlikely(!count) && 
>> !xen_spin_wait(lock, token)); } 
>
>How does this work?  Doesn't it always go into the slowpath loop even if 
>the preamble got the lock with no contention?

It indeed always enters the slowpath loop, but only for a single pass through
part of its body (the first compare in the body macro will make it exit the loop
right away: 'token' is not only the ticket here, but the full lock->slock
contents). But yes, I think you're right, one could avoid entering the body
altogether by moving the containing loop into the if(!free) body. The logic
went through a number of re-writes, so I must have overlooked that
opportunity on the last round of adjustments.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ