lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100603134500.GN6822@laptop>
Date:	Thu, 3 Jun 2010 23:45:00 +1000
From:	Nick Piggin <npiggin@...e.de>
To:	Srivatsa Vaddagiri <vatsa@...ibm.com>
Cc:	Avi Kivity <avi@...hat.com>, Andi Kleen <andi@...stfloor.org>,
	Gleb Natapov <gleb@...hat.com>, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, hpa@...or.com, mingo@...e.hu,
	tglx@...utronix.de, mtosatti@...hat.com
Subject: Re: [PATCH] use unfair spinlock when running on hypervisor.

On Thu, Jun 03, 2010 at 06:28:21PM +0530, Srivatsa Vaddagiri wrote:
> On Thu, Jun 03, 2010 at 10:38:32PM +1000, Nick Piggin wrote:
> > Holding a ticket in the queue is effectively the same as holding the
> > lock, from the pov of processes waiting behind.
> > 
> > The difference of course is that CPU cycles do not directly reduce
> > latency of ticket holders (only the owner). Spinlock critical sections
> > should tend to be several orders of magnitude shorter than context
> > switch times. So if you preempt the guy waiting at the head of the
> > queue, then it's almost as bad as preempting the lock holder.
> 
> Ok got it - although that approach is not advisable in some cases for ex: when
> the lock holder vcpu and lock acquired vcpu are scheduled on the same pcpu by
> the hypervisor (which was experimented with in [1] where they foud a huge hit in
> perf).

Sure but if you had adaptive yielding, that solves that problem.

 
> I agree that in general we should look at deferring preemption of lock 
> acquirer esp when its at "head" as you suggest - I will consider that approach
> as the next step (want to incrementally progress basically!).
>
> > > > Have you also looked at how s390 checks if the owning vcpu is running
> > > > and if so it spins, if not yields to the hypervisor. Something like
> > > > turning it into an adaptive lock. This could be applicable as well.
> > > 
> > > I don't think even s390 does adaptive spinlocks. Also afaik s390 zVM does gang
> > > scheduling of vcpus, which reduces the severity of this problem very much -
> > > essentially lock acquirer/holder are run simultaneously on different cpus all
> > > the time. Gang scheduling is on my list of things to look at much later
> > > (although I have been warned that its a scalablility nightmare!).
> > 
> > It effectively is pretty well an adaptive lock. The spinlock itself
> > doesn't sleep of course, but it yields to the hypervisor if the owner
> > has been preempted. This is pretty close to analogous with Linux adaptive mutexes.
> 
> Oops you are right - sorry should have checked more closely earlier. Given that
> we may not be able to always guarantee that locked critical sections will not be
> preempted (ex: when a real-time task takes over), we will need a combination of 
> both approaches (i.e request preemption defer on lock hold path + yield on lock 
> acquire path if owner !scheduled). The advantage of former approach is that it
> could reduce job turnaround times in most cases (as lock is available when we 
> want or we don't have to wait too long for it).

Both I think would be good. It might be interesting to talk with the
s390 guys and see if they can look at ticket locks and preempt defer
techniques too (considering they already do the other half of the
equation well).

 
> > s390 also has the diag9c instruction which I suppose somehow boosts
> > priority of a preempted contended lock holder. In spite of any other
> > possible optimizations in their hypervisor like gang scheduling,
> > diag9c apparently provides quite a large improvement in some cases.
> 
> Ok - thx for that pointer - will have a look at diag9c.
> 
> > So I think these things are fairly important to look at.
> 
> I agree ..
> 
> - vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ