linux-kernel - Re: [PATCH] use unfair spinlock when running on hypervisor.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100603125821.GJ4035@linux.vnet.ibm.com>
Date:	Thu, 3 Jun 2010 18:28:21 +0530
From:	Srivatsa Vaddagiri <vatsa@...ibm.com>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Avi Kivity <avi@...hat.com>, Andi Kleen <andi@...stfloor.org>,
	Gleb Natapov <gleb@...hat.com>, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, hpa@...or.com, mingo@...e.hu,
	tglx@...utronix.de, mtosatti@...hat.com
Subject: Re: [PATCH] use unfair spinlock when running on hypervisor.

On Thu, Jun 03, 2010 at 10:38:32PM +1000, Nick Piggin wrote:
> Holding a ticket in the queue is effectively the same as holding the
> lock, from the pov of processes waiting behind.
> 
> The difference of course is that CPU cycles do not directly reduce
> latency of ticket holders (only the owner). Spinlock critical sections
> should tend to be several orders of magnitude shorter than context
> switch times. So if you preempt the guy waiting at the head of the
> queue, then it's almost as bad as preempting the lock holder.

Ok got it - although that approach is not advisable in some cases for ex: when
the lock holder vcpu and lock acquired vcpu are scheduled on the same pcpu by
the hypervisor (which was experimented with in [1] where they foud a huge hit in
perf).

I agree that in general we should look at deferring preemption of lock 
acquirer esp when its at "head" as you suggest - I will consider that approach
as the next step (want to incrementally progress basically!).

> > > Have you also looked at how s390 checks if the owning vcpu is running
> > > and if so it spins, if not yields to the hypervisor. Something like
> > > turning it into an adaptive lock. This could be applicable as well.
> > 
> > I don't think even s390 does adaptive spinlocks. Also afaik s390 zVM does gang
> > scheduling of vcpus, which reduces the severity of this problem very much -
> > essentially lock acquirer/holder are run simultaneously on different cpus all
> > the time. Gang scheduling is on my list of things to look at much later
> > (although I have been warned that its a scalablility nightmare!).
> 
> It effectively is pretty well an adaptive lock. The spinlock itself
> doesn't sleep of course, but it yields to the hypervisor if the owner
> has been preempted. This is pretty close to analogous with Linux adaptive mutexes.

Oops you are right - sorry should have checked more closely earlier. Given that
we may not be able to always guarantee that locked critical sections will not be
preempted (ex: when a real-time task takes over), we will need a combination of 
both approaches (i.e request preemption defer on lock hold path + yield on lock 
acquire path if owner !scheduled). The advantage of former approach is that it
could reduce job turnaround times in most cases (as lock is available when we 
want or we don't have to wait too long for it).

> s390 also has the diag9c instruction which I suppose somehow boosts
> priority of a preempted contended lock holder. In spite of any other
> possible optimizations in their hypervisor like gang scheduling,
> diag9c apparently provides quite a large improvement in some cases.

Ok - thx for that pointer - will have a look at diag9c.

> So I think these things are fairly important to look at.

I agree ..

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/