linux-kernel - Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 14 Sep 2012 16:34:24 -0400
From:	Konrad Rzeszutek Wilk <konrad@...nel.org>
To:	habanero@...ux.vnet.ibm.com
Cc:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...hat.com>,
	KVM <kvm@...r.kernel.org>, chegu vinod <chegu_vinod@...com>,
	LKML <linux-kernel@...r.kernel.org>, X86 <x86@...nel.org>,
	Gleb Natapov <gleb@...hat.com>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>
Subject: Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

> The concern I have is that even though we have gone through changes to
> help reduce the candidate vcpus we yield to, we still have a very poor
> idea of which vcpu really needs to run.  The result is high cpu usage in
> the get_pid_task and still some contention in the double runqueue lock.
> To make this scalable, we either need to significantly reduce the
> occurrence of the lock-holder preemption, or do a much better job of
> knowing which vcpu needs to run (and not unnecessarily yielding to vcpus
> which do not need to run).

The patches that Raghavendra  has been posting do accomplish that.
>
> On reducing the occurrence:  The worst case for lock-holder preemption
> is having vcpus of same VM on the same runqueue.  This guarantees the
> situation of 1 vcpu running while another [of the same VM] is not.  To
> prove the point, I ran the same test, but with vcpus restricted to a
> range of host cpus, such that any single VM's vcpus can never be on the
> same runqueue.  In this case, all 10 VMs' vcpu-0's are on host cpus 0-4,
> vcpu-1's are on host cpus 5-9, and so on.  Here is the result:
>
> kvm_cpu_spin, and all
> yield_to changes, plus
> restricted vcpu placement:  8823 +/- 3.20%   much, much better
>
> On picking a better vcpu to yield to:  I really hesitate to rely on
> paravirt hint [telling us which vcpu is holding a lock], but I am not
> sure how else to reduce the candidate vcpus to yield to.  I suspect we
> are yielding to way more vcpus than are prempted lock-holders, and that
> IMO is just work accomplishing nothing.  Trying to think of way to
> further reduce candidate vcpus....

... the patches are posted -  you could try them out?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/