linux-kernel - Re: Enhancement for PLE handler in KVM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53173961.6080301@redhat.com>
Date:	Wed, 05 Mar 2014 15:49:05 +0100
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	"Li, Bin (Bin)" <bin.bl.li@...atel-lucent.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>
CC:	"Jatania, Neel (Neel)" <Neel.Jatania@...atel-lucent.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Mike Galbraith <efault@....de>,
	Chris Wright <chrisw@...s-sol.org>,
	"ttracy@...hat.com" <ttracy@...hat.com>,
	"Nakajima, Jun" <jun.nakajima@...el.com>,
	"riel@...hat.com" <riel@...hat.com>
Subject: Re: Enhancement for PLE handler in KVM

Il 05/03/2014 15:17, Li, Bin (Bin) ha scritto:
> Hello, Paolo,
>
> We are using a customized embedded SMP OS as guest OS. It is not meaningful to post the guest OS code.
> Also, there is no "performance numbers for common workloads" since there is no common workloads to compare with.
> In our OS, there is still a big kernel lock to protect the kernel.

Does this means that average spinning time for the spinlock is 
relatively high compared to Linux or Windows?

> - when the in-correct boosting happens, the vCPU in spin lock will run
> longer time on the pCPU and causing the lock holder vCPU having less
> time to run on pCPU since they are sharing the on same pCPU.

Correct.  This is an unfortunate problem in the current implementation 
of PLE.

> Adding hyper call in every kernel enter and kernel exist is
> expensive. From the trace log collect from i7 running @ 3.0GHz , the
> cost per  hyper is <1us.

Right, it is around 1500 cycles and 0.4-0.5 us, i.e. approximately 1 us 
for enter and exit together.

This is not too bad for a kernel with a big lock, but not acceptable if 
you do not have it (as is the case for Linux and Windows).

> Regarding to the " paravirtual ticketlock ", we did try the same idea in our embedded guest OS.
> We got following results:
>
> a) We implemented similar approach like linux "paravirtual
> ticketlock". The system clock jitter does get reduced a lot. But, the
> system clock jitter is still happening at lower rate. In a few hours
> system stress test, we still see the big jitter few times.

Did you find out why?  It could happen if the virtual CPU is scheduled 
out for a relatively long time.  A small number of spinning iterations 
can then account for a relatively large time.

My impression is that you're implementing a paravirtual spinlock, except 
that you're relying on PLE to decide when to go to sleep.  PLE is 
implemented using the TSC.  Can you assume the host TSC is of good 
quality?  If so, perhaps you can try to modify the pv ticketlock 
algorithm, and use a threshold based on TSC instead of an iteration count?

> b) When using "paravirtual ticketlock", the threshold to decide "are
> we spinning too much" becomes an important factor need to be tuned to
> the final system case by case. What we found from the test is, different
> application running in our guest OS would require different threshold
> setting.

Did you also find out here why this is the case?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/