linux-kernel - Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1348832438.5551.5.camel@oc6622382223.ibm.com>
Date:	Fri, 28 Sep 2012 06:40:38 -0500
From:	Andrew Theurer <habanero@...ux.vnet.ibm.com>
To:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
Cc:	Avi Kivity <avi@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...hat.com>,
	Srikar <srikar@...ux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
	KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
	chegu vinod <chegu_vinod@...com>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
	Gleb Natapov <gleb@...hat.com>,
	Andrew Jones <drjones@...hat.com>
Subject: Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit
 scenarios in PLE handler

On Fri, 2012-09-28 at 11:08 +0530, Raghavendra K T wrote:
> On 09/27/2012 05:33 PM, Avi Kivity wrote:
> > On 09/27/2012 01:23 PM, Raghavendra K T wrote:
> >>>
> >>> This gives us a good case for tracking preemption on a per-vm basis.  As
> >>> long as we aren't preempted, we can keep the PLE window high, and also
> >>> return immediately from the handler without looking for candidates.
> >>
> >> 1) So do you think, deferring preemption patch ( Vatsa was mentioning
> >> long back)  is also another thing worth trying, so we reduce the chance
> >> of LHP.
> >
> > Yes, we have to keep it in mind.  It will be useful for fine grained
> > locks, not so much so coarse locks or IPIs.
> >
> 
> Agree.
> 
> > I would still of course prefer a PLE solution, but if we can't get it to
> > work we can consider preemption deferral.
> >
> 
> Okay.
> 
> >>
> >> IIRC, with defer preemption :
> >> we will have hook in spinlock/unlock path to measure depth of lock held,
> >> and shared with host scheduler (may be via MSRs now).
> >> Host scheduler 'prefers' not to preempt lock holding vcpu. (or rather
> >> give say one chance.
> >
> > A downside is that we have to do that even when undercommitted.

Hopefully vcpu preemption is very rare when undercommitted, so it should
not happen much at all.
> >
> > Also there may be a lot of false positives (deferred preemptions even
> > when there is no contention).

It will be interesting to see how this behaves with a very high lock
activity in a guest.  Once the scheduler defers preemption, is it for a
fixed amount of time, or does it know to cut the deferral short as soon
as the lock depth is reduced [by x]?
> 
> Yes. That is a worry.
> 
> >
> >>
> >> 2) looking at the result (comparing A & C) , I do feel we have
> >> significant in iterating over vcpus (when compared to even vmexit)
> >> so We still would need undercommit fix sugested by PeterZ (improving by
> >> 140%). ?
> >
> > Looking only at the current runqueue?  My worry is that it misses a lot
> > of cases.  Maybe try the current runqueue first and then others.
> >
> > Or were you referring to something else?
> 
> No. I was referring to the same thing.
> 
> However. I had tried following also (which works well to check 
> undercommited scenario). But thinking to use only for yielding in case
> of overcommit (yield in overcommit suggested by Rik) and keep 
> undercommit patch as suggested by PeterZ
> 
> [ patch is not in proper diff I suppose ].
> 
> Will test them.
> 
> Peter, Can I post your patch with your from/sob.. in V2?
> Please let me know..
> 
> ---
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 28f00bc..9ed3759 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1620,6 +1620,21 @@ bool kvm_vcpu_eligible_for_directed_yield(struct 
> kvm_vcpu *vcpu)
>   	return eligible;
>   }
>   #endif
> +
> +bool kvm_overcommitted()
> +{
> +	unsigned long load;
> +
> +	load = avenrun[0] + FIXED_1/200;
> +	load = load >> FSHIFT;
> +	load = (load << 7) / num_online_cpus();
> +
> +	if (load > 128)
> +		return true;
> +
> +	return false;
> +}
> +
>   void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>   {
>   	struct kvm *kvm = me->kvm;
> @@ -1629,6 +1644,9 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>   	int pass;
>   	int i;
> 
> +	if (!kvm_overcommitted())
> +		return;
> +
>   	kvm_vcpu_set_in_spin_loop(me, true);
>   	/*
>   	 * We boost the priority of a VCPU that is runnable but not


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/