linux-kernel - Re: [PATCH V3 RFC 2/2] kvm: Handle yield_to failure return code for potential undercommit case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 26 Nov 2012 15:06:45 +0100
From:	Andrew Jones <drjones@...hat.com>
To:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	riel@...hat.com
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...or.com>, Avi Kivity <avi@...hat.com>,
	Gleb Natapov <gleb@...hat.com>, Ingo Molnar <mingo@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Srikar <srikar@...ux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
	KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
	Chegu Vinod <chegu_vinod@...com>,
	"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>
Subject: Re: [PATCH V3 RFC 2/2] kvm: Handle yield_to failure return code for
 potential undercommit case

On Mon, Nov 26, 2012 at 02:43:02PM +0100, Andrew Jones wrote:
> On Mon, Nov 26, 2012 at 05:38:04PM +0530, Raghavendra K T wrote:
> > From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> > 
> > yield_to returns -ESRCH, When source and target of yield_to
> > run queue length is one. When we see three successive failures of
> > yield_to we assume we are in potential undercommit case and abort
> > from PLE handler.
> > The assumption is backed by low probability of wrong decision
> > for even worst case scenarios such as average runqueue length
> > between 1 and 2.
> > 
> > note that we do not update last boosted vcpu in failure cases.
> > Thank Avi for raising question on aborting after first fail from yield_to.
> > 
> > Reviewed-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
> > Signed-off-by: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> > ---
> >  virt/kvm/kvm_main.c |   26 ++++++++++++++++----------
> >  1 file changed, 16 insertions(+), 10 deletions(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index be70035..053f494 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -1639,6 +1639,7 @@ bool kvm_vcpu_yield_to(struct kvm_vcpu *target)
> >  {
> >  	struct pid *pid;
> >  	struct task_struct *task = NULL;
> > +	bool ret = false;
> >  
> >  	rcu_read_lock();
> >  	pid = rcu_dereference(target->pid);
> > @@ -1646,17 +1647,15 @@ bool kvm_vcpu_yield_to(struct kvm_vcpu *target)
> >  		task = get_pid_task(target->pid, PIDTYPE_PID);
> >  	rcu_read_unlock();
> >  	if (!task)
> > -		return false;
> > +		return ret;
> >  	if (task->flags & PF_VCPU) {
> >  		put_task_struct(task);
> > -		return false;
> > -	}
> > -	if (yield_to(task, 1)) {
> > -		put_task_struct(task);
> > -		return true;
> > +		return ret;
> >  	}
> > +	ret = yield_to(task, 1);
> >  	put_task_struct(task);
> > -	return false;
> > +
> > +	return ret;
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_vcpu_yield_to);
> >  
> > @@ -1697,12 +1696,14 @@ bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
> >  	return eligible;
> >  }
> >  #endif
> > +
> >  void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> >  {
> >  	struct kvm *kvm = me->kvm;
> >  	struct kvm_vcpu *vcpu;
> >  	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
> >  	int yielded = 0;
> > +	int try = 3;
> >  	int pass;
> >  	int i;
> >  
> > @@ -1714,7 +1715,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> >  	 * VCPU is holding the lock that we need and will release it.
> >  	 * We approximate round-robin by starting at the last boosted VCPU.
> >  	 */
> > -	for (pass = 0; pass < 2 && !yielded; pass++) {
> > +	for (pass = 0; pass < 2 && !yielded && try; pass++) {
> >  		kvm_for_each_vcpu(i, vcpu, kvm) {
> >  			if (!pass && i <= last_boosted_vcpu) {
> >  				i = last_boosted_vcpu;
> > @@ -1727,10 +1728,15 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> >  				continue;
> >  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> >  				continue;
> > -			if (kvm_vcpu_yield_to(vcpu)) {
> > +
> > +			yielded = kvm_vcpu_yield_to(vcpu);
> > +			if (yielded > 0) {
> >  				kvm->last_boosted_vcpu = i;
> > -				yielded = 1;
> >  				break;
> > +			} else if (yielded < 0) {
> > +				try--;
> > +				if (!try)
> > +					break;
> >  			}
> >  		}
> >  	}
> > 
> 
> The check done in patch 1/2 is done before the double_rq_lock, so it's
> cheap. Now, this patch is to avoid doing too many get_pid_task calls. I
> wonder if it would make more sense to change the vcpu state from tracking
> the pid to tracking the task. If that was done, then I don't believe this
> patch is necessary.
> 
> Rik,
> for 34bb10b79de7 was there a reason pid was used instead of task?

Nevermind, I guess there's no way to validate the task pointer without
checking the pid, since, as your git commit says, there are no guarantee
that the same task always keeps the same vcpu. We'd only know it's valid
if it's running, and if it's running, it's of no interest.

> 
> Drew
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/