linux-kernel - Re: [PATCH V3 RFC 2/2] kvm: Handle yield_to failure return code for potential undercommit case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121126134301.GB9830@turtle.usersys.redhat.com>
Date:	Mon, 26 Nov 2012 14:43:02 +0100
From:	Andrew Jones <drjones@...hat.com>
To:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	riel@...hat.com
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...or.com>, Avi Kivity <avi@...hat.com>,
	Gleb Natapov <gleb@...hat.com>, Ingo Molnar <mingo@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Srikar <srikar@...ux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
	KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
	Chegu Vinod <chegu_vinod@...com>,
	"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>
Subject: Re: [PATCH V3 RFC 2/2] kvm: Handle yield_to failure return code for
 potential undercommit case

On Mon, Nov 26, 2012 at 05:38:04PM +0530, Raghavendra K T wrote:
> From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> 
> yield_to returns -ESRCH, When source and target of yield_to
> run queue length is one. When we see three successive failures of
> yield_to we assume we are in potential undercommit case and abort
> from PLE handler.
> The assumption is backed by low probability of wrong decision
> for even worst case scenarios such as average runqueue length
> between 1 and 2.
> 
> note that we do not update last boosted vcpu in failure cases.
> Thank Avi for raising question on aborting after first fail from yield_to.
> 
> Reviewed-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
> Signed-off-by: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
> ---
>  virt/kvm/kvm_main.c |   26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index be70035..053f494 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1639,6 +1639,7 @@ bool kvm_vcpu_yield_to(struct kvm_vcpu *target)
>  {
>  	struct pid *pid;
>  	struct task_struct *task = NULL;
> +	bool ret = false;
>  
>  	rcu_read_lock();
>  	pid = rcu_dereference(target->pid);
> @@ -1646,17 +1647,15 @@ bool kvm_vcpu_yield_to(struct kvm_vcpu *target)
>  		task = get_pid_task(target->pid, PIDTYPE_PID);
>  	rcu_read_unlock();
>  	if (!task)
> -		return false;
> +		return ret;
>  	if (task->flags & PF_VCPU) {
>  		put_task_struct(task);
> -		return false;
> -	}
> -	if (yield_to(task, 1)) {
> -		put_task_struct(task);
> -		return true;
> +		return ret;
>  	}
> +	ret = yield_to(task, 1);
>  	put_task_struct(task);
> -	return false;
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_yield_to);
>  
> @@ -1697,12 +1696,14 @@ bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu)
>  	return eligible;
>  }
>  #endif
> +
>  void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  {
>  	struct kvm *kvm = me->kvm;
>  	struct kvm_vcpu *vcpu;
>  	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
>  	int yielded = 0;
> +	int try = 3;
>  	int pass;
>  	int i;
>  
> @@ -1714,7 +1715,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	 * VCPU is holding the lock that we need and will release it.
>  	 * We approximate round-robin by starting at the last boosted VCPU.
>  	 */
> -	for (pass = 0; pass < 2 && !yielded; pass++) {
> +	for (pass = 0; pass < 2 && !yielded && try; pass++) {
>  		kvm_for_each_vcpu(i, vcpu, kvm) {
>  			if (!pass && i <= last_boosted_vcpu) {
>  				i = last_boosted_vcpu;
> @@ -1727,10 +1728,15 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  				continue;
>  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>  				continue;
> -			if (kvm_vcpu_yield_to(vcpu)) {
> +
> +			yielded = kvm_vcpu_yield_to(vcpu);
> +			if (yielded > 0) {
>  				kvm->last_boosted_vcpu = i;
> -				yielded = 1;
>  				break;
> +			} else if (yielded < 0) {
> +				try--;
> +				if (!try)
> +					break;
>  			}
>  		}
>  	}
> 

The check done in patch 1/2 is done before the double_rq_lock, so it's
cheap. Now, this patch is to avoid doing too many get_pid_task calls. I
wonder if it would make more sense to change the vcpu state from tracking
the pid to tracking the task. If that was done, then I don't believe this
patch is necessary.

Rik,
for 34bb10b79de7 was there a reason pid was used instead of task?

Drew
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/