lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Apr 2016 12:05:12 +0200
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	Luiz Capitulino <lcapitulino@...hat.com>, kvm@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org, rkrcmar@...hat.com,
	mtosatti@...hat.com, riel@...hat.com, bsd@...hat.com
Subject: Re: [PATCH] kvm: x86: make lapic hrtimer pinned

On 04/04/2016 22:46, Luiz Capitulino wrote:
> When a vCPU runs on a nohz_full core, the hrtimer used by
> the lapic emulation code can be migrated to another core.
> When this happens, it's possible to observe milisecond
> latency when delivering timer IRQs to KVM guests.
> 
> The huge latency is mainly due to the fact that
> apic_timer_fn() expects to run during a kvm exit. It
> sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
> entry. However, if the timer fires on a different core,
> we have to wait until the next kvm exit for the guest
> to see KVM_REQ_PENDING_TIMER set.
> 
> This problem became visible after commit 9642d18ee. This
> commit changed the timer migration code to always attempt
> to migrate timers away from nohz_full cores. While it's
> discussable if this is correct/desirable (I don't think
> it is), it's clear that the lapic emulation code has
> a requirement on firing the hrtimer in the same core
> where it was started. This is achieved by making the
> hrtimer pinned.
> 
> Lastly, note that KVM has code to migrate timers when a
> vCPU is scheduled to run in different core. However, this
> forced migration may fail. When this happens, we can have
> the same problem. If we want 100% correctness, we'll have
> to modify apic_timer_fn() to cause a kvm exit when it runs
> on a different core than the vCPU. Not sure if this is
> possible.
> 
> Here's a reproducer for the issue being fixed:
> 
>  1. Set all cores but core0 to be nohz_full cores
>  2. Start a guest with a single vCPU
>  3. Trace apic_timer_fn() and kvm_inject_apic_timer_irqs()
> 
> You'll see that apic_timer_fn() will run in core0 while
> kvm_inject_apic_timer_irqs() runs in a different core. If
> you get both on core0, try running a program that takes 100%
> of the CPU and pin it to core0 to force the vCPU out.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@...hat.com>
> ---
>  arch/x86/kvm/lapic.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 443d2a5..1a2da0e 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1369,7 +1369,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
>  
>  		hrtimer_start(&apic->lapic_timer.timer,
>  			      ktime_add_ns(now, apic->lapic_timer.period),
> -			      HRTIMER_MODE_ABS);
> +			      HRTIMER_MODE_ABS_PINNED);
>  
>  		apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
>  			   PRIx64 ", "
> @@ -1402,7 +1402,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
>  			expire = ktime_add_ns(now, ns);
>  			expire = ktime_sub_ns(expire, lapic_timer_advance_ns);
>  			hrtimer_start(&apic->lapic_timer.timer,
> -				      expire, HRTIMER_MODE_ABS);
> +				      expire, HRTIMER_MODE_ABS_PINNED);
>  		} else
>  			apic_timer_expired(apic);
>  
> @@ -1868,7 +1868,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
>  	apic->vcpu = vcpu;
>  
>  	hrtimer_init(&apic->lapic_timer.timer, CLOCK_MONOTONIC,
> -		     HRTIMER_MODE_ABS);
> +		     HRTIMER_MODE_ABS_PINNED);
>  	apic->lapic_timer.timer.function = apic_timer_fn;
>  
>  	/*
> @@ -2003,7 +2003,7 @@ void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>  
>  	timer = &vcpu->arch.apic->lapic_timer.timer;
>  	if (hrtimer_cancel(timer))
> -		hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
> +		hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
>  }
>  
>  /*
> 

Queued for 4.6.0-rc3, thanks.

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ