linux-kernel - Re: [PATCH v2 32/43] KVM: VMX: Move preemption timer <=> hrtimer dance to common x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <06ed37a510347fdc7c6f7ce46fd98ce5b9ff7554.camel@redhat.com>
Date:   Thu, 28 Oct 2021 18:45:14 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Marc Zyngier <maz@...nel.org>,
        Huacai Chen <chenhuacai@...nel.org>,
        Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
        Paul Mackerras <paulus@...abs.org>,
        Anup Patel <anup.patel@....com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Janosch Frank <frankja@...ux.ibm.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     James Morse <james.morse@....com>,
        Alexandru Elisei <alexandru.elisei@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Atish Patra <atish.patra@....com>,
        David Hildenbrand <david@...hat.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        linux-mips@...r.kernel.org, kvm@...r.kernel.org,
        kvm-ppc@...r.kernel.org, kvm-riscv@...ts.infradead.org,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
        David Matlack <dmatlack@...gle.com>,
        Oliver Upton <oupton@...gle.com>,
        Jing Zhang <jingzhangos@...gle.com>
Subject: Re: [PATCH v2 32/43] KVM: VMX: Move preemption timer <=> hrtimer
 dance to common x86

On Fri, 2021-10-08 at 19:12 -0700, Sean Christopherson wrote:
> Handle the switch to/from the hypervisor/software timer when a vCPU is
> blocking in common x86 instead of in VMX.  Even though VMX is the only
> user of a hypervisor timer, the logic and all functions involved are
> generic x86 (unless future CPUs do something completely different and
> implement a hypervisor timer that runs regardless of mode).
> 
> Handling the switch in common x86 will allow for the elimination of the
> pre/post_blocks hooks, and also lets KVM switch back to the hypervisor
> timer if and only if it was in use (without additional params).  Add a
> comment explaining why the switch cannot be deferred to kvm_sched_out()
> or kvm_vcpu_block().
> 
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/vmx/vmx.c |  6 +-----
>  arch/x86/kvm/x86.c     | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b3bb2031a7ac..a24f19874716 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7464,16 +7464,12 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu)
>  
>  static int vmx_pre_block(struct kvm_vcpu *vcpu)
>  {
> -	if (kvm_lapic_hv_timer_in_use(vcpu))
> -		kvm_lapic_switch_to_sw_timer(vcpu);
> -
>  	return 0;
>  }
>  
>  static void vmx_post_block(struct kvm_vcpu *vcpu)
>  {
> -	if (kvm_x86_ops.set_hv_timer)
> -		kvm_lapic_switch_to_hv_timer(vcpu);
> +
>  }
>  
>  static void vmx_setup_mce(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e0219acfd9cf..909e932a7ae7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9896,8 +9896,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>  {
> +	bool hv_timer;
> +
>  	if (!kvm_arch_vcpu_runnable(vcpu) &&
>  	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
> +		/*
> +		 * Switch to the software timer before halt-polling/blocking as
> +		 * the guest's timer may be a break event for the vCPU, and the
> +		 * hypervisor timer runs only when the CPU is in guest mode.
> +		 * Switch before halt-polling so that KVM recognizes an expired
> +		 * timer before blocking.
> +		 */

I didn't knew about this until now but it all makes sense. The comment is very good.

> +		hv_timer = kvm_lapic_hv_timer_in_use(vcpu);
> +		if (hv_timer)
> +			kvm_lapic_switch_to_sw_timer(vcpu);
> +
>  		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
>  		if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
>  			kvm_vcpu_halt(vcpu);
> @@ -9905,6 +9918,9 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>  			kvm_vcpu_block(vcpu);
>  		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>  
> +		if (hv_timer)
> +			kvm_lapic_switch_to_hv_timer(vcpu);
> +
>  		if (kvm_x86_ops.post_block)
>  			static_call(kvm_x86_post_block)(vcpu);
>  
> @@ -10136,6 +10152,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  			r = -EINTR;
>  			goto out;
>  		}
> +		/*
> +		 * It should be impossible for the hypervisor timer to be in
> +		 * use before KVM has ever run the vCPU.
> +		 */
> +		WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu));
>  		kvm_vcpu_block(vcpu);
>  		if (kvm_apic_accept_events(vcpu) < 0) {
>  			r = 0;

Reviewed-by: Maxim Levitsky <mlevitsk@...hat.com>

Best regards,
	Maxim Levitsky