linux-kernel - Re: [PATCH 4/4] KVM: use signals to abort enter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5ee4eeb8-4d61-06fc-f80d-06efeeffe902@redhat.com>
Date:   Sun, 23 Oct 2022 19:48:46 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Emanuele Giuseppe Esposito <eesposit@...hat.com>,
        kvm@...r.kernel.org
Cc:     Jonathan Corbet <corbet@....net>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        David Hildenbrand <david@...hat.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] KVM: use signals to abort enter_guest/blocking and
 retry

On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote:
> Once a vcpu exectues KVM_RUN, it could enter two states:
> enter guest mode, or block/halt.
> Use a signal to allow a vcpu to exit the guest state or unblock,
> so that it can exit KVM_RUN and release the read semaphore,
> allowing a pending KVM_KICK_ALL_RUNNING_VCPUS to continue.
> 
> Note that the signal is not deleted and used to propagate the
> exit reason till vcpu_run(). It will be clearead only by
> KVM_RESUME_ALL_KICKED_VCPUS. This allows the vcpu to keep try
> entering KVM_RUN and perform again all checks done in
> kvm_arch_vcpu_ioctl_run() before entering the guest state,
> where it will return again if the request is still set.
> 
> However, the userspace hypervisor should also try to avoid
> continuously calling KVM_RUN after invoking KVM_KICK_ALL_RUNNING_VCPUS,
> because such call will just translate in a back-to-back down_read()
> and up_read() (thanks to the signal).

Since the userspace should anyway avoid going into this effectively-busy 
wait, what about clearing the request after the first exit?  The 
cancellation ioctl can be kept for vCPUs that are never entered after 
KVM_KICK_ALL_RUNNING_VCPUS.  Alternatively, kvm_clear_all_cpus_request 
could be done right before up_write().

Paolo

> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@...hat.com>
> ---
>   arch/x86/include/asm/kvm_host.h |  2 ++
>   arch/x86/kvm/x86.c              |  8 ++++++++
>   virt/kvm/kvm_main.c             | 21 +++++++++++++++++++++
>   3 files changed, 31 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index aa381ab69a19..d5c37f344d65 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -108,6 +108,8 @@
>   	KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>   #define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \
>   	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> +#define KVM_REQ_USERSPACE_KICK		\
> +	KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT)
>   
>   #define CR0_RESERVED_BITS                                               \
>   	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b0c47b41c264..2af5f427b4e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10270,6 +10270,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   	}
>   
>   	if (kvm_request_pending(vcpu)) {
> +		if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu)) {
> +			r = -EINTR;
> +			goto out;
> +		}
>   		if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) {
>   			r = -EIO;
>   			goto out;
> @@ -10701,6 +10705,10 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
>   			r = vcpu_block(vcpu);
>   		}
>   
> +		/* vcpu exited guest/unblocked because of this request */
> +		if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu))
> +			return -EINTR;
> +
>   		if (r <= 0)
>   			break;
>   
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ae0240928a4a..13fa7229b85d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3431,6 +3431,8 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>   		goto out;
>   	if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu))
>   		goto out;
> +	if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu))
> +		goto out;
>   
>   	ret = 0;
>   out:
> @@ -4668,6 +4670,25 @@ static long kvm_vm_ioctl(struct file *filp,
>   		r = kvm_vm_ioctl_enable_cap_generic(kvm, &cap);
>   		break;
>   	}
> +	case KVM_KICK_ALL_RUNNING_VCPUS: {
> +		/*
> +		 * Notify all running vcpus that they have to stop.
> +		 * Caught in kvm_arch_vcpu_ioctl_run()
> +		 */
> +		kvm_make_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK);
> +
> +		/*
> +		 * Use wr semaphore to wait for all vcpus to exit from KVM_RUN.
> +		 */
> +		down_write(&memory_transaction);
> +		up_write(&memory_transaction);
> +		break;
> +	}
> +	case KVM_RESUME_ALL_KICKED_VCPUS: {
> +		/* Remove all requests sent with KVM_KICK_ALL_RUNNING_VCPUS */
> +		kvm_clear_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK);
> +		break;
> +	}
>   	case KVM_SET_USER_MEMORY_REGION: {
>   		struct kvm_userspace_memory_region kvm_userspace_mem;
>