linux-kernel - Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4734379d-97c4-44c8-ae40-be46da6e6239@linux.intel.com>
Date: Thu, 31 Oct 2024 12:56:14 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, pbonzini@...hat.com,
 isaku.yamahata@...el.com, rick.p.edgecombe@...el.com, kai.huang@...el.com,
 yuan.yao@...ux.intel.com, xiaoyao.li@...el.com
Subject: Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace
 generically




On 10/31/2024 4:49 AM, Sean Christopherson wrote:
> On Mon, Aug 26, 2024, Binbin Wu wrote:
>> Check whether a KVM hypercall needs to exit to userspace or not based on
>> hypercall_exit_enabled field of struct kvm_arch.
>>
>> Userspace can request a hypercall to exit to userspace for handling by
>> enable KVM_CAP_EXIT_HYPERCALL and the enabled hypercall will be set in
>> hypercall_exit_enabled.  Make the check code generic based on it.
>>
>> Signed-off-by: Binbin Wu <binbin.wu@...ux.intel.com>
>> Reviewed-by: Kai Huang <kai.huang@...el.com>
>> ---
>>   arch/x86/kvm/x86.c | 5 +++--
>>   arch/x86/kvm/x86.h | 4 ++++
>>   2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 966fb301d44b..e521f14ad2b2 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10220,8 +10220,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>>   	cpl = kvm_x86_call(get_cpl)(vcpu);
>>   
>>   	ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
>> -	if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
>> -		/* MAP_GPA tosses the request to the user space. */
>> +	/* Check !ret first to make sure nr is a valid KVM hypercall. */
>> +	if (!ret && user_exit_on_hypercall(vcpu->kvm, nr))
> I don't love that the caller has to re-check for user_exit_on_hypercall().
Agree, it is not ideal.

But if __kvm_emulate_hypercall() returns 0 to indicate user exit and 1 to
indicate success, then the callers have to convert the return code to set
return value for guest.  E.g., TDX code also needs to do the conversion.

> I also don't love that there's a surprising number of checks lurking in
> __kvm_emulate_hypercall(), e.g. that CPL==0, especially since the above comment
> about "a valid KVM hypercall" can be intrepreted as meaning KVM is *only* checking
> if the hypercall number is valid.
>
> E.g. my initial reaction was that we could add a separate path for userspace
> hypercalls, but that would be subtly wrong.  And my second reaction was to hoist
> the common checks out of __kvm_emulate_hypercall(), but then I remembered that
> the only reason __kvm_emulate_hypercall() is separate is to allow it to be called
> by TDX with different source/destionation registers.
>
> So, I'm strongly leaning towards dropping the above change, squashing the addition
> of the helper with patch 2, and then landing this on top.
>
> Thoughts?
I have no strong preference and OK with the proposal below.

Just some cases, which don't get the return value right as pointed by Kai
in another thread.
https://lore.kernel.org/kvm/3f158732a66829faaeb527a94b8df78d6173befa.camel@intel.com/


>
> --
> Subject: [PATCH] KVM: x86: Use '0' in __kvm_emulate_hypercall()  to signal
>   "exit to userspace"
>
> Rework __kvm_emulate_hypercall() to use '0' to indicate an exit to
> userspace instead of relying on the caller to manually check for success
> *and* if user_exit_on_hypercall() is true.  Use '1' for "success" to
> (mostly) align with KVM's de factor return codes, where '0' == exit to
> userspace, '1' == resume guest, and -errno == failure.  Unfortunately,
> some of the PV error codes returned to the guest are postive values, so
> the pattern doesn't exactly match KVM's "standard", but it should be close
> enough to be intuitive for KVM readers.
>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>   arch/x86/kvm/x86.c | 21 +++++++++++++++------
>   1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e09daa3b157c..5fdeb58221e2 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10024,7 +10024,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>   
>   	switch (nr) {
>   	case KVM_HC_VAPIC_POLL_IRQ:
> -		ret = 0;
> +		ret = 1;
>   		break;
>   	case KVM_HC_KICK_CPU:
>   		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
> @@ -10032,7 +10032,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>   
>   		kvm_pv_kick_cpu_op(vcpu->kvm, a1);
>   		kvm_sched_yield(vcpu, a1);
> -		ret = 0;
> +		ret = 1;
>   		break;
>   #ifdef CONFIG_X86_64
>   	case KVM_HC_CLOCK_PAIRING:
> @@ -10050,7 +10050,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>   			break;
>   
>   		kvm_sched_yield(vcpu, a0);
> -		ret = 0;
> +		ret = 1;
>   		break;
>   	case KVM_HC_MAP_GPA_RANGE: {
>   		u64 gpa = a0, npages = a1, attrs = a2;
> @@ -10111,12 +10111,21 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>   	cpl = kvm_x86_call(get_cpl)(vcpu);
>   
>   	ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
> -	if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
> -		/* MAP_GPA tosses the request to the user space. */
> +	if (!ret)
>   		return 0;
>   
> -	if (!op_64_bit)
> +	/*
> +	 * KVM's ABI with the guest is that '0' is success, and any other value
> +	 * is an error code.  Internally, '0' == exit to userspace (see above)
> +	 * and '1' == success, as KVM's de facto standard return codes are that
> +	 * plus -errno == failure.  Explicitly check for '1' as some PV error
> +	 * codes are positive values.
> +	 */
I didn't understand the last sentence:
"Explicitly check for '1' as some PV error codes are positive values."

The functions called in __kvm_emulate_hypercall() for PV features return
-KVM_EXXX for error code.
Did you mean the functions like kvm_pv_enable_async_pf(), which return
1 for error, would be called in __kvm_emulate_hypercall() in the future?
If this is the concern, then we cannot simply convert 1 to 0 then.

> +	if (ret == 1)
> +		ret = 0;
> +	else if (!op_64_bit)
>   		ret = (u32)ret;
> +
>   	kvm_rax_write(vcpu, ret);
>   
>   	return kvm_skip_emulated_instruction(vcpu);
>
> base-commit: 675248928970d33f7fc8ca9851a170c98f4f1c4f