[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4734379d-97c4-44c8-ae40-be46da6e6239@linux.intel.com>
Date: Thu, 31 Oct 2024 12:56:14 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, pbonzini@...hat.com,
isaku.yamahata@...el.com, rick.p.edgecombe@...el.com, kai.huang@...el.com,
yuan.yao@...ux.intel.com, xiaoyao.li@...el.com
Subject: Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace
generically
On 10/31/2024 4:49 AM, Sean Christopherson wrote:
> On Mon, Aug 26, 2024, Binbin Wu wrote:
>> Check whether a KVM hypercall needs to exit to userspace or not based on
>> hypercall_exit_enabled field of struct kvm_arch.
>>
>> Userspace can request a hypercall to exit to userspace for handling by
>> enable KVM_CAP_EXIT_HYPERCALL and the enabled hypercall will be set in
>> hypercall_exit_enabled. Make the check code generic based on it.
>>
>> Signed-off-by: Binbin Wu <binbin.wu@...ux.intel.com>
>> Reviewed-by: Kai Huang <kai.huang@...el.com>
>> ---
>> arch/x86/kvm/x86.c | 5 +++--
>> arch/x86/kvm/x86.h | 4 ++++
>> 2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 966fb301d44b..e521f14ad2b2 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10220,8 +10220,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>> cpl = kvm_x86_call(get_cpl)(vcpu);
>>
>> ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
>> - if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
>> - /* MAP_GPA tosses the request to the user space. */
>> + /* Check !ret first to make sure nr is a valid KVM hypercall. */
>> + if (!ret && user_exit_on_hypercall(vcpu->kvm, nr))
> I don't love that the caller has to re-check for user_exit_on_hypercall().
Agree, it is not ideal.
But if __kvm_emulate_hypercall() returns 0 to indicate user exit and 1 to
indicate success, then the callers have to convert the return code to set
return value for guest. E.g., TDX code also needs to do the conversion.
> I also don't love that there's a surprising number of checks lurking in
> __kvm_emulate_hypercall(), e.g. that CPL==0, especially since the above comment
> about "a valid KVM hypercall" can be intrepreted as meaning KVM is *only* checking
> if the hypercall number is valid.
>
> E.g. my initial reaction was that we could add a separate path for userspace
> hypercalls, but that would be subtly wrong. And my second reaction was to hoist
> the common checks out of __kvm_emulate_hypercall(), but then I remembered that
> the only reason __kvm_emulate_hypercall() is separate is to allow it to be called
> by TDX with different source/destionation registers.
>
> So, I'm strongly leaning towards dropping the above change, squashing the addition
> of the helper with patch 2, and then landing this on top.
>
> Thoughts?
I have no strong preference and OK with the proposal below.
Just some cases, which don't get the return value right as pointed by Kai
in another thread.
https://lore.kernel.org/kvm/3f158732a66829faaeb527a94b8df78d6173befa.camel@intel.com/
>
> --
> Subject: [PATCH] KVM: x86: Use '0' in __kvm_emulate_hypercall() to signal
> "exit to userspace"
>
> Rework __kvm_emulate_hypercall() to use '0' to indicate an exit to
> userspace instead of relying on the caller to manually check for success
> *and* if user_exit_on_hypercall() is true. Use '1' for "success" to
> (mostly) align with KVM's de factor return codes, where '0' == exit to
> userspace, '1' == resume guest, and -errno == failure. Unfortunately,
> some of the PV error codes returned to the guest are postive values, so
> the pattern doesn't exactly match KVM's "standard", but it should be close
> enough to be intuitive for KVM readers.
>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> arch/x86/kvm/x86.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e09daa3b157c..5fdeb58221e2 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10024,7 +10024,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>
> switch (nr) {
> case KVM_HC_VAPIC_POLL_IRQ:
> - ret = 0;
> + ret = 1;
> break;
> case KVM_HC_KICK_CPU:
> if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
> @@ -10032,7 +10032,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>
> kvm_pv_kick_cpu_op(vcpu->kvm, a1);
> kvm_sched_yield(vcpu, a1);
> - ret = 0;
> + ret = 1;
> break;
> #ifdef CONFIG_X86_64
> case KVM_HC_CLOCK_PAIRING:
> @@ -10050,7 +10050,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> break;
>
> kvm_sched_yield(vcpu, a0);
> - ret = 0;
> + ret = 1;
> break;
> case KVM_HC_MAP_GPA_RANGE: {
> u64 gpa = a0, npages = a1, attrs = a2;
> @@ -10111,12 +10111,21 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> cpl = kvm_x86_call(get_cpl)(vcpu);
>
> ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
> - if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
> - /* MAP_GPA tosses the request to the user space. */
> + if (!ret)
> return 0;
>
> - if (!op_64_bit)
> + /*
> + * KVM's ABI with the guest is that '0' is success, and any other value
> + * is an error code. Internally, '0' == exit to userspace (see above)
> + * and '1' == success, as KVM's de facto standard return codes are that
> + * plus -errno == failure. Explicitly check for '1' as some PV error
> + * codes are positive values.
> + */
I didn't understand the last sentence:
"Explicitly check for '1' as some PV error codes are positive values."
The functions called in __kvm_emulate_hypercall() for PV features return
-KVM_EXXX for error code.
Did you mean the functions like kvm_pv_enable_async_pf(), which return
1 for error, would be called in __kvm_emulate_hypercall() in the future?
If this is the concern, then we cannot simply convert 1 to 0 then.
> + if (ret == 1)
> + ret = 0;
> + else if (!op_64_bit)
> ret = (u32)ret;
> +
> kvm_rax_write(vcpu, ret);
>
> return kvm_skip_emulated_instruction(vcpu);
>
> base-commit: 675248928970d33f7fc8ca9851a170c98f4f1c4f
Powered by blists - more mailing lists