[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05129de6-c8d9-de94-89e7-6257197433ef@redhat.com>
Date: Tue, 20 Apr 2021 20:39:46 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
srutherford@...gle.com, joro@...tes.org, brijesh.singh@....com,
thomas.lendacky@....com, venu.busireddy@...cle.com,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...e.de>,
x86@...nel.org, Ashish Kalra <ashish.kalra@....com>
Subject: Re: [PATCH 0/3] KVM: x86: guest interface for SEV live migration
On 20/04/21 19:31, Sean Christopherson wrote:
>> + case KVM_HC_PAGE_ENC_STATUS: {
>> + u64 gpa = a0, npages = a1, enc = a2;
>> +
>> + ret = -KVM_ENOSYS;
>> + if (!vcpu->kvm->arch.hypercall_exit_enabled)
>
> I don't follow, why does the hypercall need to be gated by a capability? What
> would break if this were changed to?
>
> if (!guest_pv_has(vcpu, KVM_FEATURE_HC_PAGE_ENC_STATUS))
The problem is that it's valid to take KVM_GET_SUPPORTED_CPUID and send
it unmodified to KVM_SET_CPUID2. For this reason, features that are
conditional on other ioctls, or that require some kind of userspace
support, must not be in KVM_GET_SUPPORTED_CPUID. For example:
- TSC_DEADLINE because it is only implemented after KVM_CREATE_IRQCHIP
(or after KVM_ENABLE_CAP of KVM_CAP_IRQCHIP_SPLIT)
- MONITOR only makes sense if userspace enables KVM_CAP_X86_DISABLE_EXITS
X2APIC is reported even though it shouldn't be. Too late to fix that, I
think.
In this particular case, if userspace sets the bit in CPUID2 but doesn't
handle KVM_EXIT_HYPERCALL, the guest will probably trigger some kind of
assertion failure as soon as it invokes the HC_PAGE_ENC_STATUS hypercall.
(I should document that, Jim asked for documentation around
KVM_GET_SUPPORTED_CPUID and KVM_GET_MSR_INDEX_LIST many times).
Paolo
>> + break;
>> +
>> + if (!PAGE_ALIGNED(gpa) || !npages ||
>> + gpa_to_gfn(gpa) + npages <= gpa_to_gfn(gpa)) {
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
>> + vcpu->run->hypercall.nr = KVM_HC_PAGE_ENC_STATUS;
>> + vcpu->run->hypercall.args[0] = gpa;
>> + vcpu->run->hypercall.args[1] = npages;
>> + vcpu->run->hypercall.args[2] = enc;
>> + vcpu->run->hypercall.longmode = op_64_bit;
>> + vcpu->arch.complete_userspace_io = complete_hypercall_exit;
>> + return 0;
>> + }
>> default:
>> ret = -KVM_ENOSYS;
>> break;
>
> ...
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 590cc811c99a..d696a9f13e33 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3258,6 +3258,14 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> vcpu->arch.msr_kvm_poll_control = data;
>> break;
>>
>> + case MSR_KVM_MIGRATION_CONTROL:
>> + if (data & ~KVM_PAGE_ENC_STATUS_UPTODATE)
>> + return 1;
>> +
>> + if (data && !guest_pv_has(vcpu, KVM_FEATURE_HC_PAGE_ENC_STATUS))
>
> Why let the guest write '0'? Letting the guest do WRMSR but not RDMSR is
> bizarre.
Because it was the simplest way to write the code, but returning 0
unconditionally from RDMSR is actually simpler.
Paolo
>> + return 1;
>> + break;
>> +
>> case MSR_IA32_MCG_CTL:
>> case MSR_IA32_MCG_STATUS:
>> case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1:
>> @@ -3549,6 +3557,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> if (!guest_pv_has(vcpu, KVM_FEATURE_ASYNC_PF))
>> return 1;
>>
>> + msr_info->data = 0;
>> + break;
>> + case MSR_KVM_MIGRATION_CONTROL:
>> + if (!guest_pv_has(vcpu, KVM_FEATURE_HC_PAGE_ENC_STATUS))
>> + return 1;
>> +
>> msr_info->data = 0;
>> break;
>> case MSR_KVM_STEAL_TIME:
>> --
>> 2.26.2
>>
>
Powered by blists - more mailing lists