[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ea3137a-c63c-cf1e-7d90-ed6416ba443d@redhat.com>
Date: Tue, 9 Jan 2018 12:07:48 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Liran Alon <LIRAN.ALON@...CLE.COM>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Cc: jmattson@...gle.com, aliguori@...zon.com, thomas.lendacky@....com,
dwmw@...zon.co.uk, bp@...en8.de
Subject: Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU
On 08/01/2018 21:00, Liran Alon wrote:
>
>
> On 08/01/18 20:08, Paolo Bonzini wrote:
>> From: Tom Lendacky <thomas.lendacky@....com>
>>
>> Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
>> going to run a VCPU different from what was previously run. Nested
>> virtualization uses the same VMCB for the second level guest, but the
>> L1 hypervisor should be using IBRS to protect itself.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@....com>
>> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
>> ---
>> arch/x86/kvm/svm.c | 31 +++++++++++++++++++++++++++++++
>> 1 file changed, 31 insertions(+)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 779981a00d01..dd744d896cec 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -289,6 +289,7 @@ struct amd_svm_iommu_ir {
>> module_param(vgif, int, 0444);
>>
>> static bool __read_mostly have_spec_ctrl;
>> +static bool __read_mostly have_ibpb_support;
>>
>> static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>> static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>> @@ -540,6 +541,7 @@ struct svm_cpu_data {
>> struct kvm_ldttss_desc *tss_desc;
>>
>> struct page *save_area;
>> + struct vmcb *current_vmcb;
>> };
>>
>> static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
>> @@ -1151,6 +1153,11 @@ static __init int svm_hardware_setup(void)
>> pr_info("kvm: SPEC_CTRL available\n");
>> else
>> pr_info("kvm: SPEC_CTRL not available\n");
>> + have_ibpb_support = have_spec_ctrl || cpu_has_ibpb_support();
>> + if (have_ibpb_support)
>> + pr_info("kvm: IBPB_SUPPORT available\n");
>> + else
>> + pr_info("kvm: IBPB_SUPPORT not available\n");
>>
>> return 0;
>>
>> @@ -1725,11 +1732,19 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
>> __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
>> kvm_vcpu_uninit(vcpu);
>> kmem_cache_free(kvm_vcpu_cache, svm);
>> +
>> + /*
>> + * The VMCB could be recycled, causing a false negative in
>> + * svm_vcpu_load; block speculative execution.
>> + */
>> + if (have_ibpb_support)
>> + wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>> }
>>
>> static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>> + struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
>> int i;
>>
>> if (unlikely(cpu != vcpu->cpu)) {
>> @@ -1758,6 +1773,12 @@ static void svm_vcpu_load(struct kvm_vcpu
>> *vcpu, int cpu)
>> if (static_cpu_has(X86_FEATURE_RDTSCP))
>> wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>>
>> + if (sd->current_vmcb != svm->vmcb) {
>> + sd->current_vmcb = svm->vmcb;
>> + if (have_ibpb_support)
>> + wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>> + }
>> +
>> avic_vcpu_load(vcpu, cpu);
>> }
>>
>> @@ -2798,6 +2819,11 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
>> if (!nested_vmcb)
>> return 1;
>>
>> + /*
>> + * No need for IBPB here, the L1 hypervisor should be running with
>> + * IBRS=1 and inserts one already when switching L2 VMs.
>> + */
>> +
>
> I am not fully convinced yet this is OK from security perspective.
> From the CPU point-of-view, both L1 & L2 run in the same prediction-mode
> (as they are both running in SVM guest-mode). Therefore, IBRS will not
> hide the BHB & BTB of L1 from L2.
Indeed, IBRS will not hide the indirect branch predictor state of L2
user mode from L1 user mode.
On current generation processors, as I understand it, IBRS simply
disables the indirect branch predictor when set to 1. Therefore, as
long as the L1 hypervisor sets IBRS=1, the indirect branch predictor
state left by L2 does not affect execution of the L1 hypervisor.
Future processors might have a mode that says "just set IBRS=1 and I'll
DTRT". If that's done by indexing the BTB using the full virtual
address, the CPL _and_ the ASID/PCID/VPID, then IBPB is not needed here
because the L2 VM uses a different ASID. If that's done by only
augmenting the BTB index with the CPL, we'd need an IBPB. But in this
new world we've been thrown into, that would be a processor bug...
Note that AMD currently doesn't implement IBRS at all. In order to
mitigate against CVE-2017-5715, the hypervisor should issue an IBPB on
every vmexit (in both L0 and L1). However, the cost of that is very
high. While we may want a paranoia mode that does it, it is outside the
scope of this series (which is more of a "let's unblock QEMU patches"
thing than a complete fix).
> 6. One can implicitly assume that L1 hypervisor did issued a IBPB when
> loading an VMCB and therefore it doesn't have to do it again in L0.
> However, I see 2 problems with this approach:
> (a) We don't protect old non-patched L1 hypervisors from Spectre even
> though we could easily do that from L0 with small performance hit.
Yeah, that's a nice thing to have. However, I disagree on the "small"
performance hit... on a patched hypervisor, the cost of a double fix can
be very noticeable.
> (b) When L1 hypervisor runs only a single L2 VM, it doesn't have to
> issue an IBPB when loading the VMCB (as it didn't run any other VMCB
> before) and it should be sufficient from L1 perspective to just write 1
> to IBRS. However, in our nested-case, this is a security-hole.
This is the main difference in our reasoning. I think IBRS=1 (or IBPB
on vmexit) should be enough.
Paolo
> Am I misunderstanding something?
>
> Regards,
> -Liran
>
>> /* Exit Guest-Mode */
>> leave_guest_mode(&svm->vcpu);
>> svm->nested.vmcb = 0;
>> @@ -3061,6 +3087,11 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
>> if (!nested_vmcb)
>> return false;
>>
>> + /*
>> + * No need for IBPB here, since the nested VM is less
>> privileged. The
>> + * L1 hypervisor inserts one already when switching L2 VMs.
>> + */
>> +
>> if (!nested_vmcb_checks(nested_vmcb)) {
>> nested_vmcb->control.exit_code = SVM_EXIT_ERR;
>> nested_vmcb->control.exit_code_hi = 0;
>>
>
Powered by blists - more mailing lists