lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aaccbf48-ec7d-10bf-9980-8db0ae36b506@redhat.com>
Date:   Thu, 25 Jan 2018 22:39:14 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Radim Krčmář <rkrcmar@...hat.com>,
        Liran Alon <liran.alon@...cle.com>
Cc:     vkuznets@...hat.com, x86@...nel.org, pbonzini@...hat.com,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        "Michael S. Tsirkin" <mst@...hat.com>
Subject: Re: [PATCH] x86/kvm: disable fast MMIO when running nested



On 2018年01月25日 22:16, Radim Krčmář wrote:
> 2018-01-25 01:55-0800, Liran Alon:
>> ----- vkuznets@...hat.com wrote:
>>> I was investigating an issue with seabios >= 1.10 which stopped
>>> working
>>> for nested KVM on Hyper-V. The problem appears to be in
>>> handle_ept_violation() function: when we do fast mmio we need to skip
>>> the instruction so we do kvm_skip_emulated_instruction(). This,
>>> however,
>>> depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS.
>>> However, this is not the case.
>>>
>>> Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when
>>> EPT MISCONFIG occurs. While on real hardware it was observed to be
>>> set,
>>> some hypervisors follow the spec and don't set it; we end up
>>> advancing
>>> IP with some random value.
>>>
>>> I checked with Microsoft and they confirmed they don't fill
>>> VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG.
>>>
>>> Fix the issue by disabling fast mmio when running nested.
>>>
>>> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
>>> ---
>>>   arch/x86/kvm/vmx.c | 9 ++++++++-
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index c829d89e2e63..54afb446f38e 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -6558,9 +6558,16 @@ static int handle_ept_misconfig(struct kvm_vcpu
>>> *vcpu)
>>>   	/*
>>>   	 * A nested guest cannot optimize MMIO vmexits, because we have an
>>>   	 * nGPA here instead of the required GPA.
>>> +	 * Skipping instruction below depends on undefined behavior:
>>> Intel's
>>> +	 * manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set in VMCS
>>> +	 * when EPT MISCONFIG occurs and while on real hardware it was
>>> observed
>>> +	 * to be set, other hypervisors (namely Hyper-V) don't set it, we
>>> end
>>> +	 * up advancing IP with some random value. Disable fast mmio when
>>> +	 * running nested and keep it for real hardware in hope that
>>> +	 * VM_EXIT_INSTRUCTION_LEN will always be set correctly.
>> If Intel manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set in VMCS on EPT_MISCONFIG,
>> I don't think we should do this on real-hardware as-well.
> Neither do I, but you can see the last discussion on this topic,
> https://patchwork.kernel.org/patch/9903811/.  In short, we've agreed to
> limit the hack to real hardware and wait for Intel or virtio changes.
>
> Michael and Jason, any progress on implementing a fast virtio mechanism
> that doesn't rely on undefined behavior?
>
> (Encode writing instruction length into last 4 bits of MMIO address,
>   side-channel say that accesses to the MMIO area always use certain
>   instruction length, use hypercall, ...)
>
> Thanks.

No progress from my side. But we can use PIO for virtio 1.0 and it's 
faster than fast MMIO (qemu supports modern pio notification bar, we can 
make it as default). It looks to me that neither encoding nor hypercall 
will work for real hardware virtio device.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ