lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Mar 2018 11:16:49 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH RFC 0/3] x86/kvm: avoid expensive rdmsrs for FS/GS base
 MSRs

On 02/03/2018 11:55, Vitaly Kuznetsov wrote:
> Some time ago Paolo suggested to take a look at probably unneeded expensive
> rdmsrs for FS/GS base MSR in vmx_save_host_state(). This is called on every
> vcpu run when we need to handle vmexit in userspace.
> 
> I have to admit I got a bit lost in our kernel FS/GS magic. I managed to 
> convince myself that in the well defined context (ioctl from userspace)
> we can always get the required values from in-kernel variables and avoid
> rdmsrs. But I may have missed something really important, thus RFC.
> 
> My debug shows we're shaving off 240 cpu cycles (E5-2603 v3).
> 
> In case these patches turn out to be worthwile AMD SVM can probably be
> optimized the ame way.

SVM is a bit different, because it uses VMLOAD/VMSAVE and so it doesn't
have an equivalent of vmx_save_host_state().  Unfortunately, you cannot
really eliminate VMLOAD/VMSAVE because it's the only way to load the
hidden state of TR and LDTR---so you might as well use it to load FS and
GS, even in 64-bit mode.

In order to decrease the cost of vmload/vmsave, we could single out the
simplest vmexit handlers and process them without even getting out of
svm_vcpu_run, thus skipping all four of stgi/vmload/vmsave/clgi.
However, this probably couldn't be done for the really common vmexits
such as nested page fault, PIO or most MSR accesses.  We _could_ do it
for nested virt-related vmexits, but the advantage of that is getting
smaller too, since Zen provides hardware support for nested GIF and
nested VMLOAD/VMSAVE.

Paolo

> Vitaly Kuznetsov (3):
>   x86/kvm/vmx: read MSR_FS_BASE from current->thread
>   x86/kvm/vmx: read MSR_KERNEL_GS_BASE from current->thread
>   x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE
> 
>  arch/x86/kernel/cpu/common.c | 1 +
>  arch/x86/kvm/vmx.c           | 7 ++++---
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ