lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c293cdbd-502c-d598-3166-4e177ac21c7a@redhat.com>
Date:   Mon, 8 Feb 2021 19:12:22 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Jing Liu <jing2.liu@...ux.intel.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, jing2.liu@...el.com
Subject: Re: [PATCH RFC 3/7] kvm: x86: XSAVE state and XFD MSRs context switch

On 08/02/21 19:04, Sean Christopherson wrote:
>> That said, the case where we saw MSR autoload as faster involved EFER, and
>> we decided that it was due to TLB flushes (commit f6577a5fa15d, "x86, kvm,
>> vmx: Always use LOAD_IA32_EFER if available", 2014-11-12). Do you know if
>> RDMSR/WRMSR is always slower than MSR autoload?
> RDMSR/WRMSR may be marginally slower, but only because the autoload stuff avoids
> serializing the pipeline after every MSR.

That's probably adding up quickly...

> The autoload paths are effectively
> just wrappers around the WRMSR ucode, plus some extra VM-Enter specific checks,
> as ucode needs to perform all the normal fault checks on the index and value.
> On the flip side, if the load lists are dynamically constructed, I suspect the
> code overhead of walking the lists negates any advantages of the load lists.

... but yeah this is not very encouraging.

Context switch time is a problem for XFD.  In a VM that uses AMX, most 
threads in the guest will have nonzero XFD but the vCPU thread itself 
will have zero XFD.  So as soon as one thread in the VM forces the vCPU 
thread to clear XFD, you pay a price on all vmexits and vmentries.

However, running the host with _more_ bits set than necessary in XFD 
should not be a problem as long as the host doesn't use the AMX 
instructions.  So perhaps Jing can look into keeping XFD=0 for as little 
time as possible, and XFD=host_XFD|guest_XFD as much as possible.

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ