linux-kernel - Re: [PATCH RFC 3/7] kvm: x86: XSAVE state and XFD MSRs context switch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YCGJHme0fBk+sno3@Konrads-MacBook-Pro.local>
Date:   Mon, 8 Feb 2021 13:55:26 -0500
From:   Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Jing Liu <jing2.liu@...ux.intel.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, jing2.liu@...el.com
Subject: Re: [PATCH RFC 3/7] kvm: x86: XSAVE state and XFD MSRs context switch

On Mon, Feb 08, 2021 at 07:12:22PM +0100, Paolo Bonzini wrote:
> On 08/02/21 19:04, Sean Christopherson wrote:
> > > That said, the case where we saw MSR autoload as faster involved EFER, and
> > > we decided that it was due to TLB flushes (commit f6577a5fa15d, "x86, kvm,
> > > vmx: Always use LOAD_IA32_EFER if available", 2014-11-12). Do you know if
> > > RDMSR/WRMSR is always slower than MSR autoload?
> > RDMSR/WRMSR may be marginally slower, but only because the autoload stuff avoids
> > serializing the pipeline after every MSR.
> 
> That's probably adding up quickly...
> 
> > The autoload paths are effectively
> > just wrappers around the WRMSR ucode, plus some extra VM-Enter specific checks,
> > as ucode needs to perform all the normal fault checks on the index and value.
> > On the flip side, if the load lists are dynamically constructed, I suspect the
> > code overhead of walking the lists negates any advantages of the load lists.
> 
> ... but yeah this is not very encouraging.
> 
> Context switch time is a problem for XFD.  In a VM that uses AMX, most
> threads in the guest will have nonzero XFD but the vCPU thread itself will
> have zero XFD.  So as soon as one thread in the VM forces the vCPU thread to
> clear XFD, you pay a price on all vmexits and vmentries.
> 
> However, running the host with _more_ bits set than necessary in XFD should
> not be a problem as long as the host doesn't use the AMX instructions.  So
> perhaps Jing can look into keeping XFD=0 for as little time as possible, and
> XFD=host_XFD|guest_XFD as much as possible.

This sounds like the lazy-fpu (eagerfpu?) that used to be part of the
kernel? I recall that we had a CVE for that - so it may also be worth
double-checking that we don't reintroduce that one.