lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALMp9eQaqYG4F6f9gm0_a9v+6A_1jXBxX5Wy3J-pDBk8iar1YA@mail.gmail.com>
Date: Mon, 16 Dec 2024 12:04:28 -0800
From: Jim Mattson <jmattson@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/5] KVM: x86: Address xstate_required_size() perf regression

On Tue, Dec 10, 2024 at 5:33 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> Fix a hilarious/revolting performance regression (relative to older CPU
> generations) in xstate_required_size() that pops up due to CPUID _in the
> host_ taking 3x-4x longer on Emerald Rapids than Skylake.
>
> The issue rears its head on nested virtualization transitions, as KVM
> (unnecessarily) performs runtime CPUID updates, including XSAVE sizes,
> multiple times per transition.  And calculating XSAVE sizes, especially
> for vCPUs with a decent number of supported XSAVE features and compacted
> format support, can add up to thousands of cycles.
>
> To fix the immediate issue, cache the CPUID output at kvm.ko load.  The
> information is static for a given CPU, i.e. doesn't need to be re-read
> from hardware every time.  That's patch 1, and eliminates pretty much all
> of the meaningful overhead.
>
> Patch 2 is a minor cleanup to try and make the code easier to read.
>
> Patch 3 fixes a wart in CPUID emulation where KVM does a moderately
> expensive (though cheap compared to CPUID, lol) MSR lookup that is likely
> unnecessary for the vast majority of VMs.
>
> Patches 4 and 5 address the problem of KVM doing runtime CPUID updates
> multiple times for each nested VM-Enter and VM-Exit, at least half of
> which are completely unnecessary (CPUID is a mandatory intercept on both
> Intel and AMD, so ensuring dynamic CPUID bits are up-to-date while running
> L2 is pointless).  The idea is fairly simple: lazily do the CPUID updates
> by deferring them until something might actually consume guest the relevant
> bits.
>
> This applies on the cpu_caps overhaul[*], as patches 3-5 would otherwise
> conflict, and I didn't want to think about how safe patch 5 is without
> the rework.
>
> That said, patch 1, which is the most important and tagged for stable,
> applies cleanly on 6.1, 6.6, and 6.12 (and the backport for 5.15 and
> earlier shouldn't be too horrific).
>
> Side topic, I can't help but wonder if the CPUID latency on EMR is a CPU
> or ucode bug.  For a number of leaves, KVM can emulate CPUID faster than
> the CPUID can execute the instruction.  I.e. the entire VM-Exit => emulate
> => VM-Enter sequence takes less time than executing CPUID on bare metal.
> Which seems absolutely insane.  But, it shouldn't impact guest performance,
> so that's someone else's problem, at least for now.

Virtualization aside, perhaps Linux should set
MSR_FEATURE_ENABLES.CPUID_GP_ON_CPL_GT_0[bit 0] on EMR, and emulate
the CPUID instruction in the kernel?  :)

> [*] https://lore.kernel.org/all/20241128013424.4096668-1-seanjc@google.com
>
> Sean Christopherson (5):
>   KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
>   KVM: x86: Use for-loop to iterate over XSTATE size entries
>   KVM: x86: Apply TSX_CTRL_CPUID_CLEAR if and only if the vCPU has RTM
>     or HLE
>   KVM: x86: Query X86_FEATURE_MWAIT iff userspace owns the CPUID feature
>     bit
>   KVM: x86: Defer runtime updates of dynamic CPUID bits until CPUID
>     emulation
>
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/cpuid.c            | 63 ++++++++++++++++++++++++---------
>  arch/x86/kvm/cpuid.h            | 10 +++++-
>  arch/x86/kvm/lapic.c            |  2 +-
>  arch/x86/kvm/smm.c              |  2 +-
>  arch/x86/kvm/svm/sev.c          |  2 +-
>  arch/x86/kvm/svm/svm.c          |  2 +-
>  arch/x86/kvm/vmx/vmx.c          |  2 +-
>  arch/x86/kvm/x86.c              | 22 +++++++++---
>  9 files changed, 78 insertions(+), 28 deletions(-)
>
>
> base-commit: 06a4919e729761be751366716c00fb8c3f51d37e
> --
> 2.47.0.338.g60cca15819-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ