lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Mar 2022 00:53:18 +0000
From:   Jon Kohler <jon@...anix.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
CC:     Jon Kohler <jon@...anix.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: x86: optimize PKU branching in
 kvm_load_{guest|host}_xsave_state



> On Mar 27, 2022, at 6:43 AM, Paolo Bonzini <pbonzini@...hat.com> wrote:
> 
> On 3/26/22 02:37, Jon Kohler wrote:
>>>>    Flip the ordering of the || condition so that XFEATURE_MASK_PKRU is
>>>>    checked first, which when instrumented in our environment appeared
>>>>    to be always true and less overall work than kvm_read_cr4_bits.
>>> 
>>> If it's always true, then it should be checked last, not first.  And if
>> Sean thanks for the review. This would be a left handed || short circuit, so
>> wouldn’t we want always true to be first?
> 
> Yes.

Ack, thanks.

> 
>>> Not that it really matters, since static_cpu_has() will patch out all the branches,
>>> and in practice who cares about a JMP or NOP(s)?  But...
>> The reason I’ve been pursuing this is that the guest+host xsave adds up to
>> a bit over ~1% as measured by perf top in an exit heavy workload. This is
>> the first in a few patch we’ve drummed up to to get it back towards zero.
>> I’ll send the rest out next week.
> 
> Can you add a testcase to x86/vmexit.c in kvm-unit-tests, too?

Sure, I’ll check that out and see what I can do. 

Here’s a preview of the larger issue:  we’re seeing a regression on SKX/CLX
hosts when supporting live migration from these older hosts to ICX hosts
in the same logical compute cluster. Our control plane automatically masks
features to the lowest common denominator. In such cases, MPX is masked
away from the guests on the older systems, causing the xsave state that
the guests sees to be different than the host. When that happens, on each
load guest or load host state, we spend quite a bit amount of time on
xsetbv. This happens any time the host and guest don’t match. This is
Even more expensive when the guest sees PKU, as we spend time on
xsetbv for MPX not matching and spend time on all the rdpkru/wrpkru stuff.

Turns out, the xsave bringup code only checks running features, as we
see the same behavior if we compile out PKU too, so the patches we’ve
created add a knob for MPX in disabled-features and add the ability for 
that early code to respect the disabled features mask. That way its easy
to make the host and guest match without doing BIOS level things. 

In between those patches and this one, these two code paths are pretty
cheap now. I’ll make sure to copy the list when those patches go out, and
have a detailed cover letter for the issue.

> 
> Thanks,
> 
> Paolo
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ