[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <387E8E8B-81B9-40FF-8D52-76821599B7E4@nutanix.com>
Date: Sat, 26 Mar 2022 01:37:03 +0000
From: Jon Kohler <jon@...anix.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Jon Kohler <jon@...anix.com>, Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: x86: optimize PKU branching in
kvm_load_{guest|host}_xsave_state
> On Mar 25, 2022, at 8:15 PM, Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Wed, Mar 23, 2022, Jon Kohler wrote:
>> kvm_load_{guest|host}_xsave_state handles xsave on vm entry and exit,
>> part of which is managing memory protection key state. The latest
>> arch.pkru is updated with a rdpkru, and if that doesn't match the base
>> host_pkru (which about 70% of the time), we issue a __write_pkru.
>>
>> To improve performance, implement the following optimizations:
>> 1. Reorder if conditions prior to wrpkru in both
>> kvm_load_{guest|host}_xsave_state.
>>
>> Flip the ordering of the || condition so that XFEATURE_MASK_PKRU is
>> checked first, which when instrumented in our environment appeared
>> to be always true and less overall work than kvm_read_cr4_bits.
>
> If it's always true, then it should be checked last, not first. And if
Sean thanks for the review. This would be a left handed || short circuit, so
wouldn’t we want always true to be first?
> kvm_read_cr4_bits() is more expensive then we should address that issue, not
> try to micro-optimize this stuff. X86_CR4_PKE can't be guest-owned, and so
> kvm_read_cr4_bits() should be optimized down to:
>
> return vcpu->arch.cr4 & X86_CR4_PKE;
Ok good tip, thank you. I’ll dig into that a bit more and see what I can find. To be
fair, kvm_read_cr4_bits isn’t super expensive, it’s just more expensive than the
code I proposed. That said, I’ll see what I can do to simplify this.
>
> If the compiler isn't smart enough to do that on its own then we should rework
> kvm_read_cr4_bits() as that will benefit multiple code paths.
>
>> For kvm_load_guest_xsave_state, hoist arch.pkru != host_pkru ahead
>> one position. When instrumented, I saw this be true roughly ~70% of
>> the time vs the other conditions which were almost always true.
>> With this change, we will avoid 3rd condition check ~30% of the time.
>
> If the guest uses PKRU... If PKRU is used by the host but not the guest, the
> early comparison is pure overhead because it will almost always be true (guest
> will be zero, host will non-zero),
In a multi tenant environment with a variety of hosts and customer controlled
guests, we’ve seen this be a mixed bag. I’d be ok moving this back to the way
it is currently, I can do that on the next revision to simplify this.
>
>> 2. Wrap PKU sections with CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS,
>> as if the user compiles out this feature, we should not have
>> these branches at all.
>
> Not that it really matters, since static_cpu_has() will patch out all the branches,
> and in practice who cares about a JMP or NOP(s)? But...
The reason I’ve been pursuing this is that the guest+host xsave adds up to
a bit over ~1% as measured by perf top in an exit heavy workload. This is
the first in a few patch we’ve drummed up to to get it back towards zero.
I’ll send the rest out next week.
>
> #ifdeffery is the wrong way to handle this. Replace static_cpu_has() with
> cpu_feature_enabled(); that'll boil down to a '0' and omit all the code when
> CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=n, without the #ifdef ugliness.
Great idea, thanks. I’ll tune this up and send a v2 patch.
>
>> Signed-off-by: Jon Kohler <jon@...anix.com>
>> ---
>> arch/x86/kvm/x86.c | 14 +++++++++-----
>> 1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 6db3a506b402..2b00123a5d50 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -950,11 +950,13 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
>> wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
>> }
>>
>> +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
>> if (static_cpu_has(X86_FEATURE_PKU) &&
>> - (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
>> - (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
>> - vcpu->arch.pkru != vcpu->arch.host_pkru)
>> + vcpu->arch.pkru != vcpu->arch.host_pkru &&
>> + ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) ||
>> + kvm_read_cr4_bits(vcpu, X86_CR4_PKE)))
>> write_pkru(vcpu->arch.pkru);
>> +#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
>> }
>> EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
>>
>> @@ -963,13 +965,15 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
>> if (vcpu->arch.guest_state_protected)
>> return;
>>
>> +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
>> if (static_cpu_has(X86_FEATURE_PKU) &&
>> - (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
>> - (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
>> + ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) ||
>> + kvm_read_cr4_bits(vcpu, X86_CR4_PKE))) {
>> vcpu->arch.pkru = rdpkru();
>> if (vcpu->arch.pkru != vcpu->arch.host_pkru)
>> write_pkru(vcpu->arch.host_pkru);
>> }
>> +#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
>>
>> if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {
>>
>> --
>> 2.30.1 (Apple Git-130)
>>
Powered by blists - more mailing lists