[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2FD095E7-5C74-4B58-953F-3195BA97ABEF@nutanix.com>
Date: Mon, 17 May 2021 02:58:25 +0000
From: Jon Kohler <jon@...anix.com>
To: Andy Lutomirski <luto@...nel.org>
CC: Dave Hansen <dave.hansen@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Jon Kohler <jon@...anix.com>, Babu Moger <babu.moger@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Yu-cheng Yu <yu-cheng.yu@...el.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Tony Luck <tony.luck@...el.com>,
Uros Bizjak <ubizjak@...il.com>,
Petteri Aimonen <jpa@....mail.kapsi.fi>,
Al Viro <viro@...iv.linux.org.uk>,
Kan Liang <kan.liang@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Rapoport <rppt@...nel.org>,
Fan Yang <Fan_Yang@...u.edu.cn>,
Juergen Gross <jgross@...e.com>,
Benjamin Thiel <b.thiel@...teo.de>,
Dave Jiang <dave.jiang@...el.com>,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Arvind Sankar <nivedita@...m.mit.edu>,
LKML <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH v3] KVM: x86: use wrpkru directly in
kvm_load_{guest|host}_xsave_state
> On May 14, 2021, at 12:46 AM, Andy Lutomirski <luto@...nel.org> wrote:
>
>
>
> On Wed, May 12, 2021, at 11:33 AM, Dave Hansen wrote:
>> On 5/12/21 12:41 AM, Peter Zijlstra wrote:
>> > On Tue, May 11, 2021 at 01:05:02PM -0400, Jon Kohler wrote:
>> >> diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
>> >> index 8d33ad80704f..5bc4df3a4c27 100644
>> >> --- a/arch/x86/include/asm/fpu/internal.h
>> >> +++ b/arch/x86/include/asm/fpu/internal.h
>> >> @@ -583,7 +583,13 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
>> >> if (pk)
>> >> pkru_val = pk->pkru;
>> >> }
>> >> - __write_pkru(pkru_val);
>> >> +
>> >> + /*
>> >> + * WRPKRU is relatively expensive compared to RDPKRU.
>> >> + * Avoid WRPKRU when it would not change the value.
>> >> + */
>> >> + if (pkru_val != rdpkru())
>> >> + wrpkru(pkru_val);
>> > Just wondering; why aren't we having that in a per-cpu variable? The
>> > usual per-cpu MSR shadow approach avoids issuing any 'special' ops
>> > entirely.
>>
>> It could be a per-cpu variable. When I wrote this originally I figured
>> that a rdpkru would be cheaper than a load from memory (even per-cpu
>> memory).
>>
>> But, now that I think about it, assuming that 'prku_val' is in %rdi, doing:
>>
>> cmp %gs:0x1234, %rdi
>>
>> might end up being cheaper than clobbering a *pair* of GPRs with rdpkru:
>>
>> xor %ecx,%ecx
>> rdpkru
>> cmp %rax, %rdi
>>
>> I'm too lazy to go figure out what would be faster in practice, though.
>> Does anyone care?
Strictly from a profiling perspective, my observation is that the rdpkru
is pretty quick, its the wrpkru that seems heavier under the covers, so
any speedup in rdpkru would likely go unnoticed by comparison. Now
that said if this per cpu variable would somehow get rid of the underlying
instruction and just emulate the whole thing, that might be interesting.
From an incremental change perspective though, this patch puts
us in a better spot, happy to take a look at future work if y’all have
some tips on top of this.
>
> RDPKRU gets bonus points for being impossible to get out of sync.
Powered by blists - more mailing lists