[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190410145725.GB10760@linux.intel.com>
Date: Wed, 10 Apr 2019 07:57:25 -0700
From: Sean Christopherson <sean.j.christopherson@...el.com>
To: David Laight <David.Laight@...LAB.COM>
Cc: 'Paolo Bonzini' <pbonzini@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH] KVM: x86: optimize check for valid PAT value
On Wed, Apr 10, 2019 at 12:55:53PM +0000, David Laight wrote:
> From: Paolo Bonzini
> > Sent: 10 April 2019 10:55
> >
> > This check will soon be done on every nested vmentry and vmexit,
> > "parallelize" it using bitwise operations.
> >
> > Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> > ---
> ...
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index 28406aa1136d..7bc7ac9d2a44 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -347,4 +347,12 @@ static inline void kvm_after_interrupt(struct kvm_vcpu *vcpu)
> > __this_cpu_write(current_vcpu, NULL);
> > }
> >
> > +static inline bool kvm_pat_valid(u64 data)
> > +{
> > + if (data & 0xF8F8F8F8F8F8F8F8)
> > + return false;
> > + /* 0, 1, 4, 5, 6, 7 are valid values. */
> > + return (data | ((data & 0x0202020202020202) << 1)) == data;
> > +}
> > +
>
> How about:
> /*
> * Each byte must be 0, 1, 4, 5, 6 or 7.
> * Convert 001x to 011x then 100x so 2 and 3 fail the test.
> */
> data |= (data ^ 0x0404040404040404ULL)) + 0x0202020202020202ULL;
> if (data & 0xF8F8F8F8F8F8F8F8ULL)
> return false;
Woah. My vote is for Paolo's version as the separate checks allow the
reader to walk through step-by-step. The generated assembly isn't much
different from a performance perspective since the TEST+JNE will be not
taken in the fast path.
Fancy:
0x000000000004844f <+255>: movabs $0xf8f8f8f8f8f8f8f8,%rcx
0x0000000000048459 <+265>: xor %eax,%eax
0x000000000004845b <+267>: test %rcx,%rdx
0x000000000004845e <+270>: jne 0x4848b <kvm_mtrr_valid+315>
0x0000000000048460 <+272>: movabs $0x202020202020202,%rax
0x000000000004846a <+282>: and %rdx,%rax
0x000000000004846d <+285>: add %rax,%rax
0x0000000000048470 <+288>: or %rdx,%rax
0x0000000000048473 <+291>: cmp %rdx,%rax
0x0000000000048476 <+294>: sete %al
0x0000000000048479 <+297>: retq
Really fancy:
0x0000000000048447 <+247>: movabs $0x404040404040404,%rcx
0x0000000000048451 <+257>: movabs $0x202020202020202,%rax
0x000000000004845b <+267>: xor %rdx,%rcx
0x000000000004845e <+270>: add %rax,%rcx
0x0000000000048461 <+273>: movabs $0xf8f8f8f8f8f8f8f8,%rax
0x000000000004846b <+283>: or %rcx,%rdx
0x000000000004846e <+286>: test %rax,%rdx
0x0000000000048471 <+289>: sete %al
0x0000000000048474 <+292>: retq
Powered by blists - more mailing lists