[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eeed31d55f20561f9ef06afa40f0fa9d7f3032af.camel@redhat.com>
Date: Thu, 14 Jul 2022 13:57:31 +0300
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] KVM: x86: Add dedicated helper to get CPUID entry
with significant index
On Tue, 2022-07-12 at 16:35 +0000, Sean Christopherson wrote:
> On Tue, Jul 12, 2022, Maxim Levitsky wrote:
> > On Tue, 2022-07-12 at 00:06 +0000, Sean Christopherson wrote:
> > > static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
> > > - struct kvm_cpuid_entry2 *entries, int nent, u32 function, u32 index)
> > > + struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index)
> > How I wish that this would be just called EAX and ECX... Anyway....
>
> Heh, I strongly disagree. EAX and ECX are how the CPUID instruction specifies
> the function and index, CPUID the lookup itself operates on function+index,
> e.g. there are plenty of situations where KVM queries CPUID info without the
> inputs coming from EAX/ECX.
Just a matter of taste I guess - note that outputs of CPUID instructions are always called
as registers (EAX/EBX/ECX/EDX). But anyway it doesn't really mattter to me.
>
> > > {
> > > struct kvm_cpuid_entry2 *e;
> > > int i;
> > > @@ -77,9 +85,22 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
> > > for (i = 0; i < nent; i++) {
> > > e = &entries[i];
> > >
> > > - if (e->function == function &&
> > > - (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index))
> > > + if (e->function != function)
> > > + continue;
> > > +
> > > + /*
> > > + * If the index isn't significant, use the first entry with a
> > > + * matching function. It's userspace's responsibilty to not
> > > + * provide "duplicate" entries in all cases.
> > > + */
> > > + if (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index)
> > > return e;
> > > +
> > > + /*
> > > + * Function matches and index is significant; not specifying an
> > > + * exact index in this case is a KVM bug.
> > > + */
> > Nitpick: Why KVM bug? Bad userspace can also provide a index-significant entry for cpuid
> > leaf for which index is not significant in the x86 spec.
>
> Ugh, you're right.
>
> > We could arrange a table of all known leaves and for each leaf if it has an index
> > in the x86 spec, and warn/reject the userspace CPUID info if it doesn't match.
>
> We have such a table, cpuid_function_is_indexed(). The alternative would be to
> do:
>
> WARN_ON_ONCE(index == KVM_CPUID_INDEX_NOT_SIGNIFICANT &&
> cpuid_function_is_indexed(function));
>
> The problem with rejecting userspace CPUID on mismatch is that it could break
> userspace :-/ Of course, this entire patch would also break userspace to some
> extent, e.g. if userspace is relying on an exact match on index==0. The only
> difference being the guest lookups with an exact index would still work.
>
> I think the restriction we could put in place would be that userspace can make
> a leaf more relaxed, e.g. to play nice if userspace forgets to set the SIGNFICANT
> flag, but rejects attempts to make guest CPUID more restrictive, i.e. disallow
> setting the SIGNFICANT flag on leafs that KVM doesn't enumerate as significant.
Makes sense.
>
> > > + WARN_ON_ONCE(index == KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > > }
> > >
> > > return NULL;
> > > @@ -96,7 +117,8 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
> > > * The existing code assumes virtual address is 48-bit or 57-bit in the
> > > * canonical address checks; exit if it is ever changed.
> > > */
> > > - best = cpuid_entry2_find(entries, nent, 0x80000008, 0);
> > > + best = cpuid_entry2_find(entries, nent, 0x80000008,
> > > + KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> > OK.
>
> Thanks for looking through all these!
No problem!
>
> > > static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
> > > @@ -219,7 +242,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
> > > struct kvm_cpuid_entry2 *best;
> > > u64 guest_supported_xcr0 = cpuid_get_supported_xcr0(entries, nent);
> > >
> > > - best = cpuid_entry2_find(entries, nent, 1, 0);
> > > + best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> >
> > Leaf 1, no index indeed.
> >
> > Offtopic: I wonder why we call this 'best'?
>
> Awful, awful historic code. IIRC, for functions whose index is not significant,
> KVM would iterate over all entries and look for an exact function+index match
> anyways. If there was at least one partial match (function match only) but no
> full match, KVM would use the first partial match, which it called the "best" match.
>
> We've been slowly/opportunistically killing off the "best" terminology.
Thanks for the explanation. I also noticed that you removed recently the 'stateful'
CPUID code. Good riddance!
>
> > > -struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
> > > - u32 function, u32 index)
> > > +struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
> > > + u32 function, u32 index)
> > Nitpick: could you fix the indention while at it?
>
> The indentation is correct, it's only the diff that appears misaligned.
I think I already heard about this once, I will try to not forget and keep
that in mind next time I notice this. Sorry!
>
> > > @@ -1353,11 +1384,11 @@ get_out_of_range_cpuid_entry(struct kvm_vcpu *vcpu, u32 *fn_ptr, u32 index)
> > > return NULL;
> > >
> > > if (function >= 0x40000000 && function <= 0x4fffffff)
> > > - class = kvm_find_cpuid_entry(vcpu, function & 0xffffff00, 0);
> > > + class = kvm_find_cpuid_entry(vcpu, function & 0xffffff00);
> > > else if (function >= 0xc0000000)
> > > - class = kvm_find_cpuid_entry(vcpu, 0xc0000000, 0);
> > > + class = kvm_find_cpuid_entry(vcpu, 0xc0000000);
> > > else
> > > - class = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
> > > + class = kvm_find_cpuid_entry(vcpu, function & 0x80000000);
> > This assumes that all the classes has first entry whose EAX specifies max leaf
> > for this class. True for sure for basic and extended features, don't know
> > if true for hypervisor and Centaur entries. Seems OK.
>
> It holds true for all known hypervisors. There's no formal definition for using
> 0x400000yy as the hypervisor range, but the de facto standard is to use EBX, ECX,
> and EDX for the signature, and EAX for the max leaf.
>
> The Centaur behavior is very much a guess, but odds are it's a correct guess. When
> I added the Centaur code, I spent far too much time trying (and failing) to hunt
> down documentation.
I understand very well what you mean.
Best regards,
Maxim Levitsky
>
Powered by blists - more mailing lists