[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aO2EFiOHSuvmHvq_@google.com>
Date: Mon, 13 Oct 2025 15:58:30 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 08/12] KVM: selftests: Use 'leaf' instead of hugepage to
describe EPT entries
On Mon, Oct 13, 2025, Yosry Ahmed wrote:
> On Mon, Oct 13, 2025 at 02:41:56PM -0700, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, Yosry Ahmed wrote:
> > > From: Yosry Ahmed <yosryahmed@...gle.com>
> > >
> > > The assertions use 'hugepage' to describe a terminal EPT entry, but
> > > 'leaf' is more accruate as a PG_LEVEL_4K EPT entry is a leaf but not a
> > > hugepage.
> >
> > Yes, it's more accurate, but also less precise. I'm guessing the assert message
> > and comment talked about hugepages because that's the type of mappings that
> > caused problems at the time.
>
> Given that it refers to PG_LEVEL_4K entries too, I wouldn't call it less
> precise. All callers actually create 4K mappings so it is never actually
> a hugepage in the current context :D
nested_identity_map_1g()?
> > Ah, actually, I bet the code was copy+pasted from virt_create_upper_pte(), in
> > which case the assumptions about wanting to create a hupage are both accurate
> > and precise.
> >
> > > The distincion will be useful in coming changes that will pass
> > > the value around and 'leaf' is clearer than hugepage or page_size.
> >
> > What value?
>
> 'leaf'. The following changes will pass 'leaf' in as a boolean instead
> of checking 'current_level == target_level' here. So passing in
> 'hugepage' would be inaccurate, and 'page_size' is not as clear (but
> still works).
>
> >
> > > Leave the EPT bit named page_size to keep it conforming to the manual.
> > >
> > > Signed-off-by: Yosry Ahmed <yosry.ahmed@...ux.dev>
> > > ---
> > > tools/testing/selftests/kvm/lib/x86/vmx.c | 10 +++++-----
> > > 1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/kvm/lib/x86/vmx.c b/tools/testing/selftests/kvm/lib/x86/vmx.c
> > > index 04c4b97bcd1e7..673756b27e903 100644
> > > --- a/tools/testing/selftests/kvm/lib/x86/vmx.c
> > > +++ b/tools/testing/selftests/kvm/lib/x86/vmx.c
> > > @@ -380,15 +380,15 @@ static void nested_create_pte(struct kvm_vm *vm,
> > > pte->address = vm_alloc_page_table(vm) >> vm->page_shift;
> > > } else {
> > > /*
> > > - * Entry already present. Assert that the caller doesn't want
> > > - * a hugepage at this level, and that there isn't a hugepage at
> > > - * this level.
> > > + * Entry already present. Assert that the caller doesn't want a
> > > + * leaf entry at this level, and that there isn't a leaf entry
> > > + * at this level.
> > > */
> > > TEST_ASSERT(current_level != target_level,
> > > - "Cannot create hugepage at level: %u, nested_paddr: 0x%lx",
> > > + "Cannot create leaf entry at level: %u, nested_paddr: 0x%lx",
> > > current_level, nested_paddr);
> > > TEST_ASSERT(!pte->page_size,
> > > - "Cannot create page table at level: %u, nested_paddr: 0x%lx",
> > > + "Leaf entry already exists at level: %u, nested_paddr: 0x%lx",
> >
> > This change is flat out wrong. The existing PRESENT PTE _might_ be a 4KiB leaf
> > entry, but it might also be an existing non-leaf page table.
>
> Hmm if pte->page_size is true then it has to be a leaf page table,
> right?
No, because bit 7 is ignored by hardware for 4KiB entries. I.e. it can be 0 or
1 depending on the whims of software. Ugh, this code uses bit 7 to flag leaf
entries. That's lovely.
> If it's an existing non-leaf page table we shouldn't fail,
Ah, right, current_level can never be less than target_level because the first
assert will fail on iteration-1.
> the assertion here is when we try to override a leaf page table IIUC.
>
> > Instead of hacking on the nested code, can we instead tweak __virt_pg_map() to
> > work with nested TDP? At a glance, it's already quite close, e.g. "just" needs
> > to be taught about EPT RWX bits and allow the call to pass in the root pointer.
>
> That would be ideal, I'll take a look. In case I don't have time for
> that unification, can this be a follow-up change?
Part of me wants to be nice and say "yes", but most of me wants to say "no".
Struct overlays for PTEs suck. At best, they generate poor code and obfuscate
simple logic (e.g. vm->page_size vs pte->page_size is a confusion that simply
should not be possible). At worst, they lead to hard-to-debug issues like the
one that led to commit f18b4aebe107 ("kvm: selftests: do not use bitfields larger
than 32-bits for PTEs").
eptPageTableEntry obviously isn't your fault, but nptPageTableEntry is. :-D
And I suspect the hardest part of unificiation will be adding the globals to
deal with variable bit positions that are currently being handled by the struct
overlays.
Powered by blists - more mailing lists