[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFMl6DOcKfH6ampb@google.com>
Date: Wed, 18 Jun 2025 13:47:36 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Oliver Upton <oliver.upton@...ux.dev>
Cc: James Houghton <jthoughton@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>, Marc Zyngier <maz@...nel.org>, Yan Zhao <yan.y.zhao@...el.com>,
Nikita Kalyazin <kalyazin@...zon.com>, Anish Moorthy <amoorthy@...gle.com>,
Peter Gonda <pgonda@...gle.com>, Peter Xu <peterx@...hat.com>,
David Matlack <dmatlack@...gle.com>, wei.w.wang@...el.com, kvm@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v3 03/15] KVM: arm64: x86: Require "struct kvm_page_fault"
for memory fault exits
On Wed, Jun 18, 2025, Oliver Upton wrote:
> On Wed, Jun 18, 2025 at 04:24:12AM +0000, James Houghton wrote:
> > +#ifdef CONFIG_KVM_GENERIC_PAGE_FAULT
> > +
> > +#define KVM_ASSERT_TYPE_IS(type_t, x) \
> > +do { \
> > + type_t __maybe_unused tmp; \
> > + \
> > + BUILD_BUG_ON(!__types_ok(tmp, x) || !__typecheck(tmp, x)); \
> > +} while (0)
> > +
> > static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu,
> > - gpa_t gpa, gpa_t size,
> > - bool is_write, bool is_exec,
> > - bool is_private)
> > + struct kvm_page_fault *fault)
> > {
> > + KVM_ASSERT_TYPE_IS(gfn_t, fault->gfn);
> > + KVM_ASSERT_TYPE_IS(bool, fault->exec);
> > + KVM_ASSERT_TYPE_IS(bool, fault->write);
> > + KVM_ASSERT_TYPE_IS(bool, fault->is_private);
> > + KVM_ASSERT_TYPE_IS(struct kvm_memory_slot *, fault->slot);
> > +
> > vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
> > - vcpu->run->memory_fault.gpa = gpa;
> > - vcpu->run->memory_fault.size = size;
> > + vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT;
> > + vcpu->run->memory_fault.size = PAGE_SIZE;
> >
> > /* RWX flags are not (yet) defined or communicated to userspace. */
> > vcpu->run->memory_fault.flags = 0;
> > - if (is_private)
> > + if (fault->is_private)
> > vcpu->run->memory_fault.flags |= KVM_MEMORY_EXIT_FLAG_PRIVATE;
> > }
> > +#endif
>
> This *is not* the right direction of travel for arm64. Stage-2 aborts /
> EPT violations / etc. are extremely architecture-specific events.
Yes and no. 100% agreed there are arch/vendor specific aspects of stage-2 faults,
but there are most definitely commonalites as well.
> What I would like to see on arm64 is that for every "KVM_EXIT_MEMORY_FAULT"
> we provide as much syndrome information as possible. That could imply
> some combination of a sanitised view of ESR_EL2 and, where it is
> unambiguous, common fault flags that have shared definitions with x86.
Me confused, this is what the above does? "struct kvm_page_fault" is arch
specific, e.g. x86 has a whole pile of stuff in there beyond gfn, exec, write,
is_private, and slot.
The approach is non-standard, but I think my justification/reasoning for having
the structure be arch-defined still holds:
: Rather than define a common kvm_page_fault and kvm_arch_page_fault child,
: simply assert that the handful of required fields are provided by the
: arch-defined structure. Unlike vCPU and VMs, the number of common fields
: is expected to be small, and letting arch code fully define the structure
: allows for maximum flexibility with respect to const, layout, etc.
If we could use anonymous struct field, i.e. could embed a kvm_arch_page_fault
without having to bounce through an "arch" field, I would vote for the approach.
Sadly, AFAIK, we can't yet use those in the kernel.
> This could incur some minor code duplication, but even then we can share
> helpers for the software bits (like userfault).
Again, that is what is proposed here.
> FEAT_MTE_PERM is a very good example for this. There exists a "Tag"
> permission at stage-2 which is unrelated to any of the 'normal'
> read/write permissions. There's also the MostlyReadOnly permission from
> FEAT_THE which grants write permission to a specific set of instructions.
>
> I don't want to paper over these nuances and will happily maintain an
> arm64-specific flavor of "kvm_prepare_memory_fault_exit()"
Nothing prevents arm64 (or any arch) from wrapping kvm_prepare_memory_fault_exit()
and/or taking action after it's invoked. That's not an accident; the "prepare
exit" helpers (x86 has a few more) were specifically designed to not be used as
the "return" to userspace. E.g. this one returns "void" instead of -EFAULT
specifically so that the callers isn't "required" to ignore the return if the
caller wants to populate (or change, but hopefully that's never the case) fields
after calling kvm_prepare_memory_fault_exit), and so that arch can return an
entirely different error code, e.g. -EHWPOISON when appropriate.
And it's not just kvm_prepare_memory_fault_exit() that I want to use kvm_page_fault;
kvm_faultin_pfn() is another case where having a common "struct kvm_page_fault"
would clean up some ugly/annoying boilerplate.
Powered by blists - more mailing lists