[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <timpvklyyl5juo5ajjzuxwazc5w2t6ffcx7llnv6f2a5qzot3b@hnj3wqwtla6c>
Date: Tue, 16 Dec 2025 21:34:57 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 24/26] KVM: nSVM: Restrict mapping VMCB12 on nested
VMRUN
On Tue, Dec 16, 2025 at 04:35:04PM +0000, Yosry Ahmed wrote:
> On Mon, Dec 15, 2025 at 07:27:19PM +0000, Yosry Ahmed wrote:
> > All accesses to the VMCB12 in the guest memory on nested VMRUN are
> > limited to nested_svm_vmrun() and nested_svm_failed_vmrun(). However,
> > the VMCB12 remains mapped throughout nested_svm_vmrun(). Mapping and
> > unmapping around usages is possible, but it becomes easy-ish to
> > introduce bugs where 'vmcb12' is used after being unmapped.
> >
> > Move reading the VMCB12 and copying to cache from nested_svm_vmrun()
> > into a new helper, nested_svm_copy_vmcb12_to_cache(), that maps the
> > VMCB12, caches the needed fields, and unmaps it. Use
> > kvm_vcpu_map_readonly() as only reading the VMCB12 is needed.
> >
> > Similarly, move mapping the VMCB12 on VMRUN failure into
> > nested_svm_failed_vmrun(). Inject a triple fault if the mapping fails,
> > similar to nested_svm_vmexit().
> >
> > Signed-off-by: Yosry Ahmed <yosry.ahmed@...ux.dev>
> > ---
> > arch/x86/kvm/svm/nested.c | 55 ++++++++++++++++++++++++++++-----------
> > 1 file changed, 40 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index 48ba34d8b713..d33a2a27efe5 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -1055,23 +1055,55 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm, struct vmcb *vmcb12)
> > kvm_queue_exception(vcpu, DB_VECTOR);
> > }
> >
> > -static void nested_svm_failed_vmrun(struct vcpu_svm *svm, struct vmcb *vmcb12)
> > +static void nested_svm_failed_vmrun(struct vcpu_svm *svm, u64 vmcb12_gpa)
> > {
> > + struct kvm_vcpu *vcpu = &svm->vcpu;
> > + struct kvm_host_map map;
> > + struct vmcb *vmcb12;
> > + int r;
> > +
> > WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);
> >
>
> Ugh I missed leave_guest_mode() here, which means guest state won't be
> cleaned up properly and the triple fault won't be inject correctly if
> unmap fails. I re-introduced the bug I fixed earlier in the series :')
>
> Should probably add WARN_ON_ONCE(is_guest_mode()) in
> __nested_svm_vmexit() since the caller is expected to
> leave_guest_mode().
Although none of the selftests or KUTs failed because of this, running
them again now I noticed a couple of WARNs firing, which is reassuring
because the problem is detectable. Namely:
In nested_svm_vmexit(), the WARN introduced earlier in the series:
WARN_ON_ONCE(svm->vmcb != svm->nested.vmcb02.ptr);
In vcpu_enter_guest():
/*
* Assert that vCPU vs. VM APICv state is consistent. An APICv
* update must kick and wait for all vCPUs before toggling the
* per-VM state, and responding vCPUs must wait for the update
* to complete before servicing KVM_REQ_APICV_UPDATE.
*/
WARN_ON_ONCE((kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu)) &&
(kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED));
Not sure why the latter fired, but probably due to is_guest_mode()
returning the wrong thing in some code path.
Anyway, with the following diff no WARNs are produced with the selftests
or KUTs:
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 7af701e92c81..58e168e8c1ee 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1039,6 +1039,8 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm, struct vmcb *vmcb12)
struct vmcb *vmcb01 = svm->vmcb01.ptr;
struct kvm_vcpu *vcpu = &svm->vcpu;
+ WARN_ON_ONCE(is_guest_mode(vcpu));
+
svm->nested.vmcb12_gpa = 0;
svm->nested.ctl.nested_cr3 = 0;
@@ -1075,6 +1077,8 @@ static void nested_svm_failed_vmrun(struct vcpu_svm *svm, u64 vmcb12_gpa)
WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);
+ leave_guest_mode(vcpu);
+
r = kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map);
if (r) {
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
Sean, I will wait until you get a chance to look through the series, at
which point let me know how you want to proceed. I assume it'll depend
on whether or not other patches require a new version.
Powered by blists - more mailing lists