lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <timpvklyyl5juo5ajjzuxwazc5w2t6ffcx7llnv6f2a5qzot3b@hnj3wqwtla6c>
Date: Tue, 16 Dec 2025 21:34:57 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Paolo Bonzini <pbonzini@...hat.com>, 
	Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 24/26] KVM: nSVM: Restrict mapping VMCB12 on nested
 VMRUN

On Tue, Dec 16, 2025 at 04:35:04PM +0000, Yosry Ahmed wrote:
> On Mon, Dec 15, 2025 at 07:27:19PM +0000, Yosry Ahmed wrote:
> > All accesses to the VMCB12 in the guest memory on nested VMRUN are
> > limited to nested_svm_vmrun() and nested_svm_failed_vmrun(). However,
> > the VMCB12 remains mapped throughout nested_svm_vmrun().  Mapping and
> > unmapping around usages is possible, but it becomes easy-ish to
> > introduce bugs where 'vmcb12' is used after being unmapped.
> > 
> > Move reading the VMCB12 and copying to cache from nested_svm_vmrun()
> > into a new helper, nested_svm_copy_vmcb12_to_cache(),  that maps the
> > VMCB12, caches the needed fields, and unmaps it. Use
> > kvm_vcpu_map_readonly() as only reading the VMCB12 is needed.
> > 
> > Similarly, move mapping the VMCB12 on VMRUN failure into
> > nested_svm_failed_vmrun(). Inject a triple fault if the mapping fails,
> > similar to nested_svm_vmexit().
> > 
> > Signed-off-by: Yosry Ahmed <yosry.ahmed@...ux.dev>
> > ---
> >  arch/x86/kvm/svm/nested.c | 55 ++++++++++++++++++++++++++++-----------
> >  1 file changed, 40 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index 48ba34d8b713..d33a2a27efe5 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -1055,23 +1055,55 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm, struct vmcb *vmcb12)
> >  		kvm_queue_exception(vcpu, DB_VECTOR);
> >  }
> >  
> > -static void nested_svm_failed_vmrun(struct vcpu_svm *svm, struct vmcb *vmcb12)
> > +static void nested_svm_failed_vmrun(struct vcpu_svm *svm, u64 vmcb12_gpa)
> >  {
> > +	struct kvm_vcpu *vcpu = &svm->vcpu;
> > +	struct kvm_host_map map;
> > +	struct vmcb *vmcb12;
> > +	int r;
> > +
> >  	WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);
> >  
> 
> Ugh I missed leave_guest_mode() here, which means guest state won't be
> cleaned up properly and the triple fault won't be inject correctly if
> unmap fails. I re-introduced the bug I fixed earlier in the series :')
> 
> Should probably add WARN_ON_ONCE(is_guest_mode()) in
> __nested_svm_vmexit() since the caller is expected to
> leave_guest_mode().

Although none of the selftests or KUTs failed because of this, running
them again now I noticed a couple of WARNs firing, which is reassuring
because the problem is detectable. Namely:

In nested_svm_vmexit(), the WARN introduced earlier in the series:

	WARN_ON_ONCE(svm->vmcb != svm->nested.vmcb02.ptr);

In vcpu_enter_guest():

	/*
	 * Assert that vCPU vs. VM APICv state is consistent.  An APICv
	 * update must kick and wait for all vCPUs before toggling the
	 * per-VM state, and responding vCPUs must wait for the update
	 * to complete before servicing KVM_REQ_APICV_UPDATE.
	 */
	WARN_ON_ONCE((kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu)) &&
		     (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED));


Not sure why the latter fired, but probably due to is_guest_mode()
returning the wrong thing in some code path.

Anyway, with the following diff no WARNs are produced with the selftests
or KUTs:

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 7af701e92c81..58e168e8c1ee 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1039,6 +1039,8 @@ static void __nested_svm_vmexit(struct vcpu_svm *svm, struct vmcb *vmcb12)
        struct vmcb *vmcb01 = svm->vmcb01.ptr;
        struct kvm_vcpu *vcpu = &svm->vcpu;

+       WARN_ON_ONCE(is_guest_mode(vcpu));
+
        svm->nested.vmcb12_gpa = 0;
        svm->nested.ctl.nested_cr3 = 0;

@@ -1075,6 +1077,8 @@ static void nested_svm_failed_vmrun(struct vcpu_svm *svm, u64 vmcb12_gpa)

        WARN_ON(svm->vmcb == svm->nested.vmcb02.ptr);

+       leave_guest_mode(vcpu);
+
        r = kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map);
        if (r) {
                kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);


Sean, I will wait until you get a chance to look through the series, at
which point let me know how you want to proceed. I assume it'll depend
on whether or not other patches require a new version.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ