lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220521131700.3661-2-jiangshanlai@gmail.com>
Date:   Sat, 21 May 2022 21:16:49 +0800
From:   Lai Jiangshan <jiangshanlai@...il.com>
To:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        David Matlack <dmatlack@...gle.com>,
        Lai Jiangshan <jiangshan.ljs@...group.com>
Subject: [PATCH V3 01/12] KVM: X86/MMU: Verify PDPTE for nested NPT in PAE paging mode when page fault

From: Lai Jiangshan <jiangshan.ljs@...group.com>

When nested NPT enabled and L1 is PAE paging, mmu->get_pdptrs()
which is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE
from memory unconditionally for each call.

The guest PAE root page is not write-protected.

The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get a value
different from previous calls or different from the return value of
mmu->get_pdptrs() in mmu_alloc_shadow_roots().

It will cause FNAME(fetch) installs the spte in a wrong sp or links
a sp to a wrong parent if the return value of mmu->get_pdptrs()
is not verified unchanged since FNAME(gpte_changed) can't check
this kind of change.

Verify the return value of mmu->get_pdptrs() (only the gfn in it
needs to be checked) and do kvm_mmu_free_roots() like load_pdptr()
if the gfn isn't matched.

Do the verifying unconditionally when the guest is PAE paging no
matter whether it is nested NPT or not to avoid complicated code.

The commit e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE
from guest memory") fixs the same problem for non-nested case via
caching the PDPTEs which is also the same way as how hardware caches
the PDPTEs.

Under SVM, however, when the processor is in guest mode with PAE
enabled, the guest PDPTE entries are not cached or validated at this
point, but instead are loaded and checked on demand in the normal course
of address translation, just like page directory and page table entries.
Any reserved bit violations are detected at the point of use, and result
in a page-fault (#PF) exception rather than a general-protection (#GP)
exception.

So using caches can not fix the problem for shadowing nested NPT for
32bit L1.

Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory")
Signed-off-by: Lai Jiangshan <jiangshan.ljs@...group.com>
---
 arch/x86/kvm/mmu/paging_tmpl.h | 39 ++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index db80f7ccaa4e..6e3df84e8455 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -870,6 +870,44 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	if (is_page_fault_stale(vcpu, fault, mmu_seq))
 		goto out_unlock;
 
+	/*
+	 * When nested NPT enabled and L1 is PAE paging, mmu->get_pdptrs()
+	 * which is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE
+	 * from memory unconditionally for each call.
+	 *
+	 * The guest PAE root page is not write-protected.
+	 *
+	 * The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get a value
+	 * different from previous calls or different from the return value of
+	 * mmu->get_pdptrs() in mmu_alloc_shadow_roots().
+	 *
+	 * It will cause FNAME(fetch) installs the spte in a wrong sp or links
+	 * a sp to a wrong parent if the return value of mmu->get_pdptrs()
+	 * is not verified unchanged since FNAME(gpte_changed) can't check
+	 * this kind of change.
+	 *
+	 * Verify the return value of mmu->get_pdptrs() (only the gfn in it
+	 * needs to be checked) and do kvm_mmu_free_roots() like load_pdptr()
+	 * if the gfn isn't matched.
+	 *
+	 * Do the verifying unconditionally when the guest is PAE paging no
+	 * matter whether it is nested NPT or not to avoid complicated code.
+	 */
+	if (vcpu->arch.mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) {
+		u64 pdpte = vcpu->arch.mmu->pae_root[(fault->addr >> 30) & 3];
+		struct kvm_mmu_page *sp = NULL;
+
+		if (IS_VALID_PAE_ROOT(pdpte))
+			sp = to_shadow_page(pdpte & PT64_BASE_ADDR_MASK);
+
+		if (!sp || walker.table_gfn[PT32E_ROOT_LEVEL - 2] != sp->gfn) {
+			write_unlock(&vcpu->kvm->mmu_lock);
+			kvm_mmu_free_roots(vcpu->kvm, vcpu->arch.mmu,
+					   KVM_MMU_ROOT_CURRENT);
+			goto release_clean;
+		}
+	}
+
 	r = make_mmu_pages_available(vcpu);
 	if (r)
 		goto out_unlock;
@@ -877,6 +915,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 
 out_unlock:
 	write_unlock(&vcpu->kvm->mmu_lock);
+release_clean:
 	kvm_release_pfn_clean(fault->pfn);
 	return r;
 }
-- 
2.19.1.6.gb485710b

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ