lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220327205803.739336-1-mizhang@google.com>
Date:   Sun, 27 Mar 2022 20:58:03 +0000
From:   Mingwei Zhang <mizhang@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Ben Gardon <bgardon@...gle.com>,
        David Matlack <dmatlack@...gle.com>,
        Mingwei Zhang <mizhang@...gle.com>
Subject: [PATCH] KVM: x86/mmu: add lockdep check before lookup_address_in_mm()

Add a lockdep check before invoking lookup_address_in_mm().
lookup_address_in_mm() walks all levels of host page table without
accquiring any lock. This is usually unsafe unless we are walking the
kernel addresses (check other usage cases of lookup_address_in_mm and
lookup_address_in_pgd).

Walking host page table (especially guest addresses) usually requires
holding two types of locks: 1) mmu_lock in mm or the lock that protects
the reverse maps of host memory in range; 2) lock for the leaf paging
structures.

One exception case is when we take the mmu_lock of the secondary mmu.
Holding mmu_lock of KVM MMU in either read mode or write mode prevents host
level entities from modifying the host page table concurrently. This is
because all of them will have to invoke KVM mmu_notifier first before doing
the actual work. Since KVM mmu_notifier invalidation operations always take
the mmu write lock, we are safe if we hold the mmu lock here.

Note: this means that KVM cannot allow concurrent multiple mmu_notifier
invalidation callbacks by using KVM mmu read lock. Since, otherwise, any
host level entity can cause race conditions with this one. Walking host
page table here may get us stale information or may trigger NULL ptr
dereference that is hard to reproduce.

Having a lockdep check here will prevent or at least warn future
development that directly walks host page table simply in a KVM ioctl
function. In addition, it provides a record for any future development on
KVM mmu_notifier.

Cc: Sean Christopherson <seanjc@...gle.com>
Cc: Ben Gardon <bgardon@...gle.com>
Cc: David Matlack <dmatlack@...gle.com>

Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
---
 arch/x86/kvm/mmu/mmu.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1361eb4599b4..066bb5435156 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2820,6 +2820,24 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
 	 */
 	hva = __gfn_to_hva_memslot(slot, gfn);
 
+	/*
+	 * lookup_address_in_mm() walks all levels of host page table without
+	 * accquiring any lock. This is not safe when KVM does not take the
+	 * mmu_lock. Holding mmu_lock in either read mode or write mode prevents
+	 * host level entities from modifying the host page table. This is
+	 * because all of them will have to invoke KVM mmu_notifier first before
+	 * doing the actual work. Since KVM mmu_notifier invalidation operations
+	 * always take the mmu write lock, we are safe if we hold the mmu lock
+	 * here.
+	 *
+	 * Note: this means that KVM cannot allow concurrent multiple
+	 * mmu_notifier invalidation callbacks by using KVM mmu read lock.
+	 * Otherwise, any host level entity can cause race conditions with this
+	 * one. Walking host page table here may get us stale information or may
+	 * trigger NULL ptr dereference that is hard to reproduce.
+	 */
+	lockdep_assert_held(&kvm->mmu_lock);
+
 	pte = lookup_address_in_mm(kvm->mm, hva, &level);
 	if (unlikely(!pte))
 		return PG_LEVEL_4K;
-- 
2.35.1.1021.g381101b075-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ