linux-kernel - Re: [PATCH 5/6] KVM: x86/mmu: Protect kvm->memslots with a mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <16b2f0f3-c9a8-c455-fff0-231c2fe04a8e@redhat.com>
Date:   Thu, 29 Apr 2021 01:42:48 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Ben Gardon <bgardon@...gle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
        Peter Xu <peterx@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Peter Shier <pshier@...gle.com>,
        Junaid Shahid <junaids@...gle.com>,
        Jim Mattson <jmattson@...gle.com>,
        Yulei Zhang <yulei.kernel@...il.com>,
        Wanpeng Li <kernellwp@...il.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Xiao Guangrong <xiaoguangrong.eric@...il.com>
Subject: Re: [PATCH 5/6] KVM: x86/mmu: Protect kvm->memslots with a mutex

On 28/04/21 23:46, Ben Gardon wrote:
> On Wed, Apr 28, 2021 at 2:41 PM Paolo Bonzini <pbonzini@...hat.com> wrote:
>>
>> On 28/04/21 22:40, Ben Gardon wrote:
>>> ... However with the locking you propose below, we might still run
>>> into issues on a move or delete, which would mean we'd still need the
>>> separate memory allocation for the rmaps array. Or we do some
>>> shenanigans where we try to copy the rmap pointers from the other set
>>> of memslots.
>>
>> If that's (almost) as easy as passing old to
>> kvm_arch_prepare_memory_region, that would be totally okay.
> 
> Unfortunately it's not quite that easy because it's all the slots
> _besides_ the one being modified where we'd need to copy the rmaps.

Ah, now I understand the whole race.  And it seems to me that if one
kvm_dup_memslots within the new lock fixed a bug, two kvm_dup_memslots
within the new lock are going to fix two bugs. :)

Seriously: unless I'm missing another case (it's late here...), it's
not ugly and it's still relatively easy to explain.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2799c6660cce..48929dd5fb29 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1270,7 +1270,7 @@ static int check_memory_region_flags(const struct kvm_userspace_memory_region *m
  	return 0;
  }
  
-static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
+static void install_new_memslots(struct kvm *kvm,
  		int as_id, struct kvm_memslots *slots)
  {
  	struct kvm_memslots *old_memslots = __kvm_memslots(kvm, as_id);
@@ -1280,7 +1280,9 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
  	slots->generation = gen | KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS;
  
  	rcu_assign_pointer(kvm->memslots[as_id], slots);
+	mutex_unlock(&kvm->slots_arch_lock);
  	synchronize_srcu_expedited(&kvm->srcu);
+	kvfree(old_memslots);
  
  	/*
  	 * Increment the new memslot generation a second time, dropping the
@@ -1302,8 +1304,6 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
  	kvm_arch_memslots_updated(kvm, gen);
  
  	slots->generation = gen;
-
-	return old_memslots;
  }
  
  /*
@@ -1342,6 +1342,7 @@ static int kvm_set_memslot(struct kvm *kvm,
  	struct kvm_memslots *slots;
  	int r;
  
+	mutex_lock(&kvm->slots_arch_lock);
  	slots = kvm_dup_memslots(__kvm_memslots(kvm, as_id), change);
  	if (!slots)
  		return -ENOMEM;
@@ -1353,14 +1354,7 @@ static int kvm_set_memslot(struct kvm *kvm,
  		 */
  		slot = id_to_memslot(slots, old->id);
  		slot->flags |= KVM_MEMSLOT_INVALID;
-
-		/*
-		 * We can re-use the old memslots, the only difference from the
-		 * newly installed memslots is the invalid flag, which will get
-		 * dropped by update_memslots anyway.  We'll also revert to the
-		 * old memslots if preparing the new memory region fails.
-		 */
-		slots = install_new_memslots(kvm, as_id, slots);
+		install_new_memslots(kvm, as_id, slots);
  
  		/* From this point no new shadow pages pointing to a deleted,
  		 * or moved, memslot will be created.
@@ -1370,6 +1364,9 @@ static int kvm_set_memslot(struct kvm *kvm,
  		 *	- kvm_is_visible_gfn (mmu_check_root)
  		 */
  		kvm_arch_flush_shadow_memslot(kvm, slot);
+
+		mutex_lock(&kvm->slots_arch_lock);
+		slots = kvm_dup_memslots(__kvm_memslots(kvm, as_id), change);
  	}
  
  	r = kvm_arch_prepare_memory_region(kvm, new, mem, change);
@@ -1377,16 +1374,17 @@ static int kvm_set_memslot(struct kvm *kvm,
  		goto out_slots;
  
  	update_memslots(slots, new, change);
-	slots = install_new_memslots(kvm, as_id, slots);
+	install_new_memslots(kvm, as_id, slots);
  
  	kvm_arch_commit_memory_region(kvm, mem, old, new, change);
-
-	kvfree(slots);
  	return 0;
  
  out_slots:
-	if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
+	if (change == KVM_MR_DELETE || change == KVM_MR_MOVE) {
+		slot = id_to_memslot(slots, old->id);
+		slot->flags &= ~KVM_MEMSLOT_INVALID;
  		slots = install_new_memslots(kvm, as_id, slots);
+	}
  	kvfree(slots);
  	return r;
  }

One could optimize things a bit by reusing the allocation and only
doing a memcpy from the new memslots array to the old one under the
slots_arch_lock.  (Plus the above still lacks a mutex_init and
should be split in two patches, with the mutex going in the second;
but you get the idea and code sometimes is easier than words).

Paolo