linux-kernel - Re: [PATCH RFC 1/6] KVM: fix guest_mode optimization in kvm_make_all_cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a6091fa0-8107-239a-ad89-34ca2cba4fa8@redhat.com>
Date:   Tue, 11 Apr 2017 13:25:04 +0800
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     James Hogan <james.hogan@...tec.com>,
        Radim Krčmář <rkrcmar@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Christoffer Dall <cdall@...aro.org>,
        Andrew Jones <drjones@...hat.com>,
        Marc Zyngier <marc.zyngier@....com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Cornelia Huck <cornelia.huck@...ibm.com>,
        Paul Mackerras <paulus@...abs.org>
Subject: Re: [PATCH RFC 1/6] KVM: fix guest_mode optimization in
 kvm_make_all_cpus_request()



On 07/04/2017 05:02, James Hogan wrote:
> This presumably changes the behaviour on x86, from != OUTSIDE_GUEST_MODE
> to == IN_GUEST_MODE. so:
> - you'll no longer get IPIs if its in READING_SHADOW_PAGE_TABLES (which
>   MIPS also now uses when accessing mappings outside of guest mode and
>   depends upon to wait until the old mappings are no longer in use).

This is wrong, the purpose of READING_SHADOW_PAGE_TABLES is "kvm_flush_remote_tlbs
should send me an IPI, because I want to stop kvm_flush_remote_tlbs until I'm done
reading the page tables".

> - you'll no longer get IPIs if its in EXITING_GUEST_MODE (i.e. if you
>   get two of these in quick succession only the first will wait for the
>   IPI, which might work as long as they're already serialised but it
>   still feels wrong).

But this is okay---avoiding multiple IPIs is the exact purpose of
EXITING_GUEST_MODE.

There are evidently multiple uses of kvm_make_all_cpus_request, and we
should avoid smp_call_function_many(..., true) if possible.  So perhaps:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a17d78759727..20e3bd60bdda 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -169,7 +169,7 @@ static void ack_flush(void *_completed)
 {
 }
 
-bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
+bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req, bool wait)
 {
 	int i, cpu, me;
 	cpumask_var_t cpus;
@@ -182,18 +182,19 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		kvm_make_request(req, vcpu);
 		cpu = vcpu->cpu;
+		if (cpus == NULL || cpu == -1 || cpu == me)
+			continue;
 
 		/* Set ->requests bit before we read ->mode. */
 		smp_mb__after_atomic();
-
-		if (cpus != NULL && cpu != -1 && cpu != me &&
-		      kvm_vcpu_exiting_guest_mode(vcpu) != OUTSIDE_GUEST_MODE)
+		if (kvm_arch_vcpu_should_kick(vcpu) ||
+		    (wait && vcpu->mode != OUTSIDE_GUEST_MODE))
 			cpumask_set_cpu(cpu, cpus);
 	}
 	if (unlikely(cpus == NULL))
-		smp_call_function_many(cpu_online_mask, ack_flush, NULL, 1);
+		smp_call_function_many(cpu_online_mask, ack_flush, NULL, wait);
 	else if (!cpumask_empty(cpus))
-		smp_call_function_many(cpus, ack_flush, NULL, 1);
+		smp_call_function_many(cpus, ack_flush, NULL, wait);
 	else
 		called = false;
 	put_cpu();
@@ -221,7 +222,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	 * kvm_make_all_cpus_request() reads vcpu->mode. We reuse that
 	 * barrier here.
 	 */
-	if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
+	if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH, true))
 		++kvm->stat.remote_tlb_flush;
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
@@ -230,7 +231,7 @@ EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD);
+	/* FIXME, is wait=true really needed?  */
+	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD, true);
 }
 
 int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)


Other users do not need wait=false.

Or another idea is to embed wait in the request number, as suggested in the
ARM thread, so that for example:

- bits 0-4 = bit number in vcpu->requests

- bit 8 = wait when making request

- bit 9 = kick after making request


Responding to Andrew, I agree that "we should do away with
kvm_arch_vcpu_should_kick(), putting the x86 implementation of it
directly in the common code" (inlining kvm_vcpu_exiting_guest_mode,
I may add).  However, kvm_arch_vcpu_should_kick is just an optimization,
it's not a bug not to use it.  So let's first iron out
kvm_make_all_cpus_request.

Paolo