linux-kernel - Re: [PATCH] KVM: Avoid atomic operations when kicking the running vCPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <78be3e02-aac5-0f5b-339e-5969a14974d7@redhat.com>
Date:   Wed, 20 Oct 2021 23:32:47 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        wanpengli@...cent.com
Subject: Re: [PATCH] KVM: Avoid atomic operations when kicking the running
 vCPU

On 20/10/21 21:34, Sean Christopherson wrote:
>>
>> +	/*
>> +	 * The only state change done outside the vcpu mutex is IN_GUEST_MODE
>> +	 * to EXITING_GUEST_MODE.  Therefore the moderately expensive "should
>> +	 * kick" check does not need atomic operations if kvm_vcpu_kick is used
>> +	 * within the vCPU thread itself.
>> +	 */
>> +	if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
>> +		if (vcpu->mode == IN_GUEST_MODE)
>> +			WRITE_ONCE(vcpu->mode, EXITING_GUEST_MODE);
> 
> Fun.  I had a whole thing typed out about this being unsafe because it implicitly
> relies on a pending request and that there's a kvm_vcpu_exit_request() check _after_
> this kick.  Then I saw your other patches, and then I realized we already have this
> bug in the kvm_arch_vcpu_should_kick() below.

Yeah, the three patches are independent but part of the same rabbit hole.

> Anyways, I also think we should add do:
> 
> 	if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
> 		if (vcpu->mode == IN_GUEST_MODE &&
> 		    !WARN_ON_ONCE(!kvm_request_pending(vcpu)))
> 			WRITE_ONCE(vcpu->mode, EXITING_GUEST_MODE);
> 		goto out;
> 	}
> 
> The idea being that delaying or even missing an event in case of a KVM bug is
> preferable to letting the vCPU state become invalid due to running in the guest
> with EXITING_GUEST_MODE.

On one hand I like the idea of having a safety net; for example a test 
similar to this one would have triggered for the naked 
kvm_vcpu_exiting_guest_mode(vcpu) call in vmx_sync_pir_to_irr.

On the other hand, "request-less VCPU kicks", as 
Documentation/virt/kvm/vcpu-requests.rst calls them, are a thing; PPC 
book3s_hv does not use vcpu->requests at all. For an artificial but more 
relatable example, the ON bit takes the role of vcpu->requests when 
processing PIR.  Hence the code below would be suboptimal but still correct:

         for (;;) {
                 exit_fastpath = static_call(kvm_x86_run)(vcpu);
                 if (likely(exit_fastpath !=
			   EXIT_FASTPATH_REENTER_GUEST))
                         break;

                 if (vcpu->arch.apicv_active && pi_test_on(vcpu))
                         kvm_vcpu_kick(vcpu);

                 if (unlikely(kvm_vcpu_exit_request(vcpu))) {
                         exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
                         break;
                 }
         }

All that really matters is that every call to kvm_x86_run is guarded by 
kvm_vcpu_exit_request(vcpu), and indeed that's what is restored by "KVM: 
x86: check for interrupts before deciding whether to exit the fast 
path".  The other architectures also have similar checks, though again 
it's a bit hard to find it for book3s_hv (due to not using 
vcpu->requests) and MIPS (which only uses KVM_REQ_TLB_FLUSH).

Paolo