[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220822170659.2527086-1-pbonzini@redhat.com>
Date: Mon, 22 Aug 2022 13:06:52 -0400
From: Paolo Bonzini <pbonzini@...hat.com>
To: linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc: mlevitsk@...hat.com, seanjc@...gle.com
Subject: [PATCH v3 0/7] KVM: x86: never write to memory from kvm_vcpu_check_block
The following backtrace:
[ 1355.807187] kvm_vcpu_map+0x159/0x190 [kvm]
[ 1355.807628] nested_svm_vmexit+0x4c/0x7f0 [kvm_amd]
[ 1355.808036] ? kvm_vcpu_block+0x54/0xa0 [kvm]
[ 1355.808450] svm_check_nested_events+0x97/0x390 [kvm_amd]
[ 1355.808920] kvm_check_nested_events+0x1c/0x40 [kvm]
[ 1355.809396] kvm_arch_vcpu_runnable+0x4e/0x190 [kvm]
[ 1355.809892] kvm_vcpu_check_block+0x4f/0x100 [kvm]
[ 1355.811259] kvm_vcpu_block+0x6b/0xa0 [kvm]
can occur due to kmap being called in non-sleepable (!TASK_RUNNING) context.
The fix is to extend kvm_x86_ops->nested_ops.hv_timer_pending() to cover
all events not already checked in kvm_arch_vcpu_is_runnable(), and then
get rid of the annoying (and wrong) call to kvm_check_nested_events()
from kvm_vcpu_check_block().
Beware, this is not a complete fix, because kvm_guest_apic_has_interrupt()
might still _read_ memory from non-sleepable context. The fix here is
probably to make kvm_arch_vcpu_is_runnable() return -EAGAIN, and in that
case do a round of kvm_vcpu_check_block() polling in sleepable context.
Nevertheless, it is a good start as it pushes the vmexit into vcpu_block().
The series also does a small cleanup pass on kvm_vcpu_check_block(),
removing KVM_REQ_UNHALT in favor of simply calling kvm_arch_vcpu_runnable()
again. Now that kvm_check_nested_events() is not called anymore by
kvm_arch_vcpu_runnable(), it is much easier to see that KVM will never
consume the event that caused kvm_vcpu_has_events() to return true,
and therefore it is safe to evaluate it again.
The alternative of propagating the return value of
kvm_arch_vcpu_runnable() up to kvm_vcpu_{block,halt}() is inferior
because it does not quite get right the edge cases where the vCPU becomes
runnable right before schedule() or right after kvm_vcpu_check_block().
While these edge cases are unlikely to truly matter in practice, it is
also pointless to get them "wrong".
Paolo
v2->v3: do not propagate the return value of
kvm_arch_vcpu_runnable() up to kvm_vcpu_{block,halt}()
move and reformat the comment in vcpu_block()
move KVM_REQ_UNHALT removal last
Paolo Bonzini (6):
KVM: x86: check validity of argument to KVM_SET_MP_STATE
KVM: x86: make vendor code check for all nested events
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: never write to memory from kvm_vcpu_check_block
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: remove KVM_REQ_UNHALT
Sean Christopherson (1):
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
Documentation/virt/kvm/vcpu-requests.rst | 28 +------------
arch/arm64/kvm/arm.c | 1 -
arch/mips/kvm/emulate.c | 6 +--
arch/powerpc/kvm/book3s_pr.c | 1 -
arch/powerpc/kvm/book3s_pr_papr.c | 1 -
arch/powerpc/kvm/booke.c | 1 -
arch/powerpc/kvm/powerpc.c | 1 -
arch/riscv/kvm/vcpu_insn.c | 1 -
arch/s390/kvm/kvm-s390.c | 2 -
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/i8259.c | 4 +-
arch/x86/kvm/lapic.h | 2 +-
arch/x86/kvm/vmx/nested.c | 9 +++-
arch/x86/kvm/vmx/vmx.c | 6 ++-
arch/x86/kvm/x86.c | 53 ++++++++++++++++++------
arch/x86/kvm/x86.h | 5 ---
arch/x86/kvm/xen.c | 1 -
include/linux/kvm_host.h | 3 +-
virt/kvm/kvm_main.c | 4 +-
19 files changed, 63 insertions(+), 69 deletions(-)
--
2.31.1
Powered by blists - more mailing lists