[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f904b674-98ba-4e13-a64c-fd30b6ac4a2e@bytedance.com>
Date: Thu, 28 Aug 2025 23:13:25 +0800
From: Fei Li <lifei.shirley@...edance.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, liran.alon@...cle.com, hpa@...or.com,
wanpeng.li@...mail.com, kvm@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [External] Re: [PATCH] KVM: x86: Latch INITs only in specific CPU
states in KVM_SET_VCPU_EVENTS
On 8/28/25 12:08 AM, Paolo Bonzini wrote:
> On Wed, Aug 27, 2025 at 6:01 PM Sean Christopherson <seanjc@...gle.com> wrote:
>> On Wed, Aug 27, 2025, Fei Li wrote:
>>> Commit ff90afa75573 ("KVM: x86: Evaluate latched_init in
>>> KVM_SET_VCPU_EVENTS when vCPU not in SMM") changes KVM_SET_VCPU_EVENTS
>>> handler to set pending LAPIC INIT event regardless of if vCPU is in
>>> SMM mode or not.
>>>
>>> However, latch INIT without checking CPU state exists race condition,
>>> which causes the loss of INIT event. This is fatal during the VM
>>> startup process because it will cause some AP to never switch to
>>> non-root mode. Just as commit f4ef19108608 ("KVM: X86: Fix loss of
>>> pending INIT due to race") said:
>>> BSP AP
>>> kvm_vcpu_ioctl_x86_get_vcpu_events
>>> events->smi.latched_init = 0
>>>
>>> kvm_vcpu_block
>>> kvm_vcpu_check_block
>>> schedule
>>>
>>> send INIT to AP
>>> kvm_vcpu_ioctl_x86_set_vcpu_events
>>> (e.g. `info registers -a` when VM starts/reboots)
>>> if (events->smi.latched_init == 0)
>>> clear INIT in pending_events
>> This is a QEMU bug, no?
> I think I agree.
Actually this is a bug triggered by one monitor tool in our production
environment. This monitor executes 'info registers -a' hmp at a fixed
frequency, even during VM startup process, which makes some AP stay in
KVM_MP_STATE_UNINITIALIZED forever. But thisrace only occurs with
extremely low probability, about 1~2 VM hangs per week.
Considering other emulators, like cloud-hypervisor and firecracker maybe
also have similar potential race issues, I think KVM had better do some
handling. But anyway, I will check Qemu code to avoid such race. Thanks
for both of your comments. 🙂
Have a nice day, thanks
Fei
>
>> IIUC, it's invoking kvm_vcpu_ioctl_x86_set_vcpu_events()
>> with stale data.
> More precisely, it's not expecting other vCPUs to change the pending
> events asynchronously.
Yes, will sort out a more complete calling process later.
>
>> I'm also a bit confused as to how QEMU is even gaining control
>> of the vCPU to emit KVM_SET_VCPU_EVENTS if the vCPU is in
>> kvm_vcpu_block().
> With a signal. :)
>
> Paolo
>
Powered by blists - more mailing lists