linux-kernel - Re: [PATCH] KVM/x86: Do not clear SIPI while in SMM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240927012239.34406-1-eric.mackay@oracle.com>
Date: Thu, 26 Sep 2024 18:22:39 -0700
From: Eric Mackay <eric.mackay@...cle.com>
To: boris.ostrovsky@...cle.com
Cc: eric.mackay@...cle.com, imammedo@...hat.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, pbonzini@...hat.com, seanjc@...gle.com
Subject: Re: [PATCH] KVM/x86: Do not clear SIPI while in SMM

> On 9/24/24 5:40 AM, Igor Mammedov wrote:
>> On Fri, 19 Apr 2024 12:17:01 -0400
>> boris.ostrovsky@...cle.com wrote:
>> 
>>> On 4/17/24 9:58 AM, boris.ostrovsky@...cle.com wrote:
>>>>
>>>> I noticed that I was using a few months old qemu bits and now I am
>>>> having trouble reproducing this on latest bits. Let me see if I can get
>>>> this to fail with latest first and then try to trace why the processor
>>>> is in this unexpected state.
>>>
>>> Looks like 012b170173bc "system/qdev-monitor: move drain_call_rcu call
>>> under if (!dev) in qmp_device_add()" is what makes the test to stop failing.
>>>
>>> I need to understand whether lack of failures is a side effect of timing
>>> changes that simply make hotplug fail less likely or if this is an
>>> actual (but seemingly unintentional) fix.
>> 
>> Agreed, we should find out culprit of the problem.
>
>
> I haven't been able to spend much time on this unfortunately, Eric is 
> now starting to look at this again.
>
> One of my theories was that ich9_apm_ctrl_changed() is sending SMIs to 
> vcpus serially while on HW my understanding is that this is done as a 
> broadcast so I thought this could cause a race. I had a quick test with 
> pausing and resuming all vcpus around the loop but that didn't help.
>
>
>> 
>> PS:
>> also if you are using AMD host, there was a regression in OVMF
>> where where vCPU that OSPM was already online-ing, was yanked
>> from under OSMP feet by OVMF (which depending on timing could
>> manifest as lost SIPI).
>> 
>> edk2 commit that should fix it is:
>>      https://github.com/tianocore/edk2/commit/1c19ccd5103b
>> 
>> Switching to Intel host should rule that out at least.
>> (or use fixed edk2-ovmf-20240524-5.el10.noarch package from centos,
>> if you are forced to use AMD host)

I haven't been able to reproduce the issue on an Intel host thus far,
but it may not be an apples-to-apples comparison because my AMD hosts
have a much higher core count.

>
> I just tried with latest bits that include this commit and still was 
> able to reproduce the problem.
>
>
>-boris

The initial hotplug of each CPU appears to complete from the
perspective of OVMF and OSPM. SMBASE relocation succeeds, and the new
CPU reports back from the pen. It seems to be the later INIT-SIPI-SIPI
sequence sent from the guest that doesn't complete.

My working theory has been that some CPU/AP is lagging behind the others
when the BSP is waiting for all the APs to go into SMM, and the BSP just
gives up and moves on. Presumably the INIT-SIPI-SIPI is sent while that
CPU does finally go into SMM, and other CPUs are in normal mode.

I've been able to observe the SMI handler for the problematic CPU will
sometimes start running when no BSP is elected. This means we have a
window of time where the CPU will ignore SIPI, and least 1 CPU is in
normal mode (the BSP) which is capable of sending INIT-SIPI-SIPI from
the guest.