linux-kernel - Re: [PATCH v2 5/8] KVM: TDX: Handle TDG.VP.VMCALL<MapGPA>

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <cbda86c4-f56f-4142-93eb-12736b2e3719@linux.intel.com>
Date: Wed, 19 Feb 2025 08:49:12 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Chao Gao <chao.gao@...el.com>, Yan Zhao <yan.y.zhao@...el.com>,
 pbonzini@...hat.com, kvm@...r.kernel.org, rick.p.edgecombe@...el.com,
 kai.huang@...el.com, adrian.hunter@...el.com, reinette.chatre@...el.com,
 xiaoyao.li@...el.com, tony.lindgren@...el.com, isaku.yamahata@...el.com,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 5/8] KVM: TDX: Handle TDG.VP.VMCALL<MapGPA>



On 2/19/2025 8:29 AM, Sean Christopherson wrote:
> On Mon, Feb 17, 2025, Binbin Wu wrote:
>> On 2/13/2025 11:17 PM, Sean Christopherson wrote:
>>> On Thu, Feb 13, 2025, Binbin Wu wrote:
>>>> On 2/13/2025 11:23 AM, Binbin Wu wrote:
>>>>> On 2/13/2025 2:56 AM, Sean Christopherson wrote:
>>>>>> On Wed, Feb 12, 2025, Binbin Wu wrote:
>>>>>>> On 2/12/2025 8:46 AM, Sean Christopherson wrote:
>>>>>>>> I am completely comfortable saying that KVM doesn't care about STI/SS shadows
>>>>>>>> outside of the HALTED case, and so unless I'm missing something, I think it makes
>>>>>>>> sense for tdx_protected_apic_has_interrupt() to not check RVI outside of the HALTED
>>>>>>>> case, because it's impossible to know if the interrupt is actually unmasked, and
>>>>>>>> statistically it's far, far more likely that it _is_ masked.
>>>>>>> OK. Will update tdx_protected_apic_has_interrupt() in "TDX interrupts" part.
>>>>>>> And use kvm_vcpu_has_events() to replace the open code in this patch.
>>>>>> Something to keep an eye on: kvm_vcpu_has_events() returns true if pv_unhalted
>>>>>> is set, and pv_unhalted is only cleared on transitions KVM_MP_STATE_RUNNABLE.
>>>>>> If the guest initiates a spurious wakeup, pv_unhalted could be left set in
>>>>>> perpetuity.
>>>>> Oh, yes.
>>>>> KVM_HC_KICK_CPU is allowed in TDX guests.
>>> And a clever guest can send a REMRD IPI.
>>>
>>>>> The change below looks good to me.
>>>>>
>>>>> One minor issue is when guest initiates a spurious wakeup, pv_unhalted is
>>>>> left set, then later when the guest want to halt the vcpu, in
>>>>> __kvm_emulate_halt(), since pv_unhalted is still set and the state will not
>>>>> transit to KVM_MP_STATE_HALTED.
>>>>> But I guess it's guests' responsibility to not initiate spurious wakeup,
>>>>> guests need to bear the fact that HLT could fail due to a previous
>>>>> spurious wakeup?
>>>> Just found a patch set for fixing the issue.
>>> FWIW, Jim's series doesn't address spurious wakeups per se, it just ensures
>>> pv_unhalted is cleared when transitioning to RUNNING.  If the vCPU is already
>>> RUNNING, __apic_accept_irq() will set pv_unhalted and nothing will clear it
>>> until the next transition to RUNNING (which implies at least an attempted
>>> transition away from RUNNING).
>>>
>> Indeed.
>>
>> I am wondering why KVM doesn't clear pv_unhalted before the vcpu entering guest?
>> Is the additional memory access a concern or is there some other reason?
> Not clearing pv_unhalted when entering the guest is necessary to avoid races.
> Stating the obvious, the guest must set up all of its lock tracking before executing
> HLT, which means that the soon-to-be-blocking vCPU is eligible for wakeup *before*
> it executes HLT.  If an asynchronous exit happens on the vCPU at just the right
> time, KVM could clear pv_unhalted before the vCPU executes HLT.
>
Got it.
Thanks for the explanation.