linux-kernel - Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ed6ccd719241ef6df1558b69ec81073a3b3cf77c.camel@intel.com>
Date: Mon, 14 Oct 2024 17:36:48 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai"
	<kai.huang@...el.com>, "Zhao, Yan Y" <yan.y.zhao@...el.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Yao, Yuan"
	<yuan.yao@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"nik.borisov@...e.com" <nik.borisov@...e.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "isaku.yamahata@...il.com"
	<isaku.yamahata@...il.com>, "dmatlack@...gle.com" <dmatlack@...gle.com>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with
 operand SEPT

On Mon, 2024-10-14 at 10:54 +0000, Huang, Kai wrote:
> On Thu, 2024-10-10 at 21:53 +0000, Edgecombe, Rick P wrote:
> > On Thu, 2024-10-10 at 10:33 -0700, Sean Christopherson wrote:
> > > > 
> > > > 1st: "fault->is_private != kvm_mem_is_private(kvm, fault->gfn)" is found.
> > > > 2nd-6th: try_cmpxchg64() fails on each level SPTEs (5 levels in total)
> > 
> > Isn't there a more general scenario:
> > 
> > vcpu0                              vcpu1
> > 1. Freezes PTE
> > 2. External op to do the SEAMCALL
> > 3.                                 Faults same PTE, hits frozen PTE
> > 4.                                 Retries N times, triggers zero-step
> > 5. Finally finishes external op
> > 
> > Am I missing something?
> 
> I must be missing something.  I thought KVM is going to 
> 

"Is going to", as in "will be changed to"? Or "does today"?

> retry internally for
> step 4 (retries N times) because it sees the frozen PTE, but will never go back
> to guest after the fault is resolved?  How can step 4 triggers zero-step?

Step 3-4 is saying it will go back to the guest and fault again.


As far as what KVM will do in the future, I think it is still open. I've not had
the chance to think about this for more than 30 min at a time, but the plan to
handle OPERAND_BUSY by taking an expensive path to break any contention (i.e.
kick+lock + whatever TDX module changes we come up with) seems to the leading
idea.

Retry N times is too hacky. Retry internally forever might be awkward to
implement. Because of the signal_pending() check, you would have to handle
exiting to userspace and going back to an EPT violation next time the vcpu tries
to enter.