linux-kernel - Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZwiN0sdoLJ6YmpDR@yzhao56-desk.sh.intel.com>
Date: Fri, 11 Oct 2024 10:30:42 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
CC: "seanjc@...gle.com" <seanjc@...gle.com>, "Yao, Yuan" <yuan.yao@...el.com>,
	"Huang, Kai" <kai.huang@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "isaku.yamahata@...il.com"
	<isaku.yamahata@...il.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"dmatlack@...gle.com" <dmatlack@...gle.com>, "nik.borisov@...e.com"
	<nik.borisov@...e.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY
 with operand SEPT

On Fri, Oct 11, 2024 at 05:53:29AM +0800, Edgecombe, Rick P wrote:
> On Thu, 2024-10-10 at 10:33 -0700, Sean Christopherson wrote:
> > > 
> > > 1st: "fault->is_private != kvm_mem_is_private(kvm, fault->gfn)" is found.
> > > 2nd-6th: try_cmpxchg64() fails on each level SPTEs (5 levels in total)
> 
> Isn't there a more general scenario:
> 
> vcpu0                              vcpu1
> 1. Freezes PTE
> 2. External op to do the SEAMCALL
> 3.                                 Faults same PTE, hits frozen PTE
> 4.                                 Retries N times, triggers zero-step
> 5. Finally finishes external op
> 
> Am I missing something?
Yes, it's a follow-up discussion of Sean's proposal [1] of having TDX code to
retry on RET_PF_RETRY_FROZEN to avoid zero-step.
My worry is that merely avoiding entering guest for vCPUs seeing FROZEN_SPTE is
not enough to prevent zero-step. 
The two examples shows zero-step is possible without re-entering guest for
FROZEN_SPTE:
- The selftest [2]: a single vCPU can fire zero-step when userspace does
  something wrong (though KVM is correct).
- The above case: Nothing wrong in KVM/QEMU, except an extremely unlucky vCPU.


[1] https://lore.kernel.org/all/ZuR09EqzU1WbQYGd@google.com/
[2] https://lore.kernel.org/all/ZvPrqMj1BWrkkwqN@yzhao56-desk.sh.intel.com/

> 
> > 
> > Very technically, this shouldn't be possible.  The only way for there to be
> > contention on the leaf SPTE is if some other KVM task installed a SPTE, i.e.
> > the
> > 6th attempt should succeed, even if the faulting vCPU wasn't the one to create
> > the SPTE.
> > 
> > That said, a few thoughts:
> > 
> > 1. Where did we end up on the idea of requiring userspace to pre-fault memory?
> 
> For others reference, I think you are referring to the idea to pre-fault the
> entire S-EPT even for GFNs that usually get AUGed, not the mirrored EPT pre-
> faulting/PAGE.ADD dance we are already doing.
> 
> The last discussion with Paolo was to resume the retry solution discussion on
> the v2 posting because it would be easier "with everything else already
> addressed". Also, there was also some discussion that it was not immediately
> obvious how prefaulting everything would work for memory hot plug (i.e. memslots
> added during runtime).
> 
> > 
> > 2. The zero-step logic really should have a slightly more conservative
> > threshold.
> >    I have a hard time believing that e.g. 10 attempts would create a side
> > channel,
> >    but 6 attempts is "fine".
> 
> No idea where the threshold came from. I'm not sure if it affects the KVM
> design? We can look into it for curiosity sake in either case.
> 
> > 
> > 3. This would be a good reason to implement a local retry in
> > kvm_tdp_mmu_map().
> >    Yes, I'm being somewhat hypocritical since I'm so against retrying for the
> >    S-EPT case, but my objection to retrying for S-EPT is that it _should_ be
> > easy
> >    for KVM to guarantee success.
> > 
> > E.g. for #3, the below (compile tested only) patch should make it impossible
> > for
> > the S-EPT case to fail, as dirty logging isn't (yet) supported and mirror
> > SPTEs
> > should never trigger A/D assists, i.e. retry should always succeed.
> 
> I don't see how it addresses the scenario above. More retires could just make it
> rarer, but never fix it. Very possible I'm missing something though.
I'm also not 100% sure if zero-step must not happen after this change even when
KVM/QEMU do nothing wrong.