[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1bbe3a78-8746-4db9-a96c-9dc5f1190f16@redhat.com>
Date: Tue, 10 Sep 2024 15:15:28 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>,
Rick P Edgecombe <rick.p.edgecombe@...el.com>
Cc: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Yan Y Zhao <yan.y.zhao@...el.com>, Yuan Yao <yuan.yao@...el.com>,
"nik.borisov@...e.com" <nik.borisov@...e.com>,
"dmatlack@...gle.com" <dmatlack@...gle.com>, Kai Huang
<kai.huang@...el.com>, "isaku.yamahata@...il.com"
<isaku.yamahata@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with
operand SEPT
On 9/9/24 23:11, Sean Christopherson wrote:
> In general, I am_very_ opposed to blindly retrying an SEPT SEAMCALL, ever. For
> its operations, I'm pretty sure the only sane approach is for KVM to ensure there
> will be no contention. And if the TDX module's single-step protection spuriously
> kicks in, KVM exits to userspace. If the TDX module can't/doesn't/won't communicate
> that it's mitigating single-step, e.g. so that KVM can forward the information
> to userspace, then that's a TDX module problem to solve.
In principle I agree but we also need to be pragmatic. Exiting to
userspace may not be practical in all flows, for example.
First of all, we can add a spinlock around affected seamcalls. This way
we know that "busy" errors must come from the guest and have set
HOST_PRIORITY. It is still kinda bad that guests can force the VMM to
loop, but the VMM can always say enough is enough. In other words,
let's assume that a limit of 16 is probably appropriate but we can also
increase the limit and crash the VM if things become ridiculous.
Something like this:
static u32 max = 16;
int retry = 0;
spin_lock(&kvm->arch.seamcall_lock);
for (;;) {
args_in = *in;
ret = seamcall_ret(op, in);
if (++retry == 1) {
/* protected by the same seamcall_lock */
kvm->stat.retried_seamcalls++;
} else if (retry == READ_ONCE(max)) {
pr_warn("Exceeded %d retries for S-EPT operation\n", max);
if (KVM_BUG_ON(kvm, retry == 1024)) {
pr_err("Crashing due to lock contention in the TDX module\n");
break;
}
cmpxchg(&max, retry, retry * 2);
}
}
spin_unlock(&kvm->arch.seamcall_lock);
This way we can do some testing and figure out a useful limit.
For zero step detection, my reading is that it's TDH.VP.ENTER that
fails; not any of the MEM seamcalls. For that one to be resolved, it
should be enough to do take and release the mmu_lock back to back, which
ensures that all pending critical sections have completed (that is,
"write_lock(&kvm->mmu_lock); write_unlock(&kvm->mmu_lock);"). And then
loop. Adding a vCPU stat for that one is a good idea, too.
Paolo
Powered by blists - more mailing lists