[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4cd18b6e-5e64-4b7d-9dbc-fd4c293cb4db@intel.com>
Date: Fri, 6 Sep 2024 13:41:37 +1200
From: "Huang, Kai" <kai.huang@...el.com>
To: Rick Edgecombe <rick.p.edgecombe@...el.com>, <seanjc@...gle.com>,
<pbonzini@...hat.com>, <kvm@...r.kernel.org>
CC: <dmatlack@...gle.com>, <isaku.yamahata@...il.com>, <yan.y.zhao@...el.com>,
<nik.borisov@...e.com>, <linux-kernel@...r.kernel.org>, Yuan Yao
<yuan.yao@...el.com>
Subject: Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with
operand SEPT
On 4/09/2024 3:07 pm, Rick Edgecombe wrote:
> From: Yuan Yao <yuan.yao@...el.com>
>
> TDX module internally uses locks to protect internal resources. It tries
> to acquire the locks. If it fails to obtain the lock, it returns
> TDX_OPERAND_BUSY error without spin because its execution time limitation.
>
> TDX SEAMCALL API reference describes what resources are used. It's known
> which TDX SEAMCALL can cause contention with which resources. VMM can
> avoid contention inside the TDX module by avoiding contentious TDX SEAMCALL
> with, for example, spinlock. Because OS knows better its process
> scheduling and its scalability, a lock at OS/VMM layer would work better
> than simply retrying TDX SEAMCALLs.
>
> TDH.MEM.* API except for TDH.MEM.TRACK operates on a secure EPT tree and
> the TDX module internally tries to acquire the lock of the secure EPT tree.
> They return TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT in case of failure to
> get the lock. TDX KVM allows sept callbacks to return error so that TDP
> MMU layer can retry.
>
> Retry TDX TDH.MEM.* API on the error because the error is a rare event
> caused by zero-step attack mitigation.
The last paragraph seems can be improved:
It seems to say the "TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT" can only be
cauesd by zero-step attack detection/mitigation, which isn't true from
the previous paragraph.
In fact, I think this patch can be dropped:
1) The TDH_MEM_xx()s can return BUSY due to nature of TDP MMU, but all
the callers of TDH_MEM_xx()s are already explicitly retrying by looking
at the patch "KVM: TDX: Implement hooks to propagate changes of TDP MMU
mirror page table" -- they either return PF_RETRY to let the fault to
happen again or explicitly loop until no BUSY is returned. So I am not
sure why we need to "loo SEAMCALL_RETRY_MAX (16) times" in the common code.
2) TDH_VP_ENTER explicitly retries immediately for such case:
/* See the comment of tdx_seamcall_sept(). */
if (unlikely(vp_enter_ret == TDX_ERROR_SEPT_BUSY))
return EXIT_FASTPATH_REENTER_GUEST;
3) That means the _ONLY_ reason to retry in the common code for
TDH_MEM_xx()s is to mitigate zero-step attack by reducing the times of
letting guest to fault on the same instruction.
I don't think we need to handle zero-step attack mitigation in the first
TDX support submission. So I think we can just remove this patch.
>
> Signed-off-by: Yuan Yao <yuan.yao@...el.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
> ---
> TDX MMU part 2 v1:
> - Updates from seamcall overhaul (Kai)
>
> v19:
> - fix typo TDG.VP.ENTER => TDH.VP.ENTER,
> TDX_OPRRAN_BUSY => TDX_OPERAND_BUSY
> - drop the description on TDH.VP.ENTER as this patch doesn't touch
> TDH.VP.ENTER
> ---
> arch/x86/kvm/vmx/tdx_ops.h | 48 ++++++++++++++++++++++++++++++++------
> 1 file changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h
> index 0363d8544f42..8ca3e252a6ed 100644
> --- a/arch/x86/kvm/vmx/tdx_ops.h
> +++ b/arch/x86/kvm/vmx/tdx_ops.h
> @@ -31,6 +31,40 @@
> #define pr_tdx_error_3(__fn, __err, __rcx, __rdx, __r8) \
> pr_tdx_error_N(__fn, __err, "rcx 0x%llx, rdx 0x%llx, r8 0x%llx\n", __rcx, __rdx, __r8)
>
> +/*
> + * TDX module acquires its internal lock for resources. It doesn't spin to get
> + * locks because of its restrictions of allowed execution time. Instead, it
> + * returns TDX_OPERAND_BUSY with an operand id.
> + *
> + * Multiple VCPUs can operate on SEPT. Also with zero-step attack mitigation,
> + * TDH.VP.ENTER may rarely acquire SEPT lock and release it when zero-step
> + * attack is suspected. It results in TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT
> + * with TDH.MEM.* operation. Note: TDH.MEM.TRACK is an exception.
> + *
> + * Because TDP MMU uses read lock for scalability, spin lock around SEAMCALL
> + * spoils TDP MMU effort. Retry several times with the assumption that SEPT
> + * lock contention is rare. But don't loop forever to avoid lockup. Let TDP
> + * MMU retry.
> + */
> +#define TDX_ERROR_SEPT_BUSY (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT)
> +
> +static inline u64 tdx_seamcall_sept(u64 op, struct tdx_module_args *in)
> +{
> +#define SEAMCALL_RETRY_MAX 16
> + struct tdx_module_args args_in;
> + int retry = SEAMCALL_RETRY_MAX;
> + u64 ret;
> +
> + do {
> + args_in = *in;
> + ret = seamcall_ret(op, in);
> + } while (ret == TDX_ERROR_SEPT_BUSY && retry-- > 0);
> +
> + *in = args_in;
> +
> + return ret;
> +}
> +
> static inline u64 tdh_mng_addcx(struct kvm_tdx *kvm_tdx, hpa_t addr)
> {
> struct tdx_module_args in = {
> @@ -55,7 +89,7 @@ static inline u64 tdh_mem_page_add(struct kvm_tdx *kvm_tdx, gpa_t gpa,
> u64 ret;
>
> clflush_cache_range(__va(hpa), PAGE_SIZE);
> - ret = seamcall_ret(TDH_MEM_PAGE_ADD, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_PAGE_ADD, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -76,7 +110,7 @@ static inline u64 tdh_mem_sept_add(struct kvm_tdx *kvm_tdx, gpa_t gpa,
>
> clflush_cache_range(__va(page), PAGE_SIZE);
>
> - ret = seamcall_ret(TDH_MEM_SEPT_ADD, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_SEPT_ADD, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -93,7 +127,7 @@ static inline u64 tdh_mem_sept_remove(struct kvm_tdx *kvm_tdx, gpa_t gpa,
> };
> u64 ret;
>
> - ret = seamcall_ret(TDH_MEM_SEPT_REMOVE, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_SEPT_REMOVE, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -123,7 +157,7 @@ static inline u64 tdh_mem_page_aug(struct kvm_tdx *kvm_tdx, gpa_t gpa, hpa_t hpa
> u64 ret;
>
> clflush_cache_range(__va(hpa), PAGE_SIZE);
> - ret = seamcall_ret(TDH_MEM_PAGE_AUG, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_PAGE_AUG, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -140,7 +174,7 @@ static inline u64 tdh_mem_range_block(struct kvm_tdx *kvm_tdx, gpa_t gpa,
> };
> u64 ret;
>
> - ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_RANGE_BLOCK, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -335,7 +369,7 @@ static inline u64 tdh_mem_page_remove(struct kvm_tdx *kvm_tdx, gpa_t gpa,
> };
> u64 ret;
>
> - ret = seamcall_ret(TDH_MEM_PAGE_REMOVE, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_PAGE_REMOVE, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
> @@ -361,7 +395,7 @@ static inline u64 tdh_mem_range_unblock(struct kvm_tdx *kvm_tdx, gpa_t gpa,
> };
> u64 ret;
>
> - ret = seamcall_ret(TDH_MEM_RANGE_UNBLOCK, &in);
> + ret = tdx_seamcall_sept(TDH_MEM_RANGE_UNBLOCK, &in);
>
> *rcx = in.rcx;
> *rdx = in.rdx;
Powered by blists - more mailing lists