[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b4b3581-925b-32a8-8a4f-fdd8d98f2164@intel.com>
Date: Thu, 24 Feb 2022 10:42:56 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
luto@...nel.org, peterz@...radead.org
Cc: sathyanarayanan.kuppuswamy@...ux.intel.com, aarcange@...hat.com,
ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
hpa@...or.com, jgross@...e.com, jmattson@...gle.com,
joro@...tes.org, jpoimboe@...hat.com, knsathya@...nel.org,
pbonzini@...hat.com, sdeep@...are.com, seanjc@...gle.com,
tony.luck@...el.com, vkuznets@...hat.com, wanpengli@...cent.com,
thomas.lendacky@....com, brijesh.singh@....com, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCHv4 08/30] x86/tdx: Add HLT support for TDX guests
On 2/24/22 07:56, Kirill A. Shutemov wrote:
> The HLT instruction is a privileged instruction, executing it stops
> instruction execution and places the processor in a HALT state. It
> is used in kernel for cases like reboot, idle loop and exception fixup
> handlers. For the idle case, interrupts will be enabled (using STI)
> before the HLT instruction (this is also called safe_halt()).
>
> To support the HLT instruction in TDX guests, it needs to be emulated
> using TDVMCALL (hypercall to VMM). More details about it can be found
> in Intel Trust Domain Extensions (Intel TDX) Guest-Host-Communication
> Interface (GHCI) specification, section TDVMCALL[Instruction.HLT].
>
> In TDX guests, executing HLT instruction will generate a #VE, which is
> used to emulate the HLT instruction. But #VE based emulation will not
> work for the safe_halt() flavor, because it requires STI instruction to
> be executed just before the TDCALL. Since idle loop is the only user of
> safe_halt() variant, handle it as a special case.
>
> To avoid *safe_halt() call in the idle function, define the
> tdx_guest_idle() and use it to override the "x86_idle" function pointer
> for a valid TDX guest.
>
> Alternative choices like PV ops have been considered for adding
> safe_halt() support. But it was rejected because HLT paravirt calls
> only exist under PARAVIRT_XXL, and enabling it in TDX guest just for
> safe_halt() use case is not worth the cost.
Thanks for all the history and background here.
> diff --git a/arch/x86/coco/tdcall.S b/arch/x86/coco/tdcall.S
> index c4dd9468e7d9..3c35a056974d 100644
> --- a/arch/x86/coco/tdcall.S
> +++ b/arch/x86/coco/tdcall.S
> @@ -138,6 +138,19 @@ SYM_FUNC_START(__tdx_hypercall)
>
> movl $TDVMCALL_EXPOSE_REGS_MASK, %ecx
>
> + /*
> + * For the idle loop STI needs to be called directly before the TDCALL
> + * that enters idle (EXIT_REASON_HLT case). STI instruction enables
> + * interrupts only one instruction later. If there is a window between
> + * STI and the instruction that emulates the HALT state, there is a
> + * chance for interrupts to happen in this window, which can delay the
> + * HLT operation indefinitely. Since this is the not the desired
> + * result, conditionally call STI before TDCALL.
> + */
> + testq $TDX_HCALL_ISSUE_STI, %rsi
> + jz .Lskip_sti
> + sti
> +.Lskip_sti:
> tdcall
>
> /*
> diff --git a/arch/x86/coco/tdx.c b/arch/x86/coco/tdx.c
> index 86a2f35e7308..0a2e6be0cdae 100644
> --- a/arch/x86/coco/tdx.c
> +++ b/arch/x86/coco/tdx.c
> @@ -7,6 +7,7 @@
> #include <linux/cpufeature.h>
> #include <asm/coco.h>
> #include <asm/tdx.h>
> +#include <asm/vmx.h>
>
> /* TDX module Call Leaf IDs */
> #define TDX_GET_INFO 1
> @@ -59,6 +60,62 @@ static void get_info(void)
> td_info.attributes = out.rdx;
> }
>
> +static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
> +{
> + struct tdx_hypercall_args args = {
> + .r10 = TDX_HYPERCALL_STANDARD,
> + .r11 = EXIT_REASON_HLT,
> + .r12 = irq_disabled,
> + };
> +
> + /*
> + * Emulate HLT operation via hypercall. More info about ABI
> + * can be found in TDX Guest-Host-Communication Interface
> + * (GHCI), section 3.8 TDG.VP.VMCALL<Instruction.HLT>.
> + *
> + * The VMM uses the "IRQ disabled" param to understand IRQ
> + * enabled status (RFLAGS.IF) of the TD guest and to determine
> + * whether or not it should schedule the halted vCPU if an
> + * IRQ becomes pending. E.g. if IRQs are disabled, the VMM
> + * can keep the vCPU in virtual HLT, even if an IRQ is
> + * pending, without hanging/breaking the guest.
> + */
> + return __tdx_hypercall(&args, do_sti ? TDX_HCALL_ISSUE_STI : 0);
> +}
> +
> +static bool handle_halt(void)
> +{
> + /*
> + * Since non safe halt is mainly used in CPU offlining
> + * and the guest will always stay in the halt state, don't
> + * call the STI instruction (set do_sti as false).
> + */
> + const bool irq_disabled = irqs_disabled();
> + const bool do_sti = false;
> +
> + if (__halt(irq_disabled, do_sti))
> + return false;
> +
> + return true;
> +}
One other note: I really do like the silly:
const bool do_sti = false;
variables as opposed to doing gunk like:
__halt(irq_disabled, false));
Thanks for doing that.
Acked-by: Dave Hansen <dave.hansen@...ux.intel.com>
Powered by blists - more mailing lists