lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 9 Feb 2022 17:34:07 +0300
From:   "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>, bp@...en8.de,
        aarcange@...hat.com, ak@...ux.intel.com, dan.j.williams@...el.com,
        dave.hansen@...el.com, david@...hat.com, hpa@...or.com,
        jgross@...e.com, jmattson@...gle.com, joro@...tes.org,
        jpoimboe@...hat.com, knsathya@...nel.org,
        linux-kernel@...r.kernel.org, luto@...nel.org, mingo@...hat.com,
        pbonzini@...hat.com, peterz@...radead.org,
        sathyanarayanan.kuppuswamy@...ux.intel.com, sdeep@...are.com,
        tony.luck@...el.com, vkuznets@...hat.com, wanpengli@...cent.com,
        x86@...nel.org
Subject: Re: [PATCHv2.1 05/29] x86/tdx: Add HLT support for TDX guests

On Mon, Feb 07, 2022 at 10:52:19PM +0000, Sean Christopherson wrote:
> On Fri, Feb 04, 2022, Kirill A. Shutemov wrote:
> > > > Looks dubious to me, I donno. I worry about possible conflicts with the
> > > > spec in the future.
> > > 
> > > The spec should have a reserved space for such things :)
> 
> Heh, the problem is someone has to deal with munging the two things together.
> E.g. if there's a EXIT_REASON_SAFE_HLT then the hypervisor would need a handler
> that's identical to EXIT_REASON_HLT, except with guest.EFLAGS.IF forced to '1'.
> The guest gets the short end of the stick because EXIT_REASON_HLT is already an
> established VM-Exit reason.
> 
> > > But you might think about having a in/out struct similar to the module
> > > call or just an array of u64.
> > > 
> > > and the signature would become:
> > > 
> > > __tdx_hypercall(u64 op, u64 flags, struct inout *args)
> > > __tdx_hypercall(u64 op, u64 flags, u64 *args)
> > > 
> > > and have flag bits:
> > > 
> > >     HCALL_ISSUE_STI
> > >     HCALL_HAS_OUTPUT
> > > 
> > > Hmm?
> > 
> > We have two distinct cases: standard hypercalls (defined in GHCI) and KVM
> > hypercalls. In the first case R10 is 0 (indicating standard TDVMCALL) and
> > R11 defines the operation. For KVM hypercalls R10 encodes the operation
> > (KVM hypercalls indexed from 1) and R11 is the first argument. So we
> > cannot get away with simple "RDI is op" interface.
> > 
> > And we need to return two values: RAX indicates if TDCALL itself was
> > successful and R10 is result of the hypercall. So we cannot easily get
> > away without output struct. HCALL_HAS_OUTPUT is not needed.
> 
> But __tdx_hypercall() should never fail TDCALL.  The TDX spec even says:
> 
>   RAX TDCALL instruction return code. Always returns Intel TDX_SUCCESS (0).
> 
> IIRC, the original PoC went straight to a ud2 if tdcall failed.  Why not do
> something similar?  That would get rid of the bajillion instances of:
> 
> 	if (__tdx_hypercall(...))
> 		panic("Hypercall fn %llu failed (Buggy TDX module!)\n", fn);

Okay, below is my take on it. Boot tested.

I used UD2 as a way to deal with it-never-heppens condition. Not sure if
it the right way, but stack trace looks useful and I see it being used for
CONFIG_DEBUG_ENTRY.

Any comments?

/*
 * __tdx_hypercall() - Make hypercalls to a TDX VMM.
 *
 * Transforms values in  function call argument struct tdx_hypercall_args @args
 * into the TDCALL register ABI. After TDCALL operation, VMM output is saved
 * back in @args.
 *
 *-------------------------------------------------------------------------
 * TD VMCALL ABI:
 *-------------------------------------------------------------------------
 *
 * Input Registers:
 *
 * RAX                 - TDCALL instruction leaf number (0 - TDG.VP.VMCALL)
 * RCX                 - BITMAP which controls which part of TD Guest GPR
 *                       is passed as-is to the VMM and back.
 * R10                 - Set 0 to indicate TDCALL follows standard TDX ABI
 *                       specification. Non zero value indicates vendor
 *                       specific ABI.
 * R11                 - VMCALL sub function number
 * RBX, RBP, RDI, RSI  - Used to pass VMCALL sub function specific arguments.
 * R8-R9, R12-R15      - Same as above.
 *
 * Output Registers:
 *
 * RAX                 - TDCALL instruction status (Not related to hypercall
 *                        output).
 * R10                 - Hypercall output error code.
 * R11-R15             - Hypercall sub function specific output values.
 *
 *-------------------------------------------------------------------------
 *
 * __tdx_hypercall() function ABI:
 *
 * @args  (RDI)        - struct tdx_hypercall_args for input and output
 * @flags (RSI)        - TDX_HCALL_* flags
 *
 * On successful completion, return the hypercall error code.
 */
SYM_FUNC_START(__tdx_hypercall)
	FRAME_BEGIN

	/* Save callee-saved GPRs as mandated by the x86_64 ABI */
	push %r15
	push %r14
	push %r13
	push %r12

	/* Mangle function call ABI into TDCALL ABI: */
	/* Set TDCALL leaf ID (TDVMCALL (0)) in RAX */
	xor %eax, %eax

	/* Copy hypercall registers from arg struct: */
	movq TDX_HYPERCALL_r10(%rdi), %r10
	movq TDX_HYPERCALL_r11(%rdi), %r11
	movq TDX_HYPERCALL_r12(%rdi), %r12
	movq TDX_HYPERCALL_r13(%rdi), %r13
	movq TDX_HYPERCALL_r14(%rdi), %r14
	movq TDX_HYPERCALL_r15(%rdi), %r15

	movl $TDVMCALL_EXPOSE_REGS_MASK, %ecx

	/*
	 * For the idle loop STI needs to be called directly before
	 * the TDCALL that enters idle (EXIT_REASON_HLT case). STI
	 * instruction enables interrupts only one instruction later.
	 * If there is a window between STI and the instruction that
	 * emulates the HALT state, there is a chance for interrupts to
	 * happen in this window, which can delay the HLT operation
	 * indefinitely. Since this is the not the desired result,
	 * conditionally call STI before TDCALL.
	 */
	testq $TDX_HCALL_ISSUE_STI, %rsi
	jz .Lskip_sti
	sti
.Lskip_sti:
	tdcall

	/*
	 * TDVMCALL leaf does not suppose to fail. If it fails something
	 * is horribly wrong with TDX module. Stop the world.
	 */
	test %rax, %rax
	je .Lsuccess
	ud2
.Lsuccess:
	/* TDVMCALL leaf return code is in R10 */
	movq %r10, %rax

	/* Copy hypercall result registers to arg struct if needed */
	testq $TDX_HCALL_HAS_OUTPUT, %rsi
	jz .Lout

	movq %r10, TDX_HYPERCALL_r10(%rdi)
	movq %r11, TDX_HYPERCALL_r11(%rdi)
	movq %r12, TDX_HYPERCALL_r12(%rdi)
	movq %r13, TDX_HYPERCALL_r13(%rdi)
	movq %r14, TDX_HYPERCALL_r14(%rdi)
	movq %r15, TDX_HYPERCALL_r15(%rdi)
.Lout:
	/*
	 * Zero out registers exposed to the VMM to avoid
	 * speculative execution with VMM-controlled values.
	 * This needs to include all registers present in
	 * TDVMCALL_EXPOSE_REGS_MASK (except R12-R15).
	 * R12-R15 context will be restored.
	 */
	xor %r10d, %r10d
	xor %r11d, %r11d

	/* Restore callee-saved GPRs as mandated by the x86_64 ABI */
	pop %r12
	pop %r13
	pop %r14
	pop %r15

	FRAME_END

	retq
SYM_FUNC_END(__tdx_hypercall)
-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ