lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220310212059.6abpmnsgodqqqnfm@black.fi.intel.com>
Date:   Fri, 11 Mar 2022 00:20:59 +0300
From:   "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     tglx@...utronix.de, mingo@...hat.com, dave.hansen@...el.com,
        luto@...nel.org, peterz@...radead.org,
        sathyanarayanan.kuppuswamy@...ux.intel.com, aarcange@...hat.com,
        ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
        hpa@...or.com, jgross@...e.com, jmattson@...gle.com,
        joro@...tes.org, jpoimboe@...hat.com, knsathya@...nel.org,
        pbonzini@...hat.com, sdeep@...are.com, seanjc@...gle.com,
        tony.luck@...el.com, vkuznets@...hat.com, wanpengli@...cent.com,
        thomas.lendacky@....com, brijesh.singh@....com, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCHv5 03/30] x86/tdx: Add __tdx_module_call() and
 __tdx_hypercall() helper functions

On Thu, Mar 10, 2022 at 04:30:57PM +0100, Borislav Petkov wrote:
> On Wed, Mar 02, 2022 at 05:27:39PM +0300, Kirill A. Shutemov wrote:
> > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> > 
> > Guests communicate with VMMs with hypercalls. Historically, these
> > are implemented using instructions that are known to cause VMEXITs
> > like VMCALL, VMLAUNCH, etc. However, with TDX, VMEXITs no longer
> > expose the guest state to the host. This prevents the old hypercall
> > mechanisms from working. So, to communicate with VMM, TDX
> > specification defines a new instruction called TDCALL.
> > 
> > In a TDX based VM, since the VMM is an untrusted entity, an intermediary
> > layer -- TDX module -- facilitates secure communication between the host
> > and the guest. TDX module is loaded like a firmware into a special CPU
> > mode called SEAM. TDX guests communicate with the TDX module using the
> > TDCALL instruction.
> > 
> > A guest uses TDCALL to communicate with both the TDX module and VMM.
> > The value of the RAX register when executing the TDCALL instruction is
> > used to determine the TDCALL type. A variant of TDCALL used to communicate
> > with the VMM is called TDVMCALL.
> > 
> > Add generic interfaces to communicate with the TDX module and VMM
> > (using the TDCALL instruction).
> > 
> > __tdx_hypercall()    - Used by the guest to request services from the
> > 		       VMM (via TDVMCALL).
> > __tdx_module_call()  - Used to communicate with the TDX module (via
> > 		       TDCALL).
> 
> Ok, you need to fix this: this sounds to me like there are two insns:
> TDCALL and TDVMCALL. But there's only TDCALL.
> 
> And I'm not even clear on how the differentiation is done - I guess
> with %r11 which contains the VMCALL subfunction number in the
> __tdx_hypercall() case but I'm not sure.

TDVMCALL is a leaf of TDCALL. The leaf number is encoded in RAX: RAX==0 is
TDVMCALL.

I'm not completely sure what has to be fixed. Make it clear that TDVMCALL
is leaf of TDCALL? Something like this:

	__tdx_module_call()  - Used to communicate with the TDX module (via
			       TDCALL instruction).
	__tdx_hypercall()    - Used by the guest to request services from the
			       VMM (via TDVMCALL leaf of TDCALL).

?

> And when explaining this, pls put it in the comment over the function so
> that it is clear how the distinction is made.

But it's already there:

/*
 * __tdx_module_call()  - Used by TDX guests to request services from
 * the TDX module (does not include VMM services).
 *
 * Transforms function call register arguments into the TDCALL
 * register ABI.  After TDCALL operation, TDX module output is saved
 * in @out (if it is provided by the user)
 *
 ...
 */

and

/*
 * __tdx_hypercall() - Make hypercalls to a TDX VMM.
 *
 * Transforms values in  function call argument struct tdx_hypercall_args @args
 * into the TDCALL register ABI. After TDCALL operation, VMM output is saved
 * back in @args.
 *
 *-------------------------------------------------------------------------
 * TD VMCALL ABI:
 *-------------------------------------------------------------------------
 *
 * Input Registers:
 *
 * RAX                 - TDCALL instruction leaf number (0 - TDG.VP.VMCALL)
 .....
 */

Hm?

> > Also define an additional wrapper _tdx_hypercall(), which adds error
> > handling support for the TDCALL failure.
> > 
> > The __tdx_module_call() and __tdx_hypercall() helper functions are
> > implemented in assembly in a .S file.  The TDCALL ABI requires
> > shuffling arguments in and out of registers, which proved to be
> > awkward with inline assembly.
> > 
> > Just like syscalls, not all TDVMCALL use cases need to use the same
> > number of argument registers. The implementation here picks the current
> > worst-case scenario for TDCALL (4 registers). For TDCALLs with fewer
> > than 4 arguments, there will end up being a few superfluous (cheap)
> > instructions. But, this approach maximizes code reuse.
> > 
> > For registers used by the TDCALL instruction, please check TDX GHCI
> > specification, the section titled "TDCALL instruction" and "TDG.VP.VMCALL
> > Interface".
> > 
> > Based on previous patch by Sean Christopherson.
> > 
> > Reviewed-by: Tony Luck <tony.luck@...el.com>
> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > ---
> >  arch/x86/coco/Makefile        |   2 +-
> >  arch/x86/coco/tdcall.S        | 188 ++++++++++++++++++++++++++++++++++
> >  arch/x86/coco/tdx.c           |  18 ++++
> 
> Those should be
> 
> arch/x86/coco/tdx/tdcall.S
> arch/x86/coco/tdx/tdx.c
> 
> like we said:
> 
> "- confidential computing guest stuff: arch/x86/coco/{sev,tdx}"

Okay, will change. But it is not what we agreed about before:

https://lore.kernel.org/all/Yg5q742GsjCRHXZL@zn.tnic

> >  arch/x86/include/asm/tdx.h    |  27 +++++
> >  arch/x86/kernel/asm-offsets.c |  10 ++
> >  5 files changed, 244 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/x86/coco/tdcall.S
> 
> ...
> 
> > +SYM_FUNC_START(__tdx_hypercall)
> > +	FRAME_BEGIN
> > +
> > +	/* Save callee-saved GPRs as mandated by the x86_64 ABI */
> > +	push %r15
> > +	push %r14
> > +	push %r13
> > +	push %r12
> > +
> > +	/* Mangle function call ABI into TDCALL ABI: */
> > +	/* Set TDCALL leaf ID (TDVMCALL (0)) in RAX */
> > +	xor %eax, %eax
> > +
> > +	/* Copy hypercall registers from arg struct: */
> > +	movq TDX_HYPERCALL_r10(%rdi), %r10
> > +	movq TDX_HYPERCALL_r11(%rdi), %r11
> > +	movq TDX_HYPERCALL_r12(%rdi), %r12
> > +	movq TDX_HYPERCALL_r13(%rdi), %r13
> > +	movq TDX_HYPERCALL_r14(%rdi), %r14
> > +	movq TDX_HYPERCALL_r15(%rdi), %r15
> > +
> > +	movl $TDVMCALL_EXPOSE_REGS_MASK, %ecx
> > +
> > +	tdcall
> > +
> > +	/*
> > +	 * RAX==0 indicates a failure of the TDVMCALL mechanism itself and that
> > +	 * something has gone horribly wrong with the TDX module.
> > +	 *
> > +	 * The return status of the hypercall operation is in a separate
> > +	 * register (in R10). Hypercall errors are a part of normal operation
> > +	 * and are handled by callers.
> > +	 */
> > +	testq %rax, %rax
> > +	jne .Lpanic
> 
> Hm, can this call a C function which does the panic so that a proper
> error message is dumped to the user so that at least she knows where the
> panic comes from?

Sure we can. But it would look somewhat clunky.
Wouldn't backtrace be enough for this (never to be seen) case?

So far the panic would look like this:

	invalid opcode: 0000 [#1] PREEMPT SMP
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc5-00181-geb9d1dde679a-dirty #1883
	RIP: 0010:__tdx_hypercall+0x6d/0x70
	Code: 00 00 74 17 4c 89 17 4c 89 5f 08 4c 89 67 10 4c 89 6f 18 4c 89 77 20 4c 89 7f 28 45 31 d2 45 31 db 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 00 55 53 41 54 41 55 41 56 41 57 48 89 a7 98 1b 00 00 48 8b
	RSP: 0000:ffffffff82803e80 EFLAGS: 00010002
	RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000fc00
	RDX: ff11000004c3d448 RSI: 0000000000000002 RDI: ffffffff82803ea8
	RBP: ffffffff8283e840 R08: 0000000000000000 R09: 000000005eee98b1
	R10: 0000000000000000 R11: 000000000000000c R12: 0000000000000000
	R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
	FS:  0000000000000000(0000) GS:ff1100001c400000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: ff1100001c9ff000 CR3: 0000000002834001 CR4: 0000000000771ef0
	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
	PKRU: 55555554
	Call Trace:
	 <TASK>
	 tdx_safe_halt+0x46/0x80
	 default_idle_call+0x4f/0x90
	 do_idle+0xeb/0x250
	 cpu_startup_entry+0x15/0x20
	 start_kernel+0x372/0x3d1
	 secondary_startup_64_no_verify+0xe5/0xeb
	 </TASK>

To me "invalid opcode" at "RIP: 0010:__tdx_hypercall+0x6d/0x70" is pretty
clear where it comes from, no?

> > +
> > +	/* TDVMCALL leaf return code is in R10 */
> > +	movq %r10, %rax
> > +
> > +	/* Copy hypercall result registers to arg struct if needed */
> > +	testq $TDX_HCALL_HAS_OUTPUT, %rsi
> > +	jz .Lout
> > +
> > +	movq %r10, TDX_HYPERCALL_r10(%rdi)
> > +	movq %r11, TDX_HYPERCALL_r11(%rdi)
> > +	movq %r12, TDX_HYPERCALL_r12(%rdi)
> > +	movq %r13, TDX_HYPERCALL_r13(%rdi)
> > +	movq %r14, TDX_HYPERCALL_r14(%rdi)
> > +	movq %r15, TDX_HYPERCALL_r15(%rdi)
> > +.Lout:
> > +	/*
> > +	 * Zero out registers exposed to the VMM to avoid speculative execution
> > +	 * with VMM-controlled values. This needs to include all registers
> > +	 * present in TDVMCALL_EXPOSE_REGS_MASK (except R12-R15). R12-R15
> > +	 * context will be restored.
> > +	 */
> > +	xor %r10d, %r10d
> > +	xor %r11d, %r11d
> > +
> > +	/* Restore callee-saved GPRs as mandated by the x86_64 ABI */
> > +	pop %r12
> > +	pop %r13
> > +	pop %r14
> > +	pop %r15
> > +
> > +	FRAME_END
> > +
> > +	retq
> > +.Lpanic:
> > +	ud2
> > +SYM_FUNC_END(__tdx_hypercall)
> 
> ...
> 
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index 7dca52f5cfc6..0b465e7d0a2f 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -74,6 +74,16 @@ static void __used common(void)
> >  	OFFSET(TDX_MODULE_r10, tdx_module_output, r10);
> >  	OFFSET(TDX_MODULE_r11, tdx_module_output, r11);
> >  
> > +#ifdef CONFIG_INTEL_TDX_GUEST
> 
> Those have ifdeffery around them - why don't the TDX_MODULE_* ones need
> it too?

I will drop the #ifdef. There's no harm in generating it for !TDX build.

> > +	BLANK();
> > +	OFFSET(TDX_HYPERCALL_r10, tdx_hypercall_args, r10);
> > +	OFFSET(TDX_HYPERCALL_r11, tdx_hypercall_args, r11);
> > +	OFFSET(TDX_HYPERCALL_r12, tdx_hypercall_args, r12);
> > +	OFFSET(TDX_HYPERCALL_r13, tdx_hypercall_args, r13);
> > +	OFFSET(TDX_HYPERCALL_r14, tdx_hypercall_args, r14);
> > +	OFFSET(TDX_HYPERCALL_r15, tdx_hypercall_args, r15);
> > +#endif
> > +
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ