lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7882ef34-416f-9627-dcbe-7bf88c866dc8@intel.com>
Date:   Wed, 31 Mar 2021 14:49:54 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/1] x86/tdx: Handle MWAIT, MONITOR and WBINVD

On 3/31/21 2:09 PM, Kuppuswamy Sathyanarayanan wrote:
> As per Guest-Host Communication Interface (GHCI) Specification
> for Intel TDX, sec 2.4.1, TDX architecture does not support
> MWAIT, MONITOR and WBINVD instructions. So in non-root TDX mode,
> if MWAIT/MONITOR instructions are executed with CPL != 0 it will
> trigger #UD, and for CPL = 0 case, virtual exception (#VE) is
> triggered. WBINVD instruction behavior is also similar to
> MWAIT/MONITOR, but for CPL != 0 case, it will trigger #GP instead
> of #UD.

Could we give it a go to try this in plain English before jumping in and
quoting the exact spec section?  Also, the CPL language is nice and
precise for talking inside Intel, but it's generally easier for me to
read kernel descriptions when we just talk about the kernel.

	When running as a TDX guest, there are a number of existing,
	privileged instructions that do not work.  If the guest kernel
	uses these instructions, the hardware generates a #VE.

Which reminds me...  The SDM says: MWAIT will "#UD ... If
CPUID.01H:ECX.MONITOR[bit 3] = 0".  So, is this an architectural change?
 The guest is *supposed* to see that CPUID bit as 0, so shouldn't it
also get a #UD?  Or is this all so that if SEAM *forgets* to clear the
CPUID bit, the guest gets #VE?

What are we *actually* mitigating here?

Also, FWIW, MWAIT/MONITOR and WBINVD are pretty different beasts.  I
think this would all have been a lot more clear if this would have been
two patches instead of shoehorning them into one.

> To prevent TD guest from using these unsupported instructions,
> following measures are adapted:
> 
> 1. For MWAIT/MONITOR instructions, support for these instructions
> are already disabled by TDX module (SEAM). So CPUID flags for
> these instructions should be in disabled state. Also, just to be
> sure that these instructions are disabled, forcefully unset
> X86_FEATURE_MWAIT CPU cap in OS.
> 
> 2. For WBINVD instruction, we use audit to find the code that uses
> this instruction and disable them for TD.

Really?  Where are those patches?

> +static inline bool cpuid_has_mwait(void)
> +{
> +	if (cpuid_ecx(1) & (1 << (X86_FEATURE_MWAIT % 32)))
> +		return true;
> +
> +	return false;
> +}
> +
>  bool is_tdx_guest(void)
>  {
>  	return static_cpu_has(X86_FEATURE_TDX_GUEST);
> @@ -301,12 +309,25 @@ static int tdg_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>  	return insn.length;
>  }
>  
> +/* Initialize TDX specific CPU capabilities */
> +static void __init tdx_cpu_cap_init(void)
> +{
> +	setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
> +
> +	if (cpuid_has_mwait()) {
> +		WARN(1, "TDX Module failed to disable MWAIT\n");

WARN(1, "TDX guest enumerated support for MWAIT, disabling it").

> +		/* MWAIT is not supported in TDX platform, so suppress it */
> +		setup_clear_cpu_cap(X86_FEATURE_MWAIT);
> +	}
> +
> +}

Extra newline.

>  void __init tdx_early_init(void)
>  {
>  	if (!cpuid_has_tdx_guest())
>  		return;
>  
> -	setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
> +	tdx_cpu_cap_init();
>  
>  	tdg_get_info();
>  
> @@ -362,6 +383,27 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs,
>  	case EXIT_REASON_EPT_VIOLATION:
>  		ve->instr_len = tdg_handle_mmio(regs, ve);
>  		break;
> +	case EXIT_REASON_WBINVD:
> +		/*
> +		 * TDX architecture does not support WBINVD instruction.
> +		 * Currently, usage of this instruction is prevented by
> +		 * disabling the drivers which uses it. So if we still
> +		 * reach here, it needs user attention.
> +		 */

This comment is awfully vague.  "TDX architecture..." what?  Any CPUs
supporting the TDX architecture?  TDX VMM's?  TDX Guests?

Let's also not waste byte on stating the obvious.  If it didn't need
attention we wouldn't be warning about it, eh?

So, let's halve the size of the comment and say:

		/*
		 * WBINVD is not supported inside TDX guests.  All in-
		 * kernel uses should have been disabled.
		 */

> +		pr_err("TD Guest used unsupported WBINVD instruction\n");
> +		BUG();
> +		break;
> +	case EXIT_REASON_MONITOR_INSTRUCTION:
> +	case EXIT_REASON_MWAIT_INSTRUCTION:
> +		/*
> +		 * MWAIT/MONITOR features are disabled by TDX Module (SEAM)
> +		 * and also re-suppressed in kernel by clearing
> +		 * X86_FEATURE_MWAIT CPU feature flag in tdx_early_init(). So
> +		 * if TD guest still executes MWAIT/MONITOR instruction with
> +		 * above suppression, it needs user attention.
> +		 */

Again, let's trim this down:

		/*
		 * Something in the kernel used MONITOR or MWAIT despite
		 * X86_FEATURE_MWAIT being cleared for TDX guests.
		 */

Rather than naming the function, this makes it quite greppable to find
where it could have *possibly* been cleared.

> +		WARN(1, "TD Guest used unsupported MWAIT/MONITOR instruction\n");
> +		break;
>  	default:
>  		pr_warn("Unexpected #VE: %d\n", ve->exit_reason);
>  		return -EFAULT;
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ