linux-kernel - Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YCEQiDNSHTGBXBcj@hirez.programming.kicks-ass.net>
Date:   Mon, 8 Feb 2021 11:20:56 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        linux-kernel@...r.kernel.org,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [RFC v1 05/26] x86/traps: Add #VE support for TDX guest

On Fri, Feb 05, 2021 at 03:38:22PM -0800, Kuppuswamy Sathyanarayanan wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> 
> The TDX module injects #VE exception to the guest TD in cases of
> disallowed instructions, disallowed MSR accesses and subset of CPUID
> leaves. Also, it's theoretically possible for CPU to inject #VE
> exception on EPT violation, but the TDX module makes sure this does
> not happen, as long as all memory used is properly accepted using
> TDCALLs. You can find more details about it in, Guest-Host-Communication
> Interface (GHCI) for Intel Trust Domain Extensions (Intel TDX)
> specification, sec 2.3.
> 
> Add basic infrastructure to handle #VE. If there is no handler for a
> given #VE, since its a unexpected event (fault case), treat it as a
> general protection fault and handle it using do_general_protection()
> call.
> 
> TDCALL[TDGETVEINFO] provides information about #VE such as exit reason.
> 
> More details on cases where #VE exceptions are allowed/not-allowed:
> 
> The #VE exception do not occur in the paranoid entry paths, like NMIs.
> While other operations during an NMI might cause #VE, these are in the
> NMI code that can handle nesting, so there is no concern about
> reentrancy. This is similar to how #PF is handled in NMIs.
> 
> The #VE exception also cannot happen in entry/exit code with the
> wrong gs, such as the SWAPGS code, so it's entry point does not
> need "paranoid" handling.

All of the above are arranged by using the below secure EPT for init
text and data?

> Any memory accesses can cause #VE if it causes an EPT
> violation.  However, the VMM is only in direct control of some of the
> EPT tables.  The Secure EPT tables are controlled by the TDX module
> which guarantees no EPT violations will result in #VE for the guest,
> once the memory has been accepted.

Which is supposedly then set up to avoid #VE during the syscall gap,
yes? Which then results in #VE not having to be IST.

> +#ifdef CONFIG_INTEL_TDX_GUEST
> +DEFINE_IDTENTRY(exc_virtualization_exception)
> +{
> +	struct ve_info ve;
> +	int ret;
> +
> +	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> +	/* Consume #VE info before re-enabling interrupts */

So what happens if NMI happens here, and triggers a nested #VE ?

> +	ret = tdx_get_ve_info(&ve);
> +	cond_local_irq_enable(regs);
> +	if (!ret)
> +		ret = tdx_handle_virtualization_exception(regs, &ve);
> +	/*
> +	 * If #VE exception handler could not handle it successfully, treat
> +	 * it as #GP(0) and handle it.
> +	 */
> +	if (ret)
> +		do_general_protection(regs, 0);
> +	cond_local_irq_disable(regs);
> +}
> +#endif