linux-kernel - Re: [PATCHv2 04/29] x86/traps: Add #VE support for TDX guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YfmlnJ6LS935AMS4@google.com>
Date:   Tue, 1 Feb 2022 21:26:52 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...el.com,
        luto@...nel.org, peterz@...radead.org,
        sathyanarayanan.kuppuswamy@...ux.intel.com, aarcange@...hat.com,
        ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
        hpa@...or.com, jgross@...e.com, jmattson@...gle.com,
        joro@...tes.org, jpoimboe@...hat.com, knsathya@...nel.org,
        pbonzini@...hat.com, sdeep@...are.com, tony.luck@...el.com,
        vkuznets@...hat.com, wanpengli@...cent.com, x86@...nel.org,
        linux-kernel@...r.kernel.org,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [PATCHv2 04/29] x86/traps: Add #VE support for TDX guest

On Tue, Feb 01, 2022, Thomas Gleixner wrote:
> On Mon, Jan 24 2022 at 18:01, Kirill A. Shutemov wrote:
> > diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> > index df0fa695bb09..1da074123c16 100644
> > --- a/arch/x86/kernel/idt.c
> > +++ b/arch/x86/kernel/idt.c
> > @@ -68,6 +68,9 @@ static const __initconst struct idt_data early_idts[] = {
> >  	 */
> >  	INTG(X86_TRAP_PF,		asm_exc_page_fault),
> >  #endif
> > +#ifdef CONFIG_INTEL_TDX_GUEST
> > +	INTG(X86_TRAP_VE,		asm_exc_virtualization_exception),
> > +#endif
> >  
> > +bool tdx_get_ve_info(struct ve_info *ve)
> > +{
> > +	struct tdx_module_output out;
> > +
> > +	/*
> > +	 * NMIs and machine checks are suppressed. Before this point any
> > +	 * #VE is fatal. After this point (TDGETVEINFO call), NMIs and
> > +	 * additional #VEs are permitted (but it is expected not to
> > +	 * happen unless kernel panics).
> 
> I really do not understand that comment. #NMI and #MC are suppressed
> according to the above. How long are they suppressed and what's the
> mechanism? Are they unblocked on return from __tdx_module_call() ?

TDX_GET_VEINFO is a call into the TDX module to get the data from #VE info struct
pointed at by the VMCS.  Doing TDX_GET_VEINFO also clears that "valid" flag in
the struct.  It's basically a CMPXCHG on the #VE info struct, except that it routes
through the TDX module.

The TDX module treats virtual NMIs as blocked if the #VE valid flag is set, i.e.
refuses to inject NMI until the guest does TDX_GET_VEINFO to retrieve the info for
the last #VE.

I don't understand the blurb about #MC.  Unless things have changed, the TDX module
doesn't support injecting #MC into the guest.

> What prevents a nested #VE? If it happens what makes it fatal? Is it
> converted to a #DF or detected by software?

A #VE that would occur is morphed to a #DF by the TDX module if the #VE info valid
flag is already set.  But nested #VE should work, so long as the nested #VE happens
after TDX_GET_VEINFO.

> Also I do not understand that the last sentence tries to tell me. If the
> suppression of #NMI and #MC is lifted on return from tdcall then both
> can be delivered immediately afterwards, right?

Yep, NMI can be injected on the instruction following the TDCALL.  

Something like this?
	
	/*
	 * Retrieve the #VE info from the TDX module, which also clears the "#VE
	 * valid" flag.  This must be done before anything else as any #VE that
	 * occurs while the valid flag is set, i.e. before the previous #VE info
	 * was consumed, is morphed to a #DF by the TDX module.  Note, the TDX
	 * module also treats virtual NMIs as inhibited if the #VE valid flag is
	 * set, e.g. so that NMI=>#VE will not result in a #DF.
	 */
 
> I assume the additional #VE is triggered by software or a bug in the
> kernel.

I'm curious if that will even hold true, there's sooo much stuff that can happen
from NMI context.  I don't see much value in speculating what will/won't happen
after retrieving the #VE info.