linux-kernel - Re: [PATCH v5 11/12] x86/tdx: Don't write CSTAR MSR on Intel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a177ac69-552d-9cd1-7125-6cb92d07d604@linux.intel.com>
Date:   Wed, 4 Aug 2021 15:23:04 -0700
From:   "Kuppuswamy, Sathyanarayanan" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Sean Christopherson <seanjc@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Peter H Anvin <hpa@...or.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 11/12] x86/tdx: Don't write CSTAR MSR on Intel



On 8/4/21 2:48 PM, Dave Hansen wrote:
>> No, #GP is triggered by guest.
> ...
>>> Regardless of #GP versus #VE, "Table 16.2 MSR Virtualization" needs
>>> to state the actual behavior.
>> Even in this case, it will trigger #VE. But since CSTAR MSR is not
>> supported, write to it will fail and leads to #VE fault.
> Sathya, I think there might be a mixup of terminology here that's
> confusing.  I'm confused by this exchange.
> 
> In general, we refer to hardware exceptions by their architecture names:
> #GP for general protection fault, #PF for page fault, #VE for
> Virtualization Exception.
> 
> Those hardware exceptions are wired up to software handlers:
> #GP lands in asm_exc_general_protection
> #PF ends up in exc_page_fault
> #VE ends up in exc_virtualization_exception
> ... and more of course
> 
> But, to add to the confusion, the #VE handler
> (exc_virtualization_exception()) itself calls (or did once upon a time
> call) do_general_protection() when it can't handle something.
> do_general_protection() is (was?)*ALSO*  called by the #GP handler.
> 
> So, is that what you meant?  By "#GP is triggered by guest", you mean
> that a write to the CSTAR MSR and the resulting #VE will end up being
> handled in a way that is similar to how a #GP hardware exception would
> have been handled?
> 
> If that's what you meant, I'm not_sure_  that's totally accurate.  Could
> you elaborate on this a bit?  It also would be really handy if you were
> able to adopt the terminology I talked about above.  It will really make
> things less confusing.


In TDX guest, MSR write will trigger #VE which will be handled by
exc_virtualization_exception()->tdg_handle_virtualization_exception().
Internally this exception handler emulates the "MSR write" using
hypercalls. But if the hypercall returns failure, then it means we
failed to handle the #VE exception. In such cases,
exc_virtualization_exception() handler will trigger #GP like behavior
using ve_raise_fault(). ve_raise_fault() is the customized version of
do_general_protection(). This what I meant by guest triggers #GP(0).

Since CSTAR_MSR is not supported/used in Intel platforms, instead of
going through all these processes before triggering the failure, we
have added the exception for it before it is used.

Following are the implementation details:

static void ve_raise_fault(struct pt_regs *regs, long error_code)
{
         struct task_struct *tsk = current;

         if (user_mode(regs)) {
                 tsk->thread.error_code = error_code;
                 tsk->thread.trap_nr = X86_TRAP_VE;

                 /*
                  * Not fixing up VDSO exceptions similar to #GP handler
                  * because we don't expect the VDSO to trigger #VE.
                  */
                 show_signal(tsk, SIGSEGV, "", VEFSTR, regs, error_code);
                 force_sig(SIGSEGV);
                 return;
         }

         if (fixup_exception(regs, X86_TRAP_VE, error_code, 0))
                 return;

         tsk->thread.error_code = error_code;
         tsk->thread.trap_nr = X86_TRAP_VE;

         /*
          * To be potentially processing a kprobe fault and to trust the result
          * from kprobe_running(), we have to be non-preemptible.
          */
         if (!preemptible() &&
             kprobe_running() &&
             kprobe_fault_handler(regs, X86_TRAP_VE))
                 return;

         notify_die(DIE_GPF, VEFSTR, regs, error_code, X86_TRAP_VE, SIGSEGV);

         die_addr(VEFSTR, regs, error_code, 0);
}


DEFINE_IDTENTRY(exc_virtualization_exception)
{
         struct ve_info ve;
         int ret;

         RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");

         inc_irq_stat(tdg_ve_count);

         /*
          * NMIs/Machine-checks/Interrupts will be in a disabled state
          * till TDGETVEINFO TDCALL is executed. This prevents #VE
          * nesting issue.
          */
         ret = tdg_get_ve_info(&ve);

         cond_local_irq_enable(regs);

         if (!ret)
                 ret = tdg_handle_virtualization_exception(regs, &ve);
         /*
          * If tdg_handle_virtualization_exception() could not process
          * it successfully, treat it as #GP(0) and handle it.
          */
         if (ret)
                 ve_raise_fault(regs, 0);

         cond_local_irq_disable(regs);

}
-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer