lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 May 2021 11:31:29 -0700
From:   "Kuppuswamy, Sathyanarayanan" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>
Cc:     Tony Luck <tony.luck@...el.com>, Andi Kleen <ak@...ux.intel.com>,
        Kirill Shutemov <kirill.shutemov@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan <knsathya@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Raj Ashok <ashok.raj@...el.com>,
        Sean Christopherson <seanjc@...gle.com>,
        linux-kernel@...r.kernel.org,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [RFC v2-fix-v2 1/1] x86/boot: Avoid #VE during boot for TDX
 platforms



On 5/21/21 9:11 AM, Dave Hansen wrote:
>> Avoid operations which will inject #VE during boot process.
>> They're easy to avoid and it is less complex than handling
>> the exceptions.
> 
> This puts the solution before the problem.  I'd also make sure to
> clearly connect this solution to the problem.  For instance, if you
> refer to register "modification", ensure that you reflect that language
> here.  Don't call them "modifications" in one part of the changelog and
> "operations" here.  I'd also qualify them as "superfluous".
> 
> Please reorder this in the following form:
> 
> 1. Background
> 2. Problem
> 3. Solution
> 
> Please do this for all of your patches.
> 
>> There are a few MSRs and control register bits which the
>> kernel normally needs to modify during boot.  But, TDX
>> disallows modification of these registers to help provide
>> consistent security guarantees ( and avoid generating #VE
>> when updating them).
> 
> No, the TDX architecture does not avoid generating #VE.  The *kernel*
> does that.  This sentence conflates those two things.
> 
>> Fortunately, TDX ensures that these are
>> all in the correct state before the kernel loads, which means
>> the kernel has no need to modify them.
>>
>> The conditions we need to avoid are:
>>
>>    * Any writes to the EFER MSR
>>    * Clearing CR0.NE
>>    * Clearing CR3.MCE
> 
> Sathya, there have been repeated issues in your changelogs with "we's".
>   Remember, speak in imperative voice.  Please fix this in your tooling
> to find these so that reviewers don't have to.

How about the following commit log?

In TDX guests, Virtualization Exceptions (#VE) are delivered
to TDX guests due to specific guest actions like MSR writes,
CPUID leaf accesses or I/O access. But in early boot code, #VE
cannot be allowed because the required exception handler setup
support code is missing. If #VE is triggered without proper
handler support, it would lead to triple fault or kernel hang.
So, avoid operations which will inject #VE during boot process.
They're easy to avoid and it is less complex than handling the
exceptions.

There are a few MSRs and control register bits which the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent
security guarantees. Fortunately, TDX ensures that these are all
in the correct state before the kernel loads, which means the
kernel has no need to modify them.

The conditions to avoid are:

   * Any writes to the EFER MSR
   * Clearing CR0.NE
   * Clearing CR3.MCE

If above conditions are not avoided, it would lead to triple
fault or kernel hang.

> 
>> +	/*
>> +	 * Preserve current value of EFER for comparison and to skip
>> +	 * EFER writes if no change was made (for TDX guest)
>> +	 */
>> +	movl    %eax, %edx
>>   	btsl	$_EFER_SCE, %eax	/* Enable System Call */
>>   	btl	$20,%edi		/* No Execute supported? */
>>   	jnc     1f
>>   	btsl	$_EFER_NX, %eax
>>   	btsq	$_PAGE_BIT_NX,early_pmd_flags(%rip)
>> -1:	wrmsr				/* Make changes effective */
>>   
>> +	/* Avoid writing EFER if no change was made (for TDX guest) */
>> +1:	cmpl	%edx, %eax
>> +	je	1f
>> +	xor	%edx, %edx
>> +	wrmsr				/* Make changes effective */
>> +1:
> 
> Just curious, but what if this goes wrong?  Say the TDX firmware didn't
> set up EFER correctly and this code does the WRMSR.  What ends up
> happening? 

It would lead to triple fault.

  Do we get anything out on the console, or is it essentially
> undebuggable?
> 

We can still get logs with debug TDX module. So it is still debugable.

>>
>> +	/*
>> +	 * Skip writing to EFER if the register already has desiered
>> +	 * value (to avoid #VE for TDX guest).
>> +	 */
> 
> 
> 							spelling ^
> 
> There are lots of editors that can do spell checking, even in C
> comments.  You might want to look into that for your editor.
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ