linux-kernel - Re: [PATCH v3 04/21] x86/virt/tdx: Add skeleton for detecting and initializing TDX on demand

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3731a852-71b8-b081-2426-3b0a650e174c@intel.com>
Date:   Thu, 28 Apr 2022 07:27:01 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Kai Huang <kai.huang@...el.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     seanjc@...gle.com, pbonzini@...hat.com, len.brown@...el.com,
        tony.luck@...el.com, rafael.j.wysocki@...el.com,
        reinette.chatre@...el.com, dan.j.williams@...el.com,
        peterz@...radead.org, ak@...ux.intel.com,
        kirill.shutemov@...ux.intel.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com,
        isaku.yamahata@...el.com
Subject: Re: [PATCH v3 04/21] x86/virt/tdx: Add skeleton for detecting and
 initializing TDX on demand

On 4/27/22 17:00, Kai Huang wrote:
> On Wed, 2022-04-27 at 07:49 -0700, Dave Hansen wrote:
> I think we can use pr_info_once() when all_cpus_booted() returns false, and get
> rid of printing "SEAMRR not enabled" in seamrr_enabled().  How about below?
> 
> static bool seamrr_enabled(void)
> {
> 	if (!all_cpus_booted())
> 		pr_info_once("Not all present CPUs have been booted.  Report
> SEAMRR as not enabled");
> 
> 	return __seamrr_enabled();
> }
> 
> And we don't print "SEAMRR not enabled".

That's better, but even better than that would be removing all that
SEAMRR gunk in the first place.

>>>>> +	/*
>>>>> +	 * TDX requires at least two KeyIDs: one global KeyID to
>>>>> +	 * protect the metadata of the TDX module and one or more
>>>>> +	 * KeyIDs to run TD guests.
>>>>> +	 */
>>>>> +	return tdx_keyid_num >= 2;
>>>>> +}
>>>>> +
>>>>> +static int __tdx_detect(void)
>>>>> +{
>>>>> +	/* The TDX module is not loaded if SEAMRR is disabled */
>>>>> +	if (!seamrr_enabled()) {
>>>>> +		pr_info("SEAMRR not enabled.\n");
>>>>> +		goto no_tdx_module;
>>>>> +	}
>>>>
>>>> Why even bother with the SEAMRR stuff?  It sounded like you can "ping"
>>>> the module with SEAMCALL.  Why not just use that directly?
>>>
>>> SEAMCALL will cause #GP if SEAMRR is not enabled.  We should check whether
>>> SEAMRR is enabled before making SEAMCALL.
>>
>> So...  You could actually get rid of all this code.  if SEAMCALL #GP's,
>> then you say, "Whoops, the firmware didn't load the TDX module
>> correctly, sorry."
> 
> Yes we can just use the first SEAMCALL (TDH.SYS.INIT) to detect whether TDX
> module is loaded.  If SEAMCALL is successful, the module is loaded.
> 
> One problem is currently the patch to flush cache for kexec() uses
> seamrr_enabled() and tdx_keyid_sufficient() to determine whether we need to
> flush the cache.  The reason is, similar to SME, the flush is done in
> stop_this_cpu(), but the status of TDX module initialization is protected by
> mutex, so we cannot use TDX module status in stop_this_cpu() to determine
> whether to flush.
> 
> If that patch makes sense, I think we still need to detect SEAMRR?

Please go look at stop_this_cpu() closely.  What are the AMD folks doing
for SME exactly?  Do they, for instance, do the WBINVD when the kernel
used SME?  No, they just use a pretty low-level check if the processor
supports SME.

Doing the same kind of thing for TDX is fine.  You could check the MTRR
MSR bits that tell you if SEAMRR is supported and then read the MSR
directly.  You could check the CPUID enumeration for MKTME or
CPUID.B.0.EDX (I'm not even sure what this is but the SEAMCALL spec says
it is part of SEAMCALL operation).

Just like the SME test, it doesn't even need to be precise.  It just
needs to be 100% accurate in that it is *ALWAYS* set for any system that
might have dirtied cache aliases.

I'm not sure why you are so fixated on SEAMRR specifically for this.


...
> "During initializing the TDX module, one step requires some SEAMCALL must be
> done on all logical cpus enabled by BIOS, otherwise a later step will fail. 
> Disable CPU hotplug during the initialization process to prevent any CPU going
> offline during initializing the TDX module.  Note it is caller's responsibility
> to guarantee all BIOS-enabled CPUs are in cpu_present_mask and all present CPUs
> are online."

But, what if a CPU went offline just before this lock was taken?  What
if the caller make sure all present CPUs are online, makes the call,
then a CPU is taken offline.  The lock wouldn't do any good.

What purpose does the lock serve?