lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7c07ef95669260be75a25307da00cee3b3ff9b1.camel@intel.com>
Date:   Mon, 19 Jun 2023 23:28:37 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "david@...hat.com" <david@...hat.com>
CC:     "Hansen, Dave" <dave.hansen@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        "bagasdotme@...il.com" <bagasdotme@...il.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Yamahata, Isaku" <isaku.yamahata@...el.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "Shahar, Sagi" <sagis@...gle.com>,
        "imammedo@...hat.com" <imammedo@...hat.com>,
        "Gao, Chao" <chao.gao@...el.com>,
        "Brown, Len" <len.brown@...el.com>,
        "sathyanarayanan.kuppuswamy@...ux.intel.com" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>
Subject: Re: [PATCH v11 07/20] x86/virt/tdx: Add skeleton to enable TDX on
 demand

On Mon, 2023-06-19 at 15:16 +0200, David Hildenbrand wrote:
> On 04.06.23 16:27, Kai Huang wrote:
> > To enable TDX the kernel needs to initialize TDX from two perspectives:
> > 1) Do a set of SEAMCALLs to initialize the TDX module to make it ready
> > to create and run TDX guests; 2) Do the per-cpu initialization SEAMCALL
> > on one logical cpu before the kernel wants to make any other SEAMCALLs
> > on that cpu (including those involved during module initialization and
> > running TDX guests).
> > 
> > The TDX module can be initialized only once in its lifetime.  Instead
> > of always initializing it at boot time, this implementation chooses an
> > "on demand" approach to initialize TDX until there is a real need (e.g
> > when requested by KVM).  This approach has below pros:
> > 
> > 1) It avoids consuming the memory that must be allocated by kernel and
> > given to the TDX module as metadata (~1/256th of the TDX-usable memory),
> > and also saves the CPU cycles of initializing the TDX module (and the
> > metadata) when TDX is not used at all.
> > 
> > 2) The TDX module design allows it to be updated while the system is
> > running.  The update procedure shares quite a few steps with this "on
> > demand" initialization mechanism.  The hope is that much of "on demand"
> > mechanism can be shared with a future "update" mechanism.  A boot-time
> > TDX module implementation would not be able to share much code with the
> > update mechanism.
> > 
> > 3) Making SEAMCALL requires VMX to be enabled.  Currently, only the KVM
> > code mucks with VMX enabling.  If the TDX module were to be initialized
> > separately from KVM (like at boot), the boot code would need to be
> > taught how to muck with VMX enabling and KVM would need to be taught how
> > to cope with that.  Making KVM itself responsible for TDX initialization
> > lets the rest of the kernel stay blissfully unaware of VMX.
> > 
> > Similar to module initialization, also make the per-cpu initialization
> > "on demand" as it also depends on VMX being enabled.
> > 
> > Add two functions, tdx_enable() and tdx_cpu_enable(), to enable the TDX
> > module and enable TDX on local cpu respectively.  For now tdx_enable()
> > is a placeholder.  The TODO list will be pared down as functionality is
> > added.
> > 
> > In tdx_enable() use a state machine protected by mutex to make sure the
> > initialization will only be done once, as tdx_enable() can be called
> > multiple times (i.e. KVM module can be reloaded) and may be called
> > concurrently by other kernel components in the future.
> > 
> > The per-cpu initialization on each cpu can only be done once during the
> > module's life time.  Use a per-cpu variable to track its status to make
> > sure it is only done once in tdx_cpu_enable().
> > 
> > Also, a SEAMCALL to do TDX module global initialization must be done
> > once on any logical cpu before any per-cpu initialization SEAMCALL.  Do
> > it inside tdx_cpu_enable() too (if hasn't been done).
> > 
> > tdx_enable() can potentially invoke SEAMCALLs on any online cpus.  The
> > per-cpu initialization must be done before those SEAMCALLs are invoked
> > on some cpu.  To keep things simple, in tdx_cpu_enable(), always do the
> > per-cpu initialization regardless of whether the TDX module has been
> > initialized or not.  And in tdx_enable(), don't call tdx_cpu_enable()
> > but assume the caller has disabled CPU hotplug, done VMXON and
> > tdx_cpu_enable() on all online cpus before calling tdx_enable().
> > 
> > Signed-off-by: Kai Huang <kai.huang@...el.com>
> > Reviewed-by: Isaku Yamahata <isaku.yamahata@...el.com>
> > ---
> > 
> > v10 -> v11:
> >   - Return -NODEV instead of -EINVAL when CONFIG_INTEL_TDX_HOST is off.
> >   - Return the actual error code for tdx_enable() instead of -EINVAL.
> >   - Added Isaku's Reviewed-by.
> > 
> > v9 -> v10:
> >   - Merged the patch to handle per-cpu initialization to this patch to
> >     tell the story better.
> >   - Changed how to handle the per-cpu initialization to only provide a
> >     tdx_cpu_enable() function to let the user of TDX to do it when the
> >     user wants to run TDX code on a certain cpu.
> >   - Changed tdx_enable() to not call cpus_read_lock() explicitly, but
> >     call lockdep_assert_cpus_held() to assume the caller has done that.
> >   - Improved comments around tdx_enable() and tdx_cpu_enable().
> >   - Improved changelog to tell the story better accordingly.
> > 
> > v8 -> v9:
> >   - Removed detailed TODO list in the changelog (Dave).
> >   - Added back steps to do module global initialization and per-cpu
> >     initialization in the TODO list comment.
> >   - Moved the 'enum tdx_module_status_t' from tdx.c to local tdx.h
> > 
> > v7 -> v8:
> >   - Refined changelog (Dave).
> >   - Removed "all BIOS-enabled cpus" related code (Peter/Thomas/Dave).
> >   - Add a "TODO list" comment in init_tdx_module() to list all steps of
> >     initializing the TDX Module to tell the story (Dave).
> >   - Made tdx_enable() unverisally return -EINVAL, and removed nonsense
> >     comments (Dave).
> >   - Simplified __tdx_enable() to only handle success or failure.
> >   - TDX_MODULE_SHUTDOWN -> TDX_MODULE_ERROR
> >   - Removed TDX_MODULE_NONE (not loaded) as it is not necessary.
> >   - Improved comments (Dave).
> >   - Pointed out 'tdx_module_status' is software thing (Dave).
> > 
> > v6 -> v7:
> >   - No change.
> > 
> > v5 -> v6:
> >   - Added code to set status to TDX_MODULE_NONE if TDX module is not
> >     loaded (Chao)
> >   - Added Chao's Reviewed-by.
> >   - Improved comments around cpus_read_lock().
> > 
> > - v3->v5 (no feedback on v4):
> >   - Removed the check that SEAMRR and TDX KeyID have been detected on
> >     all present cpus.
> >   - Removed tdx_detect().
> >   - Added num_online_cpus() to MADT-enabled CPUs check within the CPU
> >     hotplug lock and return early with error message.
> >   - Improved dmesg printing for TDX module detection and initialization.
> > 
> > 
> > ---
> >   arch/x86/include/asm/tdx.h  |   4 +
> >   arch/x86/virt/vmx/tdx/tdx.c | 179 ++++++++++++++++++++++++++++++++++++
> >   arch/x86/virt/vmx/tdx/tdx.h |  13 +++
> >   3 files changed, 196 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> > index b489b5b9de5d..03f74851608f 100644
> > --- a/arch/x86/include/asm/tdx.h
> > +++ b/arch/x86/include/asm/tdx.h
> > @@ -102,8 +102,12 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1,
> >   
> >   #ifdef CONFIG_INTEL_TDX_HOST
> >   bool platform_tdx_enabled(void);
> > +int tdx_cpu_enable(void);
> > +int tdx_enable(void);
> >   #else	/* !CONFIG_INTEL_TDX_HOST */
> >   static inline bool platform_tdx_enabled(void) { return false; }
> > +static inline int tdx_cpu_enable(void) { return -ENODEV; }
> > +static inline int tdx_enable(void)  { return -ENODEV; }
> >   #endif	/* CONFIG_INTEL_TDX_HOST */
> >   
> >   #endif /* !__ASSEMBLY__ */
> > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > index e62e978eba1b..bcf2b2d15a2e 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -13,6 +13,10 @@
> >   #include <linux/errno.h>
> >   #include <linux/printk.h>
> >   #include <linux/smp.h>
> > +#include <linux/cpu.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/percpu-defs.h>
> > +#include <linux/mutex.h>
> >   #include <asm/msr-index.h>
> >   #include <asm/msr.h>
> >   #include <asm/archrandom.h>
> > @@ -23,6 +27,18 @@ static u32 tdx_global_keyid __ro_after_init;
> >   static u32 tdx_guest_keyid_start __ro_after_init;
> >   static u32 tdx_nr_guest_keyids __ro_after_init;
> >   
> > +static unsigned int tdx_global_init_status;
> > +static DEFINE_RAW_SPINLOCK(tdx_global_init_lock);
> > +#define TDX_GLOBAL_INIT_DONE	_BITUL(0)
> > +#define TDX_GLOBAL_INIT_FAILED	_BITUL(1)
> > +
> > +static DEFINE_PER_CPU(unsigned int, tdx_lp_init_status);
> > +#define TDX_LP_INIT_DONE	_BITUL(0)
> > +#define TDX_LP_INIT_FAILED	_BITUL(1)
> 
> I'm curious, why do we have to track three states: uninitialized 
> (!done), initialized (done + ! failed), permanent error (done + failed).
> 
> [besides: why can't you use an enum and share that between global and pcpu?]
> 
> Why can't you have a pcpu "bool tdx_lp_initialized" and "bool 
> tdx_global_initialized"?
> 
> I mean, if there was an error during previous initialization, it's not 
> initialized: you'd try initializing again -- and possibly fail again -- 
> on the next attempt. I doubt that a "try to cache failed status to keep 
> failing fast" is really required.
> 
> Is there any other reason (e.g., second init attempt would set your 
> computer on fire) why it can't be simpler?

No other reasons but only the one that you mentioned above: I didn't want to
retry in case of permanent error.

Yes I agree we can have a pcpu "bool tdx_lp_initialized" and a "bool
tdx_global_initialized" to simplify the logic.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ