lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3236016c46da2cbdf314839255e8806ae23f228.camel@intel.com>
Date:   Wed, 03 Aug 2022 13:30:09 +1200
From:   Kai Huang <kai.huang@...el.com>
To:     Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Cc:     seanjc@...gle.com, pbonzini@...hat.com, len.brown@...el.com,
        tony.luck@...el.com, rafael.j.wysocki@...el.com,
        reinette.chatre@...el.com, dan.j.williams@...el.com,
        peterz@...radead.org, ak@...ux.intel.com,
        kirill.shutemov@...ux.intel.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com,
        isaku.yamahata@...el.com
Subject: Re: [PATCH v5 12/22] x86/virt/tdx: Convert all memory regions in
 memblock to TDX memory

On Fri, 2022-07-08 at 11:34 +1200, Kai Huang wrote:
> > Why not just entirely remove the lower 1MB from the memblock structure
> > on TDX systems?  Do something equivalent to adding this on the kernel
> > command line:
> > 
> >  	memmap=1M$0x0
> 
> I will explore this option.  Thanks!

Hi Dave,

After investigating and testing, we cannot simply remove first 1MB from e820
table which is similar to what 'memmap=1M$0x0' does, as the kernel needs low
memory as trampoline to bring up all APs.

Currently I am doing below:

--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -65,6 +65,17 @@ void __init reserve_real_mode(void)
         * setup_arch().
         */
        memblock_reserve(0, SZ_1M);
+
+       /*
+        * As one step of initializing the TDX module (on-demand), the
+        * kernel will later verify all memory regions in memblock are
+        * truly TDX-capable and convert all of them to TDX memory.
+        * The first 1MB may not be enumerated as TDX-capable memory.
+        * To avoid failure to verify, explicitly remove the first 1MB
+        * from memblock for a TDX (BIOS) enabled system.
+        */
+       if (platform_tdx_enabled())
+               memblock_remove(0, SZ_1M);

I tested an it worked (I didn't observe any problem), but am I missing
something?

Also, regarding to whether we can remove platform_tdx_enabled() at all, I looked
into the spec again and there's no MSR or CPUID from which we can check TDX is
enabled by BIOS -- except checking the SEAMRR_MASK MSR, which is basically
platform_tdx_enabled() also did.

Checking MSR_MTRRcap.SEAMRR bit isn't enough as it will be true as long as the
hardware supports SEAMRR, but it doesn't tell whether SEAMRR(TDX) is enabled by
BIOS.

So if above code is reasonable, I think we can still detect TDX during boot and
keep platform_tdx_enabled().  

It also detects TDX KeyIDs, which isn't necessary for removing the first 1MB
here (nor for kexec() support), but detecting TDX KeyIDs must be done anyway
either during kernel boot or during initializing TDX module.

Detecting TDX KeyID at boot time also has an advantage that in the future we can
expose KeyIDs via /sysfs and userspace can know how many TDs the machine can
support w/o having to initializing the  TDX module first (we received such
requirement from customer but yes it is arguable).

Any comments?

-- 
Thanks,
-Kai


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ