lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230609101435.xmz3kgydseddrty7@box.shutemov.name>
Date:   Fri, 9 Jun 2023 13:14:35 +0300
From:   kirill.shutemov@...ux.intel.com
To:     Kai Huang <kai.huang@...el.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-mm@...ck.org, dave.hansen@...el.com, tony.luck@...el.com,
        peterz@...radead.org, tglx@...utronix.de, seanjc@...gle.com,
        pbonzini@...hat.com, david@...hat.com, dan.j.williams@...el.com,
        rafael.j.wysocki@...el.com, ying.huang@...el.com,
        reinette.chatre@...el.com, len.brown@...el.com, ak@...ux.intel.com,
        isaku.yamahata@...el.com, chao.gao@...el.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, bagasdotme@...il.com,
        sagis@...gle.com, imammedo@...hat.com
Subject: Re: [PATCH v11 17/20] x86/kexec: Flush cache of TDX private memory

On Mon, Jun 05, 2023 at 02:27:30AM +1200, Kai Huang wrote:
> There are two problems in terms of using kexec() to boot to a new kernel
> when the old kernel has enabled TDX: 1) Part of the memory pages are
> still TDX private pages; 2) There might be dirty cachelines associated
> with TDX private pages.
> 
> The first problem doesn't matter on the platforms w/o the "partial write
> machine check" erratum.  KeyID 0 doesn't have integrity check.  If the
> new kernel wants to use any non-zero KeyID, it needs to convert the
> memory to that KeyID and such conversion would work from any KeyID.
> 
> However the old kernel needs to guarantee there's no dirty cacheline
> left behind before booting to the new kernel to avoid silent corruption
> from later cacheline writeback (Intel hardware doesn't guarantee cache
> coherency across different KeyIDs).
> 
> There are two things that the old kernel needs to do to achieve that:
> 
> 1) Stop accessing TDX private memory mappings:
>    a. Stop making TDX module SEAMCALLs (TDX global KeyID);
>    b. Stop TDX guests from running (per-guest TDX KeyID).
> 2) Flush any cachelines from previous TDX private KeyID writes.
> 
> For 2), use wbinvd() to flush cache in stop_this_cpu(), following SME
> support.  And in this way 1) happens for free as there's no TDX activity
> between wbinvd() and the native_halt().
> 
> Flushing cache in stop_this_cpu() only flushes cache on remote cpus.  On
> the cpu which does kexec(), unlike SME which does the cache flush in
> relocate_kernel(), do the cache flush right after stopping remote cpus
> in machine_shutdown().  This is because on the platforms with above
> erratum, the kernel needs to convert all TDX private pages back to
> normal before a fast warm reset reboot or booting to the new kernel in
> kexec().  Flushing cache in relocate_kernel() only covers the kexec()
> but not the fast warm reset reboot.
> 
> Theoretically, cache flush is only needed when the TDX module has been
> initialized.  However initializing the TDX module is done on demand at
> runtime, and it takes a mutex to read the module status.  Just check
> whether TDX is enabled by the BIOS instead to flush cache.
> 
> Signed-off-by: Kai Huang <kai.huang@...el.com>
> Reviewed-by: Isaku Yamahata <isaku.yamahata@...el.com>

Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ