lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Sep 2023 11:42:52 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
CC:     "Raj, Ashok" <ashok.raj@...el.com>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "david@...hat.com" <david@...hat.com>,
        "bagasdotme@...il.com" <bagasdotme@...il.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "Yamahata, Isaku" <isaku.yamahata@...el.com>,
        "nik.borisov@...e.com" <nik.borisov@...e.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "Luck, Tony" <tony.luck@...el.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "Shahar, Sagi" <sagis@...gle.com>,
        "imammedo@...hat.com" <imammedo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>, "Gao, Chao" <chao.gao@...el.com>,
        "Brown, Len" <len.brown@...el.com>,
        "sathyanarayanan.kuppuswamy@...ux.intel.com" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v13 20/22] x86/kexec(): Reset TDX private memory on
 platforms with TDX erratum

On Thu, 2023-09-14 at 21:36 +0000, Edgecombe, Rick P wrote:
> On Sat, 2023-08-26 at 00:14 +1200, Kai Huang wrote:
> > diff --git a/arch/x86/kernel/machine_kexec_64.c
> > b/arch/x86/kernel/machine_kexec_64.c
> > index 1a3e2c05a8a5..03d9689ef808 100644
> > --- a/arch/x86/kernel/machine_kexec_64.c
> > +++ b/arch/x86/kernel/machine_kexec_64.c
> > @@ -28,6 +28,7 @@
> >  #include <asm/setup.h>
> >  #include <asm/set_memory.h>
> >  #include <asm/cpu.h>
> > +#include <asm/tdx.h>
> >  
> >  #ifdef CONFIG_ACPI
> >  /*
> > @@ -301,6 +302,14 @@ void machine_kexec(struct kimage *image)
> >         void *control_page;
> >         int save_ftrace_enabled;
> >  
> > +       /*
> > +        * For platforms with TDX "partial write machine check"
> > erratum,
> > +        * all TDX private pages need to be converted back to normal
> > +        * before booting to the new kernel, otherwise the new kernel
> > +        * may get unexpected machine check.
> > +        */
> > +       tdx_reset_memory();
> > +
> >  #ifdef CONFIG_KEXEC_JUMP
> >         if (image->preserve_context)
> >                 save_processor_state();
> 
> Without a ton of knowledge on TDX arch stuff, I'm mostly looked at the
> kexec flow with respect to anything that might be tinkering with the
> PAMT. Everything there looked good to me.
> 
> But I'm wondering if you want to skip the tdx_reset_memory() in the
> KEXEC_JUMP/preserve_context case. Somehow (I'm not clear on all the
> details), kexec can be configured to have the new kernel jump back to
> the old kernel and resume execution as if nothing happened. Then I
> think you would want to keep the TDX data around. Does that make any
> sense?
> 

Good point.  Thanks!

Based on my understanding, it should be OK to skip tdx_reset_memory() (or better
to) when preserve_context is on.  The second kernel shouldn't touch first
kernel's memory anyway otherwise it may corrupt the first kernel state (if it
does this maliciously or accidentally, then the first kernel isn't guaranteed to
work anyway).  

In fact, if we do tdx_reset_memory() when preserve_memory is on, we will need to
do additional things to mark TDX as dead otherwise after jumping back other
kernel code will still believe TDX is alive and continue to use TDX.

I'll do this if I don't hear objection from other people.  

Something like below?

diff --git a/arch/x86/kernel/machine_kexec_64.c
b/arch/x86/kernel/machine_kexec_64.c
index 03d9689ef808..73ed01360408 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -307,12 +307,18 @@ void machine_kexec(struct kimage *image)
         * all TDX private pages need to be converted back to normal
         * before booting to the new kernel, otherwise the new kernel
         * may get unexpected machine check.
+        *
+        * But skip this when preserve_context is on.  The second kernel
+        * shouldn't touch the first kernel's memory anyway.  Skipping
+        * this also avoids killing TDX in the first kernel, which would
+        * require more complicated handling.
         */
-       tdx_reset_memory();
-
 #ifdef CONFIG_KEXEC_JUMP
        if (image->preserve_context)
                save_processor_state();
+       else
+#else
+       tdx_reset_memory();
 #endif


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ