lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c7effc3abe71aa1cbee41f3bd46b97aed40be26.camel@intel.com>
Date:   Mon, 12 Jun 2023 10:27:44 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
CC:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "david@...hat.com" <david@...hat.com>,
        "bagasdotme@...il.com" <bagasdotme@...il.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "Yamahata, Isaku" <isaku.yamahata@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Shahar, Sagi" <sagis@...gle.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "imammedo@...hat.com" <imammedo@...hat.com>,
        "Gao, Chao" <chao.gao@...el.com>,
        "Brown, Len" <len.brown@...el.com>,
        "sathyanarayanan.kuppuswamy@...ux.intel.com" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>
Subject: Re: [PATCH v11 18/20] x86: Handle TDX erratum to reset TDX private
 memory during kexec() and reboot

On Mon, 2023-06-12 at 10:58 +0300, kirill.shutemov@...ux.intel.com wrote:
> On Mon, Jun 12, 2023 at 03:06:48AM +0000, Huang, Kai wrote:
> > On Fri, 2023-06-09 at 16:23 +0300, kirill.shutemov@...ux.intel.com wrote:
> > > On Mon, Jun 05, 2023 at 02:27:31AM +1200, Kai Huang wrote:
> > > > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > > > index 8ff07256a515..0aa413b712e8 100644
> > > > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > > > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > > > @@ -587,6 +587,14 @@ static int tdmr_set_up_pamt(struct tdmr_info *tdmr,
> > > >  		tdmr_pamt_base += pamt_size[pgsz];
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * tdx_memory_shutdown() also reads TDMR's PAMT during
> > > > +	 * kexec() or reboot, which could happen at anytime, even
> > > > +	 * during this particular code.  Make sure pamt_4k_base
> > > > +	 * is firstly set otherwise tdx_memory_shutdown() may
> > > > +	 * get an invalid PAMT base when it sees a valid number
> > > > +	 * of PAMT pages.
> > > > +	 */
> > > 
> > > Hmm? What prevents compiler from messing this up. It can reorder as it
> > > wishes, no?
> > 
> > Hmm.. Right. Sorry I missed.
> > 
> > > 
> > > Maybe add a proper locking? Anything that prevent preemption would do,
> > > right?
> > > 
> > > >  	tdmr->pamt_4k_base = pamt_base[TDX_PS_4K];
> > > >  	tdmr->pamt_4k_size = pamt_size[TDX_PS_4K];
> > > >  	tdmr->pamt_2m_base = pamt_base[TDX_PS_2M];
> > > 
> > 
> > I think a simple memory barrier will do.  How does below look?
> > 
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -591,11 +591,12 @@ static int tdmr_set_up_pamt(struct tdmr_info *tdmr,
> >          * tdx_memory_shutdown() also reads TDMR's PAMT during
> >          * kexec() or reboot, which could happen at anytime, even
> >          * during this particular code.  Make sure pamt_4k_base
> > -        * is firstly set otherwise tdx_memory_shutdown() may
> > -        * get an invalid PAMT base when it sees a valid number
> > -        * of PAMT pages.
> > +        * is firstly set and place a __mb() after it otherwise
> > +        * tdx_memory_shutdown() may get an invalid PAMT base
> > +        * when it sees a valid number of PAMT pages.
> >          */
> >         tdmr->pamt_4k_base = pamt_base[TDX_PS_4K];
> > +       __mb();
> 
> If you want to play with barriers, assign pamt_4k_base the last with
> smp_store_release() and read it first in tdmr_get_pamt() with
> smp_load_acquire(). If it is non-zero, all pamt_* fields are valid.
> 
> Or just drop this non-sense and use a spin lock for serialization.
> 

We don't need to guarantee when pamt_4k_base is valid, all other pamt_* are
valid.  Instead, we need to guarantee when (at least) _one_ of pamt_*_size is
valid, the pamt_4k_base is valid.

For example,

	pamt_4k_base  	-> valid
	pamt_4k_size	-> invalid (0)
	pamt_2m_size	-> invalid
	pamt_1g_size	-> invalid

and
	pamt_4k_base	-> valid
	pamt_4k_size	-> valid
	pamt_2m_size	-> invalid
	pamt_1g_size	-> invalid

are both OK.

The reason is the PAMTs are only written by the TDX module in init_tdmrs().  So
if tdx_memory_shutdown() sees a part of PAMT (the second case above), those PAMT
pages are not yet TDX private pages, thus converting part of PAMT is fine.

The invalid case is when any pamt_*_size is valid, pamt_4k_base is invalid,
e.g.:

	pamt_4k_base	-> invalid
	pamt_4k_size	-> valid
	pamt_2m_size	-> invalid
	pamt_1g_size	-> invalid

as this case tdx_memory_shutdown() will convert a incorrect (not partial) PAMT
area.

So I think a __mb() after setting tdmr->pamt_4k_base should be good enough, as
it guarantees when setting to any pamt_*_size happens, the valid pamt_4k_base
will be seen by other cpus.

Does it make sense?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ