[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be85aa450b221bb730013d3b6ec0d4e71b51c228.camel@intel.com>
Date: Thu, 12 Jan 2023 11:33:06 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Hansen, Dave" <dave.hansen@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: "Luck, Tony" <tony.luck@...el.com>,
"bagasdotme@...il.com" <bagasdotme@...il.com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"Christopherson,, Sean" <seanjc@...gle.com>,
"Chatre, Reinette" <reinette.chatre@...el.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"Shahar, Sagi" <sagis@...gle.com>,
"imammedo@...hat.com" <imammedo@...hat.com>,
"Gao, Chao" <chao.gao@...el.com>,
"Brown, Len" <len.brown@...el.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"sathyanarayanan.kuppuswamy@...ux.intel.com"
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
"Huang, Ying" <ying.huang@...el.com>,
"Williams, Dan J" <dan.j.williams@...el.com>
Subject: Re: [PATCH v8 07/16] x86/virt/tdx: Use all system memory when
initializing TDX module as TDX memory
On Mon, 2023-01-09 at 08:51 -0800, Dave Hansen wrote:
> > > > > > + list_for_each_entry(tmb, &tdx_memlist, list) {
> > > > > > + /*
> > > > > > + * The new range is TDX memory if it is fully
> > > > > > covered by
> > > > > > + * any TDX memory block.
> > > > > > + *
> > > > > > + * Note TDX memory blocks are originated from
> > > > > > memblock
> > > > > > + * memory regions, which can only be contiguous when
> > > > > > two
> > > > > > + * regions have different NUMA nodes or flags.
> > > > > > Therefore
> > > > > > + * the new range cannot cross multiple TDX memory
> > > > > > blocks.
> > > > > > + */
> > > > > > + if (start_pfn >= tmb->start_pfn && end_pfn <= tmb-
> > > > > > >end_pfn)
> > > > > > + return true;
> > > > > > + }
> > > > > > + return false;
> > > > > > +}
> > > >
> > > > I don't really like that comment. It should first state its behavior
> > > > and assumptions, like:
> > > >
> > > > This check assumes that the start_pfn<->end_pfn range does not
> > > > cross multiple tdx_memlist entries.
> > > >
> > > > Only then should it describe why that is OK:
> > > >
> > > > A single memory hotplug even across mutliple memblocks (from
> > > > which tdx_memlist entries are derived) is impossible. ... then
> > > > actually explain
> > > >
> >
> > How about below?
> >
> > /*
> > * This check assumes that the start_pfn<->end_pfn range does not
> > cross
> > * multiple tdx_memlist entries. A single memory hotplug event
> > across
> > * multiple memblocks (from which tdx_memlist entries are derived)
> > is
> > * impossible. That means start_pfn<->end_pfn range cannot exceed a
> > * tdx_memlist entry, and the new range is TDX memory if it is
> > fully
> > * covered by any tdx_memlist entry.
> > */
>
> I was hoping you would actually explain why it is impossible.
>
> Is there something fundamental that keeps a memory area that spans two
> nodes from being removed and then a new area added that is comprised of
> a single node?
>
> Boot time:
>
> | memblock | memblock |
> <--Node=0--> <--Node=1-->
>
> Funky hotplug... nothing to see here, then:
>
> <--------Node=2-------->
>
> I would believe that there is no current bare-metal TDX system that has
> an implementation like this. But, the comments above speak like it's
> fundamentally impossible. That should be clarified.
>
> In other words, that comment talks about memblock attributes as being
> the core underlying reason that that simplified check is OK. Is that
> it, or is it really the reduced hotplug feature set on TDX systems?
Hi Dave,
I think I have been forgetting that we have switched to reject non-TDX memory in
memory online, but not in memory hot-add.
Memory offline/online is done on granularity of 'struct memory_block', but not
memblock. In fact, the hotpluggable memory region (one memblock) must be
multiple of memory_block, and a "to-be-online" memory_block must be full range
memory (no memory hole).
So if I am not missing something, IIUC that means if the start_pfn<->end_pfn is
TDX memory, it must be fully within some @tdx_memlist entry, but cannot cross
multiple small entries. And the memory hotplug case in your above diagram
actually shouldn't matter.
If above stands, how about below?
/*
* This check assumes that the start_pfn<->end_pfn range does not
* cross multiple @tdx_memlist entries. A single memory online
* event across multiple @tdx_memlist entries (which are derived
* from memblocks at the time of module initialization) is not
* possible.
*
* This is because memory offline/online is done on granularity
* of 'struct memory_block', and the hotpluggable memory region
* (one memblock) must be multiple of memory_block. Also, the
* "to-be-online" memory_block must be full memory (no memory
* hole, i.e. containing multiple small memblocks).
*
* This means if the start_pfn<->end_pfn range is TDX memory, it
* must be fully within one @tdx_memlist entry, but cannot cross
* multiple small entries.
*/
list_for_each_entry(tmb, &tdx_memlist, list) {
if (start_pfn >= tmb->start_pfn && end_pfn <= tmb->end_pfn)
return true;
}
Powered by blists - more mailing lists