[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ddba6e9e284e3a4b4d04ea4cf552822973181c40.camel@intel.com>
Date: Fri, 29 Apr 2022 19:32:45 +1200
From: Kai Huang <kai.huang@...el.com>
To: Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Cc: seanjc@...gle.com, pbonzini@...hat.com, len.brown@...el.com,
tony.luck@...el.com, rafael.j.wysocki@...el.com,
reinette.chatre@...el.com, dan.j.williams@...el.com,
peterz@...radead.org, ak@...ux.intel.com,
kirill.shutemov@...ux.intel.com,
sathyanarayanan.kuppuswamy@...ux.intel.com,
isaku.yamahata@...el.com
Subject: Re: [PATCH v3 11/21] x86/virt/tdx: Choose to use all system RAM as
TDX memory
On Thu, 2022-04-28 at 08:54 -0700, Dave Hansen wrote:
> On 4/5/22 21:49, Kai Huang wrote:
> > As one step of initializing the TDX module, the memory regions that the
> > TDX module can use must be configured to it via an array of 'TD Memory
>
> "can use must be"?
"can use" applies to "the TDX module". "must be" applies to "the memory
regions".
Sorry for bad english. Yes it's not good. I'll use your words suggested in
another patch:
The kernel configures TDX-usable memory regions by passing an
array of "TD Memory Regions" (TDMRs) to the TDX module.
>
> > Regions' (TDMR). The kernel is responsible for choosing which memory
> > regions to be used as TDX memory and building the array of TDMRs to
> > cover those memory regions.
> >
> > The first generation of TDX-capable platforms basically guarantees all
> > system RAM regions during machine boot are Convertible Memory Regions
> > (excluding the memory below 1MB) and can be used by TDX. The memory
> > pages allocated to TD guests can be any pages managed by the page
> > allocator. To avoid having to modify the page allocator to distinguish
> > TDX and non-TDX memory allocation, adopt a simple policy to use all
> > system RAM regions as TDX memory. The low 1MB pages are excluded from
> > TDX memory since they are not in CMRs in some platforms (those pages are
> > reserved at boot time and won't be managed by page allocator anyway).
> >
> > This policy could be revised later if future TDX generations break
> > the guarantee or when the size of the metadata (~1/256th of the size of
> > the TDX usable memory) becomes a concern. At that time a CMR-aware
> > page allocator may be necessary.
>
> Remember that you have basically three or four short sentences to get a
> reviewer's attention. There's a lot of noise in that changelog. Can
> you trim it down or at least make the first bit less jargon-packed and
> more readable?
>
> > Also, on the first generation of TDX-capable machine, the system RAM
> > ranges discovered during boot time are all memory regions that kernel
> > can use during its runtime. This is because the first generation of TDX
> > architecturally doesn't support ACPI memory hotplug
>
> "Architecturally" usually means: written down and agreed to by hardware
> and software alike. Is this truly written down somewhere? I don't
> recall seeing it in the architecture documents.
>
> I fear this is almost the _opposite_ of architecture: it's basically a
> fortunate coincidence.
>
> > (CMRs are generated
> > during machine boot and are static during machine's runtime). Also, the
> > first generation of TDX-capable platform doesn't support TDX and ACPI
> > memory hotplug at the same time on a single machine. Another case of
> > memory hotplug is user may use NVDIMM as system RAM via kmem driver.
> > But the first generation of TDX-capable machine doesn't support TDX and
> > NVDIMM simultaneously, therefore in practice it cannot happen. One
> > special case is user may use 'memmap' kernel command line to reserve
> > part of system RAM as x86 legacy PMEMs, and user can theoretically add
> > them as system RAM via kmem driver. This can be resolved by always
> > treating legacy PMEMs as TDX memory.
>
> Again, there's a ton of noise here. I'm struggling to get the point.
>
> > Implement a helper to loop over all RAM entries in e820 table to find
> > all system RAM ranges, as a preparation to covert all of them to TDX
> > memory. Use 'e820_table', rather than 'e820_table_firmware' to honor
> > 'mem' and 'memmap' command lines.
>
> *How* does this honor them? For instance, if I do mem=4G, will the TDX
> code limit itself to converting 4GB for TDX?
Yes.
>
> > Following e820__memblock_setup(),
> > both E820_TYPE_RAM and E820_TYPE_RESERVED_KERN types are treated as TDX
> > memory, and contiguous ranges in the same NUMA node are merged together.
>
> Again, you're just rehashing the code's logic in English. That's not
> what a changelog is for.
Sorry you are right.
I'll address rest of your comments after we settle memory hotplug handling
discussion.
Thanks!
--
Thanks,
-Kai
Powered by blists - more mailing lists