[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d821bc32ea87f6a54b88775e7827598f6cb1a1a.camel@intel.com>
Date: Thu, 24 Nov 2022 09:51:47 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Hansen, Dave" <dave.hansen@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: "Luck, Tony" <tony.luck@...el.com>,
"bagasdotme@...il.com" <bagasdotme@...il.com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"Christopherson,, Sean" <seanjc@...gle.com>,
"Chatre, Reinette" <reinette.chatre@...el.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"Shahar, Sagi" <sagis@...gle.com>,
"imammedo@...hat.com" <imammedo@...hat.com>,
"Gao, Chao" <chao.gao@...el.com>,
"Brown, Len" <len.brown@...el.com>,
"sathyanarayanan.kuppuswamy@...ux.intel.com"
<sathyanarayanan.kuppuswamy@...ux.intel.com>,
"Huang, Ying" <ying.huang@...el.com>,
"Williams, Dan J" <dan.j.williams@...el.com>
Subject: Re: [PATCH v7 11/20] x86/virt/tdx: Add placeholder to construct TDMRs
to cover all TDX memory regions
On Wed, 2022-11-23 at 14:17 -0800, Dave Hansen wrote:
> On 11/20/22 16:26, Kai Huang wrote:
> > TDX provides increased levels of memory confidentiality and integrity.
> > This requires special hardware support for features like memory
> > encryption and storage of memory integrity checksums. Not all memory
> > satisfies these requirements.
> >
> > As a result, the TDX introduced the concept of a "Convertible Memory
>
> s/the TDX introduced/TDX introduces/
>
> > Region" (CMR). During boot, the firmware builds a list of all of the
> > memory ranges which can provide the TDX security guarantees. The list
> > of these ranges is available to the kernel by querying the TDX module.
> >
> > The TDX architecture needs additional metadata to record things like
> > which TD guest "owns" a given page of memory. This metadata essentially
> > serves as the 'struct page' for the TDX module. The space for this
> > metadata is not reserved by the hardware up front and must be allocated
> > by the kernel and given to the TDX module.
> >
> > Since this metadata consumes space, the VMM can choose whether or not to
> > allocate it for a given area of convertible memory. If it chooses not
> > to, the memory cannot receive TDX protections and can not be used by TDX
> > guests as private memory.
> >
> > For every memory region that the VMM wants to use as TDX memory, it sets
> > up a "TD Memory Region" (TDMR). Each TDMR represents a physically
> > contiguous convertible range and must also have its own physically
> > contiguous metadata table, referred to as a Physical Address Metadata
> > Table (PAMT), to track status for each page in the TDMR range.
> >
> > Unlike a CMR, each TDMR requires 1G granularity and alignment. To
> > support physical RAM areas that don't meet those strict requirements,
> > each TDMR permits a number of internal "reserved areas" which can be
> > placed over memory holes. If PAMT metadata is placed within a TDMR it
> > must be covered by one of these reserved areas.
> >
> > Let's summarize the concepts:
> >
> > CMR - Firmware-enumerated physical ranges that support TDX. CMRs are
> > 4K aligned.
> > TDMR - Physical address range which is chosen by the kernel to support
> > TDX. 1G granularity and alignment required. Each TDMR has
> > reserved areas where TDX memory holes and overlapping PAMTs can
> > be put into.
>
> s/put into/represented/
>
> > PAMT - Physically contiguous TDX metadata. One table for each page size
> > per TDMR. Roughly 1/256th of TDMR in size. 256G TDMR = ~1G
> > PAMT.
> >
> > As one step of initializing the TDX module, the kernel configures
> > TDX-usable memory regions by passing an array of TDMRs to the TDX module.
> >
> > Constructing the array of TDMRs consists below steps:
> >
> > 1) Create TDMRs to cover all memory regions that the TDX module can use;
>
> Slight tweak:
>
> 1) Create TDMRs to cover all memory regions that the TDX module will use
> for TD memory
>
> The TDX module "uses" more memory than strictly the TMDR's.
>
> > 2) Allocate and set up PAMT for each TDMR;
> > 3) Set up reserved areas for each TDMR.
>
> s/Set up/Designate/
Thanks. All above will be addressed.
>
> > Add a placeholder to construct TDMRs to do the above steps after all
> > TDX memory regions are verified to be truly convertible. Always free
> > TDMRs at the end of the initialization (no matter successful or not)
> > as TDMRs are only used during the initialization.
>
> The changelog here actually looks really good to me so far.
>
> > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > index 32af86e31c47..26048c6b0170 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -445,6 +445,63 @@ static int build_tdx_memory(void)
> > return ret;
> > }
> >
> > +/* Calculate the actual TDMR_INFO size */
> > +static inline int cal_tdmr_size(void)
>
> I think we can spare the bytes to add "culate" in the function name so
> we don't think these are California TDMRs.
Sure will do.
>
> > +{
> > + int tdmr_sz;
> > +
> > + /*
> > + * The actual size of TDMR_INFO depends on the maximum number
> > + * of reserved areas.
> > + *
> > + * Note: for TDX1.0 the max_reserved_per_tdmr is 16, and
> > + * TDMR_INFO size is aligned up to 512-byte. Even it is
> > + * extended in the future, it would be insane if TDMR_INFO
> > + * becomes larger than 4K. The tdmr_sz here should never
> > + * overflow.
> > + */
> > + tdmr_sz = sizeof(struct tdmr_info);
> > + tdmr_sz += sizeof(struct tdmr_reserved_area) *
> > + tdx_sysinfo.max_reserved_per_tdmr;
>
> First, I think 'tdx_sysinfo' should probably be a local variable in
> init_tdx_module() and have its address passed in here. Having global
> variables always makes it more opaque about who is initializing it.
>
> Second, if this code is making assumptions about
> 'max_reserved_per_tdmr', then let's actually add assertions or sanity
> checks. For instance:
>
> if (tdx_sysinfo.max_reserved_per_tdmr > MAX_TDMRS)
> return -1;
>
> or even:
>
> if (tdmr_sz > PAGE_SIZE)
> return -1;
I can add this.
>
> It does almost no good to just assert what the limits are in a comment.
>
> > + /*
> > + * TDX requires each TDMR_INFO to be 512-byte aligned. Always
> > + * round up TDMR_INFO size to the 512-byte boundary.
> > + */
>
> <sigh> More silly comments.
>
> The place to document this is TDMR_INFO_ALIGNMENT. If anyone wants to
> know what the alignment is, exactly, they can look at the definition.
> They don't need to be told *TWICE* what TDMR_INFO_ALIGNMENT #defines to
> in one comment.
I see. Then I think we don't even need this comment since the name of
TDMR_INFO_ALIGNMENT already implies?
>
> > + return ALIGN(tdmr_sz, TDMR_INFO_ALIGNMENT);
> > +}
> > +
> > +static struct tdmr_info *alloc_tdmr_array(int *array_sz)
> > +{
> > + /*
> > + * TDX requires each TDMR_INFO to be 512-byte aligned.
> > + * Use alloc_pages_exact() to allocate all TDMRs at once.
> > + * Each TDMR_INFO will still be 512-byte aligned since
> > + * cal_tdmr_size() always returns 512-byte aligned size.
> > + */
>
> OK, I think you're just trolling me now. Two *MORE* mentions of the
> 512-byte alignment?
I'll remove.
>
> > + *array_sz = cal_tdmr_size() * tdx_sysinfo.max_tdmrs;
> > +
> > + /*
> > + * Zero the buffer so 'struct tdmr_info::size' can be
> > + * used to determine whether a TDMR is valid.
> > + *
> > + * Note: for TDX1.0 the max_tdmrs is 64 and TDMR_INFO size
> > + * is 512-byte. Even they are extended in the future, it
> > + * would be insane if the total size exceeds 4MB.
> > + */
> > + return alloc_pages_exact(*array_sz, GFP_KERNEL | __GFP_ZERO);
> > +}
>
> This looks massively over complicated.
>
> Get rid of this function entirely. Then create:
>
> static int tdmr_array_size(void)
> {
> return tdmr_size_single() * tdx_sysinfo.max_tdmrs;
> }
>
> The *caller* can do:
>
> tdmr_array = alloc_pages_exact(tdmr_array_size(),
> GFP_KERNEL | __GFP_ZERO);
> if (!tdmr_array) {
> ...
>
> Then the error path is:
>
> free_pages_exact(tdmr_array, tdmr_array_size());
>
> Then, there are no size pointers going back and forth. Easy peasy. I'm
> OK with a little arithmetic being repeated.
Yes. Will do.
>
> > +/*
> > + * Construct an array of TDMRs to cover all TDX memory ranges.
> > + * The actual number of TDMRs is kept to @tdmr_num.
> > + */
> > +static int construct_tdmrs(struct tdmr_info *tdmr_array, int *tdmr_num)
> > +{
> > + /* Return -EINVAL until constructing TDMRs is done */
> > + return -EINVAL;
> > +}
> > +
> > /*
> > * Detect and initialize the TDX module.
> > *
> > @@ -454,6 +511,9 @@ static int build_tdx_memory(void)
> > */
> > static int init_tdx_module(void)
> > {
> > + struct tdmr_info *tdmr_array;
> > + int tdmr_array_sz;
> > + int tdmr_num;
>
> I tend to write these like:
>
> "tdmr_num" is the number of *a* TDMR.
>
> "nr_tdmrs" is the number of TDMRs.
Indeed. Will do.
Powered by blists - more mailing lists