[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8a14103-80a3-4b03-a214-f55429c46f93@intel.com>
Date: Fri, 30 Aug 2024 15:02:58 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: "Huang, Kai" <kai.huang@...el.com>, "Hansen, Dave"
<dave.hansen@...el.com>, "seanjc@...gle.com" <seanjc@...gle.com>,
"bp@...en8.de" <bp@...en8.de>, "peterz@...radead.org"
<peterz@...radead.org>, "hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"Williams, Dan J" <dan.j.williams@...el.com>
Cc: "Gao, Chao" <chao.gao@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"x86@...nel.org" <x86@...nel.org>, "Yamahata, Isaku"
<isaku.yamahata@...el.com>
Subject: Re: [PATCH v3 7/8] x86/virt/tdx: Reduce TDMR's reserved areas by
using CMRs to find memory holes
On 30/08/24 14:52, Huang, Kai wrote:
> On Fri, 2024-08-30 at 13:50 +0300, Hunter, Adrian wrote:
>> On 27/08/24 10:14, Kai Huang wrote:
>>> A TDX module initialization failure was reported on a Emerald Rapids
>>> platform:
>>>
>>> virt/tdx: initialization failed: TDMR [0x0, 0x80000000): reserved areas exhausted.
>>> virt/tdx: module initialization failed (-28)
>>>
>>> As part of initializing the TDX module, the kernel informs the TDX
>>> module of all "TDX-usable memory regions" using an array of TDX defined
>>> structure "TD Memory Region" (TDMR). Each TDMR must be in 1GB aligned
>>> and in 1GB granularity, and all "non-TDX-usable memory holes" within a
>>> given TDMR must be marked as "reserved areas". The TDX module reports a
>>> maximum number of reserved areas that can be supported per TDMR.
>>
>> The statement:
>>
>> ... all "non-TDX-usable memory holes" within a
>> given TDMR must be marked as "reserved areas".
>>
>> is not exactly true, which is essentially the basis of this fix.
>
> Hmm I think I see what you mean. Perhaps the "must be marked as" confuses
> you?
>
> The "TDX-usable memory" here means all pages that can potentially be used by
> TDX. They don't have to be actually used by TDX, i.e., "TDX-usable memory" vs
> "TDX-used memory".
>
> And the "non-TDX-usable memory holes" means the memory regions that cannot be
> possibly used by TDX at all.
>
> Is below better if I change "must be marked as" to "are"?
>
> As part of initializing the TDX module, the kernel informs the TDX
> module of all "TDX-usable memory regions" using an array of TDX defined
> structure "TD Memory Region" (TDMR). Each TDMR must be in 1GB aligned
> and in 1GB granularity, and all "non-TDX-usable memory holes" within a
> given TDMR are marked as "reserved areas". The TDX module reports a
> maximum number of reserved areas that can be supported per TDMR.
Yes, that's better.
>
> Note in my logic here we don't need to mention CMR. Here I just want to tell
> the fact that each TDMR has number of "reserved areas" and the maximum number
> is reported by TDX module.
>
>>
>> The relevant requirements are (from the spec):
>>
>> Any non-reserved 4KB page within a TDMR must be convertible
>> i.e., it must be within a CMR
>
> Yes.
>
>>
>> Reserved areas within a TDMR need not be within a CMR.
>
> Yes. They need not to be, but they can be.
>
>>
>> PAMT areas must not overlap with TDMR non-reserved areas;
>> however, they may reside within TDMR reserved areas
>> (as long as these are convertible).
>
> Yes. However in implementation PAMTs are out of page allocator so they are
> all within TDMRs thus need to be put to reserved areas.
>
> Those are TDX architectural requirements. They are not all related to the fix
> of this problem. The most important thing here is:
>
> Any non-reserved memory within a TDMR must be within CMR.
>
> That means as long as one memory region is CMR, it doesn't need to be in
> "reserved area" from TDX architecture's prespective.
>
> That means we can include more memory regions (even they cannot be used by TDX
> at all) as "non-reserved" areas in TDMRs to reduce the number of "reserved
> areas" as long as those regions are within CMR.
>
> This is the logic behind this fix.
>
>>
>>>
>>> Currently, the kernel finds those "non-TDX-usable memory holes" within a
>>> given TDMR by walking over a list of "TDX-usable memory regions", which
>>> essentially reflects the "usable" regions in the e820 table (w/o memory
>>> hotplug operations precisely, but this is not relevant here).
>>
>> But including e820 table regions that are not "usable" in the TDMR
>> reserved areas is not necessary - it is not one of the rules.
>
> True. That's why we can do this fix.
>
>>
>> What confused me initially was that I did not realize the we already
>> require that the TDX Module does not touch memory in the TDMR
>> non-reserved areas not specifically allocated to it. So it makes no
>> difference to the TDX Module what the pages that have not been allocated
>> to it, are used for.
>>
>>>
>>> As shown above, the root cause of this failure is when the kernel tries
>>> to construct a TDMR to cover address range [0x0, 0x80000000), there
>>> are too many memory holes within that range and the number of memory
>>> holes exceeds the maximum number of reserved areas.
>>>
>>> The E820 table of that platform (see [1] below) reflects this: the
>>> number of memory holes among e820 "usable" entries exceeds 16, which is
>>> the maximum number of reserved areas TDX module supports in practice.
>>>
>>> === Fix ===
>>>
>>> There are two options to fix this: 1) reduce the number of memory holes
>>> when constructing a TDMR to save "reserved areas"; 2) reduce the TDMR's
>>> size to cover fewer memory regions, thus fewer memory holes.
>>
>> Probably better to try and get rid of this "two options" stuff and focus
>> on how this is a simple and effective fix.
>
> As I mentioned in another reply I would prefer to keep those options since I
> believe they can provide a full view to the reviewers.
Powered by blists - more mailing lists