[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH7PR11MB84558F2095C87CF98AFBBFB09A832@PH7PR11MB8455.namprd11.prod.outlook.com>
Date: Wed, 30 Apr 2025 14:39:27 +0000
From: "Miao, Jun" <jun.miao@...el.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
CC: "Li, Zhiquan1" <zhiquan1.li@...el.com>, "Hansen, Dave"
<dave.hansen@...el.com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "x86@...nel.org" <x86@...nel.org>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>, "mingo@...hat.com"
<mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>, "Du, Fan"
<fan.du@...el.com>
Subject: RE: [V2 PATCH] x86/tdx: add VIRT_CPUID2 virtualization if REDUCE_VE
was not successful
>
>On Wed, Apr 30, 2025 at 11:10:32AM +0000, Miao, Jun wrote:
>> >
>> >On Wed, Apr 30, 2025 at 10:15:05AM +0800, Zhiquan Li wrote:
>> >>
>> >> On 2025/4/29 22:50, Dave Hansen wrote:
>> >> > On 4/29/25 07:31, Jun Miao wrote:
>> >> >> REDUCE_VE can only be enabled if x2APIC_ID has been properly
>> >> >> configured with unique values for each VCPU. Check if VMM has
>> >> >> provided an activated topology configuration first as it is the
>> >> >> prerequisite of REDUCE_VE and ENUM_TOPOLOGY, so move it to
>> >> >> reduce_unnecessary_ve(). The function
>> >> >> enable_cpu_topology_enumeration() was very little and can be
>> >> >> integrated into reduce_unnecessary_ve().
>> >> >
>> >> > Isn't this just working around VMM bugs? Shouldn't we just panic
>> >> > as quickly as possible so the VMM config gets fixed rather than adding
>kludges?
>> >>
>> >>
>> >> Now failed to virtualize these two cases will cause TD VM
>> >> regression vs legacy VM. Do you mean the panic will just for the
>> >> #VE caused by CPUID leaf 0x2? Or both (+ VMM not configure topology) will
>panic?
>> >>
>> >> Currently the most customer's complaints come from the CPUID leaf
>> >> 0x2 not virtualization, and most of access come from user space.
>> >> Is it appropriate for such behavior directly cause a guest kernel panic?
>> >
>> >The appropriate behavior would be to fix VMM to configure APIC IDs
>> >correctly and use TDX module that supports REDUCE_VE.
>> >
>>
>> Yes, I completely agree with your point to fix VMM APIC IDs.
>> The idea here is only to avoid this panic by using the guest component even when
>the host is incomplete.
>> And thereby improving the robustness of the kernel code. Moreover,
>> even if the VMM becomes complete later, the adjusted logic will
>> continue to adapt still. (^v^)
>
>VIRT_CPUID2 was introduced as stop gap until REDUCE_VE is landed. I don't see a
>point in getting it enabled at this stage. REDUCE_VE covers much more broken
>corner cases. CPUID 0x2 is just the most prominent one because of glibc bug.
>
Hmm, at this stage, I may indeed be a pressing urgency to resolve this glibc issues in
real applications from the user's perspective such as [Bug Report with Redhat/Rocky9.2 qcow].
The goal is to leverage existing resources(VIRT_CPUID2) to resolve this panic, and we're hoping
for the VMM side to prioritize implementing the ability to set x2APIC IDs for each TD vCPU.
Thank you for your patient explanation again.
---Jun Miao
>--
> Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists