[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50e67263-33ba-9921-1bc2-a37b99bc2459@intel.com>
Date: Tue, 11 Apr 2023 18:21:25 -0700
From: Fenghua Yu <fenghua.yu@...el.com>
To: "Chang S. Bae" <chang.seok.bae@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>
CC: linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
Chintan M Patel <chintan.m.patel@...el.com>,
Thiago Macieira <thiago.macieira@...el.com>
Subject: Re: [RFC PATCH] x86/fpu/xstate: Add more diagnostic information on
inconsistent xstate sizes
Hi, Chang,
On 4/11/23 09:29, Chang S. Bae wrote:
> On 4/10/2023 1:43 PM, Fenghua Yu wrote:
>> On 4/7/23 11:22, Chang S. Bae wrote:
>>> On 4/5/2023 11:39 AM, Fenghua Yu wrote:
>>>>
>>>> diff --git a/arch/x86/kernel/fpu/xstate.c
>>>> b/arch/x86/kernel/fpu/xstate.c
>>>> index 0bab497c9436..5f27fcdc6c90 100644
>>>> --- a/arch/x86/kernel/fpu/xstate.c
>>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>>> @@ -602,8 +602,37 @@ static bool __init
>>>> paranoid_xstate_size_valid(unsigned int kernel_size)
>>>> }
>>>> }
>>>> size = xstate_calculate_size(fpu_kernel_cfg.max_features,
>>>> compacted);
>>>> - XSTATE_WARN_ON(size != kernel_size,
>>>> - "size %u != kernel_size %u\n", size, kernel_size);
>>>> + if (size != kernel_size) {
>>>> + u64 xcr0, ia32_xss;
>>>> +
>>>> + XSTATE_WARN_ON(1, "size %u != kernel_size %u\n",
>>>> + size, kernel_size);
>>>> +
>>>> + /* Show more information to help diagnose the size issue. */
>>>> + pr_info("x86/fpu: max_features=0x%llx\n",
>>>> + fpu_kernel_cfg.max_features);
>>>> + print_xstate_offset_size();
>>>> + pr_info("x86/fpu: total size: %u bytes\n", size);
>>>> + xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
>>>> + if (compacted) {
>>>> + rdmsrl(MSR_IA32_XSS, ia32_xss);
>>>
>>> This shouldn't be directly read here because of the LBR state component.
>>>
>>> See the function comment:
>>>
>>> * Independent XSAVE features allocate their own buffers and are not
>>> * covered by these checks. Only the size of the buffer for task->fpu
>>> * is checked here.
>>>
>>> But, isn't that max_features bitmask pretty much about it?
>>
>> How about getting IA32_XSS from xfeatures_mask_supervisor()? That's
>> how to get kernel_size by setting IA32_XSS without independent
>> features in get_xsave_compacted_size()
> I think what it tests here is comparing the sizes between the kernel
> code and microcode calculations on the same input, which is the
> max_features bitmask.
>
> We know that the kernel code calculates the size based on it and also
> takes it to write down there -- XCR0 and IA32_XSS. Then, showing that
> bitmask looks to be enough I thought, no?
First of all, max_features is shown already.
Kernel_size from CPUID.0xd.0x1:EBX takes XCR0 | IA32_XSS as input.
Platform may take wrong XCR0 or IA32_XSS and get wrong kernel_size. The
purpose of this patch is to provide more debug info to help debug
platform/kernel issue. So instead of a whole max_features, xgetbv() to
get XCR0 and xfeatures_mask_supervisor() to get IA32_XSS provides more
debug info in case platform may have issue in XCR0 or IA32_XSS.
In other words, splitting max_features into XCR0 and IA32_XSS and
showing them individually provide more useful debug info than one single
max_features value.
Does it make sense?
>
> I still expect some acknowledgment of what is coded here for the kernel
> calculation details.
The kernel calculation is shown in
+ print_xstate_offset_size();
+ pr_info("x86/fpu: total size: %u bytes\n", size);
Isn't that detailed enough to show offset and size of each xstate and
sum of sizes?
After that,
+ pr_info("x86/fpu: kernel_size from CPUID.0xd.0x%x:EBX: %u bytes\n",
+ compacted ? 1 : 0, kernel_size);
shows how kernel_size is calculated from CPUID?
Using the above debug info, a real platform CPUID issue is shown clearly.
What other details are needed?
Thanks.
-Fenghua
Powered by blists - more mailing lists