[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1ee3d5b-5c54-4eea-b3d7-7385d39bae45@intel.com>
Date: Fri, 8 Dec 2023 23:57:00 +0800
From: "Yang, Weijiang" <weijiang.yang@...el.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
CC: <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
<dave.hansen@...el.com>, <pbonzini@...hat.com>,
<seanjc@...gle.com>, <peterz@...radead.org>, <chao.gao@...el.com>,
<rick.p.edgecombe@...el.com>, <john.allen@....com>
Subject: Re: [PATCH v7 04/26] x86/fpu/xstate: Introduce
XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
On 12/7/2023 12:11 AM, Maxim Levitsky wrote:
> On Wed, 2023-12-06 at 11:00 +0800, Yang, Weijiang wrote:
>> On 12/5/2023 5:55 PM, Maxim Levitsky wrote:
>>> On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
>>>> On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
>>>>> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>>>>>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
>>>>> I am not sure though that this name is correct, but I don't know if I can
>>>>> suggest a better name.
>>>> It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
>>>>>> optionally enabled by kernel components, i.e., the features are required by
>>>>>> specific kernel components. Currently it's used by KVM to configure guest
>>>>>> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
>>>>>>
>>>>>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
>>>>>> supported by host as they're enabled in xsaves/xrstors operating xfeature set
>>>>>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
>>>>>> not enabled in host kernel so it can be omitted for normal fpstate by default.
>>>>>>
>>>>>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
>>>>>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
>>>>>> optimized by HW for normal fpstate.
>>>>>>
>>>>>> Suggested-by: Dave Hansen <dave.hansen@...el.com>
>>>>>> Signed-off-by: Yang Weijiang <weijiang.yang@...el.com>
>>>>>> ---
>>>>>> arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>>>>>> arch/x86/kernel/fpu/xstate.c | 1 +
>>>>>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
>>>>>> index 3b4a038d3c57..a212d3851429 100644
>>>>>> --- a/arch/x86/include/asm/fpu/xstate.h
>>>>>> +++ b/arch/x86/include/asm/fpu/xstate.h
>>>>>> @@ -46,9 +46,12 @@
>>>>>> #define XFEATURE_MASK_USER_RESTORE \
>>>>>> (XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>>>>>>
>>>>>> -/* Features which are dynamically enabled for a process on request */
>>>>>> +/* Features which are dynamically enabled per userspace request */
>>>>>> #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA
>>>>>>
>>>>>> +/* Features which are dynamically enabled per kernel side request */
>>>>> I suggest to explain this a bit better. How about something like that:
>>>>>
>>>>> "Kernel features that are not enabled by default for all processes, but can
>>>>> be still used by some processes, for example to support guest virtualization"
>>>> It looks good to me, will apply it in next version, thanks!
>>>>
>>>>> But feel free to keep it as is or propose something else. IMHO this will
>>>>> be confusing this way or another.
>>>>>
>>>>>
>>>>> Another question: kernel already has a notion of 'independent features'
>>>>> which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
>>>>>
>>>>> Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
>>>>> from independent buffer (in case of LBRs, perf code cares for this).
>>>>>
>>>>> Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
>>>> CET_S here refers to PL{0,1,2}_SSP, right?
>>>>
>>>> IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
>>>> owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.
>>> This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
>>> from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
>>> that it enables it in IA32_XSS (and in XCR0 in the future).
>> This is the origin intention for introducing independent features(firstly called dynamic feature, renamed later), from the
>> changelog the major concern is overhead:
> Yes, and to some extent the reason why we want to have CET supervisor state not saved on normal thread's FPU state is also overhead,
> because in theory if the kernel did save it, the MSRs will be in INIT state and thus XSAVES shouldn't have any functional impact,
> even if it saves/restores them for nothing.
CET supervisor state in normal thread's FPU state won't always be in INIT state. Per SDM, it's INIT state is defined only if 3 MSRs are 0,
but if guest is using supervisor CET, then with vCPU migration between pCPUs, more and more MSRs would hold non-zero contents.
This doesn't impact host kernel behavior because host CET_S is still disabled, but it does impact host XSAVES/XRSTORS behavior.
> In other words, as I said, independent features = features that FPU state doesn't manage, and are just optionally enabled,
> so that a custom code can do a custom xsave(s)/xrstor(s), likely from/to a custom area to save/load these features.
>
> It might make sense to rename independent features again to something like 'unmanaged features' or 'manual features' or something
> like that.
>
>
> Another interesting question that arises here, is once KVM supports arch LBRs, it will likely need to expose the XFEATURE_LBR
> to the guest and will need to context switch it similar to CET_S state, which strengthens the argument that CET_S should
> be in the same group as the 'independent features'.
>
> Depending on the performance impact, XFEATURE_LBR might even need to be dynamically allocated.
This is most likely true for fpu_guest_cfg instead of fpu_kernel_cfg, let me think it over, thanks for bring up this brilliant idea :-)
> For the reference this is the patch series that introduced the arch LBRs to KVM:
> https://www.spinics.net/lists/kvm/msg296507.html
>
>
> Best regards,
> Maxim Levitsky
>
>> commit f0dccc9da4c0fda049e99326f85db8c242fd781f
>> Author: Kan Liang <kan.liang@...ux.intel.com>
>> Date: Fri Jul 3 05:49:26 2020 -0700
>>
>> x86/fpu/xstate: Support dynamic supervisor feature for LBR
>>
>> "However, the kernel should not save/restore the LBR state component at
>> each context switch, like other state components, because of the
>> following unique features of LBR:
>> - The LBR state component only contains valuable information when LBR
>> is enabled in the perf subsystem, but for most of the time, LBR is
>> disabled.
>> - The size of the LBR state component is huge. For the current
>> platform, it's 808 bytes.
>> If the kernel saves/restores the LBR state at each context switch, for
>> most of the time, it is just a waste of space and cycles."
>>
>>> For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
>>> I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
>>> from complexity point of view to add CET there. It's up to you to decide.
>>>
>>> Sean what do you think?
>>>
>>> Best regards,
>>> Maxim Levitsky
>>>
>>>
>>>> But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
>>>> Enabling it in common FPU framework would make the switch/swap much easier without additional
>>>> support code.
>>>>
>>>>> and maybe rename the
>>>>> 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
>>>>> (terrible name, but you might think of a better name)
>>>>>
>>>>>
>>>>>> +#define XFEATURE_MASK_KERNEL_DYNAMIC XFEATURE_MASK_CET_KERNEL
>>>>>> +
>>>>>> /* All currently supported supervisor features */
>>>>>> #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>>>>>> XFEATURE_MASK_CET_USER | \
>>>>>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>>>>>> index b57d909facca..ba4172172afd 100644
>>>>>> --- a/arch/x86/kernel/fpu/xstate.c
>>>>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>>>>> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>>>>> /* Clean out dynamic features from default */
>>>>>> fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>>>>>> fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>>> + fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>>>>>>
>>>>>> fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>>>>> fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>> Best regards,
>>>>> Maxim Levitsky
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
Powered by blists - more mailing lists