[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13b360ac-4292-5056-6d00-6eb1d4186b11@amd.com>
Date: Thu, 28 Sep 2023 10:08:56 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Maciej Wieczór-Retman
<maciej.wieczor-retman@...el.com>
Cc: Fenghua Yu <fenghua.yu@...el.com>,
Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/4] x86/resctrl: Enable non-contiguous bits in Intel
CAT
Hi Maciej,
On 9/28/23 02:06, Maciej Wieczór-Retman wrote:
> Hi, thanks for reviewing the series,
>
> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote:
>> Hi Maciej,
>>
>> How about this subject line?
>>
>> x86/resctrl: Enable non-contiguous CBMs on Intel CAT
>
> Changing "bits" to "CBMs" does indeed seem sensible so I'll do that if
> there are no objections.
>
> But I'm not sure the preposition collocation change from "in" to "on"
> would be grammatical (at least from what I've read in docs about Intel
> CAT so far).
>
>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
>>> The setting for non-contiguous 1s support in Intel CAT is
>>
>>> hardcoded to false. On these systems, writing non-contiguous
>>> 1s into the schemata file will fail before resctrl passes
>>> the value to the hardware.
>>>
>>> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
>>> being reserved and now carry information about non-contiguous 1s
>>> value support for L3 and L2 cache respectively. The CAT
>>> capacity bitmask (CBM) supports a non-contiguous 1s value if
>>> the bit is set.
>>>
>>> Replace the hardcoded non-contiguous support value with
>>> the support learned from the hardware. Add hardcoded non-contiguous
>>> support value to Haswell probe since it can't make use of CPUID for
>>> Cache allocation.
>>>
>>> Originally-by: Fenghua Yu <fenghua.yu@...el.com>
>>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>
>>> ---
>>> Changelog v2:
>>> - Rewrite part of a comment concerning Haswell. (Reinette)
>>>
>>> arch/x86/kernel/cpu/resctrl/core.c | 9 ++++++---
>>> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
>>> arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++
>>> 3 files changed, 21 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>> index 030d3b409768..c783a873147c 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>> r->cache.cbm_len = 20;
>>> r->cache.shareable_bits = 0xc0000;
>>> r->cache.min_cbm_bits = 2;
>>> + r->cache.arch_has_sparse_bitmaps = false;
>>
>> Is this change required?
>>
>> This is always set to false in rdt_init_res_defs_intel().
>
> The logic behind moving this variable initialization from
> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and
> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a
> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and
> CPUID.0x10.2:ECX[3] bits were reserved.
>
> Now for the general case the variable is dependent on CPUID output.
> And only for Haswell case it needs to be hardcoded to "false", so the
> assignment makes more sense in Haswell probe rather than in the default
> section.
Here is the current sequence order with your change.
1.
resctrl_late_init -> check_quirks -> __check_quirks_intel ->
cache_alloc_hsw_probe
r->cache.arch_has_sparse_bitmaps = false; (new code)
2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel
r->cache.arch_has_sparse_bitmaps = false; (old code)
3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources ->
rdt_get_cache_alloc_cfg
r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code)
The code in (3) is going to overwrite whatever is set in (1) or (2).
I would say you can just remove initialization in both (1) and (2). That
makes the code clearer to me. I assume reserved bits in Intel is always 0.
Thanks
Babu
>
>>> r->alloc_capable = true;
>>> rdt_alloc_capable = true;
>>> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>>> {
>>> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>> union cpuid_0x10_1_eax eax;
>>> + union cpuid_0x10_x_ecx ecx;
>>> union cpuid_0x10_x_edx edx;
>>> - u32 ebx, ecx;
>>> + u32 ebx;
>>> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
>>> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>>> hw_res->num_closid = edx.split.cos_max + 1;
>>> r->cache.cbm_len = eax.split.cbm_len + 1;
>>> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>>> r->cache.shareable_bits = ebx & r->default_ctrl;
>>> r->data_width = (r->cache.cbm_len + 3) / 4;
>>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>>> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>>> r->alloc_capable = true;
>>> }
>>> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void)
>>> if (r->rid == RDT_RESOURCE_L3 ||
>>> r->rid == RDT_RESOURCE_L2) {
>>> - r->cache.arch_has_sparse_bitmaps = false;
>>
>> Why do you have to remove this one here? This seems like a right place to
>> initialize.
>
> Look at the previous comment.
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists