[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d03ac86a-5e7b-8319-7747-4b0299aaa625@intel.com>
Date: Wed, 18 Nov 2020 14:18:19 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, bp@...en8.de
Cc: fenghua.yu@...el.com, x86@...nel.org, linux-kernel@...r.kernel.org,
mingo@...hat.com, hpa@...or.com, tglx@...utronix.de
Subject: Re: [PATCH] x86/resctrl: Fix AMD L3 QOS CDP enable/disable
Hi Babu,
On 11/6/2020 12:14 PM, Babu Moger wrote:
> When the AMD QoS feature CDP(code and data prioritization) is enabled
> or disabled, the CDP bit in MSR 0000_0C81 is written on one of the
> cpus in L3 domain(core complex). That is not correct. The CDP bit needs
> to be updated all the logical cpus in the domain.
Could you please use CPU instead of cpu throughout, in commit message as
well as the new code comments?
>
> This was not spelled out clearly in the spec earlier. The specification
> has been updated. The updated specification, "AMD64 Technology Platform
> Quality of Service Extensions Publication # 56375 Revision: 1.02 Issue
> Date: October 2020" is available now. Refer the section: Code and Data
> Prioritization.
>
> Fix the issue by adding a new flag arch_needs_update_all in rdt_cache
> data structure.
I understand that naming is hard and could be a sticky point. Even so, I
am concerned that this name is too generic. For example, there are other
cache settings that are successfully set on a single CPU in the L3
domain (the bitmasks for example). This new name and its description in
the code comments below does not make it clear which cache settings it
applies to.
I interpret this change to mean that the L[23]_QOS_CFG MSR has CPU scope
while the other L3 QoS configuration registers have the same scope as
the L3 cache. Could this new variable thus perhaps be named
"arch_has_per_cpu_cfg"? I considered "arch_has_per_cpu_cdp" but when a
new field is added to that register it may cause confusion.
> The documentation can be obtained at the links below:
> https://developer.amd.com/wp-content/resources/56375.pdf
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>
> Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature")
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 3 +++
> arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++++++--
> 3 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index e5f4ee8f4c3b..142c92a12254 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -570,6 +570,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>
> if (d) {
> cpumask_set_cpu(cpu, &d->cpu_mask);
> + if (r->cache.arch_needs_update_all)
> + rdt_domain_reconfigure_cdp(r);
> return;
> }
>
> @@ -943,6 +945,7 @@ static __init void rdt_init_res_defs_amd(void)
> r->rid == RDT_RESOURCE_L2CODE) {
> r->cache.arch_has_sparse_bitmaps = true;
> r->cache.arch_has_empty_bitmaps = true;
> + r->cache.arch_needs_update_all = true;
> } else if (r->rid == RDT_RESOURCE_MBA) {
> r->msr_base = MSR_IA32_MBA_BW_BASE;
> r->msr_update = mba_wrmsr_amd;
The current pattern is to set these flags on all the architectures.
Could you thus please set the flag within rdt_init_defs_intel()? I
confirmed that the scope is the same as the cache domain in Intel RDT so
the flag should be false.
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 80fa997fae60..d23262d59a51 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -360,6 +360,8 @@ struct msr_param {
> * executing entities
> * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
> * @arch_has_empty_bitmaps: True if the '0' bitmap is valid.
> + * @arch_needs_update_all: True if arch needs to update the cache
> + * settings on all the cpus in the domain.
Please do update this to make it clear what "cache settings" are
referred to. Since this is in struct rdt_cache perhaps something like
"QOS_CFG register for this cache level has CPU scope."
> */
> struct rdt_cache {
> unsigned int cbm_len;
> @@ -369,6 +371,7 @@ struct rdt_cache {
> unsigned int shareable_bits;
> bool arch_has_sparse_bitmaps;
> bool arch_has_empty_bitmaps;
> + bool arch_needs_update_all;
> };
>
> /**
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index af323e2e3100..a005e90b373a 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1905,8 +1905,13 @@ static int set_cache_qos_cfg(int level, bool enable)
>
> r_l = &rdt_resources_all[level];
> list_for_each_entry(d, &r_l->domains, list) {
> - /* Pick one CPU from each domain instance to update MSR */
> - cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
> + if (r_l->cache.arch_needs_update_all)
> + /* Pick all the cpus in the domain instance */
> + for_each_cpu(cpu, &d->cpu_mask)
> + cpumask_set_cpu(cpu, cpu_mask);
> + else
> + /* Pick one CPU from each domain instance to update MSR */
> + cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
> }
> cpu = get_cpu();
> /* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
>
The solution looks good to me, thank you very much.
Reinette
Powered by blists - more mailing lists