lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 Nov 2020 14:18:19 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Babu Moger <babu.moger@....com>, bp@...en8.de
Cc:     fenghua.yu@...el.com, x86@...nel.org, linux-kernel@...r.kernel.org,
        mingo@...hat.com, hpa@...or.com, tglx@...utronix.de
Subject: Re: [PATCH] x86/resctrl: Fix AMD L3 QOS CDP enable/disable

Hi Babu,

On 11/6/2020 12:14 PM, Babu Moger wrote:
> When the AMD QoS feature CDP(code and data prioritization) is enabled
> or disabled, the CDP bit in MSR 0000_0C81 is written on one of the
> cpus in L3 domain(core complex). That is not correct. The CDP bit needs
> to be updated all the logical cpus in the domain.

Could you please use CPU instead of cpu throughout, in commit message as 
well as the new code comments?

> 
> This was not spelled out clearly in the spec earlier. The specification
> has been updated. The updated specification, "AMD64 Technology Platform
> Quality of Service Extensions Publication # 56375 Revision: 1.02 Issue
> Date: October 2020" is available now. Refer the section: Code and Data
> Prioritization.
> 
> Fix the issue by adding a new flag arch_needs_update_all in rdt_cache
> data structure.

I understand that naming is hard and could be a sticky point. Even so, I 
am concerned that this name is too generic. For example, there are other 
cache settings that are successfully set on a single CPU in the L3 
domain (the bitmasks for example). This new name and its description in 
the code comments below does not make it clear which cache settings it 
applies to.

I interpret this change to mean that the L[23]_QOS_CFG MSR has CPU scope 
while the other L3 QoS configuration registers have the same scope as 
the L3 cache. Could this new variable thus perhaps be named 
"arch_has_per_cpu_cfg"? I considered "arch_has_per_cpu_cdp" but when a 
new field is added to that register it may cause confusion.

> The documentation can be obtained at the links below:
> https://developer.amd.com/wp-content/resources/56375.pdf
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> 
> Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature")
> 
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
>   arch/x86/kernel/cpu/resctrl/core.c     |    3 +++
>   arch/x86/kernel/cpu/resctrl/internal.h |    3 +++
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c |    9 +++++++--
>   3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index e5f4ee8f4c3b..142c92a12254 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -570,6 +570,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>   
>   	if (d) {
>   		cpumask_set_cpu(cpu, &d->cpu_mask);
> +		if (r->cache.arch_needs_update_all)
> +			rdt_domain_reconfigure_cdp(r);
>   		return;
>   	}
>   
> @@ -943,6 +945,7 @@ static __init void rdt_init_res_defs_amd(void)
>   		    r->rid == RDT_RESOURCE_L2CODE) {
>   			r->cache.arch_has_sparse_bitmaps = true;
>   			r->cache.arch_has_empty_bitmaps = true;
> +			r->cache.arch_needs_update_all = true;
>   		} else if (r->rid == RDT_RESOURCE_MBA) {
>   			r->msr_base = MSR_IA32_MBA_BW_BASE;
>   			r->msr_update = mba_wrmsr_amd;

The current pattern is to set these flags on all the architectures. 
Could you thus please set the flag within rdt_init_defs_intel()? I 
confirmed that the scope is the same as the cache domain in Intel RDT so 
the flag should be false.

> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 80fa997fae60..d23262d59a51 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -360,6 +360,8 @@ struct msr_param {
>    *			executing entities
>    * @arch_has_sparse_bitmaps:	True if a bitmap like f00f is valid.
>    * @arch_has_empty_bitmaps:	True if the '0' bitmap is valid.
> + * @arch_needs_update_all:	True if arch needs to update the cache
> + *				settings on all the cpus in the domain.

Please do update this to make it clear what "cache settings" are 
referred to. Since this is in struct rdt_cache perhaps something like 
"QOS_CFG register for this cache level has CPU scope."

>    */
>   struct rdt_cache {
>   	unsigned int	cbm_len;
> @@ -369,6 +371,7 @@ struct rdt_cache {
>   	unsigned int	shareable_bits;
>   	bool		arch_has_sparse_bitmaps;
>   	bool		arch_has_empty_bitmaps;
> +	bool		arch_needs_update_all;
>   };
>   
>   /**
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index af323e2e3100..a005e90b373a 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1905,8 +1905,13 @@ static int set_cache_qos_cfg(int level, bool enable)
>   
>   	r_l = &rdt_resources_all[level];
>   	list_for_each_entry(d, &r_l->domains, list) {
> -		/* Pick one CPU from each domain instance to update MSR */
> -		cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
> +		if (r_l->cache.arch_needs_update_all)
> +			/* Pick all the cpus in the domain instance */
> +			for_each_cpu(cpu, &d->cpu_mask)
> +				cpumask_set_cpu(cpu, cpu_mask);
> +		else
> +			/* Pick one CPU from each domain instance to update MSR */
> +			cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
>   	}
>   	cpu = get_cpu();
>   	/* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
> 

The solution looks good to me, thank you very much.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ