lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <faabc0ef-6c81-4ca9-b96d-e2e679cfc1a0@intel.com>
Date: Tue, 20 Feb 2024 21:10:47 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Thomas Gleixner <tglx@...utronix.de>
CC: Borislav Petkov <bp@...en8.de>, James Morse <james.morse@....com>,
	"x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "Yu, Fenghua" <fenghua.yu@...el.com>, "Ingo
 Molnar" <mingo@...hat.com>, H Peter Anvin <hpa@...or.com>, Babu Moger
	<Babu.Moger@....com>, "shameerali.kolothum.thodi@...wei.com"
	<shameerali.kolothum.thodi@...wei.com>, D Scott Phillips OS
	<scott@...amperecomputing.com>, "carl@...amperecomputing.com"
	<carl@...amperecomputing.com>, "lcherian@...vell.com" <lcherian@...vell.com>,
	"bobo.shaobowang@...wei.com" <bobo.shaobowang@...wei.com>,
	"tan.shaopeng@...itsu.com" <tan.shaopeng@...itsu.com>,
	"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>, Jamie Iles
	<quic_jiles@...cinc.com>, Xin Hao <xhao@...ux.alibaba.com>,
	"peternewman@...gle.com" <peternewman@...gle.com>, "dfustini@...libre.com"
	<dfustini@...libre.com>, "amitsinght@...vell.com" <amitsinght@...vell.com>,
	David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH] x86/resctrl: Fix WARN in get_domain_from_cpu()

Hi Tony,

Regarding the implication made in the subject ...
from what I understand the WARN is a false positive.

On 2/20/2024 4:34 PM, Tony Luck wrote:
> reset_all_ctrls() and resctrl_arch_update_domains() use
> on_each_cpu_mask() to call rdt_ctrl_update() on potentially
> one CPU from each domain.
> 
> But this means rdt_ctrl_update() needs to figure out which domain
> to apply changes to. Doing so requires a search of all domains
> in a resource, which can only be done safely if cpus_lock is
> held. Both callers do hold this lock, but there isn't a way
> for a function called on another CPU via IPI to verify this.
> 
> Fix by adding the target domain to the msr_param structure and
> calling for each domain separately using smp_call_function_single()

This sounds reasonable to me. Thank you for the proposal.

> Signed-off-by: Tony Luck <tony.luck@...el.com>
> 
> ---
> Either apply on top of tip x86/cache:
> 
>  fb700810d30b ("x86/resctrl: Separate arch and fs resctrl locks")
> 
> or merge this into that commit.

I do not know if it would be preferred to take this approach as
part of this work or just remove the WARN and add this
improvement/refactoring later as a follow-up. 

> ---
>  arch/x86/kernel/cpu/resctrl/internal.h    |  1 +
>  arch/x86/kernel/cpu/resctrl/core.c        | 10 +----
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 50 +++++------------------
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 14 ++-----
>  4 files changed, 16 insertions(+), 59 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index c99f26ebe7a6..c30d7697b431 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -383,6 +383,7 @@ static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
>   */
>  struct msr_param {
>  	struct rdt_resource	*res;
> +	struct rdt_domain	*dom;
>  	u32			low;
>  	u32			high;
>  };
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 8a4ef4f5bddc..8d8b8abcda98 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -390,16 +390,8 @@ void rdt_ctrl_update(void *arg)
>  	struct msr_param *m = arg;
>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(m->res);
>  	struct rdt_resource *r = m->res;
> -	int cpu = smp_processor_id();
> -	struct rdt_domain *d;
>  
> -	d = get_domain_from_cpu(cpu, r);
> -	if (d) {
> -		hw_res->msr_update(d, m, r);
> -		return;
> -	}
> -	pr_warn_once("cpu %d not found in any domain for resource %s\n",
> -		     cpu, r->name);
> +	hw_res->msr_update(m->dom, m, r);

It looks redundant to provide struct msr_param as well as two of its
members as parameters. 

>  }
>  
>  /*
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 7997b47743a2..aed702d06314 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -272,22 +272,6 @@ static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
>  	}
>  }
>  
> -static bool apply_config(struct rdt_hw_domain *hw_dom,
> -			 struct resctrl_staged_config *cfg, u32 idx,
> -			 cpumask_var_t cpu_mask)
> -{
> -	struct rdt_domain *dom = &hw_dom->d_resctrl;
> -
> -	if (cfg->new_ctrl != hw_dom->ctrl_val[idx]) {
> -		cpumask_set_cpu(cpumask_any(&dom->cpu_mask), cpu_mask);
> -		hw_dom->ctrl_val[idx] = cfg->new_ctrl;
> -
> -		return true;
> -	}
> -
> -	return false;
> -}
> -
>  int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_domain *d,
>  			    u32 closid, enum resctrl_conf_type t, u32 cfg_val)
>  {
> @@ -315,17 +299,13 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
>  	struct rdt_hw_domain *hw_dom;
>  	struct msr_param msr_param;
>  	enum resctrl_conf_type t;
> -	cpumask_var_t cpu_mask;
>  	struct rdt_domain *d;
> +	int cpu;
>  	u32 idx;
>  
>  	/* Walking r->domains, ensure it can't race with cpuhp */
>  	lockdep_assert_cpus_held();
>  
> -	if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
> -		return -ENOMEM;
> -
> -	msr_param.res = NULL;
>  	list_for_each_entry(d, &r->domains, list) {
>  		hw_dom = resctrl_to_arch_dom(d);
>  		for (t = 0; t < CDP_NUM_TYPES; t++) {
> @@ -334,29 +314,19 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
>  				continue;
>  
>  			idx = get_config_index(closid, t);
> -			if (!apply_config(hw_dom, cfg, idx, cpu_mask))
> +			if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
>  				continue;
> -
> -			if (!msr_param.res) {
> -				msr_param.low = idx;
> -				msr_param.high = msr_param.low + 1;
> -				msr_param.res = r;
> -			} else {
> -				msr_param.low = min(msr_param.low, idx);
> -				msr_param.high = max(msr_param.high, idx + 1);
> -			}
> +			hw_dom->ctrl_val[idx] = cfg->new_ctrl;
> +			cpu = cpumask_any(&d->cpu_mask);
> +
> +			msr_param.low = idx;
> +			msr_param.high = msr_param.low + 1;
> +			msr_param.res = r;
> +			msr_param.dom = d;
> +			smp_call_function_single(cpu, rdt_ctrl_update, &msr_param, 1);

When CDP is enabled, could this not end up sending IPI to the same CPU twice, each
requesting CPU to do one MSR write instead of sending an IPI once to write all
needed MSRs?


Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ