lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z7b0KPqtwGX4ffY7@linux.ibm.com>
Date: Thu, 20 Feb 2025 14:51:44 +0530
From: Vishal Chourasia <vishalc@...ux.ibm.com>
To: Phil Auld <pauld@...hat.com>
Cc: linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Waiman Long <longman@...hat.com>,
        Vineeth Reddy <vineethr@...ux.ibm.com>
Subject: Re: [PATCH v2] sched/isolation: Make use of more than one
 housekeeping cpu

On Tue, Feb 18, 2025 at 06:46:18PM +0000, Phil Auld wrote:
> The exising code uses housekeeping_any_cpu() to select a cpu for
> a given housekeeping task. However, this often ends up calling
> cpumask_any_and() which is defined as cpumask_first_and() which has
> the effect of alyways using the first cpu among those available.
> 
> The same applies when multiple NUMA nodes are involved. In that
> case the first cpu in the local node is chosen which does provide
> a bit of spreading but with multiple HK cpus per node the same
> issues arise.
> 
> We have numerous cases where a single HK cpu just cannot keep up
> and the remote_tick warning fires. It also can lead to the other
> things (orchastration sw, HA keepalives etc) on the HK cpus getting
> starved which leads to other issues.  In these cases we recommend
> increasing the number of HK cpus.  But... that only helps the
> userspace tasks somewhat. It does not help the actual housekeeping
> part.
> 
> Spread the HK work out by having housekeeping_any_cpu() and
> sched_numa_find_closest() use cpumask_any_and_distribute()
> instead of cpumask_any_and().
> 
LGTM.

Reviewed-by: Vishal Chourasia <vishalc@...ux.ibm.com>

> Signed-off-by: Phil Auld <pauld@...hat.com>
> Reviewed-by: Waiman Long <longman@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Juri Lelli <juri.lelli@...hat.com>
> Cc: Frederic Weisbecker <frederic@...nel.org>
> Cc: Waiman Long <longman@...hat.com>
> Cc: linux-kernel@...r.kernel.org
> Link: https://lore.kernel.org/lkml/20250211141437.GA349314@pauld.westford.csb/
> 
> ---
> 
> v2: Fix subject line. Update commit message. No code change. 
> 
>  kernel/sched/isolation.c | 2 +-
>  kernel/sched/topology.c  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 81bc8b329ef1..93b038d48900 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -40,7 +40,7 @@ int housekeeping_any_cpu(enum hk_type type)
>  			if (cpu < nr_cpu_ids)
>  				return cpu;
>  
> -			cpu = cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask);
> +			cpu = cpumask_any_and_distribute(housekeeping.cpumasks[type], cpu_online_mask);
>  			if (likely(cpu < nr_cpu_ids))
>  				return cpu;
>  			/*
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index c49aea8c1025..94133f843485 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2101,7 +2101,7 @@ int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
>  	for (i = 0; i < sched_domains_numa_levels; i++) {
>  		if (!masks[i][j])
>  			break;
> -		cpu = cpumask_any_and(cpus, masks[i][j]);
> +		cpu = cpumask_any_and_distribute(cpus, masks[i][j]);
>  		if (cpu < nr_cpu_ids) {
>  			found = cpu;
>  			break;
> -- 
> 2.47.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ