linux-kernel - Re: [PATCH 1/2] sched: Optimise task_mm_cid

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c9a39d2e-6829-4bc5-b560-347ee79ff2e8@efficios.com>
Date: Mon, 2 Dec 2024 09:21:13 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Gabriele Monaco <gmonaco@...hat.com>, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] sched: Optimise task_mm_cid_work duration

On 2024-12-02 09:07, Gabriele Monaco wrote:
> The current behaviour of task_mm_cid_work is to loop through all
> possible CPUs twice to clean up old mm_cid remotely, this can be a waste
> of resources especially on tasks with a CPU affinity.
> 
> This patch reduces the CPUs involved in the remote CID cleanup carried
> on by task_mm_cid_work.
> 
> Using the mm_cidmask for the remote cleanup can considerably reduce the
> function runtime in highly isolated environments, where each process has
> affinity to a single core.  Likewise, in the worst case, the mask is
> equivalent to all possible CPUs and we don't see any difference with the
> current behaviour.
> 
> Signed-off-by: Gabriele Monaco <gmonaco@...hat.com>
> ---
>   kernel/sched/core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 95e40895a519..57b50b5952fa 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -10553,14 +10553,14 @@ static void task_mm_cid_work(struct callback_head *work)
>   		return;
>   	cidmask = mm_cidmask(mm);
>   	/* Clear cids that were not recently used. */
> -	for_each_possible_cpu(cpu)
> +	for_each_cpu_from(cpu, cidmask)

Hi Gabriele,

Thanks for looking into this. I understand that you are after minimizing the
latency introduced by task_mm_cid_work on isolated cores. I think we'll need
to think a bit harder, because the proposed solution does not work:

  * for_each_cpu_from - iterate over CPUs present in @mask, from @cpu to the end of @mask.

cpu is uninitialized. So this is completely broken. Was this tested
against a workload that actually uses concurrency IDs to ensure it does
not break the whole thing ? Did you run the rseq selftests ?

Also, the mm_cidmask is a mask of concurrency IDs, not a mask of CPUs. So
using it to iterate on CPUs is wrong.

Mathieu

>   		sched_mm_cid_remote_clear_old(mm, cpu);
>   	weight = cpumask_weight(cidmask);
>   	/*
>   	 * Clear cids that are greater or equal to the cidmask weight to
>   	 * recompact it.
>   	 */
> -	for_each_possible_cpu(cpu)
> +	for_each_cpu_from(cpu, cidmask)
>   		sched_mm_cid_remote_clear_weight(mm, cpu, weight);
>   }
>   

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com