lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240520023823.31604-1-lizhe.67@bytedance.com>
Date: Mon, 20 May 2024 10:38:23 +0800
From: lizhe.67@...edance.com
To: lizhe.67@...edance.com
Cc: juri.lelli@...hat.com,
	linux-kernel@...r.kernel.org,
	mingo@...hat.com,
	peterz@...radead.org,
	vincent.guittot@...aro.org,
	dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	bristot@...hat.com,
	vschneid@...hat.com
Subject: Re: sched/isolation: Fix CPU affinity issues for several task

On Mon, 29 Apr 2024 17:44:27 +0800, Li Zhe wrote:

>If the parameter of cmdline "nohz_full=" contains cpu 0, the cpu affinity
>of the kernel thread "kthreadd", "rcu_sched", "rcuos%d", "rcuog%d" will
>always be 0x01, that is, these threads can only run on cpu 0. This is
>obviously not in line with the original design.
>
>The root cause of this problem is that variables 'cpu_valid_mask' in
>functions __set_cpus_allowed_ptr_locked only contain cpu 0 before smp
>initialization is completed. If we call set_cpus_allowed_ptr and pass in a
>cpumask that does not contain cpu 0, the function call will return failure.
>Thread "kthreadd" and "rcu_sched" call the function set_cpus_allowed_ptr
>early in the system startup. Thread "rcuos%d" and "rcuog%d" inherit the
>wrong cpu affinity of "kthreadd".
>
>I tried to fix this problem by adapting the function set_cpus_allowed_ptr,
>but the variable task_struct->cpus_ptr will be referenced or modified in the
>scheduled process, which seems to make it more difficult to fix this problem
>by adapting the function set_cpus_allowed_ptr. So this patch clear cpu 0 from
>nohz_full range to fix this problem.
>
>Signed-off-by: Li Zhe <lizhe.67@...edance.com>
>---
> kernel/sched/isolation.c | 7 +++++++	
> 1 file changed, 7 insertions(+)
>
>diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>index 5891e715f00d..7b9bcfcd3c55 100644
>--- a/kernel/sched/isolation.c
>+++ b/kernel/sched/isolation.c
>@@ -152,6 +152,13 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
> 	if (cpumask_empty(non_housekeeping_mask))
> 		goto free_housekeeping_staging;
> 
>+	if ((flags & HK_FLAG_KTHREAD) &&
>+		cpumask_test_cpu(smp_processor_id(), non_housekeeping_mask)) {
>+		pr_warn("Housekeeping: Clearing cpu %d from nohz_full range\n", smp_processor_id());
>+		__cpumask_set_cpu(smp_processor_id(), housekeeping_staging);
>+		__cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask);
>+	}
>+
> 	if (!housekeeping.flags) {
> 		/* First setup call ("nohz_full=" or "isolcpus=") */
> 		enum hk_type type;

Friendly ping. Could somebody give me some advice?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ