lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250310074044.3656-3-wuyun.abel@bytedance.com>
Date: Mon, 10 Mar 2025 15:40:42 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Josh Don <joshdon@...gle.com>,
	Tianchen Ding <dtcccc@...ux.alibaba.com>
Cc: Abel Wu <wuyun.abel@...edance.com>,
	linux-kernel@...r.kernel.org (open list:SCHEDULER)
Subject: [RFC PATCH 2/2] sched/fair: Do not specialcase SCHED_IDLE cpus in select slowpath

The SCHED_IDLE cgroups whose cpu.idle equals to 1, only mean something
to their siblings due to cgroup hierarchical behavior. So a SCHED_IDLE
cpu does NOT necessarily implies any of the following:

 - It is a less loaded cpu (since the parent of its topmost idle
   ancestor could be a 'giant' entity with large cpu.weight).

 - It can be expected to be preempted by a newly woken task soon
   enough (which actually depends on their ancestors who have
   common parent).

As a less loaded cpu probably has better ability to serve the newly
woken task, which also applies to the SCHED_IDLE cpus that less loaded
SCHED_IDLE cpu might be easier and faster preempted, let's not special
case SCHED_IDLE cpus at least in slowpath when selecting.

Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
---
 kernel/sched/fair.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 379764bd2795..769505cf519b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7446,7 +7446,7 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
 	unsigned int min_exit_latency = UINT_MAX;
 	u64 latest_idle_timestamp = 0;
 	int least_loaded_cpu = this_cpu;
-	int shallowest_idle_cpu = -1, si_cpu = -1;
+	int shallowest_idle_cpu = -1;
 	int i;
 
 	/* Check if we have any choice: */
@@ -7481,12 +7481,13 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
 				latest_idle_timestamp = rq->idle_stamp;
 				shallowest_idle_cpu = i;
 			}
-		} else if (shallowest_idle_cpu == -1 && si_cpu == -1) {
-			if (sched_idle_cpu(i)) {
-				si_cpu = i;
-				continue;
-			}
-
+		} else if (shallowest_idle_cpu == -1) {
+			/*
+			 * The SCHED_IDLE cpus do not necessarily means anything
+			 * to @p due to the cgroup hierarchical behavior. But it
+			 * is almost certain that the wakee will get better served
+			 * if the cpu is less loaded.
+			 */
 			load = cpu_load(cpu_rq(i));
 			if (load < min_load) {
 				min_load = load;
@@ -7495,11 +7496,7 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
 		}
 	}
 
-	if (shallowest_idle_cpu != -1)
-		return shallowest_idle_cpu;
-	if (si_cpu != -1)
-		return si_cpu;
-	return least_loaded_cpu;
+	return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
 }
 
 static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct task_struct *p,
-- 
2.37.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ