linux-kernel - [PATCH] sched/deadline: Don't count nr_running twice for dl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250627035420.37712-1-yangyicong@huawei.com>
Date: Fri, 27 Jun 2025 11:54:20 +0800
From: Yicong Yang <yangyicong@...wei.com>
To: <mingo@...hat.com>, <peterz@...radead.org>, <juri.lelli@...hat.com>,
	<vincent.guittot@...aro.org>
CC: <dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
	<mgorman@...e.de>, <vschneid@...hat.com>, <linux-kernel@...r.kernel.org>,
	<linuxarm@...wei.com>, <prime.zeng@...ilicon.com>, <yangyicong@...ilicon.com>
Subject: [PATCH] sched/deadline: Don't count nr_running twice for dl_server proxy tasks

From: Yicong Yang <yangyicong@...ilicon.com>

On CPU offline the kernel stalled with below call trace:

  INFO: task kworker/0:1:11 blocked for more than 120 seconds.
        Tainted: G           O        6.15.0-rc4+ #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:kworker/0:1     state:D stack:0     pid:11    tgid:11    ppid:2   task_flags:0x4208060 flags:0x00000008
  Workqueue: events vmstat_shepherd
  Call trace:
   __switch_to+0x118/0x188 (T)
   __schedule+0x31c/0x1300
   schedule+0x3c/0x120
   percpu_rwsem_wait+0x12c/0x1b0
   __percpu_down_read+0x78/0x188
   cpus_read_lock+0xc4/0xe8
   vmstat_shepherd+0x30/0x138
   process_one_work+0x154/0x3c8
   worker_thread+0x2e8/0x400
   kthread+0x154/0x230
   ret_from_fork+0x10/0x20
  INFO: task kworker/1:1:116 blocked for more than 120 seconds.
        Tainted: G           O        6.15.0-rc4+ #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:kworker/1:1     state:D stack:0     pid:116   tgid:116   ppid:2   task_flags:0x4208060 flags:0x00000008
  Workqueue: events work_for_cpu_fn
  Call trace:
   __switch_to+0x118/0x188 (T)
   __schedule+0x31c/0x1300
   schedule+0x3c/0x120
   schedule_timeout+0x10c/0x120
   __wait_for_common+0xc4/0x1b8
   wait_for_completion+0x28/0x40
   cpuhp_kick_ap_work+0x114/0x3c8
   _cpu_down+0x130/0x4b8
   __cpu_down_maps_locked+0x20/0x38
   work_for_cpu_fn+0x24/0x40
   process_one_work+0x154/0x3c8
   worker_thread+0x2e8/0x400
   kthread+0x154/0x230
   ret_from_fork+0x10/0x20

cpuhp hold the cpu hotplug lock endless and stalled vmstat_shepherd.
This is because we count nr_running twice on cpuhp enqueuing and failed
the wait condition of cpuhp:

enqueue_task_fair() // pick cpuhp from idle, rq->nr_running = 0
  dl_server_start()
    [...]
    add_nr_running() // rq->nr_running = 1
  add_nr_running() // rq->nr_running = 2
[switch to cpuhp, waiting on balance_hotplug_wait()]
rcuwait_wait_event(rq->nr_running == 1 && ...) // failed, rq->nr_running=2
  schedule() // wait again

This doesn't make sense to count one single task twice on
rq->nr_running. So fix this.

Signed-off-by: Yicong Yang <yangyicong@...ilicon.com>
---
 kernel/sched/deadline.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ad45a8fea245..59fb178762ee 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1894,7 +1894,9 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
 	u64 deadline = dl_se->deadline;
 
 	dl_rq->dl_nr_running++;
-	add_nr_running(rq_of_dl_rq(dl_rq), 1);
+
+	if (!dl_server(dl_se))
+		add_nr_running(rq_of_dl_rq(dl_rq), 1);
 
 	inc_dl_deadline(dl_rq, deadline);
 }
@@ -1904,7 +1906,9 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
 {
 	WARN_ON(!dl_rq->dl_nr_running);
 	dl_rq->dl_nr_running--;
-	sub_nr_running(rq_of_dl_rq(dl_rq), 1);
+
+	if (!dl_server(dl_se))
+		sub_nr_running(rq_of_dl_rq(dl_rq), 1);
 
 	dec_dl_deadline(dl_rq, dl_se->deadline);
 }
-- 
2.24.0