lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyF4rw_nvfpHfouv@slm.duckdns.org>
Date: Tue, 29 Oct 2024 14:07:11 -1000
From: Tejun Heo <tj@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, David Vernet <void@...ifault.com>,
	sched-ext@...a.com
Subject: [RFC PATCH sched/urgent] sched: Task still delay-dequeued after
 switched from fair

On the current tip/sched/urgent, the following can be easily triggered by
running `tools/testing/selftests/sched_ext/runner -t reload_loop`:

  p->se.sched_delayed
  WARNING: CPU: 0 PID: 1686 at kernel/sched/fair.c:13191 switched_to_fair+0x7a/0x80
  ...
  Sched_ext: maximal (disabling)
  RIP: 0010:switched_to_fair+0x7a/0x80
  Code: a6 fe ff 5b 41 5e c3 cc cc cc cc cc 4c 89 f7 5b 41 5e e9 49 7f fe ff c6 05 53 c0 80 02 01 48 c7 c7 27 4a e6 82 e8 c6 8f fa ff <0f> 0b eb a2 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
  RSP: 0018:ffffc90001253d40 EFLAGS: 00010086
  RAX: 0000000000000013 RBX: ffff888103a6d380 RCX: 0000000000000027
  RDX: 0000000000000002 RSI: 00000000ffffdfff RDI: ffff888237c1b448
  RBP: 0000000000030380 R08: 0000000000001fff R09: ffffffff8368e000
  R10: 0000000000005ffd R11: 0000000000000004 R12: ffffc90001253d58
  R13: ffffffff82eda0c0 R14: ffff888237db0380 R15: ffff888103a6d380
  FS:  0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fa289417000 CR3: 0000000003e58000 CR4: 0000000000750eb0
  PKRU: 55555554
  Call Trace:
   <TASK>
   scx_ops_disable_workfn+0x71b/0x930
   kthread_worker_fn+0x105/0x2a0
   kthread+0xe8/0x110
   ret_from_fork+0x33/0x40
   ret_from_fork_asm+0x1a/0x30
   </TASK>

The problem is that when tasks are switched from fair to ext, it can remain
delay-dequeued triggering the above warning when the task goes back to fair.
I can work around with the following patch but it doesn't seem like the
right way to handle it. Shouldn't e.g. fair->switched_from() cancel delayed
dequeue?

Thanks.

---
 kernel/sched/ext.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 65334c13ffa5..601aad1a2625 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5205,8 +5205,12 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
 	while ((p = scx_task_iter_next_locked(&sti))) {
 		const struct sched_class *old_class = p->sched_class;
 		struct sched_enq_and_set_ctx ctx;
+		int deq_flags = DEQUEUE_SAVE | DEQUEUE_MOVE;
 
-		sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx);
+		if (p->se.sched_delayed)
+			deq_flags |= DEQUEUE_SLEEP | DEQUEUE_DELAYED;
+
+		sched_deq_and_put_task(p, deq_flags, &ctx);
 
 		p->scx.slice = SCX_SLICE_DFL;
 		p->sched_class = __setscheduler_class(p->policy, p->prio);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ