lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241030104934.GK14555@noisy.programming.kicks-ass.net>
Date: Wed, 30 Oct 2024 11:49:34 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, David Vernet <void@...ifault.com>,
	sched-ext@...a.com
Subject: Re: [RFC PATCH sched/urgent] sched: Task still delay-dequeued after
 switched from fair

On Tue, Oct 29, 2024 at 02:07:11PM -1000, Tejun Heo wrote:
> On the current tip/sched/urgent, the following can be easily triggered by
> running `tools/testing/selftests/sched_ext/runner -t reload_loop`:

> The problem is that when tasks are switched from fair to ext, it can
> remain delay-dequeued triggering the above warning when the task goes
> back to fair. 

> I can work around with the following patch but it
> doesn't seem like the right way to handle it. Shouldn't e.g.
> fair->switched_from() cancel delayed dequeue?

->switched_from() used to do this, but it is too late. I have a TODO
item fairly high on the todo list to rework the whole
switch{ing,ed}_{from,to} hookery to make all this more sane.

But yeah, it seems I missed the below case where we are switching class.

> ---
>  kernel/sched/ext.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 65334c13ffa5..601aad1a2625 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -5205,8 +5205,12 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
>  	while ((p = scx_task_iter_next_locked(&sti))) {
>  		const struct sched_class *old_class = p->sched_class;
>  		struct sched_enq_and_set_ctx ctx;
> +		int deq_flags = DEQUEUE_SAVE | DEQUEUE_MOVE;
>  
> -		sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx);
> +		if (p->se.sched_delayed)
> +			deq_flags |= DEQUEUE_SLEEP | DEQUEUE_DELAYED;
> +
> +		sched_deq_and_put_task(p, deq_flags, &ctx);

I don't think this is quite right, the problem is that in this case
ctx.queued is reporting true, even though you want it false.

This is why 98442f0ccd82 ("sched: Fix delayed_dequeue vs switched_from_fair()")
adds a second dequeue.

Also, you seem to have a second instance of all that.

Does the below work for you? I suppose I might as well go work on that
TODO item now.

---
 kernel/sched/ext.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 40bdfe84e4f0..587e7d1a1e96 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4489,11 +4489,16 @@ static void scx_ops_disable_workfn(struct kthread_work *work)
 	scx_task_iter_start(&sti);
 	while ((p = scx_task_iter_next_locked(&sti))) {
 		const struct sched_class *old_class = p->sched_class;
+		const struct sched_class *new_class =
+			__setscheduler_class(p->policy, p->prio);
 		struct sched_enq_and_set_ctx ctx;
 
+		if (old_class != new_class && p->se.sched_delayed)
+			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DELAYED);
+
 		sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx);
 
-		p->sched_class = __setscheduler_class(p->policy, p->prio);
+		p->sched_class = new_class;
 		check_class_changing(task_rq(p), p, old_class);
 
 		sched_enq_and_set_task(&ctx);
@@ -5199,12 +5204,17 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
 	scx_task_iter_start(&sti);
 	while ((p = scx_task_iter_next_locked(&sti))) {
 		const struct sched_class *old_class = p->sched_class;
+		const struct sched_class *new_class =
+			__setscheduler_class(p->policy, p->prio);
 		struct sched_enq_and_set_ctx ctx;
 
+		if (old_class != new_class && p->se.sched_delayed)
+			dequeue_task(task_rq(p), p, DEQUEUE_SLEEP | DELAYED);
+
 		sched_deq_and_put_task(p, DEQUEUE_SAVE | DEQUEUE_MOVE, &ctx);
 
 		p->scx.slice = SCX_SLICE_DFL;
-		p->sched_class = __setscheduler_class(p->policy, p->prio);
+		p->sched_class = new_class;
 		check_class_changing(task_rq(p), p, old_class);
 
 		sched_enq_and_set_task(&ctx);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ