lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240709211001.1198145-2-tj@kernel.org>
Date: Tue,  9 Jul 2024 11:09:42 -1000
From: Tejun Heo <tj@...nel.org>
To: void@...ifault.com
Cc: linux-kernel@...r.kernel.org,
	kernel-team@...a.com,
	schatzberg.dan@...il.com,
	Tejun Heo <tj@...nel.org>
Subject: [PATCH 1/2] sched_ext: Reimplement scx_bpf_reenqueue_local()

scx_bpf_reenqueue_local() is used to re-enqueue tasks on the local DSQ from
ops.cpu_release(). Because the BPF scheduler may dispatch tasks to the same
local DSQ, to avoid processing the same tasks repeatedly, it first takes the
number of queued tasks and processes the task at the head of the queue that
number of times.

This is incorrect as a task can be dispatched to the same local DSQ with
SCX_ENQ_HEAD. Such a task will be processed repeatedly until the count is
exhausted and the succeeding tasks won't be processed at all.

Fix it by first moving all candidate tasks to a private list and then
processing that list. While at it, remove the WARNs. They're rather
superflous as later steps will check them anyway.

Signed-off-by: Tejun Heo <tj@...nel.org>
Fixes: 245254f7081d ("sched_ext: Implement sched_ext_ops.cpu_acquire/release()")
Cc: David Vernet <void@...ifault.com>
---
 kernel/sched/ext.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index f16d72d72635..0fabfe19327c 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5628,8 +5628,10 @@ __bpf_kfunc_start_defs();
  */
 __bpf_kfunc u32 scx_bpf_reenqueue_local(void)
 {
-	u32 nr_enqueued, i;
+	LIST_HEAD(tasks);
+	u32 nr_enqueued = 0;
 	struct rq *rq;
+	struct task_struct *p, *n;
 
 	if (!scx_kf_allowed(SCX_KF_CPU_RELEASE))
 		return 0;
@@ -5638,22 +5640,20 @@ __bpf_kfunc u32 scx_bpf_reenqueue_local(void)
 	lockdep_assert_rq_held(rq);
 
 	/*
-	 * Get the number of tasks on the local DSQ before iterating over it to
-	 * pull off tasks. The enqueue callback below can signal that it wants
-	 * the task to stay on the local DSQ, and we want to prevent the BPF
-	 * scheduler from causing us to loop indefinitely.
+	 * The BPF scheduler may choose to dispatch tasks back to
+	 * @rq->scx.local_dsq. Move all candidate tasks off to a private list
+	 * first to avoid processing the same tasks repeatedly.
 	 */
-	nr_enqueued = rq->scx.local_dsq.nr;
-	for (i = 0; i < nr_enqueued; i++) {
-		struct task_struct *p;
-
-		p = first_local_task(rq);
-		WARN_ON_ONCE(atomic_long_read(&p->scx.ops_state) !=
-			     SCX_OPSS_NONE);
-		WARN_ON_ONCE(!(p->scx.flags & SCX_TASK_QUEUED));
-		WARN_ON_ONCE(p->scx.holding_cpu != -1);
+	list_for_each_entry_safe(p, n, &rq->scx.local_dsq.list,
+				 scx.dsq_list.node) {
 		dispatch_dequeue(rq, p);
+		list_add_tail(&p->scx.dsq_list.node, &tasks);
+	}
+
+	list_for_each_entry_safe(p, n, &tasks, scx.dsq_list.node) {
+		list_del_init(&p->scx.dsq_list.node);
 		do_enqueue_task(rq, p, SCX_ENQ_REENQ, -1);
+		nr_enqueued++;
 	}
 
 	return nr_enqueued;
-- 
2.45.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ