[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aN4pUAfE30rF6-n4@gpd4>
Date: Thu, 2 Oct 2025 09:27:12 +0200
From: Andrea Righi <arighi@...dia.com>
To: Ryan Newton <rrnewton@...il.com>
Cc: linux-kernel@...r.kernel.org, tj@...nel.org, newton@...a.com
Subject: Re: [PATCH 2/3] sched_ext: optimize first_task update logic
On Wed, Oct 01, 2025 at 10:57:20PM -0400, Ryan Newton wrote:
> From: Ryan Newton <newton@...a.com>
>
> This is a follow-on optimization to the prior commit which added a
> lockless peek operation on DSQs. That implementation is correct and
> simple, but elides several optimizations.
>
> Previously, we read the first_task using the same slowpath, irrespective
> of where we enqueue the task. With this change, we instead base the
> update on what we know about the calling context. On both insert and
> removal we can break down whether the change (1) definitely, (2) never,
> or (3) sometimes changes first task. In some cases we know what the new
> first task will be, and can set it more directly.
>
> Signed-off-by: Ryan Newton <newton@...a.com>
> ---
> kernel/sched/ext.c | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index fd0121c03311..1cb10aa9913a 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -953,8 +953,11 @@ static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq,
> container_of(rbp, struct task_struct,
> scx.dsq_priq);
> list_add(&p->scx.dsq_list.node, &prev->scx.dsq_list.node);
> + /* first task unchanged - no update needed */
> } else {
> list_add(&p->scx.dsq_list.node, &dsq->list);
> + /* new task is at head - use fastpath */
> + rcu_assign_pointer(dsq->first_task, p);
> }
> } else {
> /* a FIFO DSQ shouldn't be using PRIQ enqueuing */
> @@ -962,15 +965,20 @@ static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq,
> scx_error(sch, "DSQ ID 0x%016llx already had PRIQ-enqueued tasks",
> dsq->id);
>
> - if (enq_flags & (SCX_ENQ_HEAD | SCX_ENQ_PREEMPT))
> + if (enq_flags & (SCX_ENQ_HEAD | SCX_ENQ_PREEMPT)) {
> list_add(&p->scx.dsq_list.node, &dsq->list);
> - else
> + /* new task inserted at head - use fastpath */
> + rcu_assign_pointer(dsq->first_task, p);
> + } else {
> + bool was_empty;
> +
> + was_empty = list_empty(&dsq->list);
> list_add_tail(&p->scx.dsq_list.node, &dsq->list);
> + if (was_empty)
> + rcu_assign_pointer(dsq->first_task, p);
> + }
> }
>
> - /* even the add_tail code path may have changed the first element */
> - dsq_update_first_task(dsq);
> -
> /* seq records the order tasks are queued, used by BPF DSQ iterator */
> dsq->seq++;
> p->scx.dsq_seq = dsq->seq;
> @@ -1023,9 +1031,15 @@ static void task_unlink_from_dsq(struct task_struct *p,
> p->scx.dsq_flags &= ~SCX_TASK_DSQ_ON_PRIQ;
> }
>
> + if (dsq->first_task == p) {
> + if (dsq->id & SCX_DSQ_FLAG_BUILTIN)
> + rcu_assign_pointer(dsq->first_task,
> + list_next_entry(p, scx.dsq_list.node));
nit: no need to split in two lines, it should fit in the 100 characters per
line limit.
> + else
> + dsq_update_first_task(dsq);
> + }
However, from my comment in PATCH 1/3, if we allow to use
scx_bpf_dsq_peek() only with user DSQs this would become:
if (!(dsq->id & SCX_DSQ_FLAG_BUILTIN) && dsq->first_task == p)
dsq_update_first_task(dsq);
> list_del_init(&p->scx.dsq_list.node);
> dsq_mod_nr(dsq, -1);
> - dsq_update_first_task(dsq);
> }
>
> static void dispatch_dequeue(struct rq *rq, struct task_struct *p)
> --
> 2.51.0
>
Thanks,
-Andrea
Powered by blists - more mailing lists