linux-kernel - Re: [PATCH] sched_ext: Clear direct dispatch state on dequeue when dsq is NULL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aXFA4-b4WLPjxCME@gpd4>
Date: Wed, 21 Jan 2026 22:10:59 +0100
From: Andrea Righi <arighi@...dia.com>
To: Daniel Hodges <hodgesd@...a.com>
Cc: tj@...nel.org, void@...ifault.com, changwoo@...lia.com,
	sched-ext@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched_ext: Clear direct dispatch state on dequeue when
 dsq is NULL

Hi Daniel,

On Wed, Jan 21, 2026 at 07:56:02AM -0800, Daniel Hodges wrote:
> When a task is direct-dispatched from ops.select_cpu() or ops.enqueue(),
> ddsp_dsq_id is set to indicate the target DSQ. If the task is dequeued
> before dispatch_enqueue() completes (e.g., task killed or receives a
> signal), dispatch_dequeue() is called with dsq == NULL.
> 
> In this case, the task is unlinked from ddsp_deferred_locals and
> holding_cpu is cleared, but ddsp_dsq_id and ddsp_enq_flags are left
> stale. On the next wakeup, when ops.select_cpu() calls
> scx_bpf_dsq_insert(), mark_direct_dispatch() finds ddsp_dsq_id already
> set and triggers:
> 
>   WARNING: CPU: 56 PID: 2323042 at kernel/sched/ext.c:2157
>            scx_bpf_dsq_insert+0x16b/0x1d0
> 
> Fix this by clearing ddsp_dsq_id and ddsp_enq_flags in dispatch_dequeue()
> when dsq is NULL, ensuring clean state for subsequent wakeups.

I've tried to fix this a while ago (same as this, right?
https://github.com/sched-ext/scx/issues/2758), I remember that I applied
exactly the same patch, but I was still able to trigger the warning.

IIRC there's also a race in ttwu_queue_wakelist tasks and
sched_setscheduler() that can hit the stale ddsp_dsq_id (maybe other
cases).

Long story short, the only thing that was working reliably for me was to
clear ddsp_dsq_id and ddsp_enq_flags in select_task_rq_scx(), but I thought
it was a bit too overkill and then I've never finished to investigate the
real issue...

In conclusion, I think this is fixing some of these warnings that we see
and it's probably good to apply it, but it's not fixing all of them.

Anyway, I'll do some tests with this patch and report back!

Thanks,
-Andrea

> 
> Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
> Signed-off-by: Daniel Hodges <hodgesd@...a.com>
> ---
>  kernel/sched/ext.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index b563b8c3fd24..fdfef3fd8814 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -1143,20 +1143,28 @@ static void dispatch_dequeue(struct rq *rq, struct task_struct *p)
>  
>  		/*
>  		 * When dispatching directly from the BPF scheduler to a local
>  		 * DSQ, the task isn't associated with any DSQ but
>  		 * @p->scx.holding_cpu may be set under the protection of
>  		 * %SCX_OPSS_DISPATCHING.
>  		 */
>  		if (p->scx.holding_cpu >= 0)
>  			p->scx.holding_cpu = -1;
>  
> +		/*
> +		 * Clear direct dispatch state. The task may have been
> +		 * direct-dispatched from ops.select_cpu() or ops.enqueue()
> +		 * but dequeued before the dispatch completed.
> +		 */
> +		p->scx.ddsp_dsq_id = SCX_DSQ_INVALID;
> +		p->scx.ddsp_enq_flags = 0;
> +
>  		return;
>  	}
>  
>  	if (!is_local)
>  		raw_spin_lock(&dsq->lock);
>  
>  	/*
>  	 * Now that we hold @dsq->lock, @p->holding_cpu and @p->scx.dsq_* can't
>  	 * change underneath us.
>  	*/
> -- 
> 2.47.3
>