linux-kernel - Re: [PATCH 2/2] selftests/sched_ext: Add test to validate ops.dequeue() semantics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aYreQnz5h4E08E4a@gpd4>
Date: Tue, 10 Feb 2026 08:29:06 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: Emil Tsalapatis <emil@...alapatis.com>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>,
	Kuba Piecuch <jpiecuch@...gle.com>,
	Christian Loehle <christian.loehle@....com>,
	Daniel Hodges <hodgesd@...a.com>, sched-ext@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] selftests/sched_ext: Add test to validate
 ops.dequeue() semantics

On Mon, Feb 09, 2026 at 02:42:04PM -1000, Tejun Heo wrote:
> Hello, Andrea.
> 
> On Mon, Feb 09, 2026 at 11:22:19PM +0100, Andrea Righi wrote:
> ...
> > Ok, what you're saying is that a direct dispatch from ops.select_cpu() is
> > just a shortcut for work that would otherwise happen at the head of
> > ops.enqueue().
> >
> > So, while ops.select_cpu() itself is not "being in scheduler custody", the
> > semantic operation of dispatching a task is still the scheduler taking
> > control of the task. As a result, a dispatch to a user DSQ from
> > ops.select_cpu() should be treated the same as a dispatch to a user DSQ
> > from ops.enqueue() for the purpose of triggering ops.dequeue(). The fact
> > that this happens in ops.select_cpu() rather than ops.enqueue() is an
> > implementation detail, not a semantic boundary.
> 
> Yes.
> 
> > Under this interpretation, storing a task in BPF internal data structures
> > from ops.select_cpu() should not trigger ops.dequeue(), since the task has
> > not been put under scheduler control yet. However, dispatching a task to a
> 
> Also, ops.select_cpu() putting the task in a BPF struct doesn't affect
> what's happening in the enqueue path. ops.enqueue() will still be invoked
> and the task will be transferred to BPF side iff ops.enqueue() does not
> perform a direct dispatch. Imagine the following (unlikely but possible)
> scenario:
> 
>    CPU A                                   CPU B
> 
>    ops.select_cpu() puts task in a BPF
>      data structure
>                                            ops.dispatch() sees the task, dequeues it and
>                                            dispatches it to CPU B's local DSQ.
> 
>                                            finish_dispatch() runs but the task is still
>                                            SCX_OPSS_NONE and dispatch attempt is ignored.
> 
>    ops.enqueue() runs and returns without
>    doing anything. Task transitions to
>    SCX_OPSS_QUEUED.
> 
> Afterwards, the kernel considers the task to be owned by BPF but the BPF
> side thinks the task has already been dispatched. It just doesn't make much
> sense to do BPF enqueue operation from ops.select_cpu(). The only reason it
> works for direct dispatch is because the kernel defers the operation to the
> enqueue time behind the scene.

Makes sense. Storing a task in a BPF internal data structure from
ops.select_cpu() doesn't prevent ops.enqueue() from being called and it can
introduce racy behavior. So, putting a task in a BPF queue at that point is
just extra overhead and provides no real benefit.

In other words, the only thing that makes sense for ops.select_cpu() is
direct dispatch, attempting any other form of enqueue from there is
pointless because the ops.enqueue() path will be invoked anyway.

We should probably document this behavior to make it explicit.

Thanks,
-Andrea