lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <DFAXGN12DFCL.2AOFRHS9WWB9Q@etsalapatis.com>
Date: Mon, 29 Dec 2025 13:55:26 -0500
From: "Emil Tsalapatis" <emil@...alapatis.com>
To: "Andrea Righi" <arighi@...dia.com>, "Tejun Heo" <tj@...nel.org>
Cc: "David Vernet" <void@...ifault.com>, "Changwoo Min"
 <changwoo@...lia.com>, "Daniel Hodges" <hodgesd@...a.com>,
 <sched-ext@...ts.linux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics

On Mon Dec 29, 2025 at 12:07 PM EST, Andrea Righi wrote:
> Hi Tejun,
>
> On Sun, Dec 28, 2025 at 01:38:01PM -1000, Tejun Heo wrote:
>> Hello again, again.
>> 
>> On Sun, Dec 28, 2025 at 01:28:04PM -1000, Tejun Heo wrote:
>> ...
>> > So, please ignore that part. That's non-sense. I still wonder whether we can
>> > create some interlocking between scx_bpf_dsq_insert() and ops.dequeue()
>> > without making hot path slower. I'll think more about it.
>> 
>> And we can't create an interlocking between scx_bpf_dsq_insert() and
>> ops.dequeue() without adding extra atomic operations in hot paths. The only
>> thing shared is task rq lock and dispatch path can't do that synchronously.
>> So, yeah, it looks like the best we can do is always letting the BPF sched
>> know and let it figure out locking and whether the task needs to be
>> dequeued from BPF side.
>
> How about setting a flag in deq_flags to distinguish between a "dispatch"
> dequeue vs a real dequeue (due to property changes or other reasons)?
>
> We should be able to pass this information in a reliable way without any
> additional synchronization in the hot paths. This would let schedulers that
> use arena data structures check the flag instead of doing their own
> internal lookups.
>
> And it would also allow us to provide both semantics:
> 1) Catch real dequeues that need special BPF-side actions (check the flag)
> 2) Track all ops.enqueue()/ops.dequeue() pairs for accounting purposes
>    (ignore the flag)
>

IMO the extra flag suffices for arena-based queueing, the arena data
structures already have to track the state of the task already:

Even without the flag it should be possible to infer the task is in
in from inside the BPF code. For example, calling .dequeue() while 
the task is in an arena queue means the task got dequeued _after_
being dispatched, while calling .dequeue() on a queued task means we are
removing it because of a true dequeue event (e.g. sched_setaffinity()
was called). The only edge case in the logic is if a true dequeue event 
happens between .dispatch() and .dequeue(), but a new flag would take 
care of that.


> Thanks,
> -Andrea


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ