lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251219224450.2537941-1-arighi@nvidia.com>
Date: Fri, 19 Dec 2025 23:43:13 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>
Cc: Emil Tsalapatis <emil@...alapatis.com>,
	Daniel Hodges <hodgesd@...a.com>,
	sched-ext@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: [PATCH 0/2] sched_ext: Implement proper ops.dequeue() semantics

Currently, ops.dequeue() is only invoked when tasks are still owned by the
BPF scheduler (i.e., not yet dispatched to any DSQ). However, BPF
schedulers may need to track task ownership transitions reliably.

The issue is that once a task is dispatched, the BPF scheduler loses
visibility of when the task leaves its ownership. This makes it impossible
to maintain accurate accounting (e.g., per-DSQ queued runtime sums) or
properly track task lifecycle events.

This fixes the semantics of ops.dequeue() to ensure that every
ops.enqueue() is properly balanced by a corresponding ops.dequeue() call.

With this, a task is considered "enqueued" from the moment ops.enqueue() is
called until it either:
1. Gets dispatched (moved to a local DSQ for execution) or,
2. Is removed from the scheduler (e.g., blocks, or properties like CPU
   affinity or priority are changed)

When either happens, ops.dequeue() is invoked, ensuring reliable 1:1
pairing between enqueue and dequeue operations.

This allows BPF schedulers to reliably track task ownership and maintain
accurate accounting.

Andrea Righi (2):
      sched_ext: Fix ops.dequeue() semantics
      selftests/sched_ext: Add test to validate ops.dequeue()

 Documentation/scheduler/sched-ext.rst           |  22 +++
 include/linux/sched/ext.h                       |   1 +
 kernel/sched/ext.c                              |  27 +++-
 tools/testing/selftests/sched_ext/Makefile      |   1 +
 tools/testing/selftests/sched_ext/dequeue.bpf.c | 139 +++++++++++++++++++
 tools/testing/selftests/sched_ext/dequeue.c     | 172 ++++++++++++++++++++++++
 6 files changed, 361 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/sched_ext/dequeue.bpf.c
 create mode 100644 tools/testing/selftests/sched_ext/dequeue.c

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ