linux-kernel - Re: [PATCH 3/3] sched_ext: Allow scx_bpf_reenqueue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251027181028.GB988547@noisy.programming.kicks-ass.net>
Date: Mon, 27 Oct 2025 19:10:28 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>,
	Andrea Righi <andrea.righi@...ux.dev>,
	Changwoo Min <changwoo@...lia.com>, linux-kernel@...r.kernel.org,
	sched-ext@...ts.linux.dev, Wen-Fang Liu <liuwenfang@...or.com>
Subject: Re: [PATCH 3/3] sched_ext: Allow scx_bpf_reenqueue_local() to be
 called from anywhere

On Mon, Oct 27, 2025 at 06:00:00AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Mon, Oct 27, 2025 at 10:18:22AM +0100, Peter Zijlstra wrote:
> ...
> > > The main use case for cpu_release() was calling scx_bpf_reenqueue_local() when
> > > a CPU gets preempted by a higher priority scheduling class. However, the old
> > > scx_bpf_reenqueue_local() could only be called from cpu_release() context.
> > 
> > I'm a little confused. Isn't this the problem where balance_one()
> > migrates a task to the local rq and we end up having to RETRY_TASK
> > because another (higher) rq gets modified?
> 
> That's what I thought too and the gap between balance() and pick_task() can
> be closed that way. However, while plugging that, I realized there's another
> bigger gap between ttwu() and pick_task() because ttwu() can directly
> dispatch a task into the local DSQ of a CPU. That one, there's no way to
> close without a global hook.

Just for my elucidation and such.. This is when ttwu() happens and the
CPU is idle and you dispatch directly to it, expecting it to then go run
that task. After which another wakeup/balance movement happens which
places/moves a task from a higher priority class to that CPU, such that
your initial (ext) task doesn't get to run after all. Right?