linux-kernel - Re: [PATCH 5/5] sched: Rework sched_class::wakeup_preempt() and rq_modified

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251130113227.GB411057@noisy.programming.kicks-ass.net>
Date: Sun, 30 Nov 2025 12:32:27 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, tj@...nel.org,
	void@...ifault.com, arighi@...dia.com, changwoo@...lia.com,
	sched-ext@...ts.linux.dev, mingo@...nel.org,
	vincent.guittot@...aro.org
Subject: Re: [PATCH 5/5] sched: Rework sched_class::wakeup_preempt() and
 rq_modified_*()

On Sat, Nov 29, 2025 at 11:38:49PM +0530, Shrikanth Hegde wrote:

> > @@ -2174,10 +2172,14 @@ void wakeup_preempt(struct rq *rq, struc
> >   {
> >   	struct task_struct *donor = rq->donor;
> > -	if (p->sched_class == donor->sched_class)
> > -		donor->sched_class->wakeup_preempt(rq, p, flags);
> > -	else if (sched_class_above(p->sched_class, donor->sched_class))
> > +	if (p->sched_class == rq->next_class) {
> > +		rq->next_class->wakeup_preempt(rq, p, flags);
> > +
> > +	} else if (sched_class_above(p->sched_class, rq->next_class)) {
> > +		rq->next_class->wakeup_preempt(rq, p, flags);
> 
> Whats the logic of calling wakeup_preempt here?
> 
> say rq was running CFS, now RT is waking up. but first thing we do is
> return if not fair_sched_class. it is effectively resched_curr right?

Yes, as-is this patch seems silly, but that is mostly to preserve
current semantics :-)

The idea is that classes *could* do something else. Notably this was a
request from sched_ext. There are cases where when they pull a task from
the global runqueue and stick it on the local runqueue, but then get
preempted by a higher priority class (say RT) they would want to stick
the task back on the global runqueue such that another CPU can select it
again, instead of having that task linger on a CPU that is not
available.

This issue has come up in the past as well but was never addressed.

Anyway, this is just foundational work. It would let a class respond to
loosing the runqueue to a higher priority class.

I suppose I should go write a better changelog.

> 
> >   		resched_curr(rq);
> > +		rq->next_class = p->sched_class;
> 
> Since resched will happen and __schedule can set the next_class. it is necessary to set it
> even earlier?

Yes, because we can have another wakeup before that schedule.

Imagine running a fair class, getting a fifo wakeup and then a dl
wakeup. You want the fair class, then the rt class to get a preemption
notification.

> > @@ -3899,6 +3876,7 @@ void move_queued_task_locked(struct rq *
> >   	deactivate_task(src_rq, task, 0);
> >   	set_task_cpu(task, dst_rq->cpu);
> >   	activate_task(dst_rq, task, 0);
> > +	wakeup_preempt(dst_rq, task, 0);
> 
> Whats the need of wakeup_preempt here?

Everything that places a task on the runqueue should do a 'wakeup'
preemption to make sure the above mentioned class preemption stuff
works.

It doesn't really matter if the task is new due to an actual wakeup or
due to a migration, the task is 'new' to this CPU and stuff might need
to 'move'.

IIRC this was the only such place that missed the check.