[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e611b632-0156-4204-a748-78d44b2d14c8@arm.com>
Date: Thu, 9 Oct 2025 18:50:55 +0200
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: tj@...nel.org, linux-kernel@...r.kernel.org, mingo@...nel.org,
juri.lelli@...hat.com, vincent.guittot@...aro.org, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
longman@...hat.com, hannes@...xchg.org, mkoutny@...e.com,
void@...ifault.com, arighi@...dia.com, changwoo@...lia.com,
cgroups@...r.kernel.org, sched-ext@...ts.linux.dev, liuwenfang@...or.com,
tglx@...utronix.de
Subject: Re: [PATCH 03/12] sched: Fold sched_class::switch{ing,ed}_{to,from}()
into the change pattern
On 09.10.25 16:09, Peter Zijlstra wrote:
> On Thu, Oct 09, 2025 at 03:54:08PM +0200, Peter Zijlstra wrote:
>> On Thu, Oct 09, 2025 at 03:30:02PM +0200, Dietmar Eggemann wrote:
>>> On 06.10.25 12:44, Peter Zijlstra wrote:
>>>> Add {DE,EN}QUEUE_CLASS and fold the sched_class::switch* methods into
>>>> the change pattern. This completes and makes the pattern more
>>>> symmetric.
>>>>
>>>> This changes the order of callbacks slightly:
>>>>
>>>> |
>>>> | switching_from()
>>>> dequeue_task(); | dequeue_task()
>>>> put_prev_task(); | put_prev_task()
>>>> | switched_from()
>>>> |
>>>> ... change task ... | ... change task ...
>>>> |
>>>> switching_to(); | switching_to()
>>>> enqueue_task(); | enqueue_task()
>>>> set_next_task(); | set_next_task()
>>>> prev_class->switched_from() |
>>>> switched_to() | switched_to()
>>>> |
>>>>
>>>> Notably, it moves the switched_from() callback right after the
>>>> dequeue/put. Existing implementations don't appear to be affected by
>>>> this change in location -- specifically the task isn't enqueued on the
>>>> class in question in either location.
>>>>
>>>> Make (CLASS)^(SAVE|MOVE), because there is nothing to save-restore
>>>> when changing scheduling classes.
>>>
>>> This one causes a DL bw related warning when I run a simple 1 DL task
>>> rt-app workload:
>>
>>> Not sure yet how this is related to switched_from_dl() being now called earlier?
>>
>> Ooh, I might see a problem. task_non_contending() uses dl_task(), which
>> uses p->prio. The move above means it is now called using the 'old'
>> prio, whereas it used to run with the 'new' prio.
>>
>> Let me see if I can figure out something for this.
>
> Does this help? /me goes find rt-app.
Yes, but there seems to be more ... missing DEQUEUE_SAVE (a.k.a.
ENQUEUE_RESTORE) in
enqueue_dl_entity()
if (flags & (ENQUEUE_RESTORE|ENQUEUE_MIGRATING))
^^^^^^^^^^^^^^^
...
add_running_bw(dl_se, dl_rq)
and
__sched_setscheduler()
...
if (prev_class != next_class)
queue_flags |= DEQUEUE_CLASS;
queue_flags &= ~(DEQUEUE_SAVE | DEQUEUE_MOVE);
^^^^^^^^^^^^
as well as
sched_change_begin()
...
if (flags & DEQUEUE_CLASS) {
if (WARN_ON_ONCE(flags & (DEQUEUE_SAVE | DEQUEUE_MOVE)))
flags &= ~(DEQUEUE_SAVE | DEQUEUE_MOVE);
^^^^^^^^^^^^
With your patch and this the issue went away:
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 884926d3dd95..35074799e9ad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10844,9 +10844,6 @@ struct sched_change_ctx
*sched_change_begin(struct task_struct *p, unsigned int
lockdep_assert_rq_held(rq);
if (flags & DEQUEUE_CLASS) {
- if (WARN_ON_ONCE(flags & (DEQUEUE_SAVE | DEQUEUE_MOVE)))
- flags &= ~(DEQUEUE_SAVE | DEQUEUE_MOVE);
-
if (p->sched_class->switching_from)
p->sched_class->switching_from(rq, p);
}
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index 007d1440374b..bcef5c72d287 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -684,10 +684,8 @@ int __sched_setscheduler(struct task_struct *p,
prev_class = p->sched_class;
next_class = __setscheduler_class(policy, newprio);
- if (prev_class != next_class) {
+ if (prev_class != next_class)
queue_flags |= DEQUEUE_CLASS;
- queue_flags &= ~(DEQUEUE_SAVE | DEQUEUE_MOVE);
- }
Powered by blists - more mailing lists