[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9e4e4a2-dadd-4f79-a83e-48ac4663f91c@amd.com>
Date: Wed, 14 Jan 2026 12:17:11 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Peter Zijlstra <peterz@...radead.org>, Pierre Gondois
<pierre.gondois@....com>
CC: <tj@...nel.org>, <linux-kernel@...r.kernel.org>, <mingo@...nel.org>,
<juri.lelli@...hat.com>, <vincent.guittot@...aro.org>,
<dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
<mgorman@...e.de>, <vschneid@...hat.com>, <longman@...hat.com>,
<hannes@...xchg.org>, <mkoutny@...e.com>, <void@...ifault.com>,
<arighi@...dia.com>, <changwoo@...lia.com>, <cgroups@...r.kernel.org>,
<sched-ext@...ts.linux.dev>, <liuwenfang@...or.com>, <tglx@...utronix.de>,
Christian Loehle <christian.loehle@....com>
Subject: Re: [PATCH 05/12] sched: Move sched_class::prio_changed() into the
change pattern
Hello Peter,
On 1/13/2026 5:17 PM, Peter Zijlstra wrote:
> Hum... so this one is a little more tricky.
>
> So the normal rules are that DEQUEUE_SAVE + ENQUEUE_RESTORE should be as
> invariant as possible.
>
> But what I think happens here is that at the point of dequeue we are
> effectively ready to throttle/replenish, but we don't.
>
> Then at enqueue, we do. The replenish changes the deadline and we're up
> a creek.
I've the following data from the scenario in which I observe
the same splat as Pierre splat wit the two fixes on top of tip:
yes-4108 [194] d..2. 53.396872: get_prio_dl: get_prio_dl: clock(53060728757)
yes-4108 [194] d..2. 53.396873: update_curr_dl_se: update_curr_dl_se: past throttle label
yes-4108 [194] d..2. 53.396873: update_curr_dl_se: dl_throttled(0) dl_overrun(0) timer_queued(0) server?(0)
yes-4108 [194] d..2. 53.396873: update_curr_dl_se: dl_se->runtime(190623) rq->dl.overloaded(0)
yes-4108 [194] d..2. 53.396874: get_prio_dl: get_prio_dl: deadline(53060017809)
yes-4108 [194] d..2. 53.396878: enqueue_dl_entity: ENQUEUE_RESTORE update_dl_entity
yes-4108 [194] d..2. 53.396878: enqueue_dl_entity: setup_new_dl_entity
yes-4108 [194] d..2. 53.396878: enqueue_dl_entity: Replenish: Old: 53060017809 dl_deadline(1000000)
yes-4108 [194] d..2. 53.396879: enqueue_dl_entity: Replenish: New: 53061728757
yes-4108 [194] d..2. 53.396882: prio_changed_dl.part.0: Woops! prio_changed_dl: CPU(194) clock(53060728757) overloaded(0): Task: yes(4108), Curr: yes(4108) deadline: 53060017809 -> 53061728757
get_prio_dl() sees "deadline < rq->clock" but dl_se->runtime is still
positive so update_curr_dl_se() doesn't fiddle with the deadline.
ENQUEUE_RESTORE sees "deadline" before "rq->clock" and calls
setup_new_dl_entity() which calls replenish.
sched_change_end() will call prio_changed() with the old deadline from
get_prio_dl() but enqueue advanced the deadline so we land in a
pickle.
>
> Let me think about this for a bit...
Should prio_changed_dl() care about "dl_se->dl_deadline" having changed
within the sched_change guard since that is the attribute that can be
changed using sched_setattr() right?
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists