[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f99f63b4a777ff990832a27fa6d171aaf1206a75.camel@redhat.com>
Date: Mon, 26 Jan 2026 15:20:12 +0100
From: Gabriele Monaco <gmonaco@...hat.com>
To: Andrea Righi <arighi@...dia.com>, Ingo Molnar <mingo@...hat.com>, Peter
Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, Tejun Heo
<tj@...nel.org>, Joel Fernandes <joelagnelf@...dia.com>, David Vernet
<void@...ifault.com>, Changwoo Min <changwoo@...lia.com>, Daniel Hodges
<hodgesd@...a.com>, sched-ext@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched/deadline: Reset dl_server execution state on
stop
On Fri, 2026-01-23 at 17:16 +0100, Andrea Righi wrote:
> dl_server_stop() can leave a deadline server in an inconsistent internal
> state across stop/start transitions, causing it to bypass its required
> deferral phase when restarted. This breaks the scheduler invariant that
> a restarted server must re-establish eligibility before being allowed to
> execute.
>
> When the server is stopped (e.g., because the associated task blocks),
> it's expected to transition back to an inactive, initial state. However,
> dl_server_stop() does not fully reset the execution state. As a result,
> the server can be logically inactive while still appearing as if it was
> still running.
>
> When the server is restarted via dl_server_start(), the following
> sequence occurs:
> 1. dl_server_start() calls enqueue_dl_entity(ENQUEUE_WAKEUP),
> 2. enqueue_dl_entity() calls update_dl_entity(),
> 3. update_dl_entity() checks (!dl_se->dl_defer_running) to decide
> whether to arm the deferral mechanism,
> 4. because dl_defer_running is stale, the check fails,
> 5. dl_defer_armed and dl_throttled are not set,
> 6. enqueue_dl_entity() skips start_dl_timer(), because
> dl_throttled == 0,
> 7. the server is enqueued via __enqueue_dl_entity(),
> 8. the scheduler picks the server to run,
> 9. update_curr_dl_se() detects that the server has exhausted its
> runtime (or has negative runtime), as it wasn't properly
> replenished/deferred,
> 10. the server is throttled (dl_throttled set to 1) and dequeued,
> 11. the server repeatedly cycles through wakeup and throttling,
> effectively receiving no usable CPU bandwidth.
Hello,
I remember wondering why defer_running was kept after stop and Peter suggested
it's to avoid penalising tasks with short sleeps. [1]
Clearing defer_running on stop is in fact removing the edge from A:init to
D:running , isn't it? The server should be able to start as running and not only
deferred (dl_defer_armed and dl_throttled set).
In the sequence you described above, I wonder why the enqueue is never
replenishing. As far as I understand the runtime should remain <= 0 only as long
as the enqueue occurs before the deadline, after that it should simply replenish
a new period (pushing deadline and restoring runtime).
What am I missing here?
Thanks,
Gabriele
[1] -
https://lore.kernel.org/lkml/20251111111716.GL278048@noisy.programming.kicks-ass.net
>
> This results in starvation of the tasks serviced by the deadline server
> in the presence of competing RT workloads.
>
> This issue can be confirmed adding debugging traces, which show that the
> server skips the deferral timer and is immediately throttled upon
> execution with negative runtime:
>
> DEBUG: dl_server_start: dl_defer_running=1 active=0
> DEBUG: enqueue_dl_entity: flags=1 dl_throttled=0 dl_defer=1
> DEBUG: update_dl_entity: dl_defer_running=1
> DEBUG: enqueue_dl_entity: SKIPPING start_dl_timer! dl_throttled=0
> ...
> DEBUG: update_curr_dl_se: THROTTLED runtime=-954758
>
> Fix this by properly resetting dl_defer_running in dl_server_stop(),
> ensuring the server correctly enters the defer phase upon restart.
>
> This issue is quite difficult to observe when only the fair server
> is present, as the required stop/start patterns are relatively rare.
> However, it becomes easier to trigger with an additional deadline server
> with more frequent server lifecycle transitions (such as a sched_ext
> deadline server).
>
> This change is a prerequisite for introducing a sched_ext deadline
> server, as it ensures correct and predictable behavior across server
> stop/start cycles.
>
> Link: https://lore.kernel.org/all/aXEMat4IoNnGYgxw@gpd4/
> Signed-off-by: Andrea Righi <arighi@...dia.com>
> ---
> Changes in v2:
> - Update state machine documentation
> - Link to v1:
> https://lore.kernel.org/all/20260122140833.1655020-1-arighi@nvidia.com/
>
> kernel/sched/deadline.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index c509f2e7d69de..e42867061ea77 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1615,7 +1615,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64
> delta_exec)
> * dl_server_active = 0
> * dl_throttled = 0
> * dl_defer_armed = 0
> - * dl_defer_running = 0/1
> + * dl_defer_running = 0
> * dl_defer_idle = 0
> *
> * [B] - zero_laxity-wait
> @@ -1704,6 +1704,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64
> delta_exec)
> * hrtimer_try_to_cancel();
> * dl_defer_armed = 0;
> * dl_throttled = 0;
> + * dl_defer_running = 0;
> * dl_server_active = 0;
> * // [A]
> * return p;
> @@ -1813,6 +1814,7 @@ void dl_server_stop(struct sched_dl_entity *dl_se)
> hrtimer_try_to_cancel(&dl_se->dl_timer);
> dl_se->dl_defer_armed = 0;
> dl_se->dl_throttled = 0;
> + dl_se->dl_defer_running = 0;
> dl_se->dl_defer_idle = 0;
> dl_se->dl_server_active = 0;
> }
Powered by blists - more mailing lists