linux-kernel - Re: [PATCH v2] sched/deadline: Reset dl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f99f63b4a777ff990832a27fa6d171aaf1206a75.camel@redhat.com>
Date: Mon, 26 Jan 2026 15:20:12 +0100
From: Gabriele Monaco <gmonaco@...hat.com>
To: Andrea Righi <arighi@...dia.com>, Ingo Molnar <mingo@...hat.com>, Peter
 Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot	 <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt	
 <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman	
 <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, Tejun Heo	
 <tj@...nel.org>, Joel Fernandes <joelagnelf@...dia.com>, David Vernet	
 <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>, Daniel Hodges	
 <hodgesd@...a.com>, sched-ext@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched/deadline: Reset dl_server execution state on
 stop

On Fri, 2026-01-23 at 17:16 +0100, Andrea Righi wrote:
> dl_server_stop() can leave a deadline server in an inconsistent internal
> state across stop/start transitions, causing it to bypass its required
> deferral phase when restarted. This breaks the scheduler invariant that
> a restarted server must re-establish eligibility before being allowed to
> execute.
> 
> When the server is stopped (e.g., because the associated task blocks),
> it's expected to transition back to an inactive, initial state. However,
> dl_server_stop() does not fully reset the execution state. As a result,
> the server can be logically inactive while still appearing as if it was
> still running.
> 
> When the server is restarted via dl_server_start(), the following
> sequence occurs:
>   1. dl_server_start() calls enqueue_dl_entity(ENQUEUE_WAKEUP),
>   2. enqueue_dl_entity() calls update_dl_entity(),
>   3. update_dl_entity() checks (!dl_se->dl_defer_running) to decide
>      whether to arm the deferral mechanism,
>   4. because dl_defer_running is stale, the check fails,
>   5. dl_defer_armed and dl_throttled are not set,
>   6. enqueue_dl_entity() skips start_dl_timer(), because
>      dl_throttled == 0,
>   7. the server is enqueued via __enqueue_dl_entity(),
>   8. the scheduler picks the server to run,
>   9. update_curr_dl_se() detects that the server has exhausted its
>      runtime (or has negative runtime), as it wasn't properly
>      replenished/deferred,
>  10. the server is throttled (dl_throttled set to 1) and dequeued,
>  11. the server repeatedly cycles through wakeup and throttling,
>      effectively receiving no usable CPU bandwidth.

Hello,

I remember wondering why defer_running was kept after stop and Peter suggested
it's to avoid penalising tasks with short sleeps. [1]

Clearing defer_running on stop is in fact removing the edge from A:init to
D:running , isn't it? The server should be able to start as running and not only
deferred (dl_defer_armed and dl_throttled set).

In the sequence you described above, I wonder why the enqueue is never
replenishing. As far as I understand the runtime should remain <= 0 only as long
as the enqueue occurs before the deadline, after that it should simply replenish
a new period (pushing deadline and restoring runtime).

What am I missing here?

Thanks,
Gabriele

[1] -
https://lore.kernel.org/lkml/20251111111716.GL278048@noisy.programming.kicks-ass.net

> 
> This results in starvation of the tasks serviced by the deadline server
> in the presence of competing RT workloads.
> 
> This issue can be confirmed adding debugging traces, which show that the
> server skips the deferral timer and is immediately throttled upon
> execution with negative runtime:
> 
>  DEBUG: dl_server_start: dl_defer_running=1 active=0
>  DEBUG: enqueue_dl_entity: flags=1 dl_throttled=0 dl_defer=1
>  DEBUG: update_dl_entity: dl_defer_running=1
>  DEBUG: enqueue_dl_entity: SKIPPING start_dl_timer! dl_throttled=0
>  ...
>  DEBUG: update_curr_dl_se: THROTTLED runtime=-954758
> 
> Fix this by properly resetting dl_defer_running in dl_server_stop(),
> ensuring the server correctly enters the defer phase upon restart.
> 
> This issue is quite difficult to observe when only the fair server
> is present, as the required stop/start patterns are relatively rare.
> However, it becomes easier to trigger with an additional deadline server
> with more frequent server lifecycle transitions (such as a sched_ext
> deadline server).
> 
> This change is a prerequisite for introducing a sched_ext deadline
> server, as it ensures correct and predictable behavior across server
> stop/start cycles.
> 
> Link: https://lore.kernel.org/all/aXEMat4IoNnGYgxw@gpd4/
> Signed-off-by: Andrea Righi <arighi@...dia.com>
> ---
> Changes in v2:
>  - Update state machine documentation
>  - Link to v1:
> https://lore.kernel.org/all/20260122140833.1655020-1-arighi@nvidia.com/
> 
>  kernel/sched/deadline.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index c509f2e7d69de..e42867061ea77 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1615,7 +1615,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64
> delta_exec)
>   *   dl_server_active = 0
>   *   dl_throttled = 0
>   *   dl_defer_armed = 0
> - *   dl_defer_running = 0/1
> + *   dl_defer_running = 0
>   *   dl_defer_idle = 0
>   *
>   * [B] - zero_laxity-wait
> @@ -1704,6 +1704,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64
> delta_exec)
>   *       hrtimer_try_to_cancel();
>   *       dl_defer_armed = 0;
>   *       dl_throttled = 0;
> + *       dl_defer_running = 0;
>   *       dl_server_active = 0;
>   *       // [A]
>   *   return p;
> @@ -1813,6 +1814,7 @@ void dl_server_stop(struct sched_dl_entity *dl_se)
>  	hrtimer_try_to_cancel(&dl_se->dl_timer);
>  	dl_se->dl_defer_armed = 0;
>  	dl_se->dl_throttled = 0;
> +	dl_se->dl_defer_running = 0;
>  	dl_se->dl_defer_idle = 0;
>  	dl_se->dl_server_active = 0;
>  }