linux-kernel - Re: [PATCH] sched/deadline: Reset dl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aXOW3_2pU6SvABEI@gpd4>
Date: Fri, 23 Jan 2026 16:42:23 +0100
From: Andrea Righi <arighi@...dia.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
	Joel Fernandes <joelagnelf@...dia.com>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>,
	Daniel Hodges <hodgesd@...a.com>, sched-ext@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/deadline: Reset dl_server execution state on stop

Hi Juri,

On Fri, Jan 23, 2026 at 08:11:35AM +0100, Juri Lelli wrote:
> Hello,
> 
> On 22/01/26 15:08, Andrea Righi wrote:
> > dl_server_stop() can leave a deadline server in an inconsistent internal
> > state across stop/start transitions, causing it to bypass its required
> > deferral phase when restarted. This breaks the scheduler invariant that
> > a restarted server must re-establish eligibility before being allowed to
> > execute.
> > 
> > When the server is stopped (e.g., because the associated task blocks),
> > it's expected to transition back to an inactive, initial state. However,
> > dl_server_stop() does not fully reset the execution state. As a result,
> > the server can be logically inactive while still appearing as if it was
> > still running.
> > 
> > When the server is restarted via dl_server_start(), the following
> > sequence occurs:
> >   1. dl_server_start() calls enqueue_dl_entity(ENQUEUE_WAKEUP),
> >   2. enqueue_dl_entity() calls update_dl_entity(),
> >   3. update_dl_entity() checks (!dl_se->dl_defer_running) to decide
> >      whether to arm the deferral mechanism,
> >   4. because dl_defer_running is stale, the check fails,
> >   5. dl_defer_armed and dl_throttled are not set,
> >   6. enqueue_dl_entity() skips start_dl_timer(), because
> >      dl_throttled == 0,
> >   7. the server is enqueued via __enqueue_dl_entity(),
> >   8. the scheduler picks the server to run,
> >   9. update_curr_dl_se() detects that the server has exhausted its
> >      runtime (or has negative runtime), as it wasn't properly
> >      replenished/deferred,
> >  10. the server is throttled (dl_throttled set to 1) and dequeued,
> >  11. the server repeatedly cycles through wakeup and throttling,
> >      effectively receiving no usable CPU bandwidth.
> > 
> > This results in starvation of the tasks serviced by the deadline server
> > in the presence of competing RT workloads.
> > 
> > This issue can be confirmed adding debugging traces, which show that the
> > server skips the deferral timer and is immediately throttled upon
> > execution with negative runtime:
> > 
> >  DEBUG: dl_server_start: dl_defer_running=1 active=0
> >  DEBUG: enqueue_dl_entity: flags=1 dl_throttled=0 dl_defer=1
> >  DEBUG: update_dl_entity: dl_defer_running=1
> >  DEBUG: enqueue_dl_entity: SKIPPING start_dl_timer! dl_throttled=0
> >  ...
> >  DEBUG: update_curr_dl_se: THROTTLED runtime=-954758
> > 
> > Fix this by properly resetting dl_defer_running in dl_server_stop(),
> > ensuring the server correctly enters the defer phase upon restart.
> > 
> > This issue is quite difficult to observe when only the fair server
> > is present, as the required stop/start patterns are relatively rare.
> > However, it becomes easier to trigger with an additional deadline server
> > with more frequent server lifecycle transitions (such as a sched_ext
> > deadline server).
> > 
> > This change is a prerequisite for introducing a sched_ext deadline
> > server, as it ensures correct and predictable behavior across server
> > stop/start cycles.
> > 
> > Link: https://lore.kernel.org/all/aXEMat4IoNnGYgxw@gpd4/
> > Signed-off-by: Andrea Righi <arighi@...dia.com>
> > ---
> >  kernel/sched/deadline.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index c509f2e7d69de..214fe62a59723 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1813,6 +1813,7 @@ void dl_server_stop(struct sched_dl_entity *dl_se)
> >  	hrtimer_try_to_cancel(&dl_se->dl_timer);
> >  	dl_se->dl_defer_armed = 0;
> >  	dl_se->dl_throttled = 0;
> > +	dl_se->dl_defer_running = 0;
> >  	dl_se->dl_defer_idle = 0;
> >  	dl_se->dl_server_active = 0;
> >  }
> 
> The fix looks good to me, thanks!
> 
> State machine above dl_server_start() might need updating, though. Don't
> we want to add dl_defer_running = 0 under dl_server_stop() for case [4]
> D->A? Also for '[A] - init', dl_defer_running = 0 (remove /1)?

Definitely! I'll send a v2 with the updated state machine documentation.

Thanks,
-Andrea