[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251014102541.GS3245006@noisy.programming.kicks-ass.net>
Date: Tue, 14 Oct 2025 12:25:41 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Gabriele Monaco <gmonaco@...hat.com>
Cc: linux-kernel@...r.kernel.org, Juri Lelli <juri.lelli@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Clark Williams <williams@...hat.com>
Subject: Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with
expired deadline
On Tue, Oct 14, 2025 at 12:05:06PM +0200, Gabriele Monaco wrote:
> On Tue, 2025-10-14 at 11:54 +0200, Peter Zijlstra wrote:
> > On Tue, Oct 07, 2025 at 02:29:04PM +0200, Gabriele Monaco wrote:
> > > Recent changes to the deadline server leave it running when the system
> > > is idle. If the system is idle for longer than the dl_server period and
> > > the first scheduling occurs after a fair task wakes up, the algorithm
> > > picks the server as the earliest deadline (in the past) and that boosts
> > > the fair task that just woke up while:
> > > * the deadline is in the past
> > > * the server consumed all its runtime (in background)
> > > * there is no starvation (idle for about a period)
> > >
> > > Prevent the server from boosting a task when the deadline is in the
> > > past. Instead, replenish a new period and start the server as deferred.
> >
> > I'm a bit confused, should not enqueue ensure deadline is in the future?
> > And if it doesn't shouldn't we fix the enqueue path somewhere?
>
> Enqueue of a deadline task should handle the case, here the CPU is idle and the
> deadline server did not stop yet (and won't until the next schedule, if I'm not
> mistaken).
> The following enqueue of a fair task triggers a schedule where the server (no
> longer deferred) boosts the task straight away.
>
> Now the only check for deadline is in pick_next_dl_entity, where the earliest
> one is chosen, despite being in the past.
>
> Do you mean to check for deadline when enqueueing the fair task too? I believe
> again nothing happens here because the server is still up.
>
> Does it make sense or am I missing something?
Lets be confused together :-)
So dl_server is active, but machine is otherwise idle, this means
dl_server_timer is pending, right?
This timer is in one of two states:
- waiting for replenish; which will trigger and switch to 0-laxity.
- waiting for 0-laxity
So that 0-laxity case is the interesting one; when the machine really is
idle, no fair tasks will run and its runtime budget will not get
depleted. Therefore, once we hit 0-laxity, it will do
enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH).
This enqueue should ensure dl_se->deadline is in the future, right?
Anyway, we run this deadline entity (there ain't nothing else to do
anyway), and it finds there aren't any fair tasks, it does
dl_server_stop().
Then, if a fair takes wakes (nr_running: 0->1) and dl_server isn't
active, we do dl_server_start(), which in turn does enqueue_dl_entity().
Now this enqueue is supposed to check if the dl_entity can still run;
does it still have time left in its current period, if not, its
replenish timer time.
So where exactly does the fair task start, and result in dl_se being
on_rq such that dl_deadline is in the past? That sounds like an enqueue
problem to me.
Powered by blists - more mailing lists