[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251014095407.GM4067720@noisy.programming.kicks-ass.net>
Date: Tue, 14 Oct 2025 11:54:07 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Gabriele Monaco <gmonaco@...hat.com>
Cc: linux-kernel@...r.kernel.org, Juri Lelli <juri.lelli@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Clark Williams <williams@...hat.com>
Subject: Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with
expired deadline
On Tue, Oct 07, 2025 at 02:29:04PM +0200, Gabriele Monaco wrote:
> Recent changes to the deadline server leave it running when the system
> is idle. If the system is idle for longer than the dl_server period and
> the first scheduling occurs after a fair task wakes up, the algorithm
> picks the server as the earliest deadline (in the past) and that boosts
> the fair task that just woke up while:
> * the deadline is in the past
> * the server consumed all its runtime (in background)
> * there is no starvation (idle for about a period)
>
> Prevent the server from boosting a task when the deadline is in the
> past. Instead, replenish a new period and start the server as deferred.
>
> Fixes: 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck")
> To: Juri Lelli <juri.lelli@...hat.com>
> Cc: Clark Williams <williams@...hat.com>
> Signed-off-by: Gabriele Monaco <gmonaco@...hat.com>
> ---
>
> This behaviour was observed using the RV monitors in [1] and the patch
> was validated on an adapted version of the models. The models are not
> exhaustively validating the dl_server behaviour.
>
> [1] - https://lore.kernel.org/lkml/20250919140954.104920-21-gmonaco@redhat.com
>
> kernel/sched/deadline.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 72c1f72463c7..b3e3d506a18d 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2371,6 +2371,17 @@ static struct task_struct *__pick_task_dl(struct rq *rq)
> dl_server_stop(dl_se);
> goto again;
> }
> + /*
> + * If the CPU was idle for long enough time and wakes up
> + * because of a fair task, the dl_server may run after its
> + * period elapsed. Replenish a new period as deferred, since we
> + * are clearly not handling starvation here.
> + */
> + if (dl_time_before(dl_se->deadline, rq_clock(rq))) {
> + dl_se->dl_defer_running = 0;
> + replenish_dl_new_period(dl_se, rq);
> + goto again;
> + }
> rq->dl_server = dl_se;
> } else {
> p = dl_task_of(dl_se);
>
I'm a bit confused, should not enqueue ensure deadline is in the future?
And if it doesn't shouldn't we fix the enqueue path somewhere?
Powered by blists - more mailing lists