[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f2ad071e59db2ed8bc0b382ae202b7474d07afc.camel@redhat.com>
Date: Fri, 31 Oct 2025 14:24:17 +0100
From: Gabriele Monaco <gmonaco@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>, linux-kernel@...r.kernel.org, Ingo
Molnar <mingo@...hat.com>, Clark Williams <williams@...hat.com>,
arighi@...dia.com
Subject: Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with
expired deadline
On Fri, 2025-10-31 at 14:05 +0100, Peter Zijlstra wrote:
> On Thu, Oct 30, 2025 at 07:42:05PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 22, 2025 at 12:11:51PM +0200, Gabriele Monaco wrote:
> > >
> > > Is this expected?
> >
> > Sort of, that was next on the list. Let me see if I can make it stop a
> > little more.
>
> OK, so I've gone over things again and all I got was a comment.
>
> That is, today I think it all works as expected.
>
> The dl_server will stop once the fair class goes idle long enough. Can
> you confirm this?
>
I'm going to go through your comment more carefully, but what I can observe now
is a bit different:
After this patch, consuming bandwidth in background on fair tasks and on idle is
equivalent. Updating idle time does effectively replenish after exhausting
runtime and we never stop the server (IMO this is correct behaviour only for
fair tasks, since there's potentially something to do).
At least this is the behaviour I get on a mostly idle system.
Different scenario if I have the CPU busy with other tasks (e.g. RT policies),
there I can see the server stopping and starting again.
After I do this I seem to get a different behaviour (even some boosting after
idle), I'm trying to understand what's going on.
Does this behaviour make sense to you?
Thanks,
Gabriele
> ---
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1152,6 +1152,94 @@ static void __push_dl_task(struct rq *rq
> /* a defer timer will not be reset if the runtime consumed was <
> dl_server_min_res */
> static const u64 dl_server_min_res = 1 * NSEC_PER_MSEC;
>
> +
> +/*
> + * dl_server && dl_defer:
> + * dl_defer_armed = 0
> + * dl_defer_running = 0
> + * dl_throttled = 0
> + *
> + * [1] dl_server_start()
> + * dl_server_active = 1;
> + * enqueue_dl_entity()
> + * update_dl_entity(WAKEUP)
> + * if (!dl_defer_running)
> + * dl_defer_armed = 1;
> + * dl_defer_throttled = 1;
> + * if (dl_throttled && start_dl_timer())
> + * return;
> + * // start server into waiting for zero-laxity
> + *
> + * // deplete server runtime from fair-class
> + * [2] update_curr_dl_se()
> + * if (dl_defer && dl_throttled && dl_runtime_exceeded())
> + * dl_defer_running = 0;
> + * hrtimer_try_to_cancel(); // stop timer
> + * replenish_dl_new_period()
> + * // advance period
> + * dl_throttled = 1;
> + * dl_defer_armed = 1;
> + * start_dl_timer(); // restart timer
> + * // back into waiting for zero-laxity
> + *
> + * // timer actually fires means we have runtime
> + * [4] dl_server_timer()
> + * if (dl_defer_armed)
> + * dl_defer_running = 1;
> + * enqueue_dl_entity(REPLENISH)
> + * replenish_dl_entity()
> + * opt-fwd-period
> + * if (dl_throttled)
> + * dl_throttled = 0;
> + * if (dl_defer_armed)
> + * dl_defer_armed = 0;
> + * __enqueue_dl_entity();
> + * // server queued
> + *
> + * // schedule server
> + * [5] pick_task_dl()
> + * p = server_pick_task();
> + * if (!p)
> + * dl_server_stop()
> + * dequeue_dl_entity();
> + * hrtimer_try_to_cancel();
> + * dl_defer_armed = 0;
> + * dl_throttled = 0;
> + * dl_server_active = 0;
> + * // goto [1]
> + *
> + * // server running
> + * [6] update_curr_dl_se()
> + * if (dl_runtime_exceeded())
> + * dl_throttled = 1;
> + * dequeue_dl_entity();
> + * start_dl_timer();
> + * // replenish-timer
> + *
> + * // goto [2]
> + *
> + * [7] dl_server_timer()
> + * enqueue_dl_entity(REPLENISH)
> + * replenish_dl_entity()
> + * fwd-period
> + * if (dl_throttled)
> + * dl_throttled = 0;
> + * __enqueue_dl_entity();
> + * // goto [5]
> + *
> + * Notes:
> + *
> + * - When there are fair tasks running the most likely loop is [2]->[2].
> + * the dl_server never actually runs, the timer never fires.
> + *
> + * - When there is actual fair starvation; the timer fires and starts the
> + * dl_server. This will then throttle and replenish like a normal DL
> + * task. Notably it will not 'defer' again.
> + *
> + * - When fair goes idle, it will not consume dl_server budget so the server
> + * will start. However, it will find there are no fair tasks to run and
> + * stop itself.
> + */
> static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct
> sched_dl_entity *dl_se)
> {
> struct rq *rq = rq_of_dl_se(dl_se);
Powered by blists - more mailing lists