lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f2ad071e59db2ed8bc0b382ae202b7474d07afc.camel@redhat.com>
Date: Fri, 31 Oct 2025 14:24:17 +0100
From: Gabriele Monaco <gmonaco@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>, linux-kernel@...r.kernel.org, Ingo
 Molnar <mingo@...hat.com>, Clark Williams <williams@...hat.com>,
 arighi@...dia.com
Subject: Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with
 expired deadline

On Fri, 2025-10-31 at 14:05 +0100, Peter Zijlstra wrote:
> On Thu, Oct 30, 2025 at 07:42:05PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 22, 2025 at 12:11:51PM +0200, Gabriele Monaco wrote:
> > > 
> > > Is this expected?
> > 
> > Sort of, that was next on the list. Let me see if I can make it stop a
> > little more.
> 
> OK, so I've gone over things again and all I got was a comment.
> 
> That is, today I think it all works as expected.
> 
> The dl_server will stop once the fair class goes idle long enough. Can
> you confirm this?
> 

I'm going to go through your comment more carefully, but what I can observe now
is a bit different:

After this patch, consuming bandwidth in background on fair tasks and on idle is
equivalent. Updating idle time does effectively replenish after exhausting
runtime and we never stop the server (IMO this is correct behaviour only for
fair tasks, since there's potentially something to do).
At least this is the behaviour I get on a mostly idle system.

Different scenario if I have the CPU busy with other tasks (e.g. RT policies),
there I can see the server stopping and starting again.
After I do this I seem to get a different behaviour (even some boosting after
idle), I'm trying to understand what's going on.

Does this behaviour make sense to you?

Thanks,
Gabriele

> ---
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1152,6 +1152,94 @@ static void __push_dl_task(struct rq *rq
>  /* a defer timer will not be reset if the runtime consumed was <
> dl_server_min_res */
>  static const u64 dl_server_min_res = 1 * NSEC_PER_MSEC;
>  
> +
> +/*
> + * dl_server && dl_defer:
> + *   dl_defer_armed = 0
> + *   dl_defer_running = 0
> + *   dl_throttled = 0
> + *
> + * [1] dl_server_start()
> + *   dl_server_active = 1;
> + *   enqueue_dl_entity()
> + *     update_dl_entity(WAKEUP)
> + *       if (!dl_defer_running)
> + *         dl_defer_armed = 1;
> + *         dl_defer_throttled = 1;
> + *     if (dl_throttled && start_dl_timer())
> + *       return;
> + *       // start server into waiting for zero-laxity
> + *
> + * // deplete server runtime from fair-class
> + * [2] update_curr_dl_se()
> + *   if (dl_defer && dl_throttled && dl_runtime_exceeded())
> + *     dl_defer_running = 0;
> + *     hrtimer_try_to_cancel();   // stop timer
> + *     replenish_dl_new_period()
> + *       // advance period
> + *       dl_throttled = 1;
> + *       dl_defer_armed = 1;
> + *       start_dl_timer();        // restart timer
> + *       // back into waiting for zero-laxity
> + *
> + * // timer actually fires means we have runtime
> + * [4] dl_server_timer()
> + *   if (dl_defer_armed)
> + *     dl_defer_running = 1;
> + *   enqueue_dl_entity(REPLENISH)
> + *     replenish_dl_entity()
> + *       opt-fwd-period
> + *       if (dl_throttled)
> + *         dl_throttled = 0;
> + *       if (dl_defer_armed)
> + *         dl_defer_armed = 0;
> + *     __enqueue_dl_entity();
> + *     // server queued
> + *
> + * // schedule server
> + * [5] pick_task_dl()
> + *   p = server_pick_task();
> + *   if (!p)
> + *     dl_server_stop()
> + *       dequeue_dl_entity();
> + *       hrtimer_try_to_cancel();
> + *       dl_defer_armed = 0;
> + *       dl_throttled = 0;
> + *       dl_server_active = 0;
> + *       // goto [1]
> + *
> + * // server running
> + * [6] update_curr_dl_se()
> + *   if (dl_runtime_exceeded())
> + *     dl_throttled = 1;
> + *     dequeue_dl_entity();
> + *     start_dl_timer();
> + *     // replenish-timer
> + *
> + * // goto [2]
> + *
> + * [7] dl_server_timer()
> + *   enqueue_dl_entity(REPLENISH)
> + *     replenish_dl_entity()
> + *       fwd-period
> + *       if (dl_throttled)
> + *         dl_throttled = 0;
> + *     __enqueue_dl_entity();
> + *     // goto [5]
> + *
> + * Notes:
> + *
> + *  - When there are fair tasks running the most likely loop is [2]->[2].
> + *    the dl_server never actually runs, the timer never fires.
> + *
> + *  - When there is actual fair starvation; the timer fires and starts the
> + *    dl_server. This will then throttle and replenish like a normal DL
> + *    task. Notably it will not 'defer' again.
> + *
> + *  - When fair goes idle, it will not consume dl_server budget so the server
> + *    will start. However, it will find there are no fair tasks to run and
> + *    stop itself.
> + */
>  static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct
> sched_dl_entity *dl_se)
>  {
>  	struct rq *rq = rq_of_dl_se(dl_se);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ