linux-kernel - Re: [PATCH v5 6/7] sched/deadline: Deferrable dl server

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZUulyj7tXnZzv5V6@localhost.localdomain>
Date:   Wed, 8 Nov 2023 16:14:18 +0100
From:   Juri Lelli <juri.lelli@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Daniel Bristot de Oliveira <bristot@...nel.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Ingo Molnar <mingo@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org,
        Luca Abeni <luca.abeni@...tannapisa.it>,
        Tommaso Cucinotta <tommaso.cucinotta@...tannapisa.it>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vineeth Pillai <vineeth@...byteword.org>,
        Shuah Khan <skhan@...uxfoundation.org>,
        Phil Auld <pauld@...hat.com>
Subject: Re: [PATCH v5 6/7] sched/deadline: Deferrable dl server

Hi Peter,

On 08/11/23 13:44, Peter Zijlstra wrote:
> On Tue, Nov 07, 2023 at 07:50:28PM +0100, Daniel Bristot de Oliveira wrote:
> > > The code is not doing what I intended because I thought it was doing overload
> > > control on the replenishment, but it is not (my bad).
> > > 
> > 
> > I am still testing but... it is missing something like this (famous last words).
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 1092ca8892e0..6e2d21c47a04 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -842,6 +842,8 @@ static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
> >   * runtime, or it just underestimated it during sched_setattr().
> >   */
> >  static int start_dl_timer(struct sched_dl_entity *dl_se);
> > +static bool dl_entity_overflow(struct sched_dl_entity *dl_se, u64 t);
> > +
> >  static void replenish_dl_entity(struct sched_dl_entity *dl_se)
> >  {
> >  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> > @@ -852,9 +854,18 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se)
> >  	/*
> >  	 * This could be the case for a !-dl task that is boosted.
> >  	 * Just go with full inherited parameters.
> > +	 *
> > +	 * Or, it could be the case of a zerolax reservation that
> > +	 * was not able to consume its runtime in background and
> > +	 * reached this point with current u > U.
> > +	 *
> > +	 * In both cases, set a new period.
> >  	 */
> > -	if (dl_se->dl_deadline == 0)
> > -		replenish_dl_new_period(dl_se, rq);
> > +	if (dl_se->dl_deadline == 0 ||
> > +		(dl_se->dl_zerolax_armed && dl_entity_overflow(dl_se, rq_clock(rq)))) {
> > +			dl_se->deadline = rq_clock(rq) + pi_of(dl_se)->dl_deadline;
> > +			dl_se->runtime = pi_of(dl_se)->dl_runtime;
> > +	}
> > 
> >  	if (dl_se->dl_yielded && dl_se->runtime > 0)
> >  		dl_se->runtime = 0;
> 
> Should we rather not cap the runtime, something like so?
> 
> Because the above also causes period drift, which we do not want.

I was honestly also concerned with the drift, but then thought it might
not be an issue for the dl_server (zerolax), as it doesn't have a
userspace counterpart that relies on synchronized clocks?

> 
> ---
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 58b542bf2893..1453a2cd0680 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -829,10 +829,12 @@ static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
>   */
>  static void replenish_dl_entity(struct sched_dl_entity *dl_se)
>  {
> +	struct sched_dl_entity *pi_se = pi_of(dl_se);
>  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>  	struct rq *rq = rq_of_dl_rq(dl_rq);
> +	u64 dl_runtime = pi_se->dl_runtime;
>  
> -	WARN_ON_ONCE(pi_of(dl_se)->dl_runtime <= 0);
> +	WARN_ON_ONCE(dl_runtime <= 0);
>  
>  	/*
>  	 * This could be the case for a !-dl task that is boosted.
> @@ -851,10 +853,13 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se)
>  	 * arbitrary large.
>  	 */
>  	while (dl_se->runtime <= 0) {
> -		dl_se->deadline += pi_of(dl_se)->dl_period;
> -		dl_se->runtime += pi_of(dl_se)->dl_runtime;
> +		dl_se->deadline += pi_se->dl_period;
> +		dl_se->runtime += dl_runtime;
>  	}
>  
> +	if (dl_se->zerolax && dl_se->runtime > dl_runtime)
> +		dl_se->runtime = dl_runtime;
> +

Anyway, I have the impression that this breaks EDF/CBS, as we are letting
the dl_server run with full dl_runtime w/o postponing the period
(essentially an u = 1 reservation until runtime is depleted).

I would say we need to either do

dl_se->deadline += pi_of(dl_se)->dl_period;
dl_se->runtime = pi_of(dl_se)->dl_runtime;

or (as Daniel proposed)

dl_se->deadline = rq_clock(rq) + pi_of(dl_se)->dl_deadline;
dl_se->runtime = pi_of(dl_se)->dl_runtime;

and I seem to be inclined towards the latter, as the former would
essentially reduce dl_server bandwidth under dl_runtime/dl_period at
times.

Best,
Juri