lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170504141721.5f6tlayqwyq4cxnw@hirez.programming.kicks-ass.net>
Date:   Thu, 4 May 2017 16:17:21 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Daniel Bristot de Oliveira <bristot@...hat.com>
Cc:     Ingo Molnar <mingo@...hat.com>, Xunlei Pang <xpang@...hat.com>,
        Juri Lelli <juri.lelli@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Luca Abeni <luca.abeni@...tannapisa.it>,
        Tommaso Cucinotta <tommaso.cucinotta@...up.it>,
        Romulo Silva de Oliveira <romulo.deoliveira@...c.br>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for
 suspending constrained dl tasks

On Mon, Apr 24, 2017 at 05:18:35PM +0200, Daniel Bristot de Oliveira wrote:
> We have been facing some problems with self-suspending constrained
> deadline tasks. The main reason is that the original CBS was not
> designed for such sort of tasks.
> 
> One problem reported by Xunlei Pang takes place when a task
> suspends, and then is awakened before the deadline, but so close
> to the deadline that its remaining runtime can cause the task
> to have an absolute density higher than allowed. In such situation,
> the original CBS assumes that the task is facing an early activation,
> and so it replenishes the task and set another deadline, one deadline
> in the future. This rule works fine for implicit deadline tasks.
> Moreover, it allows the system to adapt the period of a task in which
> the external event source suffered from a clock drift.
> 
> However, this opens the window for bandwidth leakage for constrained
> deadline tasks. For instance, a task with the following parameters:
> 
>   runtime   = 5 ms
>   deadline  = 7 ms
>   [density] = 5 / 7 = 0.71
>   period    = 1000 ms
> 
> If the task runs for 1 ms, and then suspends for another 1ms,
> it will be awakened with the following parameters:
> 
>   remaining runtime = 4
>   laxity = 5
> 
> presenting a absolute density of 4 / 5 = 0.80.
> 
> In this case, the original CBS would assume the task had an early
> wakeup. Then, CBS will reset the runtime, and the absolute deadline will
> be postponed by one relative deadline, allowing the task to run.
> 
> The problem is that, if the task runs this pattern forever, it will keep
> receiving bandwidth, being able to run 1ms every 2ms. Following this
> behavior, the task would be able to run 500 ms in 1 sec. Thus running
> more than the 5 ms / 1 sec the admission control allowed it to run.
> 
> Trying to address the self-suspending case, Luca Abeni, Giuseppe
> Lipari, and Juri Lelli [1] revisited the CBS in order to deal with
> self-suspending tasks. In the new approach, rather than
> replenishing/postponing the absolute deadline, the revised wakeup rule
> adjusts the remaining runtime, reducing it to fit into the allowed
> density.
> 
> A resumed version of the idea is:
> 
> At a given time t, the maximum absolute density of a task cannot be
> higher than its relative density, that is:
> 
>   runtime / (deadline - t) <= dl_runtime / dl_deadline
> 
> Knowing the laxity of a task (deadline - t), it is possible to move
> it to the other side of the equality, thus enabling to define max
> remaining runtime a task can use within the absolute deadline, without
> over-running the allowed density:
> 
>   runtime = (dl_runtime / dl_deadline) * (deadline - t)
> 
> For instance, in our previous example, the task could still run:
> 
>   runtime = ( 5 / 7 ) * 4
>   runtime = 2.85 ms
> 
> Without causing damage for other deadline tasks. It is note worth that
> the laxity cannot be negative because that would cause a negative
> runtime. Thus, this patch depends on the patch:
> 
>   edf5835 sched/deadline: Throttle a constrained deadline task activated
>           after the deadline

My git tree says that is:

df8eac8cafce ("sched/deadline: Throttle a constrained deadline task activated after the deadline")

> Which throttles a constrained deadline task activated after the
> deadline.
> 
> Finally, it is also possible to use the revised wakeup rule for
> all other tasks, but that would require some more discussions
> about pros and cons.
> 
> Reported-by: Xunlei Pang <xpang@...hat.com>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@...hat.com>
> Cc: Xunlei Pang <xpang@...hat.com>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Juri Lelli <juri.lelli@....com>
> Cc: Steven Rostedt <rostedt@...dmis.org>
> Cc: Luca Abeni <luca.abeni@...tannapisa.it>
> Cc: Tommaso Cucinotta <tommaso.cucinotta@...up.it>
> Cc: Romulo Silva de Oliveira <romulo.deoliveira@...c.br>
> Cc: linux-kernel@...r.kernel.org
> ---
>  kernel/sched/deadline.c | 67 +++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 60 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index a2ce590..71e5bcf 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -484,13 +484,63 @@ static bool dl_entity_overflow(struct sched_dl_entity *dl_se,
>  }
>  
>  /*
> + * Revised wakeup rule [1]: For self-suspending tasks, rather then
> + * re-initializing task's runtime and deadline, the revised wakeup
> + * rule adjusts the task's runtime to avoid the task to overrun its
> + * density.
> + *
> + * Reasoning: a task may overrun the density if:
> + *    runtime / (deadline - t) > dl_runtime / dl_deadline

When reading that, I have the instant question: "why / how ?" I suspect
the blurb below (at update_dl_entity) has the answer, if so this can use
a reference thereto.

> + *
> + * Therefore, runtime can be adjusted to:
> + *     runtime = (dl_runtime / dl_deadline) * (deadline - t)
> + *
> + * In such way that runtime will be equals to the maximum density
> + * the task can use without breaking any rule.
> + *
> + * [1] Luca Abeni, Giuseppe Lipari, and Juri Lelli. 2015. Constant
> + * bandwidth server revisited. SIGBED Rev. 11, 4 (January 2015), 19-24.
> + */
> +static void
> +update_dl_revised_wakeup(struct sched_dl_entity *dl_se, struct rq *rq)
> +{
> +	u64 density = div64_u64(dl_se->dl_runtime << 20, dl_se->dl_deadline);
> +	u64 laxity = dl_se->deadline - rq_clock(rq);
> +
> +	BUG_ON(laxity < 0);

Compiler will make that go away, by virtue of laxity being unsigned.

> +
> +	dl_se->runtime = density * laxity >> 20;
> +}
> +
/*
 * XXX comment that explains what constrained is ? per the definition
 * below that is any task where deadline != period?
 */
> +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se)
> +{
> +	return dl_se->dl_deadline < dl_se->dl_period;
> +}
> +
> +/*
>   * When a -deadline entity is queued back on the runqueue, its runtime and
>   * deadline might need updating.
>   *
> + * Currently, we are using two different CBS rules, 1) the original CBS
> + * for implicit deadline tasks; 2) the revisited CBS for constrained
> + * deadline ones. The reason is that the original CBS can cause a constrained
> + * deadline task to be replenished deadline/period times in a period, in
> + * the worst case, hence allowing it to run more than runtime/period.
> + * In order to prevent this misbehave, the revisited CBS is used for
> + * constrained deadline tasks. In the revisited CBS, in the case of an
> + * overload, rather than replenishing & postponing the deadline, the
> + * remaining runtime of a task is reduced to avoid runtime overflow.

We use two difference CBS rules:

 1) the original CBS rule for implicit deadline tasks;
 2) the revised CBS rule for constrained deadline tasks.

( and here I have the question, wth is implicit / constrained )

The reason is that the original CBS rul can cause a constrained deadline
task to be replenished deadline/period times in a period (worst case),
hence allowing it to run more than runtime/period.

( the deadline/period thing could use a few words extra; in the example
  above that divides to: 7/1000 = .007 which does not have an integer
  component. You cannot replenish fractional times etc.. )

In order to prevent this, the revised CBS rule reduces the remaining
runtime (by limiting the max density).

> + *
> + * So, for implicit deadline tasks, the policy here is that the runtime &
> + * deadline of an entity are update if and only if., either:

updated ? IFF

> + *  - the current deadline is in the past, or
>   *  - using the remaining runtime with the current deadline would make
>   *    the entity exceed its bandwidth.
> + *
> + * For constrained deadline tasks, the policy here is that the runtime
> + * is reduced to avoid exceeding its bandwidth if:

> + *   - using the remaining runtime with the current deadline would make
> + *     the entity exceed its bandwidth.
>   */

In general the comment on comments is to add whitespace. That greatly
increases readability.

>  static void update_dl_entity(struct sched_dl_entity *dl_se,
>  			     struct sched_dl_entity *pi_se)
> @@ -500,6 +550,14 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
>  
>  	if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>  	    dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) {
> +
> +		if (unlikely(dl_is_constrained(dl_se) &&
> +		    !dl_time_before(dl_se->deadline, rq_clock(rq)) &&

alignment is misleading, this is still inside the unlikely(.

> +		    !dl_se->dl_boosted)){
> +			update_dl_revised_wakeup(dl_se, rq);
> +			return;
> +		}
> +
>  		dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
>  		dl_se->runtime = pi_se->dl_runtime;
>  	}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ