linux-kernel - Re: [PATCH v3 3/4] Documentation/scheduler/sched-deadline.txt: improve and clarify AC bits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140902214538.GC22581@sisyphus.home.austad.us>
Date:	Tue, 2 Sep 2014 23:45:38 +0200
From:	Henrik Austad <henrik@...tad.us>
To:	Juri Lelli <juri.lelli@....com>
Cc:	peterz@...radead.org, luca.abeni@...tn.it, rdunlap@...radead.org,
	mingo@...hat.com, raistlin@...ux.it, juri.lelli@...il.com,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 3/4] Documentation/scheduler/sched-deadline.txt:
 improve and clarify AC bits

On Thu, Aug 28, 2014 at 11:00:28AM +0100, Juri Lelli wrote:
> From: Luca Abeni <luca.abeni@...tn.it>
> 
> Admission control is of key importance for SCHED_DEADLINE, since it guarantees
> system schedulability (or tells us something about the degree of guarantees
> we can provide to the user).
> 
> This patch improves and clarifies bits and pieces regarding AC, both for UP
> and SMP systems.
> 
> Signed-off-by: Luca Abeni <luca.abeni@...tn.it>
> Signed-off-by: Juri Lelli <juri.lelli@....com>
> Cc: Randy Dunlap <rdunlap@...radead.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Henrik Austad <henrik@...tad.us>
> Cc: Dario Faggioli <raistlin@...ux.it>
> Cc: Juri Lelli <juri.lelli@...il.com>
> Cc: linux-doc@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> ---
>  Documentation/scheduler/sched-deadline.txt | 89 +++++++++++++++++++++++++-----
>  1 file changed, 75 insertions(+), 14 deletions(-)
> 
> diff --git a/Documentation/scheduler/sched-deadline.txt b/Documentation/scheduler/sched-deadline.txt
> index 0aff2d5..641395e 100644
> --- a/Documentation/scheduler/sched-deadline.txt
> +++ b/Documentation/scheduler/sched-deadline.txt
> @@ -38,16 +38,17 @@ CONTENTS
>  ==================
>  
>   SCHED_DEADLINE uses three parameters, named "runtime", "period", and
> - "deadline" to schedule tasks. A SCHED_DEADLINE task is guaranteed to receive
> + "deadline", to schedule tasks. A SCHED_DEADLINE task should receive
>   "runtime" microseconds of execution time every "period" microseconds, and
>   these "runtime" microseconds are available within "deadline" microseconds
>   from the beginning of the period.  In order to implement this behaviour,
>   every time the task wakes up, the scheduler computes a "scheduling deadline"
>   consistent with the guarantee (using the CBS[2,3] algorithm). Tasks are then
>   scheduled using EDF[1] on these scheduling deadlines (the task with the
> - closest scheduling deadline is selected for execution). Notice that this
> - guaranteed is respected if a proper "admission control" strategy (see Section
> - "4. Bandwidth management") is used.
> + closest scheduling deadline is selected for execution). Notice that the
> + task actually receives "runtime" time units within "deadline" if a proper
> + "admission control" strategy (see Section "4. Bandwidth management") is used
> + (clearly, if the system is overloaded this guarantee cannot be respected).
>  
>   Summing up, the CBS[2,3] algorithms assigns scheduling deadlines to tasks so
>   that each task runs for at most its runtime every period, avoiding any
> @@ -134,6 +135,50 @@ CONTENTS
>   A real-time task can be periodic with period P if r_{j+1} = r_j + P, or
>   sporadic with minimum inter-arrival time P is r_{j+1} >= r_j + P. Finally,
>   d_j = r_j + D, where D is the task's relative deadline.
> + The utilisation of a real-time task is defined as the ratio between its
> + WCET and its period (or minimum inter-arrival time), and represents
> + the fraction of CPU time needed to execute the task.
> +
> + If the total utilisation sum_i(WCET_i/P_i) is larger than M (with M equal
> + to the number of CPUs), then the scheduler is unable to respect all the
> + deadlines.
> + Note that total utilisation is defined as the sum of the utilisations
> + WCET_i/P_i over all the real-time tasks in the system. When considering
> + multiple real-time tasks, the parameters of the i-th task are indicated
> + with the "_i" suffix.
> + Moreover, if the total utilisation is larger than M, then we risk starving
> + non- real-time tasks by real-time tasks.
> + If, instead, the total utilisation is smaller than M, then non real-time
> + tasks will not be starved and the system might be able to respect all the
> + deadlines.
> + As a matter of fact, in this case it is possible to provide an upper bound
> + for tardiness (defined as the maximum between 0 and the difference
> + between the finishing time of a job and its absolute deadline).
> + More precisely, it can be proven that using a global EDF scheduler the
> + maximum tardiness of each task is smaller or equal than
> +	((M − 1) · WCET_max − WCET_min)/(M − (M − 2) · U_max) + WCET_max
> + where WCET_max = max_i{WCET_i} is the maximum WCET, WCET_min=min_i{WCET_i}
> + is the minimum WCET, and U_max = max_i{WCET_i/P_i} is the maximum utilisation.
> +
> + If M=1 (uniprocessor system), or in case of partitioned scheduling (each
> + real-time task is statically assigned to one and only one CPU), it is
> + possible to formally check if all the deadlines are respected.
> + If D_i = P_i for all tasks, then EDF is able to respect all the deadlines
> + of all the tasks executing on a CPU if and only if the total utilisation
> + of the tasks running on such a CPU is smaller or equal than 1.
> + If D_i != P_i for some task, then it is possible to define the density of
> + a task as C_i/min{D_i,T_i}, and EDF is able to respect all the deadlines
> + of all the tasks running on a CPU if the sum sum_i C_i/min{D_i,T_i} of the
> + densities of the tasks running on such a CPU is smaller or equal than 1
> + (notice that this condition is only sufficient, and not necessary).
> +
> + On multiprocessor systems with global EDF scheduling (non partitioned
> + systems), a sufficient test for schedulability can not be based on the
> + utilisations (it can be shown that task sets with utilisations slightly
> + larger than 1 can miss deadlines regardless of the number of CPUs M).
> + However, as previously stated, enforcing that the total utilisation is smaller
> + than M is enough to guarantee that non real-time tasks are not starved and
> + that the tardiness of real-time tasks has an upper bound.

I'd _really_ appreciate a link to a paper where all of this is presented 
and proved!

>   SCHED_DEADLINE can be used to schedule real-time tasks guaranteeing that
>   the jobs' deadlines of a task are respected. In order to do this, a task
> @@ -163,14 +208,22 @@ CONTENTS
>  4. Bandwidth management
>  =======================
>  
> - In order for the -deadline scheduling to be effective and useful, it is
> - important to have some method to keep the allocation of the available CPU
> - bandwidth to the tasks under control. This is usually called "admission
> - control" and if it is not performed at all, no guarantee can be given on
> - the actual scheduling of the -deadline tasks.
> -
> - The interface used to control the fraction of CPU bandwidth that can be
> - allocated to -deadline tasks is similar to the one already used for -rt
> + As previously mentioned, in order for -deadline scheduling to be
> + effective and useful (that is, to be able to provide "runtime" time units
> + within "deadline"), it is important to have some method to keep the allocation
> + of the available fractions of CPU time to the various tasks under control.
> + This is usually called "admission control" and if it is not performed, then
> + no guarantee can be given on the actual scheduling of the -deadline tasks.
> +
> + As already stated in Section 3, a necessary condition to be respected to
> + correctly schedule a set of real-time tasks is that the total utilisation
> + is smaller than M. When talking about -deadline tasks, this requires to
> + impose that the sum of the ratio between runtime and period for all tasks
> + is smaller than M.

"This requires to impose that .." uhm, what? Drop 'to impose'.

> [...] Notice that the ratio runtime/period is equivalent to
> + the utilisation of a "traditional" real-time task, and is also often
> + referred to as "bandwidth".
> + The interface used to control the CPU bandwidth that can be allocated
> + to -deadline tasks is similar to the one already used for -rt
>   tasks with real-time group scheduling (a.k.a. RT-throttling - see
>   Documentation/scheduler/sched-rt-group.txt), and is based on readable/
>   writable control files located in procfs (for system wide settings).
> @@ -232,8 +285,16 @@ CONTENTS
>   950000. With rt_period equal to 1000000, by default, it means that -deadline
>   tasks can use at most 95%, multiplied by the number of CPUs that compose the
>   root_domain, for each root_domain.
> -
> - A -deadline task cannot fork.
> + This means that non -deadline tasks will receive at least 5% of the CPU time,
> + and that -deadline tasks will receive their runtime with a guaranteed
> + worst-case delay respect to the "deadline" parameter. If "deadline" = "period"
> + and the cpuset mechanism is used to implement partitioned scheduling (see
> + Section 5), then this simple setting of the bandwidth management is able to
> + deterministically guarantee that -deadline tasks will receive their runtime
> + in a period.

The whole 950000 / 1000000, is at least 50 *consecutive* ms given to non 
rt/dl tasks every second, or is this more finegrained now?

If the 50ms can be given in a single go, then I don't think you can 
guarantee that deadline-tasks will receive their runtime in a period - a 
period can be <50ms, no?

> +
> + Finally, notice that in order not to jeopardize this admission control a
> + -deadline task cannot fork.

s/this/the
(there aren't any other admission controls in the kernel)

>  
>  5. Tasks CPU affinity
>  =====================
> -- 
> 2.0.4
> 
> 

-- 
Henrik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/