linux-kernel - Re: [PATCH v3 3/4] Documentation/scheduler/sched-deadline.txt: improve and clarify AC bits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5406DCE6.7@arm.com>
Date:	Wed, 03 Sep 2014 10:18:30 +0100
From:	Juri Lelli <juri.lelli@....com>
To:	Luca Abeni <luca.abeni@...tn.it>, Henrik Austad <henrik@...tad.us>
CC:	"peterz@...radead.org" <peterz@...radead.org>,
	"rdunlap@...radead.org" <rdunlap@...radead.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"raistlin@...ux.it" <raistlin@...ux.it>,
	"juri.lelli@...il.com" <juri.lelli@...il.com>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 3/4] Documentation/scheduler/sched-deadline.txt: improve
 and clarify AC bits

Hi,

On 03/09/14 07:49, Luca Abeni wrote:
> Hi,
> 
> On 09/02/2014 11:45 PM, Henrik Austad wrote:
> [...]
>>> + On multiprocessor systems with global EDF scheduling (non partitioned
>>> + systems), a sufficient test for schedulability can not be based on the
>>> + utilisations (it can be shown that task sets with utilisations slightly
>>> + larger than 1 can miss deadlines regardless of the number of CPUs M).
>>> + However, as previously stated, enforcing that the total utilisation is smaller
>>> + than M is enough to guarantee that non real-time tasks are not starved and
>>> + that the tardiness of real-time tasks has an upper bound.
>>
>> I'd _really_ appreciate a link to a paper where all of this is presented
>> and proved!
> Well, my original plan was to add the bibliography in the next round of patches...
> Is this ok?
> 
> [...]
>>> + As already stated in Section 3, a necessary condition to be respected to
>>> + correctly schedule a set of real-time tasks is that the total utilisation
>>> + is smaller than M. When talking about -deadline tasks, this requires to
>>> + impose that the sum of the ratio between runtime and period for all tasks
>>> + is smaller than M.
>>
>> "This requires to impose that .." uhm, what? Drop 'to impose'.
> Ok. I'll send an updated patch to Juri in few days
> 
> 
>>> [...] Notice that the ratio runtime/period is equivalent to
>>> + the utilisation of a "traditional" real-time task, and is also often
>>> + referred to as "bandwidth".
>>> + The interface used to control the CPU bandwidth that can be allocated
>>> + to -deadline tasks is similar to the one already used for -rt
>>>    tasks with real-time group scheduling (a.k.a. RT-throttling - see
>>>    Documentation/scheduler/sched-rt-group.txt), and is based on readable/
>>>    writable control files located in procfs (for system wide settings).
>>> @@ -232,8 +285,16 @@ CONTENTS
>>>    950000. With rt_period equal to 1000000, by default, it means that -deadline
>>>    tasks can use at most 95%, multiplied by the number of CPUs that compose the
>>>    root_domain, for each root_domain.
>>> -
>>> - A -deadline task cannot fork.
>>> + This means that non -deadline tasks will receive at least 5% of the CPU time,
>>> + and that -deadline tasks will receive their runtime with a guaranteed
>>> + worst-case delay respect to the "deadline" parameter. If "deadline" = "period"
>>> + and the cpuset mechanism is used to implement partitioned scheduling (see
>>> + Section 5), then this simple setting of the bandwidth management is able to
>>> + deterministically guarantee that -deadline tasks will receive their runtime
>>> + in a period.
>>
>> The whole 950000 / 1000000, is at least 50 *consecutive* ms given to non
>> rt/dl tasks every second, or is this more finegrained now?
>>
>> If the 50ms can be given in a single go, then I don't think you can
>> guarantee that deadline-tasks will receive their runtime in a period - a
>> period can be <50ms, no?
> Uhmm... Maybe there is something I am missing in how the SCHED_DEADLINE admission
> control is implemented, but I do not know about any "50 consecutive ms to non dl
> tasks" rule. I agree that if there is such a rule then deadline tasks are screwed.
> Juri?
> 
> 

In SCHED_DEADLINE we use those values only at admission control time (when
the user calls sched_setattr()). Then, at runtime, we use tasks' parameters
to perform scheduling. So there is no consecutive 50ms time for !SCHED_DEADLINE
tasks.

We could probably clarify this aspect in the previous patch with something
like this:

[snip]
+ The interface used to control the fraction of CPU bandwidth that can be
+ allocated to -deadline tasks is similar to the one already used for -rt
+ tasks with real-time group scheduling (a.k.a. RT-throttling - see
+ Documentation/scheduler/sched-rt-group.txt), and is based on readable/
+ writable control files located in procfs (for system wide settings).
+ Notice that per-group settings (controlled through cgroupfs) are still not
+ defined for -deadline tasks, because more discussion is needed in order to
+ figure out how we want to manage SCHED_DEADLINE bandwidth at the task group
+ level.
+
+ A main difference between deadline bandwidth management and RT-throttling
  is that -deadline tasks have bandwidth on their own (while -rt ones don't!),
- and thus we don't need an higher level throttling mechanism to enforce the
- desired bandwidth.
+ and thus we don't need a higher level throttling mechanism to enforce the
---->
+ desired bandwidth. In other words, this means that interface parameters are
+ only used at admission control time (i.e., when the user calls
+ sched_setattr()). Scheduling is then performed considering actual tasks'
+ parameters, so that CPU bandwidth is allocated to SCHED_DEADLINE tasks
+ respecting their needs in terms of granularity. Therefore, using this simple
<---
+ interface we can put a cap on total utilization of -deadline tasks (i.e.,
+ \Sum (runtime_i / period_i) < some_desired_value).
[snip]

What you think?

Thanks,

- Juri

>>> + Finally, notice that in order not to jeopardize this admission control a
>>> + -deadline task cannot fork.
>>
>> s/this/the
>> (there aren't any other admission controls in the kernel)
> Ok; this will go in my updated patch
> 
> 
> 
> 			Thanks,
> 				Luca
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/