lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <150729050549.744832.17481160177674200884.stgit@buzz>
Date:   Fri, 06 Oct 2017 14:48:25 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org
Cc:     Tejun Heo <tj@...nel.org>
Subject: [PATCH RFC] sched/cgroup: allow overcommit of rt group runtime

Currently group rt scheduler enforces strict non-overcommit policy:

sum(child_runtime / child_period) <= parent_runtime / parent_period

This is reasonable for true real-time applications but for messy/nested
containerized environments this makes configuration very complicated.

This patch adds scheduler feature RT_GROUP_OVERCOMMIT which replaces
strict policy with restrictions similar to cfs bandwidth: non-infinite
child runtime must not exceed parent runtime limit:

max(child_runtime / child_period) <= parent_runtime / parent_period

Also infinite runtime in child is allowed if parent runtime is non-zero.

I.e. zero rt runtime (default) forbids realtime tasks inside hierarchy.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
---
 Documentation/scheduler/sched-rt-group.txt |   12 ++++++++++++
 kernel/sched/features.h                    |    1 +
 kernel/sched/rt.c                          |   26 ++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt
index d8fce3e78457..123117e86051 100644
--- a/Documentation/scheduler/sched-rt-group.txt
+++ b/Documentation/scheduler/sched-rt-group.txt
@@ -145,6 +145,18 @@ For now, this can be simplified to just the following (but see Future plans):
    \Sum_{i} runtime_{i} <= global_runtime
 
 
+2.4 Overcommit behaviour
+------------------------
+
+Feature RT_RUNTIME_OVERCOMMIT disables strict non-overcommit behaviour and
+requires only for each child runtime to be not bigger than parent runtime:
+
+    child_runtime / child_period <= parent_runtime / parent_period
+
+Also infinite runtime in child is allowed if parent runtime is non-zero.
+
+I.e. zero rt runtime (default) forbids realtime tasks inside hierarchy.
+
 3. Future plans
 ===============
 
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index d3fb15555291..aa1ddb35adac 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -78,6 +78,7 @@ SCHED_FEAT(RT_PUSH_IPI, true)
 #endif
 
 SCHED_FEAT(RT_RUNTIME_SHARE, true)
+SCHED_FEAT(RT_RUNTIME_OVERCOMMIT, true)
 SCHED_FEAT(LB_MIN, false)
 SCHED_FEAT(ATTACH_AGE_LOAD, true)
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index e25b460d051f..e2c269394456 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2486,6 +2486,7 @@ static int tg_rt_schedulable(struct task_group *tg, void *data)
 	struct task_group *child;
 	unsigned long total, sum = 0;
 	u64 period, runtime;
+	u64 p_period, p_runtime;
 
 	period = ktime_to_ns(tg->rt_bandwidth.rt_period);
 	runtime = tg->rt_bandwidth.rt_runtime;
@@ -2509,6 +2510,31 @@ static int tg_rt_schedulable(struct task_group *tg, void *data)
 
 	total = to_ratio(period, runtime);
 
+	if (tg->parent == d->tg) {
+		p_period = d->rt_period;
+		p_runtime = d->rt_runtime;
+	} else if (tg->parent) {
+		p_period = ktime_to_ns(tg->parent->rt_bandwidth.rt_period);
+		p_runtime = tg->parent->rt_bandwidth.rt_runtime;
+	} else {
+		p_period = global_rt_period();
+		p_runtime = global_rt_runtime();
+	}
+
+	/*
+	 * Child runtime should not exceed parent runtime,
+	 * but infinite runtime allowed if parent runtime is non-zero.
+	 */
+	if (sched_feat(RT_RUNTIME_OVERCOMMIT)) {
+		if (runtime == RUNTIME_INF) {
+			if (!p_runtime)
+				return -EINVAL;
+		} else if (total > to_ratio(p_period, p_runtime))
+			return -EINVAL;
+
+		return 0;
+	}
+
 	/*
 	 * Nobody can have more than the global setting allows.
 	 */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ