[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241216201305.19761-1-mkoutny@suse.com>
Date: Mon, 16 Dec 2024 21:12:56 +0100
From: Michal Koutný <mkoutny@...e.com>
To: cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Frederic Weisbecker <fweisbecker@...e.com>
Subject: [RFC PATCH 0/9] Add kernel cmdline option for rt_group_sched
Despite RT_GROUP_SCHED is only available on cgroup v1, there are still
some users of this feature. General purpose distros (e.g. [1][2][3][4])
cannot enable CONFIG_RT_GROUP_SCHED easily:
- since it prevents creation of RT tasks unless RT runtime is determined
and distributed into cgroup tree,
- grouping of RT threads is not what is desired by default on such
systems,
- it prevents use of cgroup v2 with RT tasks.
This changeset aims at deferring the decision whether to have
CONFIG_RT_GROUP_SCHED or not up until the boot time.
By default RT groups are available as originally but the user can
pass rt_group_sched=0 kernel cmdline parameter that disables the
grouping and behavior is like with !CONFIG_RT_GROUP_SCHED (with certain
runtime overhead).
The series is organized as follows:
1) generic ifdefs cleanup, no functional changes,
2) preparing root_task_group to be used in places that take shortcuts in
the case of !CONFIG_RT_GROUP_SCHED,
3) boot cmdline option that controls cgroup (v1) attributes,
4) conditional bypass of non-root task groups,
5) checks and comments refresh.
The crux are patches:
sched: Skip non-root task_groups with disabled RT_GROUP
sched: Bypass bandwitdh checks with runtime disabled RT_GROUP_SCHED
Futher notes:
- it is not sched_feat() flag because that can be flipped any time
- runtime disablement is not implemented as infinite per-cgroup RT limit
since that'd still employ group scheduling which is unlike
!CONFIG_RT_GROUP_SCHED
RFC notes:
- there remain two variants of various functions for
CONFIG_RT_GROUP_SCHED and !CONFIG_RT_GROUP_SCHED, those could be
folded into one and runtime evaluated guards in the folded functions
could be used (I haven't posted it yet due to unclear performance
benefit)
- I noticed some lockdep issues over rt_runtime_lock but those are also
in an unpatched kernel (and they seem to have been present since a
long time without complications)
[1] Debian (https://salsa.debian.org/kernel-team/linux/-/blob/debian/latest/debian/config/kernelarch-x86/config),
[2] ArchLinux (https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config),
[3] Fedora (https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel-x86_64-fedora.config)
[4] openSUSE TW (https://github.com/SUSE/kernel-source/blob/stable/config/x86_64/default)
Michal Koutný (9):
sched: Convert CONFIG_RT_GROUP_SCHED macros to code conditions
sched: Remove unneeed macro wrap
sched: Always initialize rt_rq's task_group
sched: Add commadline option for RT_GROUP_SCHED toggling
sched: Skip non-root task_groups with disabled RT_GROUP_SCHED
sched: Bypass bandwitdh checks with runtime disabled RT_GROUP_SCHED
sched: Do not construct nor expose RT_GROUP_SCHED structures if
disabled
sched: Add RT_GROUP WARN checks for non-root task_groups
sched: Add annotations to RT_GROUP_SCHED fields
.../admin-guide/kernel-parameters.txt | 5 ++
init/Kconfig | 11 +++
kernel/sched/core.c | 69 +++++++++++++++----
kernel/sched/rt.c | 51 +++++++++-----
kernel/sched/sched.h | 34 +++++++--
kernel/sched/syscalls.c | 5 +-
6 files changed, 137 insertions(+), 38 deletions(-)
base-commit: f92f4749861b06fed908d336b4dee1326003291b
--
2.47.1
Powered by blists - more mailing lists