lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250731105543.40832-1-yurand2000@gmail.com>
Date: Thu, 31 Jul 2025 12:55:18 +0200
From: Yuri Andriaccio <yurand2000@...il.com>
To: Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>
Cc: linux-kernel@...r.kernel.org,
	Luca Abeni <luca.abeni@...tannapisa.it>,
	Yuri Andriaccio <yuri.andriaccio@...tannapisa.it>
Subject: [RFC PATCH v2 00/25]  Hierarchical Constant Bandwidth Server

Hello,

This is the v2 for Hierarchical Constant Bandwidth Server, aiming at replacing
the current RT_GROUP_SCHED mechanism with something more robust and
theoretically sound. The patchset has been presented at OSPM25
(https://retis.sssup.it/ospm-summit/), and a summary of its inner workings can
be found at https://lwn.net/Articles/1021332/ . You can find the v1 of this
patchset at the bottom of the page, which talks in more detail what this
patchset is all about and how it is implemented.

The big update for this v2 version is the addition of migration code, which
allows to migrate tasks between different CPUs (following of course affinity
settings).

As requested, we've split the big patches in smaller chunks in order to improve
in readability. Additionally, it has been rebased on the latest tip/master to
keep up with the latest scheduler updates and new features of dl_servers.

Last but not least, the first patch, which has been presented separately at
https://lore.kernel.org/all/20250725164412.35912-1-yurand2000@gmail.com/ , is
necessary to fully utilize the deadline bandwidth while keeping the fair-servers
active. You can refer to the aforementioned link for details. The issue
presented in this patch also reflects in HCBS: in the current version of the
kernel, by default, 5% of the realtime bandwidth is reserved for fair-servers,
5% is not usable, and only the remaining 90% could be used by deadline tasks, or
in our case, by HCBS dl_servers. The first patch addresses this issue and allows
to fully utilize the default 95% of bandwidth for rt-tasks/servers.

Summary of the patches:
     1) Account fair-servers bw separately from other dl tasks and servers bw.
   2-5) Preparation patches, so that the RT classes' code can be used both
        for normal and cgroup scheduling.
  6-15) Implementation of HCBS, no migration and only one level hierarchy.
        The old RT_GROUP_SCHED code is removed.
 16-18) Remove cgroups v1 in favour of v2.
    19) Add support for deeper hierarchies.
 20-25) Add support for tasks migration.

Updates from v1:
- Rebase to tip/master.
- Add migration code.
- Split big patches for more readability.
- Refactor code to use guarded locks where applicable.
- Remove unnecessary patches from v1 which have been addressed differently by
  mainline updates.
- Remove unnecessary checks and general code cleanup.

Notes:
Task migration support needs some extra work to reduce its invasiveness,
especially patches 22-23.

Testing v2:
The HCBS mechanism has been further evaluated on two fully-fledged distros,
instead of virtual machines, demonstrating stability in this latest version.
A small suite of regression tests shows that the newly added mechanism does not
break fair-servers and other scheduling mechanisms. Stress tests show that our
implementation is robust while time-based tests demonstrate that the theoretical
analysis of real-time tasksets matches with the implementation.

The tests can be found at https://github.com/Yurand2000/HCBS-rust-initrd . The
executables are essentially the same as the ones mentioned in the v1 version,
minor some updates. You can refer to that for additional details.

Future Work:

We want to further test this patchset, and provide a more commented description
of the test suite so that it can be fully automated for testing also by other
people. Additionally, we will finish the currently partial/untested,
implementation of HCBS with different runtimes per CPU, instead of having the
same runtime allocated on all CPUs, to include it in a future RCF.

Future patches:
 - HCBS with different runtimes per CPU.
 - capacity aware bandwidth reservation.
 - enable/disable dl_servers when a CPU goes online/offline.

Have a nice day,
Yuri

v1: https://lore.kernel.org/all/20250605071412.139240-1-yurand2000@gmail.com/

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Yuri Andriaccio (6):
  sched/deadline: Remove fair-servers from real-time task's bandwidth
    accounting
  sched/rt: Disable RT_GROUP_SCHED
  sched/deadline: Account rt-cgroups bandwidth in
    sched_dl_global_validate
  sched/rt: Remove support for cgroups-v1
  sched/rt: Zero rt-cgroups default bandwidth
  sched/core: Execute enqueued balance callbacks when migrating task
    betweeen cgroups

luca abeni (19):
  sched/deadline: Do not access dl_se->rq directly
  sched/deadline: Distinct between dl_rq and my_q
  sched/rt: Pass an rt_rq instead of an rq where needed
  sched/rt: Move some functions from rt.c to sched.h
  sched/rt: Introduce HCBS specific structs in task_group
  sched/deadline: Account rt-cgroups bandwidth in deadline tasks
    schedulability tests.
  sched/core: Initialize root_task_group
  sched/deadline: Add dl_init_tg
  sched/rt: Add {alloc/free}_rt_sched_group and dl_server specific
    functions
  sched/rt: Add HCBS related checks and operations for rt tasks
  sched/rt: Update rt-cgroup schedulability checks
  sched/rt: Remove old RT_GROUP_SCHED data structures
  sched/core: Cgroup v2 support
  sched/deadline: Allow deeper hierarchies of RT cgroups
  sched/rt: Add rt-cgroup migration
  sched/rt: add HCBS migration related checks and function calls
  sched/deadline: Make rt-cgroup's servers pull tasks on timer
    replenishment
  sched/deadline: Fix HCBS migrations on server stop
  sched/core: Execute enqueued balance callbacks when changing allowed
    CPUs

 include/linux/sched.h    |   10 +-
 kernel/sched/autogroup.c |    4 +-
 kernel/sched/core.c      |   68 +-
 kernel/sched/deadline.c  |  311 ++--
 kernel/sched/debug.c     |    6 -
 kernel/sched/fair.c      |    6 +-
 kernel/sched/rt.c        | 3024 ++++++++++++++++++--------------------
 kernel/sched/sched.h     |  140 +-
 kernel/sched/syscalls.c  |    6 +-
 kernel/sched/topology.c  |    8 -
 10 files changed, 1829 insertions(+), 1754 deletions(-)

-- 
2.50.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ