lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20230531163405.2200292-1-longman@redhat.com>
Date:   Wed, 31 May 2023 12:33:59 -0400
From:   Waiman Long <longman@...hat.com>
To:     Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>, Shuah Khan <shuah@...nel.org>
Cc:     linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kselftest@...r.kernel.org,
        Juri Lelli <juri.lelli@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Mrunal Patel <mpatel@...hat.com>,
        Ryan Phillips <rphillips@...hat.com>,
        Brent Rowsell <browsell@...hat.com>,
        Peter Hunt <pehunt@...hat.com>, Phil Auld <pauld@...hat.com>,
        Waiman Long <longman@...hat.com>
Subject: [PATCH v2 0/6] cgroup/cpuset: Support remote isolated partitions

 v2:
  - [v1] https://lore.kernel.org/lkml/20230412153758.3088111-1-longman@redhat.com/
  - Dropped the special "isolcpus" partition in v1
  - Add the root only "cpuset.cpus.reserve" control file for reserving
    CPUs used for remote isolated partitions.
  - Update the test_cpuset_prs.sh test script and documentation
    accordingly.

This patch series introduces a new category of cpuset partition called
remote partitions. The existing partition category where the partition
roots have to be clustered around the root cgroup in a hierarchical way
is now referred to as adjacent partitions.

A remote partition can be formed far from the root cgroup with no
partition root parent. The only commonality is that the CPUs that are
used in the partition as specified in "cpuset.cpus" have to be present
in the "cpuset.cpus" of all its ancestors.

It is relatively rare to have applications that require creation of
a separate scheduling domain (root). However, it is more common to
have applications that require the use of isolated CPUs (isolated),
e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options
to get that statically. Of course, the "isolated" partition is another
way to achieve that dynamically.

Modern container orchestration tools like Kubernetes use the cgroup
hierarchy to manage different containers. And it is relying on other
middleware like systemd to help managing it. If a container needs to
use isolated CPUs, it is hard to get those with the adjacent partitions
as it will require the administrative parent cgroup to be a partition
root too which tool like systemd may not be ready to manage.

With this patch series, a new root cgroup only "cpuset.cpus.reserve"
file is added to specify the set of CPUs that can be used in partitions
(whether remote or adjacent). To create a remote partition, the set
of CPUs to be used in that partition (the "cpuset.cpus" file of the
partition root) has to be reserved by manually adding them to that
control file first. Then that partition can be activated by writing
"isolated" into its "cpuset.cpus.partition". CPU reservation of adjacent
partitions is done automatically without touching "cpuset.cpus.reserve"
at all.

Currently only remote isolated partitions are supported, we could
support a scheduling partition ("root") in the future if the need arises.
Additional isolation attributes like those with the "isolcpus" or "nohz"
boot command line options may be supported in the isolated partitions
in the future.

Waiman Long (6):
  cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE
    handling
  cgroup/cpuset: Improve temporary cpumasks handling
  cgroup/cpuset: Add cpuset.cpus.reserve for top cpuset
  cgroup/cpuset: Introduce remote isolated partition
  cgroup/cpuset: Documentation update for partition
  cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition

 Documentation/admin-guide/cgroup-v2.rst       |  92 ++-
 kernel/cgroup/cpuset.c                        | 749 +++++++++++++++---
 .../selftests/cgroup/test_cpuset_prs.sh       | 403 ++++++----
 3 files changed, 988 insertions(+), 256 deletions(-)

-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ