lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMrERWeIeEOGzXHO@slm.duckdns.org>
Date:   Wed, 2 Aug 2023 11:01:57 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Christian Brauner <brauner@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <shuah@...nel.org>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Michal Koutný <mkoutny@...e.com>,
        Giuseppe Scrivano <gscrivan@...hat.com>
Subject: Re: [PATCH v5 4/5] cgroup/cpuset: Documentation update for partition

Hello, Waiman.

On Thu, Jul 13, 2023 at 01:26:00PM -0400, Waiman Long wrote:
...
> +	When a valid partition is created, the value of this file will
> +	be automatically set to the largest subset of "cpuset.cpus"
> +	that can be granted for exclusive access from its parent if
> +	its value isn't explicitly set before.
> +
> +	Users can also manually set it to a value that is different from
> +	"cpuset.cpus".	In this case, its value becomes invariant and
> +	may no longer reflect the effective value that is being used
> +	to create a valid partition if some dependent cpuset control
> +	files are modified.
> +
> +	There are constraints on what values are acceptable to this
> +	control file.  If a null string is provided, it will invalidate a
> +	valid partition root and reset its invariant state.  Otherwise,
> +	its value must be a subset of the cgroup's "cpuset.cpus" value
> +	and the parent cgroup's "cpuset.cpus.exclusive" value.

As I wrote before, the hidden state really bothers me. This is fine when
there is one person configuring the system, but working with automated
management and monitoring tools can be really confusing at scale when there
are hidden states like this as there's no way to determine the current state
by looking at what's visible at the interface level.

Can't we do something like the following?

* cpuset.cpus.exclusive can be set to any possible cpus. While I'm not
  completely against failing certain writes (e.g. siblings having
  overlapping masks is never correct or useful), expanding that to
  hierarchical checking quickly gets into trouble around what happens when
  an ancestor retracts a CPU.

  I don't think it makes sense to reject writes if the applied rules can't
  be invariants for the same reason given for avoiding hidden states - the
  system can be managed by multiple agents at different delegation levels.
  One layer changing resource configuration shouldn't affect the success or
  failure of configuration operations in other layers.

* cpuset.cpus.exclusive.effective shows what's currently available for
  exclusive usage - ie. what'd be used for a partition if the cgroup is to
  become a partition at that point.

  This, I think, gets rid of the need for the hidden states. If .exclusive
  of a child of a partition is empty, its .exclusive.effective can show all
  the CPUs allowed in it. If .exclusive is set then, .exclusive.effective
  shows the available subset.

What do you think?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ