[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211011141737.GA58758@blackbody.suse.cz>
Date: Mon, 11 Oct 2021 16:17:37 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: "Pratik R. Sampat" <psampat@...ux.ibm.com>, bristot@...hat.com,
christian@...uner.io, ebiederm@...ssion.com,
lizefan.x@...edance.com, tj@...nel.org, hannes@...xchg.org,
mingo@...nel.org, juri.lelli@...hat.com,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
cgroups@...r.kernel.org, containers@...ts.linux.dev,
containers@...ts.linux-foundation.org, pratik.r.sampat@...il.com
Subject: Re: [RFC 0/5] kernel: Introduce CPU Namespace
On Mon, Oct 11, 2021 at 12:11:24PM +0200, Christian Brauner <christian.brauner@...ntu.com> wrote:
> Fundamentally I think making this a new namespace is not the correct
> approach.
I tend to agree.
Also, generally, this is not only problem of cpuset but some other
controllers well (the original letter mentions CPU bandwidth limits, another
thing are memory limits (and I wonder whether some apps already adjust their
behavior to available IO characteristics)).
The problem as I see it is the mapping from a real dedicated HW to a
cgroup restricted environment ("container"), which can be shared. In
this instance, the virtualized view would not be able to represent a
situation when a CPU is assigned non-exclusively to multiple cpusets.
(Although, one speciality of the CPU namespace approach here is the
remapping/scrambling of the CPU topology. Not sure if good or bad.)
> I think that either we need to come up with new non-syscall based
> interfaces that allow to query virtualized cpu information and buy into
> the process of teaching userspace about them. This is even independent
> of containers.
For the reason above, I also agree with this. And I think this interface
(mostly) exists -- the userspace could query the cgroup files
(cpuset.cpus.effective in this case), they would even have the liberty
to decide between querying available resources in their "container"
(root cgroup (cgroup NS)) or further subdivision of that (the
immediately encompassing cgroup).
On Sat, Oct 09, 2021 at 08:42:38PM +0530, "Pratik R. Sampat" <psampat@...ux.ibm.com> wrote:
> Existing solutions to the problem include userspace tools like LXCFS
> which can fake the sysfs information by mounting onto the sysfs online
> file to be in coherence with the limits set through cgroup cpuset.
> However, LXCFS is an external solution and needs to be explicitly setup
> for applications that require it. Another concern is also that tools
> like LXCFS don't handle all the other display mechanism like procfs load
> stats.
>
> Therefore, the need of a clean interface could be advocated for.
I'd like to write something in support of your approach but I'm afraid that the
problem of the mapping (dedicated vs shared) makes this most suitable for some
external/separate entity such as the LCXFS already.
My .02€,
Michal
Powered by blists - more mailing lists