lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211011141737.GA58758@blackbody.suse.cz>
Date:   Mon, 11 Oct 2021 16:17:37 +0200
From:   Michal Koutný <mkoutny@...e.com>
To:     Christian Brauner <christian.brauner@...ntu.com>
Cc:     "Pratik R. Sampat" <psampat@...ux.ibm.com>, bristot@...hat.com,
        christian@...uner.io, ebiederm@...ssion.com,
        lizefan.x@...edance.com, tj@...nel.org, hannes@...xchg.org,
        mingo@...nel.org, juri.lelli@...hat.com,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        cgroups@...r.kernel.org, containers@...ts.linux.dev,
        containers@...ts.linux-foundation.org, pratik.r.sampat@...il.com
Subject: Re: [RFC 0/5] kernel: Introduce CPU Namespace

On Mon, Oct 11, 2021 at 12:11:24PM +0200, Christian Brauner <christian.brauner@...ntu.com> wrote:
> Fundamentally I think making this a new namespace is not the correct
> approach.

I tend to agree. 

Also, generally, this is not only problem of cpuset but some other
controllers well (the original letter mentions CPU bandwidth limits, another
thing are memory limits (and I wonder whether some apps already adjust their
behavior to available IO characteristics)).

The problem as I see it is the mapping from a real dedicated HW to a
cgroup restricted environment ("container"), which can be shared. In
this instance, the virtualized view would not be able to represent a
situation when a CPU is assigned non-exclusively to multiple cpusets.

(Although, one speciality of the CPU namespace approach here is the
remapping/scrambling of the CPU topology. Not sure if good or bad.)

> I think that either we need to come up with new non-syscall based
> interfaces that allow to query virtualized cpu information and buy into
> the process of teaching userspace about them. This is even independent
> of containers.

For the reason above, I also agree with this. And I think this interface
(mostly) exists -- the userspace could query the cgroup files
(cpuset.cpus.effective in this case), they would even have the liberty
to decide between querying available resources in their "container"
(root cgroup (cgroup NS)) or further subdivision of that (the
immediately encompassing cgroup).


On Sat, Oct 09, 2021 at 08:42:38PM +0530, "Pratik R. Sampat" <psampat@...ux.ibm.com> wrote:
> Existing solutions to the problem include userspace tools like LXCFS
> which can fake the sysfs information by mounting onto the sysfs online
> file to be in coherence with the limits set through cgroup cpuset.
> However, LXCFS is an external solution and needs to be explicitly setup
> for applications that require it. Another concern is also that tools
> like LXCFS don't handle all the other display mechanism like procfs load
> stats.
>
> Therefore, the need of a clean interface could be advocated for.

I'd like to write something in support of your approach but I'm afraid that the
problem of the mapping (dedicated vs shared) makes this most suitable for some
external/separate entity such as the LCXFS already.

My .02€,
Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ