linux-kernel - Re: [PATCH 0/3] cgroup/misc: Add hwcap masks to the misc controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANaxB-yOfS1KPZaZJ_4WG8XeZnB9M_shtWOOONTXQ2CW4mqsSA@mail.gmail.com>
Date: Fri, 5 Dec 2025 12:19:04 -0800
From: Andrei Vagin <avagin@...il.com>
To: Chen Ridong <chenridong@...weicloud.com>
Cc: Andrei Vagin <avagin@...gle.com>, Kees Cook <kees@...nel.org>, linux-kernel@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, cgroups@...r.kernel.org, 
	criu@...ts.linux.dev, Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, 
	Michal Koutný <mkoutny@...e.com>, 
	Vipin Sharma <vipinsh@...gle.com>, Jonathan Corbet <corbet@....net>
Subject: Re: [PATCH 0/3] cgroup/misc: Add hwcap masks to the misc controller

On Fri, Dec 5, 2025 at 2:04 AM Chen Ridong <chenridong@...weicloud.com> wrote:
>
>
>
> On 2025/12/5 14:39, Andrei Vagin wrote:
> > On Thu, Dec 4, 2025 at 6:52 PM Chen Ridong <chenridong@...weicloud.com> wrote:
> >>
> >>
> >>
> >> On 2025/12/5 8:58, Andrei Vagin wrote:
> >>> This patch series introduces a mechanism to mask hardware capabilities
> >>> (AT_HWCAP) reported to user-space processes via the misc cgroup
> >>> controller.
> >>>
> >>> To support C/R operations (snapshots, live migration) in heterogeneous
> >>> clusters, we must ensure that processes utilize CPU features available
> >>> on all potential target nodes. To solve this, we need to advertise a
> >>> common feature set across the cluster. This patchset allows users to
> >>> configure a mask for AT_HWCAP, AT_HWCAP2. This ensures that applications
> >>> within a container only detect and use features guaranteed to be
> >>> available on all potential target hosts.
> >>>
> >>
> >> Could you elaborate on how this mask mechanism would be used in practice?
> >>
> >> Based on my understanding of the implementation, the parent’s mask is effectively a subset of the
> >> child’s mask, meaning the parent does not impose any additional restrictions on its children. This
> >> behavior appears to differ from typical cgroup controllers, where children are further constrained
> >> by their parent’s settings. This raises the question: is the cgroup model an appropriate fit for
> >> this functionality?
> >
> > Chen,
> >
> > Thank you for the question. I think I was not clear enough in the
> > description.
> >
> > The misc.mask file works by masking out available features; any feature
> > bit set in the mask will not be advertised to processes within that
> > cgroup. When a child cgroup is created, its effective mask is  a
> > combination of its own mask and its parent's effective mask. This means
> > any feature masked by either the parent or the child will be hidden from
> > processes in the child cgroup.
> >
> > For example:
> > - If a parent cgroup masks out feature A (mask=0b001), processes in it
> >   won't see feature A.
> > - If we create a child cgroup under it and set its mask to hide feature
> >   B (mask=0b010), the effective mask for processes in the child cgroup
> >   becomes 0b011. They will see neither feature A nor B.
> >
> Let me ask some basic questions:
>
> When is the misc.mask typically set? Is it only configured before starting a container (e.g., before
> docker run), or can it be adjusted dynamically while processes are already running?

If we are talking about C/R use cases, it should be configured when
container is started. It can be adjusted dynamically, but all changes
will affect only new processes. The auxiliary vectors are set on execve.

>
> I'm concerned about a potential scenario: If a child process initially has access to a CPU feature,
> but then its parent cgroup masks that feature out, could the child process remain unaware of this
> change?
>
> Specifically, if a process has already cached or relied on a CPU capability before the mask was
> applied, would it continue to assume it has that capability, leading to potential issues if it
> attempts to use instructions that are now masked out?

I wouldn't classify this behavior as an issue; it's designed to function
this way. It's important to understand that this isn't enforcement, but
rather information for processes regarding which features are
"guaranteed" to them. A process can choose to utilize unexposed
features at its own risk, potentially encountering problems after
migration to a different host.

Thanks,
Andrei