lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ad4mel7m2tfybp54vqfl5c6sownjr5kq3xa5ytucfkqecfakga@aw65fx3rziyj>
Date: Fri, 28 Feb 2025 14:46:08 +0100
From: Michal Koutný <mkoutny@...e.com>
To: Aleksandr Mikhalitsyn <aleksandr.mikhalitsyn@...onical.com>
Cc: brauner@...nel.org, stgraber@...raber.org, tycho@...ho.pizza, 
	cyphar@...har.com, yun.zhou@...driver.com, joel.granados@...nel.org, 
	rostedt@...dmis.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/2] pid_namespace: namespacify sysctl kernel.pid_max

Hello Aleksandr.

On Tue, Feb 25, 2025 at 07:01:21PM +0100, Aleksandr Mikhalitsyn <aleksandr.mikhalitsyn@...onical.com> wrote:
> We see some kernel global limit or setting and consider if it's safe
> to be namespaced in some way
> and if it is safe and if it makes sense then we do it.

I know there are ucounts for various per-userns limits (NB RLIMIT_NPROC
among them).
Do you have any other precedents in mind?

In my thinking (biased towards raw resources, not ucounts) it's composed
like one global limit + cgroup limits for non-root groups, hence the
surprise with pid_max granularity.

> Second reason for having this is that we have a real use case scenario
> with 32-bit Android Bionic libc
> where we need to set a limit for PID *value*. And here, unfortunately,
> pids controller does not help either.

(I think if there were no pids controller, namespaced pid_max would be
very good approach how to implement this. But it sounds a little bit
redundant after pids controller was conceived.)

pid namespaces are definitely good place to tackle this since they do
pid numbers virtualization afterall. The challenge is how to limit the
number (amount) and number (pid) of tasks.
Note that besides the pids controller, pid_max and RLIMIT_NPROC, there's
also threads-max limit. Namespacing pid_max makes configuration space
even more complex :-/ In contrast with pids.max, there's no external
visibility of the namespace's pid_max (you must nsenter it) and pid_max
failures are more difficult to troubleshoot (mere failed fork(2)).

Admiteddly, I'm slightly hesitant to pursue the pids controller based
approach due to ns_last_pid. (Also how is that with starting those 32b
apps?  Do they themselves adjust the limits inside the pidns or is this
done by some launcher (who may need privileges to set pids.max)?)


One more idea I have, would be to rebase my original pid_max default
value elimination [1] on top of the namespaced pid_max and not to copy
from parent but start unlimited in the ns too. (Or keep global default
value and unlimit only descednants so that's similar semantics to
ucounts.)


> I hope I explained above why I believe that this does not duplicate an
> existing mechanism.

The 32b scenario is certainly a sensible thing to resolve. But I'm still
worried people would start adjusting both of those and (presumably
different) people would run into unexpected fork failures.

Thanks,
Michal

[1] https://lore.kernel.org/all/20240408145819.8787-1-mkoutny@suse.com/

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ