lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87r0lyad40.fsf@email.froward.int.ebiederm.org>
Date:   Fri, 13 Oct 2023 08:03:27 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     yunhui cui <cuiyunhui@...edance.com>
Cc:     akpm@...ux-foundation.org, keescook@...omium.org,
        brauner@...nel.org, jeffxu@...gle.com, frederic@...nel.org,
        mcgrof@...nel.org, cyphar@...har.com, rongtao@...tc.cn,
        linux-kernel@...r.kernel.org,
        Linux Containers <containers@...ts.linux.dev>
Subject: Re: [External] Re: [PATCH] pid_ns: support pidns switching between
 sibling

yunhui cui <cuiyunhui@...edance.com> writes:

> Hi Eric,
>
> On Thu, Oct 12, 2023 at 11:31 AM Eric W. Biederman
> <ebiederm@...ssion.com> wrote:
>>
>> The check you are deleting is what verifies the pid namespaces you are
>> attempting to change pid_ns_for_children to is a member of the tasks
>> current pid namespace (aka task_active_pid_ns).
>>
>>
>> There is a perfectly good comment describing why what you are attempting
>> to do is unsupportable.
>>
>>         /*
>>          * Only allow entering the current active pid namespace
>>          * or a child of the current active pid namespace.
>>          *
>>          * This is required for fork to return a usable pid value and
>>          * this maintains the property that processes and their
>>          * children can not escape their current pid namespace.
>>          */
>>
>>
>> If you pick a pid namespace that does not meet the restrictions you are
>> removing the pid of the new child can not be mapped into the pid
>> namespace of the parent that called setns.
>>
>> AKA the following code can not work.
>>
>> pid = fork();
>> if (!pid) {
>>         /* child */
>>         do_something();
>>         _exit(0);
>> }
>> waitpid(pid);
>
> Sorry, I don't understand what you mean here.

What I mean is that if your simple patch was adopted,
then the classic way of controlling a fork would fail.

	pid = fork()
        ^--------------- Would return 0 for both parent and child
        ^--------------- Look at pid_nr_ns to understand.
        if (!pid() {
		/* child */
		do_something();
		_exit(0);
	}
	waitpid(pid);

For your use case there are more serious problems as well.  The entire
process hierarchy built would be incorrect.   Which means children
signaling parents when they exit would be incorrect, and that parents
would not be able to wait on their children.

I do understand the desire to want to cow the memory space of all of the
processes.  That can potentially save a lot of resources.

In other checkpoint/restart scenarios people have been using userfaultfd
to get a similar benefit.

I suggest you look at the CRIU project.

Eric


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ