[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4AC66B5E.9060200@librato.com>
Date: Fri, 02 Oct 2009 17:06:38 -0400
From: Oren Laadan <orenl@...rato.com>
To: Alexey Dobriyan <adobriyan@...il.com>
CC: "Serge E. Hallyn" <serue@...ibm.com>, arnd@...db.de,
Containers <containers@...ts.linux-foundation.org>,
Nathan Lynch <nathanl@...tin.ibm.com>,
linux-kernel@...r.kernel.org,
"Eric W. Biederman" <ebiederm@...ssion.com>, hpa@...or.com,
mingo@...e.hu, Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
torvalds@...ux-foundation.org, Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [RFC][v7][PATCH 0/9] Implement clone2() system call
Alexey Dobriyan wrote:
> On Wed, Sep 30, 2009 at 01:41:45PM -0400, Oren Laadan wrote:
>> Alexey Dobriyan wrote:
>>> On Thu, Sep 24, 2009 at 01:35:56PM -0500, Serge E. Hallyn wrote:
>>>> Quoting Alexey Dobriyan (adobriyan@...il.com):
>>>>> I don't like this even more.
>>>>>
>>>>> Pid namespaces are hierarchical _and_ anonymous, so simply
>>>>> set of numbers doesn't describe the final object.
>>>>>
>>>>> struct pid isn't special, it's just another invariant if you like
>>>>> as far as C/R is concerned, but system call is made special wrt pids.
>>>>>
>>>>> What will be in an image? I hope "struct kstate_image_pid" with several
>>>> Sure pid namespaces are anonymous, but we will give each an 'objref'
>>>> valid only for a checkpoint image, and store the relationship between
>>>> pid namespaces based on those objrefs. Basically the same way that user
>>>> structs and hierarchical user namespaces are handled right now.
>>> OK, that's certainly doable.
>>>
>>> You're commiting yourself to creation of tasks in userspace if this goes in. :-\
>>> Which can let you into putting wrong kind of relations into image.
>> A malicious user can put "wrong" king of relations into the image,
>> regardless of whether the tasks are created in the kernel or in
>> userspace. As long as the creation follows the "instructions" in
>> the image, the result would be the same.
>
> Wrong as in "fundamentally wrong", not malicious.
> In case of uts_ns, the correct data to put into image is "task uses this uts_ns",
> not "at this point do clone(CLONE_NEWUTS)".
So we are in total agreement: that's how it is done now.
Only task creation per-se, including pid-ns (future work) is done
in userspace. Network namespaces will probably be created in userspace
but attached to tasks in the kernel. Remaining namespaces are covered
in the kernel the way you described.
>
> BTW, now I'm convinced that nsproxy should not even be mentioned be in an image,
> it's irrelevant technical detail, not future-proof at all.
It's helpful (as is more efficient) to keep it now. We can always
decide to ignore it in the future.
Thanks,
Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists