[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1mxywyege.fsf@fess.ebiederm.org>
Date: Thu, 25 Feb 2010 17:26:41 -0800
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Matt Helsley <matthltc@...ibm.com>
Cc: hadi@...erus.ca, Daniel Lezcano <dlezcano@...ibm.com>,
Patrick McHardy <kaber@...sh.net>,
Linux Netdev List <netdev@...r.kernel.org>,
containers@...ts.linux-foundation.org,
Netfilter Development Mailinglist
<netfilter-devel@...r.kernel.org>,
Ben Greear <greearb@...delatech.com>,
Serge Hallyn <serue@...ibm.com>
Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.
Matt Helsley <matthltc@...ibm.com> writes:
> On Thu, Feb 25, 2010 at 12:57:02PM -0800, Eric W. Biederman wrote:
>>
>> Introduce two new system calls:
>> int nsfd(pid_t pid, unsigned long nstype);
>> int setns(unsigned long nstype, int fd);
>>
>> These two new system calls address three specific problems that can
>> make namespaces hard to work with.
>> - Namespaces require a dedicated process to pin them in memory.
>> - It is not possible to use a namespace unless you are the
>> child of the original creator.
>> - Namespaces don't have names that userspace can use to talk
>> about them.
>>
>> The nsfd() system call returns a file descriptor that can
>> be used to talk about a specific namespace, and to keep
>> the specified namespace alive.
>>
>> The fd returned by nsfd() can be bind mounted as:
>> mount --bind /proc/self/fd/N /some/filesystem/path
>> to keep the namespace alive indefinitely as long as
>> it is mounted.
>>
>> open works on the fd returned by nsfd() so another
>> process can get a hold of it and do interesting things.
>>
>> Overall that allows for persistent naming of namespaces
>> according to userspace policy.
>>
>> setns() allows changing the namespace of the current process
>> to a namespace that originates with nsfd().
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@...ssion.com>
>> ---
>>
>> This is just my first pass at this, and not yet compiled tested.
>> I was pleasantly surprised at how easy all of this was to implement.
>
> <snip>
>
>> +SYSCALL_DEFINE2(setns, unsigned long, nstype, int, fd)
>> +{
>> + struct file *file;
>> +
>> + if (!capable(CAP_SYS_ADMIN))
>> + return -EPERM;
>
> Is this check preliminary? In the future would we check against the
> owner of the target namespace too? Naturally that will require tagging
> each namespace with an owner but I thought that was already part of the
> plan...
We aren't modifying the namespace here so namespace owners are
irrelevant here.
We are modifying the process so we need to have CAP_SYS_ADMIN in the
processes credential/uid namespace.
Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists