[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1lje2qzf4.fsf@fess.ebiederm.org>
Date: Mon, 08 Mar 2010 13:25:03 -0800
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Daniel Lezcano <daniel.lezcano@...e.fr>
Cc: Pavel Emelyanov <xemul@...allels.com>,
Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
Serge Hallyn <serue@...ibm.com>,
Linux Netdev List <netdev@...r.kernel.org>,
containers@...ts.linux-foundation.org,
Netfilter Development Mailinglist
<netfilter-devel@...r.kernel.org>,
Ben Greear <greearb@...delatech.com>
Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.
Daniel Lezcano <daniel.lezcano@...e.fr> writes:
> Eric W. Biederman wrote:
>> Daniel Lezcano <daniel.lezcano@...e.fr> writes:
>>
>>
>>> Eric W. Biederman wrote:
>>>
>>>> Daniel Lezcano <daniel.lezcano@...e.fr> writes:
>>>>
>>>>
>>>>> Eric W. Biederman wrote:
>>>>>
>>>>>> Daniel Lezcano <daniel.lezcano@...e.fr> writes:
>>>>>>
>>>>>>
>>>>>>> Eric W. Biederman wrote:
>>>>>>>
>>>>>>>> I have take an snapshot of my development tree and placed it at.
>>>>>>>>
>>>>>>>>
>>>>>>>> git://git.kernel.org/pub/scm/linux/people/ebiederm/linux-2.6.33-nsfd-v5.git
>>>>>>>>
>>>>>>> Hi Eric,
>>>>>>>
>>>>>>> thanks for the pointer.
>>>>>>>
>>>>>>> I tried to boot the kernel under qemu and I got this oops:
>>>>>>>
>>>>>> I am clearly running an old userspace on my test machine. No udev.
>>>>>> It looks like udev has a long standing netlink misfeature, where
>>>>>> it does not initializing NETLINK_CB....
>>>>>>
>>>>>>
>>>>>> >From 8d85e3ab88718eda3d94cf8e1be14b69dae2b8f1 Mon Sep 17 00:00:00 2001
>>>>>> From: Eric W. Biederman <ebiederm@...ssion.com>
>>>>>> Date: Mon, 8 Mar 2010 09:25:20 -0800
>>>>>> Subject: [PATCH] kobject_uevent: Use the netlink allocator helper...
>>>>>>
>>>>>> Signed-off-by: Eric W. Biederman <ebiederm@...ssion.com>
>>>>>>
>>>>> Thanks.
>>>>>
>>>>> I was able to boot but I have the following warning:
>>>>>
>>>> Thanks for the bug report.
>>>>
>>> Thanks to you for the patchset :)
>>>
>>>
>>>> For the moment you might want to drop:
>>>> af_netlink: Allow credentials to work across namespaces.
>>>> af_netlink: Debugging in case I have missed something.
>>>>
>>>> Although I am curious if you hit my debugging messages in
>>>> netlink recv.
>>>>
>>> No, it does not appear (looked for "missing NETLINK_CB proto").
>>>
>>>
>>>> I guess if the goal is to test my nsfd bits you can drop everything
>>>> starting with my 'scm: Reorder scm_cookie.' commit. The rest is what
>>>> it takes to get get uids, gid and pids translated when the cross
>>>> namespaces on an af_unix of an af_netlink socket.
>>>>
>>>> At least in the af_netlink case it appears clear I am have missed
>>>> something.
>>>>
>>>> This is a warning that netlink throws when the packet accounting messed
>>>> up. So it sounds like you are exercising another path that I failed
>>>> to exercise and fix.
>>>>
>>> I will look forward if I find more clues for this warning.
>>>
>>> In the meantime was able to enter the container with the ugly following
>>> program:
>>>
>>> #include <unistd.h>
>>> #include <stdlib.h>
>>> #include <stdio.h>
>>> #include <syscall.h>
>>> #include <sys/types.h>
>>> #include <sys/stat.h>
>>> #include <fcntl.h>
>>> #include <sys/param.h>
>>>
>>> #define __NR_setns 300
>>>
>>> int setns(int nstype, int fd)
>>> {
>>> return syscall (__NR_setns, nstype, fd);
>>> }
>>>
>>> int main(int argc, char *argv[])
>>> {
>>> char path[MAXPATHLEN];
>>> char *ns[] = { "pid", "mnt", "net", "pid", "uts" };
>>> const int size = sizeof(ns) / sizeof(char *);
>>> int fd[size];
>>> int i;
>>>
>>> if (argc != 3) {
>>> fprintf(stderr, "mynsenter <pid> <command>\n");
>>> exit(1);
>>> }
>>>
>>> for (i = 0; i < size; i++) {
>>> sprintf(path, "/proc/%s/ns/%s", argv[1], ns[i]);
>>>
>>> fd[i] = open(path, O_RDONLY);
>>> if (fd[i] < 0) {
>>> perror("open");
>>> return -1;
>>> }
>>>
>>> }
>>>
>>> for (i = 0; i < size; i++) {
>>>
>>> if (setns(0, fd[i])) {
>>> perror("setns");
>>> return -1;
>>> }
>>> }
>>>
>>> execve(argv[2], &argv[2], NULL);
>>> perror("execve");
>>>
>>> return 0;
>>> }
>>>
>>> At the fist glance, no problem :)
>>>
>>
>> No fork() so your processes is completely in the pid namespace?
>>
> What I do is to attach "/bin/sh" to the container with this program.
> The container is a VPS running busybox with the full isolation.
>
> echo $$ gives the real pid.
> All the forked processes appears in the pid namespace, they are visible through
> /proc with the virtual pid.
> I am not able to change to the /proc/self directory (I assume this is normal).
I guess my meaning is I was expecting.
child = fork();
if (child == 0) {
execve(...);
}
waitpid(child);
This puts /bin/sh in the container as well.
I'm not certain about the /proc/self thing I have never encountered that.
But I guess if your pid is outside of the pid namespace of that instance
of proc /proc/self will be a broken symlink.
Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists