[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080429193417.GA19282@sergelap.austin.ibm.com>
Date: Tue, 29 Apr 2008 14:34:17 -0500
From: "Serge E. Hallyn" <serue@...ibm.com>
To: Greg KH <gregkh@...e.de>
Cc: "Serge E. Hallyn" <serue@...ibm.com>,
Benjamin Thery <benjamin.thery@...l.net>,
linux-kernel@...r.kernel.org, Al Viro <viro@....linux.org.uk>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Tejun Heo <htejun@...il.com>,
Daniel Lezcano <dlezcano@...ibm.com>,
Pavel Emelyanov <xemul@...nvz.org>, netdev@...r.kernel.org
Subject: Re: [PATCH 00/10] sysfs tagged directories
Quoting Greg KH (gregkh@...e.de):
> On Tue, Apr 29, 2008 at 01:04:45PM -0500, Serge E. Hallyn wrote:
> > Quoting Greg KH (gregkh@...e.de):
> > > On Tue, Apr 29, 2008 at 07:10:15PM +0200, Benjamin Thery wrote:
> > > > Here is the announcement Eric wrote back in December to introduce his
> > > > patchset:
> > >
> > > <snip>
> > >
> > > Are the objections that Al Viro made to this patchset when it was last
> > > sent out addressed in this new series?
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > Which objections were those? The last submission which I see by Eric
> > was http://lkml.org/lkml/2007/12/1/15 this past December. I see no
> > response from Al and get the feeling you were ok with them.
> >
> > So my hunch would be that Eric had addressed those before that last
> > submission, but if not I'm sorry, and please do set me straight.
>
> See the thread from Al starting with:
> Date: Mon, 7 Jan 2008 10:24:17 +0000
> From: Al Viro <viro@...IV.linux.org.uk>
> To: "Eric W. Biederman" <ebiederm@...ssion.com>
> Cc: linux-kernel@...r.kernel.org, htejun@...il.com,
> linux-fsdevel@...r.kernel.org, gregkh@...e.de
> Subject: [RFC] netns / sysfs interaction
> Message-ID: <20080107072301.GW27894@...IV.linux.org.uk>
>
> He had a lot of questions and objections to this way forward, and I
> share those objections.
Ah I see it, thanks.
All Al's questions appear to be about how a task migration will be handled
in the face of funky userspace usage of sysfs files. But it seems clear the
first use of these will not be for migration but for vservers. The key
thing to remember is that we don't (as decided at kernel-summit 06) aim
to hide from userspace the fact that it's in a vserver, we just give it
what it needs so that it can pretend.
As we start implementing checkpoint and restart to effect migration,
*clearly* if we're trying to restart a task which has cwd or an open fd
in /sys/class/net/eth42/, but that directory doesn't exist on the target
machine, then the restart (and hence migrate) fails.
There was a concern about
/sys/devices/pci0000\:00/0000\:00\:0a.0/net:eth0. Since that's a
symlink to ../../../class/net/eth0, it will either point nowhere or
point to the virtualized eth0, if veth1 (or vethN) was renamed to eth0
in the container. (see below) If that is the wrong thing to do we
could try to address it in this patchset, but I suspect it is better
left until device namespace are implemented. Does that sounds sane?
The last question of Al's which went unanswered was
> Excuse me, _what_? Are you seriously suggesting going through all dentry
> trees, doing d_move() in each? I want to see your locking. It's promising
> to be worse than devfs had ever been. Much worse.
I think this is answered in patch 4. So yeah, it does d_move() in each
sysfs mount. It's all done under the sysfs_rename_mutex. Judging by
the phrasing of the question, is that not acceptable?
Finally, to give an idea about how the trees end up looking, here is
what I just did on my test box;
/usr/sbin/ip link add type veth
mount --bind /mnt /mnt
mkdir /mnt/sys
mount --make-shared /mnt
ns_exec -cmn /bin/sh # unshare netns and mounts ns
# At this point, I still see eth0 and friends under /sys/class/net etc
mount -t sysfs none /sys
# At this point, /sys/class/net has only lo0 and sit0, and
# /sys/devices/pci0000:00/0000:00:03.0/net:eth0 is a dead link
mount --bind /sys /mnt/sys
echo $$
3050
(back in another shell):
/usr/sbin/ip link set veth1 netns 3050
(back in container shell):
/usr/sbin/ip link set veth1 name eth0
# Now /sys/devices/pci0000:00/0000:00:03.0/net:eth0 is a live link to
# the /sys/class/net/eth0 which is really the original veth1
exit
ls /mnt/sys/class/net
# empty directory
thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists