[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2023101632-circle-delegate-39dd@gregkh>
Date: Mon, 16 Oct 2023 19:20:26 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Daniel Gröber <dxld@...kboxed.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
netdev@...r.kernel.org, Richard Weinberger <richard@....at>,
Serge Hallyn <serge.hallyn@...onical.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [BUG] rtnl_newlink: Rogue MOVE event delivered on netns change
On Mon, Oct 16, 2023 at 07:32:51AM -0700, Jakub Kicinski wrote:
> On Sat, 14 Oct 2023 10:58:20 +0200 Greg Kroah-Hartman wrote:
> > On Fri, Oct 13, 2023 at 03:43:02PM -0700, Jakub Kicinski wrote:
> > > On Fri, 13 Oct 2023 15:36:05 -0700 Jakub Kicinski wrote:
> > > > kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
> > > > dev_net_set(dev, net);
> > > > kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
> > >
> > > Greg, we seem to have a problem in networking with combined
> > > netns move and name change.
> > >
> > > We have this code in __dev_change_net_namespace():
> > >
> > > kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
> > > dev_net_set(dev, net);
> > > kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
> > >
> > > err = device_rename(&dev->dev, dev->name);
> > >
> > > Is there any way we can only get the REMOVE (old name) and ADD
> > > (new name) events, without the move? I.e. silence the rename?
> > >
> > > Daniel is reporting that with current code target netns sees an
> > > add of an interface with the old (duplicated) name. And then a rename.
> >
> > But that's how this has always been, right? What problems is this
> > causing?
>
> Original report is up-thread:
> https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/
> With a link to a GH issue for lxc:
> https://github.com/lxc/incus/issues/146
>
> > > Without a silent move best we can do is probably:
> > >
> > > kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
> > > dev_net_set(dev, net);
> > > err = device_rename(&dev->dev, dev->name);
> > > kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
> > >
> > > which will give us:
> > >
> > > MOVE new-name
> > > ADD new-name
> > >
> > > in target netns, which, hm.
> >
> > That wouldn't make much sense.
> >
> > What is the real problem here? What changed to cause a problem?
>
> IIUC what happens is:
>
> - systemd controls "real" eth0
> - we move a "to be renamed" eth0 from a container into main ns
> - we rename "to be renamed" eth0 to something else
> - seeing the rename of eth0 system thinks it's the "real" one
> that is being renamed, ergo there's no eth0 any more,
> so it shuts down its "unit" for eth0
>
> I don't think anything changed. Sounds more like someone finally tried
> to use this in anger.
Then they get to keep the broken pieces that they created here.
"moving" a network connection to a container needs to either be added to
systemd if it is going to manage the network connections, or just stop
using systemd to handle the connection entirely as they want to do
something that systemd doesn't support.
I don't think your proposed change is going to do much here as you would
have multiple adds for the same device without any removes, which is
odd.
thanks,
greg k-h
Powered by blists - more mailing lists