[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2023112132-crummiest-onion-de48@gregkh>
Date: Tue, 21 Nov 2023 06:40:42 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com,
Daniel Gröber <dxld@...kboxed.org>
Subject: Re: [PATCH net-next] net: do not send a MOVE event when netdev
changes netns
On Mon, Nov 20, 2023 at 10:41:40AM -0800, Jakub Kicinski wrote:
> Networking supports changing netdevice's netns and name
> at the same time. This allows avoiding name conflicts
> and having to rename the interface in multiple steps.
> E.g. netns1={eth0, eth1}, netns2={eth1} - we want
> to move netns1:eth1 to netns2 and call it eth0 there.
> If we can't rename "in flight" we'd need to (1) rename
> eth1 -> $tmp, (2) change netns, (3) rename $tmp -> eth0.
>
> To rename the underlying struct device we have to call
> device_rename(). The rename()'s MOVE event, however, doesn't
> "belong" to either the old or the new namespace.
> If there are conflicts on both sides it's actually impossible
> to issue a real MOVE (old name -> new name) without confusing
> user space. And Daniel reports that such confusions do in fact
> happen for systemd, in real life.
>
> Since we already issue explicit REMOVE and ADD events
> manually - suppress the MOVE event completely. Move
> the ADD after the rename, so that the REMOVE uses
> the old name, and the ADD the new one.
>
> If there is no rename this changes the picture as follows:
>
> Before:
>
> old ns | KERNEL[213.399289] remove /devices/virtual/net/eth0 (net)
> new ns | KERNEL[213.401302] add /devices/virtual/net/eth0 (net)
> new ns | KERNEL[213.401397] move /devices/virtual/net/eth0 (net)
>
> After:
>
> old ns | KERNEL[266.774257] remove /devices/virtual/net/eth0 (net)
> new ns | KERNEL[266.774509] add /devices/virtual/net/eth0 (net)
>
> If there is a rename and a conflict (using the exact eth0/eth1
> example explained above) we get this:
>
> Before:
>
> old ns | KERNEL[224.316833] remove /devices/virtual/net/eth1 (net)
> new ns | KERNEL[224.318551] add /devices/virtual/net/eth1 (net)
> new ns | KERNEL[224.319662] move /devices/virtual/net/eth0 (net)
>
> After:
>
> old ns | KERNEL[333.033166] remove /devices/virtual/net/eth1 (net)
> new ns | KERNEL[333.035098] add /devices/virtual/net/eth0 (net)
>
> Note that "in flight" rename is only performed when needed.
> If there is no conflict for old name in the target netns -
> the rename will be performed separately by dev_change_name(),
> as if the rename was a different command, and there will still
> be a MOVE event for the rename:
>
> Before:
>
> old ns | KERNEL[194.416429] remove /devices/virtual/net/eth0 (net)
> new ns | KERNEL[194.418809] add /devices/virtual/net/eth0 (net)
> new ns | KERNEL[194.418869] move /devices/virtual/net/eth0 (net)
> new ns | KERNEL[194.420866] move /devices/virtual/net/eth1 (net)
>
> After:
>
> old ns | KERNEL[71.917520] remove /devices/virtual/net/eth0 (net)
> new ns | KERNEL[71.919155] add /devices/virtual/net/eth0 (net)
> new ns | KERNEL[71.920729] move /devices/virtual/net/eth1 (net)
>
> If deleting the MOVE event breaks some user space we should insert
> an explicit kobject_uevent(MOVE) after the ADD, like this:
>
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -11192,6 +11192,12 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
> kobject_uevent(&dev->dev.kobj, KOBJ_ADD);
> netdev_adjacent_add_links(dev);
>
> + /* User space wants an explicit MOVE event, issue one unless
> + * dev_change_name() will get called later and issue one.
> + */
> + if (!pat || new_name[0])
> + kobject_uevent(&dev->dev.kobj, KOBJ_MOVE);
> +
> /* Adapt owner in case owning user namespace of target network
> * namespace is different from the original one.
> */
>
> CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Reported-by: Daniel Gröber <dxld@...kboxed.org>
> Link: https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
Acked-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Powered by blists - more mailing lists