lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231013153605.487f5a74@kernel.org>
Date: Fri, 13 Oct 2023 15:36:05 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Daniel Gröber <dxld@...kboxed.org>
Cc: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
 netdev@...r.kernel.org, Richard Weinberger <richard@....at>, Serge Hallyn
 <serge.hallyn@...onical.com>, "Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [BUG] rtnl_newlink: Rogue MOVE event delivered on netns change

On Tue, 10 Oct 2023 14:10:03 +0200 Daniel Gröber wrote:
> Changing a device's netns and renaming it with one RTM_NEWLINK call causes
> a rogue MOVE uevent to be delivered to the new netns in addition to the
> expected ADD uevent.
> 
> iproute2 reproducer:
> 
>     $ ip netns add test
>     $ ip link add dev eth0 netns test type dummy
>     $ ip link add dev eth0 type dummy
> 
>     $ ip -netns test link set dev eth0 netns 1 name eth123
> 
> With the last command, which renames the device and moves it out of the
> netns, we get the following:
> 
>     $ udevadm monitor -k
>     KERNEL[230953.424834] add      /devices/virtual/net/eth0 (net)
>     KERNEL[230953.424914] move     /devices/virtual/net/eth0 (net)
>     KERNEL[230953.425066] move     /devices/virtual/net/eth123 (net)

FTR I don't see the move on current net-next, I see two adds 
and one move.

 [ ~]# udevadm monitor -k &
 monitor will print the received events for:
 KERNEL - the kernel uevent

 [ ~]# ip netns add test
 [ ~]# ip link add dev eth0 netns test type dummy
 KERNEL[115.393650] add      /module/dummy (module)
 [ ~]# ip link add dev eth0 type dummy
 KERNEL[121.702300] add      /devices/virtual/net/eth0 (net)
 KERNEL[121.704608] add      /devices/virtual/net/eth0/queues/rx-0 (queues)
 KERNEL[121.704733] add      /devices/virtual/net/eth0/queues/tx-0 (queues)
 [ ~]# ip -netns test link set dev eth0 netns 1 name eth123
 KERNEL[135.598907] add      /devices/virtual/net/eth0 (net)
 KERNEL[135.600425] move     /devices/virtual/net/eth123 (net)

I don't think it matters for the problem you're describing, tho.

> The problem is the MOVE event hooribly confuses userspace. The particular
> symptom we're seing is that systemd will bring down the ifup@...0.service
> on the host as it handles the MOVE of eth0->eth123 as a stop for the
> BoundTo sys-subsystem-net-devices-eth0.device unit.
> 
> I also create a clashing eth0 device on the host in the repro to
> demonstrate that the RTM_NETLINK move+rename call is atomic and so the MOVE
> event is entirely nonsensical.
> 
> Looking at the code in __rtnl_newlink I see do_setlink first calls
> __dev_change_net_namespace and then dev_change_name. My guess is the order
> is just wrong here.

Interesting. My analysis is slightly different but only in low level
aspects.. tell me if I'm wrong:

1. we have tb[IFLA_IFNAME] set, so do_setlink() will populate ifname

2. Because of #1, __dev_change_net_namespace() gets called with 
   new name provide (pat = eth123)

3. It will do netdev_name_in_use(), which returns true.

4. It will then call dev_get_valid_name() which, (confusingly?) already
   sets the new name on the netdevice itself.

5. It then calls device shutdown from the source netns.
   This results in Bug#1, the netlink notification carries 
   the new name in the old netns.

 [ ~]# ip -netns test link add dev eth0 netns test type dummy
 [ ~]# ip -netns test link add dev eth1 netns test type dummy
 [ ~]# ip -netns test link set dev eth0 netns 1 name eth1

ip monitor inside netns:

 5: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
     link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff
 6: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
     link/ether 1e:4a:34:36:e3:cd brd ff:ff:ff:ff:ff:ff

 Deleted inet eth0 
 Deleted inet6 eth0 
 Deleted 5: eth1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default 
     link/ether be:4d:58:f9:d5:40 brd ff:ff:ff:ff:ff:ff new-netnsid 0 new-ifindex 7

tells us we deleted eth1, ifindex 5, which is not true. It was eth0.

Small sidebar - altnames are completely broken when it comes to netns,
too:

 [ ~]# ip link add dev eth0 type dummy
 [ ~]# ip link add dev eth1 type dummy
 [ ~]# ip link property add dev eth1 altname eth0
 RTNETLINK answers: File exists

^ it's not letting us use eth0 because device eth0 exists, but

 [ ~]# ip netns add test
 [ ~]# ip -netns test link add dev ethX netns test type dummy
 [ ~]# ip -netns test link property add dev ethX altname eth0
 [ ~]# ip -netns test link set dev ethX netns 1  

 [ ~]# ip li
 ...
 3: eth0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
     link/ether 02:40:88:62:ec:b8 brd ff:ff:ff:ff:ff:ff
 ...
 5: ethX: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
     link/ether 26:b7:28:78:38:0f brd ff:ff:ff:ff:ff:ff
     altname eth0

and we have eth0 altname. So that's Bug#2.
Picking back up after the shutdown in old netns.

6. We call:

   kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE);
   dev_net_set(dev, net);
   kobject_uevent(&dev->dev.kobj, KOBJ_ADD);

Those are the calls you see in udev, recall that device core has
its own naming, so both of those calls will use the _old_ name.
REMOVE in the source netns and ADD in the destination netns.

The kobject calls were added by Serge in 4e66ae2ea371c, in 2012,
curiously it states:

    v2: also send KOBJ_ADD to new netns.  There will then be a
    _MOVE event from the device_rename() call, but that should
    be innocuous.

Which was based on this conversation:
https://lore.kernel.org/all/20121012031328.GA5472@sergelap/

7. Now we finally call:

   err = device_rename(&dev->dev, dev->name);

Which tells device core that the name has changed, and gives you 
the (second) MOVE event. This time with the correct name.

Which is what you're seeing, Bug#3, the ADD event should be after
the call to device_rename()...

Bug#1 and Bug#2 we can fix in networking. Bug#3 is a bit more tricky,
because what we want is a "silent" rename, without generating the MOVE.
This email is a bit long, so let me cut off here..

Powered by blists - more mailing lists