lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Sep 2021 14:54:59 +0200
From:   Antoine Tenart <atenart@...nel.org>
To:     davem@...emloft.net, kuba@...nel.org
Cc:     Antoine Tenart <atenart@...nel.org>, pabeni@...hat.com,
        gregkh@...uxfoundation.org, ebiederm@...ssion.com,
        stephen@...workplumber.org, herbert@...dor.apana.org.au,
        juri.lelli@...hat.com, netdev@...r.kernel.org
Subject: [RFC PATCH net-next 8/9] net: delay device_del until run_todo

Move the deletion of the device from unregister_netdevice_many to
netdev_run_todo and move it outside the rtnl lock.

12 years ago was reported an ABBA deadlock between net-sysfs and the
netdevice unregistration[1]. The issue was the following:

              A                            B

   unregister_netdevice_many         sysfs access
   rtnl_lock                         sysfs refcount
				     rtnl_lock
   drain sysfs files
   => waits for B                    => waits for A

This was avoided thanks to two patches[2][3], which used rtnl_trylock in
net-sysfs and restarted the syscall when the rtnl lock was already
taken. This way kernfs nodes were not blocking the netdevice
unregistration anymore.

This was fine at the time but is now causing some issues: creating and
moving interfaces makes userspace (systemd, NetworkManager or others) to
spin a lot as syscalls are restarted, which has an impact on
performance. This happens for example when creating pods. While
userspace applications could be improved, fixing this in-kernel has the
benefit of fixing the root cause of the issue.

The sysfs removal is done in device_del, and moving it outside of the
rtnl lock does fix the initial deadlock. With that the trylock/restart
logic can be removed in a following-up patch.

[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/
(I'm referencing the full thread but the sysfs issue was discussed later
in the thread).
[2] 336ca57c3b4e ("net-sysfs: Use rtnl_trylock in sysfs methods.")
[3] 5a5990d3090b ("net: Avoid race between network down and sysfs")

Co-developed-by: Paolo Abeni <pabeni@...hat.com>
Signed-off-by: Antoine Tenart <atenart@...nel.org>
---
 net/core/dev.c       | 2 ++
 net/core/net-sysfs.c | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index a1eab120bb50..d774fbec5d63 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10593,6 +10593,8 @@ void netdev_run_todo(void)
 			continue;
 		}
 
+		device_del(&dev->dev);
+
 		dev->reg_state = NETREG_UNREGISTERED;
 
 		netdev_wait_allrefs(dev);
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 21c3fdeccf20..e754f00c117b 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1955,8 +1955,6 @@ void netdev_unregister_kobject(struct net_device *ndev)
 	remove_queue_kobjects(ndev);
 
 	pm_runtime_set_memalloc_noio(dev, false);
-
-	device_del(dev);
 }
 
 /* Create sysfs entries for network device. */
-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ