lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+dN11K7EushTwsT0tchEytceTWHqiB23KqrYvfauRjWg@mail.gmail.com>
Date: Tue, 7 Jan 2025 21:46:41 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: "David S . Miller" <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org, 
	Simon Horman <horms@...nel.org>, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()

On Tue, Jan 7, 2025 at 9:22 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Jan 7, 2025 at 9:11 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Tue,  7 Jan 2025 17:38:34 +0000 Eric Dumazet wrote:
> > > One major source of RTNL contention resides in unregister_netdevice()
> > >
> > > Due to RCU protection of various network structures, and
> > > unregister_netdevice() being a synchronous function,
> > > it is calling potentially slow functions while holding RTNL.
> > >
> > > I think we can release RTNL in two points, so that three
> > > slow functions are called while RTNL can be used
> > > by other threads.
> >
> > I think we'll need:
> >
> > diff --git a/net/devlink/port.c b/net/devlink/port.c
> > index 939081a0e615..cdfa22453a55 100644
> > --- a/net/devlink/port.c
> > +++ b/net/devlink/port.c
> > @@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
> >                 __devlink_port_type_set(devlink_port, devlink_port->type,
> >                                         netdev);
> >                 break;
> > +       case NETDEV_UNREGISTERING:
>
> Not sure I follow ?
>
> >         case NETDEV_UNREGISTER:
> >                 if (devlink_net(devlink) != dev_net(netdev))
> >                         return NOTIFY_OK;
> >
> >
> > There is no other way to speed things up? Use RT prio for the work?
> > Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?
> >
> > I'm not gonna stand in your way but in general re-taking caller locks
> > in a callee is a bit ugly :(
>
> We might restrict this stuff to cleanup_net() caller only, we know the
> netns are disappearing
> and that no other thread can mess with them.

ie something like:

diff --git a/net/core/dev.c b/net/core/dev.c
index 9e93b13b9a76bd256d93d05a13d21dca883d6ab8..a555e82adbeda90672c72700e9235a5d271be8fd
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11414,6 +11414,23 @@ struct net_device *alloc_netdev_dummy(int sizeof_priv)
 }
 EXPORT_SYMBOL_GPL(alloc_netdev_dummy);

+static bool from_cleanup_net(void)
+{
+       return current == cleanup_net_task;
+}
+
+static void rtnl_drop_if_cleanup(void)
+{
+       if (from_cleanup_net())
+               __rtnl_unlock();
+}
+
+static void rtnl_acquire_if_cleanup(void)
+{
+       if (from_cleanup_net())
+               rtnl_lock();
+}
+
 /**
  *     synchronize_net -  Synchronize with packet receive processing
  *
@@ -11423,7 +11440,7 @@ EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
 void synchronize_net(void)
 {
        might_sleep();
-       if (current == cleanup_net_task || rtnl_is_locked())
+       if (from_cleanup_net() || rtnl_is_locked())
                synchronize_rcu_expedited();
        else
                synchronize_rcu();
@@ -11527,10 +11544,10 @@ void unregister_netdevice_many_notify(struct
list_head *head,
                WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERING);
        }

-       __rtnl_unlock();
+       rtnl_drop_if_cleanup();
        flush_all_backlogs();
        synchronize_net();
-       rtnl_lock();
+       rnl_acquire_if_cleanup();

        list_for_each_entry(dev, head, unreg_list) {
                struct sk_buff *skb = NULL;
@@ -11590,9 +11607,9 @@ void unregister_netdevice_many_notify(struct
list_head *head,
 #endif
        }

-       __rtnl_unlock();
+       rtnl_drop_if_cleanup();
        synchronize_net();
-       rtnl_lock();
+       rnl_acquire_if_cleanup();

        list_for_each_entry(dev, head, unreg_list) {
                netdev_put(dev, &dev->dev_registered_tracker);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ