[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+dN11K7EushTwsT0tchEytceTWHqiB23KqrYvfauRjWg@mail.gmail.com>
Date: Tue, 7 Jan 2025 21:46:41 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: "David S . Miller" <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Simon Horman <horms@...nel.org>, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
On Tue, Jan 7, 2025 at 9:22 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Jan 7, 2025 at 9:11 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Tue, 7 Jan 2025 17:38:34 +0000 Eric Dumazet wrote:
> > > One major source of RTNL contention resides in unregister_netdevice()
> > >
> > > Due to RCU protection of various network structures, and
> > > unregister_netdevice() being a synchronous function,
> > > it is calling potentially slow functions while holding RTNL.
> > >
> > > I think we can release RTNL in two points, so that three
> > > slow functions are called while RTNL can be used
> > > by other threads.
> >
> > I think we'll need:
> >
> > diff --git a/net/devlink/port.c b/net/devlink/port.c
> > index 939081a0e615..cdfa22453a55 100644
> > --- a/net/devlink/port.c
> > +++ b/net/devlink/port.c
> > @@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
> > __devlink_port_type_set(devlink_port, devlink_port->type,
> > netdev);
> > break;
> > + case NETDEV_UNREGISTERING:
>
> Not sure I follow ?
>
> > case NETDEV_UNREGISTER:
> > if (devlink_net(devlink) != dev_net(netdev))
> > return NOTIFY_OK;
> >
> >
> > There is no other way to speed things up? Use RT prio for the work?
> > Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?
> >
> > I'm not gonna stand in your way but in general re-taking caller locks
> > in a callee is a bit ugly :(
>
> We might restrict this stuff to cleanup_net() caller only, we know the
> netns are disappearing
> and that no other thread can mess with them.
ie something like:
diff --git a/net/core/dev.c b/net/core/dev.c
index 9e93b13b9a76bd256d93d05a13d21dca883d6ab8..a555e82adbeda90672c72700e9235a5d271be8fd
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11414,6 +11414,23 @@ struct net_device *alloc_netdev_dummy(int sizeof_priv)
}
EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
+static bool from_cleanup_net(void)
+{
+ return current == cleanup_net_task;
+}
+
+static void rtnl_drop_if_cleanup(void)
+{
+ if (from_cleanup_net())
+ __rtnl_unlock();
+}
+
+static void rtnl_acquire_if_cleanup(void)
+{
+ if (from_cleanup_net())
+ rtnl_lock();
+}
+
/**
* synchronize_net - Synchronize with packet receive processing
*
@@ -11423,7 +11440,7 @@ EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
void synchronize_net(void)
{
might_sleep();
- if (current == cleanup_net_task || rtnl_is_locked())
+ if (from_cleanup_net() || rtnl_is_locked())
synchronize_rcu_expedited();
else
synchronize_rcu();
@@ -11527,10 +11544,10 @@ void unregister_netdevice_many_notify(struct
list_head *head,
WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERING);
}
- __rtnl_unlock();
+ rtnl_drop_if_cleanup();
flush_all_backlogs();
synchronize_net();
- rtnl_lock();
+ rnl_acquire_if_cleanup();
list_for_each_entry(dev, head, unreg_list) {
struct sk_buff *skb = NULL;
@@ -11590,9 +11607,9 @@ void unregister_netdevice_many_notify(struct
list_head *head,
#endif
}
- __rtnl_unlock();
+ rtnl_drop_if_cleanup();
synchronize_net();
- rtnl_lock();
+ rnl_acquire_if_cleanup();
list_for_each_entry(dev, head, unreg_list) {
netdev_put(dev, &dev->dev_registered_tracker);
Powered by blists - more mailing lists