[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAVpQUDfftGhVvsiHy999avT=Tz8Gyw-Dwvrd3=rPvU8GyzP2w@mail.gmail.com>
Date: Wed, 2 Jul 2025 00:44:50 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>, "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Jamal Hadi Salim <jhs@...atatu.com>, Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
eric.dumazet@...il.com, Vlad Buslov <vladbu@...dia.com>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
Subject: Re: [PATCH net-next] net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit()
On Wed, Jul 2, 2025 at 12:08 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Jul 1, 2025 at 7:35 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> >
> > On Tue, Jul 01, 2025 at 01:30:06PM +0000, Eric Dumazet wrote:
> > > tc_action_net_exit() got an rtnl exclusion in commit
> > > a159d3c4b829 ("net_sched: acquire RTNL in tc_action_net_exit()")
> > >
> > > Since then, commit 16af6067392c ("net: sched: implement reference
> > > counted action release") made this RTNL exclusion obsolete.
> >
> > I am not sure removing RTNL is safe even we have action refcnt.
> >
> > For example, are you sure tcf_action_offload_del() is safe to call
> > without RTNL?
>
> My thinking was that at the time of these calls, devices were already
> gone from the dismantling netns,
I thought the same because tcf_register_action() uses
register_pernet_subsys(), and subsys_ops->exit() is always executed
after other device_ops including default_device_exit_batch() that should
wait for all dev to go away.
> but this might be wrong.
>
> We can conditionally acquire rtnl from tcf_idrinfo_destroy() when
> there is at least one offloaded action in the idr.
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 057e20cef3754f33357c4c1e30034f6b9b872d91..9e468e46346710c85c3a85b905d27dfe3972916a
> 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -933,18 +933,25 @@ void tcf_idrinfo_destroy(const struct tc_action_ops *ops,
> struct tcf_idrinfo *idrinfo)
> {
> struct idr *idr = &idrinfo->action_idr;
> + bool mutex_taken = false;
> struct tc_action *p;
> - int ret;
> unsigned long id = 1;
> unsigned long tmp;
> + int ret;
>
> idr_for_each_entry_ul(idr, p, tmp, id) {
> + if (tc_act_in_hw(p) && !mutex_taken) {
> + rtnl_lock();
> + mutex_taken = true;
> + }
> ret = __tcf_idr_release(p, false, true);
> if (ret == ACT_P_DELETED)
> module_put(ops->owner);
> else if (ret < 0)
> return;
> }
> + if (mutex_taken)
> + rtnl_unlock();
> idr_destroy(&idrinfo->action_idr);
> }
> EXPORT_SYMBOL(tcf_idrinfo_destroy);
>
>
> >
> > What are you trying to improve here?
>
> Yeah, some of us are spending months of work to improve the RTNL
> situation, and we do not copy/paste why on every single patch :)
>
> I will capture the following in V2, thanks !
>
> Most netns do not have actions, yet deleting them is adding a lot of
> pressure on RTNL, which is for us the most contended mutex in the
> kernel.
>
> We are moving to a per-netns 'rtnl', so tc_action_net_exit() will not
> be able to grab 'rtnl' a single time for a batch of netns.
>
>
> Before the patch:
>
> perf probe -a rtnl_lock # Note: This does not capture all calls, some
> of them might be inlined in net/core/rtnetlink.c
>
> perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1'
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.305 MB perf.data (25 samples) ]
>
> After the patch:
>
> perf record -e probe:rtnl_lock -a /bin/bash -c 'unshare -n "/bin/true"; sleep 1'
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.304 MB perf.data (9 samples) ]
Powered by blists - more mailing lists