[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zyyj1KUDDJXEJjkd@calendula>
Date: Thu, 7 Nov 2024 12:26:12 +0100
From: Pablo Neira Ayuso <pablo@...filter.org>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netfilter-devel@...r.kernel.org, davem@...emloft.net,
netdev@...r.kernel.org, kuba@...nel.org, edumazet@...gle.com,
fw@...len.de
Subject: Re: [PATCH net 1/1] netfilter: nf_tables: wait for rcu grace period
on net_device removal
On Thu, Nov 07, 2024 at 11:55:47AM +0100, Paolo Abeni wrote:
> Hi,
> On 11/7/24 00:58, Pablo Neira Ayuso wrote:
> > 8c873e219970 ("netfilter: core: free hooks with call_rcu") removed
> > synchronize_net() call when unregistering basechain hook, however,
> > net_device removal event handler for the NFPROTO_NETDEV was not updated
> > to wait for RCU grace period.
> >
> > Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks
> > on net_device removal") does not remove basechain rules on device
> > removal, I was hinted to remove rules on net_device removal later, see
> > 5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on
> > netdevice removal").
> >
> > Although NETDEV_UNREGISTER event is guaranteed to be handled after
> > synchronize_net() call, this path needs to wait for rcu grace period via
> > rcu callback to release basechain hooks if netns is alive because an
> > ongoing netlink dump could be in progress (sockets hold a reference on
> > the netns).
> >
> > Note that nf_tables_pre_exit_net() unregisters and releases basechain
> > hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
> > the netns exit path, eg. veth peer device in another netns:
> >
> > cleanup_net()
> > default_device_exit_batch()
> > unregister_netdevice_many_notify()
> > notifier_call_chain()
> > nf_tables_netdev_event()
> > __nft_release_basechain()
> >
> > In this particular case, same rule of thumb applies: if netns is alive,
> > then wait for rcu grace period because netlink dump in the other netns
> > could be in progress. Otherwise, if the other netns is going away then
> > no netlink dump can be in progress and basechain hooks can be released
> > inmediately.
> >
> > While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
> > validation, which should not ever happen.
> >
> > Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
> > Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
> > ---
> > include/net/netfilter/nf_tables.h | 2 ++
> > net/netfilter/nf_tables_api.c | 41 +++++++++++++++++++++++++------
> > 2 files changed, 36 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
> > index 91ae20cb7648..8dd8e278843d 100644
> > --- a/include/net/netfilter/nf_tables.h
> > +++ b/include/net/netfilter/nf_tables.h
> > @@ -1120,6 +1120,7 @@ struct nft_chain {
> > char *name;
> > u16 udlen;
> > u8 *udata;
> > + struct rcu_head rcu_head;
>
> I'm sorry to be pedantic but the CI is complaining about the lack of
> kdoc for this field...
>
> >
> > /* Only used during control plane commit phase: */
> > struct nft_rule_blob *blob_next;
> > @@ -1282,6 +1283,7 @@ struct nft_table {
> > struct list_head sets;
> > struct list_head objects;
> > struct list_head flowtables;
> > + possible_net_t net;
>
> ... and this one ...
>
> > u64 hgenerator;
> > u64 handle;
> > u32 use;
>
> [...]
> > +static void nft_release_basechain_rcu(struct rcu_head *head)
> > +{
> > + struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
> > + struct nft_ctx ctx = {
> > + .family = chain->table->family,
> > + .chain = chain,
> > + .net = read_pnet(&chain->table->net),
> > + };
> > +
> > + __nft_release_basechain_now(&ctx);
> > + put_net(ctx.net);
>
> ... and also about deprecated API usage here, the put_net_tracker()
> version should be preferred.
>
> Given this change will likely land on very old trees I guess the tracker
> conversion is better handled as a follow-up net-next patch.
Agreed.
> Would you mind addressing the kdoc above? Today PR will be handled by
> Jakub quite later, so there is a bit of time.
I will fix kdoc and resubmit.
Powered by blists - more mailing lists