[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <362a90d1-a331-4bcc-8f14-495baf5c2309@redhat.com>
Date: Thu, 7 Nov 2024 11:55:47 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Pablo Neira Ayuso <pablo@...filter.org>, netfilter-devel@...r.kernel.org
Cc: davem@...emloft.net, netdev@...r.kernel.org, kuba@...nel.org,
edumazet@...gle.com, fw@...len.de
Subject: Re: [PATCH net 1/1] netfilter: nf_tables: wait for rcu grace period
on net_device removal
Hi,
On 11/7/24 00:58, Pablo Neira Ayuso wrote:
> 8c873e219970 ("netfilter: core: free hooks with call_rcu") removed
> synchronize_net() call when unregistering basechain hook, however,
> net_device removal event handler for the NFPROTO_NETDEV was not updated
> to wait for RCU grace period.
>
> Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks
> on net_device removal") does not remove basechain rules on device
> removal, I was hinted to remove rules on net_device removal later, see
> 5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on
> netdevice removal").
>
> Although NETDEV_UNREGISTER event is guaranteed to be handled after
> synchronize_net() call, this path needs to wait for rcu grace period via
> rcu callback to release basechain hooks if netns is alive because an
> ongoing netlink dump could be in progress (sockets hold a reference on
> the netns).
>
> Note that nf_tables_pre_exit_net() unregisters and releases basechain
> hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
> the netns exit path, eg. veth peer device in another netns:
>
> cleanup_net()
> default_device_exit_batch()
> unregister_netdevice_many_notify()
> notifier_call_chain()
> nf_tables_netdev_event()
> __nft_release_basechain()
>
> In this particular case, same rule of thumb applies: if netns is alive,
> then wait for rcu grace period because netlink dump in the other netns
> could be in progress. Otherwise, if the other netns is going away then
> no netlink dump can be in progress and basechain hooks can be released
> inmediately.
>
> While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
> validation, which should not ever happen.
>
> Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
> Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
> ---
> include/net/netfilter/nf_tables.h | 2 ++
> net/netfilter/nf_tables_api.c | 41 +++++++++++++++++++++++++------
> 2 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
> index 91ae20cb7648..8dd8e278843d 100644
> --- a/include/net/netfilter/nf_tables.h
> +++ b/include/net/netfilter/nf_tables.h
> @@ -1120,6 +1120,7 @@ struct nft_chain {
> char *name;
> u16 udlen;
> u8 *udata;
> + struct rcu_head rcu_head;
I'm sorry to be pedantic but the CI is complaining about the lack of
kdoc for this field...
>
> /* Only used during control plane commit phase: */
> struct nft_rule_blob *blob_next;
> @@ -1282,6 +1283,7 @@ struct nft_table {
> struct list_head sets;
> struct list_head objects;
> struct list_head flowtables;
> + possible_net_t net;
... and this one ...
> u64 hgenerator;
> u64 handle;
> u32 use;
[...]
> +static void nft_release_basechain_rcu(struct rcu_head *head)
> +{
> + struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
> + struct nft_ctx ctx = {
> + .family = chain->table->family,
> + .chain = chain,
> + .net = read_pnet(&chain->table->net),
> + };
> +
> + __nft_release_basechain_now(&ctx);
> + put_net(ctx.net);
... and also about deprecated API usage here, the put_net_tracker()
version should be preferred.
Given this change will likely land on very old trees I guess the tracker
conversion is better handled as a follow-up net-next patch.
Would you mind addressing the kdoc above? Today PR will be handled by
Jakub quite later, so there is a bit of time.
Thanks!
Paolo
Powered by blists - more mailing lists