[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa5e105b-9109-1c30-c07c-5a2b0720ffec@stressinduktion.org>
Date: Wed, 23 Nov 2016 18:47:03 +0100
From: Hannes Frederic Sowa <hannes@...essinduktion.org>
To: Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org
Cc: davem@...emloft.net, idosch@...lanox.com, eladr@...lanox.com,
yotamg@...lanox.com, nogahf@...lanox.com, arkadis@...lanox.com,
ogerlitz@...lanox.com, roopa@...ulusnetworks.com,
dsa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
andy@...yhouse.net, vivien.didelot@...oirfairelinux.com,
andrew@...n.ch, f.fainelli@...il.com, alexander.h.duyck@...el.com,
kaber@...sh.net
Subject: Re: [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB
dump
On 23.11.2016 15:34, Jiri Pirko wrote:
> From: Ido Schimmel <idosch@...lanox.com>
>
> Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
> introduced a new notification chain to notify listeners (f.e., switchdev
> drivers) about addition and deletion of routes.
>
> However, upon registration to the chain the FIB tables can already be
> populated, which means potential listeners will have an incomplete view
> of the tables.
>
> Solve that by adding an API to request a FIB dump. The dump itself it
> done using RCU in order not to starve consumers that need RTNL to make
> progress.
>
> For each net namespace the integrity of the dump is ensured by reading
> the atomic change sequence counter before and after the dump. This
> allows us to avoid the problematic situation in which the dumping
> process sends a ENTRY_ADD notification following ENTRY_DEL generated by
> another process holding RTNL.
>
> Signed-off-by: Ido Schimmel <idosch@...lanox.com>
> Signed-off-by: Jiri Pirko <jiri@...lanox.com>
> ---
> include/net/ip_fib.h | 1 +
> net/ipv4/fib_trie.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 118 insertions(+)
>
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 6c67b93..c76303e 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -221,6 +221,7 @@ enum fib_event_type {
> FIB_EVENT_RULE_DEL,
> };
>
> +bool fib_notifier_dump(struct notifier_block *nb);
> int register_fib_notifier(struct notifier_block *nb);
> int unregister_fib_notifier(struct notifier_block *nb);
> int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index b1d2d09..9770edfe 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -86,6 +86,67 @@
>
> static ATOMIC_NOTIFIER_HEAD(fib_chain);
>
> +static int call_fib_notifier(struct notifier_block *nb, struct net *net,
> + enum fib_event_type event_type,
> + struct fib_notifier_info *info)
> +{
> + info->net = net;
> + return nb->notifier_call(nb, event_type, info);
> +}
> +
> +static void fib_rules_notify(struct net *net, struct notifier_block *nb,
> + enum fib_event_type event_type)
> +{
> +#ifdef CONFIG_IP_MULTIPLE_TABLES
> + struct fib_notifier_info info;
> +
> + if (net->ipv4.fib_has_custom_rules)
> + call_fib_notifier(nb, net, event_type, &info);
> +#endif
> +}
> +
> +static void fib_notify(struct net *net, struct notifier_block *nb,
> + enum fib_event_type event_type);
> +
> +static int call_fib_entry_notifier(struct notifier_block *nb, struct net *net,
> + enum fib_event_type event_type, u32 dst,
> + int dst_len, struct fib_info *fi,
> + u8 tos, u8 type, u32 tb_id, u32 nlflags)
> +{
> + struct fib_entry_notifier_info info = {
> + .dst = dst,
> + .dst_len = dst_len,
> + .fi = fi,
> + .tos = tos,
> + .type = type,
> + .tb_id = tb_id,
> + .nlflags = nlflags,
> + };
> + return call_fib_notifier(nb, net, event_type, &info.info);
> +}
> +
> +bool fib_notifier_dump(struct notifier_block *nb)
> +{
> + struct net *net;
> + bool ret = true;
> + rcu_read_lock();
> + for_each_net_rcu(net) {
> + int fib_seq = atomic_read(&net->ipv4.fib_seq);
> +
> + fib_rules_notify(net, nb, FIB_EVENT_RULE_ADD);
> + fib_notify(net, nb, FIB_EVENT_ENTRY_ADD);
> + if (atomic_read(&net->ipv4.fib_seq) != fib_seq) {
> + ret = false;
> + goto out_unlock;
> + }
Hmm, I think you need to read the sequence counter under rtnl_lock to
have an ordering with the rest of the updates to the RCU trie. Otherwise
you don't know if the fib trie has the correct view regarding to the
incoming notifications as a whole. This is also necessary during restarts.
You can also try to register the notifier after the dump and check for
the sequence number after registering the notifier, maybe that is easier
(and restart unregisters and does the same).
Bye,
Hannes
Powered by blists - more mailing lists