[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56d2179d-00ff-bcc5-e365-845179cbe672@stressinduktion.org>
Date: Wed, 16 Nov 2016 20:43:25 +0100
From: Hannes Frederic Sowa <hannes@...essinduktion.org>
To: Ido Schimmel <idosch@...sch.org>
Cc: Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
davem@...emloft.net, idosch@...lanox.com, eladr@...lanox.com,
yotamg@...lanox.com, nogahf@...lanox.com, arkadis@...lanox.com,
ogerlitz@...lanox.com, roopa@...ulusnetworks.com,
dsa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
andy@...yhouse.net, vivien.didelot@...oirfairelinux.com,
andrew@...n.ch, f.fainelli@...il.com, alexander.h.duyck@...el.com,
kuznet@....inr.ac.ru, jmorris@...ei.org, yoshfuji@...ux-ipv6.org,
kaber@...sh.net
Subject: Re: [patch net-next 6/8] ipv4: fib: Add an API to request a FIB dump
On 16.11.2016 19:51, Ido Schimmel wrote:
> Hi,
>
> On Wed, Nov 16, 2016 at 06:35:45PM +0100, Hannes Frederic Sowa wrote:
>> On 16.11.2016 16:18, Ido Schimmel wrote:
>>> On Wed, Nov 16, 2016 at 03:51:01PM +0100, Hannes Frederic Sowa wrote:
>>>> On 16.11.2016 15:09, Jiri Pirko wrote:
>>>>> From: Ido Schimmel <idosch@...lanox.com>
>>>>>
>>>>> Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
>>>>> introduced a new notification chain to notify listeners (f.e., switchdev
>>>>> drivers) about addition and deletion of routes.
>>>>>
>>>>> However, upon registration to the chain the FIB tables can already be
>>>>> populated, which means potential listeners will have an incomplete view
>>>>> of the tables.
>>>>>
>>>>> Solve that by adding an API to request a FIB dump. The dump itself it
>>>>> done using RCU in order not to starve consumers that need RTNL to make
>>>>> progress.
>>>>>
>>>>> Signed-off-by: Ido Schimmel <idosch@...lanox.com>
>>>>> Signed-off-by: Jiri Pirko <jiri@...lanox.com>
>>>>
>>>> Have you looked at potential inconsistencies resulting of RCU walking
>>>> the table and having concurrent inserts?
>>>
>>> Yes. I did try to think about situations in which this approach will
>>> fail, but I could only find problems with concurrent removals, which I
>>> addressed in 5/8. In case of concurrent insertions, even if you missed
>>> the node, you would still get the ENTRY_ADD event to your listener.
>>
>> Theoretically a node could still be installed while the deletion event
>> fired before registering the notifier. E.g. a synchronize_net before
>> dumping could help here?
>
> If the deletion event fired for some fib alias, then by 5/8 we are
> guaranteed that it was already unlinked from the fib alias list in the
> leaf in which it was contained. So, while it's possible we didn't
> register our listener in time for the deletion event, we won't traverse
> this fib alias while dumping the trie anyway. Did I understand you
> correctly?
>
Theoretically we can have the same problem for insertion:
You receive a delete event from the notifier that is queued up first but
the dump will still see the entry in the fib due to being managed by RCU
(the notifier running on another CPU).
The problem is that the fib_remove_alias->hlist_del_rcu->WRITE_ONCE is
still not strongly ordered against the local fib dump trie walk.
>> I don't know how you prepare the data structures for inserting in into
>> the hardware, but if ordering matters, the notifier for a delete event
>> can be called before the dump installed the fib entry?
>
> Right. It's possible for the listener to receive a deletion event for a
> fib entry it doesn't have, in which case it should just ignore it (as
> current listeners do).
Yep, for this specific case.
Bye,
Hannes
Powered by blists - more mailing lists