lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a81fdb9c-4bf7-3e0c-85b0-9504b6780b48@stressinduktion.org>
Date:   Thu, 24 Nov 2016 13:34:08 +0100
From:   Hannes Frederic Sowa <hannes@...essinduktion.org>
To:     Ido Schimmel <idosch@...sch.org>
Cc:     Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
        davem@...emloft.net, idosch@...lanox.com, eladr@...lanox.com,
        yotamg@...lanox.com, nogahf@...lanox.com, arkadis@...lanox.com,
        ogerlitz@...lanox.com, roopa@...ulusnetworks.com,
        dsa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
        andy@...yhouse.net, vivien.didelot@...oirfairelinux.com,
        andrew@...n.ch, f.fainelli@...il.com, alexander.h.duyck@...el.com,
        kaber@...sh.net
Subject: Re: [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB
 dump

On 24.11.2016 09:47, Ido Schimmel wrote:
> On Thu, Nov 24, 2016 at 12:04:57AM +0100, Hannes Frederic Sowa wrote:
>> On 23.11.2016 20:53, Ido Schimmel wrote:
>>> On Wed, Nov 23, 2016 at 06:47:03PM +0100, Hannes Frederic Sowa wrote:
>>>> Hmm, I think you need to read the sequence counter under rtnl_lock to
>>>> have an ordering with the rest of the updates to the RCU trie. Otherwise
>>>> you don't know if the fib trie has the correct view regarding to the
>>>> incoming notifications as a whole. This is also necessary during restarts.
>>>
>>> I spent quite a lot of time thinking about this specific issue, but I
>>> couldn't convince myself that the read should be done under RTNL and I'm
>>> not sure I understand your reasoning. Can you please elaborate?
>>>
>>> If, before each notification sent, we call atomic_inc() and then call
>>> atomic_read() at the end, then how can we be tricked?
>>
>> The race I am suspecting to happen is:
>>
>> <CPU0> fib_register()
>>
>> <CPU1> delete route by notifier
>> <CPU1> enqueue delete cmd into ordered queue
>>
>> <CPU0> starts dump
>> <CPU0> sees deleted route by CPU1 because route not yet removed from RCU
>> <CPU0> enqueues route for addition
> 
> Yea, I missed this trivial case... My mind was fixed on problems that
> could happen after the dump already started. :(
> 
> Regarding your suggestion, I think the API will be more useful if we
> don't bundle fib_register() and fib_dump() together. We can do the
> following instead:
> 
> 1) Sum 'fib_seq' (doesn't need to be atomic_t anymore) from all net
> namespaces under RTNL

You anyway only support init_net, no?

I didn't fully understood what you mean by sum? Using one for the whole
system?

We already have net->ipv4.rt_genid as a per-namespace routing change
counter, have you looked at that?

> 2) Dump FIB tables under RCU
> 3) Do 1) again
> 4) Compare results from 1) and 3) and retry (according to sysctl limit)
> if results differ. Before each retry the module's callback (if passed)
> will be invoked.
> 
> Sounds OK?

Ah, you want to sum up all the fib_seq from all namespaces. Now I got it.

Not sure if that is such a good idea actually. It might make problems
later on if offloading will maybe one day become a per-netns knob for
the respective admins.

But semantically it should work.

If it turns out to be much easier than doing it per-netns, I think this
approach should work.

Bye,
Hannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ