netdev - Re: [patch net-next v2 10/11] mlxsw: spectrum_router: Request a dump of FIB tables during init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161123170436.GC1873@nanopsycho>
Date:   Wed, 23 Nov 2016 18:04:36 +0100
From:   Jiri Pirko <jiri@...nulli.us>
To:     Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, idosch@...lanox.com,
        eladr@...lanox.com, yotamg@...lanox.com, nogahf@...lanox.com,
        arkadis@...lanox.com, ogerlitz@...lanox.com,
        roopa@...ulusnetworks.com, dsa@...ulusnetworks.com,
        nikolay@...ulusnetworks.com, andy@...yhouse.net,
        vivien.didelot@...oirfairelinux.com, andrew@...n.ch,
        f.fainelli@...il.com, alexander.h.duyck@...el.com, kaber@...sh.net
Subject: Re: [patch net-next v2 10/11] mlxsw: spectrum_router: Request a dump
 of FIB tables during init

Wed, Nov 23, 2016 at 05:59:05PM CET, hannes@...essinduktion.org wrote:
>On Wed, Nov 23, 2016, at 17:04, Jiri Pirko wrote:
>> Wed, Nov 23, 2016 at 05:00:00PM CET, hannes@...essinduktion.org wrote:
>> >On Wed, Nov 23, 2016, at 15:48, Jiri Pirko wrote:
>> >> From: Ido Schimmel <idosch@...lanox.com>
>> >> 
>> >> Make sure the device has a complete view of the FIB tables by invoking
>> >> their dump during module init.
>> >> 
>> >> Signed-off-by: Ido Schimmel <idosch@...lanox.com>
>> >> Signed-off-by: Jiri Pirko <jiri@...lanox.com>
>> >> ---
>> >>  drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 16
>> >>  ++++++++++++++++
>> >>  1 file changed, 16 insertions(+)
>> >> 
>> >> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> >> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> >> index 14bed1d..36a71d2 100644
>> >> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> >> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> >> @@ -2027,6 +2027,21 @@ static int mlxsw_sp_router_fib_event(struct
>> >> notifier_block *nb,
>> >>  	return NOTIFY_DONE;
>> >>  }
>> >>  
>> >> +static void mlxsw_sp_router_fib_dump(struct mlxsw_sp *mlxsw_sp)
>> >> +{
>> >> +       while (!fib_notifier_dump(&mlxsw_sp->fib_nb)) {
>> >> +               /* Flush pending FIB notifications and then flush the
>> >> +                * device's table before requesting another dump. Do
>> >> +                * that with RTNL held, as FIB notification block is
>> >> +                * already registered.
>> >> +                */
>> >> +               mlxsw_core_flush_owq();
>> >> +               rtnl_lock();
>> >> +               mlxsw_sp_router_fib_flush(mlxsw_sp);
>> >> +               rtnl_unlock();
>> >> +       }
>> >> +}
>> >
>> >I think it is fine to use this kind of synchronization.
>> >
>> >But I think that this part of the logic still belongs into the core
>> 
>> Core does not know how driver handles the offloaded fibs. So only driver
>> knows how/if he needs to do flush in case of retry.
>
>Sure, but an abort function can be provided to the kernel anyway and the
>driver can care about that.

Ok, how?


>
>> >kernel. I still think it could happen that we will loop here
>> >indefinitely because of a lot of routing updates and as such would need
>> >to abort this loop after a number of tries.
>> 
>> In theory, it is possible, howevery quite unlikely.
>
>I think the "quite unlikely" already got us down the path to not using
>rtnl_lock in the first place.
>
>As I said, I am not sure about this as I didn't try any hardware
>offloading before and delays how long it needs to be transferred to
>hardware, but having a fail case for that seems like a nice improvement.
>At the same time I know of Linux boxes running in internet exchanges
>having several peers. The high update rates actually led to bgp
>implementation specifying flap damping which is actually nowadays
>considered harmful.
>
>Seriously, while most of the time convergence in routing protocols is
>good and most updates only hit the BGP user space table anyway and the
>change is suppressed because recursive routing lookup idempotence, quite
>unlikely events happen to the internet now and then:
>http://research.dyn.com/2009/02/longer-is-not-better/, which caused *a
>lot* of flapping and ongoing events on BGP routers throughout the world.
>
>I agree it is unlikely that you have to refresh your hw dump during this
>time, but who knows what customers do and what admins do in case
>something like this happens. I just don't favor to looping endlessly
>trying to sync up and getting into a stable state but tell the admin to
>detach the control plane from the forwarding plane and sync up then.
>
>That said, I think a sysctl for a maximum number of loops respected by
>drivers that needs to do so, should be enough for the time being.

Okay. Point taken.