[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200526.152800.1859140520396255826.davem@davemloft.net>
Date: Tue, 26 May 2020 15:28:00 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: dsahern@...nel.org
Cc: netdev@...r.kernel.org, kuba@...nel.org,
nikolay@...ulusnetworks.com, dahern@...italocean.com
Subject: Re: [PATCH net 0/5] nexthops: Fix 2 fundamental flaws with nexthop
groups
From: David Ahern <dsahern@...nel.org>
Date: Tue, 26 May 2020 09:01:09 -0600
> From: David Ahern <dahern@...italocean.com>
>
> Nik's torture tests have exposed 2 fundamental mistakes with the initial
> nexthop code for groups. First, the nexthops entries and num_nh in the
> nh_grp struct should not be modified once the struct is set under rcu.
> Doing so has major affects on the datapath seeing valid nexthop entries.
>
> Second, the helpers in the header file were convenient for not repeating
> code, but they cause datapath walks to potentially see 2 different group
> structs after an rcu replace, disrupting a walk of the path objects.
> This second problem applies solely to IPv4 as I re-used too much of the
> existing code in walking legs of a multipath route.
>
> Patches 1 is refactoring change to simplify the overhead of reviewing and
> understanding the change in patch 2 which fixes the update of nexthop
> groups when a compnent leg is removed.
>
> Patches 3-5 address the second problem. Patch 3 inlines the multipath
> check such that the mpath lookup and subsequent calls all use the same
> nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
> with iterative calls to fib_info_nhc.
>
> fib_info_num_path can be used in control plane path in a 'for loop' with
> subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
> only changed while holding the rtnl; the combination can not be used in
> the data plane with external nexthops as it involves repeated dereferences
> of nh_grp struct which can change between calls.
>
> Similarly, nexthop_is_multipath can be used for branching decisions in
> the datapath since the nexthop type can not be changed (a group can not
> be converted to standalone and vice versa).
>
> Patch set developed in coordination with Nikolay Aleksandrov. He did a
> lot of work creating a good reproducer, discussing options to fix it
> and testing iterations.
>
> I have adapted Nik's commands into additional tests in the nexthops
> selftest script which I will send against -next.
Series applied and queued up for -stable, thanks David.
Powered by blists - more mailing lists