lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b4441b5-70ba-3462-64af-293ec3955d6e@cumulusnetworks.com>
Date:   Wed, 27 May 2020 01:37:09 +0300
From:   Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To:     David Miller <davem@...emloft.net>, dsahern@...nel.org
Cc:     netdev@...r.kernel.org, kuba@...nel.org, dahern@...italocean.com
Subject: Re: [PATCH net 0/5] nexthops: Fix 2 fundamental flaws with nexthop
 groups

On 5/27/20 1:35 AM, Nikolay Aleksandrov wrote:
> On 5/27/20 1:28 AM, David Miller wrote:
>> From: David Ahern <dsahern@...nel.org>
>> Date: Tue, 26 May 2020 09:01:09 -0600
>>
>>> From: David Ahern <dahern@...italocean.com>
>>>
>>> Nik's torture tests have exposed 2 fundamental mistakes with the initial
>>> nexthop code for groups. First, the nexthops entries and num_nh in the
>>> nh_grp struct should not be modified once the struct is set under rcu.
>>> Doing so has major affects on the datapath seeing valid nexthop entries.
>>>
>>> Second, the helpers in the header file were convenient for not repeating
>>> code, but they cause datapath walks to potentially see 2 different group
>>> structs after an rcu replace, disrupting a walk of the path objects.
>>> This second problem applies solely to IPv4 as I re-used too much of the
>>> existing code in walking legs of a multipath route.
>>>
>>> Patches 1 is refactoring change to simplify the overhead of reviewing and
>>> understanding the change in patch 2 which fixes the update of nexthop
>>> groups when a compnent leg is removed.
>>>
>>> Patches 3-5 address the second problem. Patch 3 inlines the multipath
>>> check such that the mpath lookup and subsequent calls all use the same
>>> nh_grp struct. Patches 4 and 5 fix datapath uses of fib_info_num_path
>>> with iterative calls to fib_info_nhc.
>>>
>>> fib_info_num_path can be used in control plane path in a 'for loop' with
>>> subsequent fib_info_nhc calls to get each leg since the nh_grp struct is
>>> only changed while holding the rtnl; the combination can not be used in
>>> the data plane with external nexthops as it involves repeated dereferences
>>> of nh_grp struct which can change between calls.
>>>
>>> Similarly, nexthop_is_multipath can be used for branching decisions in
>>> the datapath since the nexthop type can not be changed (a group can not
>>> be converted to standalone and vice versa).
>>>
>>> Patch set developed in coordination with Nikolay Aleksandrov. He did a
>>> lot of work creating a good reproducer, discussing options to fix it
>>> and testing iterations.
>>>
>>> I have adapted Nik's commands into additional tests in the nexthops
>>> selftest script which I will send against -next.
>>
>> Series applied and queued up for -stable, thanks David.
>>
> 
> Dave this was v1, there were some whitespace errors which were fixed
> in v2: http://patchwork.ozlabs.org/project/netdev/list/?series=179428
> 

We can send a simple incremental against this set to fix 'em up. If David
doesn't me beat to it, I'll send a fix tomorrow.

Cheers

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ