[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <999e09f0-2593-d079-e8f4-f9db6f2f85af@cumulusnetworks.com>
Date: Tue, 19 May 2020 11:48:01 +0300
From: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To: David Ahern <dsahern@...il.com>,
Roopa Prabhu <roopa@...ulusnetworks.com>, davem@...emloft.net
Cc: netdev@...r.kernel.org, jiri@...lanox.com, idosch@...lanox.com,
petrm@...lanox.com
Subject: Re: [PATCH net-next 1/6] nexthop: dereference nh only once in
nexthop_select_path
On 19/05/2020 06:25, David Ahern wrote:
> On 5/18/20 8:14 PM, Roopa Prabhu wrote:
>> From: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
>>
>> the ->nh pointer might become suddenly null while we're selecting the
>> path and we may dereference it. Dereference it only once in the
>> beginning and use that if it's not null, we rely on the refcounting and
>> rcu to protect against use-after-free. (This is needed for later
>> vxlan patches that exposes the problem)
>>
>> Signed-off-by: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
>> Signed-off-by: Roopa Prabhu <roopa@...ulusnetworks.com>
>> ---
>> net/ipv4/nexthop.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>
> Reviewed-by: David Ahern <dsahern@...il.com>
>
> Should this be a bug fix? Any chance the route path can hit it?
>
That was fast, I didn't expect to see it on upstream so quickly. :)
So I haven't had time to inspect it in detail, but it did seem to me
that it should be possible to hit from the route path. When I tried
running a few basic tests to make it happen I couldn't mainly due to the
fib flush done at nexthop removal, but I still believe with the right
timing one could hit it.
In fact I'd go 1 step further and add a null check from the return
of nexthop_select_path() in the helpers which dereference the value it
returns like: nexthop_path_fib6_result() and nexthop_path_fib_result()
The reason is that the .nh ptr is set and read without any sync primitives
so in theory it can become null at any point (being cleared on nh group removal),
and also the nh count in a group (num_nh), when it becomes == 0 while destroying a nh group
and we hit it then in nexthop_select_path() rc would remain == NULL and we'll
deref a null ptr. We did see the above with the vxlan code due to it missing the equivalent
of a fib flush (or rather it being more relaxed), but I haven't had time to see how feasible
it is to hit it via the route path yet.
Powered by blists - more mailing lists