[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YZqIBVcFwIzj6VZG@shredder>
Date: Sun, 21 Nov 2021 19:55:17 +0200
From: Ido Schimmel <idosch@...sch.org>
To: Nikolay Aleksandrov <razor@...ckwall.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
dsahern@...il.com, Nikolay Aleksandrov <nikolay@...dia.com>
Subject: Re: [PATCH net 0/3] net: nexthop: fix refcount issues when replacing
groups
On Sun, Nov 21, 2021 at 05:24:50PM +0200, Nikolay Aleksandrov wrote:
> From: Nikolay Aleksandrov <nikolay@...dia.com>
>
> Hi,
> This set fixes a refcount bug when replacing nexthop groups and
> modifying routes. It is complex because the objects look valid when
> debugging memory dumps, but we end up having refcount dependency between
> unlinked objects which can never be released, so in turn they cannot
> free their resources and refcounts. The problem happens because we can
> have stale IPv6 per-cpu dsts in nexthops which were removed from a
> group. Even though the IPv6 gen is bumped, the dsts won't be released
> until traffic passes through them or the nexthop is freed, that can take
> arbitrarily long time, and even worse we can create a scenario[1] where it
> can never be released. The fix is to release the IPv6 per-cpu dsts of
> replaced nexthops after an RCU grace period so no new ones can be
> created. To do that we add a new IPv6 stub - fib6_nh_release_dsts, which
> is used by the nexthop code only when necessary. We can further optimize
> group replacement, but that is more suited for net-next as these patches
> would have to be backported to stable releases.
Will run regression with these patches tonight and report tomorrow
Powered by blists - more mailing lists