[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190604215334.wdvogz7qeg3jbbvl@kafai-mbp.dhcp.thefacebook.com>
Date: Tue, 4 Jun 2019 21:53:54 +0000
From: Martin Lau <kafai@...com>
To: David Ahern <dsahern@...il.com>
CC: Wei Wang <weiwan@...gle.com>, David Ahern <dsahern@...nel.org>,
"David S . Miller" <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
"idosch@...lanox.com" <idosch@...lanox.com>,
"saeedm@...lanox.com" <saeedm@...lanox.com>
Subject: Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in
a fib6_info
On Tue, Jun 04, 2019 at 03:13:38PM -0600, David Ahern wrote:
> On 6/4/19 3:06 PM, Martin Lau wrote:
> > On Tue, Jun 04, 2019 at 02:17:28PM -0600, David Ahern wrote:
> >> On 6/3/19 11:29 PM, Martin Lau wrote:
> >>> On Mon, Jun 03, 2019 at 07:36:06PM -0600, David Ahern wrote:
> >>>> On 6/3/19 6:58 PM, Martin Lau wrote:
> >>>>> I have concern on calling ip6_create_rt_rcu() in general which seems
> >>>>> to trace back to this commit
> >>>>> dec9b0e295f6 ("net/ipv6: Add rt6_info create function for ip6_pol_route_lookup")
> >>>>>
> >>>>> This rt is not tracked in pcpu_rt, rt6_uncached_list or exception bucket.
> >>>>> In particular, how to react to NETDEV_UNREGISTER/DOWN like
> >>>>> the rt6_uncached_list_flush_dev() does and calls dev_put()?
> >>>>>
> >>>>> The existing callers seem to do dst_release() immediately without
> >>>>> caching it, but still concerning.
> >>>>
> >>>> those are the callers that don't care about the dst_entry, but are
> >>>> forced to deal with it. Removing the tie between fib lookups an
> >>>> dst_entry is again the right solution.
> >>> Great to know that there will be a solution. It would be great
> >>> if there is patch (or repo) to show how that may look like on
> >>> those rt6_lookup() callers.
> >>
> >> Not 'will be', 'there is' a solution now. Someone just needs to do the
> >> conversions and devise the tests for the impacted users.
> > I don't think everyone will convert to the new nexthop solution
> > immediately.
> >
> > How about ensuring the existing usage stays solid first?
>
> Use of nexthop objects has nothing to do with separating fib lookups
> from dst_entries, but with the addition of fib6_result it Just Works.
>
> Wei converted ipv6 to use exception caches instead of adding them to the
> FIB.
>
> I converted ipv6 to use separate data structures for fib entries, added
> direct fib6 lookup functions and added fib6_result. See the
> net/core/filter.c.
>
> The stage is set for converting users.
>
> For example, ip6_nh_lookup_table does not care about the dst entry, only
> the fib entry. This converts it:
>
> static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg,
> const struct in6_addr *gw_addr, u32 tbid,
> int flags, struct fib6_result *res)
> {
> struct flowi6 fl6 = {
> .flowi6_oif = cfg->fc_ifindex,
> .daddr = *gw_addr,
> .saddr = cfg->fc_prefsrc,
> };
> struct fib6_table *table;
> struct rt6_info *rt;
>
> table = fib6_get_table(net, tbid);
> if (!table)
> return -EINVAL;
>
> if (!ipv6_addr_any(&cfg->fc_prefsrc))
> flags |= RT6_LOOKUP_F_HAS_SADDR;
>
> flags |= RT6_LOOKUP_F_IGNORE_LINKSTATE;
>
> fib6_table_lookup(net, table, cfg->fc_ifindex, fl6, res, flags);
> if (res.f6i == net->ipv6.fib6_null_entry)
> return -ENETUNREACH;
>
> fib6_select_path(net, &res, fl6, oif, false, NULL, flags);
>
> return 0;
> }
Thanks for the example. This patch/example is what I am looking for.
These changes are still "will be" done.
AFAICT, there are more than 10 rt6_lookup() usages which may end up calling
ip6_create_rt_rcu() and return an untracked rt.
It is very difficult to audit the future changes on them to ensure they will
not cache them till the above suggested changes to be done or
completely remove ip6_create_rt_rcu() for now.
Powered by blists - more mailing lists