[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpXtWUZoGS_KEqfZc+ZbzYxaF1hbAV7H9Qd82=r4auJojw@mail.gmail.com>
Date: Mon, 15 May 2017 11:34:57 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Julian Anastasov <ja@....bg>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
David Miller <davem@...emloft.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Andrey Konovalov <andreyknvl@...gle.com>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [Patch net] ipv4: restore rt->fi for reference counting
On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov <ja@....bg> wrote:
> Now the main question: is FIB_LOOKUP_NOREF used
> everywhere in IPv4? I guess so. If not, it means
> someone can walk its res->fi NHs which is bad. I think,
> this will delay the unregistration for long time and we
> can not solve the problem.
>
> If yes, free_fib_info() should not use call_rcu.
> Instead, fib_release_info() will start RCU callback to
> drop everything via a common function for fib_release_info
> and free_fib_info. As result, the last fib_info_put will
> just need to free fi->fib_metrics and fi.
Yes it is used. But this is a different problem from the
dev refcnt issue, right? I can send a separate patch to
address it.
>> Are you sure we are safe to call dev_put() in fib_release_info()
>> for _all_ paths, especially non-unregister paths? See:
>
> Yep, dev_put is safe there...
>
>> commit e49cc0da7283088c5e03d475ffe2fdcb24a6d5b1
>> Author: Yanmin Zhang <yanmin_zhang@...ux.intel.com>
>> Date: Wed May 23 15:39:45 2012 +0000
>>
>> ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
>
> ...as long as we do not set nh_dev to NULL
>
OK, fair enough, then I think the best solution here is to move
the dev_put() from free_fib_info_rcu() to fib_release_info(),
fib_nh is already removed from hash there anyway.
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index da449dd..cb712d1 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -205,8 +205,6 @@ static void free_fib_info_rcu(struct rcu_head *head)
struct fib_info *fi = container_of(head, struct fib_info, rcu);
change_nexthops(fi) {
- if (nexthop_nh->nh_dev)
- dev_put(nexthop_nh->nh_dev);
lwtstate_put(nexthop_nh->nh_lwtstate);
free_nh_exceptions(nexthop_nh);
rt_fibinfo_free_cpus(nexthop_nh->nh_pcpu_rth_output);
@@ -246,6 +244,14 @@ void fib_release_info(struct fib_info *fi)
if (!nexthop_nh->nh_dev)
continue;
hlist_del(&nexthop_nh->nh_hash);
+ /* We have to release these nh_dev here because a dst
+ * could still hold a fib_info via rt->fi, we can't wait
+ * for GC, a socket could hold the dst for a long time.
+ *
+ * This is safe, dev_put() alone does not really free
+ * the netdevice, we just have to put the refcnt back.
+ */
+ dev_put(nexthop_nh->nh_dev);
} endfor_nexthops(fi)
fi->fib_dead = 1;
fib_info_put(fi);
Thanks!
Powered by blists - more mailing lists