lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 May 2017 23:37:30 +0300 (EEST)
From:   Julian Anastasov <ja@....bg>
To:     Cong Wang <xiyou.wangcong@...il.com>
cc:     Eric Dumazet <eric.dumazet@...il.com>,
        David Miller <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [Patch net] ipv4: restore rt->fi for reference counting


	Hello,

On Mon, 15 May 2017, Cong Wang wrote:

> On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov <ja@....bg> wrote:
> >         Now the main question: is FIB_LOOKUP_NOREF used
> > everywhere in IPv4? I guess so. If not, it means
> > someone can walk its res->fi NHs which is bad. I think,
> > this will delay the unregistration for long time and we
> > can not solve the problem.
> >
> >         If yes, free_fib_info() should not use call_rcu.
> > Instead, fib_release_info() will start RCU callback to
> > drop everything via a common function for fib_release_info
> > and free_fib_info. As result, the last fib_info_put will
> > just need to free fi->fib_metrics and fi.
> 
> 
> Yes it is used. But this is a different problem from the
> dev refcnt issue, right? I can send a separate patch to
> address it.

	Any user that does not set FIB_LOOKUP_NOREF
will need nh_dev refcounts. The assumption is that the
NHs are accessed, who knows, may be even after RCU grace
period. As result, we can not use dev_put on NETDEV_UNREGISTER.
So, we should check if there are users that do not
set FIB_LOOKUP_NOREF, at first look, I don't see such ones
for IPv4.

> >> Are you sure we are safe to call dev_put() in fib_release_info()
> >> for _all_ paths, especially non-unregister paths? See:
> >
> >         Yep, dev_put is safe there...
> >
> >> commit e49cc0da7283088c5e03d475ffe2fdcb24a6d5b1
> >> Author: Yanmin Zhang <yanmin_zhang@...ux.intel.com>
> >> Date:   Wed May 23 15:39:45 2012 +0000
> >>
> >>     ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
> >
> >         ...as long as we do not set nh_dev to NULL
> >
> 
> OK, fair enough, then I think the best solution here is to move
> the dev_put() from free_fib_info_rcu() to fib_release_info(),
> fib_nh is already removed from hash there anyway.

	free_fib_info still needs to put the references,
that is the reason for the common fib_info_release() in
my example. It happens in fib_create_info() where free_fib_info()
is called. The func names in my example can be corrected,
if needed.

> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index da449dd..cb712d1 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -205,8 +205,6 @@ static void free_fib_info_rcu(struct rcu_head *head)
>         struct fib_info *fi = container_of(head, struct fib_info, rcu);
> 
>         change_nexthops(fi) {
> -               if (nexthop_nh->nh_dev)
> -                       dev_put(nexthop_nh->nh_dev);
>                 lwtstate_put(nexthop_nh->nh_lwtstate);
>                 free_nh_exceptions(nexthop_nh);
>                 rt_fibinfo_free_cpus(nexthop_nh->nh_pcpu_rth_output);
> @@ -246,6 +244,14 @@ void fib_release_info(struct fib_info *fi)
>                         if (!nexthop_nh->nh_dev)
>                                 continue;
>                         hlist_del(&nexthop_nh->nh_hash);
> +                       /* We have to release these nh_dev here because a dst
> +                        * could still hold a fib_info via rt->fi, we can't wait
> +                        * for GC, a socket could hold the dst for a long time.
> +                        *
> +                        * This is safe, dev_put() alone does not really free
> +                        * the netdevice, we just have to put the refcnt back.
> +                        */
> +                       dev_put(nexthop_nh->nh_dev);
>                 } endfor_nexthops(fi)
>                 fi->fib_dead = 1;

	Such solution needs the fib_dead = 1|2 game to
know who dropped the nh_dev reference, fib_release_info (2) or
fib_create_info (1). You can not remove the dev_put calls
from free_fib_info_rcu.

>                 fib_info_put(fi);

Regards

--
Julian Anastasov <ja@....bg>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ