lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Dec 2021 20:06:29 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     dust.li@...ux.alibaba.com
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        netdev <netdev@...r.kernel.org>,
        Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH net-next 00/19] net: add preliminary netdev refcount tracking

On Thu, Dec 2, 2021 at 7:46 PM dust.li <dust.li@...ux.alibaba.com> wrote:
>
> On Thu, Dec 02, 2021 at 07:13:12AM -0800, Eric Dumazet wrote:
> >On Thu, Dec 2, 2021 at 4:49 AM dust.li <dust.li@...ux.alibaba.com> wrote:
> >>
> >> On Wed, Dec 01, 2021 at 07:21:20PM -0800, Eric Dumazet wrote:
> >> >From: Eric Dumazet <edumazet@...gle.com>
> >> >
> >> >Two first patches add a generic infrastructure, that will be used
> >> >to get tracking of refcount increments/decrements.
> >> >
> >> >The general idea is to be able to precisely pair each decrement with
> >> >a corresponding prior increment. Both share a cookie, basically
> >> >a pointer to private data storing stack traces.
> >> >
> >> >Then a series of 17 patches converts some dev_hold()/dev_put()
> >> >pairs to new hepers : dev_hold_track() and dev_put_track().
> >> >
> >> >Hopefully this will be used by developpers and syzbot to
> >> >root cause bugs that cause netdevice dismantles freezes.
> >>
> >> Hi Eric:
> >>
> >> I really like the idea of adding debug informantion for debugging
> >> refcnt problems, we have met some bugs of leaking netdev refcnt in
> >> the past and debugging them is really HARD !!
> >>
> >> Recently, when investigating a sk_refcnt double free bug in SMC,
> >> I added some debug code a bit similar like this. I'm curious have
> >> you considered expanding the ref tracker infrastructure into other
> >> places like sock_hold/put() or even some hot path ?
> >
> >Sure, this should be absolutely doable with this generic infra.
> >
> >>
> >> AFAIU, ref tracker add a tracker inside each object who want to
> >> hold a refcnt, and stored the callstack into the tracker.
> >> I have 2 questions here:
> >> 1. If we want to use this in the hot path, looks like the overhead
> >>    is a bit heavy ?
> >
> >Much less expensive than lockdep (we use one spinlock per struct
> >ref_tracker_dir), so I think this is something doable.
>
> Yeah, I'm thinking enable this feature by default in our kernel
> so I care about this won't bring regression.

I doubt it, especially if some workload depended on device refcount being percpu
(CONFIG_PCPU_DEV_REFCNT=y)

> I did a simple test in case this might be helpful to other.
>
> Testing was triggered by test_ref_tracker.ko run on a Intel Xeon server.
> Added the following debug code:
>
> @@ -12,8 +13,14 @@ struct ref_tracker {
>         bool                    dead;
>         depot_stack_handle_t    alloc_stack_handle;
>         depot_stack_handle_t    free_stack_handle;
> +       unsigned long long      lat;
>  };
>
> +static void dump_tracker_latency(struct ref_tracker *tracker)
> +{
> +       printk("tracker %pK alloc cost(ns): %llu\n", tracker, tracker->lat);
> +}
> +
>  void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
>  {
>         struct ref_tracker *tracker, *n;
> @@ -23,6 +30,7 @@ void ref_tracker_dir_exit(struct ref_tracker_dir *dir)
>         spin_lock_irqsave(&dir->lock, flags);
>         list_for_each_entry_safe(tracker, n, &dir->quarantine, head) {
>                 list_del(&tracker->head);
> +               dump_tracker_latency(tracker);
>                 kfree(tracker);
>                 dir->quarantine_avail++;
>         }
>  ...
>  }
>
> @@ -70,7 +78,9 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
>         struct ref_tracker *tracker;
>         unsigned int nr_entries;
>         unsigned long flags;
> +       unsigned long long t;
>
> +       t = sched_clock();
>         *trackerp = tracker = kzalloc(sizeof(*tracker), gfp | __GFP_NOFAIL);
>         if (unlikely(!tracker)) {
>                 pr_err_once("memory allocation failure, unreliable refcount tracker.\n");
> @@ -84,6 +94,7 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
>         spin_lock_irqsave(&dir->lock, flags);
>         list_add(&tracker->head, &dir->list);
>         spin_unlock_irqrestore(&dir->lock, flags);
> +       tracker->lat = sched_clock() - t;
>         return 0;
>  }
>
> result:
> [   13.187325] tracker ffff888113285980 alloc cost(ns): 3488
> [   13.188050] tracker ffff888113285800 alloc cost(ns): 2612
> [   13.188760] tracker ffff888113285940 alloc cost(ns): 2763
> [   13.189482] tracker ffff888113285bc0 alloc cost(ns): 1063
> [   13.190227] tracker ffff888113285340 alloc cost(ns): 982
> [   13.190933] tracker ffff888113285f80 alloc cost(ns): 883
> [   13.191638] tracker ffff888113285080 alloc cost(ns): 1006
> [   13.192355] tracker ffff888113285cc0 alloc cost(ns): 1007
> [   13.193105] tracker ffff888113285dc0 alloc cost(ns): 901
> [   13.193811] tracker ffff8881132858c0 alloc cost(ns): 882
> [   13.194522] tracker ffff888113285e40 alloc cost(ns): 1000
> [   13.195253] tracker ffff888113285a00 alloc cost(ns): 896
> [   13.195973] tracker ffff888113285ac0 alloc cost(ns): 903
> [   13.196673] tracker ffff888113285c80 alloc cost(ns): 893
> [   13.197384] tracker ffff888113285d00 alloc cost(ns): 1220
> [   13.198130] tracker ffff888113285140 alloc cost(ns): 979
> [   13.198858] tracker ffff888113285180 alloc cost(ns): 893
> [   13.199571] tracker ffff8881132850c0 alloc cost(ns): 893
>
> The average cost for a ref_tracker_alloc() is about 0.9us
> And most of the time is spent in stack_trace_save().
>
> I'm not sure whether __builtin_return_address(0) is enough ?

Probably not.

Perhaps we can make the depth of the stack trace configurable,
but  this seems premature right now.


>
>
> >
> >> 2. Since we only store 1 callstack in 1 tracker, what if some object
> >>    want to hold and put refcnt in different places ?
> >
> >
> >You can use a tracker on the stack (patch 6/19 net: add net device
> >refcount tracker to ethtool_phys_id())
> >
> >For generic uses of dev_hold(), like in dev_get_by_index(), we will
> >probably have new helpers
> >so that callers can provide where the tracker is put.
> >
> >Or simply use this sequence to convert a generic/untracked reference
> >to a tracked one.
> >
> >dev = dev_get_by_index(),;
> >...
> >
> >p->dev = dev;
> >dev_hold_track(dev, &p->dev_tracker, GFP...)
> >dev_put(dev);
>
> Yeah, I see. This looks good for netdev.
> For sock_hold/put, I feel it's more complicate, I'm not sure if we
> need add lots of tracker.

I fail to see why there would be a lot of trackers.

I am betting less than 50, if you only focus on the usual suspects.

>
> >
> >
> >>
> >> Thanks.
> >>
> >> >
> >> >With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
> >> >some class of bugs, but too late (when too many dev_put()
> >> >were happening).
> >> >
> >> >Eric Dumazet (19):
> >> >  lib: add reference counting tracking infrastructure
> >> >  lib: add tests for reference tracker
> >> >  net: add dev_hold_track() and dev_put_track() helpers
> >> >  net: add net device refcount tracker to struct netdev_rx_queue
> >> >  net: add net device refcount tracker to struct netdev_queue
> >> >  net: add net device refcount tracker to ethtool_phys_id()
> >> >  net: add net device refcount tracker to dev_ifsioc()
> >> >  drop_monitor: add net device refcount tracker
> >> >  net: dst: add net device refcount tracking to dst_entry
> >> >  ipv6: add net device refcount tracker to rt6_probe_deferred()
> >> >  sit: add net device refcount tracking to ip_tunnel
> >> >  ipv6: add net device refcount tracker to struct ip6_tnl
> >> >  net: add net device refcount tracker to struct neighbour
> >> >  net: add net device refcount tracker to struct pneigh_entry
> >> >  net: add net device refcount tracker to struct neigh_parms
> >> >  net: add net device refcount tracker to struct netdev_adjacent
> >> >  ipv6: add net device refcount tracker to struct inet6_dev
> >> >  ipv4: add net device refcount tracker to struct in_device
> >> >  net/sched: add net device refcount tracker to struct Qdisc
> >> >
> >> > include/linux/inetdevice.h  |   2 +
> >> > include/linux/netdevice.h   |  53 ++++++++++++++
> >> > include/linux/ref_tracker.h |  73 +++++++++++++++++++
> >> > include/net/devlink.h       |   3 +
> >> > include/net/dst.h           |   1 +
> >> > include/net/if_inet6.h      |   1 +
> >> > include/net/ip6_tunnel.h    |   1 +
> >> > include/net/ip_tunnels.h    |   3 +
> >> > include/net/neighbour.h     |   3 +
> >> > include/net/sch_generic.h   |   2 +-
> >> > lib/Kconfig                 |   4 ++
> >> > lib/Kconfig.debug           |  10 +++
> >> > lib/Makefile                |   4 +-
> >> > lib/ref_tracker.c           | 140 ++++++++++++++++++++++++++++++++++++
> >> > lib/test_ref_tracker.c      | 116 ++++++++++++++++++++++++++++++
> >> > net/Kconfig                 |   8 +++
> >> > net/core/dev.c              |  10 ++-
> >> > net/core/dev_ioctl.c        |   5 +-
> >> > net/core/drop_monitor.c     |   4 +-
> >> > net/core/dst.c              |   8 +--
> >> > net/core/neighbour.c        |  18 ++---
> >> > net/core/net-sysfs.c        |   8 +--
> >> > net/ethtool/ioctl.c         |   5 +-
> >> > net/ipv4/devinet.c          |   4 +-
> >> > net/ipv4/route.c            |   7 +-
> >> > net/ipv6/addrconf.c         |   4 +-
> >> > net/ipv6/addrconf_core.c    |   2 +-
> >> > net/ipv6/ip6_gre.c          |   8 +--
> >> > net/ipv6/ip6_tunnel.c       |   4 +-
> >> > net/ipv6/ip6_vti.c          |   4 +-
> >> > net/ipv6/route.c            |  10 +--
> >> > net/ipv6/sit.c              |   4 +-
> >> > net/sched/sch_generic.c     |   4 +-
> >> > 33 files changed, 481 insertions(+), 52 deletions(-)
> >> > create mode 100644 include/linux/ref_tracker.h
> >> > create mode 100644 lib/ref_tracker.c
> >> > create mode 100644 lib/test_ref_tracker.c
> >> >
> >> >--
> >> >2.34.0.rc2.393.gf8c9666880-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ