lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1638849511.git.lucien.xin@gmail.com>
Date:   Mon,  6 Dec 2021 23:02:03 -0500
From:   Xin Long <lucien.xin@...il.com>
To:     network dev <netdev@...r.kernel.org>
Cc:     Eric Dumazet <edumazet@...gle.com>, davem@...emloft.net,
        kuba@...nel.org,
        Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        Davide Caratti <dcaratti@...hat.com>,
        Paolo Abeni <pabeni@...hat.com>
Subject: [PATCH net-next 0/5] net: add refcnt tracking for some common objects

This patchset provides a simple lib(obj_cnt) to count the operatings on any
objects, and saves them into a gobal hashtable. Each node in this hashtable
can be identified with a calltrace and an object pointer. A calltrace could
be a function called from somewhere, like dev_hold() called by:

    inetdev_init+0xff/0x1c0
    inetdev_event+0x4b7/0x600
    raw_notifier_call_chain+0x41/0x50
    register_netdevice+0x481/0x580

and an object pointer would be the dev that this function is accessing:

    dev_hold(dev).

When this call comes to this object, a node including calltrace + object +
counter will be created if it doesn't exist, and the counter in this node
will increment if it already exists. Pretty simple.

So naturally this lib can be used to track the refcnt of any objects, all
it has to do is put obj_cnt_track() to the place where this object is
held or put. It will count how many times this call has operated this
object after checking if this object and this type(hold/put) accessing
are being tracked.

After the 1st lib patch, the other patches add the refcnt tracking for
netdev, dst, in6_dev and xfrm_state, and each has example how to use
in the changelog. The common use is:

    # sysctl -w obj_cnt.control="clear" # clear the old result

    # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
    # sysctl -w obj_cnt.name=test    # match name == test or
    # sysctl -w obj_cnt.index=1      # match index == 1
    # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace

    ... (reproduce the issue)

    # sysctl -w obj_cnt.control="scan"  # print the new result

Note that after seeing Eric's another patchset for refcnt tracking I
decided to post this patchset. As in this implemenation, it has some
benefits which I think worth sharing:

  - it runs fast:
    1. it doesn't create nodes for the repeatitive calls to the same
       objects, and it saves memory and time.
    2. the depth of the calltrace to record is configurable, at most
       time small calltrace also saves memory and time, but will not
       affect the analysis.
    3. kmem_cache used also contributes to the performance.

  - easy to use:
    1. it doesn't add any members to the object structure, just place
       an API to the hold/put functions, and it keep the kernel code
       clear and won't break any ABIs.
    2. three types of matching conditions for tracking can be set up,
       int, string by sysctl and API, and pointer by API.

This patchset has helped solve quite some refcnt leaks, from netdev to
dst, in6_dev, xfrm_dst. There are also some difficult cases that we've
addressed with this pathset:

  - some leaks were only reproduciable in customer's environment by running
    for a couple of months, "probe" data was even too huge to save and
    analyse, so saving memory is crucial.

  - some are not able to reproduce if the tracking patch worked slowly,
    like not using kmem cache, so running fast is important.

  - some leak was a chain, such as a leak we see it as a dev leak, but it
    was caused by a dst or in6_dev leak, and this dst or in6_dev leak was
    caused by another object, so tracking multiple types at the same time
    is effective.

Xin Long (5):
  lib: add obj_cnt infrastructure
  net: track netdev refcnt with obj_cnt
  net: track dst refcnt with obj_cnt
  net: track in6_dev refcnt with obj_cnt
  net: track xfrm_state refcnt with obj_cnt

 include/linux/netdevice.h |  11 ++
 include/linux/obj_cnt.h   |  20 +++
 include/net/addrconf.h    |   7 +-
 include/net/dst.h         |   8 +-
 include/net/sock.h        |   3 +-
 include/net/xfrm.h        |  11 ++
 lib/Kconfig.debug         |   7 +
 lib/Makefile              |   1 +
 lib/obj_cnt.c             | 285 ++++++++++++++++++++++++++++++++++++++
 net/core/dst.c            |   2 +
 10 files changed, 352 insertions(+), 3 deletions(-)
 create mode 100644 include/linux/obj_cnt.h
 create mode 100644 lib/obj_cnt.c

-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ