netdev - Re: [PATCH net-next 0/5] net: add refcnt tracking for some common objects

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJyiDbGdvm-oNKBBk5r3-0+3h+3ui1pL3rOTrz2BOztmA@mail.gmail.com>
Date:   Mon, 6 Dec 2021 20:41:18 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     Xin Long <lucien.xin@...il.com>
Cc:     network dev <netdev@...r.kernel.org>, davem@...emloft.net,
        kuba@...nel.org,
        Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        Davide Caratti <dcaratti@...hat.com>,
        Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net-next 0/5] net: add refcnt tracking for some common objects

On Mon, Dec 6, 2021 at 8:02 PM Xin Long <lucien.xin@...il.com> wrote:
>
> This patchset provides a simple lib(obj_cnt) to count the operatings on any
> objects, and saves them into a gobal hashtable. Each node in this hashtable
> can be identified with a calltrace and an object pointer. A calltrace could
> be a function called from somewhere, like dev_hold() called by:
>
>     inetdev_init+0xff/0x1c0
>     inetdev_event+0x4b7/0x600
>     raw_notifier_call_chain+0x41/0x50
>     register_netdevice+0x481/0x580
>
> and an object pointer would be the dev that this function is accessing:
>
>     dev_hold(dev).
>
> When this call comes to this object, a node including calltrace + object +
> counter will be created if it doesn't exist, and the counter in this node
> will increment if it already exists. Pretty simple.
>
> So naturally this lib can be used to track the refcnt of any objects, all
> it has to do is put obj_cnt_track() to the place where this object is
> held or put. It will count how many times this call has operated this
> object after checking if this object and this type(hold/put) accessing
> are being tracked.
>
> After the 1st lib patch, the other patches add the refcnt tracking for
> netdev, dst, in6_dev and xfrm_state, and each has example how to use
> in the changelog. The common use is:
>
>     # sysctl -w obj_cnt.control="clear" # clear the old result
>
>     # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
>     # sysctl -w obj_cnt.name=test    # match name == test or
>     # sysctl -w obj_cnt.index=1      # match index == 1
>     # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace
>
>     ... (reproduce the issue)
>
>     # sysctl -w obj_cnt.control="scan"  # print the new result
>
> Note that after seeing Eric's another patchset for refcnt tracking I
> decided to post this patchset. As in this implemenation, it has some
> benefits which I think worth sharing:
>

How can your code coexist with ref_tracker ?

>   - it runs fast:
>     1. it doesn't create nodes for the repeatitive calls to the same
>        objects, and it saves memory and time.
>     2. the depth of the calltrace to record is configurable, at most
>        time small calltrace also saves memory and time, but will not
>        affect the analysis.
>     3. kmem_cache used also contributes to the performance.

Points 2/3 can be implemented right away in the ref_tracker infra,
please send patches.

Quite frankly using a global hash table seems wrong, stack_depot
already has this logic, why reimplement it ?
stack_depot is damn fast (no spinlock in fast path)

Seeing that your patches add chunks in lib/obj_cnt.c, I do not see how
you can claim this is generic code.

I don't know, it seems very strange to send this patch series now I
have done about 60 patches on these issues.

And by doing this work, I found already two bugs in our stack.

You can be sure syzbot will send us many reports, most syzbot repros
use a very limited number of objects.

About performance : You use a single spinlock to protect your hash table.
In my implementation, there is a spinlock per 'directory (eg one
spinlock per struct net_device, one spinlock per struct net), it is
more scalable.

My tests have not shown a significant cost of the ref_tracker
(the major cost comes from stack_trace_save() which you also use)