linux-kernel - Re: [RFC 00/12] locking: Separate lock tracepoints from lockdep/lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM9d7ci=N2NVj57k=W0ebqBzfW+ThBqYSrx-CZbgwGcbOSrEGA@mail.gmail.com>
Date:   Wed, 9 Feb 2022 11:22:44 -0800
From:   Namhyung Kim <namhyung@...nel.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        rostedt <rostedt@...dmis.org>,
        Byungchul Park <byungchul.park@....com>,
        Radoslaw Burny <rburny@...gle.com>, Tejun Heo <tj@...nel.org>,
        rcu <rcu@...r.kernel.org>, cgroups <cgroups@...r.kernel.org>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        intel-gfx@...ts.freedesktop.org,
        "Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [RFC 00/12] locking: Separate lock tracepoints from
 lockdep/lock_stat (v1)

Hello,

On Wed, Feb 9, 2022 at 11:02 AM Waiman Long <longman@...hat.com> wrote:
>
> On 2/9/22 13:29, Mathieu Desnoyers wrote:
> > ----- On Feb 9, 2022, at 1:19 PM, Waiman Long longman@...hat.com wrote:
> >
> >> On 2/9/22 04:09, Peter Zijlstra wrote:
> >>> On Tue, Feb 08, 2022 at 10:41:56AM -0800, Namhyung Kim wrote:
> >>>
> >>>> Eventually I'm mostly interested in the contended locks only and I
> >>>> want to reduce the overhead in the fast path.  By moving that, it'd be
> >>>> easy to track contended locks with timing by using two tracepoints.
> >>> So why not put in two new tracepoints and call it a day?
> >>>
> >>> Why muck about with all that lockdep stuff just to preserve the name
> >>> (and in the process continue to blow up data structures etc..). This
> >>> leaves distros in a bind, will they enable this config and provide
> >>> tracepoints while bloating the data structures and destroying things
> >>> like lockref (which relies on sizeof(spinlock_t)), or not provide this
> >>> at all.
> >>>
> >>> Yes, the name is convenient, but it's just not worth it IMO. It makes
> >>> the whole proposition too much of a trade-off.
> >>>
> >>> Would it not be possible to reconstruct enough useful information from
> >>> the lock callsite?
> >>>
> >> I second that as I don't want to see the size of a spinlock exceeds 4
> >> bytes in a production system.
> >>
> >> Instead of storing additional information (e.g. lock name) directly into
> >> the lock itself. Maybe we can store it elsewhere and use the lock
> >> address as the key to locate it in a hash table. We can certainly extend
> >> the various lock init functions to do that. It will be trickier for
> >> statically initialized locks, but we can probably find a way to do that too.
> > If we go down that route, it would be nice if we can support a few different
> > use-cases for various tracers out there.
> >
> > One use-case (a) requires the ability to query the lock name based on its address as key.
> > For this a hash table is a good fit. This would allow tracers like ftrace to
> > output lock names in its human-readable output which is formatted within the kernel.
> >
> > Another use-case (b) is to be able to "dump" the lock { name, address } tuples
> > into the trace stream (we call this statedump events in lttng), and do the
> > translation from address to name at post-processing. This simply requires
> > that this information is available for iteration for both the core kernel
> > and module locks, so the tracer can dump this information on trace start
> > and module load.
> >
> > Use-case (b) is very similar to what is done for the kernel tracepoints. Based
> > on this, implementing the init code that iterates on those sections and populates
> > a hash table for use-case (a) should be easy enough.
>
> Yes, that are good use cases for this type of functionality. I do need
> to think about how to do it for statically initialized lock first.

Thank you all for the review and good suggestions.

I'm also concerning dynamic allocated locks in a data structure.
If we keep the info in a hash table, we should delete it when the
lock is gone.  I'm not sure we have a good place to hook it up all.

Thanks,
Namhyung