linux-kernel - Re: [PATCH v2] perf lock contention: Account contending locks too

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM9d7cgb+-treat5Mf_hitEjLDJH8B-RFZYoDxzaGXu0VbNr8A@mail.gmail.com>
Date: Thu, 29 Feb 2024 13:53:21 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Ian Rogers <irogers@...gle.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-perf-users@...r.kernel.org, Song Liu <song@...nel.org>, bpf@...r.kernel.org
Subject: Re: [PATCH v2] perf lock contention: Account contending locks too

On Thu, Feb 29, 2024 at 9:23 AM Arnaldo Carvalho de Melo
<acme@...nel.org> wrote:
>
> On Wed, Feb 28, 2024 at 01:19:12PM -0800, Namhyung Kim wrote:
> > On Wed, Feb 28, 2024 at 12:16 PM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > > On Wed, Feb 28, 2024 at 12:01:55PM -0800, Namhyung Kim wrote:
> > > > On Wed, Feb 28, 2024 at 4:22 AM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > > > > On Tue, Feb 27, 2024 at 09:33:35PM -0800, Namhyung Kim wrote:
> > > > > > Currently it accounts the contention using delta between timestamps in
> > > > > > lock:contention_begin and lock:contention_end tracepoints.  But it means
> > > > > > the lock should see the both events during the monitoring period.
>
> > > > > > Actually there are 4 cases that happen with the monitoring:
>
> > > > > >                 monitoring period
> > > > > >             /                       \
> > > > > >             |                       |
> > > > > >  1:  B------+-----------------------+--------E
> > > > > >  2:    B----+-------------E         |
> > > > > >  3:         |           B-----------+----E
> > > > > >  4:         |     B-------------E   |
> > > > > >             |                       |
> > > > > >             t0                      t1
>
> > > > > > where B and E mean contention BEGIN and END, respectively.  So it only
> > > > > > accounts the case 4 for now.  It seems there's no way to handle the case
> > > > > > 1.  The case 2 might be handled if it saved the timestamp (t0), but it
> > > > > > lacks the information from the B notably the flags which shows the lock
> > > > > > types.  Also it could be a nested lock which it currently ignores.  So
> > > > > > I think we should ignore the case 2.
>
> > > > > Perhaps have a separate output listing locks that were found to be with
> > > > > at least tE - t0 time, with perhaps a backtrace at that END time?
>
> > > > Do you mean long contentions in case 3?  I'm not sure what do
> > > > you mean by tE, but they started after t0 so cannot be greater
>
> > > case 2
>
> > >                 monitoring period
> > >             /                       \
> > >             |                       |
> > >  2:    B----+-------------E         |
> > >             |             |         |
> > >             t0            tE        t1
> > >
> > > We get a notification for event E, right? We don´t have one for B,
> > > because it happened before we were monitoring.
> >
> > Ah, ok.  But there should be too many events in case 2 and
> > I don't think users want to see them all.  And they don't have
>
> So maybe a summary, something like:
>
>   N locks that were locked before this session started have been
>   released, no further info besides this histogram of in-session
>   durations:
>
>     0-N units of time: ++
>   N+1-M units of time: ++++
>     ...

Summary could work.  But I'm not sure about the histogram
since different locks would have different behavior - spinlock
vs. mutex/semaphore.  Maybe it's more meaningful when
you have filters or separate histograms for each lock.


>
> > flags.  But maybe we can update the flag when it sees exactly
> > the same callstack later.
>
>   The callstack, if going all the way to userspace may have the workload
> targeted in the command line ( some pid, tid, CPU, etc) and thus would
> point for things the user probably is interested than some other lock
> that may affect it but indirectly.

It doesn't collect user callstacks yet since it requires recording memory
address space information of every process - basically FORK and MMAP.
Maybe we can use callstacks with build-ID and offsets but it also requires
mapping from build-ID to binary somewhere.

Anyway, it's good to add more features to it.  Let me process this
patch first and think about more later. :)

Thanks,
Namhyung