linux-kernel - Re: [RFC/PATCH] perf report: Support latency profiling in system-wide mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CACT4Y+Z4Yr_jNV56NvdiJeUzMDmubFL5BtkACDXWaTGxZC2deg@mail.gmail.com>
Date: Sat, 31 May 2025 08:31:07 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Ian Rogers <irogers@...gle.com>, 
	Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-perf-users@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC/PATCH] perf report: Support latency profiling in system-wide mode

On Sat, 31 May 2025 at 00:05, Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Fri, May 30, 2025 at 07:50:45AM +0200, Dmitry Vyukov wrote:
> > On Wed, 28 May 2025 at 20:38, Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > Hello,
> > >
> > > On Tue, May 27, 2025 at 09:14:34AM +0200, Dmitry Vyukov wrote:
> > > > On Wed, 21 May 2025 at 09:30, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > > > >
> > > > > > Maybe we can use this
> > > > > > only for the frequency mode which means user didn't use -c option or
> > > > > > similar in the event description.
> > > >
> > > >
> > > > All-in-all I think the best option for now is using CPU IDs to track
> > > > parallelism as you suggested, but be more precise with idle detection.
> > > > 2 passes over the trace may be fine to detect idle points. I see the
> > > > most time now spent in hist_entry__cmp, which accesses other entries
> > > > and is like a part of O(N*logN) processing, so a simple O(N) pass
> > > > shouldn't slow it down much.
> > > > That's what I would try. But I would also try to assess the precision
> > > > of this approach by comparing with results of using explicit switch
> > > > events.
> > >
> > > It's not clear to me how you want to maintain the idle info in the 2
> > > pass approach.  Please feel free to propose something based on this
> > > work.
> >
> >
> > What part of it is unclear?
> >
> > Basically, in the first pass we only mark events as sched_out/in.
> > When we don't see samples on a CPU for 2*period, we mark the previous
> > sample on the CPU as sched_out:
> >
> >   // Assuming the period is stable across time and CPUs.
> >   for_each_cpu(cpu) {
> >       if (current[cpu]->last_timestamp + 2*period < sample->timestamp) {
> >           if (current[cpu]->thread != idle)
> >               current[cpu]->last_sample->sched_out = true;
> >       }
> >   }
> >
> >   leader = machine__findnew_thread(machine, sample->pid);
> >   if (current[sample->cpu]->thread != leader) {
> >     current[sample->cpu]->last_sample->sched_out = true;
> >     sample->sched_in = true;
> >   }
> >   current[sample->cpu]->thread = leader;
> >   current[sample->cpu]->last_sample = sample;
> >   current[sample->cpu]->last_timestamp = sample->timestamp;
>
> Oh, you wanted to save the info in the sample.  But I'm afraid it won't
> work since it's stack allocated for one-time use in the
> perf_session__deliver_event().

No, I just showed the algorithm. I don't know perf well enough to say
how to implement it.

> > On the second pass we use the precomputed sched_in/out to calculate parallelism:
> >
> >   leader = machine__findnew_thread(machine, sample->pid);
> >   if (sample->sched_in)
> >     leader->parallelism++;
> >   sample->parallelism = leader->parallelism;
> >   if (sample->sched_out)
> >     leader->parallelism--;
> >
> > This is more precise b/c we don't consider a thread running for
> > 2*period after it stopped running.
>
> IIUC it can make some samples have less parallelism right before
> they go to idle.
>
> > A more precise approach would probably be to consider the thread
> > running for 0.5*period after the last sample (and similarly for
> > 0.5*period before the first sample), but it would require injecting
> > sched_in/out events into the trace at these points.
>
> Yep, that will fix the issue.  But then how to inject the events is the
> problem.

Yes, but incorrect data is incomparably worse problem than writing a
bit of code.