lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+ameQFd3n=u+bjd+vKR6svShp3NNQzjsUo_UUBCZPzrBw@mail.gmail.com>
Date: Tue, 6 May 2025 07:55:25 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Ian Rogers <irogers@...gle.com>, 
	Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Peter Zijlstra <peterz@...radead.org>, 
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-perf-users@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC/PATCH] perf report: Support latency profiling in system-wide mode

On Tue, 6 May 2025 at 07:30, Namhyung Kim <namhyung@...nel.org> wrote:
>
> Hello,
>
> On Mon, May 05, 2025 at 10:08:17AM +0200, Dmitry Vyukov wrote:
> > On Sat, 3 May 2025 at 02:36, Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > When it profile a target process (and its children), it's
> > > straight-forward to track parallelism using sched-switch info.  The
> > > parallelism is kept in machine-level in this case.
> > >
> > > But when it profile multiple processes like in the system-wide mode,
> > > it might not be clear how to apply the (machine-level) parallelism to
> > > different tasks.  That's why it disabled the latency profiling for
> > > system-wide mode.
> > >
> > > But it should be able to track parallelism in each process and it'd
> > > useful to profile latency issues in multi-threaded programs.  So this
> > > patch tries to enable it.
> > >
> > > However using sched-switch info can be a problem since it may emit a lot
> > > more data and more chances for losing data when perf cannot keep up with
> > > it.
> > >
> > > Instead, it can maintain the current process for each CPU when it sees
> > > samples.
> >
> > Interesting.
> >
> > Few questions:
> > 1. Do we always see a CPU sample when a CPU becomes idle? Otherwise we
> > will think that the last thread runs on that CPU for arbitrary long,
> > when it's actually not.
>
> No, it's not guaranteed to have a sample for idle tasks.  So right, it
> can mis-calculate the parallelism for the last task.  If we can emit
> sched-switches only when it goes to the idle task, it'd be accurate.

Then I think the profile can be significantly off if the system wasn't
~100% loaded, right?

> > 2. If yes, can we also lose that "terminating" even when a CPU becomes
> > idle? If yes, then it looks equivalent to missing a context switch
> > event.
>
> I'm not sure what you are asking.  When it lose some records because the
> buffer is full, it'll see the task of the last sample on each CPU.
> Maybe we want to reset the current task after PERF_RECORD_LOST.

This probably does not matter much if the answer to question 1 is No.

But what I was is the following:

let's say we have samples:
Sample 1 for Pid 42 on Cpu 10
Sample 2 for idle task on Cpu 10
... no samples for some time on Cpu 10 ...

When we process sample 2, we decrement the counter for running tasks
for Pid 42, right.
Now if sample 2 is lost, then we don't do decrement and the accounting
becomes off.
In a sense this is equivalent to the problem of losing context switch event.


> > 3. Does this mode kick in even for non system-wide profiles (collected
> > w/o context switch events)? If yes, do we properly understand when a
> > thread stops running for such profiles? How do we do that? There won't
> > be samples for idle/other tasks.
>
> For non system-wide profiles, the problem is that it cannot know when
> the current task is scheduled out so that it can decrease the count of
> parallelism.  So this approach cannot work and sched-switch info is
> required.

Where does the patch check that this mode is used only for system-wide profiles?
Is it that PERF_SAMPLE_CPU present only for system-wide profiles?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ