lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBmei7cMf-MzzX5W@google.com>
Date: Mon, 5 May 2025 22:30:51 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC/PATCH] perf report: Support latency profiling in
 system-wide mode

Hello,

On Mon, May 05, 2025 at 10:08:17AM +0200, Dmitry Vyukov wrote:
> On Sat, 3 May 2025 at 02:36, Namhyung Kim <namhyung@...nel.org> wrote:
> >
> > When it profile a target process (and its children), it's
> > straight-forward to track parallelism using sched-switch info.  The
> > parallelism is kept in machine-level in this case.
> >
> > But when it profile multiple processes like in the system-wide mode,
> > it might not be clear how to apply the (machine-level) parallelism to
> > different tasks.  That's why it disabled the latency profiling for
> > system-wide mode.
> >
> > But it should be able to track parallelism in each process and it'd
> > useful to profile latency issues in multi-threaded programs.  So this
> > patch tries to enable it.
> >
> > However using sched-switch info can be a problem since it may emit a lot
> > more data and more chances for losing data when perf cannot keep up with
> > it.
> >
> > Instead, it can maintain the current process for each CPU when it sees
> > samples.
> 
> Interesting.
> 
> Few questions:
> 1. Do we always see a CPU sample when a CPU becomes idle? Otherwise we
> will think that the last thread runs on that CPU for arbitrary long,
> when it's actually not.

No, it's not guaranteed to have a sample for idle tasks.  So right, it
can mis-calculate the parallelism for the last task.  If we can emit
sched-switches only when it goes to the idle task, it'd be accurate.


> 2. If yes, can we also lose that "terminating" even when a CPU becomes
> idle? If yes, then it looks equivalent to missing a context switch
> event.

I'm not sure what you are asking.  When it lose some records because the
buffer is full, it'll see the task of the last sample on each CPU.
Maybe we want to reset the current task after PERF_RECORD_LOST.


> 3. Does this mode kick in even for non system-wide profiles (collected
> w/o context switch events)? If yes, do we properly understand when a
> thread stops running for such profiles? How do we do that? There won't
> be samples for idle/other tasks.

For non system-wide profiles, the problem is that it cannot know when
the current task is scheduled out so that it can decrease the count of
parallelism.  So this approach cannot work and sched-switch info is
required.

Thanks,
Namhyung


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ