lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ldujkjsi.fsf@linux.intel.com>
Date: Thu, 06 Feb 2025 10:30:37 -0800
From: Andi Kleen <ak@...ux.intel.com>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: namhyung@...nel.org,  irogers@...gle.com,
  linux-perf-users@...r.kernel.org,  linux-kernel@...r.kernel.org,  Arnaldo
 Carvalho de Melo <acme@...nel.org>
Subject: Re: [PATCH v5 0/8] perf report: Add latency and parallelism profiling

Dmitry Vyukov <dvyukov@...gle.com> writes:

> There are two notions of time: wall-clock time and CPU time.
> For a single-threaded program, or a program running on a single-core
> machine, these notions are the same. However, for a multi-threaded/
> multi-process program running on a multi-core machine, these notions are
> significantly different. Each second of wall-clock time we have
> number-of-cores seconds of CPU time.

I'm curious how does this interact with the time / --time-quantum sort key?

I assume it just works, but might be worth checking.

It was intended to address some of these issues too.

> Optimizing CPU overhead is useful to improve 'throughput', while
> optimizing wall-clock overhead is useful to improve 'latency'.
> These profiles are complementary and are not interchangeable.
> Examples of where latency profile is needed:
>  - optimzing build latency
>  - optimizing server request latency
>  - optimizing ML training/inference latency
>  - optimizing running time of any command line program
>
> CPU profile is useless for these use cases at best (if a user understands
> the difference), or misleading at worst (if a user tries to use a wrong
> profile for a job).

I would agree in the general case, but not if the time sort key
is chosen with a suitable quantum. You can see how the parallelism
changes over time then which is often a good enough proxy. 


> We still default to the CPU profile, so it's up to users to learn
> about the second profiling mode and use it when appropriate.

You should add it to tips.txt then

>  .../callchain-overhead-calculation.txt        |   5 +-
>  .../cpu-and-latency-overheads.txt             |  85 ++++++++++++++
>  tools/perf/Documentation/perf-record.txt      |   4 +
>  tools/perf/Documentation/perf-report.txt      |  54 ++++++---
>  tools/perf/Documentation/tips.txt             |   3 +
>  tools/perf/builtin-record.c                   |  20 ++++
>  tools/perf/builtin-report.c                   |  39 +++++++
>  tools/perf/ui/browsers/hists.c                |  27 +++--
>  tools/perf/ui/hist.c                          | 104 ++++++++++++------
>  tools/perf/util/addr_location.c               |   1 +
>  tools/perf/util/addr_location.h               |   7 +-
>  tools/perf/util/event.c                       |  11 ++
>  tools/perf/util/events_stats.h                |   2 +
>  tools/perf/util/hist.c                        |  90 ++++++++++++---
>  tools/perf/util/hist.h                        |  32 +++++-
>  tools/perf/util/machine.c                     |   7 ++
>  tools/perf/util/machine.h                     |   6 +
>  tools/perf/util/sample.h                      |   2 +-
>  tools/perf/util/session.c                     |  12 ++
>  tools/perf/util/session.h                     |   1 +
>  tools/perf/util/sort.c                        |  69 +++++++++++-
>  tools/perf/util/sort.h                        |   3 +-
>  tools/perf/util/symbol.c                      |  34 ++++++
>  tools/perf/util/symbol_conf.h                 |   8 +-

We traditionally didn't do it, but in general test coverage
of perf report is too low, so I would recommend to add some simple
test case in the perf test scripts.

-Andi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ