lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 28 May 2024 11:47:16 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Artem Savkov <asavkov@...hat.com>
Cc: Guilherme Amadio <amadio@...too.org>, Ian Rogers <irogers@...gle.com>,
	linux-perf-users@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>, Namhyung Kim <namhyung@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	"Liang, Kan" <kan.liang@...ux.intel.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf record: add a shortcut for metrics

On Tue, May 28, 2024 at 01:45:25PM +0200, Artem Savkov wrote:
> On Mon, May 27, 2024 at 02:28:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Mon, May 27, 2024 at 02:04:54PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Mon, May 27, 2024 at 02:02:33PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > On Mon, May 27, 2024 at 12:15:19PM +0200, Artem Savkov wrote:
> > > > > Add -M/--metrics option to perf-record providing a shortcut to record
> > > > > metrics and metricgroups. This option mirrors the one in perf-stat.
> >
> > > > > Suggested-by: Arnaldo Carvalho de Melo <acme@...nel.org>
> > > > > Signed-off-by: Artem Savkov <asavkov@...hat.com>

<SNIP>

> > How did you test this?
> >
> > I'm trying:
> >
> > perf list metric
> >
> > pick a metric then:
> >
> > perf record -M tma_core_bound
> >
> > And it gets in a long loop doing perf_event_open() calls...
> 
> [snip]
> 
> > (gdb) bt
> > #0  0x00007ffff6f21804 in close () from /lib64/libc.so.6
> > #1  0x000000000061fbd2 in perf_evsel__close_fd_cpu (evsel=0xdab470, cpu_map_idx=6) at evsel.c:188
> > #2  0x000000000061fc22 in perf_evsel__close_fd (evsel=0xdab470) at evsel.c:197
> > #3  0x000000000061fc9b in perf_evsel__close (evsel=0xdab470) at evsel.c:211
> > #4  0x00000000004e0b5f in evlist.reset_weak_group ()
> > #5  0x0000000000423bb9 in __cmd_record.constprop.0 ()
> > #6  0x00000000004276c5 in cmd_record ()
> > #7  0x00000000004c4579 in run_builtin ()
> > #8  0x00000000004c4889 in handle_internal_command ()
> > #9  0x0000000000410e57 in main ()
> > (gdb) c
> > Continuing.
> > ^C
> > Program received signal SIGINT, Interrupt.
> > 0x00007ffff6f21804 in close () from /lib64/libc.so.6
> > (gdb)
> >
> > So you should investigate this further.
> 
> I tried a bunch of random metrics from perf list but didn't hit this.
> 
> It spins forever in evlist__for_each_entry() loop in record__open() with
> the same error:
> 
>         Weak group for TOPDOWN.SLOTS/5 failed
> 
> Looks like the culprit is one of those unsupported metrics, will
> investigate.

Right, when trying something new, in a different way than the
pre-existing codebase was envisioned to be used we may uncover latent
problems, that endless loop seems like something we want fixed :-)
 
> > The idea, from my notes, was to be able to have extra columns in 'perf
> > report' with things like IPC and other metrics, probably not all metrics
> > will apply. We need to find a way to find out which ones are OK for that
> > purpose, for instance:
> >
> > Opening: cpu_core/topdown-bad-spec/
> > ------------------------------------------------------------
> > perf_event_attr:
> >   type                             4 (cpu_core)
> >   size                             136
> >   config                           0x8100 (topdown-bad-spec)
> >   { sample_period, sample_freq }   4000
> >   sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
> >   read_format                      ID|LOST
> >   disabled                         1
> >   inherit                          1
> >   freq                             1
> >   sample_id_all                    1
> >   exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off PERF_FORMAT_LOST support
> > Opening: cpu_core/topdown-bad-spec/
> 
> Is it just metrics containing unsupported events that need to be skipped
> or there are other cases that wouldn't make much sense? If the latter
> maybe it will be easier to just tag the ones that are supported (or not) in
> pmu-events?

Maybe we can use some criteria to look at the metric and filter out
things that are not working right now? As you go on studying the
codebase you will figure out the reasons, sometimes its a bug (the
forever loop above), sometimes it plain don't make sense and we just
skip it, leaving things like IPC, i.e. we have instructions, we have
cycles, that is what needed for IPC, ok, that makes sense and we should
have an IPC column when collecting both cycles and instructions, just
like is done in a ad hoc way for IPC in perf stat since forever.

People want to have those columns in 'perf report' and 'perf top'.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ