[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXAitSZuRPppAjH=38Ua6BFyhou0sSj7xmfNakqPUQqPw@mail.gmail.com>
Date: Thu, 21 Nov 2024 10:55:00 -0800
From: Ian Rogers <irogers@...gle.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] perf tests: Fix hwmon parsing with PMU name test
On Thu, Nov 21, 2024 at 9:22 AM Arnaldo Carvalho de Melo
<acme@...nel.org> wrote:
>
> On Thu, Nov 21, 2024 at 02:04:45PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Wed, Nov 20, 2024 at 04:09:55PM -0800, Ian Rogers wrote:
> > > Incorrectly the hwmon with PMU name test didn't pass "true". Fix and
> > > address issue with hwmon_pmu__config_terms needing to load events - a
> > > load bearing assert fired. Also fix missing list deletion when putting
> > > the hwmon test PMU and lower some debug warnings to make the hwmon PMU
> > > less spammy in verbose mode.
>
> > After applying this, with this series of patches on a Fedora 40 system,
> > I get output from -v where before I needed, IIRC, to use -vv:
> >
> > f8244bb9bfa66f79 (HEAD -> perf-tools-next, x1/perf-tools-next) perf tests: Fix hwmon parsing with PMU name test
> > 9ae6c7a4bd02acbd perf hwmon_pmu: Ensure hwmon key union is zeroed before use
> > 3e37de098af38179 perf tests hwmon_pmu: Remove double evlist__delete()
> > 0565017a0ac824c2 perf test: Correct hwmon test PMU detection
> > 85c60a01b85ee956 (perf-tools-next/tmp.perf-tools-next, perf-tools-next/perf-tools-next) perf: Remove unused del_perf_probe_events()
> > ⬢ [acme@...lbox perf-tools-next]$
>
> <SNIP>
>
> > root@...ber:~# perf test -v 11
> > 11: Hwmon PMU :
> > 11.1: Basic parsing test : Ok
> > --- start ---
> > test child forked, pid 1823259
> > Testing 'temp_test_hwmon_event1'
> > Using CPUID GenuineIntel-6-B7-1
> > FAILED tests/hwmon_pmu.c:159 failed to parse event 'temp_test_hwmon_event1', err 1
> > event syntax error: 'temp_test_hwmon_event1'
> > \___ Bad event name
> >
> > Unable to find event on a PMU of 'temp_test_hwmon_event1'
>
> In gdb it fails on the first call to do_test() from the first test__hwmon_pmu()
>
> Starting program: /root/bin/perf test -F -vv 11
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> --- start ---
> ---- end ----
> 11.1: Basic parsing test : Ok
> --- start ---
>
> Breakpoint 1, test__hwmon_pmu (with_pmu=false) at tests/hwmon_pmu.c:203
> 203 {
> (gdb) n
> 205 struct perf_pmu *pmu = test_pmu_get(dir, sizeof(dir));
> (gdb) n
> 206 int ret = TEST_OK;
> (gdb) p pmu
> $2 = (struct perf_pmu *) 0xf50450
> (gdb) p *pmu
> $3 = {name = 0xf50ac0 "hwmon_a_test_hwmon_pmu", alias_name = 0xf50aa0 "hwmon1234", id = 0x0, type = 4294902994, selectable = false, is_core = false, is_uncore = false, auxtrace = false,
> formats_checked = false, config_masks_present = false, config_masks_computed = false, max_precise = 0, perf_event_attr_init_default = 0x0, cpus = 0xf4fbf0, format = {next = 0xf50488,
> prev = 0xf50488}, aliases = {next = 0xf50498, prev = 0xf50498}, events_table = 0x0, sysfs_aliases = 0, cpu_json_aliases = 0, sys_json_aliases = 0, sysfs_aliases_loaded = false,
> cpu_aliases_added = false, caps_initialized = false, nr_caps = 0, caps = {next = 0xf504c8, prev = 0xf504c8}, list = {next = 0xedc090 <other_pmus>, prev = 0xedc090 <other_pmus>},
> config_masks = {0, 0, 0, 0}, missing_features = {exclude_guest = false, checked = false}, mem_events = 0x0}
Thanks for helping, I'm not able to repro this, so extra debugging
would be useful for me. Here sysfs_aliases_loaded is false as we'll
load the PMU aliases when there is a request to pmu__have_event. This
looks pretty ordinary.
> (gdb) s
> 208 if (!pmu)
> (gdb) s
> 211 for (size_t i = 0; i < ARRAY_SIZE(test_events); i++) {
> (gdb) s
> 212 ret = do_test(i, with_pmu, /*with_alias=*/false);
> (gdb) s
> do_test (i=0, with_pmu=false, with_alias=false) at tests/hwmon_pmu.c:136
> 136 {
> (gdb) n
> 137 const char *test_event = with_alias ? test_events[i].alias : test_events[i].name;
> (gdb) n
> 138 struct evlist *evlist = evlist__new();
> (gdb) n
> 143 bool found = false;
> (gdb) n
> 145 if (!evlist) {
> (gdb) n
> 150 if (with_pmu)
> (gdb) n
> 153 strlcpy(str, test_event, sizeof(str));
> (gdb) n
> 155 pr_debug("Testing '%s'\n", str);
> (gdb) p str
> $4 = "temp_test_hwmon_event1\000\000\004\000\000\000\000\000\000\000\274\204z\000\000\000\000\000΄z\000\000\000\000\000\021\000\000\000\000\000\000\000 \305\377\377\377\177\000\000߄z\000\000\000\000\000\353\204z\000\000\000\000\000\376\204z\000\000\000\000\000\n\205z\000\000\000\000\000\021\205z\000\000\000\000\000\035\205z\000\000\000\000\0000\205z\000\000\000\000\000<\205z\000\000\000\000"
> (gdb) n
> Testing 'temp_test_hwmon_event1'
> 156 parse_events_error__init(&err);
So there was no parse event output like the expected:
```
Attempt to add: hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
..after resolving event: hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
```
The wildcard PMU lookup will call perf_pmu__have_event trying to find
a PMU with the event:
```
Breakpoint 1, perf_pmu__have_event (pmu=0x555556157f90,
name=0x5555560ce470 "temp_test_hwmon_event1") at util/pmu.c:1816
1816 {
(gdb) bt
#0 perf_pmu__have_event (pmu=0x555556157f90, name=0x5555560ce470
"temp_test_hwmon_event1")
at util/pmu.c:1816
#1 0x00005555557ab143 in parse_events_multi_pmu_add
(parse_state=0x7fffffffbf00,
event_name=0x5555560ce470 "temp_test_hwmon_event1", hw_config=10,
const_parsed_terms=0x0,
listp=0x7fffffffa270, loc_=0x7fffffffb120) at util/parse-events.c:1592
#2 0x00005555558108cc in parse_events_parse
(_parse_state=0x7fffffffbf00, scanner=0x555556138aa0)
at util/parse-events.y:293
#3 0x00005555557abdc2 in parse_events__scanner (str=0x7fffffffc000
"temp_test_hwmon_event1",
input=0x0, parse_state=0x7fffffffbf00) at util/parse-events.c:1870
#4 0x00005555557ac735 in __parse_events (evlist=0x55555613ca40,
str=0x7fffffffc000 "temp_test_hwmon_event1", pmu_filter=0x0,
err=0x7fffffffbff0,
fake_pmu=false, warn_if_reordered=true, fake_tp=false) at
util/parse-events.c:2139
#5 0x00005555557448ca in parse_events (evlist=0x55555613ca40,
str=0x7fffffffc000 "temp_test_hwmon_event1", err=0x7fffffffbff0)
at /home/irogers/kernel.org2/tools/perf/util/parse-events.h:41
#6 0x0000555555744f6e in do_test (i=0, with_pmu=false,
with_alias=false) at tests/hwmon_pmu.c:156
#7 0x00005555557452dd in test__hwmon_pmu (with_pmu=false) at
tests/hwmon_pmu.c:212
#8 0x000055555574538d in test__hwmon_pmu_without_pmu
(test=0x5555560a3740 <suite.hwmon_pmu>,
subtest=1) at tests/hwmon_pmu.c:229
#9 0x00005555556fc935 in start_test (test=0x5555560a3740
<suite.hwmon_pmu>, i=10, subi=1,
child=0x55555613c528, width=64, pass=1) at tests/builtin-test.c:424
#10 0x00005555556fd014 in __cmd_test (suites=0x55555613c0f0, argc=1,
argv=0x7fffffffd9c0,
skiplist=0x0) at tests/builtin-test.c:571
#11 0x00005555556fdb29 in cmd_test (argc=1, argv=0x7fffffffd9c0) at
tests/builtin-test.c:773
#12 0x000055555568043a in run_builtin (p=0x55555608f950 <commands+624>, argc=4,
argv=0x7fffffffd9c0) at perf.c:351
#13 0x00005555556806e1 in handle_internal_command (argc=4,
argv=0x7fffffffd9c0) at perf.c:404
#14 0x000055555568083a in run_argv (argcp=0x7fffffffd7bc,
argv=0x7fffffffd7b0) at perf.c:448
#15 0x0000555555680b83 in main (argc=4, argv=0x7fffffffd9c0) at perf.c:560
(gdb) p pmu->name
$1 = 0x5555560ce940 "cpu"
```
Repeating this the test hwmon_a_test_hwmon_pmu test PMU should be tested:
```
(gdb) c
Continuing.
(gdb) p pmu->name
$2 = 0x5555560d6b20 "breakpoint"
(gdb) c
Continuing.
Breakpoint 1, perf_pmu__have_event (pmu=0x555556158060,
name=0x5555560ce470 "temp_test_hwmon_event1") at util/pmu.c:1816
1816 {
(gdb) p pmu->name
$3 = 0x5555560d7ec0 "cstate_core"
(gdb) c
Continuing.
Breakpoint 1, perf_pmu__have_event (pmu=0x555556158610,
name=0x5555560ce470 "temp_test_hwmon_event1") at util/pmu.c:1816
1816 {
(gdb) p pmu->name
$4 = 0x5555560c84b0 "cstate_pkg"
(gdb) c
Continuing.
Breakpoint 1, perf_pmu__have_event (pmu=0x5555561360b0,
name=0x5555560ce470 "temp_test_hwmon_event1") at util/pmu.c:1816
1816 {
(gdb) p pmu->name
$5 = 0x5555560d46a0 "hwmon_a_test_hwmon_pmu"
```
which should then go into the hwmon_pmu__have_event:
```
(gdb) n
1817 if (!name)
(gdb)
1819 if (perf_pmu__is_tool(pmu) && tool_pmu__skip_event(name))
(gdb)
1821 if (perf_pmu__is_hwmon(pmu))
(gdb) n
1822 return hwmon_pmu__have_event(pmu, name);
(gdb) s
hwmon_pmu__have_event (pmu=0x5555561360b0, name=0x5555560ce470
"temp_test_hwmon_event1")
at util/hwmon_pmu.c:559
```
hwmon_pmu__have_event should return true but it is either not getting
called in your case or it is returning false. Not getting called I
find hard to understand as your output shows the test PMU was created.
It seems more likely reading the "events" and then doing the hashmap
lookup in hwmon_pmu__have_event fails. This was failing for me with
undefined behavior sanitizer because of the under initialized unions.
But that should be fixed by: "perf hwmon_pmu: Ensure hwmon key union
is zeroed before use". The particular event "temp_test_hwmon_event1"
is going to search all events as it uses the "label" name, so that
makes me think the bug is here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/hwmon_pmu.c?h=perf-tools-next#n584
but I can't eye ball an issue and the test works for me even when
trying to be aggressive with sanitizers. If you could help me look I'd
appreciate it.
Thanks,
Ian
> (gdb) n
> 157 ret = parse_events(evlist, str, &err);
> (gdb) n
> Using CPUID GenuineIntel-6-B7-1
> 158 if (ret) {
> (gdb) p ret
> $5 = 1
> (gdb) n
> 159 pr_debug("FAILED %s:%d failed to parse event '%s', err %d\n",
> (gdb)
>
>
Powered by blists - more mailing lists