[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fVKvDLUMw_HXBoRLK3FyPvUCWGOZNECer6_fyhbTZTM6A@mail.gmail.com>
Date: Wed, 13 Nov 2024 10:06:13 -0800
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
Athira Jajeev <atrajeev@...ux.vnet.ibm.com>, James Clark <james.clark@...aro.org>,
Dominique Martinet <asmadeus@...ewreck.org>, Yang Li <yang.lee@...ux.alibaba.com>,
Colin Ian King <colin.i.king@...il.com>, Yang Jihong <yangjihong@...edance.com>,
"Steinar H. Gunderson" <sesse@...gle.com>, Oliver Upton <oliver.upton@...ux.dev>,
Ilkka Koskinen <ilkka@...amperecomputing.com>, Ze Gao <zegao2021@...il.com>,
Weilin Wang <weilin.wang@...el.com>, Ben Gainey <ben.gainey@....com>,
zhaimingbing <zhaimingbing@...s.chinamobile.com>, Zixian Cai <fzczx123@...il.com>,
Andi Kleen <ak@...ux.intel.com>, Paran Lee <p4ranlee@...il.com>,
Thomas Falcon <thomas.falcon@...el.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
"Steven Rostedt (Google)" <rostedt@...dmis.org>
Subject: Re: [PATCH v4 0/6] Avoid parsing tracepoint format just for id
On Sat, Nov 9, 2024 at 9:04 AM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Sat, Nov 09, 2024 at 08:26:20AM -0800, Ian Rogers wrote:
> > On Fri, Nov 8, 2024 at 10:45 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > On Fri, Nov 08, 2024 at 10:47:45AM -0800, Ian Rogers wrote:
> > > > Ian Rogers (6):
> > > > tool api fs: Correctly encode errno for read/write open failures
> > > > perf trace-event: Constify print arguments
> > > > perf trace-event: Always build trace-event-info.c
> > > > perf evsel: Add/use accessor for tp_format
> > > > perf evsel: Allow evsel__newtp without libtraceevent
> > > > perf tests: Enable tests disabled due to tracepoint parsing
> > >
> > > After applying this series, I'm seeing some test failures. But I don't
> > > understand why it affects non-tracepoint events though.
> > >
> > > $ sudo ./perf test -v pipe
> > > --- start ---
> > > test child forked, pid 3036123
> > > 1bde35-1bdecc l noploop
> > > perf does have symbol 'noploop'
> > >
> > > Record+report pipe test
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.210 MB - ]
> > > [ perf record: Woken up 2 times to write data ]
> > > [ perf record: Captured and wrote 0.517 MB - ]
> > > [ perf record: Woken up 2 times to write data ]
> > > [ perf record: Captured and wrote 0.516 MB - ]
> > > Record+report pipe test [Success]
> > >
> > > Inject -B build-ids test
> > > 0xa5c [0x17a4]: failed to process type: 80
> > > Error:
> > > failed to process sample
> > > Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > > Inject -b build-ids test
> > > 0xa5c [0x17a4]: failed to process type: 80
> > > Error:
> > > failed to process sample
> > > Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > > Inject --buildid-all build-ids test
> > > 0xa5c [0x17a4]: failed to process type: 80
> > > Error:
> > > failed to process sample
> > > Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > >
> > > Inject --mmap2-buildid-all build-ids test
> > > 0xa5c [0x17a4]: failed to process type: 80
> > > Error:
> > > failed to process sample
> > > Inject build-ids test [Failed - cannot find noploop function in pipe #1]
> > > ---- end(-1) ----
> > > 84: perf pipe recording and injection test : FAILED!
> > >
> > > $ sudo ./perf test -v Zstd
> > > --- start ---
> > > test child forked, pid 3036097
> > > Collecting compressed record file:
> > > 500+0 records in
> > > 500+0 records out
> > > 256000 bytes (256 kB, 250 KiB) copied, 0.00169127 s, 151 MB/s
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.032 MB /tmp/perf.data.KBo, compressed (original 0.004 MB, ratio is 3.324) ]
> > > Checking compressed events stats:
> > > Couldn't decompress data
> > > 0x7ca8 [0x4f2]: failed to process type: 81 [Operation not permitted]
> > > Error:
> > > failed to process sample
> > > ---- end(-1) ----
> > > 86: Zstd perf.data compression/decompression : FAILED!
> > >
> > > Thanks,
> > > Namhyung
> >
> > I'm not able to reproduce:
> > ```
> > $ git log --oneline |head
> > a59bca6eb0a6 perf test: Add a runs-per-test flag
> > 0d0c002eb45c perf tests: Enable tests disabled due to tracepoint parsing
> > 4b8f5c9dfbda perf evsel: Allow evsel__newtp without libtraceevent
> > 7f57057c7884 perf evsel: Add/use accessor for tp_format
> > c27d357d2d4c perf trace-event: Always build trace-event-info.c
> > 20bf7a2cd68a perf trace-event: Constify print arguments
> > f18b07ee2af1 tool api fs: Correctly encode errno for read/write open failures
> > ...
> > $ sudo /tmp/perf/perf test -r 10 Zstd pipe -v
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 84: perf pipe recording and injection test : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > 86: Zstd perf.data compression/decompression : Ok
> > ```
> > Similarly not as root or if runs-per-test is 100.
> >
> > Agreed that the changes are for tracepoints and these tests aren't for
> > tracepoints, so an interaction wouldn't be expected. If you have a
> > reliable reproduction perhaps you can bisect it.
>
> it says:
>
> 9c10de391840a35ab095b65e9a5c4fad0ac52068 is the first bad commit
> commit 9c10de391840a35ab095b65e9a5c4fad0ac52068 (HEAD)
> Author: Ian Rogers <irogers@...gle.com>
> Date: Fri Nov 8 10:47:46 2024 -0800
>
> tool api fs: Correctly encode errno for read/write open failures
>
> Switch from returning -1 to -errno so that callers can determine types
> of failure.
>
> Signed-off-by: Ian Rogers <irogers@...gle.com>
> Acked-by: Arnaldo Carvalho de Melo <acme@...hat.com>
>
> tools/lib/api/fs/fs.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
So I tried to eye-ball/grep all callers to spot assumptions on the
return value like:
```
err = ...__read_int
if (err == -1)
```
Didn't spot anything.
It seems in the test log the record is failing so I ran this under
gdb, set breakpoints on the 3 modified functions and then looked up
the call stack to spot bad return value assumptions. Everything looks
good.
I then tried inject and report, the only file read by these functions
is /proc/sys/kernel/perf_event_paranoid as part of symbol
initialization (nit, this should probably be read lazily and the
restriction should really come from the perf.data file, not the
running system) and those calls look good.
The change is small and not critical for the series. It improves the
error message when reading the tracepoint id fails. So we could move
forward with the rest of the series, but that could be annoying for
tracepoint users.
If I had a reproducer I'd revert the 1 line change on each function to
find out which is causing the regression. Once you have that then you
can binary search to find the bad call by using some global counter
where the first 'n' calls use the new return value and the later use
the old value. You can then vary 'n' to binary search and find the bad
caller.
Is there any chance you can help diagnose this or help me to find the
reproducer?
Thanks,
Ian
Powered by blists - more mailing lists