[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZpBSbjF3QP81_637@google.com>
Date: Thu, 11 Jul 2024 14:45:18 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: "Wang, Weilin" <weilin.wang@...el.com>
Cc: Ian Rogers <irogers@...gle.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
"Hunter, Adrian" <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
"linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Taylor, Perry" <perry.taylor@...el.com>,
"Alt, Samantha" <samantha.alt@...el.com>,
"Biggers, Caleb" <caleb.biggers@...el.com>
Subject: Re: [RFC PATCH v16 8/8] perf test: Add test for Intel TPEBS counting
mode
Hello,
On Tue, Jul 09, 2024 at 06:23:51AM +0000, Wang, Weilin wrote:
> > On Mon, Jul 8, 2024 at 9:58 PM Wang, Weilin <weilin.wang@...el.com>
> > wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Namhyung Kim <namhyung@...nel.org>
> > > > Sent: Monday, July 8, 2024 9:44 PM
> > > > To: Wang, Weilin <weilin.wang@...el.com>
> > > > Cc: Ian Rogers <irogers@...gle.com>; Arnaldo Carvalho de Melo
> > > > <acme@...nel.org>; Peter Zijlstra <peterz@...radead.org>; Ingo Molnar
> > > > <mingo@...hat.com>; Alexander Shishkin
> > > > <alexander.shishkin@...ux.intel.com>; Jiri Olsa <jolsa@...nel.org>; Hunter,
> > > > Adrian <adrian.hunter@...el.com>; Kan Liang <kan.liang@...ux.intel.com>;
> > > > linux-perf-users@...r.kernel.org; linux-kernel@...r.kernel.org; Taylor,
> > Perry
> > > > <perry.taylor@...el.com>; Alt, Samantha <samantha.alt@...el.com>;
> > Biggers,
> > > > Caleb <caleb.biggers@...el.com>
> > > > Subject: Re: [RFC PATCH v16 8/8] perf test: Add test for Intel TPEBS
> > counting
> > > > mode
> > > >
> > > > Hello Weilin,
> > > >
> > > > On Sat, Jul 6, 2024 at 4:30 PM <weilin.wang@...el.com> wrote:
> > > > >
> > > > > From: Weilin Wang <weilin.wang@...el.com>
> > > > >
> > > > > Intel TPEBS sampling mode is supported through perf record. The
> > counting
> > > > mode
> > > > > code uses perf record to capture retire_latency value and use it in metric
> > > > > calculation. This test checks the counting mode code.
> > > > >
> > > > > Signed-off-by: Weilin Wang <weilin.wang@...el.com>
> > > > > ---
> > > > > .../perf/tests/shell/test_stat_intel_tpebs.sh | 18
> > ++++++++++++++++++
> > > > > 1 file changed, 18 insertions(+)
> > > > > create mode 100755 tools/perf/tests/shell/test_stat_intel_tpebs.sh
> > > > >
> > > > > diff --git a/tools/perf/tests/shell/test_stat_intel_tpebs.sh
> > > > b/tools/perf/tests/shell/test_stat_intel_tpebs.sh
> > > > > new file mode 100755
> > > > > index 000000000000..fea8cb1b8367
> > > > > --- /dev/null
> > > > > +++ b/tools/perf/tests/shell/test_stat_intel_tpebs.sh
> > > > > @@ -0,0 +1,18 @@
> > > > > +#!/bin/bash
> > > > > +# test Intel TPEBS counting mode
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +
> > > > > +set -e
> > > > > +
> > > > > +# Use this event for testing because it should exist in all platforms
> > > > > +event=cache-misses:R
> > > > > +
> > > > > +# Without this cmd option, default value or zero is returned
> > > > > +echo "Testing without --record-tpebs"
> > > > > +result=$(perf stat -e "$event" true 2>&1)
> > > > > +[[ "$result" =~ $event ]] || exit 1
> > > > > +
> > > > > +# In platforms that do not support TPEBS, it should execute without
> > error.
> > > > > +echo "Testing with --record-tpebs"
> > > > > +result=$(perf stat -e "$event" --record-tpebs -a sleep 0.01 2>&1)
> > > >
> > > > It never finishes on my AMD machine.
> > > >
> > > Hi Namhyung,
> > >
> > > Do you see any message while it executes? Is the perf record forked
> > successfully
> > > but failed to return?
> >
> > I don't know.. all I can get is like below:
> >
> > $ sudo ./perf test tpebs -vv
> > 121: test Intel TPEBS counting mode:
> > --- start ---
> > test child forked, pid 583475
> > Testing without --record-tpebs
> > Testing with --record-tpebs
> > ^C
>
> I think the problem is that the forked "perf record" encountered error, which
> caused the control fifo failed to send a "ACK" back and the PIPE hanged.
>
> Could you please try out the diff below and see if the test would finish?
>
> As for the "perf record" error, I think it might because of the fake
> event(cache-misses:R) cannot be supported in AMD. Could you please test run
> a "perf stat -e cache-misses:R --record-tpebs true" and see if it complains about
> the event?
So I tried the below patch and it worked.
$ ./perf test -v tpebs
121: test Intel TPEBS counting mode:
--- start ---
test child forked, pid 2190174
Testing without --record-tpebs
Testing with --record-tpebs
---- end(-1) ----
121: test Intel TPEBS counting mode : FAILED!
It would be better if it can skip rather than fail on
non-supported machines.
Also I saw this message when I run the command manually.
$ ./perf stat -e cache-misses:R --record-tpebs -v true
Control descriptor is not initialized
Retire_latency of event cache-misses:R is required
Prepare perf record for retire_latency
Error:
The cache-misses:pu event is not supported.
incompatible file format
incompatible file format (rerun with -v to learn more)
failed: did not received an ack
cache-misses:R: 0 1 1
Performance counter stats for 'true':
0 cache-misses:R
0.000004939 seconds time elapsed
0.000000000 seconds user
0.000000000 seconds sys
I'm not sure why it showed the incompatible file format message.
>
> diff --git a/tools/perf/util/intel-tpebs.c b/tools/perf/util/intel-tpebs.c
> index a0585a6571b5..5b8e104f36f1 100644
> --- a/tools/perf/util/intel-tpebs.c
> +++ b/tools/perf/util/intel-tpebs.c
> @@ -263,6 +263,7 @@ int tpebs_start(struct evlist *evsel_list)
> }
>
> if (tpebs_event_size > 0) {
> + struct pollfd pollfd = { .events = POLLIN, };
> int control_fd[2], ack_fd[2], len;
> char ack_buf[8];
>
> @@ -297,6 +298,19 @@ int tpebs_start(struct evlist *evsel_list)
> goto out;
> }
>
> + /* wait for an ack */
> + pollfd.fd = ack_fd[0];
> +
> + if (!poll(&pollfd, 1, 2000)) {
Is it 2 seconds? Any specific reason for the value?
At least we need a comment to explain the value (or just saying it's a
random one).
Thanks,
Namhyung
> + pr_err("failed: perf record ack timeout\n");
> + goto out;
> + }
> +
> + if (!(pollfd.revents & POLLIN)) {
> + pr_err("failed: did not received an ack\n");
> + goto out;
> + }
> +
> ret = read(ack_fd[0], ack_buf, sizeof(ack_buf));
> if (ret > 0)
> ret = strcmp(ack_buf, "ack\n");
>
> Thanks,
> Weilin
Powered by blists - more mailing lists