[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUg-DFKM4SQa7P2fWRd62y7kDiP+qP2kP-TiZMy2EX7mQ@mail.gmail.com>
Date: Sat, 2 Nov 2024 21:58:03 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Kan Liang <kan.liang@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH] perf test: Fix LBR test by adding indirect calls
On Sat, Nov 2, 2024 at 5:24 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> I've noticed sometimes perf record LBR tests failed on indirect call
> test because it has empty branch stacks more than expected.
>
> The test workload (thloop) spawns a thread and calls a loop function for
> 1 second both from the main thread and the new thread. However neither
> of them has indirect calls in the body so it ended up with empty branch
> stacks.
>
> LBR any indirect call test
> [ perf record: Woken up 21 times to write data ]
> [ perf record: Captured and wrote 5.607 MB /tmp/__perf_test.perf.data.pujKd (7924 samples) ]
> LBR any indirect call test: 7924 samples
> LBR any indirect call test [Failed empty br stack ratio exceed 2%: 3%]
>
> Refactor the test workload to call the test_loop() both directly and
> indirectly. Now expectation of indirect call is 50% but let's add some
> margin for startup and finish routines.
>
> Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> ---
> tools/perf/tests/shell/record_lbr.sh | 2 +-
> tools/perf/tests/workloads/thloop.c | 9 ++++++---
> 2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh
> index 8d750ee631f877fd..7a23b2095be8acba 100755
> --- a/tools/perf/tests/shell/record_lbr.sh
> +++ b/tools/perf/tests/shell/record_lbr.sh
> @@ -121,7 +121,7 @@ lbr_test "-j any_ret" "any ret" 2
> lbr_test "-j ind_call" "any indirect call" 2
> lbr_test "-j ind_jmp" "any indirect jump" 100
> lbr_test "-j call" "direct calls" 2
> -lbr_test "-j ind_call,u" "any indirect user call" 100
> +lbr_test "-j ind_call,u" "any indirect user call" 52
> lbr_test "-a -b" "system wide any branch" 2
> lbr_test "-a -j any_call" "system wide any call" 2
>
> diff --git a/tools/perf/tests/workloads/thloop.c b/tools/perf/tests/workloads/thloop.c
> index 457b29f91c3ee277..fa5547939882cf6c 100644
> --- a/tools/perf/tests/workloads/thloop.c
> +++ b/tools/perf/tests/workloads/thloop.c
> @@ -18,14 +18,16 @@ static void sighandler(int sig __maybe_unused)
>
> noinline void test_loop(void)
> {
> - while (!done);
> + for (volatile int i = 0; i < 10000; i++)
I don't think the volatile here will stop a sufficiently eager
optimizing compiler. I think it may need to be static as well.
Thanks,
Ian
> + continue;
> }
>
> static void *thfunc(void *arg)
> {
> void (*loop_fn)(void) = arg;
>
> - loop_fn();
> + while (!done)
> + loop_fn();
> return NULL;
> }
>
> @@ -42,7 +44,8 @@ static int thloop(int argc, const char **argv)
> alarm(sec);
>
> pthread_create(&th, NULL, thfunc, test_loop);
> - test_loop();
> + while (!done)
> + test_loop();
> pthread_join(th, NULL);
>
> return 0;
> --
> 2.47.0.163.g1226f6d8fa-goog
>
Powered by blists - more mailing lists