[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXRODCKJexjxqkW_Wkf5MTxFJeq8zmu-iUbpEJM-9gjgg@mail.gmail.com>
Date: Tue, 9 Apr 2024 08:39:10 -0700
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@....com>
Cc: linux-perf-users@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, "Liang, Kan" <kan.liang@...ux.intel.com>,
Athira Rajeev <atrajeev@...ux.vnet.ibm.com>, Leo Yan <leo.yan@...ux.dev>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] perf tests: Skip "test data symbol" on Neoverse N1
On Tue, Apr 9, 2024 at 1:48 AM James Clark <james.clark@....com> wrote:
>
> To prevent anyone from seeing a test failure appear as a regression and
> thinking that it was caused by their code change, just skip the test on
> N1.
>
> It can be caused by any unrelated change that shifts the loop into an
> unfortunate position in the Perf binary which is almost impossible to
> debug as the root cause of the test failure. Ultimately it's caused by
> the referenced errata.
>
> Fixes: 60abedb8aa90 ("perf test: Introduce script for data symbol testing")
> Signed-off-by: James Clark <james.clark@....com>
This change makes me sad :-( Is there no hope of aligning the loop? We
have little enough testing coverage for memory events and even precise
events on ARM that anything take away testing coverage feels like we
should try to do better.
Which models are we losing coverage for, presumably neoverse-n1 but
what about neoverse-v1 and neoverse-n2-v2?
If aligning the loop doesn't work, could we use objdump and check its
alignment skipping when it is off? Or run the test and turn fails to
skip on neoverse-n1 - so we get some coverage testing.
It would also be nice if the change didn't add a dependency on lscpu
for the sake of parsing /proc/cpuinfo, I see another arm test already
did this test_arm_callgraph_fp.sh - that case looks like it should be
using uname.
Thanks,
Ian
> ---
> tools/perf/tests/shell/test_data_symbol.sh | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tools/perf/tests/shell/test_data_symbol.sh b/tools/perf/tests/shell/test_data_symbol.sh
> index 3dfa91832aa8..ffc641d00aa4 100755
> --- a/tools/perf/tests/shell/test_data_symbol.sh
> +++ b/tools/perf/tests/shell/test_data_symbol.sh
> @@ -16,6 +16,12 @@ skip_if_no_mem_event() {
> return 2
> }
>
> +# Skip on Arm N1 due to errata 1694299. Bias exists in SPE sampling
> +# which can cause the load and store instructions to be skipped
> +# entirely. This comes and goes randomly depending on the offset the
> +# linker places the datasym loop at in the Perf binary.
> +lscpu | grep -q "Neoverse-N1" && exit 2
> +
> skip_if_no_mem_event || exit 2
>
> skip_test_missing_symbol buf1
> --
> 2.34.1
>
Powered by blists - more mailing lists