lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f5a367cf-4154-498d-8985-b4d5498ff201@arm.com>
Date: Wed, 10 Apr 2024 10:08:38 +0100
From: James Clark <james.clark@....com>
To: Ian Rogers <irogers@...gle.com>
Cc: linux-perf-users@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 "Liang, Kan" <kan.liang@...ux.intel.com>,
 Athira Rajeev <atrajeev@...ux.vnet.ibm.com>, Leo Yan <leo.yan@...ux.dev>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] perf tests: Skip "test data symbol" on Neoverse N1



On 09/04/2024 16:39, Ian Rogers wrote:
> On Tue, Apr 9, 2024 at 1:48 AM James Clark <james.clark@....com> wrote:
>>
>> To prevent anyone from seeing a test failure appear as a regression and
>> thinking that it was caused by their code change, just skip the test on
>> N1.
>>
>> It can be caused by any unrelated change that shifts the loop into an
>> unfortunate position in the Perf binary which is almost impossible to
>> debug as the root cause of the test failure. Ultimately it's caused by
>> the referenced errata.
>>
>> Fixes: 60abedb8aa90 ("perf test: Introduce script for data symbol testing")
>> Signed-off-by: James Clark <james.clark@....com>
> 
> This change makes me sad :-( Is there no hope of aligning the loop? We
> have little enough testing coverage for memory events and even precise
> events on ARM that anything take away testing coverage feels like we
> should try to do better.
> 
> Which models are we losing coverage for, presumably neoverse-n1 but
> what about neoverse-v1 and neoverse-n2-v2?
> 
> If aligning the loop doesn't work, could we use objdump and check its
> alignment skipping when it is off? Or run the test and turn fails to
> skip on neoverse-n1 - so we get some coverage testing.
> 
> It would also be nice if the change didn't add a dependency on lscpu
> for the sake of parsing /proc/cpuinfo, I see another arm test already
> did this test_arm_callgraph_fp.sh - that case looks like it should be
> using uname.
> 

I'll make the change to add the noise to the loop, which will drop this
lscpu addition. And I'll fix up test_arm_callgraph_fp.sh while I'm at it.

> Thanks,
> Ian
> 
>> ---
>>  tools/perf/tests/shell/test_data_symbol.sh | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/tools/perf/tests/shell/test_data_symbol.sh b/tools/perf/tests/shell/test_data_symbol.sh
>> index 3dfa91832aa8..ffc641d00aa4 100755
>> --- a/tools/perf/tests/shell/test_data_symbol.sh
>> +++ b/tools/perf/tests/shell/test_data_symbol.sh
>> @@ -16,6 +16,12 @@ skip_if_no_mem_event() {
>>         return 2
>>  }
>>
>> +# Skip on Arm N1 due to errata 1694299. Bias exists in SPE sampling
>> +# which can cause the load and store instructions to be skipped
>> +# entirely. This comes and goes randomly depending on the offset the
>> +# linker places the datasym loop at in the Perf binary.
>> +lscpu | grep -q "Neoverse-N1" && exit 2
>> +
>>  skip_if_no_mem_event || exit 2
>>
>>  skip_test_missing_symbol buf1
>> --
>> 2.34.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ