linux-kernel - Re: [PATCH 08/14] perf test: Add memcpy thread test shell script

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5dbda300-2ad3-ca23-6013-f3dd3126ba30@foss.arm.com>
Date:   Fri, 8 Jul 2022 10:19:04 +0100
From:   Carsten Haitzler <carsten.haitzler@...s.arm.com>
To:     James Clark <james.clark@....com>, linux-kernel@...r.kernel.org
Cc:     coresight@...ts.linaro.org, mathieu.poirier@...aro.org,
        mike.leach@...aro.org, linux-perf-users@...r.kernel.org,
        acme@...nel.org, Suzuki K Poulose <suzuki.poulose@....com>
Subject: Re: [PATCH 08/14] perf test: Add memcpy thread test shell script

On 7/5/22 15:25, James Clark wrote:
> 
> 
> On 01/07/2022 13:07, carsten.haitzler@...s.arm.com wrote:
>> From: "Carsten Haitzler (Rasterman)" <raster@...terman.com>
>>
>> Add a script to drive the threaded memcpy test that gathers data so
>> it passes a minimum bar for amount and quality of content that we
>> extract from the kernel's perf support.
>>
> 
> On this one I get a failure about 1/50 times on N1SDP (I ran it about 150

I also see inconsistent results. The whole point of these tests is to 
point this out and provide data to track it and then lead eventually to 
improvements/fixes. A failing test is probably good - it found a 
problem. Perf test for me has lots of failures so I'm taking the 
position that failures are OK normally in perf test as long as you know 
what those failures are and why.

> times and saw 3 failures so it's quite consistent). Usually it records
> about a 1.4MB file with one aux record. But when it fails the file is
> only 20K and has one small aux record:
> 
>     0 0 0x1a10 [0x30]: PERF_RECORD_AUXTRACE size: 0x1820  offset: 0  ref: 0x1c23126d7ff3d2ab  idx: 3  tid: 682799  cpu: 3
> 
> Nothing was dropped, and the load on the system wasn't any different
> to when it passes. So I'm not sure if this is a real coresight bug
> or that the test is flaky. There was a bug in SPE before where

The binary is the same with the same content running the same perf 
command every time. Workload doesn't change. The perf data captured does 
change. It sometimes captures so little it fails even the low pass bar 
given in the test.

> threads weren't followed after forking, but only very rarely. It feels
> a bit like that.

That ... would be a "CoreSight" bug though I think, not the test.

> It could also be some contention issue because 10 threads are launched
> but the machine only has 4 cores.

We still should be capturing data reliably (in theory). If you have 10 
threads on a 4 core machine it'll take longer to run for the same 
workload as the threads will have to share the same cores, but this 
should still result in decent data collection as the cores switch 
between threads. That's the point.

> The failure message from the test looks like this:
> 
>     77: CoreSight / Memcpy 16k 10 Threads                               :
>     --- start ---
>     Couldn't synthesize bpf events.
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 0.012 MB ./perf-memcpy_thread-16k_10.data ]
>     Sanity check number of ASYNC is too low (3 < 10)
>      ---- end ----
>     CoreSight / Memcpy 16k 10 Threads: FAILED!
> 
> I didn't see this issue on any of the other tests. Sometimes very small
> files were made if I loaded the system, but the tests still passed.

For me the "Check TID" tests fails very often... but as I said - the 
point here is to find issues and ensure they are reported in results. 
The test even track the results over time/many runs in the csv files so 
you get a good idea of consistency and even how it may statistically 
change over time matching that up to changes in the kernel.

Unless of course you think it's acceptable that sometimes perf record + 
CoreSight will output essentially no data (your 20k example). :)

> Thanks
> James
> 
>> Signed-off-by: Carsten Haitzler <carsten.haitzler@....com>
>> ---
>>   .../shell/coresight/memcpy_thread_16k_10.sh    | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>   create mode 100755 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
>>
>> diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
>> new file mode 100755
>> index 000000000000..d21ba8545938
>> --- /dev/null
>> +++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
>> @@ -0,0 +1,18 @@
>> +#!/bin/sh -e
>> +# CoreSight / Memcpy 16k 10 Threads
>> +
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Carsten Haitzler <carsten.haitzler@....com>, 2021
>> +
>> +TEST="memcpy_thread"
>> +. $(dirname $0)/../lib/coresight.sh
>> +ARGS="16 10 1"
>> +DATV="16k_10"
>> +DATA="$DATD/perf-$TEST-$DATV.data"
>> +
>> +perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
>> +
>> +perf_dump_aux_verify "$DATA" 10 10 10
>> +
>> +err=$?
>> +exit $err