linux-kernel - Re: [PATCH 5/5] perf arm-spe: Snapshot mode test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ef1d4b66-1743-529f-7c5b-fbc4ada7113b@arm.com>
Date:   Tue, 2 Nov 2021 15:37:10 +0000
From:   James Clark <james.clark@....com>
To:     Leo Yan <leo.yan@...aro.org>, German Gomez <german.gomez@....com>
Cc:     linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        John Garry <john.garry@...wei.com>,
        Will Deacon <will@...nel.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Mike Leach <mike.leach@...aro.org>,
        linux-arm-kernel@...ts.infradead.org, coresight@...ts.linaro.org
Subject: Re: [PATCH 5/5] perf arm-spe: Snapshot mode test



On 02/11/2021 14:07, James Clark wrote:
> 
> 
> On 20/10/2021 14:13, Leo Yan wrote:
>> On Thu, Sep 16, 2021 at 04:46:35PM +0100, German Gomez wrote:
>>> Shell script test_arm_spe.sh has been added to test the recording of SPE
>>> tracing events in snapshot mode.
>>>
>>> Reviewed-by: James Clark <james.clark@....com>
>>> Signed-off-by: German Gomez <german.gomez@....com>
>>> ---
>>>  tools/perf/tests/shell/test_arm_spe.sh | 91 ++++++++++++++++++++++++++
>>>  1 file changed, 91 insertions(+)
>>>  create mode 100755 tools/perf/tests/shell/test_arm_spe.sh
>>>
>>> diff --git a/tools/perf/tests/shell/test_arm_spe.sh b/tools/perf/tests/shell/test_arm_spe.sh
>>> new file mode 100755
>>> index 000000000000..9ed817e76f95
>>> --- /dev/null
>>> +++ b/tools/perf/tests/shell/test_arm_spe.sh
>>> @@ -0,0 +1,91 @@
>>> +#!/bin/sh
>>> +# Check Arm SPE trace data recording and synthesized samples
>>> +
>>> +# Uses the 'perf record' to record trace data of Arm SPE events;
>>> +# then verify if any SPE event samples are generated by SPE with
>>> +# 'perf script' and 'perf report' commands.
>>> +
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# German Gomez <german.gomez@....com>, 2021
>>> +
>>> +skip_if_no_arm_spe_event() {
>>> +	perf list | egrep -q 'arm_spe_[0-9]+//' && return 0
>>> +
>>> +	# arm_spe event doesn't exist
>>> +	return 2
>>> +}
>>> +
>>> +skip_if_no_arm_spe_event || exit 2
>>> +
>>> +perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
>>> +glb_err=0
>>> +
>>> +cleanup_files()
>>> +{
>>> +	rm -f ${perfdata}
>>> +	trap - exit term int
>>> +	kill -2 $$ # Forward sigint to parent
>>
>> I understand you copy this code from Arm cs-etm testing, but I found
>> the sentence 'kill -2 $$' will cause a failure at my side with the
>> command:
>>
>> root@...ntu:/home/leoy/linux/tools/perf# ./perf test 85 -v
>> 85: Check Arm SPE trace data recording and synthesized samples      :
>> --- start ---
>> test child forked, pid 29053
>> Recording trace with snapshot mode /tmp/__perf_test.perf.data.uughb
>> Looking at perf.data file for dumping samples:
>> Looking at perf.data file for reporting samples:
>> SPE snapshot testing: PASS
>> test child finished with -1
>> ---- end ----
>> Check Arm SPE trace data recording and synthesized samples: FAILED!
>>
>> I changed to use below code and looks it works for me:
>>
>>         if [[ "$1" == "int" ]]; then
>>                 kill -SIGINT $$
>>         fi
>>         if [[ "$1" == "term" ]]; then
>>                 kill -SIGTERM $$
>>         fi
>>
>> Thanks,
>> Leo
> 
> This is quite interesting. It looks like the issue is caused by the update from dash 0.5.8
> on Ubuntu 18 to dash 0.5.10 on Ubuntu 20. Specifically the commit that causes the issue is:
> 
>    commit 9e5cd41d9605e4caaac3aacdc0482f6ee220a298
>    Author: Herbert Xu <herbert@...dor.apana.org.au>
>    Date:   Mon May 7 00:40:34 2018 +0800
> 
>     jobs - Do not block when waiting on SIGCHLD
>     
>     Because of the nature of SIGCHLD, the process may have already been
>     waited on and therefore we must be prepared for the case that wait
>     may block.  So ensure that it doesn't by using WNOHANG.
>     
>     Furthermore, multiple jobs may have exited when gotsigchld is set.
>     Therefore we need to wait until there are no zombies left.
>     
>     Lastly, waitforjob needs to be called with interrupts off and
>     the original patch broke that.
>     
>     Fixes: 03876c0743a5 ("eval: Reap zombies after built-in...")
>     Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
> 
> 
> This also means that the Coresight shell test will not be working anymore because I added
> the same trap to it so that it could be run in a loop. I'm going to compare the bahaviour
> to bash to see which one is doing the right thing and what the correct change to make to 
> fix it is. Or a bug needs to be reported.
> 
> Thanks
> James
> 

Ok, it seems like I was relying on buggy dash behaviour for my original change. Even with this:

        if [[ "$1" == "int" ]]; then
                kill -SIGINT $$
        fi
        if [[ "$1" == "term" ]]; then
                kill -SIGTERM $$
        fi

it still doesn't allow you to break out of running it in a while loop. This is only because of
the exit code, rather than any kind of signal propagation. Actually it's possible to stop it
with Ctrl-\ rather than Ctrl-C, and that doesn't require any extra handling in the script.

For that reason I'm happy to go with Leo's original suggestion when I first added this which was
to not have any extra kill at all.

Another fix could be this, but I'm not too keen on it because I don't think any other tests behave
like this:

        [ "$1" = "int" ] || exit 1
        [ "$1" = "term" ] || exit 1

>>
>>> +	exit $glb_err
>>> +}
>>> +
>>> +trap cleanup_files exit term int
>>> +
>>> +arm_spe_report() {
>>> +	if [ $2 != 0 ]; then
>>> +		echo "$1: FAIL"
>>> +		glb_err=$2
>>> +	else
>>> +		echo "$1: PASS"
>>> +	fi
>>> +}
>>> +
>>> +perf_script_samples() {
>>> +	echo "Looking at perf.data file for dumping samples:"
>>> +
>>> +	# from arm-spe.c/arm_spe_synth_events()
>>> +	events="(ld1-miss|ld1-access|llc-miss|lld-access|tlb-miss|tlb-access|branch-miss|remote-access|memory)"
>>> +
>>> +	# Below is an example of the samples dumping:
>>> +	#	dd  3048 [002]          1    l1d-access:      ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
>>> +	#	dd  3048 [002]          1    tlb-access:      ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
>>> +	#	dd  3048 [002]          1        memory:      ffffaa64999c __GI___libc_write+0x3c (/lib/aarch64-linux-gnu/libc-2.27.so)
>>> +	perf script -F,-time -i ${perfdata} 2>&1 | \
>>> +		egrep " +$1 +[0-9]+ .* +${events}:(.*:)? +" > /dev/null 2>&1
>>> +}
>>> +
>>> +perf_report_samples() {
>>> +	echo "Looking at perf.data file for reporting samples:"
>>> +
>>> +	# Below is an example of the samples reporting:
>>> +	#   73.04%    73.04%  dd    libc-2.27.so      [.] _dl_addr
>>> +	#    7.71%     7.71%  dd    libc-2.27.so      [.] getenv
>>> +	#    2.59%     2.59%  dd    ld-2.27.so        [.] strcmp
>>> +	perf report --stdio -i ${perfdata} 2>&1 | \
>>> +		egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1
>>> +}
>>> +
>>> +arm_spe_snapshot_test() {
>>> +	echo "Recording trace with snapshot mode $perfdata"
>>> +	perf record -o ${perfdata} -e arm_spe// -S \
>>> +		-- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 &
>>> +	PERFPID=$!
>>> +
>>> +	# Wait for perf program
>>> +	sleep 1
>>> +
>>> +	# Send signal to snapshot trace data
>>> +	kill -USR2 $PERFPID
>>> +
>>> +	# Stop perf program
>>> +	kill $PERFPID
>>> +	wait $PERFPID
>>> +
>>> +	perf_script_samples dd &&
>>> +	perf_report_samples dd
>>> +
>>> +	err=$?
>>> +	arm_spe_report "SPE snapshot testing" $err
>>> +}
>>> +
>>> +arm_spe_snapshot_test
>>> +exit $glb_err
>>> \ No newline at end of file
>>> -- 
>>> 2.17.1
>>>