[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0ac7523-edce-4b0b-a142-14c03c912720@arm.com>
Date: Sat, 25 Nov 2023 19:10:25 +0000
From: Nick Forrington <nick.forrington@....com>
To: Leo Yan <leo.yan@...aro.org>, Michael Petlan <mpetlan@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
vmolnaro@...hat.com
Subject: Re: [PATCH] perf test: Remove atomics from test_loop to avoid test
failures
On 25/11/2023 03:05, Leo Yan wrote:
> Hi all,
>
> On Fri, Nov 24, 2023 at 08:57:52PM +0100, Michael Petlan wrote:
>> On Thu, 2 Nov 2023, Nick Forrington wrote:
>>> The current use of atomics can lead to test failures, as tests (such as
>>> tests/shell/record.sh) search for samples with "test_loop" as the
>>> top-most stack frame, but find frames related to the atomic operation
>>> (e.g. __aarch64_ldadd4_relax).
> I am confused by above description. As I went through the script
> record.sh, which is the only test invoking the program 'test_loop',
> but I don't find any test is related with stack frame.
>
> Do I miss anything? I went through record.sh but no clue why the
> failure is caused by stack frame. All the testings use command:
>
> if ! perf report -i "${perfdata}" -q | grep -q "${testsym}"
> ...
> fi
>
> @Nick, could you narrow down which specific test case causing the
> failure.
>
> [...]
All checks for ${testsym} in record.sh (including the example you
provide) can fail, as the expected symbol (test_loop) is not the
top-most function on the stack (and therefore not the symbol associated
with the sample).
Example perf report output:
# Overhead Command Shared Object Symbol
# ........ ....... ..................... .............................
#
99.53% perf perf [.] __aarch64_ldadd4_relax
...
You can see the issue when recording/reporting with call stacks:
# Children Self Command Shared Object Symbol
# ........ ........ ....... .....................
..........................................................
#
99.52% 99.52% perf perf [.]
__aarch64_ldadd4_relax
|
|--49.77%--0xffffb905a5dc
| 0xffffb8ff0aec
| thfunc
| test_loop
| __aarch64_ldadd4_relax
...
>
>> I believe that it was there to prevent the compiler to optimize the loop
>> out or some reason like that. Hopefully, it will work even without that
>> on all architectures with all compilers that are used for building perf...
> Agreed.
>
> As said above, I'd like to step back a bit for making clear what's the
> exactly failure caused by the program.
I don't think this loop could be sensibly optimised away, as it depends
on "done", which is defined at file scope (and assigned by a signal
handler).
Cheers,
Nick
>
> Thanks,
> Leo
>
Powered by blists - more mailing lists