[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220112095408.GB3322255@leoy-ThinkPad-X240s>
Date: Wed, 12 Jan 2022 17:54:08 +0800
From: Leo Yan <leo.yan@...aro.org>
To: Carsten Haitzler <carsten.haitzler@...s.arm.com>
Cc: linux-kernel@...r.kernel.org, coresight@...ts.linaro.org,
suzuki.poulose@....com, mathieu.poirier@...aro.org,
mike.leach@...aro.org, inux-perf-users@...r.kernel.org,
acme@...nel.org
Subject: Re: [PATCH 07/12] perf test: Add simple bubblesort test for
coresight aux data
Hi Carsten,
On Tue, Jan 04, 2022 at 03:13:08PM +0000, Carsten Haitzler wrote:
> On 1/3/22 08:00, Leo Yan wrote:
[...]
> > Furthermore, I expect the bubble sort is to be used for testing the
> > CoreSight configuration, e.g. it can be used to test for the strobing
> > mode (and for validation AutoFDO).
> >
> > How about you think for this?
>
> I actually didn't include any autofdo testing as this was mostly a matter of
> tooling after you have collected a trace. Run through the trace data and
> then build up a good image of the execution of the target and that would
> probably belong in tooling outside of the kernel. The idea here was to see
> if we do collect sufficient amounts of data and that the data looks "sane".
Yeah, this is consistent with what Suzuki told me that the main target
of this patch set is to verify the CoreSight trace data quality.
> This is all about looking to see if we only get a single block or only 2 or
> 3 blocks then it stops or no blocks and then with various stresses on kernel
> (memory heavy, cpu heavy) to see if anything will greatly affect this.
>
> The bubble sort does allow a basis to build some fdo tests on, but having a
> baseline of "does it collect data at all" to start with is a good call. I
> had not tested the strobing yet as that was probably another phase in this.
> Most of this was about getting the core infrastructure in to be able to add
> lots of little test tools we can run and the harnesses to run them and
> collect statistical data over time.
>
> Just a side note - the asm loop is arm64 specific and thus it's good for
> testing an exact result from, but bubble sort is portable. It would allow us
> to use this for an Arm 64 platform like the Morello board. I've been
> keeping in mind "be somewhat portable" for this reason.
>
> The only downside of keeping this test I think is that the whole test suite
> takes a bit longer to run. Is this sufficient a concern to remove this test
> from the patchset given the above?
So my essential purpose is to condense test cases as possible :)
For example, although the Arm64 asm loop case and the bubble sort case
have different execution flows, both of them actually are to verify
verify a complete process with CoreSight trace data recording and
reporting (so covers CoreSight driver, perf tool and OpenCSD lib).
Since we can pass different loop number to a test program, e.g. we
already have one case to test very small trace data with Arm64 asm
loop, why we still need the test case of bubble soring with small
array? Seems to me that more cases are not bad thing, but if both
case work on the same integration flow, I personally think these two
cases cannot give us significant benefit rather than single case.
Throughout the whole patch set, my another concern is some test cases
are platform dependent. E.g. if mainline kernel contains these test
cases and later a developer reports a test case failure, it's difficult
for us to figure out whether the failure is caused by the platform
factors (e.g. memory usage, timeout, etc), or it's a good exposing for
any issue in software components.
So for a test case requiring very small resources, we can set a strict
criteria, for a test case with big chunk trace data, we can report a
percentage value as the profiling quality metrics (e.g. we expect 1000
branch samples, but the result only contains 100 branch samples, so we
can output the quality metrics as 100 / 1000 = 10%). This can allow us
to easily conclude that the underlying mechanism works well, but the
profiling quality is bad caused by losing tracing data. In other
word, we can convert the quality result from binary format to a
range value [0% .. 100%].
Thanks,
Leo
Powered by blists - more mailing lists